1

I deployed a helm chart onto an isolated server and in the self-signed certificate HTTPS post it does to the kube-api it is failing w/ this error:

curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to kubernetes.default.svc:443

Anyone seen it before? This is the “POST”:

echo "Creating a secret for the certificate and keys" STATUS=$(curl -ik \ -o ${TMP_DIR}/output \ -w "%{http_code}" \ -X POST \ -H "Authorization: Bearer $TOKEN" \ -H 'Accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "kind": "Secret", "apiVersion": "v1", "metadata": { "name": "spark-webhook-certs", "namespace": "'"$NAMESPACE"'" }, "data": { "ca-cert.pem": "'"$ca_cert"'", "ca-key.pem": "'"$ca_key"'", "server-cert.pem": "'"$server_cert"'", "server-key.pem": "'"$server_key"'" } }' \ https://kubernetes.default.svc/api/v1/namespaces/${NAMESPACE}/secrets 

The error is occuring due to a self signed certificate whose .sh is being invoked by a docker image command here: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/hack/gencerts.sh

I know it's not getting to the end of this because it's failing to make the secret it's trying to post. Where do y'all think I should look to start troubleshooting? I've posted additional info here including screen shots: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/926

1 Answer 1

1

I would try to update the contents of gencerts.sh to get some more context on the error:

  1. Add the -v or --verbose option to the curl command.
  2. Use strace to invoke the curl command

Both of those options will send more outputs to stderr so you should be able to inspect in your log and get a better idea of the failure mode. Fair warning: strace will generate a lot of output.

Another source of information would be the kube-apiserver logs. You'll need to enable collection of master logs by adjusting the configuration of your cluster. You should expect every API request to be logged by kube-apiserver.

The first question here is whether the request is received by the control plane at all. To troubleshoot this, I would get a shell on a container inside the cluster and try to recreate the curl request that gencerts.sh is making. There is some information on accessing the cluster API without kubectl in the kubernetes docs.

Sign up to request clarification or add additional context in comments.

3 Comments

I’ve wanted to do that the only issue is that I don’t I don’t have access to the dockerfile that generates the gencert.sh including image. I’ll need to think about if I can base an image off that image and somehow remove and replace the file.
I added some info on collecting kube-apiserver logs which should tell you whether the failing request is even making it through (I suspect it is not). Also, rather than manipulate the image, you could try to replicate the failure in an arbitrary container that you create for troubleshooting.
You are right: k8s.io/client-go/informers/factory.go:134: Watch close - *v1beta1.ValidatingWebhookConfiguration total 0 items received But the kube-apiserver doesn't give me much info either. I do in the kubectl logs the below as well: 1) unmounted volumes=[webhook-certs], unattached volumes=[webhook-certs sparkoperator-1590335342-token-49pmd]: timed out waiting for the condition 2) "webhook-certs" : secret "spark-webhook-certs" not found What do you think of: github.com/jetstack/cert-manager/issues/1425 I'm seeing: "spark-webhook-certs timeout" followed by 404 later

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.