0

I have a wierd issue in my eks cluster.

I have a deployment which is using a dedicated service-account. The service-account has annotation to a role in aws:

eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxxxx:role/prod-us-west-2-default-xxxxx 

For some reason the deployment will not scale up. when I look at teh events , this is what I see:

14s Warning FailedCreate job/xxxxx-shared-db-migration Error creating: Timeout: request did not complete within requested timeout - context deadline exceeded 32m Warning FailedCreate job/xxxxx-shared-db-migration Error creating: Post "https://172.16.112.21:443/api/v1/namespaces/default/pods": 

The describe of the deployment is :

Conditions: Type Status Reason ---- ------ ------ Available False MinimumReplicasUnavailable Progressing False ProgressDeadlineExceeded ReplicaFailure True FailedCreate 

obviously there is not pod/container logs/errors since the pod does not even gets created.

if I disconnect the iam-role annotation for the service-account , the deployment can start without an issue.

Any ideas ? any help would be appreciated

1 Answer 1

0

The problem was in kubernetes webhooks.

I recently installed datadog agents and it turns out it requires opening port 8000 inbound for incoming traffic. if the port is not opened datadog bombs the eks webhook and causes timeouts in the controller-plane.

The minutes I deleted the webhook , the issue resolved it slef. After addressing datadog support I learned of the 8000 port.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.