0

(Reposted from original post at: https://stackoverflow.com/questions/73012913/kubernetes-pull-from-image-private-network-fails-to-respect-etc-hosts-of-serv as this is a more appropriate place to ask the question)

I am running a small 3 node test kubernetes cluster (using kubeadm) running on Ubuntu Server 22.04, with Flannel as the network fabric. I also have a separate gitlab private server, with container registry set up and working.

The problem I am running into is I have a simple test deployment, and when I apply the deployment yaml, it fails to pull the image from the gitlab private server.

apiVersion: apps/v1 kind: Deployment metadata: name: platform-deployment spec: replicas: 1 selector: matchLabels: app: platform-service template: metadata: labels: app: platform-service spec: containers: - name: platform-service image: registry.examle.com/demo/platform-service:latest 

Ubuntu Server: /etc/hosts (the relevant line)

192.168.1.30 registry.example.com 

The Error

Failed to pull image "registry.example.com/demo/platform-service:latest": rpc error: code = Unknown desc = failed to pull and unpack image "registry.example.com/deni/platform-service:latest": failed to resolve reference "registry.example.com/demo/platform-service:latest": failed to do request: Head "https://registry.example.com/v2/demo/platform-service/manifests/latest": dial tcp xxx.xxx.xxx.xxx:443: i/o timeout 

The 'xxx.xxx.xxx.xxx' is related to my external network, to which exists a domain name in the DNS, however all of my internal networks are set up to attach to the internal network representation, and 'registry.example.com' is a representation of my own domains.

Also to note:

docker pull registry.example.com/demo/platform-service:latest 

From the command line of the server, works perfectly fine, it is just not working from kubernetes deploy yaml.

The problem

While the network on the server, and the host files on the server are configured correctly, the docker image is not resolving because when I apply it is not using the correct IP (that is configured in hosts), rather a public IP that is a different server. And the reason for the timeout is because the public facing server is not set up the same.

When I run kubectl apply -f platform-service.yaml why does it not respect the hosts file of the server, and is there a way configure hosts inside Kubernetes.

(If this problem is not clear, I apologize, I am quite new, and still learning terminology, maybe why google is not helping me with this problem.)

The closest S/O I could find is:

https://stackoverflow.com/questions/62940403/kubernetes-not-able-pull-image-from-private-registry-having-private-domain-point

(SO Answer #1): hostAliases (this is for the pod itself, not pulling the image), also, installed through apt/package manager rather than snap. With the rest of the answer suggests changing the distribution, which I would rather go with my current setup than change it.

--- Update(s):

  1. I have narrowed down the problem (I believe) to needing settings in containerd, but have not yet found how to set the hosts to match the server's /etc/hosts file
  2. I created a second kubernetes cluster, using k3s instead of kubeadm: instructions found at https://computingforgeeks.com/install-kubernetes-on-ubuntu-using-k3s/ and am encountering the same problem.

Update

Attempts to add hosts to coredns not working either: (https://stackoverflow.com/questions/65283827/how-to-change-host-name-resolve-like-host-file-in-coredns)

kubectl -n kube-system edit configmap/coredns 
... .:53 { errors health { lameduck 5s } ready hosts custom.hosts registry.example.com { 192.168.1.30 registry.example.com fallthrough } kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure fallthrough in-addr.arpa ip6.arpa ttl 30 } prometheus :9153 forward . /etc/resolv.conf { max_concurrent 1000 } cache 30 loop reload loadbalance } ... 

deleted the coredns pods (so they are recreated)

and still the docker pull on the deployment fails with the external ip address instead of the internal address.

1
  • I've always modified CoreDNS configmap to point to a dnsmasq server I install on one of the cluster nodes. All I do is change the forward . /etc/resolve to forward . <dnsmasq_ip>. And for good measure, I also configure systemd-resolved to point to the same DNSMasq server. It hasn't failed me yet! Commented Aug 21, 2024 at 7:40

1 Answer 1

1

After going through many different solutions and lots of research and testing. The answer was actually very simple.

Solution in my case

The /etc/hosts file MUST contain the host for the registry (and possibly the entry for the gitlab instance as well) on EVERY node of the cluster including the master node.

192.168.1.30 registry.example.com 192.168.1.30 gitlab.example.com # Necessary in my case, not sure required 

Once I included that on each of the 2 slaves, it attempted to pull the image, and failed with credential issues (which I was expecting to see once the hosts issue was resolved). From there I was able to add the credentials and now the image pulls fine from the private registry rather than the public facing registry.

Bonus: Fix for credentials error connecting to private registry (not part of the original question, but part of the setup process for connecting)

After fixing the /etc/hosts issue, you will probably need to set up 'regcred' credentials to access the private registry, Kubernetes documentation provides the steps on that part:

https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

1
  • Are you saying this works with containerd as container runtime. For what I notice so far, it works with Docker, but not containerd. Commented Apr 5, 2023 at 15:25

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.