You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Likely a duplicate of #96, but I'm opening a new issue as requested here since that one is old/possibly no longer relevant.
On 6/6/24 I observed DNS resolution for a storage service in our cluster taking multiple seconds to complete and am now observing it again; here are my notes from that issue.
First, I made a request from within our app container to the public hostname of the service:
root@app-55f755fc8c-88gvj:/app# time curl --location 'https://<hostname>/api/v1/ping'
pong
real 0m5.196s
user 0m0.020s
sys 0m0.006s
I then tried the cluster internal hostname:
/app # time curl --location '<service>.<namespace>.svc.cluster.local:443/api/v1/ping'
pong
real 0m 7.52s
user 0m 0.00s
sys 0m 0.00s
Making a request to the IP address was very fast:
/app # time curl --location '<IP>:443/api/v1/ping'
pong
real 0m 0.00s
user 0m 0.00s
sys 0m 0.00s
I believe this was an issue with kube DNS timing out for the following reasons:
I updated the app deployment to use dnsPolicy: Default instead of dnsPolicy: ClusterFirst as an experiment. After doing this, I saw that making a request to the public hostname of this service from within the app container was now fast, and my understanding is that this is because the requests were skipping kube DNS and going through a different resolver (?) based on these docs.
Following the DNS debugging guide, I observed the following unexpected logs in the kube-dns pods:
When I deleted the kube-dns pods and allowed the deployment to recreate them, the problem seemed to resolve, at least for the time being.
Unfortunately, I'm not sure how to reproduce this issue. This similar issue suggests this error may occur when a deployment disconnects from the kube API server: cert-manager/cert-manager#4685 (comment)
I didn't observe any restarts for the kube-dns pods:
➜ ~ k get po -n kube-system
NAME READY STATUS RESTARTS AGE
...
kube-dns-f65b59b6b-bkqw9 4/4 Running 0 31h
kube-dns-f65b59b6b-v72bv 4/4 Running 0 2d10h
k8s version:
➜ ~ k version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.10-gke.1075001
The text was updated successfully, but these errors were encountered:
Likely a duplicate of #96, but I'm opening a new issue as requested here since that one is old/possibly no longer relevant.
On 6/6/24 I observed DNS resolution for a storage service in our cluster taking multiple seconds to complete and am now observing it again; here are my notes from that issue.
First, I made a request from within our app container to the public hostname of the service:
I then tried the cluster internal hostname:
Making a request to the IP address was very fast:
I believe this was an issue with kube DNS timing out for the following reasons:
dnsPolicy: Default
instead ofdnsPolicy: ClusterFirst
as an experiment. After doing this, I saw that making a request to the public hostname of this service from within the app container was now fast, and my understanding is that this is because the requests were skipping kube DNS and going through a different resolver (?) based on these docs.Unfortunately, I'm not sure how to reproduce this issue. This similar issue suggests this error may occur when a deployment disconnects from the kube API server: cert-manager/cert-manager#4685 (comment)
I didn't observe any restarts for the kube-dns pods:
k8s version:
The text was updated successfully, but these errors were encountered: