metrics-server fails to get metrics from recently started nodes #1571
Labels
kind/support
Categorizes issue or PR as a support question.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
What happened:
The metrics-server fails with error:
E0916 14:23:37.254021 1 scraper.go:149] "Failed to scrape node" err="Get \"https://10.34.50.99:10250/metrics/resource\": remote error: tls: internal error" node="ip-10-34-50-99.eu-west-1.compute.internal"
What you expected to happen:
The metrics-server can get metrics from nodes successfully.
Anything else we need to know?:
This problem happens when the autoscaler (we use Karpenter) adds or removes new nodes. For a brief period of time the node will fail to report metrics in the
metrics/resource
, causing the HPA to have manyFailedToGetResourceMetric
events.Environment:
Kubernetes distribution (GKE, EKS, Kubeadm, the hard way, etc.):
EKS 17
Container Network Setup (flannel, calico, etc.):
Note: This issue is not network related
kubectl version
):spoiler for Metrics Server manifest:
spoiler for Kubelet config:
spoiler for Metrics Server logs:
spolier for Status of Metrics API:
/kind bug
The text was updated successfully, but these errors were encountered: