Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: Metrics API not available #1282

Open
halohsu opened this issue Jul 1, 2023 · 33 comments
Open

error: Metrics API not available #1282

halohsu opened this issue Jul 1, 2023 · 33 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@halohsu
Copy link

halohsu commented Jul 1, 2023

What happened:

kubectl get pods --all-namespaces 
NAMESPACE     NAME                                       READY   STATUS             RESTARTS       AGE
kube-system   calico-kube-controllers-85578c44bf-526bd   1/1     Running            0              89m
kube-system   calico-node-4x7zk                          1/1     Running            0              80m
kube-system   calico-node-6bfnp                          1/1     Running            5 (84m ago)    119m
kube-system   calico-node-79tnt                          1/1     Running            0              71m
kube-system   calico-node-h99hx                          1/1     Running            0              82m
kube-system   calico-node-r4dk4                          1/1     Running            0              83m
kube-system   calico-typha-866bf4ccff-xb4kl              1/1     Running            0              89m
kube-system   coredns-5d78c9869d-gbhnw                   0/1     CrashLoopBackOff   39 (10s ago)   159m
kube-system   coredns-5d78c9869d-zklwl                   0/1     CrashLoopBackOff   39 (16s ago)   159m
kube-system   etcd-k0.xlab.io                            1/1     Running            2              159m
kube-system   kube-apiserver-k0.xlab.io                  1/1     Running            0              159m
kube-system   kube-controller-manager-k0.xlab.io         1/1     Running            0              159m
kube-system   kube-proxy-8wrl7                           1/1     Running            0              71m
kube-system   kube-proxy-9d5xs                           1/1     Running            0              82m
kube-system   kube-proxy-ksq4n                           1/1     Running            0              83m
kube-system   kube-proxy-r926v                           1/1     Running            0              159m
kube-system   kube-proxy-w954b                           1/1     Running            0              80m
kube-system   kube-scheduler-k0.xlab.io                  1/1     Running            0              159m
kube-system   metrics-server-7866664974-bzt4j            1/1     Running            0              2m29s
kubectl apply -f metrics-server.yaml
kubectl top node
error: Metrics API not available

What you expected to happen: Show metrics.

Anything else we need to know?: latest version metrics server yaml.

Environment:

  • Kubernetes distribution (GKE, EKS, Kubeadm, the hard way, etc.): Kubeadm on my local servers.

  • Container Network Setup (flannel, calico, etc.): Calico

  • Kubernetes version (use kubectl version):

kubectl version -o yaml
clientVersion:
  buildDate: "2023-06-14T09:53:42Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-06-14T09:47:40Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
  • Metrics Server manifest
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls=true
        image: registry.k8s.io/metrics-server/metrics-server:v0.6.3
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100
  • Kubelet config:
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data:<MyKey>
    server: https://k0.xlab.io:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: system:node:k0.xlab.io
  name: system:node:k0.xlab.io@kubernetes
current-context: system:node:k0.xlab.io@kubernetes
kind: Config
preferences: {}
users:
- name: system:node:k0.xlab.io
  user:
    client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
    client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
  • Metrics server logs:
I0701 16:17:05.854367       1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0701 16:17:06.442804       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0701 16:17:06.442814       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0701 16:17:06.443407       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0701 16:17:06.443418       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0701 16:17:06.444083       1 secure_serving.go:267] Serving securely on [::]:4443
I0701 16:17:06.444097       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0701 16:17:06.444102       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0701 16:17:06.444104       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I0701 16:17:06.444098       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
W0701 16:17:06.444137       1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0701 16:17:06.544671       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0701 16:17:06.544688       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0701 16:17:06.544698       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
  • Status of Metrics API:
Name:         v1beta1.metrics.k8s.io
Namespace:    
Labels:       k8s-app=metrics-server
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2023-07-01T16:17:04Z
  Resource Version:    20032
  UID:                 bb670fc2-666f-4617-ac5b-4405bbb2328c
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-system
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2023-07-01T16:17:04Z
    Message:               failing or missing response from https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1: Get "https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

/kind bug
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 1, 2023
@halohsu
Copy link
Author

halohsu commented Jul 2, 2023

Help me pls!!

@yangjunmyfm192085
Copy link
Contributor

Hi, @bluemiaomiao ,Has the issue been solved?
metrics-server needs two scripe cycles to provide metrics.
If kubectl top node has no metrics after two scripe cycles, please contine to provide the logs of metrics-srever

@halohsu
Copy link
Author

halohsu commented Jul 6, 2023

@yangjunmyfm192085 No indicators have been provided yet, and I haven’t investigated what happened internally.

@masazumi9527
Copy link

I have almost similar problem! I use the helm chart to install kube-metrics in one master and one worker, kubectl top node doesn't work like @bluemiaomiao.

hostnetwork: true in values.yaml is useful, but why metrics-server in CNI doesn't work?

@dashpole
Copy link

/assign @yangjunmyfm192085
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 27, 2023
@yangjunmyfm192085
Copy link
Contributor

Hi, @bluemiaomiao @masazumi9527 Could you help provide more metrics-server logs?
From the previous log, the metrics-server is working normally
It just looks like APIService is not accessible
Message: failing or missing response from https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1: Get "https://10.104.75.22:443/apis/metrics.k8s.io/v1beta1": context deadline exceeded (Client.Timeout exceeded while awaiting headers

@masazumi9527 hostnetwork: true can solve your issue?

@aws-apradana
Copy link

It does not work for me too. The metrics-server pods are running, I have set --kubelet-insecure-tls flag, and this error: Metrics API not available still shows when I do kubectl top node or kubectl top pod.

However, when I inquire using kubectl get --raw /api/v1/nodes/ip-172-31-7-243/proxy/metrics/resource, it does show something like this:

# HELP container_cpu_usage_seconds_total [ALPHA] Cumulative cpu time consumed by the container in core-seconds
# TYPE container_cpu_usage_seconds_total counter
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-6p4n2"} 11.147192766 1691263859507
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-kd62l"} 10.973078388 1691263849797
...
...

@yangjunmyfm192085
Copy link
Contributor

It does not work for me too. The metrics-server pods are running, I have set --kubelet-insecure-tls flag, and this error: Metrics API not available still shows when I do kubectl top node or kubectl top pod.

However, when I inquire using kubectl get --raw /api/v1/nodes/ip-172-31-7-243/proxy/metrics/resource, it does show something like this:

# HELP container_cpu_usage_seconds_total [ALPHA] Cumulative cpu time consumed by the container in core-seconds
# TYPE container_cpu_usage_seconds_total counter
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-6p4n2"} 11.147192766 1691263859507
container_cpu_usage_seconds_total{container="coredns",namespace="kube-system",pod="coredns-5d78c9869d-kd62l"} 10.973078388 1691263849797
...
...

Can you provide the logs of the metrics-server?

@Kikyo-chan
Copy link

Kikyo-chan commented Aug 7, 2023

I also had the same problem:

image

image
image

@henzbnzr
Copy link

https://www.youtube.com/watch?v=0UDG52REs68

@brosef
Copy link

brosef commented Aug 22, 2023

Default containerPort is wrong in the latest release - #1236. Try overriding that to 10250.

@LeoShivas
Copy link

I encounter the same issue. Fresh install of Kubernetes, applying the https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml or deploying with helm with default values still gives these errors :

Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       app.kubernetes.io/instance=metrics-server
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=metrics-server
              app.kubernetes.io/version=0.6.4
              helm.sh/chart=metrics-server-3.11.0
Annotations:  meta.helm.sh/release-name: metrics-server
              meta.helm.sh/release-namespace: default
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2023-09-22T20:01:52Z
  Managed Fields:
    API Version:  apiregistration.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:meta.helm.sh/release-name:
          f:meta.helm.sh/release-namespace:
        f:labels:
          .:
          f:app.kubernetes.io/instance:
          f:app.kubernetes.io/managed-by:
          f:app.kubernetes.io/name:
          f:app.kubernetes.io/version:
          f:helm.sh/chart:
      f:spec:
        f:group:
        f:groupPriorityMinimum:
        f:insecureSkipTLSVerify:
        f:service:
          .:
          f:name:
          f:namespace:
          f:port:
        f:version:
        f:versionPriority:
    Manager:      helm
    Operation:    Update
    Time:         2023-09-22T20:01:52Z
    API Version:  apiregistration.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          .:
          k:{"type":"Available"}:
            .:
            f:lastTransitionTime:
            f:message:
            f:reason:
            f:status:
            f:type:
    Manager:         kube-apiserver
    Operation:       Update
    Subresource:     status
    Time:            2023-09-23T18:42:54Z
  Resource Version:  283501
  UID:               2739dbe3-a6b0-4e50-a91a-dc7497af7658
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       default
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2023-09-22T20:01:53Z
    Message:               failing or missing response from https://10.110.138.40:443/apis/metrics.k8s.io/v1beta1: Get "https://10.110.138.40:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled whi
le waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

There is no error in the container :

I0922 20:02:29.383074       1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0922 20:02:31.376931       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0922 20:02:31.376974       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0922 20:02:31.377020       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0922 20:02:31.377033       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0922 20:02:31.377063       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0922 20:02:31.377069       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0922 20:02:31.377415       1 secure_serving.go:267] Serving securely on :10250
I0922 20:02:31.377452       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I0922 20:02:31.377869       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0922 20:02:31.378482       1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0922 20:02:31.477787       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0922 20:02:31.477820       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0922 20:02:31.477907       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
Stream closed EOF for default/metrics-server-76c55fc4fc-5hdpv (metrics-server)

@henzbnzr
Copy link

did you check this step, edit metric server file?
link : https://www.youtube.com/watch?v=0UDG52REs68

@LeoShivas
Copy link

Hi @henzbnzr,

Thank you.

But, first, I don't want to use the --kubelet-insecure-tls option.

Second, this option should go in the args section, not in the command one.

Third, even in trying this option, I still have the unable to load configmap based request-header-client-ca-file: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication" error.

The extension-apiserver-authentication config map exists :

NAMESPACE↑                                    NAME                                                 
default                                       kube-root-ca.crt                                     
kube-node-lease                               kube-root-ca.crt                                     
kube-public                                   cluster-info                                         
kube-public                                   kube-root-ca.crt                                     
kube-system                                   cilium-config                                        
kube-system                                   coredns                                              
kube-system                                   extension-apiserver-authentication                   
kube-system                                   kube-apiserver-legacy-service-account-token-tracking 
kube-system                                   kube-proxy                                           
kube-system                                   kube-root-ca.crt                                     
kube-system                                   kubeadm-config                                       
kube-system                                   kubelet-config                                       

However, I don't know if it's a problem, but the error says request-header-client-ca-file and the config map contains a requestheader-client-ca-file cert (there is a missing dash in the name).

@ata666
Copy link

ata666 commented Oct 11, 2023

I also have the same problem, which is still unresolved. Please help me, thank you
My configuration snippet is as follows, adding hostNetwork: true and also adding - --kubelet-insecure-tls. The pod runs normally and no errors are reported. The log is as follows. However, when executing kubectl top pod, the error error: Metrics API not available is still reported. Please help me, thank you.

apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
hostNetwork: true
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls
image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.4

[root@master metrics-server]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-66f779496c-cmndx 1/1 Running 0 19d
coredns-66f779496c-ztmmb 1/1 Running 0 19d
etcd-master 1/1 Running 10 (36d ago) 48d
kube-apiserver-master 1/1 Running 10 (36d ago) 48d
kube-controller-manager-master 1/1 Running 13 (19d ago) 48d
kube-proxy-4rdfh 1/1 Running 10 (36d ago) 48d
kube-proxy-lv2gq 1/1 Running 3 (19d ago) 48d
kube-proxy-pzskd 1/1 Running 2 (36d ago) 47d
kube-scheduler-master 1/1 Running 12 (19d ago) 48d
metrics-server-59dc595f65-spbh7 1/1 Running 0 12m

[root@master metrics-server]# kubectl logs metrics-server-59dc595f65-spbh7 -n kube-system
I1011 13:35:27.396842 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I1011 13:35:28.011295 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1011 13:35:28.011318 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I1011 13:35:28.011389 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I1011 13:35:28.011406 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1011 13:35:28.011424 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I1011 13:35:28.011432 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1011 13:35:28.011509 1 secure_serving.go:267] Serving securely on [::]:4443
I1011 13:35:28.011544 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I1011 13:35:28.011988 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W1011 13:35:28.012148 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I1011 13:35:28.111522 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1011 13:35:28.111650 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I1011 13:35:28.111672 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file

[root@master metrics-server]# kubectl get apiservices | grep metrics
v1beta1.metrics.k8s.io kube-system/metrics-server False (FailedDiscoveryCheck) 48s

[root@master metrics-server]# kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"apiregistration.k8s.io/v1","kind":"APIService","metadata":{"annotations":{},"labels":{"k8s-app":"metrics-server"},"name":"v1beta1.metrics.k8s.io"},"spec":{"group":"metrics.k8s.io","groupPriorityMinimum":100,"insecureSkipTLSVerify":true,"service":{"name":"metrics-server","namespace":"kube-system"},"version":"v1beta1","versionPriority":100}}
creationTimestamp: "2023-10-16T03:48:25Z"
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
resourceVersion: "9122129"
uid: df0b8054-7456-4158-8b96-78b1dad148d0
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
port: 443
version: v1beta1
versionPriority: 100
status:
conditions:

[root@master metrics-server]# kubectl top node
error: Metrics API not available

@AymenFJA
Copy link

AymenFJA commented Oct 18, 2023

This worked for me, thanks to @NileshGule:

  1. Deploy metric server:
[deploy metrics server] 

$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
  1. Open the file in editor mode:
$ k -n kube-system edit deploy metrics-server
  1. Under the containers section, add only the command part:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP
  1. Check if the metric-server is running now:
$ k -n kube-system get pods
NAME                                      READY   STATUS    RESTARTS   AGE
calico-kube-controllers-9d57d8f49-d26pd   1/1     Running   3          25h
canal-5xf7z                               2/2     Running   0          11m
canal-mgtxd                               2/2     Running   0          11m
coredns-7cbb7cccb8-gpnp5                  1/1     Running   0          25h
coredns-7cbb7cccb8-qqcs6                  1/1     Running   0          25h
etcd-controlplane                         1/1     Running   0          25h
kube-apiserver-controlplane               1/1     Running   2          25h
kube-controller-manager-controlplane      1/1     Running   2          25h
kube-proxy-mk759                          1/1     Running   0          25h
kube-proxy-wmp2n                          1/1     Running   0          25h
kube-scheduler-controlplane               1/1     Running   2          25h
metrics-server-678d4b775-gqb65            1/1     Running   0          48s
  1. Now try the top command:
 $ k top node
controlplane $ k top node
NAME           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
controlplane   85m          8%     1211Mi          64%       
node01         34m          3%     957Mi           50%     

@MahiraTechnology
Copy link

MahiraTechnology commented Oct 25, 2023

I am seeing same issue on 1.27 and 0.6.3 metrics server. I have opened #1352 for the same.
I am able to run top node and top pod commands.

@jangalapallisr
Copy link

jangalapallisr commented Nov 8, 2023

I had similar issue, metrics-server up & running where as top command is not working as expected, error says "error: Metrics API not available" with 1.28 version with pod n/w is Calico.
My container runtime engine is cri-o, & K8S install by kubeadm ("v1.28.2) on ubuntu machines ( 5 node cluster)
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.3
Calico: v3.26.1
metrics-server:v0.6.4
Since its Calico networking plugin for my CNI, I just added below 2 lines in my metrics-server deployment with reference to https://datacenterdope.wordpress.com/2020/01/20/installing-kubernetes-metrics-server-with-kubeadm/

- --kubelet-insecure-tls ---> this is at spec.containers.args section
hostNetwork: true ---> this is at spec.containers section
After editing/adding with above two lines at metrics-server deployment, top command started working; because metrics-server pod started to communicating with API server, otherwise we may end-up see "Readiness Probe" failed for metrics-server deployment.

image

@theten52
Copy link

This worked for me, thanks to @NileshGule:

  1. Deploy metric server:
[deploy metrics server](https://gist.github.com/NileshGule/8f772cf04ea6ae9c76d3f3e9186165c2#deploy-metrics-server)
  1. Open the file in editor mode:
k -n kube-system edit deploy metrics-server
  1. Under the containers section, add only the command part:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP
  1. Check if the metric-server is running now:
k -n kube-system get pods
NAME                                      READY   STATUS    RESTARTS   AGE
calico-kube-controllers-9d57d8f49-d26pd   1/1     Running   3          25h
canal-5xf7z                               2/2     Running   0          11m
canal-mgtxd                               2/2     Running   0          11m
coredns-7cbb7cccb8-gpnp5                  1/1     Running   0          25h
coredns-7cbb7cccb8-qqcs6                  1/1     Running   0          25h
etcd-controlplane                         1/1     Running   0          25h
kube-apiserver-controlplane               1/1     Running   2          25h
kube-controller-manager-controlplane      1/1     Running   2          25h
kube-proxy-mk759                          1/1     Running   0          25h
kube-proxy-wmp2n                          1/1     Running   0          25h
kube-scheduler-controlplane               1/1     Running   2          25h
metrics-server-678d4b775-gqb65            1/1     Running   0          48s
  1. Now try the top command:
 k top node
controlplane $ k top node
NAME           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
controlplane   85m          8%     1211Mi          64%       
node01         34m          3%     957Mi           50%     

Thanks so much for this! it works for me!

@LeoShivas
Copy link

IMHO, if you ended by add the --kubelet-insecure-tls to unlock your problem, that means that you don't resolve the root cause.

@mehdi-aghayari
Copy link

mehdi-aghayari commented Jan 8, 2024

hi,
I added new worker to rke2 v1.26.11, but metrics server not working for just new worker-03 in below command:

kubectl top nodes
NAME         CPU CPU% MEMORY MEMORY%
worker-02  400m 25% 800Mi 37%
worker-03   <UNKNOWN> <UNKNOWN> <UNKNOWN> <UNKNOWN>

also result of below command not contain worker-03:
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes

also i configure metrics server deployment with below:

      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=10250
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP
        - --v=9

also there is same log in metrics-server pod:
I0108 14:09:01.842852 1 decode.go:86] "Failed getting complete node metric" node="worker-03" metric=&{StartTime:0001-01-01 00:00:00 +0000 UTC Timestamp:2024-01-08 14:08:59.764 +0000 UTC CumulativeCpuUsed:0 MemoryUsage:0}

So, please let me your any graceful advice.

In stackoverflow

@Tusenka
Copy link

Tusenka commented Jan 22, 2024

I have the same error and got Readiness probe failed: HTTP probe failed with statuscode: 500
image
image
image

image

k version -o yaml

image

@davidwincent
Copy link

This worked for me,

kustomization.yaml

resources:
  - https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.7.1/components.yaml
patches:
- target:
    kind: Deployment
    labelSelector: "k8s-app=metrics-server"
  patch: |-
    - op: replace
      path: /spec/template/spec/containers/0/args
      value:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls
    - op: replace
      path: /spec/template/spec/containers/0/ports
      value:
        - containerPort: 4443
          name: https
          protocol: TCP

@HFourier
Copy link

I have the same error and got Readiness probe failed: HTTP probe failed with statuscode: 500 image image image

image

k version -o yaml

image

I get the same problem, how to fix it?

@HFourier
Copy link

HFourier commented Apr 16, 2024

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.

spec:
nodeName: <your master node name>
containers:
- args:
- --cert-dir=/tmp
- --secure-port=10250
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP

@tienhuynh17
Copy link

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.

spec: nodeName: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP

This works for me. Thank you!

@chenjilan123
Copy link

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.
spec: nodeName: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP

This works for me. Thank you!

Thanks, wonderful. This work for me to solve the problem.

@edernucci
Copy link

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.

spec: nodeName: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP

Amazing solution, thanks for sharing. Working on Hetzner Cloud, Kubernetes 1.30.

@ambarishsatheesh-rs
Copy link

I get the same problem, how to fix it?

I have solve that. I found the metrics-server was not in master node. When I add master node name in yaml and restore metrics-server, it works.

spec: nodeName: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP

What's the reason for this? Is there some documentation about why metrics-server needs to be on a control-plane node?

@jmutai
Copy link

jmutai commented Aug 2, 2024

I struggled to solve the issue and documented the steps after finding the solution - https://computingforgeeks.com/fix-error-metrics-api-not-available-in-kubernetes/

@RazaGR
Copy link

RazaGR commented Aug 4, 2024

with helm chart, this solved it

containerPort: 4443
hostNetwork:
  enabled: true
defaultArgs:
  - --cert-dir=/tmp
  - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
  - --kubelet-use-node-status-port
  - --metric-resolution=15s
  - --secure-port=4443
  - --kubelet-insecure-tls

@francisco-mata
Copy link

Verify your kube-api pods in kube-system are communicating properly with the API due to proxy issues... I had to add no_proxy to the /etc/kubernetes/manifests/kube-apiserver.yaml to connect properly to the APIs and it fix my issue, I added 10.0.0.0/8 to cover my services subnet.

I used HELM Chart (replicas == 3 enable HA), also used the nodeSelector with node-role.kubernetes.io/control-plane: "" as my masters are also workers, so the metrics servers are lock on control-plane nodes only.

Hope this will help you,

@atrakic
Copy link

atrakic commented Dec 23, 2024

I can confirm helm install works after adding additional config (see bellow).

  • kind config:
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  # Patch kubeadm to add node labels for accepting ingress.
  kubeadmConfigPatches:
  - |
    kind: InitConfiguration
    nodeRegistration:
      kubeletExtraArgs:
        node-labels: "ingress-ready=true"
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    protocol: TCP
  - containerPort: 443
    hostPort: 443
    protocol: TCP
  • values.yaml
metrics-server:
  #replicas: 3
  podDisruptionBudget:
    enabled: true
    maxUnavailable: 1
  nodeSelector:
    node-role.kubernetes.io/control-plane: ""
  containerPort: 4443
  hostNetwork:
    enabled: true
  defaultArgs:
    - --cert-dir=/tmp
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    - --kubelet-use-node-status-port
    - --metric-resolution=15s
    - --secure-port=4443
    - --kubelet-insecure-tls

Without editing values.yaml I got following log entries:

I1223 20:19:33.839235       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
E1223 20:19:48.737196       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.2:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.2 because it doesn't contain any IP SANs" node="helm-charts-control-plane"
I1223 20:19:55.583459       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E1223 20:20:03.747902       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.2:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.2 because it doesn't contain any IP SANs" node="helm-charts-control-plane"
I1223 20:20:05.584782       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
I1223 20:20:15.583868       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E1223 20:20:18.741813       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.2:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.2 because it doesn't contain any IP SANs" node="helm-charts-control-plane"
I1223 20:20:25.580648       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E1223 20:20:33.747969       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.2:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.2 because it doesn't contain any IP SANs" node="helm-charts-control-plane"
I1223 20:20:35.580721       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
I1223 20:20:42.669951       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E1223 20:20:49.030270       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.2:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.2 because it doesn't contain any IP SANs" node="helm-charts-control-plane"
I1223 20:20:52.680057       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
I1223 20:21:02.679072       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E1223 20:21:03.749176       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.2:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.2 because it doesn't contain any IP SANs" node="helm-charts-control-plane"
I1223 20:21:12.675822       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E1223 20:21:18.746595       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.2:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.2 because it doesn't contain any IP SANs" node="helm-charts-control-plane"
I1223 20:21:22.676680       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve

My setup

kind version
kind v0.26.0 go1.23.4 darwin/arm64

helm: metrics-server 3.8.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests