Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We should reserve compute resources for system daemons #3178

Open
sameh-farouk opened this issue Jun 14, 2021 · 0 comments
Open

We should reserve compute resources for system daemons #3178

sameh-farouk opened this issue Jun 14, 2021 · 0 comments

Comments

@sameh-farouk
Copy link
Member

sameh-farouk commented Jun 14, 2021

Description

Pods can consume all the available capacity on a node by default. This is an issue because nodes typically run quite a few system daemons that power the OS like (sshd, udev, etc. ) and Kubernetes itself. Unless resources are set aside for these system daemons, pods and system daemons compete for resources and lead to resource starvation issues on the node.

Without leaving RAM/Resources set aside, the Kubelet will happily use it all up, and then when we try to SSH in to debug why the node has gotten really slow and unstable, we could not be able to.

it is recommended to configure the kubelet Node Allocatable feature based on the workload density on each node.

Implementation

more info about this can found here:
https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/

More

the Node Allocatable = the Node Capacity - kube-reserved - system-reserved - eviction-threshold

on my VDC kubernetes cluster, describing any node will give you something like this:

kubectl describe node k3os-14247
...
Capacity:
  cpu:                1
...
  memory:             2034804Ki
Allocatable:
  cpu:                1
...
  memory:             2034804Ki
...

note that the node Allocatable same as the node Capacity, because we don't leave room for system daemons.

this could be or not the reason for something like this ? note the RESTARTS value

root@zosv2-04:/sandbox/code/github/threefoldtech/js-sdk# kubectl get pods -A
NAMESPACE     NAME                                         READY   STATUS      RESTARTS   AGE
...
kube-system   local-path-provisioner-7ff9579c6-mgwnn       1/1     Running     60         10d
...
@sameh-farouk sameh-farouk added this to the later milestone Jun 14, 2021
@sameh-farouk sameh-farouk added this to Accepted in JS-SDK 11.6 via automation Jun 20, 2021
@sameh-farouk sameh-farouk moved this from Accepted to Backlog in JS-SDK 11.6 Jun 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

1 participant