Solution: OS Patching

Performing system upgrades is likely something you want to automate, but let's do everything manually as an exercise.

Node Patching

For all nodes (you likely want to start with a worker node just to be sure updates don't murk something horribly):

Drain the Node of all possible Pods

# on control plane node (or anywhere you can access kubectl)
kubectl get nodes
# this command can fail, pay attention if it does!! https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/#the-eviction-api
kubectl drain NODE_NAME --ignore-daemonsets

Hop into the node and perform system upgrades

vagrant ssh <NODE_NAME>

sudo -i

# list held packages - should include all kube* binaries and container runtime
apt-mark showhold

# patch everything not in that list, requires confirmation
apt-get update && apt-get upgrade

# informational only; view logs for patched software versions
awk '$3=="upgrade"' /var/log/dpkg.log

# optional - only if some patches require a restart
reboot

Allow workloads to again be scheduled to this node

# validate node state
kubectl get node NODE_NAME
kubectl describe node NODE_NAME

# allow scheduling again
kubectl uncordon NODE_NAME

Conclusions

Patching is relatively straightforward, just be sure (as with ANY maintenance procedure on any node...) that you're draining the Node so you don't interrupt your workloads! To ensure workloads in the cluster (especially stateful ones!) are capable of handling any node outages, make sure your customers know about and use PodDisruptionBudgets when necessary. If draining a node fails due to a PDB, you'll need to work directly with the application owner and make a decision on when/how to get that Pod evicted to perform upgrades. As a caveat, this is one great reason to have a useful labeling strategy for all primitives. I'd recommend including team contact info (mailing list/on-call number) in a standardized annotation as well. With this, you can easily migrate into an automated patching strategy - one that automatically notifies users if their Pods are preventing cluster maintenance. Slick!

If you've already got an automated patching strategy and only want to facilitate kubernetes-safe draining/rebooting/uncordoning of all your nodes in the cluster, check out Kubernetes Reboot Daemon by WeaveWorks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Solution: OS Patching

Node Patching

Conclusions

Files

README.md

Latest commit

History

README.md

File metadata and controls

Solution: OS Patching

Node Patching

Conclusions