WIP: Implement system-reserved-compressible #5408
Draft
+197
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TODO: Before Review
What I did
This PR enables system-reserved-compressible enforcement by default for all new OpenShift 4.21+ clusters to allow better CPU allocation for system reserved processes through cgroup-based enforcement.
Template Changes:
/system.sliceto default kubelet configuration for all node types (master, worker, arbiter)enforceNodeAllocatablealongside pods in kubelet template filesPerformance Profile Compatibility:
The kubelet cannot simultaneously enforce both
systemReservedCgroupand--reserved-cpus(used by Performance Profiles in the Node Tuning Operator). To resolve this conflict, I added logic in the Kubelet Config Controller (pkg/controller/kubelet-config/helpers.go) to:--reserved-cpus) is setenforceNodeAllocatable to ["pods"]only in this scenarioThis approach leverages the fact that
--reserved-cpusalready supersedes system-reserved, making systemReservedCgroup enforcement redundant in PerformanceProfile scenarios.Validation:
systemReservedCgroupmatchessystemCgroupswhen both are user-specifiedHow to verify it
For New OCP 4.21+ Clusters:
cat /etc/kubernetes/kubelet.conf | grep -A2 systemReservedCgroup
cat /etc/kubernetes/kubelet.conf | grep -A3 enforceNodeAllocatable
systemReservedCgroup: /system.slice
enforceNodeAllocatable:
For Clusters with Performance Profiles:
cat /etc/kubernetes/kubelet.conf | grep systemReservedCgroup
cat /etc/kubernetes/kubelet.conf | grep enforceNodeAllocatable
- systemReservedCgroup is NOT present (empty/cleared)
- enforceNodeAllocatable only contains ["pods"]
- Kubelet starts successfully without errors
journalctl -u kubelet | grep -i "system-reserved|reserved-cpus"
For OCP 4.20 to 4.21 Upgrades:
Description for the changelog
Enable system-reserved-compressible enforcement by default in new OCP 4.21+ clusters. The kubelet now enforces CPU limits on system daemons via systemReservedCgroup (/system.slice), improving CPU allocation for system reserved processes on nodes with high CPU counts. Automatically disables systemReservedCgroup enforcement when Performance Profiles with reserved-cpus are used to prevent conflicts. Existing OCP 4.20 clusters upgrading to 4.21+ will preserve their current behavior via migration MachineConfig.
Related: