-
Notifications
You must be signed in to change notification settings - Fork 75
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add RKE2 Basic Resource Profiling (#202)
* Add resource profiling for RKE2 Signed-off-by: Derek Nola <[email protected]> * Fix broken zh anchors Signed-off-by: Derek Nola <[email protected]> * Cleanup comments on script Signed-off-by: Derek Nola <[email protected]> --------- Signed-off-by: Derek Nola <[email protected]>
- Loading branch information
Showing
6 changed files
with
111 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
--- | ||
title: Resource Profiling | ||
--- | ||
|
||
This section captures the results of tests to determine minimum resource requirements for RKE2. | ||
|
||
## Scope of Resource Testing | ||
|
||
The resource tests were intended to address the following problem statements: | ||
|
||
- On a single-node cluster, determine the legitimate minimum amount of CPU and memory entire RKE2 server stack, assuming that a real workload will be deployed on the cluster. | ||
- On an agent node, determine the legitimate minimum amount of CPU and memory that should be set aside for the kubelet and RKE2 agent components. | ||
|
||
### Environment and Components | ||
|
||
| Arch | OS | System | CPU | RAM | Disk | | ||
|------|----|--------|--|----|------| | ||
| x86_64 | Ubuntu 22.04 | AWS c6id.xlarge | Intel Xeon Platinum 8375C CPU, 4 Core 2.90 GHz | 8 GB | NVME SSD | | ||
|
||
|
||
The tested components are: | ||
|
||
* RKE2 v1.27.12 with all packaged components enabled, canal as the CNI | ||
* [Kubernetes Example Nginx Deployment](https://kubernetes.io/docs/tasks/run-application/run-stateless-application-deployment/) | ||
|
||
### Methodology | ||
|
||
`systemd-cgtop` was used to track systemd cgroup-level CPU and memory utilization. | ||
- `system.slice/rke2-server.service` tracks resource utilization for both RKE2 and containerd components. | ||
- `system.slice/rke2-agent.service` tracks resource utilization for the agent components. | ||
|
||
Utilization figures were based on 95th percentile readings from steady state operation on nodes running the described workloads, giving an upper bounds on typical resource usage. | ||
|
||
### RKE2 Server with a Workload | ||
|
||
These are the requirements for a single-node cluster in which the RKE2 server shares resources with a [simple workload](https://kubernetes.io/docs/tasks/run-application/run-stateless-application-deployment/). | ||
|
||
| System | CPU Core Usage | Memory | | ||
|--------|----------------| ------ | | ||
| Intel 8375C | 17% of a core | 4977 MB | | ||
|
||
### RKE2 Cluster with a Single Agent | ||
|
||
These are the baseline requirements for a RKE2 cluster with a RKE2 server node and a RKE2 agent, but no workload. | ||
|
||
| Node | System | CPU Core Usage | Memory | | ||
| ---- | -------|----------------| ------ | | ||
| Server | Intel 8375C | 18% of a core | 4804 MB | | ||
| Agent | Intel 8375C | 5% of a core | 3590 MB | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Script used to parse the output of systemd-cgtop and generate a histogram of the memory usage | ||
# Used for resource profiling page in the RKE2 documentation | ||
# Generate input file using the following command: | ||
# systemd-cgtop system.slice/rke2-server.service --raw -b -n 1200 -d 250ms > systemd_cgtop_output.txt | ||
# Pulls data every 0.25s for 5 minutes | ||
|
||
# Data arragment is: | ||
# cgroup name, # of tasks, CPU %, MEM usage (bytes) | ||
|
||
# import matplotlib.pyplot as plt | ||
import re | ||
import numpy | ||
|
||
tasks = [] | ||
cpu_usage = [] | ||
memory_usage = [] | ||
input_file = "systemd_cgtop_output.txt" | ||
cgroup = "rke2-server" | ||
|
||
with open(input_file, 'r') as infile: | ||
# Iterate over each line in the input file | ||
for line in infile: | ||
regex = r'system\.slice/' + cgroup | ||
if re.search(regex, line): | ||
# Split the line into fields | ||
fields = line.split() | ||
# first entry for cpu is blank, so we skip it | ||
if fields[2] == "-": | ||
continue | ||
tasks.append(int(fields[1])) | ||
cpu_usage.append(float(fields[2])) | ||
memory_usage.append(int(fields[3])) | ||
|
||
# Convert memory usage to megabytes | ||
memory_usage = [usage / 1024 ** 2 for usage in memory_usage] | ||
|
||
tasks_avg = numpy.average(tasks) | ||
cpu_95th = numpy.percentile(cpu_usage, 95) | ||
memory_95th = numpy.percentile(memory_usage, 95) | ||
print(f'Number of Tasks: ', tasks_avg) | ||
print(f'95th Percentile CPU Usage: {cpu_95th:.2f}%') | ||
print(f'95th Percentile Memory Usage: {memory_95th:.2f} MB') | ||
|
||
# Optional Plotting | ||
# plt.hist(cpu_usage, bins=20, alpha=0.7, label='CPU Usage') | ||
# plt.hist(memory_usage, bins=20, alpha=0.7, label='Memory Usage') | ||
# plt.axvline(memory_95th, linestyle='dashed', linewidth=1, label=f'95th Percentile Memory ({memory_95th:.2f} MB)') | ||
|
||
# plt.xlabel('Usage') | ||
# plt.ylabel('Ticks') | ||
# plt.title('Systemd-cgtop Resource Usage Histogram') | ||
# plt.legend() | ||
# plt.show() | ||
|
||
|
||
|