Skip to content

Commit

Permalink
Refactoring:
Browse files Browse the repository at this point in the history
- explicit values file for privileged direct method,
- hide (into docs directory) "unprivileged" direct method (and fixes),
- remove unnessesary mounts (mcfg, /dev/cpu/dev/mem for privileged access),
- add instructions to collection methods,
- fixes (extra builder) for build local development image,
- silent mode
- move collection methods to the top
  • Loading branch information
ppalucki committed May 21, 2024
1 parent 17a9198 commit 6ed6bdf
Show file tree
Hide file tree
Showing 10 changed files with 124 additions and 50 deletions.
57 changes: 42 additions & 15 deletions deployment/pcm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,21 @@ Helm chart instructions

### Features:

- Configurable as non-privileged container (value: `privileged=false` / default) and privileged container,
- Support for bare-metal and VM host configurations (files: [values-metal.yaml](values-metal.yaml), [values-vm.yaml](values-metal.yaml)),
- Configurable as non-privileged container (value: `privileged=false`, default) and privileged container,
- Support for bare-metal and VM host configurations (files: [values-metal.yaml](values-metal.yaml), [values-vm.yaml](values-vm.yaml)),
- Ability to deploy multiple releases alongside configured differently to handle different kinds of machines (bare-metal, VM) at the [same time](#heterogeneous-mixed-vmmetal-instances-cluster),
- Controllable set of metrics and method of collection (RDT, uncore), support direct (msr) and indirect (Linux abstractions perf/resctrl) counter accesses (file: [values-indirect.yaml](values-indirect.yaml)).
- Linux Watchdog handling (controlled with `PCM_KEEP_NMI_WATCHDOG`, `PCM_NO_AWS_WORKAROUND`, `nmiWatchdogMount` values).
- Deploy to own namespace with "helm install ... **-n pcm --create-namespace**"
- Silent mode (value: `silent=false`, default)

Here are available methods in this chart of metrics collection w.r.t interfaces and required access:

| Method | Used interfaces | default | Notes | instructions |
|-------------------------|----------------------| ------- | ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| unprivileged "indirect" | perf, resctrl | v | recommended, missing metrics: energy metrics (TODO link to issues/PR or node_exporter/rapl_collector) | `helm install . pcm` |
| privileged "indirect" | perf, resctrl | | not recommended, unsecure, no advantages over unprivileged), missing metrics: energy metrics | `helm install . pcm --set privileged=true` |
| privileged "direct" | msr | | not recommended, unsecure and requires msr module pre loaded on host | `helm install . pcm -f values-direct-privileged.yaml` |
| unprivileged "direct" | msr | | not recommended, requires msr module and access to /dev/cpu and /dev/mem (non trivial, like using 3rd plugins) | [link for detailed documentation](docs/direct-unprivileged-deployment.md) |

For more information about direct/indirect collection methods please see [here](#metric-collection-methods-capabilites-vs-requirements)

Expand Down Expand Up @@ -47,7 +56,7 @@ helm install ... --set nfd=true --set podMonitor=true
### Requirements

- Full set of metrics (uncore/UPI, RDT, energy) requires bare-metal or .metal cloud instance.
- /sys/fs/resctrl has to be mounted on host OS (for default indirect deployment method),
- /sys/fs/resctrl has to be mounted on host OS (for default indirect deployment method)
- pod is allowed to be run with privileged capabilities (SYS_ADMIN, SYS_RAWIO) on given namespace in other words: Pod Security Standards allow to run on privileged level,

```
Expand Down Expand Up @@ -78,12 +87,14 @@ More information here: https://kubernetes.io/docs/tutorials/security/ns-level-ps
#### 1) (Optionally) mount resctrl filesystem (for RDT metrics) to unload "msr" kernel module for validation

```
echo 0 > /proc/sys/kernel/perf_event_paranoid
mount -t resctrl resctrl /sys/fs/resctrl
```

For validation to verify that all metrics are available without msr, unload "msr" module from kernel:
For validation to verify that all metrics are available without msr, unload "msr" module from kernel and perf_event_paranoid has default value
```
rmmod msr
echo 2 > /proc/sys/kernel/perf_event_paranoid
```

#### 2) Create kind based Kubernetes cluster
Expand Down Expand Up @@ -123,11 +134,24 @@ bash kind-with-registry.sh
Check that resctrl is available inside kind node:
```
docker exec kind-control-plane ls /sys/fs/resctrl/info
# expected output:
# L3_MON
# MB
# ...
```


and optionally local registry is running (to be used with local pcm build images, more detail [below](development-with-local-images-and-testing))
```
docker ps | grep kind-registry
# expected output:
# e57529be23ea registry:2 "/entrypoint.sh /etc…" 3 weeks ago Up 3 weeks 127.0.0.1:5001->5000/tcp kind-registry
```

Export kind kubeconfig as default for further kubectl commands:
```
kind export kubeconfig
kubectl get pods -A
```

#### 3) (Optionally) Deploy Node Feature Discovery (nfd)
Expand Down Expand Up @@ -200,9 +224,9 @@ promtool query instant http://127.0.0.1:8001/api/v1/namespaces/default/services/

### Deploy alternative options

#### Direct as privileged container
#### Direct (msr access) as privileged container
```
helm install pcm . -f values-direct.yaml --set privileged=true
helm install pcm . -f values-direct-privileged.yaml
```

#### Homogeneous bare metal instances cluster (full set of metrics)
Expand Down Expand Up @@ -243,14 +267,21 @@ wget https://kind.sigs.k8s.io/examples/kind-with-registry.sh
bash kind-with-registry.sh
```

2) Build docker image and upload to local registry
2) Build docker image and upload to local registry (from project root directory)
```
docker build . -t localhost:5001/pcm-local
docker push localhost:5001/pcm-local
# or with single line
# optionally create buildx based builder
mkdir ~/.docker/cli-plugins
curl -sL https://github.com/docker/buildx/releases/download/v0.14.0/buildx-v0.14.0.linux-amd64 -o ~/.docker/cli-plugins/docker-buildx
chmod +x ~/.docker/cli-plugins/docker-buildx
docker buildx create --driver docker-container --name mydocker --use --bootstrap
# or with single line (from deployment/pcm/ directory)
# Build local image for tests/development + fix /pcm/resctrl mounting (assuming project was configured with cmake previously):
(cd ../.. ; (cd build ; make -j pcm pcm-sensor-server) ; docker build . -t localhost:5001/pcm-local && docker push localhost:5001/pcm-local; docker run -ti --rm --name pcmtest --entrypoint bash localhost:5001/pcm-local -c "pcm 2>&1 | head -5" )
# Note: Warning: we're using patched Dockerfile (TODO to be removed, because "build" directory conflits with existing root "build" directory and for caching ability)
(cd ../.. ; (cd build ; make -j pcm pcm-sensor-server) ; docker build . -t localhost:5001/pcm-local && docker push localhost:5001/pcm-local)
```

3) When deploying to kind cluster pcm use values to switch to local pcm-local image
Expand All @@ -274,12 +305,8 @@ kubectl exec -ti ds/pcm -- bash
kubectl logs ds/pcm
```

#### Metric collection methods (capabilities vs requirements)
### Metric collection methods (capabilities vs requirements)

| Method | Used interfaces | default | Notes |
|---------------|------------------------------------------------------------| -------- | ------------------------------------------------------------------------------------- |
| indirect | perf, resctrl | v | missing energy metrics, |
| direct | msr | | requires msr module and access to /dev/cpu (non trivial) or privileged access |


| Metrics | Available on Hardware | Available through interface | Available through method |
Expand Down
4 changes: 2 additions & 2 deletions deployment/pcm/docs/direct-unprivileged-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ helm install smarter-device-plugin --create-namespace --namespace smarter-device
kubectl get node kind-control-plane -o json | jq .status.capacity
# Install pcm helm chart in unprivileged mode with extraResources for cpu and memory devices.
helm install pcm . --set privileged=false -f values-direct.yaml -f values-smarter-devices-cpu-mem.yaml
helm install pcm . -f docs/direct-unprivileged-examples/values-direct-unprivileged.yaml -f docs/direct-unprivileged-examples/values-smarter-devices-cpu-mem.yaml
```

##### b) Device injection using NRI plugin device-injection
Expand Down Expand Up @@ -63,5 +63,5 @@ docker exec kind-control-plane systemctl restart containerd
docker exec kind-control-plane systemd-run -u device-injector /device-injector -idx 10 -verbose
docker exec kind-control-plane systemctl status device-injector
helm install pcm-device-injector . --set privileged=false --set hostPort= --set debugSleep=true -f values-opcm-local-image.yaml -f values-device-injector.yaml
helm install pcm . -f docs/direct-unprivileged-examples/values-direct-unprivileged.yaml -f docs/direct-unprivileged-examples/values-device-injector.yaml
```
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,9 +1,19 @@
# Warning: this file is to be used or direct unprivilegd access which requires 3rd party plugin
# e.g. device-injector NRI or smarter-devices-cpu-mem
privileged: false

# Swtich to using MSR
PCM_NO_MSR: 0 # use MSR
PCM_NO_PERF: 1 # do not use Linux perf
PCM_USE_UNCORE_PERF: 0 # also use MSR for uncore
PCM_NO_RDT: 0 # Collect RDT data
PCM_USE_RESCTRL: 0 # using MSR (no resctrl)
resctrlHostMount: false # with MSR resctrl mount is not needed

# RDT metrics will be used by direct msr programming
resctrlHostMount: false
resctrlInsideMount: false

# sys and pci mounts are required for uncore PMU devices discovery
sysMount: true # /pcm/sys is required
pciMount: true # /pcm/proc/bus/pci is required

1 change: 1 addition & 0 deletions deployment/pcm/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ securityContext:
add:
- SYS_ADMIN
- SYS_RAWIO
#- PERFMON
{{- end }}
{{- end }}

Expand Down
66 changes: 38 additions & 28 deletions deployment/pcm/templates/daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,14 @@ spec:
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
{{- include "pcm.securityContext" . | nindent 8 }}
{{- if .Values.silent }}
command:
- "/usr/local/bin/pcm-sensor-server"
- "-p"
- "9738"
- "-r"
- "-silent"
{{- end -}}
{{- if .Values.debugSleep }}
command:
- /usr/bin/sleep
Expand All @@ -63,7 +71,7 @@ spec:
command:
- /bin/bash
- -c
- "/usr/local/bin/pcm 2 -r -nc -nsys"
- "/usr/local/bin/pcm 2 -r -nc -nsys{{ if .Values.silent }} -silent{{ end }}"
{{- end -}}
{{- if .Values.resctrlInternalMount }}
# Ugly hack to mount resctrl inside only for baremetal when we want use resctrl abstraction and is not mounted on HOST: TBC conflicts with
Expand Down Expand Up @@ -116,14 +124,14 @@ spec:
protocol: TCP
{{- end }}
volumeMounts:
{{- if .Values.privileged }}
- mountPath: /pcm/dev/cpu
name: dev-cpu
readOnly: false
- mountPath: /pcm/dev/mem
name: dev-mem
readOnly: false
{{- end }}
# {{- if .Values.privileged }}
# - mountPath: /pcm/dev/cpu
# name: dev-cpu
# readOnly: false
# - mountPath: /pcm/dev/mem
# name: dev-mem
# readOnly: false
# {{- end }}
{{- if .Values.pciMount }}
- mountPath: /pcm/proc/bus/pci
name: proc-pci
Expand All @@ -136,26 +144,27 @@ spec:
{{- if .Values.nmiWatchdogMount }}
- mountPath: /pcm/proc/sys/kernel/nmi_watchdog
name: nmi-watchdog
readOnly: true # RW?
readOnly: true # RW? # TODO
{{- end }}
{{- if .Values.resctrlHostMount }}
- mountPath: /sys/fs/resctrl
name: sysfs-resctrl
{{- end }}
{{- if .Values.mcfgMount }}
- mountPath: /pcm/sys/firmware/acpi/tables/MCFG
name: sys-acpi
readOnly: true
{{- end }}
# TODO: to be removed, already handled by /sysMount
# {{- if .Values.mcfgMount }}
# - mountPath: /pcm/sys/firmware/acpi/tables/MCFG
# name: sys-acpi
# readOnly: true
# {{- end }}
volumes:
{{- if .Values.privileged }}
- name: dev-cpu
hostPath:
path: /dev/cpu
- name: dev-mem
hostPath:
path: /dev/mem
{{- end}}
# {{- if .Values.privileged }}
# - name: dev-cpu
# hostPath:
# path: /dev/cpu
# - name: dev-mem
# hostPath:
# path: /dev/mem
# {{- end}}
{{- if .Values.sysMount }}
- name: sysfs
hostPath:
Expand All @@ -171,11 +180,12 @@ spec:
hostPath:
path: /proc/sys/kernel/nmi_watchdog
{{- end }}
{{- if .Values.mcfgMount }}
- name: sys-acpi
hostPath:
path: /sys/firmware/acpi/tables/MCFG
{{- end }}
# TODO: to be removed, already handled by /sysMount
# {{- if .Values.mcfgMount }}
# - name: sys-acpi
# hostPath:
# path: /sys/firmware/acpi/tables/MCFG
# {{- end }}
{{- if .Values.resctrlHostMount }}
- name: sysfs-resctrl
hostPath:
Expand Down
16 changes: 16 additions & 0 deletions deployment/pcm/values-direct-privileged.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#### Tunning for "direct" privilaged access
privileged: true

# Switch PCM to use msr access always
PCM_NO_MSR: 0 # use MSR
PCM_NO_PERF: 1 # do not use Linux perf
PCM_USE_UNCORE_PERF: 0 # also use MSR for uncore
PCM_NO_RDT: 0 # Enable RDT metrics ...
PCM_USE_RESCTRL: 0 # but using MSR (no resctrl filesystem)

# with privileged container addtional mounts aren't required
resctrlHostMount: false # with MSR resctrl mount is not needed
resctrlInsideMount: false
sysMount: false
pciMount: false
mcfgMount: false
1 change: 1 addition & 0 deletions deployment/pcm/values-vm.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#### ================ Tunning for VM ================
nmiWatchdogMount: true

# Disable RDT because is not avaiable for VM instances
PCM_NO_RDT: 1
resctrlHostMount: false
17 changes: 13 additions & 4 deletions deployment/pcm/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ imagePullSecrets: {}
# Configures SecurityContext to not privileged (by default) so SYS_ADMIN/SYS_RAWIO capabilietes are required for running pod
privileged: false

# Run pcm in silent mode (additional -silent argument to pcm-sensor-server binary)
# Removes some of debug outputs (like warnings about unability to open some /sys... /proc... files)
silent: false

### -------------- Required OS affinity -------
# Should only running on linux
nodeSelector:
Expand All @@ -29,10 +33,15 @@ probes: false
### ================ Metrics configuration ======================

### -------------- Metrics: Uncore ------------
# required for uncore metrics, only in baremetal, not available for VM
mcfgMount: false
sysMount: false
pciMount: false
# Mounts section
# NOTE: only required for direct mode
# required for uncore metrics discovery and working only in baremetal, not available for VM
sysMount: false # mounts host /sys into container /pcm/sys/
pciMount: false # mounts host /proc/bus/pci into container /pcm/proc/bus/pci/

# NOTE this is only required for direct unprivileged mode ?!?!?!
# TODO: to be removed!!!?!?!!?!? (already coverred sysMounts !!!!)
#mcfgMount: false # mounts hosts: /sys/firmware/acpi/tables/MCFG -> /pcm/sys/firmware/acpi/tables/MCFG

### linux Perf (indirect) vs msr(direct)
# Lets try "indirect" as default
Expand Down

0 comments on commit 6ed6bdf

Please sign in to comment.