Skip to content

Commit d441c6d

Browse files
committed
Add instructions for running the example driver on GKE
1 parent e82f291 commit d441c6d

File tree

4 files changed

+72
-1
lines changed

4 files changed

+72
-1
lines changed

README.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -383,6 +383,49 @@ Finally, you can run the following to cleanup your environment and delete the
383383
./demo/delete-cluster.sh
384384
```
385385

386+
## Installing the example driver on a GKE cluster
387+
It is also possible to run the example driver on a GKE cluster. For this, we
388+
will use the pre-built image for the kubelet plugin, so there is no need
389+
to build anything. All that is needed is a Google Cloud Platform account,
390+
the gcloud CLI and Helm.
391+
392+
To keep things simple and identical to the Kind example, we will use a
393+
single-node GKE cluster.
394+
395+
CDI must be enabled in containerd for the DRA driver to work. CDI is
396+
enabled by default in GKE since 1.32.1-gke.1489001, so we will create
397+
a cluster in the rapid channel to make sure we get a recent version.
398+
399+
Since DRA is still a beta feature, we need to explicitely enable it
400+
when the cluster is created.
401+
402+
First, create a GKE cluster with gcloud.
403+
```bash
404+
gcloud container clusters create dra-example-driver-cluster \
405+
--location=us-central1-c \
406+
--release-channel=rapid \
407+
--num-nodes=1 \
408+
--enable-kubernetes-unstable-apis=resource.k8s.io/v1beta1/deviceclasses,resource.k8s.io/v1beta1/resourceclaims,resource.k8s.io/v1beta1/resourceclaimtemplates,resource.k8s.io/v1beta1/resourceslices
409+
```
410+
411+
Once the cluster is ready, we can install the DRA using Helm.
412+
413+
The kubelet plugin in the example driver is set up to run with priority class
414+
`system-node-critical`. On GKE, pods are by default restricted from running
415+
with this priority class, so we need to use a ResourceQuota to allow it. The
416+
Helm chart supports, this, we just have to enable it.
417+
418+
```bash
419+
helm upgrade -i \
420+
--create-namespace \
421+
--namespace dra-example-driver \
422+
--set=resourcequota.enabled=true \
423+
dra-example-driver \
424+
deployments/helm/dra-example-driver
425+
```
426+
427+
The examples in `demo/gpu-test{1,2,3,4,5}.yaml` works just like with Kind.
428+
386429
## Anatomy of a DRA resource driver
387430

388431
TBD

deployments/helm/dra-example-driver/Chart.yaml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,4 +25,6 @@ version: 0.0.0-dev
2525
# It is recommended to use it with quotes.
2626
appVersion: "v0.1.0"
2727

28-
kubeVersion: "1.32.x"
28+
# The "-0" suffix is to make sure the chart works on GKE clusters, which uses versions on
29+
# the format 1.32.1-gke.1234567.
30+
kubeVersion: "1.32.x-0"
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
{{- if .Values.resourcequota.enabled }}
2+
apiVersion: v1
3+
kind: ResourceQuota
4+
metadata:
5+
name: {{ include "dra-example-driver.fullname" . }}-resourcequota
6+
namespace: {{ include "dra-example-driver.namespace" . }}
7+
spec:
8+
hard:
9+
pods: {{ .Values.resourcequota.pods }}
10+
{{- with .Values.resourcequota.scopeSelector.matchExpressions }}
11+
scopeSelector:
12+
matchExpressions:
13+
{{- toYaml . | nindent 4 }}
14+
{{- end }}
15+
{{- end }}

deployments/helm/dra-example-driver/values.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,3 +87,14 @@ webhook:
8787
# The name of the service account to use.
8888
# If not set and create is true, a name is generated using the fullname template
8989
name: ""
90+
91+
resourcequota:
92+
enabled: false
93+
pods: 10
94+
scopeSelector:
95+
matchExpressions:
96+
- operator: In
97+
scopeName: PriorityClass
98+
values:
99+
- system-node-critical
100+
- system-cluster-critical

0 commit comments

Comments
 (0)