Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
archlitchi committed Jun 24, 2024
1 parent 5d370fc commit 8d6af35
Show file tree
Hide file tree
Showing 231 changed files with 13 additions and 2,423 deletions.
100 changes: 0 additions & 100 deletions .bashrc

This file was deleted.

11 changes: 0 additions & 11 deletions .profile

This file was deleted.

55 changes: 13 additions & 42 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,17 @@
# DCU vGPU device plugin for HAMi
# Mock device plugin for HAMi

## Introduction
This is a [Kubernetes][k8s] [device plugin][dp] implementation that enables the registration of hygon DCU in a container cluster for compute workload. With the approrpriate hardware and this plugin deployed in your Kubernetes cluster, you will be able to run jobs that require AMD DCU. It supports DCU-virtualzation by using hy-virtual provided by dtk
This is a [Kubernetes][k8s] [device plugin][dp] implementation that enables the registration of virtual-devices which would normally be ignored by scheduler (i.e gpu-memory, gpu-cores, etc..)on each node. After deployment, these resources will be available on node.status.allocatable and node.status.capacity


## Prerequisites
* dtk >= 24.04
* hy=smi == v1.6.0


## Limitations
* This plugin targets Kubernetes v1.18+.

## Deployment
```
$ kubectl apply -f k8s-dcu-rbac.yaml
$ kubectl apply -f k8s-dcu-plugin.yaml
$ kubectl apply -f k8s-mock-rbac.yaml
$ kubectl apply -f k8s-mock-plugin.yaml
```

## Build
Expand All @@ -26,40 +22,15 @@ docker build .
## Examples

```
apiVersion: v1
kind: Pod
metadata:
name: alexnet-tf-gpu-pod-mem
labels:
purpose: demo-tf-amdgpu
spec:
containers:
- name: alexnet-tf-gpu-container
image: ubuntu:20.04
workingDir: /root
command: ["sleep","infinity"]
resources:
limits:
hygon.com/dcunum: 1 # requesting a GPU
hygon.com/dcumem: 2000 # each dcu require 2000 MiB device memory
hygon.com/dcucores: 15 # each dcu use 60% of total compute cores
```

## Validation

Inside container, use hy-virtual to validate

```
source /opt/hygondriver/env.sh
hy-virtual -show-device-info
```

There will be output like these:
```
Device 0:
Actual Device: 0
Compute units: 9
Global memory: 2097152000 bytes
Allocatable:
...
memory: 769189866507
nvidia.com/gpu: 20
nvidia.com/gpucores: 200
nvidia.com/gpumem: 65536
nvidia.com/gpumem-percentage: 200
pods: 110
...
```

## Maintainer
Expand Down
1 change: 0 additions & 1 deletion example/README.md

This file was deleted.

17 changes: 0 additions & 17 deletions example/default_use.yaml

This file was deleted.

15 changes: 0 additions & 15 deletions example/exlusive_use.yaml

This file was deleted.

16 changes: 0 additions & 16 deletions example/pod/alexnet-cpu.yaml

This file was deleted.

19 changes: 0 additions & 19 deletions example/pod/alexnet-gpu.yaml

This file was deleted.

Binary file removed example/pod/k8s-plugin.tar
Binary file not shown.
23 changes: 0 additions & 23 deletions helm/amd-gpu/.helmignore

This file was deleted.

25 changes: 0 additions & 25 deletions helm/amd-gpu/Chart.yaml

This file was deleted.

40 changes: 0 additions & 40 deletions helm/amd-gpu/README.md

This file was deleted.

4 changes: 0 additions & 4 deletions helm/amd-gpu/templates/NOTES.txt

This file was deleted.

Loading

0 comments on commit 8d6af35

Please sign in to comment.