You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# If the server can access the external network, it can be installed directly in the following way.
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
# If the server cannot access the external network, you can first download the offline installation package of nvidia-container-toolkit on GitHub. The website is as follows:
https://github.com/NVIDIA/nvidia-container-toolkit/releases
# Then unzip and enter the file directory. Use sudo apt install./* to install all packages as follows: (only tested on Ubuntu).
root@edgenode:~/release-v1.16.0-rc.1-experimental/packages/ubuntu18.04/amd64# pwd
/root/release-v1.16.0-rc.1-experimental/packages/ubuntu18.04/amd64
root@edgenode:~/release-v1.16.0-rc.1-experimental/packages/ubuntu18.04/amd64# ls
libnvidia-container1_1.16.0~rc.1-1_amd64.deb libnvidia-container-tools_1.16.0~rc.1-1_amd64.deb nvidia-container-toolkit-operator-extensions_1.16.0~rc.1-1_amd64.deb
libnvidia-container1-dbg_1.16.0~rc.1-1_amd64.deb nvidia-container-toolkit_1.16.0~rc.1-1_amd64.deb
libnvidia-container-dev_1.16.0~rc.1-1_amd64.deb nvidia-container-toolkit-base_1.16.0~rc.1-1_amd64.deb
root@edgenode:~/release-v1.16.0-rc.1-experimental/packages/ubuntu18.04/amd64# sudo apt install ./*
4. Configure Docker or Containerd to use nvidia-runtime
# After the installation of nvidia-container-toolkit is completed, nvidia-ctk can be used to configure nvidia-runtime.# docker
sudo nvidia-ctk runtime configure --runtime=docker --set-as-default (docker)
# containerd
sudo nvidia-ctk runtime configure --runtime=containerd --set-as-default (containerd)
10. After the application is successfully deployed, you can verify whether the GPU is available on the corresponding node
# Enter the application container on the node and use the torch.cuda.is_available() method to judge. Returning True indicates that the GPU is successfully driven.# docker
root@nano-desktop:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e7e3804626a5 853b58c1dce6 "tail -f /dev/null" 53 seconds ago Up 45 seconds k8s_container-1_test-gpu-arm64-nano-7f8fd7f79f-hzvp5_default_64fb7a90-b0e6-4b46-a34f-8a06b24b9169_0
root@nano-desktop:~# docker exec -it e7e3804626a5 /bin/bash
root@test-gpu-arm64-nano-7f8fd7f79f-hzvp5:/# python3
Python 3.8.10 (default, Nov 14 2022, 12:59:47)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license"for more information.
>>> import torch
>>> torch.cuda.is_available()
True
# containerd
root@edgenode:~# crictl ps
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
de1f1e60abc0a 0dd75116a8ce8 2 minutes ago Running container-1 0 6beffb412af3f test-gpu-6bfbdc9449-jfbrl
root@edgenode:~# crictl exec -it de1f1e60abc0a /bin/bash
root@test-gpu-6bfbdc9449-jfbrl:/workspace# python3
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license"for more information.
>>> import torch
>>> torch.cuda.is_available()
True
The text was updated successfully, but these errors were encountered:
How to join a nvidia GPU node in kubeedge
1. Install the GPU Driver First
2. Install Docker or Containerd
3. Install Nvidia-Container-Toolkit
4. Configure Docker or Containerd to use nvidia-runtime
5. Restart Docker or Containerd.
6、Join the KubeEdge Node
7. Deploy the daemonset (k8s-device-plugin)
8. Verify whether the GPU information is reported successfully
9. Deploy an application for testing
10. After the application is successfully deployed, you can verify whether the GPU is available on the corresponding node
The text was updated successfully, but these errors were encountered: