You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want to make sure one cannot request more AMD GPU than he should by using certain environment variables. (e.g. HIP_VISIBLE_DEVICES / ROCR_VISIBLE_DEVICES).
I am not sure whether this is an issue as of today, we cannot verify this since we don't have a box with more than one AMD GPU at the present time.
To bring more clarity, it is possible to expose access to all NVIDIA GPU on the Host via NVIDIA_VISIBLE_DEVICES=all env. variable set to the Pod. Luckily, we were able to work it around by setting --set deviceListStrategy=volume-mounts for nvdp/nvidia-device-plugin helm chart along with these configs in /etc/nvidia-container-runtime/config.toml file:
We want to make sure one cannot request more AMD GPU than he should by using certain environment variables. (e.g.
HIP_VISIBLE_DEVICES
/ROCR_VISIBLE_DEVICES
).I am not sure whether this is an issue as of today, we cannot verify this since we don't have a box with more than one AMD GPU at the present time.
To bring more clarity, it is possible to expose access to all NVIDIA GPU on the Host via
NVIDIA_VISIBLE_DEVICES=all
env. variable set to the Pod. Luckily, we were able to work it around by setting--set deviceListStrategy=volume-mounts
fornvdp/nvidia-device-plugin
helm chart along with these configs in/etc/nvidia-container-runtime/config.toml
file:The text was updated successfully, but these errors were encountered: