GPU support under Ubuntu #223

kushnirm · 2018-05-14T17:24:34Z

OS is Ubuntu 16.04. Nvidia drivers installed and working fine. Nvidia drivers and CUDA work fine in nvidia-docker. Using driver 384.111 and CUDA 9.0 for testing. Slurm+shifter working fine.

But, under shifter, I can't get GPU integration to work quite right. When running an image with nvidia-docker, drivers and utilities like nvidia-smi are available and work. When running the same container via shifter they are not.

If I make a copy of /usr/lib/nvidia-384 to my siteFs, and set the PATH and LD_LIBRARY_PATH, nvidia-smi retuns the expected output. However, CUDA demo apps (i.e. deviceQuery, etc...) says:

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Thanks,
Michael

kushnirm · 2018-05-23T16:21:35Z

On further review, looks like some of the GPU related bind mounts are not being automatically created. I found the contrib/gpu_activate_gpu_support.sh script, But, I am not sure when, where, or how it is being invoked.

Please advise.

Thanks,
Michael

scanon · 2018-06-28T00:21:36Z

Michael,

We don't have a GPU system to test with at NERSC. Let me ping some of the CSCS folks and see if they can comment.

uvNikita · 2018-08-20T14:47:49Z

I have the same question, how does contrib/gpu/activate_gpu_support.sh is intended to be used?

uvNikita · 2018-08-20T14:54:21Z

Found commit that removed the code which was using this script: c5e66cc, but I don't see any replacement for this functionality.

kushnirm · 2018-08-20T15:03:46Z

We received no guidance on this and never got it to work. Sorry. Singularity worked as an alternative for us. Cheers, Michael

…

-------- Original Message -------- From: Nikita Uvarov <[email protected]<mailto:[email protected]>> Date: Mon, Aug 20, 2018, 10:55 AM To: NERSC/shifter <[email protected]<mailto:[email protected]>> CC: "Kushnir, Michael (NIH/NLM/LHC) [C]" <[email protected]<mailto:[email protected]>>,Author <[email protected]<mailto:[email protected]>> Subject: Re: [NERSC/shifter] GPU support under Ubuntu (#223) Found commit that removed the code which was using this script: c5e66cc<c5e66cc>, but I don't see any replacement for this functionality. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#223 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AldJnzTAI3l09UrATxNUWzKPoP3qweFLks5uSs2ggaJpZM4T-Mk1>.

scanon · 2018-08-20T16:11:50Z

NERSC will hopefully be able to help more directly on this in the near future.

sk2991 · 2018-08-23T04:28:20Z

We are also facing the same issue. With /usr/lib/nvidia-384 loaded into the container, nvidia-smi is showing the GPUs present on the node. But when we try to execute deviceQuery and nbody benchwork it is throwing same error as
CUDA driver version is insufficient for CUDA runtime version Result = FAIL

Is there anyway another way to test GPUs with shifter and slurm integration?

uvNikita · 2018-08-23T08:07:24Z

After digging through sources and git history, it seems that the plan is to replace an old GPU support with the new module system, see doc/modules.rst and doc/config/udiRoot.conf.rst.

So, we added these lines to our config:

module_nvidia_siteEnvAppend=LD_LIBRARY_PATH=/opt/udiImage/modules/nvidia PATH=/nvidia-bin PATH=/cuda/bin
module_nvidia_siteFs=/usr/bin:/nvidia-bin;/usr/local/cuda:/cuda
module_nvidia_copyPath=/usr/lib64/nvidia

After this, users can start jobs which require nvidia libraries by specifying shifter --module nvidia.

kushnirm changed the title ~~GPU support under Linux~~ GPU support under Ubuntu May 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU support under Ubuntu #223

GPU support under Ubuntu #223

kushnirm commented May 14, 2018 •

edited

Loading

kushnirm commented May 23, 2018

scanon commented Jun 28, 2018

uvNikita commented Aug 20, 2018

uvNikita commented Aug 20, 2018

kushnirm commented Aug 20, 2018 via email

scanon commented Aug 20, 2018

sk2991 commented Aug 23, 2018

uvNikita commented Aug 23, 2018

GPU support under Ubuntu #223

GPU support under Ubuntu #223

Comments

kushnirm commented May 14, 2018 • edited Loading

kushnirm commented May 23, 2018

scanon commented Jun 28, 2018

uvNikita commented Aug 20, 2018

uvNikita commented Aug 20, 2018

kushnirm commented Aug 20, 2018 via email

scanon commented Aug 20, 2018

sk2991 commented Aug 23, 2018

uvNikita commented Aug 23, 2018

kushnirm commented May 14, 2018 •

edited

Loading