Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to initialize NVML: GPU access blocked by the operating system #9938

Open
1 of 2 tasks
loliq opened this issue Apr 10, 2023 · 19 comments
Open
1 of 2 tasks

Failed to initialize NVML: GPU access blocked by the operating system #9938

loliq opened this issue Apr 10, 2023 · 19 comments
Labels

Comments

@loliq
Copy link

loliq commented Apr 10, 2023

Windows Version

Windows 10 [19045.2728]

WSL Version

1.1.6.0

Are you using WSL 1 or WSL 2?

  • WSL 2
  • WSL 1

Kernel Version

Linux version 5.15.90.1-microsoft-standard-WSL2

Distro Version

Ubuntu 22.04

Other Software

No response

Repro Steps

I install wsl2 in Windows 10 [19045.2728]
image

In Windows, “nvidia-smi” output is :
image

but in wsl2, output is :
image

Below are the solutions I have tried that didn't work:

  1. log in with administrator privileges
  2. update the driver to the latest version
  3. install cuda toolkits

My file list in C:\Windows\System32\lxss\lib is :

image

Expected Behavior

that nvidia-smi outputs the informations related to my gpu and that I can use it inside the WSL2 environment.

Actual Behavior

this is what happens when I launch nvidia-smi inside ubuntu 22.04 :

nvidia-smi

Failed to initialize NVML: GPU access blocked by the operating system
Failed to properly shut down NVML: GPU access blocked by the operating system

Diagnostic Logs

No response

@fschvart
Copy link

I have the exact same problem

@benhillis benhillis added the GPU label Apr 10, 2023
@OneBlue
Copy link
Collaborator

OneBlue commented Apr 11, 2023

/logs

@microsoft-github-policy-service
Copy link
Contributor

Hello! Could you please provide more logs to help us better diagnose your issue?

To collect WSL logs, download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The scipt will output the path of the log file once done.

Once completed please upload the output files to this Github issue.

Click here for more info on logging

Thank you!

@fschvart
Copy link

Hi,
In my case the issue was that WSL doesn't support A100 GPUs

@loliq
Copy link
Author

loliq commented Apr 12, 2023

Hi, In my case the issue was that WSL doesn't support A100 GPUs

Thank you very much, I guess it was the problem, I have tried other machine which use 3060 and it works.

@loliq
Copy link
Author

loliq commented Apr 12, 2023

Hello! Could you please provide more logs to help us better diagnose your issue?

To collect WSL logs, download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The scipt will output the path of the log file once done.

Once completed please upload the output files to this Github issue.

Click here for more info on logging

Thank you!

Thank you for replying, the attachment is log file.
WslLogs-2023-04-12_11-08-54.zip

@lpdink
Copy link

lpdink commented Apr 13, 2023

I have encountered a similar problem. nvidia-smi works well in wsl2, but it doesn't work properly in the docker container started in wsl2, with error "Failed to initialize NVML: GPU access blocked by the operating system".
aebf03e8daa69b32bdc7e609ecc194d8 (1)
I use the official image provided by Pytorch and am confident that Docker-ce and nvidia_container_toolkit has been installed correctly. In fact, when I use the same installation script in the Ubuntu system, the GPU in the container works well.
Here is my system version info:
QQ截图20230413134621
QQ截图20230413134729
Looking forward to your reply, thank you in advance for your help

@bsekachev
Copy link

Exactly the same problem like @loliq has.
Yesteday worked fine, today does not work anymore.

Windows has updated last night.
Installed: KB5025239, KB2267602, KB890830

@alf-wangzhi
Copy link

I have encountered a similar problem. nvidia-smi works well in wsl2, but it doesn't work properly in the docker container started in wsl2, with error "Failed to initialize NVML: GPU access blocked by the operating system". aebf03e8daa69b32bdc7e609ecc194d8 (1) I use the official image provided by Pytorch and am confident that Docker-ce and nvidia_container_toolkit has been installed correctly. In fact, when I use the same installation script in the Ubuntu system, the GPU in the container works well. Here is my system version info: QQ截图20230413134621 QQ截图20230413134729 Looking forward to your reply, thank you in advance for your help

I encountered the same problem. Has it been resolved? Does this error mean that GPU cannot be used?

@lpdink
Copy link

lpdink commented Apr 18, 2023

Yeah,just see: #9962 @alf-wangzhi

@alf-wangzhi
Copy link

thank you so much. it is means a lot @lpdink

@anton-petrov
Copy link

The problem was solved, time to close this issue.

@qwqawawow
Copy link

qwqawawow commented Jun 15, 2024

I got this error on my machine

WSL version: 2.2.4.0
version: 5.15.153.1-2
WSLg version: 1.0.61
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26091.1-240325-1447.ge-release
Windows version: 10.0.22631.3737

and the kernel is 6.6.32-microsoft-standard-WSL2 compiled by myself
Any suggestions?

@bert-jan
Copy link

I have the same issue with Nvidia A16.

WSL version: 2.1.5.0 version: 5.15.146.1-2 WSLg version: 1.0.60 MSRDC version: 1.2.5105 Direct3D version: 1.611.1-81528511 DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp Windows version: 10.0.19045.3570

@Zhiwei-Zhai
Copy link

I have the same issue, with:

system: windows 10, 22H2, 19045.4651, GPU: Nvidia Tesla v100
Ubuntu-22.04

WSL version: 2.3.13.0
内核version: 6.6.36.3-1
WSLg version: 1.0.63
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.19045.4651

@Rahman2001
Copy link

Here, I found solution in #9962 that says:

Inside the file /etc/nvidia-container-runtime/config.toml change no-cgroups from true to false

It worked for me. Hope it does for you too.

@ankrwu
Copy link

ankrwu commented Nov 4, 2024

Inside the file /etc/nvidia-container-runtime/config.toml change no-cgroups from true to false

@Zhiwei-Zhai
Copy link

I have the same issue, with:

system: windows 10, 22H2, 19045.4651, GPU: Nvidia Tesla v100 Ubuntu-22.04

WSL version: 2.3.13.0
内核version: 6.6.36.3-1
WSLg version: 1.0.63
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.19045.4651

Tried everything, but it's still not working. WSL not support for Tesla GPU.

Image

@umarbutler
Copy link

Very bizzare, this happened to me randomly but it was fixed after restarting...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests