You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to get my new ThinkPad with "NVIDIA RTX 4000 Ada 12 GB" graphics card going.
No matter what "cuda-driver(12.4)+cudnn+jax+jaxlib" combination I try, the best results are either a)"No GPU/TPU found, falling back to CPU." or b)"failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error"
Run:
runfile('/home/saumya/NeuralN/Op Net/ImprovedDeepONets/Stokes/PI_DeepONet_Stokes-Copy1', wdir='/home/saumya/NeuralN/Op Net/ImprovedDeepONets/Stokes')
2024-03-19 11:48:27.682846: I external/xla/xla/service/service.cc:168] XLA service 0x8dd95c0 initialized for platform Interpreter (this does not guarantee that XLA will be used). Devices:
2024-03-19 11:48:27.682867: I external/xla/xla/service/service.cc:176] StreamExecutor device (0): Interpreter,
2024-03-19 11:48:27.689135: I external/xla/xla/pjrt/tfrt_cpu_pjrt_client.cc:218] TfrtCpuClient created.
2024-03-19 11:48:29.450971: E external/xla/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2024-03-19 11:48:29.450988: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:168] retrieving CUDA diagnostic information for host: saumya-TP-GPU
2024-03-19 11:48:29.450991: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:175] hostname: saumya-TP-GPU
2024-03-19 11:48:29.451052: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:199] libcuda reported version is: 550.54.14
2024-03-19 11:48:29.451064: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:203] kernel reported version is: NOT_FOUND: could not find kernel module information in driver version file contents: "NVRM version: NVIDIA UNIX Open Kernel Module for x86_64 550.54.14 Release Build (dvs-builder@U16-A24-2-2) Thu Feb 22 01:44:50 UTC 2024
GCC version: gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~22.04)
"
No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
Run:
2024-03-19 12:10:31.130411: I external/xla/xla/service/service.cc:168] XLA service 0x6a1d490 initialized for platform Interpreter (this does not guarantee that XLA will be used). Devices:
2024-03-19 12:10:31.130427: I external/xla/xla/service/service.cc:176] StreamExecutor device (0): Interpreter,
2024-03-19 12:10:31.134477: I external/xla/xla/pjrt/tfrt_cpu_pjrt_client.cc:433] TfrtCpuClient created.
2024-03-19 12:10:50.428065: E external/xla/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2024-03-19 12:10:50.428083: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:168] retrieving CUDA diagnostic information for host: saumya-TP-GPU
2024-03-19 12:10:50.428086: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:175] hostname: saumya-TP-GPU
2024-03-19 12:10:50.428143: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:199] libcuda reported version is: 550.54.14
2024-03-19 12:10:50.428156: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:203] kernel reported version is: NOT_FOUND: could not find kernel module information in driver version file contents: "NVRM version: NVIDIA UNIX Open Kernel Module for x86_64 550.54.14 Release Build (dvs-builder@U16-A24-2-2) Thu Feb 22 01:44:50 UTC 2024
GCC version: gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~22.04)
"
No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
My system:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
I am trying to get my new ThinkPad with "NVIDIA RTX 4000 Ada 12 GB" graphics card going.
No matter what "cuda-driver(12.4)+cudnn+jax+jaxlib" combination I try, the best results are either a)"No GPU/TPU found, falling back to CPU." or b)"failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error"
When I run Data Sampler section from https://github.com/PredictiveIntelligenceLab/ImprovedDeepONets/blob/main/Stokes/PI_DeepONet_Stokes.ipynb
I get errors like:
a)
Installation:
pip install jaxlib==0.4.7+cuda12.cudnn88 jax==0.4.7 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
Run:
runfile('/home/saumya/NeuralN/Op Net/ImprovedDeepONets/Stokes/PI_DeepONet_Stokes-Copy1', wdir='/home/saumya/NeuralN/Op Net/ImprovedDeepONets/Stokes')
2024-03-19 11:48:27.682846: I external/xla/xla/service/service.cc:168] XLA service 0x8dd95c0 initialized for platform Interpreter (this does not guarantee that XLA will be used). Devices:
2024-03-19 11:48:27.682867: I external/xla/xla/service/service.cc:176] StreamExecutor device (0): Interpreter,
2024-03-19 11:48:27.689135: I external/xla/xla/pjrt/tfrt_cpu_pjrt_client.cc:218] TfrtCpuClient created.
2024-03-19 11:48:29.450971: E external/xla/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2024-03-19 11:48:29.450988: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:168] retrieving CUDA diagnostic information for host: saumya-TP-GPU
2024-03-19 11:48:29.450991: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:175] hostname: saumya-TP-GPU
2024-03-19 11:48:29.451052: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:199] libcuda reported version is: 550.54.14
2024-03-19 11:48:29.451064: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:203] kernel reported version is: NOT_FOUND: could not find kernel module information in driver version file contents: "NVRM version: NVIDIA UNIX Open Kernel Module for x86_64 550.54.14 Release Build (dvs-builder@U16-A24-2-2) Thu Feb 22 01:44:50 UTC 2024
GCC version: gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~22.04)
"
No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
b)
Installation:
pip install jaxlib==0.4.9+cuda12.cudnn88 jax==0.4.9 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
Run:
2024-03-19 12:10:31.130411: I external/xla/xla/service/service.cc:168] XLA service 0x6a1d490 initialized for platform Interpreter (this does not guarantee that XLA will be used). Devices:
2024-03-19 12:10:31.130427: I external/xla/xla/service/service.cc:176] StreamExecutor device (0): Interpreter,
2024-03-19 12:10:31.134477: I external/xla/xla/pjrt/tfrt_cpu_pjrt_client.cc:433] TfrtCpuClient created.
2024-03-19 12:10:50.428065: E external/xla/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2024-03-19 12:10:50.428083: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:168] retrieving CUDA diagnostic information for host: saumya-TP-GPU
2024-03-19 12:10:50.428086: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:175] hostname: saumya-TP-GPU
2024-03-19 12:10:50.428143: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:199] libcuda reported version is: 550.54.14
2024-03-19 12:10:50.428156: I external/xla/xla/stream_executor/cuda/cuda_diagnostics.cc:203] kernel reported version is: NOT_FOUND: could not find kernel module information in driver version file contents: "NVRM version: NVIDIA UNIX Open Kernel Module for x86_64 550.54.14 Release Build (dvs-builder@U16-A24-2-2) Thu Feb 22 01:44:50 UTC 2024
GCC version: gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~22.04)
"
No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
My system:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
$ nvidia-smi
Tue Mar 19 12:21:40 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 ERR! Off | 00000000:01:00.0 N/A | N/A |
|ERR! ERR! ERR! N/A / N/A | 14MiB / 12282MiB | N/A Default |
| | | ERR! |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
python version
$ whereis python | tr ' ' '\n' | grep ^/ | sort
/home/saumya/anaconda3/envs/OpNet/bin/python
$ python --version && python3 --version
Python 3.9.18
Python 3.9.18
The text was updated successfully, but these errors were encountered: