Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guiding Windows users through the WSL install process #449

Open
Jon77Ruler opened this issue Jan 29, 2025 · 0 comments
Open

Guiding Windows users through the WSL install process #449

Jon77Ruler opened this issue Jan 29, 2025 · 0 comments

Comments

@Jon77Ruler
Copy link

Jon77Ruler commented Jan 29, 2025

As Meridian has just been opened to the public, I suspect a wave of people (like myself) from a Windows background are going to be beating their heads against the wall trying to make Meridian work on GPUs.

Firstly: Meridian works (well enough) in Windows on CPU. I've installed it and run on CPU, but the run time is exponential without GPU support, and NVidia no longer support CUDA for Windows.

Which means turning to WSL2.

I need some help please as either:

Here are my steps so far (and hopefully this may later act as a user guide for Windows analysts).
Any command starting with a > is in Windows Powershell, starting with a $ is done in the Ubuntu WSL2 instance, and [1] is done in Jupyter Lab

Install the latest (standard) nvidia game ready drivers, but nothing else 'special'.
Activate WSL2 in Windows 11 as a feature and add a UBUNTU instance if one hasn't been created by default. Go to Powershell

> wsl --install

You should get a new Ubuntu icon in the start menu; click on that. Install Anaconda (https://docs.anaconda.com/anaconda/install/):

$ wget https://repo.anaconda.com/archive/Anaconda3-2024.10-1-Linux-x86_64.sh
$ bash ~/Anaconda3-2024.10-1-Linux-x86_64.sh

say yes to the various options that pop up.

Create an environment "meridian" using compatible Python (3.12)

(base)$ conda create -n meridian python=3.12

Install JupyterLAB

(base)$ conda install -c conda-forge jupyterlab

go in to new Conda environment and install ipykernel. Use ipython to name the kernel, e.g.

(base)$ conda activate meridian
(meridian)$ conda install ipykernel
(meridian)$ ipython kernel install --user --name=<any_name_for_kernel>
(meridian)$ conda deactivate

Install build-essentials so that you can install meridian

(base)$ sudo apt-get update && sudo apt-get upgrade -y
(base)$ sudo apt autoremove -y
(base)$ sudo apt install build-essential -y

Close down the WSL window and open it back up

(base)$ conda activate meridian
(meridian)$ pip install --upgrade google-meridian

After a long time it should successfully compile and install.

(meridian)$ conda deactivate

Type (base)$ jupyter lab to start things up

Now: this is where I'm stuck.
If I now start jupyter lab and run the first few lines to check GPUs - it comes back as zero. I get warnings saying e.g. Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU

If I go in to my meridian environment and install with CUDA
(meridian)$ pip install tensorflow[and-cuda]

then the following in python / jupyter lab gives warnings but "works"

[1] import tensorflow as tf

2025-01-29 16:02:28.969520: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2025-01-29 16:02:28.977839: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:479] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-01-29 16:02:28.989958: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:10575] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-01-29 16:02:28.989977: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1442] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-29 16:02:28.999165: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-01-29 16:02:29.490521: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

[2] print("GPUs: ", len(tf.config.list_physical_devices('GPU')))
GPUs: 1
2025-01-29 16:02:30.614133: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2025-01-29 16:02:30.732499: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2025-01-29 16:02:30.732545: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

I can run the dummy script to a point, e.g.

# check if GPU is available
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
print("Num CPUs Available: ", len(tf.config.experimental.list_physical_devices('CPU')))

Your runtime has 8.2 gigabytes of available RAM

Num GPUs Available: 1
Num CPUs Available: 1

but the part where we sample from the posterior: this always makes my kernel crash after about a minute, and the GPU never gets going judging by the task manager:

%%time
mmm.sample_prior(500)
mmm.sample_posterior(n_chains=2, n_adapt=500, n_burnin=500, n_keep=1000)

(I've switched between the 7 and 2 to test if it's just something there).

@Jon77Ruler Jon77Ruler changed the title Guiding Windows users through the install process Guiding Windows users through the WSL install process Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant