Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to run PeakVI model returns mpi4py error #2836

Closed
Sun-storm opened this issue Jun 10, 2024 · 2 comments
Closed

Trying to run PeakVI model returns mpi4py error #2836

Sun-storm opened this issue Jun 10, 2024 · 2 comments
Labels

Comments

@Sun-storm
Copy link

Whenever I try to run the model or (after having run the model somewhere else) I try to access it with model = scvi.model.PEAKVI.load(model_dir, adata=adata) I get the bug shown bellow. I've tried installing openMPI and quite a few other things, but nothing seems to work.

python
Python 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:50:58) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import tempfile
>>> from pathlib import Path
>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>> import pooch
>>> import scanpy as sc
>>> import scvi
>>> import torch
>>> os.chdir('/home/.../data')
5ad')
model_dir = os.path.join("model/peakvi_trained", "model.pt")
model = scvi.model.PEAKVI.load(model_dir, adata=adata)
>>> adata = scvi.data.read_h5ad('combined_seurat_object.h5ad')
>>> model_dir = os.path.join("model/peakvi_trained", "model.pt")
>>> model = scvi.model.PEAKVI.load(model_dir, adata=adata)
/services/tools/scvi-tools/1.1.2/lib/python3.12/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")

--------------------------------------------------------------------------------
<stdin> 1 <module>
1

_base_model.py 662 load
_, _, device = parse_device_args(

_utils.py 101 parse_device_args
connector = _AcceleratorConnector(accelerator=accelerator, devices=devices)

accelerator_connector.py 152 __init__
self.cluster_environment: ClusterEnvironment = self._choose_and_init_cluster_environment()

accelerator_connector.py 421 _choose_and_init_cluster_environment
if env_type.detect():

mpi.py 71 detect
from mpi4py import MPI

ImportError:
libmpi.so.40: cannot open shared object file: No such file or directory

Versions:

1.1.3

@Sun-storm Sun-storm added the bug label Jun 10, 2024
@canergen
Copy link
Member

This seems to be a problem with your CUDA installation on a specific workstations. Actually torch can't initialize. Can you check that: nvidia-smi works within the command line and that functions like:
`>>> import torch

torch.cuda.is_available()
True

torch.cuda.device_count()
1

torch.cuda.current_device()
0

torch.cuda.device(0)
<torch.cuda.device at 0x7efce0b03be0>

torch.cuda.get_device_name(0)
'GeForce GTX 950M'`

If these don't work and you can't set up a new environment with only installing a CUDA enabled torch, you might want to check with your system administrator or update your CUDA drivers yourself.

@canergen
Copy link
Member

canergen commented Jul 5, 2024

Closed due to inactivity.

@canergen canergen closed this as completed Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants