Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to import graphbolt due to libgraphbolt_pytorch_2.3.0.post300.so #7438

Open
Rhett-Ying opened this issue May 30, 2024 · 17 comments
Open
Assignees
Labels
bug:confirmed Something isn't working

Comments

@Rhett-Ying
Copy link
Collaborator

🐛 Bug

Somehow post300 is appended for the target so name which results in failure to find it as the expected name is libgraphbolt_pytorch_2.3.0.so.
See more details in https://discuss.dgl.ai/t/filenotfounderror-cannot-find-dgl-c-graphbolt-library-in-dgl-2-2-1-and-pytorch-2-3-0/4419

To Reproduce

Steps to reproduce the behavior:

  1. conda install DGL package for torch 2.3.0.

Expected behavior

Environment

  • DGL Version (e.g., 1.0): DGL 2.2.1
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3): PyTorch 2.3.0
  • OS (e.g., Linux):
  • How you installed DGL (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version (if applicable):
  • GPU models and configuration (e.g. V100):
  • Any other relevant information:

Additional context

@alexbarghi-nv
Copy link

I'm also seeing this bug - any update?

@Rhett-Ying
Copy link
Collaborator Author

This issue is not handled yet.

@alexbarghi-nv
Copy link

I think I've partly figured out the source of this bug - I tried installing again with PyTorch from the pytorch channel instead of conda-forge and that resolved the issue. There's probably a different version string or something similar in the conda-forge distribution which is causing this.

@Davidxswang
Copy link

I am also see this bug, installed dlg from pip, 2.2.1+cu121, for torch 2.2.2+cu121. Also tried for torch 2.3.x, not working.

@Rhett-Ying
Copy link
Collaborator Author

I am also see this bug, installed dlg from pip, 2.2.1+cu121, for torch 2.2.2+cu121. Also tried for torch 2.3.x, not working.

Did you install torch with pip or conda from conda-forge?

@Rhett-Ying
Copy link
Collaborator Author

Seems it's a common issue, we could add a reg check when loading graphbolt.

@Rhett-Ying Rhett-Ying self-assigned this Jun 13, 2024
@Rhett-Ying Rhett-Ying added the bug:confirmed Something isn't working label Jun 13, 2024
@Davidxswang
Copy link

I am also see this bug, installed dlg from pip, 2.2.1+cu121, for torch 2.2.2+cu121. Also tried for torch 2.3.x, not working.

Did you install torch with pip or conda from conda-forge?

I installed torch with pip

@Rhett-Ying
Copy link
Collaborator Author

@Davidxswang could you share your pip install command? and what is the version if check with pip list|grep torch and torch.__version__ in your case?

@Davidxswang
Copy link

Davidxswang commented Jun 13, 2024

@Davidxswang could you share your pip install command? and what is the version if check with pip list|grep torch and torch.__version__ in your case?

pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu121

pytorch-lightning            2.2.1
torch                        2.2.2+cu121
torch_geometric              2.5.2
torchaudio                   2.2.2+cu121
torchdata                    0.7.1
torchmetrics                 1.3.2
torchvision                  0.17.2+cu121
In [2]: torch.__version__
Out[2]: '2.2.2+cu121' 

@Silhouettes-of-U
Copy link

Silhouettes-of-U commented Jun 14, 2024

I got a similar error while without post300 appended:

  File "<stdin>", line 1, in <module>
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/dist_graph.py", line 11, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/__init__.py", line 26, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.3.1.so```

@rfrs
Copy link

rfrs commented Jun 14, 2024

` File "", line 1, in
File "/root/miniconda3/lib/python3.8/site-packages/dgl/init.py", line 16, in
from . import (
File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/init.py", line 13, in
from .dataloader import *
File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/dataloader.py", line 27, in
from ..distributed import DistGraph
File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/init.py", line 5, in
from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/dist_graph.py", line 11, in
from .. import backend as F, graphbolt as gb, heterograph_index
File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/init.py", line 36, in
load_graphbolt()
File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/init.py", line 26, in load_graphbolt
raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.3.1.so````

We are also facing the same issue with version 2.3.1.
Any progresses? Thank you

@Rhett-Ying
Copy link
Collaborator Author

I got a similar error while without post300 appended:

  File "<stdin>", line 1, in <module>
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/dist_graph.py", line 11, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/__init__.py", line 26, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.3.1.so```

This is expected as DGL 2.2 does not support torch 2.3.1 yet. The latest supported torch version is 2.3.0

@Rhett-Ying
Copy link
Collaborator Author

@rfrs This is expected as DGL 2.2 does not support torch 2.3.1 yet. The latest supported torch version is 2.3.0

@jbm-composer
Copy link

jbm-composer commented Jun 26, 2024

I installed using the 2.3.x version, cuda 12.1, conda (from the DGL website), with torch 2.3.0, and I'm seeing:

>>> import dgl
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/james/src/jbm/dgl/python/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/home/james/src/jbm/dgl/python/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/home/james/src/jbm/dgl/python/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/home/james/src/jbm/dgl/python/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/home/james/src/jbm/dgl/python/dgl/distributed/dist_graph.py", line 12, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/home/james/src/jbm/dgl/python/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/home/james/src/jbm/dgl/python/dgl/graphbolt/__init__.py", line 26, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /home/james/src/jbm/dgl/python/dgl/graphbolt/libgraphbolt_pytorch_2.3.0.so

I tried with conda sourcing from both conda-forge and pytorch, btw

I also (just) tried uninstalling 2.3.x and install 2.2.x, but same error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/james/src/jbm/dgl/python/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/home/james/src/jbm/dgl/python/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/home/james/src/jbm/dgl/python/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/home/james/src/jbm/dgl/python/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/home/james/src/jbm/dgl/python/dgl/distributed/dist_graph.py", line 12, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/home/james/src/jbm/dgl/python/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/home/james/src/jbm/dgl/python/dgl/graphbolt/__init__.py", line 26, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /home/james/src/jbm/dgl/python/dgl/graphbolt/libgraphbolt_pytorch_2.3.0.so

@Livvi
Copy link

Livvi commented Jul 8, 2024

Are there any news on this? I'm also currently failing to set up the dgl library and always get the same DGL C++ graphbolt library error shown above..

@Rhett-Ying
Copy link
Collaborator Author

Are there any news on this? I'm also currently failing to set up the dgl library and always get the same DGL C++ graphbolt library error shown above..

could you list what files exist under //dgl/graphbolt/ after you installed?

@Rhett-Ying
Copy link
Collaborator Author

Is *.post300 a post-release version with additional bug fixes? https://discuss.pytorch.org/t/why-torch-version-returns-2-3-1-post300/206486

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug:confirmed Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants