You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All the variables echo well. I can build megatron-lm and apex in this environment, no problem. But not TE.
Error:
conda/envs/megatron/lib/python3.10/site-packages/torch/include/ATen/cudnn/cudnn-wrapper.h:3:10: fatal error: cudnn.h: No such file or directory
3 | #include <cudnn.h>
| ^~~~~~~~~
The text was updated successfully, but these errors were encountered:
I try to compile TE on a slurmcluster because containers aren't fully supported (MPI issues).
My setup is like this:
All the variables echo well. I can build megatron-lm and apex in this environment, no problem. But not TE.
Error:
The text was updated successfully, but these errors were encountered: