Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对internvl-76B微调过的模型进行评测,8卡40G,报错:CUDA out of memory是为什么? #735

Open
starevelyn opened this issue Jan 20, 2025 · 6 comments
Assignees

Comments

@starevelyn
Copy link

执行的命令是:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python run.py \ --data MMMU_DEV_VAL \ --model InternVL2-76B-sft \ --verbose \ --work-dir results/
结果是已经开始正常推理了,但是推理几个问题之后就开始报错:

Image

Image

Image

这是为什么呢,8个卡,每张卡40多G,不应该不够用呀

@PhoenixZ810
Copy link
Collaborator

你好,MMMU在推理的过程中产生的回复较长,且76B占用的显存较大,一般都是8卡80G推理较为稳妥。可以在推理的过程中使用nvidia-smi实时关注显存占用,极大概率为显存不够。

@PhoenixZ810 PhoenixZ810 self-assigned this Jan 20, 2025
@starevelyn
Copy link
Author

Image
换成8卡80G,依然报错:Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm), skipping this combination.
这个也是由于显存不够吗?都推理到20%了。。

@starevelyn
Copy link
Author

Image

@starevelyn
Copy link
Author

Image
我看监控这个显存也没有全部满呀

@PhoenixZ810
Copy link
Collaborator

你好,split_model函数是用于多卡切分模型的函数,可以关注internvl init中的split步骤,并打印输出device_map, visible_devices,看是否与本地对齐。

@PhoenixZ810
Copy link
Collaborator

命令行前面要设置AUTO_SPLIT=1,这样才会使用split_model函数将模型进行切分

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants