Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM 问题, 显卡是A00 40G #42

Open
gongye19 opened this issue May 8, 2024 · 5 comments
Open

OOM 问题, 显卡是A00 40G #42

gongye19 opened this issue May 8, 2024 · 5 comments

Comments

@gongye19
Copy link

gongye19 commented May 8, 2024

用llama factory进行sft可以使用deepspeed zero2 微调llama3-8B的模型,但这个框架就算batch设为1,用deepspeed zero2也会报OOM。
用zero3训练会变得很慢,出现这个问题:

2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time

@fe1ixxu
Copy link
Owner

fe1ixxu commented May 12, 2024

OOM issue could be the reason that llama3 has 128K vocab size, while llama2 is 32K.

@gongye19
Copy link
Author

OOM issue could be the reason that llama3 has 128K vocab size, while llama2 is 32K.

I tried deepseek-7B, same question

@fe1ixxu
Copy link
Owner

fe1ixxu commented May 14, 2024

The deepseek vocab size is also large -- 100K. The memory I used for training is 64GB for llama-2 with 8/16 GPUs. Maybe you want to try using fsdp.

@gongye19
Copy link
Author

The deepseek vocab size is also large -- 100K. The memory I used for training is 64GB for llama-2 with 8/16 GPUs. Maybe you want to try using fsdp.

谢谢,我现在是先用llama factory进行sft,再到你的框架上进行cpo。这样可以吗?

@moore3930
Copy link

Technically, 1 GPU (80G) should be fine to fine-tune LLaMA 7B with Lora, but it seems always OOM under your codebase if we do not use 8 GPUs. I am just wondering why it costs a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants