Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

require_grad #81

Open
1764758458 opened this issue Jun 14, 2024 · 6 comments
Open

require_grad #81

1764758458 opened this issue Jun 14, 2024 · 6 comments

Comments

@1764758458
Copy link

微信图片_20240614110705
微信图片_20240614110705
微信图片_20240614110705

@YingHuTsing
Copy link
Collaborator

Hi, this warning does not have impact on training process and model performance. Our training also has this warning printed.

@1764758458
Copy link
Author

Thank you very much for your answer.!But he this loss stays at 0 it is assumed that the optimization has been reached and then the training is interrupted.
QQ截图20240614181801

@ggcr
Copy link

ggcr commented Jun 23, 2024

@1764758458

Make sure you are using the correct conv_version flag.

  • --conv_version phi for Phi-2, StableLM, Qwen-1.5
  • --conv_version llama for TinyLlama, OpenELM
  • --conv_version gemma for Gemma

@Daming-W
Copy link

Daming-W commented Jul 2, 2024

Thank you very much for your answer.!But he this loss stays at 0 it is assumed that the optimization has been reached and then the training is interrupted. QQ截图20240614181801

I am facing the same error. Have you resolved this?
My recipe is clip-vit&dinov2-vit mof with Vicuna-7b as LLM.

@1764758458
Copy link
Author

Thank you very much for your answer.!But he this loss stays at 0 it is assumed that the optimization has been reached and then the training is interrupted. QQ截图20240614181801

I am facing the same error. Have you resolved this? My recipe is clip-vit&dinov2-vit mof with Vicuna-7b as LLM.

我之前用的phi跑出来是这个情况,然后我换成tinyllama之后就正常了

@Daming-W
Copy link

Daming-W commented Jul 2, 2024

Thank you very much for your answer.!But he this loss stays at 0 it is assumed that the optimization has been reached and then the training is interrupted. QQ截图20240614181801

I am facing the same error. Have you resolved this? My recipe is clip-vit&dinov2-vit mof with Vicuna-7b as LLM.

我之前用的phi跑出来是这个情况,然后我换成tinyllama之后就正常了

刚刚解决啦, 我这边是将脚本中的 --fp16 True改为 --bf16 True就可以了,在deepspeed的repo中有类似的问题和解法
ref: microsoft/DeepSpeed#4017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants