Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is sat suuport saving checkpoint by using fp16 or bf16? #160

Open
xxxwuwq opened this issue Jan 19, 2024 · 5 comments
Open

Is sat suuport saving checkpoint by using fp16 or bf16? #160

xxxwuwq opened this issue Jan 19, 2024 · 5 comments

Comments

@xxxwuwq
Copy link

xxxwuwq commented Jan 19, 2024

Is sat suuport saving checkpoint by using fp16 or bf16?

@1049451037
Copy link
Member

Saving checkpoint preserves the original parameter dtype. Do you mean you want to train model with fp32, but save it with fp16? If you train a model with fp16, the model will be save with fp16 by default.

@xxxwuwq
Copy link
Author

xxxwuwq commented Jan 19, 2024

Yes, if support, i can chose what i need, cause that when i using cogvlm to finetune, it only support save checkpoint in fp32, which need 60GB+ storage to save one model, fp16 maybe enough. offical api seems doesn'y support

@1049451037
Copy link
Member

1049451037 commented Jan 19, 2024

cogvlm finetune saves model in bf16, unless you train it with fp32.

@1049451037
Copy link
Member

It is memory consuming to save the model with a different dtype with that you train, because you need a copy of the whole model to complete that.

@xxxwuwq
Copy link
Author

xxxwuwq commented Jan 20, 2024

thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants