You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am deeply interested in your Optimum-TPU project.
Currently, I am planning to fine-tune the Llama 3.1 and 3.2 models on my native language and a specific domain, with a fairly large dataset (approximately 60B tokens).
I am using Google TPU Pods, but I have been facing significant challenges in implementing model parallel training from scratch, saving unified checkpoints in the safetensors format, setting up appropriate logging, and configuring hyperparameters.
While exploring solutions, I came across the Optimum-TPU project, which seems incredibly useful. However, I noticed that it currently only supports up to Llama 3.
Are there any plans to extend support to Llama 3.1 and 3.2 for fine-tuning?
I strongly hope that future updates will include support for these versions as well.
Thank you for considering this request!
The text was updated successfully, but these errors were encountered:
Hi @DimensionSTP !
We do not support Llama 3.1 or 3.2 yet, but we should add that support before the end of the year.
Having said that, if all you want is to fine-tune these models, you can probably just follow the example steps in our Llama fine tuning example and it should work (though this is untested yet).
For serving/inference you would still need to a better support for sharding, but for fine-tuning it should be fine.
Hello,
I am deeply interested in your Optimum-TPU project.
Currently, I am planning to fine-tune the Llama 3.1 and 3.2 models on my native language and a specific domain, with a fairly large dataset (approximately 60B tokens).
I am using Google TPU Pods, but I have been facing significant challenges in implementing model parallel training from scratch, saving unified checkpoints in the safetensors format, setting up appropriate logging, and configuring hyperparameters.
While exploring solutions, I came across the Optimum-TPU project, which seems incredibly useful. However, I noticed that it currently only supports up to Llama 3.
Are there any plans to extend support to Llama 3.1 and 3.2 for fine-tuning?
I strongly hope that future updates will include support for these versions as well.
Thank you for considering this request!
The text was updated successfully, but these errors were encountered: