-
Notifications
You must be signed in to change notification settings - Fork 293
Add Olmo3 #445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Olmo3 #445
Conversation
|
MLX port of PR |
|
Any progress on this? Is there anything missing? |
|
Afaik there is no model or config to test this? Would be good to test the model before landing it. |
|
I can provide you with a model. What do you need? |
|
Great! Access to a hugging face repo with the model safetensors, config and tokenizer would be ideal. |
|
You can use https://huggingface.co/shanearora/2025-sep-a-base-model-with-yarn to get yourself going. The tokenizer is https://huggingface.co/allenai/dolma2-tokenizer. The given model is not instruction-tuned and it's somewhat early in pretraining but you should expect it to produce long rambling continuations if the model is implemented correctly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good (though I'm not an mlx dev)! Just some minor corrections.
|
Perfect, thanks, I’ll update it tomorrow! |
|
@awni @2015aroras the implementation is finished and can be merged! training:inference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the addition! LGTM.
No description provided.