-
Notifications
You must be signed in to change notification settings - Fork 13.1k
Add Olmo3 implementation #16015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Olmo3 implementation #16015
Conversation
I used the model conversion example for testing. I got the following results when using bf16 on shanearora/2025-sep-a-base-model, modified to have yarn rope scaling enabled.
Also, below is the allenai/OLMo-2-0425-1B with fp32.
|
Co-authored-by: Sigbjørn Skjæret <[email protected]>
@2015aroras What tool are you using to compare the conversion? |
@pwilkin I am using the model conversion tools inside this repo. These have been created to help make sure HF to llama.cpp conversion is accurate. The logs above are from the Model logits verfication step. |
Ah, that's nice, I haven't used that specific one yet :) |
All the check failures seem to be unrelated to this change. Before merging master again, an ios check was failing instead. So imo ready to merge. |
Yes, sorry for the delay, just a minor cosmetic change and we'll merge. :) |
Co-authored-by: Sigbjørn Skjæret <[email protected]>
* Add HF to gguf conversion logic for Olmo3 * Add Olmo3 implementation * Update rope comment * Fix indentation Co-authored-by: Sigbjørn Skjæret <[email protected]> * Apply suggestion from @CISC Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>
This PR adds the upcoming Olmo 3. The main architectural differences from Olmo 2 are:
Since the architecture is very similar to Olmo 2, this PR opts to merge Olmo 3 changes into the Olmo 2 implementation (similar to vllm-project/vllm#24534). I can create a separate Olmo 3 implementation instead if preferred.