🚀 DASLab GGUF Quantization Toolkit #16035
mkleinegger
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🚀 Announcing the DASLab GGUF Quantization Toolkit
We're excited to release the first open-source toolkit that brings GPTQ + EvoPress to the GGUF format, enabling heterogeneous quantization based on importance. Delivering Higher-quality models, same file size.
What's inside
Why it matters
Unlike standard uniform quantization, our toolkit optimizes precision where it matters most.
Critical layers (e.g. attention) can use higher precision, while others (e.g. FFN) compress more aggressively.
With EvoPress search + GPTQ quantization, these trade-offs are discovered automatically.
Results
Below are zero-shot evaluations. Full benchmark results are available in the repo.
Resources
DASLab GGUF Quantization Toolkit (GitHub Repo Link)
We'd love your feedback, contributions, and experiments! 💡
Beta Was this translation helpful? Give feedback.
All reactions