-
Notifications
You must be signed in to change notification settings - Fork 463
Add support for gemma 3n #1248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add support for gemma 3n #1248
Conversation
Introduces LLamaFlashAttentionType enum and integrates flash attention configuration into LLamaContextParams. Adds support for diffusion-based models in SafeLlamaModelHandle. Updates NativeApi and SafeLLamaContextHandle with new adapter metadata and sequence state methods. Syncs llama.cpp submodule.
Thanks for putting this together. I've started a build here to produce new binaries, once those are ready we can update the csproj and run the tests. |
Hi @martindevans , i guess i need to hardcode the binary file url for now once the changes are tested and ready to merge we can restore the original url right ? |
Have changed it assuming i had to hardcode if anything else is required do let me know. |
Sorry for the delay, I'll create a new release on https://github.com/SciSharp/LLamaSharpBinaries shortly, then you can just put the ID of that release into the csproj file. |
Ok I've created https://github.com/SciSharp/LLamaSharpBinaries/releases/tag/86587da, you can put that release ID into the csproj here and it should auto download on build (probably best to do a clean and rebuild to be sure). |
Will do it in sometime. |
Hi @martindevans , few tests are failing but i checked out to the last commit before my contribution even there the tests are failing... am i missing something ? |
No description provided.