Skip to content

Conversation

krisbiradar
Copy link

No description provided.

Introduces LLamaFlashAttentionType enum and integrates flash attention configuration into LLamaContextParams. Adds support for diffusion-based models in SafeLlamaModelHandle. Updates NativeApi and SafeLLamaContextHandle with new adapter metadata and sequence state methods. Syncs llama.cpp submodule.
@martindevans
Copy link
Member

Thanks for putting this together. I've started a build here to produce new binaries, once those are ready we can update the csproj and run the tests.

@krisbiradar
Copy link
Author

Hi @martindevans , i guess i need to hardcode the binary file url for now once the changes are tested and ready to merge we can restore the original url right ?

@krisbiradar
Copy link
Author

Have changed it assuming i had to hardcode if anything else is required do let me know.

@martindevans
Copy link
Member

Sorry for the delay, I'll create a new release on https://github.com/SciSharp/LLamaSharpBinaries shortly, then you can just put the ID of that release into the csproj file.

@martindevans
Copy link
Member

Ok I've created https://github.com/SciSharp/LLamaSharpBinaries/releases/tag/86587da, you can put that release ID into the csproj here and it should auto download on build (probably best to do a clean and rebuild to be sure).

@krisbiradar
Copy link
Author

Will do it in sometime.

@krisbiradar
Copy link
Author

krisbiradar commented Sep 14, 2025

Hi @martindevans , few tests are failing but i checked out to the last commit before my contribution even there the tests are failing... am i missing something ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants