Skip to content

Releases: EAddario/llama.cpp

b5995

26 Jul 06:52
9b8f3c6
Compare
Choose a tag to compare
musa: fix build warnings (unused variable) (#14869)

Signed-off-by: Xiaodong Ye <[email protected]>

b5994

25 Jul 22:12
c7f3169
Compare
Choose a tag to compare
ggml-cpu : disable GGML_NNPA by default due to instability (#14880)

* docs: update s390x document for sentencepiece

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit e086c5e3a7ab3463d8e0906efcfa39352db0a48d)

* docs: update huggingface links + reword

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 8410b085ea8c46e22be38266147a1e94757ef108)

* ggml-cpu: disable ggml-nnpa compile flag by default

fixes #14877

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 412f4c7c88894b8f55846b4719c76892a23cfe09)

* docs: update s390x build docs to reflect nnpa disable

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit c1eeae1d0c2edc74ab9fbeff2707b0d357cf0b4d)

---------

Signed-off-by: Aaron Teo <[email protected]>

b5971

23 Jul 14:07
221c0e0
Compare
Choose a tag to compare
ci : correct label refactor->refactoring (#14832)

b5964

22 Jul 22:27
acd6cb1
Compare
Choose a tag to compare
ggml : model card yaml tab->2xspace (#14819)

b5942

19 Jul 17:16
9008328
Compare
Choose a tag to compare
imatrix : use GGUF to store importance matrices (#9400)

* imatrix : allow processing multiple chunks per batch

* perplexity : simplify filling the batch

* imatrix : fix segfault when using a single chunk per batch

* imatrix : use GGUF to store imatrix data

* imatrix : fix conversion problems

* imatrix : use FMA and sort tensor names

* py : add requirements for legacy imatrix convert script

* perplexity : revert changes

* py : include imatrix converter requirements in toplevel requirements

* imatrix : avoid using designated initializers in C++

* imatrix : remove unused n_entries

* imatrix : allow loading mis-ordered tensors

Sums and counts tensors no longer need to be consecutive.

* imatrix : more sanity checks when loading multiple imatrix files

* imatrix : use ggml_format_name instead of std::string concatenation

Co-authored-by: Xuan Son Nguyen <[email protected]>

* quantize : use unused imatrix chunk_size with LLAMA_TRACE

* common : use GGUF for imatrix output by default

* imatrix : two-way conversion between old format and GGUF

* convert : remove imatrix to gguf python script

* imatrix : use the function name in more error messages

* imatrix : don't use FMA explicitly

This should make comparisons between the formats easier
because this matches the behavior of the previous version.

* imatrix : avoid returning from void function save_imatrix

* imatrix : support 3d tensors with MUL_MAT

* quantize : fix dataset name loading from gguf imatrix

* common : move string_remove_suffix from quantize and imatrix

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* imatrix : add warning when legacy format is written

* imatrix : warn when writing partial data, to help guess dataset coverage

Also make the legacy format store partial data
by using neutral values for missing data.
This matches what is done at read-time for the new format,
and so should get the same quality in case the old format is still used.

* imatrix : avoid loading model to convert or combine imatrix

* imatrix : avoid using imatrix.dat in README

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>

b5939

19 Jul 12:23
f0d4d17
Compare
Choose a tag to compare
Documentation: Update build.md's Vulkan section (#14736)

* Documentation: Rewrote and updated the "Without docker" portion of the Vulkan backend build documentation.

* Documentation: Reorganize build.md's Vulkan section.

b5921

17 Jul 11:15
086cf81
Compare
Choose a tag to compare
llama : fix parallel processing for lfm2 (#14705)

b5906

16 Jul 08:07
4b91d6f
Compare
Choose a tag to compare
convert : only check for tokenizer folder if we need it (#14704)

b5890

13 Jul 17:20
982e347
Compare
Choose a tag to compare
quantize : fix minor logic flaw in --tensor-type (#14572)

b5873

11 Jul 19:54
f5e96b3
Compare
Choose a tag to compare
model : support LiquidAI LFM2 hybrid family (#14620)

**Important**
LFM2 was [merged ](https://github.com/huggingface/transformers/pull/39340)into transformers, but has not yet been released.
To convert into gguf, install transformers from source
```shell
pip install "transformers @ git+https://github.com/huggingface/transformers.git@main"
```