Releases: EAddario/llama.cpp
Releases · EAddario/llama.cpp
b5476
releases : enable openmp in windows cpu backend build (#13756)
b5373
scripts : fix compare-llama-bench.py show parameter (#13514)
b5343
docs : Fix typo in InternVL3 model name (#13440)
b5269
llama : move end-user examples to tools directory (#13249) * llama : move end-user examples to tools directory --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
b5215
model : Nomic Embed Text V2 with Mixture-of-Experts (MoE) architectur…
b5200
llama-bench : Add `--override-tensors` arg (#12922) * Add --override-tensors option to llama-bench * Correct llama-bench --override-tensors to --override-tensor * llama-bench: Update --override-tensors parsing to match --tensor-split, appear in test matrix. * Make new llama-bench util functions static to fix Ubuntu CI * llama-bench: Correct -ot corner cases (No -ot calls, leading and trailing empty -ot spans, etc.)
b5191
llama : fix K-shift with quantized K and BLAS backend (#13113)
b5156
clip : refactor, add `image_manipulation` and `llava_uhd` classes (#1…
b5146
llama : recognize IBM Granite 3.3 FIM tokens (#12988) The Granite's FIM tokens are very similar to Qwen's; it's just that they use underscore instead of a dash. So <fim_middle> for example instead of <fim-middle>. Opening up tokenizer_config.json in ibm-granite/granite-3.3-8b-base shows: ``` "<fim_prefix>", "<fim_middle>", "<fim_suffix>", "<fim_pad>", ... "<reponame>", ```
b5142
opencl: split `ggml-opencl.cl` into multiple files and cleanup (#12886) * opencl: refactor - split the kernel files --------- Co-authored-by: Shangqing Gu <[email protected]> * opencl: split more kernels into separate files * opencl: specify subgroup size instead of querying it * opencl: refine Adreno cl compiler version parsing * opencl: skip some kernels not used by Adreno on old compilers * opencl: refine logic for selecting Adreno kernels * opencl: refine Adreno cl compiler version * opencl: cleanup preprocessor for kernels * opencl: consider Adreno CL compiler on Windows * opencl: add final newline for `mul_mv_f16_f16.cl` --------- Co-authored-by: Shangqing Gu <[email protected]>