Skip to content

Releases: ngxson/llama.cpp

b6769

15 Oct 14:33
17304cb

Choose a tag to compare

server : fix img token logs (#16595)

b6768

15 Oct 13:12
3e3cb19

Choose a tag to compare

llama-quant: add support for mmproj (#16592)

* llama-quant: add support for mmproj

* Update src/llama.cpp

Co-authored-by: Georgi Gerganov <[email protected]>

* check prefix instead

* small fix

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b6767

15 Oct 12:20
5acd455

Choose a tag to compare

CUDA: Changing the CUDA scheduling strategy to spin (#16585)

* CUDA set scheduling strategy to spinning for cc121

* Using prop.major and prop.minor, include HIP and MUSA

* Exclude HIP and MUSA

* Remove trailing whitespace

Co-authored-by: Johannes Gäßler <[email protected]>

* Remove empty line

Co-authored-by: Johannes Gäßler <[email protected]>

---------

Co-authored-by: Johannes Gäßler <[email protected]>

b6766

15 Oct 10:15
554fd57

Choose a tag to compare

server : fix mtmd checkpoints (#16591)

b6765

14 Oct 18:04
fa882fd

Choose a tag to compare

metal : avoid using Metal's gpuAddress property (#16576)

* metal : avoid using Metal's gpuAddress property

* metal : fix rope kernels buffer check

b6764

14 Oct 17:56
ffa0590

Choose a tag to compare

vulkan: Add ACC_TYPE_VEC2 implementation (#16203)

Signed-off-by: Stefan Savic <[email protected]>
Co-authored-by: Stefan Savic <[email protected]>

b6763

14 Oct 15:18
120bf70

Choose a tag to compare

CUDA + openCL: fix bug in accessing rms_norm->src while doing fusion …

b6762

14 Oct 14:17
4258e0c

Choose a tag to compare

vulkan: Support FA with K/V in F32 (#16543)

b6761

14 Oct 13:22
7ea15bb

Choose a tag to compare

vulkan: Improve build time for MSVC (#16545)

Enable CMP0147 so custom build steps (invoking vulkan-shader-gen) are run in parallel.

Enable /MP so source files are compiled in parallel.

b6760

14 Oct 12:44
9c7185d

Choose a tag to compare

CUDA: enable FA for FP32 KV cache (#16546)