-
Notifications
You must be signed in to change notification settings - Fork 995
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
1 addition
and
1 deletion.
There are no files selected for viewing
Submodule llama.cpp
updated
25 files
+5 −0 | .github/PULL_REQUEST_TEMPLATE/pull_request_template.md | |
+2 −4 | .github/workflows/server.yml | |
+14 −0 | CONTRIBUTING.md | |
+0 −29 | README.md | |
+14 −12 | convert-hf-to-gguf.py | |
+0 −19 | examples/alpaca.sh | |
+0 −15 | examples/gpt4all.sh | |
+51 −7 | examples/imatrix/imatrix.cpp | |
+0 −18 | examples/llama2-13b.sh | |
+0 −18 | examples/llama2.sh | |
+1 −1 | examples/server/public/index-new.html | |
+16 −14 | examples/server/server.cpp | |
+3 −3 | flake.lock | |
+57 −31 | ggml-cuda.cu | |
+17 −4 | ggml-cuda/common.cuh | |
+10 −10 | ggml-cuda/fattn-common.cuh | |
+1 −1 | ggml-cuda/fattn-tile-f16.cu | |
+1 −1 | ggml-cuda/fattn-vec-f16.cuh | |
+3 −3 | ggml-cuda/fattn-wmma-f16.cuh | |
+95 −0 | ggml-cuda/mma.cuh | |
+2 −1 | ggml-cuda/mmq.cu | |
+548 −139 | ggml-cuda/mmq.cuh | |
+77 −10 | ggml-cuda/quantize.cu | |
+16 −1 | ggml-cuda/quantize.cuh | |
+7 −4 | ggml-sycl.cpp |