-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile bug: macOS Vulkan build fails #10923
Comments
I'm able to compile tags up until b4273. The errors above start there up until the latest tag. Using a build with tag b4272 it's just generating @@@ symbols. ./llama-cli -m ~/Downloads/Llama-3.2-3B-Instruct-f16.gguf -p "Hey how are you?" -n 128 --n-gpu-layers 40
|
Should fix ggerganov#10923
@soerenkampschroer please try #10927. |
@jeffbolznv Thank you, that works! Compilation Log
I do still have the issue of models only generating "@@@@@" as mentioned above when running on the GPU but that may just be my environment. BTW: I was able to get my AMD GPU working with Metal when building the repo yesterday. The version from brew didn't work on my GPU so I thought I'd share just in case it's a side effect you're not aware of. Performance isn't great at about 5 tokens/s compared to 10 tokens/s when running on the CPU but it's something 👍 |
Thanks for checking. I don't know what would be causing the corruption. Is this using MoltenVK? Can you run test-backend-ops and see if any are failing? |
No problem, happy to help where I can. I'm using MoltenVK, I installed it through brew (I ran I ran test-backend-ops and Vulkan is indeed failig. It's very verbose and I'm not sure where to begin. Output test-backend-ops
|
I reinstalled MoltenVk and while it still passes and fails the same tests and overall still fails, it does now work. There's still one issue though. Every other query, it descends into madness after answering the first query. It will either hallucinate follow-up questions or repeat phrases, words or letters. Either way, when it works, it's insanely fast. I'm up from 7 token/s on the CPU to 45 token/s using vulkan. This is all using gemma-2-9b-it.Q5_K_M.gguf. |
* vulkan: build fixes for 32b Should fix #10923 * vulkan: initialize some buffer/offset variables
Git commit
eb5c3dc
Operating systems
Mac
GGML backends
Vulkan
Problem description & steps to reproduce
The build process fails with errors in ggml_vk_host_get.
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: