forked from tabletuser-blogspot/ollama-benchmark
-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
See: geerlingguy/raspberry-pi-pcie-devices#779
Idle power consumption with desktop active is about 14.8W.
Qwen2.5 14b
$ ./build/bin/llama-bench -m models/Qwen2.5-14B-Instruct-Q4_K_M.gguf -n 128 -p 512,4096 -pg 4096,128 -ngl 99 -r 2
TODO
TODO RESULT
The full system was using about TODOW during the run.
gpt-oss 20b
$ ./build/bin/llama-bench -m models/gpt-oss-20b-F16.gguf -n 128 -p 512,4096 -pg 4096,128 -ngl 99 -r 2
TODO
TODO
The full system was using about TODOW during the run.
Llama 3.2:3b
$ ./build/bin/llama-bench -m models/Llama-3.2-3B-Instruct-Q4_K_M.gguf -n 128 -p 512,4096 -pg 4096,128 -ngl 99 -r 2
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (BMG G21) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 131072 | int dot: 1 | matrix cores: KHR_coopmat
build: 12bbc3fa (6715)
| model | size | params | backend | ngl | test | t/s |
|---|---|---|---|---|---|---|
| llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | Vulkan | 99 | pp512 | 489.15 ± 0.33 |
| llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | Vulkan | 99 | pp4096 | 423.47 ± 0.96 |
| llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | Vulkan | 99 | tg128 | 29.80 ± 0.56 |
| llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | Vulkan | 99 | pp4096+tg128 | 283.15 ± 5.57 |
The full system was using about 78.5W during the run.
Metadata
Metadata
Assignees
Labels
No labels