RX 470 Vulkan Benchmarks

This isn't Raspberry Pi/ARM related but seeing this reminded me of some RX 470s I have sitting a bin. I bought them about 8 years ago for ETH mining but had since written them off as e-waste. Following the [blog](https://www.jeffgeerling.com/blog/2024/llms-accelerated-egpu-on-raspberry-pi-5) instructions, I managed to run the `Llama-3.2-3B-Instruct-Q4_K_M` model and my RX 470 is able to get about 20 token/s, half of the RX6500 XT from the Pi benchmarks.

I decided to pair the GPU with an even older Intel Ivy bridge CPU. One problem I ran into is when llama.cpp is compiled on a modern system then transferred over, it would lead to `Illegal instruction (core dumped)` due to the Ivy bridge CPU being so old and missing many CPU extensions. It took some trial and error to find the right compile options but I eventually got it working so I decided to publish a Docker image so I can easily deploy it with 1 command: https://github.com/kth8/llama-server-vulkan

Here are the benchmarks using `llama-bench`:

| model                          |       size |     params | backend    | ngl |          test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------: | -------------------: |
| llama 1B Q4_K - Medium         | 762.81 MiB |     1.24 B | Vulkan     |  99 |         pp512 |        353.81 ± 0.19 |
| llama 1B Q4_K - Medium         | 762.81 MiB |     1.24 B | Vulkan     |  99 |        pp4096 |        527.48 ± 0.22 |
| llama 1B Q4_K - Medium         | 762.81 MiB |     1.24 B | Vulkan     |  99 |         tg128 |         60.83 ± 0.07 |
| llama 1B Q4_K - Medium         | 762.81 MiB |     1.24 B | Vulkan     |  99 |  pp4096+tg128 |        375.29 ± 0.24 |

| model                          |       size |     params | backend    | ngl |          test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------: | -------------------: |
| llama 3B Q4_K - Medium         |   1.87 GiB |     3.21 B | Vulkan     |  99 |         pp512 |        203.06 ± 0.63 |
| llama 3B Q4_K - Medium         |   1.87 GiB |     3.21 B | Vulkan     |  99 |        pp4096 |        179.68 ± 0.28 |
| llama 3B Q4_K - Medium         |   1.87 GiB |     3.21 B | Vulkan     |  99 |         tg128 |         25.65 ± 0.15 |
| llama 3B Q4_K - Medium         |   1.87 GiB |     3.21 B | Vulkan     |  99 |  pp4096+tg128 |        123.08 ± 0.11 |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RX 470 Vulkan Benchmarks #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

model	size	params	backend	ngl	test	t/s
llama 1B Q4_K - Medium	762.81 MiB	1.24 B	Vulkan	99	pp512	353.81 ± 0.19
llama 1B Q4_K - Medium	762.81 MiB	1.24 B	Vulkan	99	pp4096	527.48 ± 0.22
llama 1B Q4_K - Medium	762.81 MiB	1.24 B	Vulkan	99	tg128	60.83 ± 0.07
llama 1B Q4_K - Medium	762.81 MiB	1.24 B	Vulkan	99	pp4096+tg128	375.29 ± 0.24

model	size	params	backend	ngl	test	t/s
llama 3B Q4_K - Medium	1.87 GiB	3.21 B	Vulkan	99	pp512	203.06 ± 0.63
llama 3B Q4_K - Medium	1.87 GiB	3.21 B	Vulkan	99	pp4096	179.68 ± 0.28
llama 3B Q4_K - Medium	1.87 GiB	3.21 B	Vulkan	99	tg128	25.65 ± 0.15
llama 3B Q4_K - Medium	1.87 GiB	3.21 B	Vulkan	99	pp4096+tg128	123.08 ± 0.11

RX 470 Vulkan Benchmarks #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions