0.16.0

Latest

b4rtaz released this 05 Sep 17:18

· 3 commits to main since this release

v0.16.0

5f5adaf

This version adds support for Qwen3 MOE models on CPU. Vulkan support will be added in a future release.

The performance of MOE models is quite impressive: Qwen3-30B-A3B-Q40 achieves 13.04 tok/s during prediction on 4× Raspberry Pi 5 (8GB). Check details here.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

0.16.0

Uh oh!