Skip to content

0.16.0

Latest
Compare
Choose a tag to compare
@b4rtaz b4rtaz released this 05 Sep 17:18
· 3 commits to main since this release
5f5adaf

This version adds support for Qwen3 MOE models on CPU. Vulkan support will be added in a future release.

The performance of MOE models is quite impressive: Qwen3-30B-A3B-Q40 achieves 13.04 tok/s during prediction on 4× Raspberry Pi 5 (8GB). Check details here.