This version adds support for Qwen3 MOE models on CPU. Vulkan support will be added in a future release.
The performance of MOE models is quite impressive: Qwen3-30B-A3B-Q40 achieves 13.04 tok/s during prediction on 4× Raspberry Pi 5 (8GB). Check details here.