For reference:
| model | size | params | backend | threads | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: | | qwen35 ?B Q5_K - Medium | 6.12 GiB | 8.95 B | MTL,BLAS | 6 | pp512 | 288.90 ± 0.67 | | qwen35 ?B Q5_K - Medium | 6.12 GiB | 8.95 B | MTL,BLAS | 6 | tg128 | 16.58 ± 0.05 | | model | size | params | backend | threads | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: | | gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | MTL,BLAS | 6 | pp512 | 615.94 ± 2.23 | | gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | MTL,BLAS | 6 | tg128 | 42.85 ± 0.61 | Klein 4B completes a 1024px generation in 72seconds.
For reference: