Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are plenty of smaller (quantized) models that fit well on your machine! On a M4 with 24GB it’s already possible to comfortably run 8B quantized models.

Im benchmarking runtime and memory usage for a few of them: https://aukejw.github.io/mlx_transformers_benchmark/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: