Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Improve Rust Compile Time by 108X (burn.dev)
8 points by nathanielsimard on Jan 16, 2025 | hide | past | favorite | 1 comment


During the last iteration of CubeCL, we refactored the matrix multiplication GPU kernel to work with many different configurations and element types.

The goal was to improve performance and flexibility by using Tensor cores when available, performing bounds checks when necessary, supporting any tensor layout without any new allocation to transpose the matrices beforehand, and implementing many improvements.

The performance is greatly improved, and now it works better with many different matrix shapes. However, I think we created an atrocity in terms of compilation speed. Simply compiling a few matmul kernels, using incremental compilation, took close to 2 minutes.

So we fixed it! I took the time to write a blog post with our solutions, since I believe this can be useful to Rust developers in general, even if the techniques might not be applicable to your projects.

Feel free to ask any questions here, about the techniques, the process, the algorithms, CubeCL, whatever you want!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: