For the newest models unless you quantize the crap out of them, even with a 5090 you’re going to be swapping blocks, which slows things down anyways. At least you’d be able to train on them at full precision with a decent batch size.
That said, I can’t imagine there’s enough of a market there to make it worth it.
For the newest models unless you quantize the crap out of them, even with a 5090 you’re going to be swapping blocks, which slows things down anyways. At least you’d be able to train on them at full precision with a decent batch size.
That said, I can’t imagine there’s enough of a market there to make it worth it.