Apple m3/m4 silicon is certainly good in some ways, but the bottleneck is often a lack of CUDA software support and price (could buy >4 times the GPU raw performance on a dual rtx 5090 desktop.) =3
The key features of the m3 ultra is 512GB of shared GPU/CPU ram, and ultra fast LAN over peripheral cabling.
Once an NVIDIA card caches a model into its VRAM, than it doesn't get hit with the memory data copy cost over the bus.
Yet as many people have noticed, who cares if the m3 ultra takes four times as long if the faster alternative simply won't fit the larger models. YMMV =3
Apple M3 Ultra (GPU - 80 cores) scores 7235.31
NVIDIA GeForce RTX 5090 Laptop GPU scores 7931.31
Note the memory constraints of NVIDIA are not like Apple silicon which tends to also be less i/o constrained. YMMV
https://www.youtube.com/watch?v=d8yS-2OyJhw
https://www.youtube.com/watch?v=Ju0ndy2kwlw
Apple m3/m4 silicon is certainly good in some ways, but the bottleneck is often a lack of CUDA software support and price (could buy >4 times the GPU raw performance on a dual rtx 5090 desktop.) =3