From this writeup it does sound like the architecture of the AMD gpu makes it a bit harder to optimize. It also seems like long term, the AMD approach may scale better in the long run. 8 chiplets rather than 2 for the nvidia offering, along with all the associated cache and memory locality woes.
The future will probably see more chiplets rather than less, so I wonder if dealing with complexity here will pay more dividends in the long run
The future will probably see more chiplets rather than less, so I wonder if dealing with complexity here will pay more dividends in the long run