nvtop or nvidia-smi gives you a good macro overview but I personally have found that utilization (EDIT: As reported by nvidia-smi) is actually a poor proxy for how fast your workload can be outside of just ensuring that a GPU is indeed being used
I agree that utilization by nvidia-smi is a poor proxy for performance. FWIW, I’ve found that for the same architecture the power consumption reported in nvtop very often correlates super nicely with the training performance and the peak performance is always at peak power consumption. Agreed on your advice for getting to tune your architecture details, but once that’s fixed and you have simple things to debug like memory usage, batch size, dataloading bottlenecks the raw power metric is typically a quick proxy. I find the temperature is a second useful macro metric that; you want to be at max power draw and max allowed temp at all times but not exceed the temperature where you throttle.
That's hard to argue with. Of course power draw is a direct measure of hardware utilization, but it doesn't translate very well to a measure of GPU Code efficiency.
Often you can squeeze out another order of magnitude of performance by rewriting the kernel and the power draw will always stay capped at whatever the maximum is. I'd say GPU power consumption is interesting if you're CPU bound and struggling to feed the GPU enough data and/or tasks.
FLOPs utilization is arguably the industry standard metric for efficiency right now and it should be a good first approximation of how much performance is left on the table.
But if you mean the reported utilization in nvtop is misleading I completely agree (as someone who uses it daily).
I’ve been meaning to dig into the source/docs to see what’s going on. The power usage seems to be a more reliable indicator of actual hardware utilization, at least on nvidia gear.
Thanks, Some people were having random problems installing WSL on their systems and I found this was the easiest solution (but based on their card models, they appeared to have much older machines.
There is no need to install Docker Desktop just to run nvidia-smi in WSL; the Windows directory containing the nvidia-smi binary is mounted inside a WSL instance and added to PATH automatically by WSL on instance startup.
As an aside: there is no need to install Docker Desktop just to use Docker containers in WSL either, unless you want a Windows GUI to manage your containers. Just follow the official documentation for installing Docker in your Linux distro of choice, or simply run `sudo apt install docker.io` in the default WSL Ubuntu distro. Docker will work just fine with an up-to-date WSL.
Further aside, it's possible to have both Docker Desktop and the normal linux Docker.io installed on WSL. They work in isolation, the easy way to know which is active is to check if Docker Desktop is running or not. I wouldn't recommend this set up...
If you're here because you're interested in AI performance I'd recommend instead https://docs.nvidia.com/nsight-compute/NsightComputeCli/inde... to profile individual kernels. Nsight systems for a macro view https://developer.nvidia.com/nsight-systems and the PyTorch profiler if you're not authoring kernels directly but using something PyTorch https://pytorch.org/tutorials/recipes/recipes/profiler_recip...