I had the GPU working with 3.0.1 and upgraded. Now the startup process says it's using the GPU, but generation is very slow and the Windows task monitor doesn't show any spike in GPU usage. Auto1111 is installed in WSL also and works fine (GPU spikes when generating and generation is fast). I've recreated the venv several times with no success. Running pytorch test code in the venv works fine (CUDA device is recognized and code to move a basic tensor to the GPU works fine).
How can I debug this further?
Thanks!