#Error in loading state_dict while using CUDA/ROCm

1 messages · Page 1 of 1 (latest)

prisma widget
#

I am currently trying to get whisper up and running on my machine.
Cuda is avalable and the standard device the model chooses to run on.
I checked with whisper --help
While using the CPU (AMD Ryzen 9 7900X3D) works like a charm, using the GPU (AMD Radeon RX 7900 XTX) doesnt work at all.
The following code:

import whisper
model = whisper.load_model("base")
result = model.transcribe("audio.mp3")
print(f' The text in video: \n {result["text"]}')

raises this error:

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Whisper:
    While copying the parameter named "encoder.blocks.0.attn.query.weight", whose dimensions in the model are torch.Size([384, 384]) and whose dimensions in the checkpoint are torch.Size([384, 384]), an exception occurred : ('HIP error: invalid device function\nHIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing HIP_LAUNCH_BLOCKING=1.\nCompile with `TORCH_USE_HIP_DSA` to enable device-side assertions.\n',).

I dont know how to fix this.

hallow girderBOT
#

Hi @prisma widget. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the information provided in the extracts, it seems that you're encountering an issue with loading the state_dict for the Whisper model while using a GPU. This could be due to a variety of reasons, including compatibility issues with your specific GPU, issues with the CUDA or ROCm versions, or problems with the PyTorch installation.

Here are a few potential solutions based on the information from the extracts:

  1. Check your CUDA or ROCm version: Make sure that the CUDA or ROCm version you're using is compatible with the version of PyTorch you have installed. For example, in one of the extracts, a user was able to resolve a similar issue by uninstalling and reinstalling PyTorch with a specific CUDA version:

    pip uninstall torch
    pip install torch torchvision torchaudio --extra-index-url <https://download.pytorch.org/whl/cu116>
    
  2. Check your PyTorch installation: Make sure that PyTorch is correctly installed and can see your GPU. You can do this by running the following commands in Python:

    import torch
    print(torch.cuda.is_available())
    print(torch.cuda.device_count())
    print(torch.cuda.get_device_name(0))
    

    These commands should return True, the number of GPUs available, and the name of your GPU, respectively.

  3. Try using ROCm with AMD GPU: If you're using an AMD GPU, you might want to try using ROCm. According to one of the extracts, PyTorch for ROCm intentionally reuses the existing torch.cuda interfaces, so you can keep using cuda if the ROCm version is properly installed.

If none of these solutions work, you might want to consider reaching out to the OpenAI community or the PyTorch community for more specific assistance.

#

Sources:


Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: