#H100 multi-gpus settings

1 messages · Page 1 of 1 (latest)

shell geyser
#

When I tried to load weights from checkpoints on my custom model using multi-gpus, weights are not loaded and the progress bar shows stop.
I am using H100 x 7 on runpod, and when I did same trial on my local server (A6000 x 6), it worked well.
Do you have any idea?

delicate blazeBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution