#Why are secure cloud pods so slow?

11 messages · Page 1 of 1 (latest)

supple igloo
#

I'm pretty sure I just wasted a few hours of time trying to find a decent pod that isn't being bottlenecked by it's other hardware. I only managed to find 1 pod a few days ago that was giving me 3 it/s while training a model and it was a community pod.

trim forum
#

What kind of model are you training? Are you using Kohya_ss or something else? What kind of GPU are you using and which region of secure cloud are you using?

supple igloo
#

been trying to train a LORA for SDXL, been trying 3090's, 4090's, tried an a100

#

havent been picking a region, any suggestions?

trim forum
#

Do you know which region you're getting when you're auto assigned one?

supple igloo
#

says CZ

#

most of the time

trim forum
#

Okay, I'll run some tests, not sure whether its the slow disk causing it to be slow.

supple igloo
#

someone suggested in another thread it could be old CPU's

#

but they get away with it because they only show vCPU count and nothing else

trim forum
#

Yeah CPU can have some impact but once everything is loaded and the training starts, it should be mostly using GPU not CPU. If I check the CPU usage while training is in progress, its very low, while GPU utilization is basically maxed out.