#Help Reducing Cold Start

12 messages · Page 1 of 1 (latest)

tacit token
#

Hi, I've been working with RunPod for a couple of months, it has been great.
I know the image only downloads one time, I know you have two options for optimization, embedding the model on the docker image or having a network volume but with less flexibility since it will be located only on one region. I'm embedding my model on the docker image plus executing scripts to cache the loading, config or downloading.

I'm using whisper-large-v3 model with my own code since it has a lot of optimizations. The cold start without any flashboot is between 15-45 seconds. My goal is to reduce this time as much as possible without depending on a high requests volume.

In this case, would a network volume with a specific available GPU reduce the cold start? I'm having trouble understanding if a network volume would do the trick. Is the cold start loading my container only or also loading that model, would a storage fix this?

still sequoia
#

I believe you've already implemented the optimal solution by embedding the model within the Docker image. This ensures fast access since the image is stored locally on the host machine. Using a network volume would likely slow things down due to the additional network data transfer. One option to consider is setting up an active worker. This setup could reduce cold start times, and since it's 40% cheaper when idle, it is a cost-effective solution. Additionally, for applications that are CPU-intensive at startup, using a CPU with a higher clock speed might improve performance. It’s worth testing to see if that helps.

twin gyro
prisma mountain
#

network volumes perform poorly in both Serverless and Pods. 😦

twin gyro
#

That is correct. But I am trying to figure out the discrepancy in model loading speed between the Pods and Serverless, for the same machine type. I also believe that the CPU is being throttled in Serverless.
Another explanation would be that the network volume is being "mounted" to the Pod and no more over the network. But this is just a guess..

prisma mountain
twin gyro
#

Thank you, will do!

tacit token
still sequoia
still sequoia
tacit token
sharp moss
#

Reset? resetting / restarting workers arent really necesarry