#Problems with Serverless

2 messages · Page 1 of 1 (latest)

whole urchin
#

Hello together,

i had fun setting up pods, everything worked as planned. now i am trying out serverless for our company.

i want to deploy qwen3-30B, the problem is that the model need to be downloaded every time a worker starts. how can i fix this? is the answer to use a network volume with the model on it ?

and if that is the answer, how do i deploy that correctly, do i just stuff it into my endpoint and then the first worker pulls the model for all other running and future workers to use?

cedar bisonBOT