Hello together,
i had fun setting up pods, everything worked as planned. now i am trying out serverless for our company.
i want to deploy qwen3-30B, the problem is that the model need to be downloaded every time a worker starts. how can i fix this? is the answer to use a network volume with the model on it ?
and if that is the answer, how do i deploy that correctly, do i just stuff it into my endpoint and then the first worker pulls the model for all other running and future workers to use?