#how to host 20gb models + fastapi code on serverless
28 messages · Page 1 of 1 (latest)
Create a serverless handler, make it into a docker image, make a template with your docker image that contains the serverless handler and model
Or, put the model in runpod's network storage then connect it to your endpoint and use your template to access the model
But fastAPI seems to be for the API part only, what about the model runner? or the software that runs the model for inferencing and for training
Maybe HF Transformers or tensorflow, pytorch
It's Pytorch + tensorflow
alright then you can create a docker image template with a serverless handler that executes those codes on gpu
Yeah.. that's why thought to keep model out of docker
Theres an alternative to that
by putting the model in network storage and accessing it through the endpoint
it will be mounted in /runpod-volume like in the docs
Perfect that sounds a plan
Thanks
Can u share links for the network storage access and deploy too plz
alright but try using the search or the AI in bottom right next time its cool
Goodluck on building bro
Don't use fastAPI on serverless, its already an API layer.
and don't put your models in your image most likely, def use network drives
Why not
just several reasons to prefer smaller images, it will work but with a lot of overhead
Network drives are about 1 million percent slower than baking things into the image, so I don't know why you are saying its better, because you are wrong.