#what am i doing wrong, serverless workers optimization
20 messages · Page 1 of 1 (latest)
i dont know this discord is dead
Sometimes just downloading from GitHub can be rough. If you can help it, downloading in parallel can be useful
is there like a blueprint, for best cold start optimization
i want to keep the min workers at 0
i have models already on network volume, in same datacenter location
but its like 40gb of models
i didn't know sending this could achieve something
But maybe you can try opening a support ticket too if you'd like
just have a constant flow of request coming in
the more requests the better your flashboot start will be
Try to put the models into hf then use the model cache feature ( model field in your endpoint settings)
i think its not yet well documented yet but worth trying to optimize cold starts
@keen storm Is the model cache feature different (and more efficient) than downloading the models from the Dockerfile with the Flashboot enabled?
Yes it should be faster
You don't have to bake in models meaning faster push and build and easier but I think it's not well documented for usage. You can try it
@keen storm When using the cached model feature in the endpoint setting, should I remove commands to download associated huggingface files (model, textencoders,vae, etc) from my Dockerfile ?
Yep, you can access it inside your worker on a specific path
@keen storm What should be the specific path? There is no information in the online documentation.
If I use cached model and remove the initial download from Dockerfile, ComfyUI cannot find my model files in /comfyui/models/
Right..