#Has anyone successfully deployed a serverless instance using wan2.1 to generate i2v?
57 messages · Page 1 of 1 (latest)
quite scarce, im guessing its because you have a network storage attached isnt it?
that specific dc has less gpu of your choice, so i'd suggest to move to other dc
yeah. got a volume attached to it indeed. eu-se-1 seems have less choice too. regardless of dc. i can't seem to configure it right tho.
i'm trying to run wan2.1 on serverless setup. but all the tutorials online are just running it on RunPod instance. so i'm asking for advice to set it up on serverless. For example this seems to be the most popular tutorial: https://www.youtube.com/watch?v=HAQkxI8q3X0 ... every other tutorial seems to deploy the pods
RunPod is a cloud platform that provides easy access to GPU instances for running AI models.
This video takes you through setting up a complete video generation package that includes Wan2.1 txt2vid, img2vid and vid2vid in one click using a RunPod template.
To deploy the template:
https://runpod.io/console/deploy?template=758dsjwiqz&ref=uyjfcr...
sorry if i'm asking such noob questions lol. I guess i have no other choice but to make my own docker image
it does comfyui on pods
that guy created template for pods i see, but for serverless you need an image with a handler.py code specific for serverless
yeah i just figured that out recently. So i've been asking for if someone had already made handler.py for this purpose. couldn't seem to find it online
do you wanna use comfyui
this is the worker you can use
yeah i'm trying to use comfyui
i've used this one earlier. it's comfyui version seemed to be missing WanImageToVideo nodes.
thanks for trying to help tho.
actually.. comfyui isn't including all the custom community nodes
so you gotta install community nodes there, read the readme
got it... i'll get back with the results here if i succeed
any updates?
Hey, it's been a while. I decided to run regular pods to install all my comfyui stuff on a network volume. Then trying to run it with serverless instances. In the process tho
yeah i wish comfyui was more API friendly
I hopeful it'll get there
Yes I succesfully ran Wan 2.1 image2video 720 on H100's behind a serverless endpoint.
It's quite tricky to setup and took me 2-3 days and lots of docker rebuilds
I'm right now in the process of optimizing to see if I can push the cost down for a single i2v - 720p - 5sec generation
@loud plover 2-3days is impressive man.. it got me weeks. i'm finally getting there lol
Im curious which specific process takes the most time
If you need help you can also post here or in new #1185337101307367535 post
I've already been able to push down the execution time quite well.
CausVid lora + SageAttention + Cublas + FP16 accum. + Torch Compile
Running Wan 2.1 14B 720 model in max res (1280 x 720) for a 5 sec clip takes around 3 minutes on a H100 (80GB Pro)
Also preloading the models in the Dockerfile instead of a network volume helps tremendously cutting down execution time. That was my biggest time winner from 8-9 minutes > 3 minutes
Whats your average execution time on the highest quality (720P 14B model)
The Docker builds take a lot of time. Like I said earlier, I'm preloading / baking the model in the Dockerfile moving it straight in the /models folder of comfyui. So each build it has to upload a 100 GB+ docker image and then Runpod has to download / extract the image
Wow that's good to know. I'm not aiming for a high quality. Currently generating 640x640 video in 240-260secs with wan vace + causvid + sageattention
I'm using network volume, so gpu choices are limited
Don't use network volume. I understand it's more convenient to have a storage volume to add / remove things from.
But there's two downsides when using it:
- Very slow cold-starts from loading the Wan and text encoder model which adds unnecessary billed execution time with every single request.
- Limited locations like you said
It's much better to just preload the models and bake them in the Docker image (see image).
Sure.. building, uploading, initialising worker takes longer. But once your image is up and running you can benefit from Runpod's FlashBoot technology which instantly preloads the model in the GPU memory. This seriously cuts back your billing time
Haven't worked with vace much yet. Can Wan vace work 240-260secs video? Thats impressive
Whats the workflow like? Is it just creating 5sec clips and then grab the last frame to continue for another iteration
Or is it doing the 240 sec segment in one shot?
Thanks for valuable information. I'll experiment with it then
Oh the workflow is very basic, its the one from comfy tutorials
Could you link it? I really wonder how to do longer segments other than 5 seconds
Tho i had to modify it a little to add causvid and other loras
Thanks, do you generate 240 seconds in 720p?
Because even with causvid, sageattn, torchcompile and all the optimizations. This would take ages even on an H100 unless I'm doing something wrong
for 720p with causvid+6 steps i'm getting around 350s
@loud plover
anyway, if anyone is looking to deploy serverless comfyui+wan take a look at my repo: https://github.com/atumn/runpod-wan
Thank you for marking this question as solved!
@loud plover Hey! Could it be possible I could have your starting dockerfile + Handler.py? Trying to make my own Wan2.1Img2video Serverless Endpoint (Using my Custom_Nodes, Models, and Loras Etc~~ But I want to pre-bake them into the image just as you.) Then deploy it to RunPod Serverless Endpoint,
Really much apprechiated. Being expiermenting for the past 1-2 weeks and been running into problems. Thank you!
Can I ask, where you are building your images? Im trying on runpod, but failing due to time or image size
Github actions can work
or setup your own server that runs the actions
depot.dev too
How did you set up the SageAttention? I have struggled to install SageAttention v2.2 for a few weeks now. I tried installing it in Dockerfile, but it constantly gives the "No CUDA runtime is found" error
@sour trellis Same. I mean I did manage to build with Sage, but the gen time went up not down.