#Has anyone successfully deployed a serverless instance using wan2.1 to generate i2v?

57 messages · Page 1 of 1 (latest)

long crescent
#

I tried the most comfyui+wan templates but they are all for RunPod. Resources for creating serverless instance for this purpose seem quite scarce too. Halp pls?

merry coralBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

hexed glacier
#

that specific dc has less gpu of your choice, so i'd suggest to move to other dc

long crescent
hexed glacier
#

clarify?

#

whats wrong

long crescent
#

i'm trying to run wan2.1 on serverless setup. but all the tutorials online are just running it on RunPod instance. so i'm asking for advice to set it up on serverless. For example this seems to be the most popular tutorial: https://www.youtube.com/watch?v=HAQkxI8q3X0 ... every other tutorial seems to deploy the pods

RunPod is a cloud platform that provides easy access to GPU instances for running AI models.

This video takes you through setting up a complete video generation package that includes Wan2.1 txt2vid, img2vid and vid2vid in one click using a RunPod template.

To deploy the template:
https://runpod.io/console/deploy?template=758dsjwiqz&ref=uyjfcr...

▶ Play video
#

sorry if i'm asking such noob questions lol. I guess i have no other choice but to make my own docker image

hexed glacier
#

it does comfyui on pods

#

that guy created template for pods i see, but for serverless you need an image with a handler.py code specific for serverless

long crescent
#

yeah i just figured that out recently. So i've been asking for if someone had already made handler.py for this purpose. couldn't seem to find it online

hexed glacier
#

do you wanna use comfyui

#

this is the worker you can use

long crescent
#

yeah i'm trying to use comfyui

long crescent
#

thanks for trying to help tho.

hexed glacier
#

actually.. comfyui isn't including all the custom community nodes

#

so you gotta install community nodes there, read the readme

long crescent
#

got it... i'll get back with the results here if i succeed

long crescent
# obtuse comet any updates?

Hey, it's been a while. I decided to run regular pods to install all my comfyui stuff on a network volume. Then trying to run it with serverless instances. In the process tho

obtuse comet
long crescent
#

I hopeful it'll get there

loud plover
#

Yes I succesfully ran Wan 2.1 image2video 720 on H100's behind a serverless endpoint.

It's quite tricky to setup and took me 2-3 days and lots of docker rebuilds

#

I'm right now in the process of optimizing to see if I can push the cost down for a single i2v - 720p - 5sec generation

long crescent
#

@loud plover 2-3days is impressive man.. it got me weeks. i'm finally getting there lol

hexed glacier
#

If you need help you can also post here or in new #1185337101307367535 post

loud plover
#

I've already been able to push down the execution time quite well.
CausVid lora + SageAttention + Cublas + FP16 accum. + Torch Compile

Running Wan 2.1 14B 720 model in max res (1280 x 720) for a 5 sec clip takes around 3 minutes on a H100 (80GB Pro)

Also preloading the models in the Dockerfile instead of a network volume helps tremendously cutting down execution time. That was my biggest time winner from 8-9 minutes > 3 minutes

loud plover
loud plover
long crescent
#

Wow that's good to know. I'm not aiming for a high quality. Currently generating 640x640 video in 240-260secs with wan vace + causvid + sageattention

#

I'm using network volume, so gpu choices are limited

loud plover
# long crescent I'm using network volume, so gpu choices are limited

Don't use network volume. I understand it's more convenient to have a storage volume to add / remove things from.
But there's two downsides when using it:

  • Very slow cold-starts from loading the Wan and text encoder model which adds unnecessary billed execution time with every single request.
  • Limited locations like you said

It's much better to just preload the models and bake them in the Docker image (see image).
Sure.. building, uploading, initialising worker takes longer. But once your image is up and running you can benefit from Runpod's FlashBoot technology which instantly preloads the model in the GPU memory. This seriously cuts back your billing time

loud plover
#

Whats the workflow like? Is it just creating 5sec clips and then grab the last frame to continue for another iteration
Or is it doing the 240 sec segment in one shot?

long crescent
long crescent
loud plover
long crescent
#

Tho i had to modify it a little to add causvid and other loras

loud plover
#

Thanks, do you generate 240 seconds in 720p?

Because even with causvid, sageattn, torchcompile and all the optimizations. This would take ages even on an H100 unless I'm doing something wrong

long crescent
#

for 720p with causvid+6 steps i'm getting around 350s

long crescent
long crescent
merry coralBOT
lilac lion
#

@loud plover Hey! Could it be possible I could have your starting dockerfile + Handler.py? Trying to make my own Wan2.1Img2video Serverless Endpoint (Using my Custom_Nodes, Models, and Loras Etc~~ But I want to pre-bake them into the image just as you.) Then deploy it to RunPod Serverless Endpoint,

Really much apprechiated. Being expiermenting for the past 1-2 weeks and been running into problems. Thank you!

exotic estuary
hexed glacier
#

or setup your own server that runs the actions

#

depot.dev too

sour trellis
molten idol
#

@sour trellis Same. I mean I did manage to build with Sage, but the gen time went up not down.

potent viper
#

@sour trellis