#Issue with Multiple instances of ComfyUI running simultaneously on Serverless

79 messages · Page 1 of 1 (latest)

heavy hound
#

Hello,
I am using Runpod Serverless and deploying ComfyUI using this repo: https://github.com/blib-la/runpod-worker-comfy?tab=readme-ov-file#bring-your-own-models
For the Server, this repo is being used: https://github.com/comfyanonymous/ComfyUI
I am deploying via docker image and both these repos are engrained into the image.

When I run 2-3 workers via API, the comfy server gets activated and it responds as usual.
The problem arose when multiple API requests came for example more than 5 requests came to workers and more than 5 workers got activated, in that case, the ComfyUI server creates an issue and does not get activate.

I understand that activation of ComfyUI server is related to the comfyUI server code but if that is the case then even 1 worker shouldn't work but that is not the case. When workers are less, everything is working fine as soon number of workers increase then comfyUI server does not get's activated.

I appreciate if anyone takes a look.
Thank You

GitHub

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. - comfyanonymous/ComfyUI

GitHub

ComfyUI as a serverless API on RunPod. Contribute to blib-la/runpod-worker-comfy development by creating an account on GitHub.

topaz scroll
heavy hound
#

@topaz scroll I am doing the same thing.
Blib repo make comfyUI local server and then send request to it.
The problem is when they are less than 3-4 workers, everything works fine.
Api becomes reachable after 15-20 retires (Retires are done after few miliseconds - the default behavior).

But when the workers are more than 5 then Api is not reachable. Server does not get activated.
For example out of 5 workers only 2-3 workers able to activate server and others keeps retrying until max retries are reached.

topaz scroll
#

That's odd. Each worker should be an island unto itself. I'm not sure how having more workers would impact the function of any single worker. How are you handling models?

heavy hound
#

@topaz scroll That us what my thought is that each worker is independent.
Silly thought i have, that probably the port is occupied so I have randomly generate ports on which server ran for each worker but issue remains the same.

I have setup models, loras, custom nodes everything within docker image, no network volumn is attached.

topaz scroll
#

Have you checked when this is happening which workers they are running on?

#

The port should be wholly contained inside the docker image it doesn't even touch the host ports.

#

maybe if the host was completly out of ports???

heavy hound
#

I have a python script for testing .
I just send 5 request instantly to my endpoint.
Rest runpod endpoint assign each request to worker.

Can you please explain what do you mean to check workers?

topaz scroll
#

If you go into serverless you can select workers and see the status of all the workers assigned your endpoint.

#

What do you have set for max workers?

heavy hound
topaz scroll
#

so, all your requests are IN_PROGRESS state? or are you using RUNSYNC?

heavy hound
heavy hound
topaz scroll
#

50 would be nice, all I could get from them was 35.

heavy hound
#

Some other endpoints i am running that's why i need those.

topaz scroll
#

So, you can see from logs that the ComfyUI API is timeing out?

heavy hound
#

Yes

topaz scroll
#

As long as you are paying RP enough I'm sure they will continue to give more... I am not currently spending anything, in development.

#

I do everything Asnyc but I have no such issues... Although I currently only have flux schnell, dev, sd3, and sdxl. I don't have anything custom.

heavy hound
topaz scroll
#

I would expect some of that while it syncs up. Are you running in specific region?

heavy hound
topaz scroll
#

Yeah, don't see how that would change anything.

heavy hound
topaz scroll
#

Do you block out any regions?

heavy hound
topaz scroll
#

I am currently blocking EU-* and US-OR as they have had issues reported.

#

still have seen no update about them getting it working

heavy hound
#

Btw, may be you can say that to my specific endpont, there is some internet bug.
But tested on a test endpoint.
Issue remains the same.

topaz scroll
#

Do you have a local GPU you can test locally?

heavy hound
heavy hound
topaz scroll
#

If you have local GPU you can use the docker compose from the repo to run it in local API mode.

#

Just people talking about it on this server

heavy hound
heavy hound
#

Can you please explain about this more?

topaz scroll
#

I would try blocking those regions I mentioned and testing again.. and open ticket with RP.

#

It shouldn't matter how many workers are running each worker should be an isolated entity.

heavy hound
heavy hound
#

@topaz scroll tried blocking the EU and US-OR region but the issue persists. However, I have noticed that today the error rate is low.
Like if I send 10 requests then I see that all 10 get completed and sometimes 2-3 requests fail.
So issue appears to be runpod internal.

Thank you for your time to take a look at this.

eternal creek
#

use the async endpoint instead. Look here

#

The issue is that your requests are something passing the limit.

#

Use a polling mechanism and check every few seconds to see if your requests are ready.

#

I am using the exact same severless comfyui api to power my application. I do bulk processing of images and I ran into the exact same issue. This is how I fixed it. I can run hundreds of images in one shot now effortlessly. Even with just 3 active gpus.

#

It's much more performant anyways to use the polling system. And if you build an app on say severless in future it won't effect your serverless function limits. Using runsync and waiting for a bunch of requests to return is not very efficient.

Modern frameworks like nextjs have a 10s limit on function timeouts and the the serverless function time is one of the biggest cost factors

#

And it does not effect the speed at all. I feel it is faster now. I have not done tests to verify that. But it's definitely not slower

#

Im using 3x 4090's and ripping hard. I saw no/very little performance increase from using bigger gpus

heavy hound
#

@eternal creek Hello,
Thank you so much.
Your point totally makes sense.
Though two queries:
1- Even though async is better performant but even if I use sync they said that limit is 2000 per 10 sec. I believe i haven't even cross 100.
2- Using async, what is the time interval after which you make request to check status
Though it depends upon a task you are performing but still what is your suggestion.

eternal creek
#

Because it's one request at a time and comfyui doesn't use batching, you don't really gain much from the extra ram.

#

Set the polling to 1 second interval. It is very performant. I don't see a difference at all

#

My requests come back in the same time. I know those limits are fudged. I had to figure it out through trial and error

heavy hound
eternal creek
#

OKay cool thanks. I will test that out. Where are you adding the flag?

heavy hound
#

When you run python main.py in comfy ui directory there write python main.py --gpu-only

#

You can also see other flags like highVram etc.
Write python main.py --help

eternal creek
#

Thanks will check that out don't know much about Comfyui.

If anyone can make an inpainting version of this worker that uses flux it would be AMAZING!

I want to be able to just pass in a masked photo, with a prompt and get something back.

My starlink+ dockerhub is just super slow for some reason, can't effectively push these big images.

heavy hound
#

Why not make a pod and run your comfy expertiments there?
Install models using wget in network volumn so that you can use later.

eternal creek
#

Sorry I don't want to hijack your thread but

The issue is that you need to put everything in the docker image otherwise it takes to long to initialize in severless. I need it to be scalable from 0 serverless, because our sass infra demands that. It's quite complicated. Have tried.

It might also be a bit premature I don't see any official/popular flux based workflows for inpainting up yet.

safe pecan
#

Yeah, I've been experimenting with my own serverless comfyui setup today and that was my experience as well, A flux docker image without extra quantization and whatnot takes forever to build and ends up at 40+ gigs but once it's set up the returned images go from 0 to loaded and generated within 20 seconds typically

heavy hound
weak inlet
#

its best to use webhooks where possible, polling is inefficient in general

weak inlet
topaz scroll
eternal creek
#

is there a serverless endpoint for cogvideo5b yet? I see camenduru has one with a gui. looking for just api service send request, receive back video . If anyone finds it please hola.

weak inlet
topaz scroll
eternal creek
topaz scroll
#

I use runpod.serverless.progress_update to send updates in real time but it will NOT update the final status. For the final update you have to include all the data in what you return from the handler. You can either check that data with a STATUS or get the info from a webhook. <-- all of this is assuming you are using async RUN method.

eternal creek
#

Yeah I am just using polling currently with RUN. Works flawlessly. Not really phased about a few extra requests popping of to check if it's ready. Hardly consequential in my current pipelines.

mossy sapphire
zealous epoch
eternal creek
sudden breach
#

You can do that with github actions too, or other ci /Cd pipelines provider