Greetings. After updating from yesterday's version of SwarmUI, for some unknown reason, Quen Image Edit's runtime increased from 1 minute 15 seconds to 3-4 minutes. Rolling back to the previous version ([992a4bed: LTXV Latent Upscale basic support]) helps. I'm attaching screenshots (Before 1 and 2, After 1 and 2) with logs.
#Longer generation time for Qwen Edit 2251 after update
1 messages · Page 1 of 1 (latest)
not a single possibly relevant change happened between the latest and that commit. More likely you're just on the edge of available memory limits so incidental changes alter your free memory space just enough to significantly change performance arbitrarily
there's been some minor UI tweaks, some documentation, and some edge case handling on very specific configs that don't apply to you
Make sure you aren't near your VRAM and/or RAM limits
I understand and accept your arguments. It's just strange that yesterday, with identical settings, generations in Qwen took about 1 minute, and I could do them in batches while watching YouTube videos in the background. Today, it takes 3-5 minutes per generation, and during it, I can't even move the mouse. Everything is so slow. I haven't changed anything at all, haven't updated the system, drivers, or anything else. I'll continue experimenting, maybe I'll find the cause.
Weird shit actually.
Yes this is the case, but why it didn't happened before? Never!
idk same thing happened to me yesterday or the day before, I just switched some of the text enc models for gguf and resolved it that way
Kinda sucks tho because it breaks my normal Qwen/Z-Image combination
Wow, can you please tell me where exactly i can find those text encoders?
I can't find any info in generation history or in options
It shows up in the comfyui tab but isnt listed in your gens because when you use a default parameter, it doesnt show it
So i should change them somehow and return back old ones or what?
What?
Or just somehow change mine encoders to ggufs one?
You select it here, in the UI or change the default
Thanks i'll try!
And how did you determine among the pile of these files that you need exactly Qwen2.5-VL-7B-Instruct-Q4_K_M?
experience with the models
I got error with this text encoder
It will do that if youre trying to use it with z-image because its not the right one for z-image
z-image uses qwn_3_4b
I'm not use z-image it is pure qwen 2251 + lightning lora + anime2real lora 🙁
It is better now, but still not ~1 minute generation time
Yeah, thats the fp8 model, which is the original one that is twice the size of the gguf
If the gguf text enc doesnt work, that may be something monkey needs to look at.
I'll try also a Qwen2.5-VL-7B-Instruct-Q6_K.gguf
Are all these text encoders compatible with Qwen Edit or regular Qwen?
obv it works with qwen edit because you just posted a gen with it
Not quite, i used one i found in folder, idk maybe it is default one? But it is not gguf
I dont know how to explain that the fp8 and gguf are the same model, in different formats. It isnt different for qwen edit and qwen image
I understand. But no luck with Qwen2.5-VL-7B-Instruct-Q6_K (same result like with Qwen2.5-VL-7B-Instruct-Q4_K_M).
@abstract basalt maybe do you have an idea why it is so oh mighty guru?
Omg it was 30 seconds yesterday!
I figured out the problem, and I'm shocked. It turns out that if you don't specify the BASE application, the generation takes a long time. If you specify it, it takes 30 seconds. I don't know why there's such a huge difference; I'm not an expert on this matter.
You realize that your first screenshots are cold gens ( first of the the session) and this one is a warm gen?
Cold always takes longer because the files have to be moved to ram and vram
In my situation it doesn't matter. If you don't apply Base in Lora, the generation time will be slow, no matter the first, second, or hundredth. But if you install it, it will always be fast, and the VRAM problem will magically disappear.
But how i can apply BASE in prompt? Like that or in other way?
lora:qwen/Qwen-Image-Edit-2511-Lightning-8steps-V1.0-fp32:base
using an input image made it so that you are genning in a smaller resoultion
due to the option "smart image prompt resizing"
higher res = slower gen and more vram usage
scratch that
confused myself
every single slow gen you posted is a cold gen though
and every fast one a warm one
I know you said it doesn't make a difference, but you might've misunderstood
just test it, restart swarm do any gen and it will be slow, press the gen button again and it will be fast
<base><lora:...>
but everything after <base> will only be used for the base model, so careful if you have a refiner or segment etc
honestly not too sure what you mean by that, but disabling a lora will make the first gen after that take a bit longer
and this fix doesn't make sense, but good for you if it works 😄
can just say that every screenshot you posted has nothing out of the ordinary
Really?
#1459079784406712432 message
It just work! (c)
on a cold gen the split between gen time and prep time isn't really reliable
always look what the sum is
and the sum on that one isn't out of the ordinary for you
I don't understand, I demonstrated a cold generation in 49 seconds and the same one for 3 minutes. Is there really no difference or what? I think there is, and I found a solution. It turned out to be strange and unexpected, but it works.
I can disable @base and it will be even 5 minutes
The reason is unclear, but without @base the video memory simply gets clogged and that's it, generation takes much longer.
35 seconds, yahoo! 🙂
In warm it is 26 seconds even.
This is interesting.
Are you saying it generates faster if you assign the lora to base?
Yes it is
Although this is not entirely true, generation occurs normally when there is BASE, but when there is not, the video memory is full and generation is much slower.
I am writing through an online translator, my English is not good enough.
Maybe it will work for WAN 2.2 too? Have to test later