ComfyUI for Intel Arc using IPEX | Intel Insiders Community | Page 10

earnest grotto Jun 19, 2025, 7:37 AM

#

@reef ivy do you want to explain your steps for getting compile working on windows?

somber trellis Jun 19, 2025, 7:43 AM

#

main spear Jun 19, 2025, 7:44 AM

#

I can see now what I need, I have to run it on a WSL with intel's Triton Backend

#

This is just the universe saying "Get an NVIDIA GPU......."

formal tusk Jun 19, 2025, 5:44 PM

#

I just use a symlink

earnest grotto Jun 19, 2025, 6:22 PM

#

there is no point making symlinks when it's a feature already supported by comfy by default

#

if you can learn to make a symlink, you can edit a yaml file and add many locations much more easily

main spear Jun 19, 2025, 7:12 PM

#

My problem right now is that when I make a wsl it doesn't read my gpu even though I have all the requirements

earnest grotto Jun 19, 2025, 7:31 PM

#

does clinfo show your gpu

#

you installed with my script?

earnest grotto Jun 19, 2025, 7:32 PM

#

earnest grotto <@490291944841281547> do you want to explain your steps for getting compile work...

Also, ^
pretty sure you can get compile working on native windows. aaron had it working if i recall correctly. i just dunno how

main spear Jun 19, 2025, 7:42 PM

#

From what i've been reading and what Grok and ChatCPT have told me, torch.compile doesn't work on native windows. And I have your script working on native windows, I didn't try installing it on a WSL2 though

#

The problem is clinfo doesn't read my gpu

earnest grotto Jun 19, 2025, 7:44 PM

#

gork and chatgpt probably told you outdated info. compile wasn't a thing on windows in general (intel or nvidia) till recently. and i highly doubt they'd give you anything accurate about intel

earnest grotto Jun 19, 2025, 7:46 PM

#

main spear The problem is clinfo doesn't read my gpu

did you install the compute runtime?

#

there are a bunch of extra things you probably will need to install on wsl

#

i have not touched wsl in quite a while

main spear Jun 19, 2025, 7:47 PM

#

I couldn't find the compute runtime or level zero for windows, it was all for Linux and Ubuntu

earnest grotto Jun 19, 2025, 7:47 PM

#

yes, you're using wsl.

main spear Jun 19, 2025, 7:48 PM

#

I meant to get complie working on native

#

I have OneAPI installed as well because I thought those came in the package but they do not

earnest grotto Jun 19, 2025, 7:49 PM

#

search aaron's messages in this thread regarding compile on windows

main spear Jun 19, 2025, 7:58 PM

#

#1193952640225267802 message

#

Here?

earnest grotto Jun 19, 2025, 8:03 PM

#

yes

#

if you're using my installer, 2.8 is nightly/experimental, dunno which i called it

#

you can run the script again from the same location as before and pick a different pytorch version. it will install the different version, faster than installing it anew

reef ivy Jun 19, 2025, 8:17 PM

#

earnest grotto <@490291944841281547> do you want to explain your steps for getting compile work...

#1193952640225267802 message this is how I got it working, not sure if its still necessary you can just try calling the setvars when starting your environment, if not then try that stuff

main spear Jun 19, 2025, 8:24 PM

#

@reef ivy How did you get Wan vids to finish in less than 20mins? I switched to doing ltx cause they are faster but Wan has the quality I want but I tried some different work flows and it they all were taking 1hr sometimes longer

#

What did I miss? Do I need to add it to path?

📎 message.txt

#

oh it didn't show all the code

#

<pre>``` File "C:\Users\chris\OneDrive\Documents\ComfyUI\Comfy_Intel\cenv\lib\site-packages\triton\runtime\build.py", line 74, in _build
raise RuntimeError("Failed to find C++ compiler. Please specify via CXX environment variable.")
torch._inductor.exc.InductorError: RuntimeError: Failed to find C++ compiler. Please specify via CXX environment variable.

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

print("Execution finished")
Execution finished```</pre>

reef ivy Jun 19, 2025, 9:35 PM

#

main spear <@490291944841281547> How did you get Wan vids to finish in less than 20mins? I ...

Lower frames and fp8 quant(if 14b i2v), gguf are slower and less consistent with speed. Try 49 frames also resolution size makes a difference. Also teacache and other techniques and causevid for lower steps and lower cfg now.

#

Check out the fusion x models also, haven't tried them personally but they have causevid and all that stuff merged. This is for 14b btw.

main spear Jun 19, 2025, 10:34 PM

#

I think I got this working now, just added the call oneapi setvars in the activate.bat

somber trellis Jun 20, 2025, 3:43 AM

#

@earnest grotto

#

Index tts post that was in chinese

reef ivy Jun 20, 2025, 5:36 AM

#

If it only needs the call then they added the needed files to the driver(which is what I thought but wasn't sure since I already did it lol)

earnest grotto Jun 20, 2025, 6:40 AM

#

somber trellis <@311915623179485186>

sad 🤷‍♂️

earnest grotto Jun 20, 2025, 6:56 AM

#

Did i post that i was trying to get instantcharacter to work here? i guess i didn't, can't see scrolling up
well, it works now
https://cdn.discordapp.com/attachments/1091563787749953647/1385342584980770907/ComfyUI_00060_.png?ex=6855b852&is=685466d2&hm=9f291d995163f3c287ecd9d38a93e9d0b07c9cb82aa4c6e4ecde27db251af4b3&

#

I wonder if I should look into how comfy does inference and try to shove it in that. only the transformer, dinov2 siglip and the ipadapter remain done by diffusers, everything else is native
and for the better, flan gave me a prettier result, at least for this seed. regular t5 was having finger issues and bg was a bit worse

somber trellis Jun 20, 2025, 11:17 AM

#

earnest grotto Did i post that i was trying to get instantcharacter to work here? i guess i did...

InstantID solaire of astora

#

I wonder how versatile it is

earnest grotto Jun 20, 2025, 11:18 AM

#

somber trellis InstantID solaire of astora

whut?

somber trellis Jun 20, 2025, 11:18 AM

#

earnest grotto whut?

Does InstantID not work by using a reference image ?

#

Actually lemme set it up my end

earnest grotto Jun 20, 2025, 11:19 AM

#

pretty sure all the -ID things are for faces specifically

somber trellis Jun 20, 2025, 11:20 AM

#

earnest grotto pretty sure all the -ID things are for faces specifically

Welp

reef ivy Jun 20, 2025, 1:08 PM

#

Anybody tried vace or phantom for wan2.1 with one frame? One might work decent for consistent characters.

main spear Jun 23, 2025, 6:46 AM

#

With the 13b vace t2v I can get pretty good quality in about 10-20min, the 14b vace takes about 40min-1hr but the loras I use work with vace even though they don't spefically say they do on civitai. I also use it in conjuction with CausVid. I've been trying to get it to work with tea cache and torch.complie but sometimes it works and sometimes it doesn't, and on the time it does work it doesn't really shorten anything. I think that's partly due to cauvid alreadly shortening it as much as it can be but I'm stiil tinkering with it

reef ivy Jun 23, 2025, 3:54 PM

#

main spear With the 13b vace t2v I can get pretty good quality in about 10-20min, the 14b v...

how many frames and resolution? GGUF or FP8? Try the fp8 scaled models from comfy as well. t2v might be a little slower than i2v (which i use) but I don't think it should take 20 minutes. Also put --reserve-vram 7 into the command line for comfy. (you can try different numbers but 7 is usually the sweet spot if on 8gb gpu like me)

wicked fulcrum Jun 23, 2025, 4:19 PM

#

main spear With the 13b vace t2v I can get pretty good quality in about 10-20min, the 14b v...

@somber trellis for WAN 2.1 Vace are you using IPEX + PyTorch or straight PyTorch ie 2.7?

main spear Jun 23, 2025, 8:14 PM

#

wicked fulcrum + <@204342691964780546> for WAN 2.1 Vace are you using IPEX + PyTorch or straigh...

I have Nightly installed so I can use the torch.complie but I havn't really seen a big difference with it so I might use the other verison that is in Vik's installer

earnest grotto Jun 23, 2025, 8:38 PM

#

What do you mean by doesn't work, broken result or errors
IIRC, teacache and causvid might not mix

#

Also there's magcache which should be better than teacache

#

(Not that it would mix if it doesn't, but it'd be better in cases where teacache is effective. I haven't tried magcache)

main spear Jun 23, 2025, 8:45 PM

#

I'll try to recreate the error it gives me, I think it's a memory problem but Idk. I tried asking Grok to try to and see if it was fixable but it ended up causing other errors so I reinstalled Comfy and haven't use wan2.1 yet. But I have 48gb or ram and i'm using the arc b580 12gb vram and I can see it does use the vram on most processes and occassionally it'll use cpu for other things like positive and negative prompts

main spear Jun 23, 2025, 9:04 PM

#

Okay, maybe it was a compatibility issue with nightly, I installed the stable version in your installer and the issue isn't appearing anymore

reef ivy Jun 23, 2025, 11:11 PM

#

Some nightly builds can randomly break since its actively being developed everyday

#

2.7 stable should support torch.compile now iirc

main spear Jun 23, 2025, 11:19 PM

#

reef ivy 2.7 stable should support torch.compile now iirc

How do you get it working? I tried just following the same steps as the nightly build but got errors

#

spefically when using the WanVaceToVideo node

#

which was weird because I didn't even have it connected to that node

reef ivy Jun 24, 2025, 5:39 AM

#

Native nodes? Or the wanwrapper? For wan wrapper you have to do a minor code edit.

main spear Jun 24, 2025, 6:16 AM

#

reef ivy Native nodes? Or the wanwrapper? For wan wrapper you have to do a minor code edi...

The wanwrapper

reef ivy Jun 24, 2025, 1:47 PM

#

main spear The wanwrapper

#1193952640225267802 message follow those instructions if you are on A series, battlemage shouldn't need it but maybe try it anyway. If you are on battlemage probably post your errors here and maybe one of us can help.

#

Also, you can try the latest nightly build and it might be fixed now if it was working before with the wrapper.

somber trellis Jun 24, 2025, 2:31 PM

#

wicked fulcrum + <@204342691964780546> for WAN 2.1 Vace are you using IPEX + PyTorch or straigh...

Ipex+Pytorch. Pretty sure both can do wan though without issue

wicked fulcrum Jun 25, 2025, 5:28 PM

#

WAN 2.1 Vace 14B GGUF Q3 - 512x512, 30 steps. Takes 35-40 minutes on A770

#

This is using PyTorch 2.7 no IPEX and the AI Playground install. Wondering if this is in what people are experiencing for times: 70.46 s/it. Video gen is pretty good consider lower res but time is slow. Any thoughts?

earnest grotto Jun 25, 2025, 5:55 PM

#

I don't do videos, but there are a lot of speedups you can employ. Notably:
causvid - can be a lora kinda like lcm extracted from the finetune, or a finetune
teacache/magcache - those also apply to flux and other purely image models, decent speedups, magcache is a little newer and might not be as supported yet
self-forcing - basically cfg+step distillation, like flux schnell
I have this link for a self-forcing lora. No idea how effective it is
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors

#

https://github.com/welltop-cn/ComfyUI-TeaCache

#

teacache slightly lowers quality (and might break if used with the other speedups). magcache supposedly retains much more quality with similar speedup

wicked fulcrum Jun 25, 2025, 7:00 PM

#

Using teacache now and not sure it is speeding it up. Need to test to be sure

reef ivy Jun 26, 2025, 1:43 AM

#

Teacache will speed it up, also torch.compile. teachache works by skipping steps so it speeds up as it goes, usually starts after third step

#

Also there is the merged fusion model that can run at fewer steps with even better quality (have not tried it myself yet though)

rustic sonnet Jun 26, 2025, 3:32 PM

#

https://x.com/bfl_ml/status/1938257909726519640?t=U33R5RsvF8gddYKbWY3BhQ&s=19

Black Forest Labs (@bfl_ml)

High quality image editing no longer needs closed models

We release FLUX.1 Kontext [dev] - an open weights model for proprietary-level image editing performance. Runs on consumer chips.

✓ Open weights available
✓ Best in-class performance
✓ Self-serve commercial licensing

somber trellis Jun 26, 2025, 4:17 PM

#

wicked fulcrum WAN 2.1 Vace 14B GGUF Q3 - 512x512, 30 steps. Takes 35-40 minutes on A770

Why are you using a Q3.

#

These lower GGUF quants for these models tend to actually be slower.

#

Q8 is always the way to go for image/video models.

#

also use teacache

#

In other news.

#

flux kontext dev released

rustic sonnet Jun 26, 2025, 4:19 PM

#

So if I want to use comfyUI on Meteorlake IGPU, what would be the steps to install it

#

With the 4GB issue mitigation i mean

wicked fulcrum Jun 26, 2025, 4:33 PM

#

somber trellis Q8 is always the way to go for image/video models.

Ill try Q4 and Q8. My worry was, higher memory models would cause cause memory sharing to kick in and dramatically slow down inferencing

somber trellis Jun 26, 2025, 4:34 PM

#

wicked fulcrum Ill try Q4 and Q8. My worry was, higher memory models would cause cause memory s...

With reserve vram properly set, this actually doesn't seem to be the case.
In fact, memory swapping between sys and vram in arc's case seems to improve inferencing time.

wicked fulcrum Jun 26, 2025, 4:43 PM

#

somber trellis With reserve vram properly set, this actually doesn't seem to be the case. In fa...

Hmm interesting. That's not been my experience. But Ive been focused elsewhere last couple months. At least with drivers from earlier this year my experience was when models are at the edge of GPU memory and shared memory kicks in, the copy process interrupts GPU compute time and can slow down inferencing 10X... with reserve vram set.
But I'll try higher vram models, and clear my assumptions. Thanks for the tip

somber trellis Jun 26, 2025, 4:47 PM

#

Originally on 2.3.1 I used a --reserve-vram of 4.0, but as the drivers and comfyui itself changed I've had to increase it.

#

It was 4. Then 6.

#

Now it's 8.

#

#

Okay. Wow.

#

earnest grotto Jun 26, 2025, 5:12 PM

#

rustic sonnet https://x.com/bfl_ml/status/1938257909726519640?t=U33R5RsvF8gddYKbWY3BhQ&s=19

finally

earnest grotto Jun 26, 2025, 5:14 PM

#

rustic sonnet So if I want to use comfyUI on Meteorlake IGPU, what would be the steps to insta...

You can use my script #1193952640225267802 message
Or AI Playground should also work
If you do use my script, make sure it tells you it's detected the iGPU right. I don't have an iGPU and am not entirely clear on what's what with them, but it should work

rustic sonnet Jun 26, 2025, 5:14 PM

#

earnest grotto You can use my script https://discord.com/channels/554824368740630529/1193952640...

Thanks mate

earnest grotto Jun 26, 2025, 5:14 PM

#

Here's a link to AI playground https://game.intel.com/us/stories/introducing-ai-playground/

Intel Gaming Access

Bob Duffy

Introducing AI Playground

wicked fulcrum Jun 26, 2025, 5:15 PM

#

somber trellis

Is this Kontext dev?

wicked fulcrum Jun 26, 2025, 5:17 PM

#

rustic sonnet So if I want to use comfyUI on Meteorlake IGPU, what would be the steps to insta...

MTL-H?

somber trellis Jun 26, 2025, 5:21 PM

#

wicked fulcrum Is this Kontext dev?

Yes.

rustic sonnet Jun 26, 2025, 5:23 PM

#

wicked fulcrum MTL-H?

Yes

#

185H

#

Got 32GB ram

somber trellis Jun 26, 2025, 5:30 PM

#

so i just bigbrained something

#

combining both redux and flux kontext is pretty powerful

earnest grotto Jun 26, 2025, 5:45 PM

#

are you asking because you want to try kontext?

somber trellis Jun 26, 2025, 5:46 PM

#

earnest grotto are you asking because you want to try kontext?

no i just downloaded it

#

im testing it out right now

wicked fulcrum Jun 26, 2025, 5:46 PM

#

@rustic sonnet As Vik said, either follow his instructions to install ComfyUI or install AI Playground. MTL-H should work with most models. But the 4 gig memory chunk is part of Alchemist. A series GPUs like Arc in MTL-H have this limit. SDXL, Flux, LTXV, WAN 2.1 should all be good

Note AI Playground provides you ComfyUI as a backend where it's launched by AI Playground. With AI Playground running you can launch ComfyUI in a browser by going to localhost:49000.

You can change the version of ComfyUI in the AI Playground backend manager.. I believe by default it installs 3.3. You can change to update the latest which is 3.42. Doing this can break some AI Playground features. ie LTX-video image 2 video requires you add a field value to its workflow

Anytime you reinstall ComfyUI or AI Playground, it wipes out ComfyUI. So backup or move out ComfyUI Models directory before reinstall through AI Playground

earnest grotto Jun 26, 2025, 5:49 PM

#

somber trellis no i just downloaded it

I mean bionic, sorry

somber trellis Jun 26, 2025, 5:50 PM

#

bruh

#

#

Flux Kontext Dev + Flux Redux = This

#

#

The prompt

#

was literally just

#

3D Hyperrealism

somber trellis Jun 26, 2025, 6:12 PM

#

rustic sonnet Jun 26, 2025, 6:16 PM

#

earnest grotto I mean bionic, sorry

Yea that's my intention, well I won't be at home for a while.
So just asking for future reference when I get home

earnest grotto Jun 26, 2025, 6:18 PM

#

hmm
i expect a single image to take 5 minutes with that... or more?

#

teacache probably doesn't support it

#

yet

somber trellis Jun 26, 2025, 6:28 PM

#

earnest grotto teacache probably doesn't support it

teacache works with kontext

#

in fact it's already been quantized to q8

#

and works with most ofthe same nodes normal flux dev would

#

including redux

#

#

#

just like normal flux you can use dynamic thresholding

#

allowing you to use negative prompts with flux at the cost of speed

#

wicked fulcrum Jun 26, 2025, 6:52 PM

#

somber trellis allowing you to use negative prompts with flux at the cost of speed

Have you tried image to image editing using text prompts. "Remove sunglasses" etc

somber trellis Jun 26, 2025, 6:58 PM

#

wicked fulcrum Have you tried image to image editing using text prompts. "Remove sunglasses" e...

#

only at 30% completion

rustic sonnet Jun 26, 2025, 6:58 PM

#

earnest grotto hmm i expect a single image to take 5 minutes with that... or more?

Not a problem

somber trellis Jun 26, 2025, 7:03 PM

#

wicked fulcrum Have you tried image to image editing using text prompts. "Remove sunglasses" e...

#

Definitely works

#

seems whatever transition node i used didnt line em up proper

somber trellis Jun 26, 2025, 7:28 PM

#

oh i shoulda mentioned i did it in cfg 2 instead of 2.5

#

its why it looks more grainy and faded

earnest grotto Jun 26, 2025, 7:32 PM

#

somber trellis seems whatever transition node i used didnt line em up proper

are you sure the input's resolution is divisible by 8?

somber trellis Jun 26, 2025, 7:39 PM

#

earnest grotto are you sure the input's resolution is divisible by 8?

The input resolution is not always divisible by 8 since im using a 2x upscaler on multiple different aspect ratios, then it gets downsampled to 1MP resolution

#

sometimes it gives weird odd-integered resolutions

earnest grotto Jun 26, 2025, 7:39 PM

#

that's probably why your images shift a bit

somber trellis Jun 26, 2025, 7:39 PM

#

I probably should use the upscale to closest SDXL resolution node

#

that would instantly fix the problem

#

earnest grotto Jun 26, 2025, 7:40 PM

#

comfy will automatically pad if it's not perfectly divisible by 8

somber trellis Jun 26, 2025, 7:40 PM

#

i bet they couldve made a much more close to artwork gordan if they just put more work into his model

earnest grotto Jun 26, 2025, 7:41 PM

#

this can sometimes result in grey lines with sdxl. with flux, i dunno what the vae decodes that to but probably also greyish or pinkish

somber trellis Jun 26, 2025, 7:41 PM

#

https://github.com/jamesWalker55/comfyui-various

#

this repo has a node that has an resize node which auto upscales/downsamples images to the closest sdxl resolution

#

also these nodes are good for memory control

reef ivy Jun 26, 2025, 7:44 PM

#

somber trellis allowing you to use negative prompts with flux at the cost of speed

How much speed loss?

somber trellis Jun 26, 2025, 7:44 PM

#

reef ivy How much speed loss?

40-50%

#

its quite a bit slower

#

but you get far more control

#

reef ivy Jun 26, 2025, 7:45 PM

#

Oof, a little too much for me. I guess for editing it would be worth it though

somber trellis Jun 26, 2025, 7:45 PM

#

#

simply allows you to use cfgs higher than 1

#

and for normal cfg models it can allow you to go above normal cfg limits

#

effectively operating as an anti-burn node

reef ivy Jun 26, 2025, 7:46 PM

#

Does the hyper loras work? For lower steps. Or forget the name

somber trellis Jun 26, 2025, 7:46 PM

#

On Flux Kontext? No clue.

reef ivy Jun 26, 2025, 7:46 PM

#

With flux i get images pretty quick with minimum quality loss with those turbo loras etc.

somber trellis Jun 26, 2025, 7:47 PM

#

Which turbo lora though

#

The Flux Alpha Turbo Lora?

#

Or Hyper Flux?

reef ivy Jun 26, 2025, 7:47 PM

#

I honestly forget, I tried both and one was better than the other. Will have to check later

#

Been a minute since i used it

somber trellis Jun 26, 2025, 7:50 PM

#

#

much better

tough wharf Jun 26, 2025, 8:31 PM

#

Is it just me, After Driver update of my A770 16g from 6734 to 6881 all T2Image generations turned to shit,
HOWEVER Video generation has speed up so much like;
6881: around 8s/it - 10s/it.
6734: around 80s/it - 200s/it
(note: Im using the new Self Forcing model from Kijai)
Wth happened (im not complaining tho)

earnest grotto Jun 26, 2025, 8:32 PM

#

you were running out of vram

#

you are not running out of vram now

#

keep track of your vram usage, use smaller models if you're running out

#

most programs use vram. they use a little, but if you have a lot of chromium tabs open those can start to add up to ~1gb

#

discord also uses something like 400mb I think?

tough wharf Jun 26, 2025, 8:36 PM

#

earnest grotto keep track of your vram usage, use smaller models if you're running out

wait, was the message for me?
But yea, Im not running out of Vram, it seems the model is fit everything on the gpu 12g-14g/16g when running the wf
note Previously Ive previously used the Selfforcing model on driver 6734

earnest grotto Jun 26, 2025, 8:36 PM

#

yes

#

you are not running out now

#

you probably were before

tough wharf Jun 26, 2025, 8:37 PM

#

Very nice, Kudos to Intel for this update 😭 🙏

earnest grotto Jun 26, 2025, 8:41 PM

#

show what's up with the images

tough wharf Jun 26, 2025, 8:57 PM

#

tough wharf Is it just me, After Driver update of my A770 16g from 6734 to 6881 all T2Image ...

I think its because Im running out of Vram on T2I
The Artifacting was my mis, I changed the sampler from Euler_A_Cfg++ to Euler A seems to have fix the issue,
but the significant slow down on T2I generation is very visible
previously before updating my driver (6734) T2I is pretty fast on this workflow like 140-180s per image.
All images below are with 6881 driver

earnest grotto Jun 26, 2025, 8:59 PM

#

Once the shared goes above 0, yea you're running out

tough wharf Jun 26, 2025, 8:59 PM

#

This is now with Euler A sampler, it fixed the artifacting issue

#

I havent changed the settings on this workflow apart from the sampler

earnest grotto Jun 26, 2025, 9:00 PM

#

use the tiled ksampler custom node, having upscaled with some esrgan model before it, 4x-ultrasharp2 is recent and good

tough wharf Jun 26, 2025, 9:01 PM

#

This one?

earnest grotto Jun 26, 2025, 9:01 PM

#

either of them

#

whichever one

tough wharf Jun 26, 2025, 9:01 PM

#

Ill make a sample workflow

earnest grotto Jun 26, 2025, 9:02 PM

#

your image -> upscale image using model (the model being 4x-ultrasharp2) -> tiled vae encode -> tiled ksampler

#

wonder if kontext also benefits from flan t5

tough wharf Jun 26, 2025, 9:13 PM

#

I feel like my gpu is not being fully utilize?

earnest grotto Jun 26, 2025, 9:14 PM

#

Compute dropdown

#

What ARE these goggles man

earnest grotto Jun 26, 2025, 9:16 PM

#

earnest grotto Did i post that i was trying to get instantcharacter to work here? i guess i did...

instantchar for comparison, the goggles actually look like goggles. But i'm gonna hope it's just an unlucky seed or something

tough wharf Jun 26, 2025, 9:20 PM

#

earnest grotto Compute dropdown

do you mean by manualy set the model compute dropdown?

earnest grotto Jun 26, 2025, 9:21 PM

#

sorry, i was a bit vague

earnest grotto Jun 26, 2025, 9:21 PM

#

tough wharf do you mean by manualy set the model compute dropdown?

these

tough wharf Jun 26, 2025, 9:21 PM

#

earnest grotto What ARE these goggles man

I can see you also use A770, how fast can you generate images?

earnest grotto Jun 26, 2025, 9:21 PM

#

depends on many things

#

i don't want to oom right now so i've opted for q5 which slows it down

#

actually what model even

tough wharf Jun 26, 2025, 9:22 PM

#

Oh it seems to be utilizing it then

earnest grotto Jun 26, 2025, 9:22 PM

#

i'm doing flux kontext rn

tough wharf Jun 26, 2025, 9:25 PM

#

LMAO

tough wharf Jun 26, 2025, 9:26 PM

#

earnest grotto i'm doing flux kontext rn

I heard Flux is mainly used for realistic images
Im not particularly interested with realistic image generations
so Im sticking to sdxl or illustrious

#

It took 15mins ICANT

earnest grotto Jun 26, 2025, 9:26 PM

#

tough wharf LMAO

You essentially just made a brand new image at an oversized resolution, and that's what happens when you do that (with sdxl/1.5)
Lower denoise (and steps since you won't need as many), e.g. to 0.5, your steps are already kinda low though. generally 28 is good, so i'd go for 14

#

different tiling strategies are faster or slower but have drawbacks

#

random is slowest but generally looks the best

#

you can also bump up the tile sizes to 1024x1024 since sdxl

#

ah, i didn't see there's nothing before it

#

you need to generate some image, use the tiled ksampler to upscale it, not to generate it outright

earnest grotto Jun 26, 2025, 9:29 PM

#

earnest grotto What ARE these goggles man

flan t5

tough wharf Jun 26, 2025, 9:30 PM

#

earnest grotto you need to generate some image, use the tiled ksampler to upscale it, not to ge...

Ah, so thats why
so i assume the Tiled ksampler is best used for upscaling no?

earnest grotto Jun 26, 2025, 9:31 PM

#

yes

tough wharf Jun 26, 2025, 9:33 PM

#

But still the Images generation seems to really have been hit after the update to 6881 driver
This is with a workflow I used to test for a quick image before I updated my driver, usually it takes below 1 min to finish the execution

#

this took 200s

earnest grotto Jun 26, 2025, 9:40 PM

#

tough wharf I heard Flux is mainly used for realistic images Im not particularly interested ...

Ideally, a good model will be able to do any task, and won't be "realistic" or "anime". Sadly we're not there yet. Any base model will skew towards being able to only do realistic stuff competently. With loras you can kinda get an actually decent style but it still lacks a lot of other knowledge you'd be getting from an anime finetune.
Inpainting models however are a massive step up in understanding, due to the context of the rest of the image being inpainted. Flux fill can inpaint pretty well.
And evidently kontext is also not too bad
Base SDXL is worse than flux.
Expect a Lumina 2 finetune by onoma (illustrious) soon™ and sd3.5 large by cagliostro (animagine) "q1 to q2 this year" (april to september)

earnest grotto Jun 26, 2025, 9:41 PM

#

tough wharf I can see you also use A770, how fast can you generate images?

1.95it/s just making a 1216x832 image with a lora with animagine 4.0 opt, euler ancestral

tough wharf Jun 26, 2025, 9:43 PM

#

earnest grotto your image -> upscale image using model (the model being 4x-ultrasharp2) -> tile...

the Tiled Method worked (Encode > Tiled Ksampler > Decode >)
The Vram stayed around 8gb to 10gb while on Tiled Ksampler

tough wharf Jun 26, 2025, 9:44 PM

#

earnest grotto Ideally, a good model will be able to do any task, and won't be "realistic" or "...

Ive been seeing the buzz with kontext lately, How good and fast is it in your A770?

earnest grotto Jun 26, 2025, 9:45 PM

#

many ways to get it faster or slower

#

i am using a q5 quant because i don't want to oom. that makes it slower than q8 or fp8

#

you can just do it without ooming but doing larger images+cfg starts eating vram

#

flux can do larger or smaller images without breaking, unlike SDXL, but only up to a certain point. since then I've seen at least 1 paper with an even better method for getting models to generalize to other resolutions even better than rotary embeddings so that can be something you can expect from future models

#

I'm not using teacache, yet, since i just wanted to see if it works

#

i get 10s/it

tough wharf Jun 26, 2025, 9:50 PM

#

earnest grotto i get 10s/it

Thats fast
With Teacache will it be faster I wonder?

earnest grotto Jun 26, 2025, 9:50 PM

#

usually teacache is a ~2-3x speedup

tough wharf Jun 26, 2025, 9:53 PM

#

Ive tried a workflow backthen using Teacache with wan2.1-14b, q5 k_m, It took me around an hour I think then it crash because of OOM

#

I think I will settle for self forcing models for now, since its fast
but downside is its 1.3 so not much loras I can play with

#

I hope they release a wan14b self forcing soon

earnest grotto Jun 26, 2025, 10:25 PM

#

5.45s/it with q8

#

should try q4

#

with q8 and teacache, 2.6s/it

#

but on the brink of ooming

earnest grotto Jun 26, 2025, 10:54 PM

#

earnest grotto should try q4

7.1s/it

earnest grotto Jun 26, 2025, 11:54 PM

#

On more testing, q4 is a bit too fried

reef ivy Jun 26, 2025, 11:56 PM

#

tough wharf Very nice, Kudos to Intel for this update 😭 🙏

Try --reserve-vram 7

#

Seems arc is slower when the model is not swapping to system ram, especially with gguf models

#

Or try kijais nodes with my little code edit and use block swap

#

#1193952640225267802 message

#

Also can't use gguf with kijai but 16gb should be enough for fp8 models

tough wharf Jun 27, 2025, 12:25 AM

#

reef ivy Try --reserve-vram 7

Hi, I have no problems with generating video generation

#

Like I said, After the update to 6881, It seems to have fix the issue of slowing down the video generation process or just simply going oom

tough wharf Jun 27, 2025, 12:28 AM

#

tough wharf this took 200s

what Im having the problem now is with image generation, if you scroll back a little, I mentioned why the upscale nodes have gotten slow on its generation process(5s/it)
Idk how to explain it anyfurther than that
Its just like that with my I2I or T2I workflows since the update to 6881 drivers

#

tho, Im not complaining since I want to generate videos too with my A770 😅

#

idk If I can share the videos I generated here, its a bit nsfw

tough wharf Jun 27, 2025, 12:32 AM

#

reef ivy Try --reserve-vram 7

Ill try this later bro

reef ivy Jun 27, 2025, 12:58 AM

#

Yeah, not the place for nsfw stuff. I will check out comfy again soon and see what has changed.

earnest grotto Jun 27, 2025, 1:06 AM

#

Seems CFG is needed to make anime goggles

somber trellis Jun 27, 2025, 3:03 AM

#

earnest grotto What ARE these goggles man

transparent versions of egads glasses

#

i hope you know who egad is

somber trellis Jun 27, 2025, 3:04 AM

#

earnest grotto On more testing, q4 is a bit too fried

anything but q8 isnt worth it (for visual models)

#

LLMs are generally fine at 6_K, any lower and they start to get that lobotomized feel

#

I don't know if 6_K is any good on visual models

#

I think it's best if we just wait for a method of low perplexity low bit quantization such as bitnet 1.58b but without the need for a full architecture re-train

#

that will revolutionize our capabilities

#

low-bit quants like 1.58b will allow us to run much larger model sizes, which in turn lowers perplexity as well as parameter count increases

#

tough wharf Jun 27, 2025, 3:21 AM

#

reef ivy Try --reserve-vram 7

Just try this, It seems to have fix the issue of the USDU being slow wth
I also tried it with V2V self forcing(I havent noticed anything different with --reserve-vram parameter)

#

Thank you Vik, Aaron
Im back on the game now

somber trellis Jun 27, 2025, 4:14 AM

#

tough wharf Thank you Vik, Aaron Im back on the game now

earnest grotto Jun 27, 2025, 12:33 PM

#

somber trellis

Yellow filter :(

queen remnant Jun 27, 2025, 6:30 PM

#

any time i try to update comfyui either from comfyui manager or with the script it does this and never updates

#

i haven't made any manual changes myself and i've actually had to delete and reinstall it to get it to update in the past

earnest grotto Jun 27, 2025, 6:45 PM

#

type in git stash and press enter

#

In a command prompt in comfyui's folder

#

I'll see why the script struggles to do that later

valid escarp Jun 27, 2025, 6:49 PM

#

any good yutube vids or courses to get started learning comfy ui?

queen remnant Jun 27, 2025, 6:49 PM

#

i just deleted the file it was whining about and that seemed to get it to update but im guessing there are some changes made to the file in the comfy intel install script

earnest grotto Jun 27, 2025, 6:52 PM

#

queen remnant i just deleted the file it was whining about and that seemed to get it to update...

yes.

earnest grotto Jun 27, 2025, 6:52 PM

#

valid escarp any good yutube vids or courses to get started learning comfy ui?

https://docs.comfy.org/get_started/first_generation

ComfyUI

Getting Started with AI Image Generation - ComfyUI

This tutorial will guide you through your first image generation with ComfyUI, covering basic interface operations like workflow loading, model installation, and image generation

#

I don't like most youtube videos on this topic.

valid escarp Jun 27, 2025, 6:56 PM

#

thank you

earnest grotto Jun 27, 2025, 7:21 PM

#

Probably worth noting here that these are a thing
https://huggingface.co/fal/Wojak-Kontext-Dev-LoRA
https://huggingface.co/fal/Plushie-Kontext-Dev-LoRA
https://huggingface.co/fal/Broccoli-Hair-Kontext-Dev-LoRA

#

#

Also normal flux dev loras might also work

earnest grotto Jun 27, 2025, 7:48 PM

#

wonder why bfl didn't train it with flan

earnest grotto Jun 28, 2025, 2:51 AM

#

queen remnant any time i try to update comfyui either from comfyui manager or with the script ...

I can't reproduce whatever issue with updating comfy you say my script had

queen remnant Jun 28, 2025, 2:56 AM

#

strange

#

i could try re-installing git to see if that fixes it

#

idr when i installed it

earnest grotto Jun 28, 2025, 2:58 AM

#

what does the script say

#

how do you know it doesn't work

earnest grotto Jun 28, 2025, 3:01 AM

#

somber trellis I don't know if 6_K is any good on visual models

Down to Q4 has been fine for me with regular flux. aaron did some tests too. q3 starts to have noticeable artifacts and q2 is completely broken. Visual artifacts from too much quantization seem to usually manifest as blurryness, broken edges and lack of texture
The 2 images I posted were with Q5

reef ivy Jun 28, 2025, 3:10 AM

#

With all my testings so far, LLM's, Video and Image modes, q4 is the cut off for good quality. It has a slight degredation but usually not noticable without comparison, but once you go down to q3 it gets burnt. However, the Fusion Wan model may have decent quality with q3 but I have not tested it myself.

earnest grotto Jun 28, 2025, 3:59 AM

#

for flux kontext specifically i was having some issues with q4

#

i also haven't tried the 2 sd 3.5s and their quants

earnest grotto Jun 28, 2025, 4:37 AM

#

earnest grotto 5.45s/it with q8

On linux, I could run Q8 kontext without ooming with 5.45s/it

#

Now on windows, Q8 kontext again. Without reserve-vram, it spills out ~1.5gb into shared memory and ends up with ~19s/it

#

with --reserve-vram 15, I'm at 4.7gb of vram and 7.45s/it

#

According to task manager. this seems pretty wrong but oh well

#

--reserve-vram 7, 11.4 gb used and 6.46s/it

#

--reserve-vram 3, 15.6gb used and approaching those linux speeds at 5.6s/it

#

also stumbled on a lucky seed where the goggles finally look like goggles without needing cfg

#

@reef ivy Do you wanna test a few different --reserve-vram values and note the flux quant you're using and the speed and vram usage you get? (+how much shared memory is used if it is)

#

seems to me more like it helped you not run out of vram, and with that, speed improved

#

I suspect reserve-vram 11 will give me about 8gb usage, let's find out

#

8.4, about there. 19-x it is

reef ivy Jun 28, 2025, 6:10 AM

#

I may take some time and test it but probably not too soon. I will say latest drivers and pytorch have changed things alot, i guess they are working on how memory is managed. Honestly if we could get a native block swap it would be great for intel users

somber trellis Jun 28, 2025, 10:59 AM

#

earnest grotto Probably worth noting here that these are a thing https://huggingface.co/fal/Woj...

#

somber trellis Jun 28, 2025, 11:48 AM

#

#

earnest grotto Jun 28, 2025, 1:47 PM

#

Updated to latest driver, not seeing any change in either vram usage or speed

somber trellis Jun 28, 2025, 3:27 PM

#

earnest grotto Updated to latest driver, not seeing any change in either vram usage or speed

identical to me as well in terms of inference

#

I actually found a repo that implements some level of bitsandbytes support ontop of XPU

#

https://github.com/bitsandbytes-foundation/bitsandbytes-intel

#

but it does seem to still work on the current nightlys

#

tested it via the repo

#

i dont know if it will be beneficial however

#

I looked further into the bitsandbytes issue

#

and theyre working on a multi-support backend (pytorch custom operator integration)

#

https://github.com/bitsandbytes-foundation/bitsandbytes/issues/1545

#

minimal requirements are promising

#

bnb 4-bit quantization and dequant as the minimum requirements for this

#

https://github.com/bitsandbytes-foundation/bitsandbytes/pull/1544

#

nvm i found the multi backend refactor

#

this might actually have full-on support as an experimental backend

#

i like how i bring this up now

#

and i search this discord channel and find 10 other people mentioning the same link

#

I got it to build from source just fine

#

torch 2.8.0 nightly

#

built bitsandbytes-0.47.0.dev0

#

oh hey i dont get the cuda121.dll error anymore

rose fern Jun 28, 2025, 3:45 PM

#

reef ivy Jun 28, 2025, 3:54 PM

#

somber trellis *oh hey i dont get the cuda121.dll error anymore*

yeah, i think I posted this a while back. It removes the error but last i checked still no 4bit support for intel.

rose fern Jun 28, 2025, 4:22 PM

#

earnest grotto Jun 28, 2025, 7:40 PM

#

Hmm... Having mixed results anime-ifying images with kontext

#

earnest grotto Jun 28, 2025, 8:09 PM

#

It loves to remove large HUD elements sometimes 🤔

somber trellis Jun 29, 2025, 5:11 PM

#

earnest grotto It loves to remove large HUD elements sometimes 🤔

That's why I think using Redux with it makes sense.

#

Since you're getting additional context with it via a clip vision model

#

#

With a node like this, you can control the strength of it too to prevent it from overwriting everything else.

#

#

kratos with the wojak lora worked pretty well

#

i mean it works pretty well in general (thats me)

earnest grotto Jun 29, 2025, 5:15 PM

#

a bit too detailed imo

somber trellis Jun 29, 2025, 5:15 PM

#

earnest grotto a bit too detailed imo

there are both super-detailed wojaks

earnest grotto Jun 29, 2025, 5:15 PM

#

and not enough wacky faces

somber trellis Jun 29, 2025, 5:15 PM

#

and super simple wojaks

#

#

i mean i can go full DURR

earnest grotto Jun 29, 2025, 5:16 PM

#

i know but, it's detailed in the wrong way

#

doesn't feel right

somber trellis Jun 29, 2025, 5:16 PM

#

#

scarface

#

#

I think it did great on walter

#

but it kept his coat and hat the same

queen remnant Jun 29, 2025, 5:16 PM

#

queen remnant any time i try to update comfyui either from comfyui manager or with the script ...

interesting i got an OOM error and saw it was fixed in the next release, but this time trying to update through comfyui-manager gives me a completely different error:

...
RuntimeError: Native API failed. Native API returns: 38 (UR_RESULT_ERROR_OUT_OF_HOST_MEMORY)

Prompt executed in 00:13:14
[ComfyUI-Manager] Failed to checkout 'master' branch.
repo_path=F:\AI-NVMe\Comfy_Intel\ComfyUI
Available branches:
        master
ComfyUI update failed

[ComfyUI-Manager] Queued works are completed.
{'update-comfyui': 1}

After restarting ComfyUI, please refresh the browser.

~~sorry for interrupting lol~~

📎 log.txt

earnest grotto Jun 29, 2025, 5:17 PM

#

queen remnant interesting i got an OOM error and saw it was fixed in the next release, but thi...

You ran out of memory / windows driver is buggy

somber trellis Jun 29, 2025, 5:17 PM

#

gguf ops.py

#

RuntimeError: Native API failed. Native API returns: 38 (UR_RESULT_ERROR_OUT_OF_HOST_MEMORY)

earnest grotto Jun 29, 2025, 5:17 PM

#

If you were not looking at your vram, then you ran out

queen remnant Jun 29, 2025, 5:17 PM

#

yea

earnest grotto Jun 29, 2025, 5:18 PM

#

Don't run out

somber trellis Jun 29, 2025, 5:18 PM

#

im still running the 2.8 nightlies

#

because it's still faster

#

than 2.7.10 ipex

#

anything under --reserve-vram 8 for me causes problems

#

for really big loads

earnest grotto Jun 29, 2025, 5:19 PM

#

Use --reserve-vram x to use less vram, x being some number. my estimate, 19-x actual vram usage with a simple kontext workflow, down to about 4.7gb of vram at 15. obviously, slower, but not by much. stick with q8

queen remnant Jun 29, 2025, 5:19 PM

#

hence why i was trying to update from 0.3.42 to 0.3.43, i was trying to get kontext to work and it ate huge chunks of memory every iteration until it eventually ran out

somber trellis Jun 29, 2025, 5:19 PM

#

earnest grotto Use --reserve-vram x to use less vram, x being some number. my estimate, 19-x ac...

15, though slower

#

is the most stable out of all of them for anything you wanna nitpick with

earnest grotto Jun 29, 2025, 5:19 PM

#

kills sdxl performance

somber trellis Jun 29, 2025, 5:19 PM

#

and the performance loss is around 40% compared to --reserve-vram 6.0

somber trellis Jun 29, 2025, 5:20 PM

#

earnest grotto kills sdxl performance

well if you can load the model onto vram fully then yeah reserve vram is gonna tank inference speed if it can no longer fit the model

#

for some reason as well btw

#

2.7.10 really doesnt like big workloads

#

even with --reserve-vram 8 it ends up stuttering my PC at high vram usage

#

idk about 2.3.1 since i havent touched it in forever

queen remnant Jun 29, 2025, 5:23 PM

#

oof

#

this was on top of all of my vram being used on my a770 😭

somber trellis Jun 29, 2025, 5:24 PM

#

queen remnant this was on top of all of my vram being used on my a770 😭

what model is it that youre trying to run

#

kontext dev?

queen remnant Jun 29, 2025, 5:24 PM

#

queen remnant hence why i was trying to update from 0.3.42 to 0.3.43, i was trying to get kont...

kontext

earnest grotto Jun 29, 2025, 5:25 PM

#

queen remnant oof

Show the workflow

#

e.g. you probably don't need the t5 at fp16/bf16. or can use a q8 gguf of it

somber trellis Jun 29, 2025, 5:26 PM

#

i use the q8 t5

#

i also use the text-enhanced clip_L

#

https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14

queen remnant Jun 29, 2025, 5:27 PM

#

just an example workflow with the model loader swapped out with a unet loader for a q4_k_m gguf

earnest grotto Jun 29, 2025, 5:29 PM

#

welp... close some browser tabs i guess

#

there are smaller quants of the t5

#

imo q4 kontext gets a bit fried

queen remnant Jun 29, 2025, 5:29 PM

#

im certain its an issue with comfyui itself, i was just having trouble updating is all

#

ran git pull origin master and that seemed to do something so i'll update here if its fixed after running the script again

earnest grotto Jun 29, 2025, 5:32 PM

#

running out of ram is a buy more sticks, close other programs or use smaller models issue

somber trellis Jun 29, 2025, 5:34 PM

#

earnest grotto It loves to remove large HUD elements sometimes 🤔

earnest grotto Jun 29, 2025, 5:35 PM

#

I don't actually mind that it removed the binos

#

I do mind that it looks fried

#

Some different prompting (russian soldiers in a desert, blablabla) got it to be a bit less fried

#

am testing colorization rn. i think they didn't train it to colorize with colored dots which is pretty sad

#

mmm... driver crashed

somber trellis Jun 29, 2025, 5:42 PM

#

#

#

💥

earnest grotto Jun 29, 2025, 5:44 PM

#

If the gif compressed it too hard, this is the original

queen remnant Jun 29, 2025, 5:50 PM

#

queen remnant ran `git pull origin master` and that seemed to do something so i'll update here...

can confirm this completely fixed the issue 👍

#

hes so handsome

earnest grotto Jun 29, 2025, 6:05 PM

#

Various prompts and these dots, vs colors in prompt

earnest grotto Jun 29, 2025, 6:22 PM

#

really getting the feeling that this needs more training, loras.

somber trellis Jun 29, 2025, 11:00 PM

#

ComfyUI-euler-2.0-30-2025-06-29_20-28-38-0001.webp

earnest grotto Jun 29, 2025, 11:15 PM

#

are you making the images yellow on purpose or is this just an ironic moment for bfl

#

90% sure kontext's license should also have that little clause about not training on outputs...

somber trellis Jun 29, 2025, 11:17 PM

#

earnest grotto are you making the images yellow on purpose or is this just an ironic moment for...

#

I don't see the yellowing you speak of.

earnest grotto Jun 29, 2025, 11:18 PM

#

somber trellis

lots of the images you've posted are very yellow

earnest grotto Jun 29, 2025, 11:18 PM

#

somber trellis

yellow lighting

earnest grotto Jun 29, 2025, 11:19 PM

#

somber trellis

this one a lot

earnest grotto Jun 29, 2025, 11:19 PM

#

somber trellis

brown, but that's dark yellow, really

somber trellis Jun 29, 2025, 11:19 PM

#

earnest grotto this one a lot

the original image is in a yellow room lmao

earnest grotto Jun 29, 2025, 11:19 PM

#

i've played skyrim, i know, but it's yellower

somber trellis Jun 29, 2025, 11:20 PM

#

well that image had no LUTs or filmgrain applied to it

earnest grotto Jun 29, 2025, 11:20 PM

#

they're all slightly yellow or brown

somber trellis Jun 29, 2025, 11:20 PM

#

it was a direct kontext dev output

earnest grotto Jun 29, 2025, 11:20 PM

#

i'm not saying you did anything, i'm implying bfl trained on chatgpt outputs

somber trellis Jun 29, 2025, 11:20 PM

#

#

prompt biases

earnest grotto Jun 29, 2025, 11:22 PM

#

you could probably just make the negative be "make the image yellow" and I'd expect that to get rid of it

#

but it's a peculiar sight nonetheless

earnest grotto Jun 30, 2025, 9:37 AM

#

@fleet cape What do you mean by "optimized"

somber trellis Jun 30, 2025, 4:19 PM

#

im going back to linux again

#

bluescreen galore on windows as of recent

late glen Jun 30, 2025, 9:23 PM

#

What version of torch are you all using? I can't seem to get any version of flux kontext inc quants working for my a770. I've been using 2.7.1 + IPEX, is this not advised?

earnest grotto Jun 30, 2025, 9:33 PM

#

late glen What version of torch are you all using? I can't seem to get any version of flux...

What doesn't work and did you install using my script

late glen Jun 30, 2025, 10:26 PM

#

earnest grotto What doesn't work and did you install using my script

I'll give it a go. Haven't had a lot of issues with other models but this one crashes my system.

somber trellis Jun 30, 2025, 10:49 PM

#

earnest grotto What doesn't work and did you install using my script

Your script is windows-specific correct?

earnest grotto Jun 30, 2025, 10:50 PM

#

somber trellis Your script is windows-specific correct?

no

#

both windows and linux

somber trellis Jun 30, 2025, 10:50 PM

#

hmm

#

It errored out on endeavourOS (ArchLinux) for me

earnest grotto Jun 30, 2025, 10:50 PM

#

with what

somber trellis Jun 30, 2025, 10:51 PM

#

i assume it works with normal windows and debian

#

Like why the hell is that bot able to remove pastes here

#

in a support and questions channel

earnest grotto Jun 30, 2025, 10:56 PM

#

dm it

somber trellis Jun 30, 2025, 10:58 PM

#

nvm

#

missing prerequisites for the script lmao

#

works now

earnest grotto Jun 30, 2025, 11:02 PM

#

i disabled 2.3 for linux because it was broken for me and i didn't figure out the issue at the time. in case you're wondering why it's not there

somber trellis Jul 1, 2025, 1:06 AM

#

earnest grotto i disabled 2.3 for linux because it was broken for me and i didn't figure out th...

no matter im going back to windows again

#

even with protonge mordhau gets less than half the framerate than windows 11 does

#

and that is one of the games i play most

earnest grotto Jul 1, 2025, 1:07 AM

#

yeah that do be the linux gaming performance with more demanding games

#

with arc

somber trellis Jul 1, 2025, 1:07 AM

#

windows really do be the only choice for games of that caliber huh

#

i am sad

#

id post sad seal gif but tenor doesnt work in here so

#

baikal-seals-always-look-like-theyre-about-to-cry-or-they-v0-fpb2rgu815yd1.png

#

i do this every 6 months excited about trying linux again

#

only for something really stupid to break like openssl

#

that even pacman-static is like "hell nah i aint touchin this"

earnest grotto Jul 1, 2025, 1:13 AM

#

i am even sadder about kontext

#

undertrained, over-lobotomized, or idk what they did but i'm losing hope. an ad for their training service? an ad for pro?

#

it's like altering/removing text and watermarks is the only thing it can do well

#

i haven't tried virtual try-on/swapping clothes/etc., but given I saw there's a lora for that on civit, that gives me the feeling it's bad at that too

somber trellis Jul 1, 2025, 1:26 AM

#

ComfyUI-euler-2.0-8-2025-06-29_21-39-11-0001.webp

ComfyUI-euler-2.2-8-2025-06-29_22-20-04-0001.webp

#

Its not the greatest model no

#

its a flux that is worse at text to image but is better at image to image and not even by much

#

we already have models that outpace it closedsource

#

but ye 🤷‍♂️

earnest grotto Jul 1, 2025, 1:47 AM

#

somber trellis its a flux that is worse at text to image but is better at image to image and no...

It's that but also not merely that. They pretty clearly trained on 4o outputs a model that is competing with 4o, all while still having this nice little tidbit in their license: You may not use the Output to train, fine-tune or distill a model that is competitive with the FLUX.1 [dev] Model or the FLUX.1 Kontext [dev] Model
If the model is undertrained, them (fal, but still) having a lora trainer service ready on day one, not even training code, is even worse. We're supposed to train it ourselves to a decent state, and at that, who knows, maybe they'll get to make a better dataset from what the community finetunes it/makes loras with? For their whole spiel about NSFW in the license, I got an almost NSFW output because it's so bad at anime-ifying a furry rabbit person it made that into a regular skin person, not that I care about NSFW but evidently they do.

#

In the meantime, there was a new 3B model that popped up, Ovis-U1, claiming to be able to describe images, do text to image, and do edits
Tried their HF demo with images which IMO kontext anime-ified the best, and... I found out afterwards that it uses the sdxl vae. sad.

#

More yellow. People just can't stop training on 4o

#

I'm still hopeful that maybe there's something I'm missing with kontext but man...

#

I should also try cosxl again, though IIRC it was fairly bad

earnest grotto Jul 1, 2025, 2:40 AM

#

oh, i should try ovis with kontext's big fails like that hl2 playground screenshot

#

ah, and ostris seems to have local training for it

somber trellis Jul 1, 2025, 7:27 PM

#

changed up my kontext workflow a bit

#

changed versions of redux to the higher quality reflux redux model

#

#

initial load image upscale chain

#

dynamic thresholding and teacache for both higher quality and faster processing

#

disabled fluxkontextimagescale node

#

recommended by reddit because it was actually causing worse outputs

#

just gotta limit it to appropriate image sizes under a megapixel

#

ComfyUI-euler-2.0-20-2025-07-01_16-54-07-0001.png

#

#

also have two saving options, one that utilizes a LUT and filmgrain and one for clean output

#

ComfyUI-euler-2.0-20-2025-07-01_17-01-55-0001.png

#

also set cfg to 2, guidance to 2.

#

#

#

somber trellis Jul 1, 2025, 8:01 PM

#

earnest grotto Jul 1, 2025, 10:18 PM

#

Something feels uncanny about the lighting but this is definitely better than the overly yellow/brown results

earnest grotto Jul 1, 2025, 10:40 PM

#

@teal monolith @rustic sonnet Still alive?

reef ivy Jul 2, 2025, 12:52 AM

#

late glen What version of torch are you all using? I can't seem to get any version of flux...

Is your comfy up to date?

somber trellis Jul 2, 2025, 3:04 AM

#

rustic sonnet Jul 2, 2025, 3:09 AM

#

earnest grotto <@649668372627456011> <@770918780133441566> Still alive?

Well comfy doesn't even load the model(was getting OOM errors)
SDnext worked but also kept crashing because of OOM

earnest grotto Jul 2, 2025, 3:11 AM

#

rustic sonnet Well comfy doesn't even load the model(was getting OOM errors) SDnext worked but...

What version of the model did you download, or you don't know
In either case, get one of the Q5 GGUFs, Q4 might start to get fried but also might not, not sure if it's just kontext's own issues at this point
https://huggingface.co/bullerwins/FLUX.1-Kontext-dev-GGUF/tree/main

bullerwins/FLUX.1-Kontext-dev-GGUF at main

rustic sonnet Jul 2, 2025, 3:12 AM

#

I used Q4 GGUF

earnest grotto Jul 2, 2025, 3:13 AM

#

Hmm

rustic sonnet Jul 2, 2025, 3:13 AM

#

How much Vram does Q4 use for you?

earnest grotto Jul 2, 2025, 3:14 AM

#

I can make Q8 use 4.7. But I also have 48gb of regular RAM to spare

#

Right now, 34gb of RAM + 10.6 VRAM, with lots of chromium tabs, discord, Noita, buncha things open

#

I'll try Q4 in a bit

rustic sonnet Jul 2, 2025, 3:15 AM

#

earnest grotto I'll try Q4 in a bit

Tell me how much system ram and vram it takes

earnest grotto Jul 2, 2025, 3:27 AM

#

7gb ram with everything closed. 11.3gb vram and 16.6gb ram with q4_k_m

reef ivy Jul 2, 2025, 3:32 AM

#

rustic sonnet Well comfy doesn't even load the model(was getting OOM errors) SDnext worked but...

Are you using the "--reserve-vram x" x representing the number amount.

#

With regular flux, 8gb vram and 32gb ram I have been able to run it.

rustic sonnet Jul 2, 2025, 3:33 AM

#

Well I've got only 18GB Ram usable (including vram)

earnest grotto Jul 2, 2025, 3:34 AM

#

what happened to the other 14

reef ivy Jul 2, 2025, 3:34 AM

#

Yeah not enough probably. I think my system can resever 24gb when adding ram to vram

rustic sonnet Jul 2, 2025, 3:40 AM

#

earnest grotto what happened to the other 14

System takes up the rest

#

Well I guess 19GB usable

#

When it hits 20GB the system starts lagging so much

#

And hitting swap memory more

reef ivy Jul 2, 2025, 4:06 AM

#

Did you set that manually? I get 24gb shared video memory with 32gb ram and 8gbvram

rustic sonnet Jul 2, 2025, 4:11 AM

#

I didn't set it manually

reef ivy Jul 2, 2025, 1:49 PM

#

Strange, what gpu do you have?

rustic sonnet Jul 2, 2025, 2:01 PM

#

I'm trying to run it on my MTL-H IGPU

civic charm Jul 2, 2025, 2:10 PM

#

Flux Dev with SDNQ UINT4 fully fits into A770's 16 GB VRAM on SDNext with offload mode = none / no offload / everything is on the GPU

#

Only issue is the VAE decode

#

reducing the vae tile size to 512 works

reef ivy Jul 2, 2025, 3:24 PM

#

rustic sonnet I'm trying to run it on my MTL-H IGPU

Oh okay, thats why. Adding ram is the best option, or trying even smaller quants but they burn too much below q4

earnest grotto Jul 2, 2025, 3:47 PM

#

civic charm Flux Dev with SDNQ UINT4 fully fits into A770's 16 GB VRAM on SDNext with offloa...

it's kontext, and an iGPU that shares 32gb ram with the rest of the system, but something's not quite right so it's closer to 20gb in practice (??? did microsoft do something?)
i could get <32gb of vram+ram but something's wrong here

#

my vram usage on windows has also been slightly higher. i could do q8 kontext on linux and not run out, barely 15gb, but same thing on windows and i run out

#

though i also noticed sdxl bumped up to ~2.3it/s 🤔

#

need a bit more testing, that was 832x1216

rustic sonnet Jul 2, 2025, 4:40 PM

#

earnest grotto it's kontext, and an iGPU that shares 32gb ram with the rest of the system, but ...

What i wanted to say is OS and all the other apps areusing 11-12GB, so comfy only gets the remaining 20GB

earnest grotto Jul 2, 2025, 4:44 PM

#

You should be able to get that down to at least 7-8gb, unless there's something you absolutely want open

#

it can get down to ~2-4gb afaik but that needs some debloating

#

and idk, maybe you do use cortana

earnest grotto Jul 2, 2025, 5:24 PM

#

earnest grotto You should be able to get that down to at least 7-8gb, unless there's something ...

discord uses a fair bit of ram
vscode can use up a lot of ram, especially if you have many things open
krita doesn't use much ram
dunno what browser you use, i don't think chrome was good at keeping tabs around. I use vivaldi and it can unload tabs without closing them, or remember them when it's opened without loading them all

reef ivy Jul 2, 2025, 5:36 PM

#

intel arc windows memory management is terrible. Only --reserve-vram helps or block swap in custom nodes.

somber trellis Jul 3, 2025, 2:10 AM

#

#

I mean, this is pretty cool.

#

#

The ability for it to take multiple images or a spreadsheet as context

earnest grotto Jul 3, 2025, 2:15 AM

#

It can do much more interesting interactions but you also need to beat it over the head and get lucky, evidently

somber trellis Jul 3, 2025, 2:15 AM

#

i think using ic-light with it might be a good idea though

earnest grotto Jul 3, 2025, 2:16 AM

#

Ideally you wouldn't even need ic-light

#

You'd just be able to relight with kontext itself

somber trellis Jul 3, 2025, 2:16 AM

#

That however requires a second pass, wouldn't it.

#

Ic-light v1 being sdxl based might be a faster choice

earnest grotto Jul 3, 2025, 2:19 AM

#

somber trellis That however requires a second pass, wouldn't it.

yes? what's the issue

somber trellis Jul 3, 2025, 2:19 AM

#

earnest grotto yes? what's the issue

having a second pass isnt the issue

#

the time it takes to re-run kontext again on my hardware

#

i could just use

#

a flux schnell lora

#

i only say this because it takes 13s/it to run kontext with all the stuff i am using with it.

earnest grotto Jul 3, 2025, 2:21 AM

#

I get about ~1:15 for a 1mp image with teacache. there doesn't seem to be much point in more steps and cfg doesn't save it when it refuses to work
I think it would take me about that long to load sdxl and gen with it, than gen with already loaded kontext

somber trellis Jul 3, 2025, 2:21 AM

#

earnest grotto I get about ~1:15 for a 1mp image with teacache. there doesn't seem to be much p...

cfg wont work without a thresholding node

#

flux is inherently designed to only use cfg1 and requires its fluxguidance nodes to handle it

#

its the same reason why no negative prompts work on it either

#

#

but yeah

#

only problem is youre doubling inference time with the thresholding node

#

benefits are however, is that you can now use a negative prompt with flux and also have more than 1CFG.

#

earnest grotto Jul 3, 2025, 2:26 AM

#

somber trellis cfg wont work without a thresholding node

Not what I'm referencing, and you don't need any special nodes for CFG to have an effect when the model isn't being stupid

somber trellis Jul 3, 2025, 2:28 AM

#

earnest grotto Not what I'm referencing, and you don't need any special nodes for CFG to have a...

certain seeds completely stop the model from outputting any edits

#

other than that the model has been editing for me fine

#

i was initially optimistic that this model would be better than base flux

#

but it doesnt seem like it is* it just seems like the same quality level but far better-tuned for image-to-image

earnest grotto Jul 3, 2025, 2:34 AM

#

Anime-ifying an image. No CFG, 2.0 CFG "make the image brighter", 2.0 CFG "make the image darker". No thresholding or anything else
CFG is just interpolating 2 predicted noises (linear combination but basically that). You don't need any special nodes to do that. Dynamic thresholding might only be better for you because the images are more contrasty.

somber trellis Jul 3, 2025, 2:36 AM

#

earnest grotto Anime-ifying an image. No CFG, 2.0 CFG "make the image brighter", 2.0 CFG "make ...

it isnt just for CFG if it actively allows you to use and work with negative prompts

earnest grotto Jul 3, 2025, 2:36 AM

#

When it refuses, there is no trickery you can do with its prediction to make it do what you want

earnest grotto Jul 3, 2025, 2:37 AM

#

somber trellis it isnt just for CFG if it actively allows you to use and work with negative pro...

"make the image brighter" and "make the image darker" are the negative prompts (the images are respectively darker and brighter)

somber trellis Jul 3, 2025, 2:40 AM

#

earnest grotto "make the image brighter" and "make the image darker" are the negative prompts (...

I am very confused by what you're trying to tell me.

#

So you have a normal ksampler workflow with those negative prompts in it

#

and it gave you the same outcome as if you would put it in positive?

#

(also this image didnt work properly on this seed)

#

Because I'm not using a samplecustomadvanced node workflow

earnest grotto Jul 3, 2025, 2:42 AM

#

somber trellis and it gave you the same outcome as if you would put it in positive?

The darker image had a negative prompt of "make the image brighter"

#

And the insanely bright image had a negative prompt of "make the image darker"

somber trellis Jul 3, 2025, 2:43 AM

#

But I thought the point of using DynamicThresholding was because Flux is a distilled model.

earnest grotto Jul 3, 2025, 2:43 AM

#

It can help, I'm not saying to not use it

#

It will most likely help with higher cfg. I've been sticking with low values

somber trellis Jul 3, 2025, 2:46 AM

#

I might start doing the opposite and just go for an all-performance workflow

#

flux alpha 8-step lora and all

earnest grotto Jul 3, 2025, 2:46 AM

#

I do not thing your generic negative prompt warrants cfg

#

It's also kinda nonsense for kontext. You'd want it to do the opposite edit, don't just throw words at it, though it does kinda work when you throw words at it

somber trellis Jul 3, 2025, 2:47 AM

#

the prompt doesnt work at all without it for normal flux, i dont know about kontext

#

also i used redux to have myself not need to manually prompt the image, by utilizing a clip vision encoder

#

of course adding that in as well would probably give me better results however

earnest grotto Jul 3, 2025, 2:49 AM

#

my cases where cfg makes it conform better to the positive have been kinda slim. usually i want more conformity when it's failing to edit, but then no cfg will save it

#

it's so wrong i can see it on the literal 1st step

#

"Place the girl in the left image together with the ones in the right image, while maintaing the composition and their look." (and many other variations of this prompt tried, incl. "image #1/2", getting left/right wrong, as well as differently colored backgrounds or different order)

#

I tried being specific with hair and eye color instead of "her", "girl", etc.

#

I guess I haven't tried too many seeds, only 2-3 or so

#

I'm gonna try thresholding with colorization to see if that's any better, might help there more 🤔

somber trellis Jul 3, 2025, 2:53 AM

#

it just seems like certain seed values don't give the model the noise it wants

#

and it just completely botches the job, giving you a near identical output to the input

#

or a blurry mess

earnest grotto Jul 3, 2025, 2:54 AM

#

the 2nd image is the 1st step

#

what its prediction then looks like

#

the blur is normal

#

but you can basically tell that the end result is going to be unchanged or borderline unchanged

#

ostris had some peculiar artifacts when training a lora for it (that went away after more training). so I wonder if loras will save it

somber trellis Jul 3, 2025, 2:56 AM

#

earnest grotto but you can basically tell that the end result is going to be unchanged or borde...

quite literally a blur of the two images side-by-side

#

lmao

earnest grotto Jul 3, 2025, 2:56 AM

#

https://www.youtube.com/watch?v=WSWubJ4eFqI

YouTube

Ostris AI

How to Train a Flux.1 Kontext LoRA with AI Toolkit

Training a big head LoRA with AI Toolkit. Download this big head LoRA https://huggingface.co/ostris/kontext_big_head_lora

AI Toolkit - https://github.com/ostris/ai-toolkit
Runpod - https://runpod.io?ref=h0y9jyr2
Support me - https://ostris.com/support
Comfy Workflow - https://gist.github.com/jaretburkett/4d43238cb567eba3e32e776323ecb740

▶ Play video

#

(Don't get your hopes too high, this trainer has no block swap, we'll be waiting for kohya)

somber trellis Jul 3, 2025, 2:56 AM

#

#

earnest grotto Jul 3, 2025, 3:00 AM

#

somber trellis quite literally a blur of the two images side-by-side

The blurring is normal, it's what diffusion models all predict early on

somber trellis Jul 3, 2025, 3:01 AM

#

earnest grotto The blurring is normal, it's what diffusion models all predict early on

earnest grotto Jul 3, 2025, 3:01 AM

#

all its anime outputs are oddly bright, like how the realistic ones are oddly yellowish

#

i wonder what dataset that brightness came from

#

Here's for example, 1 step out of 20 with juggernaut xl v9, and al the 20, both with 1 cfg

#

The "negative" (empty) prompt prediction is much blurrier and greyer -> when subtracting, it essentially gives more contrast/color to the image, but we subtract enough so that doesn't destroy it -> the model can make a better prediction and you get a generally better result
kontext being distilled, is going to have much brighter predictions so subtracting them can burn the image much more easily, which is why you normally want to use thresholding I guess. but it's only more easily, it's not guaranteed it'll burn them

somber trellis Jul 3, 2025, 6:23 AM

#

earnest grotto Jul 5, 2025, 4:43 AM

#

Took a gander at what more kontext loras people have trained, and things are looking very very promising

#

#

just a funny thing i found as i was browsing

#

but there are good not so memey loras

earnest grotto Jul 5, 2025, 9:10 PM

#

#

Omnigen 2 is pretty good

#

omnigen 2 is 4b

#

both omnigen 2 and ovis u1 are apache 2

#

ovis seems to be getting more attention despite being worse in general. i can see that sdxl vae smoothing people's images out

#

sadly i expect both will be ignored like lumina 2

somber trellis Jul 6, 2025, 2:17 AM

#

earnest grotto ovis seems to be getting more attention despite being worse in general. i can se...

ovis with high guidance seems to be latching onto words better than default guidance

earnest grotto Jul 6, 2025, 2:18 AM

#

ovis is sdxl vae and omnigen 2 is some 16ch vae, dunno if flux or sd3 or other

#

i didn't prompt the text in the text box for any of them

somber trellis Jul 6, 2025, 2:19 AM

#

kontext only won in this case

#

because its a bigger model

#

a finetune of ovis i bet would compete

earnest grotto Jul 6, 2025, 2:19 AM

#

One ovis dev said they intend to make a bigger version https://github.com/AIDC-AI/Ovis-U1/issues/1#issuecomment-3017636930

GitHub

Very good job, may I ask how much VRAM and time it takes to run, an...

Very good job, may I ask how much VRAM and time it takes to run, and whether it supports editing multiple images and text

somber trellis Jul 6, 2025, 2:20 AM

#

If they make a 12b model

earnest grotto Jul 6, 2025, 2:20 AM

#

However IMO omnigen 2 currently just beats it

somber trellis Jul 6, 2025, 2:20 AM

#

like flux

earnest grotto Jul 6, 2025, 2:20 AM

#

Given the chroma dev's findings, flux doesn't even need to be 12b due to wasting ~3b params on nearly pointless stuff

#

IDK why ovis is getting more attention than omnigen when both are apache 2

somber trellis Jul 6, 2025, 2:22 AM

#

ovis higher guidance gets quite close to the base image

#

at least thats what your comparison shows

earnest grotto Jul 6, 2025, 2:23 AM

#

yes. thing is, the default guidance was too unrelated to the image, so I included multiple pics of me progressively cranking it up. they're multiple because it also gets fried

somber trellis Jul 6, 2025, 2:23 AM

#

however the pillow and blanket in the back are overcontrasted and weird black splotchs

earnest grotto Jul 6, 2025, 2:24 AM

#

I don't want to give the wrong impression that it's fried by default. in this case, it might be that it's undertrained on anime, however at the same time they had ghibli style as one of their prompts, so...

somber trellis Jul 6, 2025, 2:24 AM

#

🤷‍♂️

#

and kontext though getting the closest was just a lucky gen right

earnest grotto Jul 6, 2025, 2:25 AM

#

Here's the default guidance (this kinda worked for non-anime images)
And a bit higher than the last image. absolutely fried

earnest grotto Jul 6, 2025, 2:25 AM

#

somber trellis and kontext though getting the closest was just a lucky gen right

Yeah. I am just surprised at how close it got, perfect text and everything

#

You've seen the unlucky HL2 image that did not want to get animeified at all

somber trellis Jul 6, 2025, 2:25 AM

#

yep

#

ive got a few wojak failures

#

lmao

earnest grotto Jul 6, 2025, 2:26 AM

#

However the lucky gen shows there's room to easily train it better

somber trellis Jul 6, 2025, 2:26 AM

#

kontext is still most capable but if a model of similar size comes out with an apache license

#

that will change

#

lets see if ovis makes a bigger model

#

i myself am still kinda sad that there still arent any models opensource that can voiceclone as well as indextts, but with emotion

#

chatterbox sucks at maintaining dialects/accents, dia is still not good

#

openvoice s1 mini (FishTTS) is also meh

earnest grotto Jul 6, 2025, 2:31 AM

#

I don't want bigger models

#

slower for everything, harder to train, for questionable benefits

#

there are ways to make the models better without ballooning parameter counts

#

https://arxiv.org/abs/2504.10188 https://arxiv.org/abs/2504.05741

arXiv.org

Efficient Generative Model Training via Embedded Representation Warmup

Diffusion models excel at generating high-dimensional data but fall short in training efficiency and representation quality compared to self-supervised methods. We identify a key bottleneck: the underutilization of high-quality, semantically rich representations during training notably slows down convergence. Our systematic analysis reveals a cr...

arXiv.org

DDT: Decoupled Diffusion Transformer

Diffusion transformers have demonstrated remarkable generation quality, albeit requiring longer training iterations and numerous inference steps. In each denoising step, diffusion transformers encode the noisy inputs to extract the lower-frequency semantic component and then decode the higher frequency with identical modules. This scheme creates...

#

These are just 2 examples.

#

Both of these claim training speedups AND higher quality (FID isn't exactly quality but that's good)

#

I don't know every single paper out there. As you know, SD1.5 and SDXL struggle with changing resolutions, and Flux is better. This is due to an architectural improvement (rotary positional embeddings) rather than parameter counts. In that vein, I had seen a paper that touted even better generalization to higher resolution, but I forgot what it was

#

mm, another comes to mind, i think there was a paper about doing the diffusion at a lower and progressively increasing resolution, too

earnest grotto Jul 6, 2025, 5:23 AM

#

earnest grotto I don't know every single paper out there. As you know, SD1.5 and SDXL struggle ...

may have been this https://arxiv.org/html/2503.18719v1

reef ivy Jul 9, 2025, 12:08 PM

#

Apparently the wanwrappwr got gguf support. Probably wont be able to test for who knows how long though.

wicked fulcrum Jul 12, 2025, 1:22 AM

#

FYI New Blog post on how to get Flux.1 Kontext and Wan 2.1 VACE working with AI Playground v2.5.5 beta
https://game.intel.com/us/stories/update-ai-playground-to-run-flux-1-kontext-and-wan-2-1-vace-on-intel-ai-pcs/

reef ivy Jul 13, 2025, 1:58 PM

#

I am seeing people using the 14b wan model to make images now and claiming its better than flux. Not sure exactly what lora or special models they are using though if any.

somber trellis Jul 13, 2025, 2:48 PM

#

reef ivy I am seeing people using the 14b wan model to make images now and claiming its b...

https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32.safetensors

#

Those two.

#

https://www.reddit.com/r/StableDiffusion/comments/1lu7nxx/wan_21_txt2img_is_amazing/

#

I definitely wouldn't call it better than flux.

reef ivy Jul 13, 2025, 4:58 PM

#

Would need to check it out, think its also much faster as well.

#

Gonna try the gguf with block swap now that kijai added support to the wrapper. Might even make gguf faster or at least more consistent

#

Gonna be a while before I can try though likely

somber trellis Jul 13, 2025, 11:22 PM

#

https://www.reddit.com/r/LocalLLaMA/comments/1lyy39n/indextts2_the_most_realistic_and_expressive/

#

Looks like IndexTTS2 is coming out soon.

earnest grotto Jul 17, 2025, 11:52 AM

#

New hidream edit model that works at 1mp instead of 0.5mp, and potentially other improvements? https://huggingface.co/HiDream-ai/HiDream-E1-1

HiDream-ai/HiDream-E1-1 · Hugging Face

craggy hinge Jul 20, 2025, 7:25 AM

#

Hi all, are there any 3D generation options on IPEX or in other ARC-compatible libs?

earnest grotto Jul 21, 2025, 3:02 PM

#

craggy hinge Hi all, are there any 3D generation options on IPEX or in other ARC-compatible l...

IPEX is not necessary anymore.
Trellis works. I'd expect Hunyuan 2.0 and 2.1 to work.
However, personally I've just been using Hunyuan 2.5 (which is online, a service, although still free)

#

Since you talk about IPEX, you probably want to install Comfy using my script: #1193952640225267802 message
Which does a few things like add Disty's hijacks to Comfy so random nodes that hardcode references to "cuda" instead of using Comfy's device getter still work

#

You can use Hunyuan 2.5 here https://3d.hunyuan.tencent.com

#

earnest grotto Jul 21, 2025, 3:42 PM

#

Recently they added a remesher as well but I'm not liking the topology it produces, and the models 2.5 outputs are already kinda broken. This doesn't save much time from just using the model as a quick base

#

Although it is promising

craggy hinge Jul 22, 2025, 7:44 PM

#

earnest grotto Since you talk about IPEX, you probably want to install Comfy using my script: h...

Great work, it runs a way faster than common IPEX. But I failed to install trellis or Hunyuan 2.1. Base example workflow for 2.0 works fine. It's probably best to just wait for things to get better, i think.

earnest grotto Jul 22, 2025, 9:58 PM

#

Yeah I don't think current local 3d model gen is good enough yet

#

But hopefully soon

somber trellis Jul 24, 2025, 1:09 AM

#

driver 6972 seems to give garbled outputs when using a basic flux gguf workflow in comfy

#

Nevermind. This might not be a driver issue, as 6913 which worked for me before is now doing the same thing.

formal tusk Jul 24, 2025, 4:25 PM

#

I've had some weirdness for the last week-ish across the board

ripe pivot Jul 25, 2025, 8:04 PM

#

updating to 0.2.2 somehow makes it this slow, which one where I suppose to update? nightly or stable?

#

that's very slow for imagegen

earnest grotto Jul 25, 2025, 8:45 PM

#

ripe pivot updating to 0.2.2 somehow makes it this slow, which one where I suppose to updat...

Scroll up and show a screenshot

ripe pivot Jul 26, 2025, 4:39 AM

#

earnest grotto Scroll up and show a screenshot

#

do I somehow prompt to my cpu or what?

earnest grotto Jul 26, 2025, 8:53 AM

#

ripe pivot

Please show everything ComfyUI shows in the console

earnest grotto Jul 26, 2025, 8:53 AM

#

ripe pivot do I somehow prompt to my cpu or what?

yes

ripe pivot Jul 26, 2025, 10:08 AM

#

earnest grotto Please show everything ComfyUI shows in the console

On the boot?

earnest grotto Jul 26, 2025, 1:36 PM

#

ripe pivot On the boot?

yes

ripe pivot Jul 26, 2025, 1:53 PM

#

earnest grotto yes

earnest grotto Jul 26, 2025, 2:26 PM

#

ripe pivot

Is this screenshot with the ksampler or some random custom node

ripe pivot Jul 26, 2025, 2:31 PM

#

ksampler, then again it works fine on previous version

#

upgrading to 0.2.2 changes thing

earnest grotto Jul 26, 2025, 2:31 PM

#

ok, i'll check that out in a few hours

earnest grotto Jul 26, 2025, 6:03 PM

#

ripe pivot upgrading to 0.2.2 changes thing

I can't reproduce this, using the stable pytorch

#

it's possible you just hit a broken nightly and when you tried again there's a new nightly with that bug fixed

#

use the stable pytorch

ripe pivot Jul 26, 2025, 6:36 PM

#

earnest grotto use the stable pytorch

nothing changes even on stable pytorch

earnest grotto Jul 26, 2025, 6:37 PM

#

The first sample is always going to take a bit

ripe pivot Jul 26, 2025, 6:38 PM

#

1st step take 2 minutes, before update that amount of times already on the step of detailer inpaint

earnest grotto Jul 26, 2025, 6:39 PM

#

Show the workflow
Which shortcut are you launching with

ripe pivot Jul 26, 2025, 6:39 PM

#

regular one

#

earnest grotto Jul 26, 2025, 6:40 PM

#

earnest grotto Is this screenshot with the ksampler or some random custom node

.

#

you are using a random custom node

#

there is nothing regular about that workflow

ripe pivot Jul 26, 2025, 6:42 PM

#

I mean it stuck on ksampler before any other nodes. I understand if it stuck on the custom nodes

earnest grotto Jul 26, 2025, 6:43 PM

#

I see what the issue is, but in the future I need you to tell me when something is or isn't a custom node

ripe pivot Jul 26, 2025, 6:44 PM

#

tbf I don't understand which one is the default or which one the custom

#

on reforge detailer kind of provided by default iirc so I assume it's just a default tool

earnest grotto Jul 26, 2025, 6:45 PM

#

#

If something is a custom node, that on the top right is where it comes from

ripe pivot Jul 26, 2025, 6:47 PM

#

so every nodes that has label on the right are custom right?

earnest grotto Jul 26, 2025, 7:22 PM

#

Ok, script updated

#

Should be fixed now

earnest grotto Jul 26, 2025, 7:23 PM

#

ripe pivot so every nodes that has label on the right are custom right?

yes

#

besides that, comfyui does not have face detection by default.

#

also most dedicated face detection models are bad for anime

#

they will often fail to detect anime faces

#

and you don't even need it that much for anime anyways, anime faces don't change that much with resolution, what will break with anime more often is smaller eyes not whole faces and doing the equivalent of hires fix is 99% of the time enough

earnest grotto Jul 26, 2025, 8:00 PM

#

and your last tag isn't a thing

ripe pivot Jul 26, 2025, 8:26 PM

#

That detection works fine. Doing hiresfix on one workflow will make the times spent longer for genning and detailer mainly to reduce inconsistency of the pupil especially when the character is quite niche

ripe pivot Jul 26, 2025, 8:27 PM

#

earnest grotto and your last tag isn't a thing

I had a gist, but it appears on the autotag so eh might as well try to see how the noise looks like

ripe pivot Jul 27, 2025, 8:25 AM

#

earnest grotto Should be fixed now

still the same

ripe pivot Jul 27, 2025, 8:43 AM

#

does disabling iGPU matter?

earnest grotto Jul 27, 2025, 11:54 AM

#

no

earnest grotto Jul 27, 2025, 12:12 PM

#

ripe pivot still the same

This with 0.2.3 now?

earnest grotto Jul 27, 2025, 12:31 PM

#

Did you install fresh? Also, DM me the model_management.py file (inside the comfy folder)

ripe pivot Jul 27, 2025, 2:05 PM

#

I just run the script, since it did download overall I presume that's fresh install.

earnest grotto Jul 27, 2025, 2:07 PM

#

hmm, everything looks fine now

earnest grotto Jul 27, 2025, 2:07 PM

#

ripe pivot still the same

Can you scroll up in this and show what's there

ripe pivot Jul 27, 2025, 2:12 PM

#

earnest grotto Can you scroll up in this and show what's there

earnest grotto Jul 27, 2025, 2:12 PM

#

more up

ripe pivot Jul 27, 2025, 2:12 PM

#

earnest grotto Jul 27, 2025, 2:13 PM

#

you don't need comfyui-manager

#

you can save those 18 seconds of loading

#

comfy has a built in manager now

#

ah, ok, I think I see the new issue

ripe pivot Jul 27, 2025, 2:17 PM

#

do they? then again I don't know how to uninstall those

earnest grotto Jul 27, 2025, 2:31 PM

#

ripe pivot do they? then again I don't know how to uninstall those

you can just delete its folder in custom_nodes

#

ok, fixed. you can redownload the script but i haven't changed the version number

ripe pivot Jul 28, 2025, 2:30 PM

#

Thanks it works fine now, the decode is slower than before but I assume that's the intel driver update ruining something again

formal tusk Jul 28, 2025, 4:37 PM

#

Anyone working with the new Wan2.2 yet?

reef ivy Jul 28, 2025, 7:55 PM

#

Nope, wanna try the 5b but everyone seems still focused on 14b. Hope to get to poke around with stuff again soon

earnest grotto Jul 28, 2025, 8:27 PM

#

you mean 24b?

reef ivy Jul 28, 2025, 9:59 PM

#

think it's still 14b model right?

somber trellis Jul 28, 2025, 10:11 PM

#

https://huggingface.co/collections/multimodalart/wan-22-688767e313337b434ed55112

#

@reef ivy Yes.

#

earnest grotto Jul 28, 2025, 10:20 PM

#

@reef ivy No. It's a 27B model that has 14B active at a time (what the A is for). So you will need more regular RAM at least.

somber trellis Jul 28, 2025, 10:20 PM

#

That's why it's A14B.

#

14 billion activated parameters during inference.

#

It will take the same processing power to run as a 14b model, but require more ram like vik said

#

@earnest grotto I just looked at the gguf quants for a14b wan2.2

#

For some reason it's 500 megabytes smaller than wan 2.1.

#

Hm. It's an MoE model with 2 experts.

#

Wan2.2 introduces Mixture-of-Experts (MoE) architecture into the video generation diffusion model. MoE has been widely validated in large language models as an efficient approach to increase total model parameters while keeping inference cost nearly unchanged. In Wan2.2, the A14B model series adopts a two-expert design tailored to the denoising process of diffusion models: a high-noise expert for the early stages, focusing on overall layout; and a low-noise expert for the later stages, refining video details. Each expert model has about 14B parameters, resulting in a total of 27B parameters but only 14B active parameters per step, keeping inference computation and GPU memory nearly unchanged.

#

I'm a moron I just understood how it works

#

It requires both the lownoise and highnoise models to function. If that's the case, then the actual model size combining the two is around 30.8 gigabytes for the Q8 GGUF versions of the models. It however only needs one loaded at a time to function, essentially making it the same resource requirements as Wan2.1, excluding storage.

#

It's a seperated MoE model.

#

https://comfyanonymous.github.io/ComfyUI_examples/wan22/

#

https://docs.comfy.org/tutorials/video/wan/wan2_2

#

https://huggingface.co/collections/QuantStack/wan22-ggufs-6887ec891bdea453a35b95f3

reef ivy Jul 29, 2025, 2:21 AM

#

earnest grotto <@490291944841281547> No. It's a 27B model that has 14B active at a time (what t...

oh now I get what people were talking about when using some 2.1 models as a second pass, seems this is similar to how sdxl was when released with 2 models as one? interesting.

coarse whale Jul 29, 2025, 1:25 PM

#

Question, I installed comfyUI thorugh @earnest grotto script some months ago, it still works great, but I was wondering, should I update the install eventually? The manager for example says that there are more recent version of ComfyUI, can I update thourgh there or should I do it through the script somehow?

Or is it fine using and older version?

earnest grotto Jul 29, 2025, 1:25 PM

#

download a newer version of the script and run it from the same location you last ran it

coarse whale Jul 29, 2025, 1:26 PM

#

I dont need to delete anything?

earnest grotto Jul 29, 2025, 1:26 PM

#

generally, comfyui updates add native support for newer models (e.g. kontext or possibly wan2.2)

earnest grotto Jul 29, 2025, 1:26 PM

#

coarse whale I dont need to delete anything?

you don't need to

coarse whale Jul 29, 2025, 1:26 PM

#

Great, thank you so much!

earnest grotto Jul 29, 2025, 1:26 PM

#

if you don't need newer models you don't need to update

earnest grotto Jul 29, 2025, 3:16 PM

#

torch compile decided to work on linux and i got a pretty nice boost for lumina 2, from 1.6s/it -> ~0.95s/it

formal tusk Jul 29, 2025, 3:24 PM

#

somber trellis https://huggingface.co/collections/QuantStack/wan22-ggufs-6887ec891bdea453a35b95...

I'm a bit confused on how the GGUF format is used for the high / low noise - how are these used in a workflow? As I understand it, in the normal configuration the two are combined in a MOE model and loaded as a single safetensor model? How does this work when the MOE is split?\

somber trellis Jul 29, 2025, 3:28 PM

#

formal tusk I'm a bit confused on how the GGUF format is used for the high / low noise - how...

The comfy examples page has a normal non-gguf workflow you can swap the loaders out for gguf nodes.

formal tusk Jul 29, 2025, 3:31 PM

#

somber trellis The comfy examples page has a normal non-gguf workflow you can swap the loaders ...

Does one just load one or the other for high/low? Is there a sequential process or a dual model loader?

somber trellis Jul 29, 2025, 3:32 PM

#

#

ignore the preview, it's broken

#

It loads the high-noise model first, inferences, sends the latents from the first ksampler to the second where the low-noise model gets loaded.

formal tusk Jul 29, 2025, 3:33 PM

#

Thank you

somber trellis Jul 29, 2025, 3:33 PM

#

I might try to use the q4 a14b models though.

#

Q8 is just a bit too big on the A770.

reef ivy Jul 29, 2025, 3:57 PM

#

You can use gguf in the wrapper now, you just need to do that one edit i posted a while back to get it to work on intel(unless something changes I havent updated yet). Block swap seems more reliable than just using the reserve vram command

#

Wrapper usually makes everything a bit easier to use tbh.

coarse whale Jul 29, 2025, 7:55 PM

#

What could this be for?
"Native API failed. Native API returns: 20 (UR_RESULT_ERROR_DEVICE_LOST)"

somber trellis Jul 29, 2025, 8:05 PM

#

earnest grotto Jul 29, 2025, 8:13 PM

#

coarse whale What could this be for? "Native API failed. Native API returns: 20 (UR_RESULT_ER...

Your driver crashed

#

You probably ran out of vram

#

Or, drivers can be a bit iffy on windows still.

#

If you haven't restarted your pc in a while, time to do so

#

win+ctrl+shift+b won't save you

#

I've mostly only had issues when kontexting for a while. I guess I used to have issues after about 200-1000 sdxl images but i kinda stopped doing that so i dunno if that's still a thing

earnest grotto Jul 29, 2025, 8:21 PM

#

somber trellis

something happened at the start

somber trellis Jul 29, 2025, 8:46 PM

#

earnest grotto something happened at the start

looks like burn at the beginning

somber trellis Jul 29, 2025, 11:01 PM

#

coarse whale Jul 30, 2025, 9:36 AM

#

earnest grotto If you haven't restarted your pc in a while, time to do so

Yeah restarting it fixed it, it does come back after circa 10 images, with illustrious, but it´s only with resolution >1500 x1500 px, so i guess thats the limit I have

craggy hinge Jul 30, 2025, 5:55 PM

#

Hello again, everyone. Please tell me, is there any way to use two ARC 770s?

late glen Jul 31, 2025, 5:06 AM

#

craggy hinge Hello again, everyone. Please tell me, is there any way to use two ARC 770s?

I believe in a limited capacity? https://github.com/pollockjj/ComfyUI-MultiGPU

GitHub

GitHub - pollockjj/ComfyUI-MultiGPU: This custom_node for ComfyUI a...

This custom_node for ComfyUI adds one-click "Virtual VRAM" for any GGUF UNet and CLIP loader, managing the offload of layers to DRAM or VRAM to maximize the latent space of your c...

reef ivy Jul 31, 2025, 10:57 AM

#

I wonder how the 48gb b60 will work

somber trellis Aug 1, 2025, 12:16 AM

#

74 seconds per step on 640x480x74 wan 2.2 A14B, total 20 steps. Takes 25+ minutes to generate.

somber trellis Aug 1, 2025, 12:42 AM

#

41 seconds per step, 640x480x81, total of 6 steps. Took less than eight minutes.

#

Using the lightx2v T2V 14B Wan 2.1 lora at 2 strength.

#

https://huggingface.co/QuantStack/FLUX.1-Krea-dev-GGUF/tree/main

#

https://bfl.ai/announcements/flux-1-krea-dev

#

Supposedly the purpose of this model is to overcome the "AI Look"

somber trellis Aug 1, 2025, 3:57 AM

#

cursive hull Aug 1, 2025, 7:13 AM

#

how can i use GGUF , in Comfy GGUf loader in not poping-up , using Arc A750

reef ivy Aug 1, 2025, 1:16 PM

#

I think its a custom node you need to get from the manager

formal tusk Aug 1, 2025, 7:45 PM

#

cursive hull how can i use GGUF , in Comfy GGUf loader in not poping-up , using Arc A750

https://github.com/city96/ComfyUI-GGUF

GitHub

GitHub - city96/ComfyUI-GGUF: GGUF Quantization support for native ...

GGUF Quantization support for native ComfyUI models - city96/ComfyUI-GGUF

boreal maple Aug 2, 2025, 8:00 AM

#

Hi, i am looking for help to install compfyui in my laptop using Intel powered Arc gpu, but all i am getting is errors and cpu only version

#

🎯 Summary: Why It Doesn't Work (Yet)
🔧 Component ❌ Problem
PyTorch (on Windows) GPU (XPU) backend still experimental or missing
IPEX (Intel Extension for PyTorch) Current public builds mostly support CPU only
ComfyUI Designed for CUDA backend, no out-of-the-box support for Intel GPU
Your GPU (Arc) Based on Xe HPG, not yet fully integrated into PyTorch workflows
TorchDirectML / OpenVINO Works partially for inference, but not supported by ComfyUI pipeline
🕯️ The Major Roadblock in One Line:

Intel Arc GPUs lack stable, official PyTorch/XPU backend support on Windows, and ComfyUI doesn't yet support Intel's alternate GPU paths (SYCL, OpenVINO, DirectML).

✅ What Is Working Right Now?

Your CPU can run PyTorch + ComfyUI reliably.

You can install optimized CPU builds (e.g., torch==2.3.0) using Intel’s IPEX.

You can do inference, just a little slower.

reef ivy Aug 2, 2025, 11:49 AM

#

boreal maple Hi, i am looking for help to install compfyui in my laptop using Intel powered A...

What is your gpu? Try Vik's script in the pinned comments #1193952640225267802 message

#

Chatgpt is wrong as well, xpu is built into latest pytorch.

tough wharf Aug 2, 2025, 5:24 PM

#

I wonder if some of yall are playing with wan2.2 on Arc cards
How's the speed and output?

reef ivy Aug 2, 2025, 8:58 PM

#

Haven't had time, haven't even been able to mess with the fusionx and all the speed up lora's and models for wan2.1.

earnest grotto Aug 2, 2025, 8:59 PM

#

@final mirage Run comfy with --reserve-vram 8, and say what happens

final mirage Aug 2, 2025, 9:33 PM

#

I'm having issues installing the IPEX support as described here

https://github.com/comfyanonymous/ComfyUI

Running these commands (as described in the docs)

pip install torch==2.3.1.post0+cxx11.abi torchvision==0.18.1.post0+cxx11.abi torchaudio==2.3.1.post0+cxx11.abi intel-extension-for-pytorch==2.3.110.post0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

I get the following errors:

ERROR: No matching distribution found for torch==2.3.1.post0+cxx11.abi```

Any ideas what could be wrong?

GitHub

GitHub - comfyanonymous/ComfyUI: The most powerful and modular diff...

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. - comfyanonymous/ComfyUI

reef ivy Aug 2, 2025, 10:44 PM

#

final mirage I'm having issues installing the IPEX support as described here https://github....

Use this way to install #1193952640225267802 message

#

Or try ai playground https://discord.com/channels/554824368740630529/1243956384052285560

reef ivy Aug 3, 2025, 1:38 AM

#

Seems like 5b wan2.2 might be worse than 1.3b 2.1. I think it is a t2v and i2v in one? Might be why its worse trying to do both(could be wrong only been able to browse discord for examples)

final mirage Aug 3, 2025, 12:17 PM

#

So I manage to install latest IPEX, now I'm getting the following

RuntimeError: XPU out of memory. Tried to allocate 115.99 GiB (GPU 0; 7.75 GiB total capacity; 6.05 GiB already allocated; 6.39 GiB reserved in total by PyTorch)

What's going on here? 115.99gb? I'm trying to run a gguf model under 7 gb with a ltx2v Lora. Before IPEX i could run it without running out of memory. Now I'm getting the above after IPEX installation.

Any ideas what could be causing the explosion of memory allocation?

reef ivy Aug 3, 2025, 12:58 PM

#

You will probably need to show the entire error. This is a workflow you previously used on pytorch I assume?

final mirage Aug 3, 2025, 1:53 PM

#

reef ivy You will probably need to show the entire error. This is a workflow you previou...

I'll do some more tests later today and post more of the error. I don't remember there being more than this. But I'll check next time.

And yes. Same workflow and same hardware as before I installed IPEX. Had no vram issues then.

reef ivy Aug 3, 2025, 3:35 PM

#

there can be issues if installing ipex on top of pytorch or vice versa, it's best to always make a clean environment if you didn't. If you updated anything it could be an issue with comfy or the nodes themselves. also make sure you have the --reserve-vram in command line with whatever amount works best

coarse whale Aug 3, 2025, 3:47 PM

#

Question, png saves with comfyui looks a bit washed out/desaturated compared to how they look like in the web preview. Why is that? The thumbnail on window looks like the web interface, but when opening with the windows viewer or photoshop is washed out.

#

earnest grotto Aug 3, 2025, 3:52 PM

#

final mirage So I manage to install latest IPEX, now I'm getting the following ```RuntimeErr...

Show the workflow

earnest grotto Aug 3, 2025, 3:52 PM

#

coarse whale Question, png saves with comfyui looks a bit washed out/desaturated compared to ...

Show the workflow

coarse whale Aug 3, 2025, 3:54 PM

#

this is weird even copy pasting from photoshop to discord the colors change

#

maybe i have some probelms with the color profiles in window

#

every time I copy paste from snipping tool to some application it change the colors, I don´t think is comfyui fault

#

It was wrong color profile set up one windows sorry!

nocturne fjord Aug 3, 2025, 4:46 PM

#

reef ivy Seems like 5b wan2.2 might be worse than 1.3b 2.1. I think it is a t2v and i2v ...

The model is unusable below 720p resolution.

reef ivy Aug 3, 2025, 4:58 PM

#

so far nobody has been able to fix the face issues with loras, they just get destroyed with 5b model apparently. Might be worth doing a second pass with the 1.3b maybe?

final mirage Aug 3, 2025, 5:36 PM

#

earnest grotto Show the workflow

Is there a good way to share workflows? (I'm new to comfyui)

earnest grotto Aug 3, 2025, 5:41 PM

#

final mirage Is there a good way to share workflows? (I'm new to comfyui)

workflow > export
or send a screenshot

reef ivy Aug 3, 2025, 6:28 PM

#

If its not a custom one just link where you got it. I assume it is a downloaded one you are using since you are doing comparisons

earnest grotto Aug 3, 2025, 7:08 PM

#

reef ivy If its not a custom one just link where you got it. I assume it is a downloaded...

no

#

Don't link it or anything like that. Send or show the exact workflow you ran.

#

oops, ping

#

oh well

somber trellis Aug 5, 2025, 2:02 PM

#

https://huggingface.co/city96/Qwen-Image-gguf/tree/main

#

Ding ding ding.

somber trellis Aug 5, 2025, 2:29 PM

#

#

somber trellis Aug 5, 2025, 3:19 PM

#

somber trellis Aug 5, 2025, 5:19 PM

#

Kinda wish there were ways to make it as fast as possible on arc.

#

rn it takes 5 minutes with the q8 gguf

earnest grotto Aug 5, 2025, 7:16 PM

#

things can most likely be faster with int8 or int4, just those aren't available in comfy

#

also there was a better alternative to teacache that popped up recently with supposedly almost zero degradation, ~~however iirc unlike teacache or such, it was not training free~~ not sure actually

#

https://arxiv.org/html/2507.17135v1

#

apparently it works both with no cfg and decently high cfg? interesting

#

i wonder if sdnext has support for it with disty's quantization

civic charm Aug 5, 2025, 7:57 PM

#

earnest grotto i wonder if sdnext has support for it with disty's quantization

sdnext has support for qwen-image in dev branch

#

but you need to patch this line until diffusers fixes it: https://github.com/huggingface/diffusers/issues/12066

GitHub

Qwen Image incorrect device assignment during prompt encode · Issu...

Describe the bug In the new QwenImagePipeline method _get_qwen_prompt_embeds it does following: txt_tokens = self.tokenizer( txt, max_length=self.tokenizer_max_length + drop_idx, padding=True, trun...

#

this is for the balanced offload to work

#

qwen image doesn't really want to go below 6 bits

#

Also needs to enable dynamic atten either through compute settings or via the ipex force attention env var

#

intel's flash attention just fails to run with qwen-image

somber trellis Aug 5, 2025, 9:08 PM

#

civic charm intel's flash attention just fails to run with qwen-image

lmao im still using normal pytorch attention in comfyui

#

flash attention would be a nice speed boost

glossy gyro Aug 8, 2025, 8:16 AM

#

Is there a relevant docker for Comfyui IPEX?

earnest grotto Aug 8, 2025, 8:19 AM

#

are you asking because you want simple setup for yourself or because you have 10 computers with arc and want to deploy to all of them

#

also, ipex by itself is not very relevant anymore. in fact, I think it's getting discontinued?

#

#intel-arc message

#

You can ask vipitis what he was talking about. to me, it just sounds plausible enough

glossy gyro Aug 8, 2025, 8:24 AM

#

earnest grotto are you asking because you want simple setup for yourself or because you have 10...

I wanted to simplify the installation.
ipex discontiunued, but what's the replacement?

earnest grotto Aug 8, 2025, 8:24 AM

#

regular pytorch

#

#1193952640225267802 message

#

Here's a script to install comfyui for you.

#

You need conda and git installed and that's it.

#

And well, working graphics drivers of course

#

On linux, that entails working clinfo at least

glossy gyro Aug 8, 2025, 8:33 AM

#

earnest grotto https://discord.com/channels/554824368740630529/1193952640225267802/127546481113...

Ok, I just have a Linux machine, and in any case I need to build Docker with this script.
I just thought, maybe there is a ready solution, as for Ollama

earnest grotto Aug 8, 2025, 8:33 AM

#

glossy gyro Ok, I just have a Linux machine, and in any case I need to build Docker with thi...

and in any case I need to build Docker with this script
why

glossy gyro Aug 8, 2025, 8:36 AM

#

earnest grotto > and in any case I need to build Docker with this script why

I have the concept of working NAS with Docker, as a targeted solution for the application.
But I can use LXC, which will violate homogeneity

#

hmm, found this project
https://github.com/YanWenKun/ComfyUI-Docker/tree/main/xpu-test
I hope this is what I need

glossy gyro Aug 8, 2025, 3:42 PM

#

glossy gyro hmm, found this project https://github.com/YanWenKun/ComfyUI-Docker/tree/main/xp...

I tested, It works, without IPEX directly, there are exist ready dock images

reef ivy Aug 10, 2025, 9:23 PM

#

Has there been anything notable in the pytorch-xpu development? Last big thing was triton/torch.compile. I heard flash attention was coming but apparently it's not actually making anything faster afaik.

civic charm Aug 10, 2025, 10:20 PM

#

Flash atten is already here with PyTorch 2.7 and made it much faster

#

But base PyTorch performance is still awful compared to ipex 2.3

#

So flash attention wasn't enough of a speedup to close the gap

#

Also installing IPEX on top of PyTorch 2.6 / 2.7 / 2.8 halves your performance for some reason

#

And using non-blocking even once makes things slow down to a halt on Intel and corrupts your data

#

non-blocking is supposed to make things faster, not slower

#

This wasn't an issue with IPEX 2.3

reef ivy Aug 10, 2025, 11:41 PM

#

So this is just slowdown for pytorch in general or intel/xpu/ipex specific slowdown?

civic charm Aug 11, 2025, 9:09 AM

#

Intel specific

#

PyTorch slowdown is fixed on all others with PyTorch 2.6

#

And 2.7 runs much faster than everything before it on AMD

earnest grotto Aug 13, 2025, 1:41 PM

#

civic charm Also installing IPEX on top of PyTorch 2.6 / 2.7 / 2.8 halves your performance f...

When training a lumina 2 lora*, with 2.9 nightly and 2.8 I get ~11.1s/it, with 2.8+IPEX I get ~8.7s/it

#

*without the TE, 12 rank, lotsa random 1MP resolutions, 1 batch size, and with cache clearing before and after backward() because for some reason it both speeds it up AND reduces vram usage AND makes it so I don't randomly crash with a pseudo-OOM issue

#

which crashing was also happening when I tried to train for ace-step too

#

I'll try inference later but, I'm pretty sure my 2.8/2.9 performance is basically identical to 2.3+ipex

lunar thicket Aug 13, 2025, 2:15 PM

#

earnest grotto When training a lumina 2 lora*, with 2.9 nightly and 2.8 I get ~11.1s/it, with 2...

2.8+IPEX sounds appreciably faster then?

earnest grotto Aug 13, 2025, 2:16 PM

#

when training, a lumina 2 lora specifically, and on linux

#

for me

#

IIRC disty had some issues with SDXL lora training performance too. For me personally, I've had SDXL lora training performance go all over the place for seemingly no reason. Fresh boot, 5-6s/it, reboot, 5 again, reboot again, finally the expected 2.3s/it

#

I'll poke inference in comfy later but generally, I'm pretty sure my 2.3+ipex and 2.8/2.9 performance was the same. haven't tried 2.8+ipex for inference

#

on windows my training speeds were about 25% slower?

lunar thicket Aug 13, 2025, 2:19 PM

#

This is why I was wondering about AI Playground moving off IPEX in the other channel. Feel like I am seeing mixed messages on IPEX vs native pytorch. But maybe it's not an issue for that workload

reef ivy Aug 13, 2025, 6:22 PM

#

2.8 ipex seems pretty new, might not have been out back then. Might only be linux and training as well.

#

Quick question for anyone who would know, does ipex eventually get upstreamed to pytorch or are they separate?

civic charm Aug 13, 2025, 7:47 PM

#

We might get a 2.9 release but 2.8 really is the last IPEX release.

#

It is pure PyTorch from now on.

#

lunar thicket Aug 13, 2025, 7:59 PM

#

Ahh well, that settles that 👍

bold owl Aug 14, 2025, 1:37 AM

#

civic charm We might get a 2.9 release but 2.8 really is the last IPEX release.

Wait, I know this is the ComfyUI thread, but does this affect Ollama serve also?

civic charm Aug 14, 2025, 1:38 AM

#

Isn't ollama c++ / llamacpp based?

#

IPEX is for PyTorch

earnest grotto Aug 14, 2025, 3:29 AM

#

inference speeds do look pretty bad though at a glance

#

2.15-2.2s/it for lumina 2 with 2.8+ipex, vs 1.6-1.7s/it with 2.9

#

1.8-1.85it/s for sdxl with 2.8+ipex vs 2.20-2.25it/s with 2.8

#

so yeah, substantially slower

earnest grotto Aug 14, 2025, 3:57 AM

#

civic charm Isn't ollama c++ / llamacpp based?

apparently go

#

well, nonetheless, no ipex for go either

lunar thicket Aug 14, 2025, 12:36 PM

#

earnest grotto 2.15-2.2s/it for lumina 2 with 2.8+ipex, vs 1.6-1.7s/it with 2.9

1.6-1.7 with 2.9 seems good?

earnest grotto Aug 14, 2025, 12:38 PM

#

It can get down to 1s/it with compile, probably even faster with other things like int8 if comfy had some support for that like sdnext does
though something's wrong with compile and if I keep changing resolutions, or... Prompt? It adds 100ms every time and gets slower and slower and I've even reached 3s/it. weird stuff.

bold owl Aug 14, 2025, 1:00 PM

#

civic charm IPEX is for PyTorch

Just going off the web site https://github.com/intel/ipex-llm?tab=readme-ov-file, wasn't sure.

earnest grotto Aug 14, 2025, 3:55 PM

#

IPEX's discontinuation is with the goal of Intel putting the optimizations (or other features) directly into Pytorch
ipex-llm will continue to exist in some form I'm sure. it's also weird, not sure how exactly they integrate it into ollama/llamacpp and it seems you need custom intel builds to use it? 🤔

#

ipex-llm apparently existed prior to ipex as bigdl. still based on pytorch, among other things? so, it can become bigdl again, who knows

civic charm Aug 14, 2025, 4:07 PM

#

i still don't understand why it was renamed to ipex-llm

#

It doesn't really have much to do with ipex

somber trellis Aug 14, 2025, 9:31 PM

#

ipex versions on windows always seemed to be significantly slower and less operable than the normal pytorch nightlies for xpu

#

I can't run wan 2.2 properly on 2.8.1+ipex but I can with the 2.8/2.9 nightly

#

there are exceptions though

#

index tts runs better with the ipex version than the non-ipex version

#

somber trellis Aug 14, 2025, 10:39 PM

#

current nightly's torchaudio isnt working for some reason so I'm on the stable 2.8 release

nocturne fjord Aug 15, 2025, 7:18 AM

#

Does anyone know if this model could work on an Intel GPU? https://huggingface.co/tencent/Hunyuan-GameCraft-1.0

tencent/Hunyuan-GameCraft-1.0 · Hugging Face

earnest grotto Aug 15, 2025, 7:24 AM

#

With block swapping, most likely
What I'm more concerned with is if it's actually supported in Comfy and how you will feed its inputs, especially given it will be nowhere near real time

nocturne fjord Aug 15, 2025, 10:54 AM

#

earnest grotto With block swapping, most likely What I'm more concerned with is if it's actuall...

I think ComfyUI is not suitable for this kind of model. I mean, real-time rendering is not in the "spirit" of ComfyUI.

earnest grotto Aug 15, 2025, 10:56 AM

#

This is not going to be anywhere near realtime on any consumer hardware

#

let alone an intel gpu

#

The model is tested on a machine with 8GPUs.
Minimum: The minimum GPU memory required is 24GB but very slow.
Recommended: We recommend using a GPU with 80GB of memory for better generation quality.

#

They most likely recommend 80GB not because you need 80GB but because those 8 GPUs are 8 H100s

somber trellis Aug 15, 2025, 7:37 PM

#

@earnest grotto Isn't there another worldmodel released that they distilled for consumer gpus?
I need to find it.

#

https://x.com/TencentHunyuan/status/1956309202437296538

#

Found it.

#

Doubt it will run well on our hardware though.

#

not gamecraft tho

#

earnest grotto Aug 15, 2025, 7:54 PM

#

people will optimize it

#

it just won't be fast enough, at least on intel

#

i'm sure there's always a guy with a 5090 around the corner

reef ivy Aug 15, 2025, 11:37 PM

#

I wonder if intel could ever get something like sage attention?

earnest grotto Aug 16, 2025, 10:53 AM

#

Technically possible for regular sage attention
2 and 3 use fp4/int4 hardware, speedup comes from that which I don't think current Intel GPUs have? So, on that basis, those won't be around for current gen. Maybe celestial though, I'd expect them to have 4 bit hardware
Realistically...? As things are right now, I don't see it happening

earnest grotto Aug 16, 2025, 11:11 AM

#

Decided to test out Wan 2.2 with some anime. Honestly not as bad as I was expecting

#

about 343 seconds per

#

incl. 4-step lora

reef ivy Aug 16, 2025, 3:17 PM

#

I think only 50 series has 4bit support but older nvidia still get sage attention 2 speedup i believe. Could be int4 but I think Intel can do that? Probably wrong about that though

#

Wan seems pretty on par or real close to paid models, for anime seems good from what I have seen might need loras or finetunes though not sure

earnest grotto Aug 16, 2025, 3:24 PM

#

it needs loras because I do not intend on waiting for 1500 seconds instead

#

(25 minutes)

earnest grotto Aug 16, 2025, 3:30 PM

#

reef ivy I think only 50 series has 4bit support but older nvidia still get sage attentio...

I could be wrong on int4. don't remember

civic charm Aug 16, 2025, 5:15 PM

#

reef ivy I think only 50 series has 4bit support but older nvidia still get sage attentio...

RTX 2000, RX 7000 and anything after them has INT4
RTX 5000 series removed INT4 support and added FP4 instead

#

A770 does have INT8 and INT4 too

#

But INT8 via onednn / mkldnn quantized matmul (what pytorch uses) runs 2x slower than 16 bit for some reason

#

OneDNN is for the GPU and MKLDNN is for the CPU but the behavior is exactly the same on both

#

CPU runs INT8 2x slower than FP32 with MKLDNN
GPU runs INT8 2x slower than BF16 / FP16 with OneDNN

#

A770 supposed to run INT8 2x faster than 16 bit, not 2x slower

earnest grotto Aug 16, 2025, 5:33 PM

#

civic charm But INT8 via onednn / mkldnn quantized matmul (what pytorch uses) runs 2x slower...

that's pretty sad

somber trellis Aug 16, 2025, 11:37 PM

#

earnest grotto Decided to test out Wan 2.2 with some anime. Honestly not as bad as I was expect...

I have been messing with wan 2.2 a14b myself for a while now

#

on arc, I've been getting better results using the older lightx2v lora at 3 strength at high noise and 1 strength at low noise

#

horror btw ^

#

these gens however are 8 steps, taking 3 minutes and 36 seconds per inference, to 7 minutes 12 seconds for inferencing alone not including clip text encode or prompting (if youre using an llm like i am)

bold owl Aug 17, 2025, 2:16 AM

#

somber trellis these gens however are 8 steps, taking 3 minutes and 36 seconds per inference, t...

How interesting is the workflow?

somber trellis Aug 17, 2025, 3:13 PM

#

bold owl How interesting is the workflow?

It's actually just the native comfyui workflow with gguf and lora loaders

#

one lora is used

#

the old wan 2.1 lightx2v lora

#

strength 3 on high noise, strength 1 on low noise

somber trellis Aug 17, 2025, 10:04 PM

#

upbeat crow Aug 17, 2025, 11:01 PM

#

somber trellis

please keep it to 2 at a time, automod doesnt like it if its too much. Sorry it took this long to untime out

somber trellis Aug 17, 2025, 11:16 PM

#

upbeat crow please keep it to 2 at a time, automod doesnt like it if its too much. Sorry it...

no its fine

#

👍

#

It already warned me and I ignored it

#

lmao

#

I've mainly been rather interested in messing with wan though

#

It's really quite a good local model

reef ivy Aug 18, 2025, 12:14 AM

#

Wan is amazing, can't wait to try out all the new stuff, haven't been able to mess around with it since before the fusionx finetune was released months ago.

earnest grotto Aug 18, 2025, 3:34 PM

#

I increase resolution from 400k pixels to 500k pixels and time more than doubles from roughly 5 minutes to 11+ minutes. yeesh.

#ComfyUI for Intel Arc using IPEX