#ComfyUI for Intel Arc using IPEX

1 messages · Page 8 of 1

signal patio
#

changed nodes from pulid to pulid-flux-enchanced and now it works

#

how can i fix this?

earnest grotto
#

run my script again to update the hijacks

signal patio
signal patio
signal patio
#

also is it possible to run wan 2.1 on arc a770 16gb?

reef ivy
#

it's possible

#

still no torch.compile in windows. Do you guys think it will be required to use the one api basekit again whenever it becomes funtional? I'm not 100% but it seems when calling the setvars the speed goes down. speed seems the same now

reef ivy
#

added skip layer guidance and enhance a video to workflow, runs even faster with better quality.

#

for some reason wan loves slow motion for this image

signal patio
reef ivy
#

Well, no sageattention or torch.compile(if in windows). You can use the workflow in the png I just posted if you want to

#

I am on an a750 so you probably could use a larger model if you want to.

quartz kelp
#

Torch 2.8.0.dev has a good performance improvement in SDXL speed on my A580 with comfyui. I can now generate a 1024x1024 image in 17 seconds instead of 35+ seconds.

ripe pivot
#

still very slow for i2v

reef ivy
reef ivy
#

Although that is way slower than me

#

Are you using my workflow?

ripe pivot
#

yeah

reef ivy
#

B580 right? Thats crazy slow, what size model are you using?

#

I use Q4_k-s

ripe pivot
#

Q8, surely it won't slow it to 10 minutes per vid to 10 minutes per it right?

reef ivy
#

Also what length?

reef ivy
#

Try different models, maybe q6 or q5? Also fp8 is the same size as q8 and could be faster maybe

ripe pivot
#

also do everything must be sync with each other? I think my text encode is something like fp16

reef ivy
#

Try a smaller model maybe, try each step lower till you get decent speed. If all are alow them maybe something is up with drivers or b580.

reef ivy
#

yeah that may be it, I am using the fp8 encoder, it's probably half the size

#

don't forget that 12gb vram is still small for Ai, it's better than 8 but you will still need quants for most stuff

#

go for best quality that can fit on your system

#

in my workflow the clip is put straight onto the cpu so the model itself uses the vram instead. If you get a model that can fit into your 12gb it should be faster.

reef ivy
#

what sampler are you using? Last I checked fastest I got was 1.3 s/it with sdxl

#

You are on a580 right not b580?

somber trellis
quartz kelp
reef ivy
#

I will try this in a little bit, I expected the a770 to be faster but not the a580.

reef ivy
#

nope, with that workflow it's even slower for me. I wonder if it's the model you all are using, my sdxl models are old as dirt.

#

also, could be drivers. What are you all on? The latest?

#

with wavespeed added I can get 1.3it/s, still slower than you all get lol

signal patio
signal patio
# signal patio correct?

@earnest grotto plz help, im scared if i do something wrong the whole comfy will stop workin lol

earnest grotto
earnest grotto
#

Please install inside your user folder, not in C:

signal patio
#

its trying to install new copy

#

in my user profile folder

signal patio
earnest grotto
earnest grotto
#

go into the comfy folder in ComfyUI, delete ipex_to_cuda, run the installer script again.

#

or git clone https://github.com/Disty0/ipex_to_cuda into that folder.

signal patio
signal patio
#

is there any way to speed up?

reef ivy
#

Right above your first post

#

@somber trellis @quartz kelp what driver are you guys on?

signal patio
reef ivy
#

should just be able to click the image, right click and chose save as.

#

you can also try and download the mp4 video

#

I think it should save the workflow as well

signal patio
signal patio
quartz kelp
reef ivy
#

latest drivers I am even slower now lol.

somber trellis
#

im also currently on 6647

reef ivy
#

just tried latest and 6647, both are slower. Latest driver had my clocks going all over the place also

#

guess this memory issue is only on the a750 and not the a580...:(

quartz kelp
reef ivy
#

dmpm_sde karas is almost 200% slower lol

#

euler is about .20s/it slower

#

--bf16-unet --use-pytorch-cross-attention --disable-ipex-optimize --reserve-vram 7.0

#

maybe reserve vram needs to change for the latest drivers

#

since I'm swapping drivers, I may try a real old one from pre-battlemage era

#

although i may have to reinstall arc control dunno

#

seems reserve-vram 6 takes me back to 1.3s/it. Probably would slow down wan, but speed up smaller models maybe?

reef ivy
#

Interesting, at reserve 4.0 i am at 1.05s/it with sdxl

#

And wan speed slows by double.

ripe pivot
#

wonder if I should upgrade to 6651

earnest grotto
#

Use 6314.

ripe pivot
#

even for windows?

earnest grotto
ripe pivot
earnest grotto
#

On Linux, the relevant versions of things are the kernel, the intel compute runtime, and probably others. WSL has a small set of specific kernels it works with

#

You really don't sound like you're on Linux, so just use 6314.

ripe pivot
#

to install, I just download the particular driver and execute it right? I don't have to DDU or something like that

reef ivy
#

If it effects speed I may roll back myself tbh

earnest grotto
earnest grotto
#

But uh... On Linux, the driver has started to completely crash if I'm doing anything in Blender, so for now I'm sitting on windows till I finish the thing I wanna model

ripe pivot
#

can't downgrade. ugh, I guess I'll just stick on 6647. I don't really want the DDU hassle just to gamble for speed

reef ivy
#

Issue I am seeing atm, is my version of level zero is 1.20.1 and the closest on there is 1.20.2

ripe pivot
earnest grotto
# ripe pivot b580

6314 is for alchemist only, not battlemage. If you're on battlemage, the oldest driver with support is the one after 6319. Driver quality dropped since battlemage released, I don't think there's much you can do for now

reef ivy
#

you may be better with latest, and if you have any issues then roll back

earnest grotto
#

if you want more speed, use AIPG's comfy, it has a special battlemage build of ipex for 2.3, as battlemage hadn't released back then.

reef ivy
#

Although, not sure of compatibility with newer models and nodes

reef ivy
#

I appear to have gotten torch.compile to work in windows, although not sure if there is a benefit.

#

Inference time before torch.compile for iteration 9: 216.99881553649902 ms Inference time after torch.compile for iteration 9: 203.9179801940918 ms

reef ivy
#

this will only work with 2.8 xpu nightly with the triton xpu

#

If there is a way to download the level one stuff with the windows installer of oneapi, I couldn't find it.

reef ivy
#

Seem to get between 15% and 25% speed increase. But lots of trial and error with all the comfy nodes and settings getting it to work

somber trellis
#

My speeds seem to be worse with compile than without.

reef ivy
#

doesn't seem to work great with sdxl, it is working with Wan though. A little speed boost in Flux as well

#

also, different compile nodes seem to work better than others, teacache compile node with inductor and default seemed to work well for flux

#

Although I am probably not on the latest nightly, i am still on the one from yesterday

#

I think the bigger the model the more difference it makes

#

for wan I use comiple on tranformer blocks only, which helped speed it up faster than normal I think.

#

Wan has gone from about 12min with my workflow to as low as 8min with an average of around 9 with torch compile

somber trellis
#

It seems the best one is wavespeed's compile node.

reef ivy
#

I will try that one again, I had trouble with it and trying to change settings

somber trellis
#

specifically "Compile Model+"

#

The one with the default of inductor, not velocator for the backend

reef ivy
#

I think wavespeed as a whole is better than teacache, but dev doesn't update often so it's old

somber trellis
#

Honestly I'm not even sure if the wavespeed compile node is even working...

reef ivy
#

if it's faster (or slower) it is.

#

If using sdxl, honestly I didn't find anything that helped the speed other than like wavespeed or teacache, i think the model is too small for it to help and might hurt it instead.

#

Flux got a small gain, and Wan can get a big gain

reef ivy
#

Using a couple techiniques for speed, and only did 20 steps so could improve with more. Haven't tried a 50 step output yet

rustic sonnet
#

Unrelated to Comfy but since Pytorch nightly now installs Triton for windows, did anyone try writing any kernels with it?

reef ivy
#

That's way above my knowledge, but I did manage to get compile to work

rustic sonnet
earnest grotto
#

I think @somber trellis ran into that somewhere up in the chat?

#

Was it activating the basekit or what

reef ivy
#

Download the level zero files and add them to the oneapi comiler folder, then call oneapi setvars like normal.

reef ivy
#

As far as speed, it only seems to increase with larger models. But it does work

somber trellis
#

just drag and drop the lib and include folders into C:\Program Files (x86)\Intel\oneAPI\compiler\latest as aaron said before

#

torch compile doesn't seem worth messing with just due to the compilation time it takes, and the errors it needs to error out

#

what would've been better is getting torch 2.6.10+IPEX compilation to work with compile nodes

rustic sonnet
#

Thanks for the help, i guess my dream of writing Triton kernels in windows have to wait

reef ivy
#

I tried adding the visual studio c++ compiler to path like nvidia users do but it only worked with oneapi

rustic sonnet
reef ivy
#

I think at the moment it is still in infancy, I think full support will come. Tbh, I think Intel is the only one with official triton wheels for windows

rustic sonnet
#

Yea atleast officially it seems to be only intel. There are forks that works on Windows for Nvidia tho

reef ivy
#

Yeah, they also get sage attention

#

I am hoping for flash attention soon(i hope its faster than the pytorch attention anyway). I found some posts on github of ot being worked on

rustic sonnet
#

Not yet in Pytorch tho

reef ivy
#

Nice

#

I think it also may be in open vino

#

Not sure though

somber trellis
#

i think we'd all be using it

reef ivy
#

Open vino has issues though. Last I checked it used much more resources than ipex.

#

Would love to try it in comfy, i believe there was a fork years ago

somber trellis
#

Well they're definitely trying.

#

i dont think torch compile likes ipex_to_cuda hijacks

reef ivy
#

Some settings don't work in the custom nodes in comfy

#

inductor and default or reduce overhead seem to work.

#

max-autotune-no-cudagraphs can pull errors

somber trellis
#

teacache also has eager and aot_eager

reef ivy
#

Eager will work, but I think it is slower than inductor. Also if you have the option to compile on transformer blocks only, i think that helps with speed. I only see it on the wan nodes though.

#

If you have wan installed I think you will see the difference (on second gen as the compile adds to the time). Flux as well should get a speed bump

#

if using a lora, you will need the model patch order node from kj nodes

reef ivy
#

For flux, you don't need the model patcher node for lora if using the teacache compiler I don't think. Although with a detailer lora the output is different, but i think it's still working. The patcher node just causes massive slowdown for me.

rustic sonnet
reef ivy
#

Is there a hardware limitation with alchemist?

#

looks like no speed increase for us, wish we could get the ipex 2.1 speed back 😦

reef ivy
#

Yeah, looks like its a hardware limitation with lack of flash attention support all together.

tiny bolt
#

i dont think its a hardware limitation

#

thats why first movers can hardly be surpassed. researchers do it for free for you

civic charm
#

only hardware dependency of flash atten is wmma / tensor cores

#

a770 has tensor cores

#

i don't see any reason why it is an hardware limit

civic charm
#

Welp, being a 32 bit gpu hits here too

reef ivy
somber trellis
#

after many hours of reinstalling windows 11 and all of my packages

#

upscale model no longer crashes me anymore (again)

#

I wish I knew why it happened. What I find odd is this "Display Port Lost" blackscreen happened during upscaling and randomly while running certain games.

reef ivy
#

Crashing might be memory related, the display port I dunno.

somber trellis
#

Its a full on blackscreen, pc becomes entirely unresponsive until restarted

reef ivy
#

Do you see any ram spikes before it happens? I've had the system freeze when ooming before, usually it's going into page file. (not sure why it would do that on an a770 though).

somber trellis
reef ivy
#

triton was updated in nightly, dunno what has been changed though.

ripe pivot
#

oh new driver actually improved the it/s, nice I guess

reef ivy
ripe pivot
#

6647 to 6651, I'm just waiting for whql really

reef ivy
#

anybody know if this torch.compile error is fixable torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: RuntimeError: self and mat2 must have the same dtype, but got Half and Float happens when using the fp8 weight type with some wan2.1 models when using the load diffusion.

civic charm
#

That error is because torch.compile is bypassing my ipex hijacks

#

And ComfyUI errors out without my hijack

reef ivy
#

Would that just be torch.compile all together or maybe because it's a custom node?

civic charm
#

it is from the node, it shouldn't need my hijacks in the first place

reef ivy
#

thanks, I found another compiler node that works, my guess it may be one of the settings I have set probably bypasses comfy settings

verbal crow
#

hey guys, could you tell me how the B580 performs on AI? I made a topic here #1354844590519353384, but some people told me to ask here

#

if you could answer me there it would be nice

earnest grotto
# ripe pivot b580

Would you like to run just a basic sdxl clip->ksampler->vae decode thing, 1024*1024, etc., and tell us your performance and pytorch version?

late glen
#

anyone on the 2.8 nightly release for pytorch?

ripe pivot
earnest grotto
#

If you don't know how that works - open the image in browser, download it, and drag and drop the file over comfyui in your browser

#

then run

#

With 6647 on windows, with my A770 16GB, the slower pytorch 2.6, I get 1.22it/s with this workflow

ripe pivot
earnest grotto
#

Should also be able to download it with comfyui manager

#

I'll be off

#

I should switch back to 2.3, forgot i was using 2.6 on windows

#

yucky

ripe pivot
verbal crow
ripe pivot
verbal crow
#

interesting

ripe pivot
#

the latest driver really improves the gen speed

verbal crow
#

im looking to get a b580 and i want to know how fast it is on ai, that is why i asked

#

that is 1024x1024?

ripe pivot
#

yes

ripe pivot
#

but hey, b580 is amazing when it works. you just have to expect it to not just works™

verbal crow
#

inspiring 💀

#

i would kill for 1it/s 😭

#

im on gtx 1650

ripe pivot
verbal crow
#

omg finally some numbers 😭 🙏

#

i was going to get a 3060, but looks like my mind is locked on the b580 100% now

ripe pivot
#

well keep in mind that the webUI that they're using is probably different. but even with that 27 secs instead of 12 sec probably is better, strictly for image genning tho.

verbal crow
#

yea

#

anything is easily better than my current gpu

#

i could get a 3060 for around the same price of the b580 here. the 3060 is nvidia, so it would be easier to use ai stuff, but the power of the b580 over it makes i choose the b580

ripe pivot
#

I mean sure, but it won't be convenient using comfyUI if you're used to regular A1111 and it's derivative. and not to mention it kinda sus if you're gaming with intel on older games(I got to use vulkan for dota2 because it drips into 10 fps or so on dx11 lol)

verbal crow
#

i dont mind having some extra steps running stuff and i dont play older games, so i guess im good

reef ivy
# verbal crow i would kill for 1it/s 😭

b580 will be a major upgrade over that lol. Only thing is the setup will be a little more difficult if your new, but Viks script and intel ai plaground are really good to get started.

#

12gb will be really good for entry level too, i mean I am on 8 and getting by (barely lol)

earnest grotto
earnest grotto
#

1.85it/s with 2.8

earnest grotto
#

a 16gb one would be good too for your case

muted dagger
ripe pivot
#

Sdxl

#

Uh google directs me to msi article iirc

muted dagger
#

very wrong numbers

#

there is no way in earth 3060 can pass 3080

ripe pivot
#

Eh they probably run the test somehow that it bloated that it need of 12 gb vram minimum

muted dagger
#

i think so

#

because i have 2080ti and 3080, and my friend had a 3060 and its not even close in flux

#

it takes 45 seconds with 25 steps with 2080 ti and around 1 and half minutes in 3060

#

I wish Gpu comparison channels also did tests with comfyui

ripe pivot
#

It's pretty niche interest. You probably better off find the benchmark comparison on east asian video ironically(provided that you can type the moonrune keyword) lmao

muted dagger
#

i guess they using sdxl base with a refiner or somthn

signal patio
#

guys whats the best pytorch for windows? my setup is soooooo slow
sometimes its 4s/it but 90% of the time its 9-24
but im using same workflow every day

signal patio
earnest grotto
#

Flux is slow

signal patio
#

but how its possible? yesterday it was 3.5s/it now its 32

earnest grotto
#

restart pc

#

close things using vram

summer oxide
#

I have a question: is it still correct as of the end of March 2025 to follow the installation procedure in this post on a windows 11 24h2 + intel arc a770 16gb pc configuration? Or has something changed?

somber trellis
#

even flux 4_0 is slow

somber trellis
#

It does it for you.

earnest grotto
#

@summer oxide ^

buoyant gulch
#

This content is no longer available. I can't download the requirements.txt file now. Could someone kindly help me?

buoyant gulch
summer oxide
buoyant gulch
# summer oxide Thanks for the helpful advice!

But when I run this script, it returns this: Script version: 0.1.6p
Unknown or potentially incorrectly detected GPU:

Please report this issue.

I guess this script is for a discrete graphics card, but I have an integrated Intel chip( Thinkbook Pro 14 Ultra 225H), and I just want to use this card. How can I do it?

earnest grotto
#

The built-in windows driver updater does NOT work, if you were thinking of using that

#

Also, you can try AI Playground

#

Say you driver version
And open powershell, type in Get-WmiObject Win32_VideoController, press enter, and show what is shown

buoyant gulch
earnest grotto
#

Ok, I see the issue, I'll fix that, hold on

buoyant gulch
#

Thanks a lot!

earnest grotto
#

hmm

#

this is an arrow lake igpu, i wonder if it's supported at all

#

Get AI playground and tell me if it works

#

@buoyant gulch

#

wait i'm getting confused, it should be supported

buoyant gulch
#

Ok, playing this ground for the first time.

muted dagger
#

it says coming soon

earnest grotto
#

I updated the script, should work now for arrow lake

buoyant gulch
#

I installed it on D drive, is there a problem?

earnest grotto
# muted dagger

@buoyant gulch Arrow lake is not supported yet for AIPG, sorry

#

I updated my script. Download it again, try again with it

buoyant gulch
#

Anyway, try your script, it's only a problem of time……thanks!

buoyant gulch
#

Everything is OK except this file, tried tree times, still can't download this file……

earnest grotto
#

everything is probably fine and you can probably run it without issue, but show the whole list
also consider trying the experimental pytorch which will be much faster, but might not work

buoyant gulch
earnest grotto
#

hmm

earnest grotto
buoyant gulch
#

Nightly is installed successfully.

earnest grotto
#

It will install, the question is will it work afterwards

#

download some model and test

buoyant gulch
#

Entered 8188, and downloading sd1.5 model.

#

Thanks man! LexThumbsUp you are the goat!

reef ivy
earnest grotto
reef ivy
#

You can also add teacache or wavespeed, or the distilled flux

earnest grotto
#

wdym by distilled flux, it's already distilled

reef ivy
#

woops i am thinking of hunyuan, i think flux has a de-destilled model which is maybe slower lol

#

There is hyper flux and turbo flux lora's, you'd have to experiment to see which is better

#

@signal patio you can try this workflow, added teacache and torch.compile(if on windows requires extra steps to set compile up but it only helps a little so can be ignored). I have 2 sd upscalers as well if needed. Detailer deamon could be bypassed as well for a little more speed but less details. lora loader for multiple loras, 8 steps is for the turbo lora etc. Also launch comfy with --reserve-vram 7 (or try different numbers, they help with speed in windows and gguf models)

earnest grotto
reef ivy
#

updated to pytorch 2.6 as well

buoyant gulch
#

I can run Flux_pruned_fp32 (16.6G)on my ThinkBook Pro Ultra 255H, checkpoint version workflow in 3-4 minutes to generate this 896 X 1152 picture!

vestal jungle
#

I’ve got image generation working. Curious if there is a consensus on the most capable video workflow for a750

buoyant gulch
#

But after a while, I got an error and exited, restart comfyui using python main.py, I got this: No module named 'yaml", create a new conda environment, reinstall, still not working, same error.

earnest grotto
buoyant gulch
#

Can I install comfyui manager? I think there is no problem with it , because I run flux after installing comfyui manager and using it install a missing node.

earnest grotto
#

Sure. My script also gives you the option to install comfyui manager, and a few others, if you don't know how

buoyant gulch
#

I' 'm testing it, this laptop has UBS4 and TGX ports, and I can use a GPU docking station. I will try to solve these minor things by myself. Thanks for your contribution to the community!LexThumbsUp

#

SDXL only needs 10 seconds, so it's a good choice for this laptop.

reef ivy
earnest grotto
reef ivy
#

Not personally but dan did a couple with the 1.3b model and they looked pretty good. Should have decent prompt adherence too

vestal jungle
#

Is LTX just not as good or not compatable with Intel cards?

reef ivy
#

Ltx is okay for closeup videos without too much movement. Its the fastest. The new 1.3b wan fun model may be better, it just came out yesterday but haven't tried it yet.

#

Wan2.1 14b i2v videos with my current setup take about 10-12 minutes for 480p at 49 frames(3secs)

vestal jungle
#

I don't really have a frame of reference, so I guess that's good...?

#

I think I'm getting 720p still images with 20 steps at around 30 seconds. Not quire sure how that translates, but sounds about right. I assumed video would be a little easier and just the first image would be the most computationally intensive, but that was an assumption.

reef ivy
#

Nope, video is harder. 720p with an a750 if possible will probably take over an hour maybe much more.

#

and that's probably for like 2 seconds maybe, 33 frames. (with wan 14b).

#

The smaller models like 1.3 and LTX may be able to do 720p at a reasonable speed but lower quality

vestal jungle
#

Yikes. I'm sure I'll start with 480. I've noticed some prompts give really bad ugly results and others come out almost perfect for images. I used to think that was just the capability of the software, but finally learning how to do prompts a little bit. Haven't even stumbled through Workflows yet. LoL

reef ivy
#

Prompting is very important, especially for older models like sd1.5 and SDXL

#

flux does pretty good with natural language, LTX requires an LLM tbh

vestal jungle
#

Well...I'm downloading LTX now just to test it out and start with lower expectations. WDYM it needs a LLM? ...like to help write effective prompts? How so?

reef ivy
#

Yeah, they have one built into their workflow now called prompt enhancer. 0.95 is the latest model

#

I have my own custom one for my old ltx workflow that used ollama

#

Ltx can get decent results for simple upclose animations and non compex movements.

vestal jungle
#

Well...that's if I could get it to install correctly after downloading 40+ GB. 😵‍💫

vestal jungle
#

At least it appears every other node and feature is fine except all the ones that start with LTXV

#

When I try ComfyUI Manager, I get "Failed to Clone Repo" and when I try manually it's missing several nodes. I wonder if something happened to the Git or if he's trying to monetize it now.

reef ivy
#

40gb for ltxv? That doesn't sound right.

vestal jungle
reef ivy
#

and missing nodes should be able to be installed with the manager, unless the workflow your using is old as some nodes may have been removed for ltx.

muted dagger
#

which one is better at quality wavespeed or teacache? i really cant see major difference between them

reef ivy
#

I think wavespeed but its not compatible with wan yet and has way less frequent updates. If there is no major difference for you use the one thats fastest

somber trellis
#

Teacache is generally more stable with outputs for flux than wavespeed is. I've been getting blurry outputs using default 0.120 recommended threshold on the first block cache node for wavespeed.

#

Though I think both caching methods seem to have a chance to straight up fail on my end.

earnest grotto
#

I've had blurry flux outputs without either

#

I kinda doubt it's their problem

vestal jungle
reef ivy
vestal jungle
#

I guess I have some digging to do because it's been several weeks since I went through all that and I don't remember what I did. LoL! I think I used miniconda3, but I don't remember anything about how I created the environment. 😅

reef ivy
#
python -m venv 'name of your enviornment'
"environment name" \scripts\activate ```
reef ivy
#

If so i think you can run it again, and control+c to cancel and it will exit into your environment

#

Also did you try install missing custom nodes in comfyui manager? It usually works

vestal jungle
#

I tried the manager. It doesn't find any missing nodes. When I load the I2V workspace I get the missing nodes error notice. The manager has been useless for me so far. 🤷‍♂️

#

I did end up using Vik's script during my original install.

#

I generally run the instance from windows with the start_lowvram.bat from either you or him or someone around here. LoL

vestal jungle
earnest grotto
earnest grotto
#

Running batch files directly will cause that to happen
The shortcut my script makes launches it with extra arguments for the command prompt to not close (also helps in case comfyui crashes or something)

#

The alternative is to launch a command prompt yourself and run the batch file from that, rather than double clicking

vestal jungle
#

So from there I try .\scripts\activate ...I'm not a python dev. LoL

#

Do I run
cd custom_nodes/ComfyUI-LTXVideo && pip install -r requirements.txt
or
.\python_embeded\python.exe -m pip install -r .\ComfyUI\custom_nodes\ComfyUI-LTXVideo\requirements.txt

earnest grotto
vestal jungle
#

Hmm. I just get requirement already satisfied since I already did this from a GitCMD instead of this environment before. Ugh.

earnest grotto
#

What nodes are missing, what workflow

#

Are there any custom nodes saying they failed to load

vestal jungle
earnest grotto
#

Show what comfyui prints in the console when you launch it

vestal jungle
#

The whole thing or just this part?
0.0 seconds (IMPORT FAILED): D:\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-LTXVideo

earnest grotto
#

Whole thing

#

if it's getting deleted by the bot, dm it to me

vestal jungle
earnest grotto
#

in the command prompt, in comfyui's folder

#

run again, say if it's still broken

vestal jungle
#

(D:\Comfy_Intel\cenv) D:\Comfy_Intel\ComfyUI>git pull
Updating 96d891cb..2d17d891
error: Your local changes to the following files would be overwritten by merge:
comfy/model_management.py
Please commit your changes or stash them before you merge.
Aborting

earnest grotto
vestal jungle
#

0.1 seconds: D:\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-LTXVideo
... but still says missing nodes in the GUI. Let me check something...

#

Manager still doesn't detect missing nodes from GUI. Bummer.

#

Let me see if the Lightricks Git workflows work instead of the ones from the comfyui-wiki

reef ivy
#

I think Ltx nodes got completely reworked so if the workflow is old those nodes may not exist

vestal jungle
#

It defaulted to a checkpoint I didn't even have instead of just being blank. v0.9.5 instead of my v0.9.0 took me a minute to spot. LoL

reef ivy
#

yeah, most workflows do that

vestal jungle
#

I assume workflows go in the user/default/workflows directory. That's where I put the ones from the wiki. I just left the ones from Lightricks in the custom nodes where it was installed and searched to open from there.

#

...maybe at this point I should move my conversation over to a ComfyUI channel or server. Thanks @earnest grotto for helping me get this resolved.

#

Just noticed my GPU is at 0%. When generating images it's usually 50-100%. Hmm Seems it probably crashed and froze.

reef ivy
#

Seems there is an issue with manager now where it can't auto detect some nodes, I think the "workaround" is to manually search for them and install them.

signal patio
#

@earnest grotto if the first run after boot is 3 s/it, 2nd - 9, 3rd 20+ its because of VRAM is full? tired of rebooting to run at normal speed. Do i have any alternative? thx

earnest grotto
reef ivy
#

--reserve-vram 7 or test other numbers

ripe pivot
#

Probably need to set clear vram nodes once or twice on the prompt

signal patio
signal patio
ripe pivot
#

Either you typo or it is faster which is normal since boot are slower due to loading the models etc

signal patio
ripe pivot
#

Yeah

signal patio
#

after fresh boot generation is fast
but when i try to run it again it becames slower and slower
until 20s + /it

#

rebooting makes it fast again

#

clearing vram didnt help, unfortunately

reef ivy
reef ivy
#

I don't have his version installed at the moment, you can probably add it to the start up shortcut he has if you right click and edit. It might be in there already, if so you can play with the numbers. I think --reserve-vram 7 was good for a770

#

Or you can start it up manually with conda activate .\cenv\Scripts\activate python main.py --bf16-unet --use-pytorch-cross-attention --disable-ipex-optimize --reserve-vram 7.0 in a command prompt from comfy folder. I think that's the name of the enviorment

earnest grotto
#

Edit the start_lowvram batch file to also have --reserve-vram 6.0 or whatever other number

lament shale
#

which option is better now? native pytoch 2.6 with xpu support or IPEX? I'm looking through the git repo and wondering if the native support is decent enough? (P.S. I'm using arc a750)

lament shale
#

So its better than ipex after all? Alrighty

lament shale
# earnest grotto Use 2.8

what's this error with ComfyUI manager? I cloned the repo in the custom_nodes section and am using a venv now.
`(comfyui_env) D:\Extra\ComfyUI>python main.py --bf16-unet
Failed to execute startup-script: D:\Extra\ComfyUI\custom_nodes\ComfyUI-Manager\prestartup_script.py / No module named 'async_timeout'

Prestartup times for custom nodes:
0.1 seconds (PRESTARTUP FAILED): D:\Extra\ComfyUI\custom_nodes\ComfyUI-Manager

Traceback (most recent call last):
File "D:\Extra\ComfyUI\main.py", line 134, in <module>
import comfy.utils
File "D:\Extra\ComfyUI\comfy\utils.py", line 20, in <module>
import torch
File "D:\Extra\ComfyUI\comfyui_env\lib\site-packages\torch_init_.py", line 1002, in <module>
raise ImportError(
ImportError: Failed to load PyTorch C extensions:
It appears that PyTorch has loaded the torch/_C folder
of the PyTorch repository rather than the C extensions which
are expected in the torch._C namespace. This can occur when
using the install workflow. e.g.
$ python setup.py install && python -c "import torch"

This error can generally be solved using the `develop` workflow
    $ python setup.py develop && python -c "import torch"  # This should succeed
or by running Python from a different directory.`
#

where do I run $ python setup.py develop && python -c "import torch"

earnest grotto
#

install comfyui with my script

#

.

lament shale
earnest grotto
lament shale
#

okay, thanks bro. I guess I have to choose Nightly for torch 2.8

earnest grotto
#

Yes

ripe pivot
earnest grotto
#

if you installed the nodes the script lets you install, running it and choosing them again updates them as well

lament shale
# earnest grotto Yes

the command line inside the script doesn't include the --pre arg when choosing Nightly so it installs the 2.6, instead of the pre-launch 2.8

earnest grotto
#

hm, ok

#

fixed

lament shale
#

weird

earnest grotto
#

Then installs on top of that

lament shale
#

oh, I was confused for a second

#

can I change the torch version by rerunning the script after install?

earnest grotto
#

yes

#

I don't think there's much reason to switch off 2.8 unless you found some bug

#

With 2.3, I got 1.9it/s. 2.6, 1.22it/s. 2.8, 1.85it/s

#

2.5 is about the same as 2.6

#

You can use 2.3 if you want to test out my unfinished stuff for getting the 3dpack to work

lament shale
#

hm, it works fine and very fast. Around what Vik said. But i'm also getting this error:
from accelerate.utils.memory import clear_device_cache ImportError: cannot import name 'clear_device_cache' from 'accelerate.utils.memory' (D:\Extra\Comfy_Intel\cenv\lib\site-packages\accelerate\utils\memory.py)

ComfyUI works and runs just fine even with this

primal hatch
#

when i click run then facing this problem .... other models works fine but problem with the flux models

#

can anyone help me to solve this ?

earnest grotto
#

show command prompt that comfy is running in and say gpu

primal hatch
#
Checkpoint files will always be loaded safely.
F:\COOMFY UI\Comfy_Intel\cenv\lib\site-packages\torchvision\io\image.py:14: UserWarning: Failed to load image Python extension: 'Could not find module 'F:\COOMFY UI\Comfy_Intel\cenv\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
[W331 23:02:25.000000000 OperatorEntry.cpp:162] Warning: Warning only once for all operators,  other operators may also be overridden.
  Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::_cummax_helper(Tensor self, Tensor(a!) values, Tensor(b!) indices, int dim) -> ()
    registered at C:\Jenkins\workspace\IPEX-WW-BUILDS\private-gpu\build\aten\src\ATen\RegisterSchema.cpp:6
  dispatch key: XPU
  previous kernel: registered at C:\Jenkins\workspace\IPEX-WW-BUILDS\private-gpu\build\aten\src\ATen\RegisterCPU.cpp:30476
       new kernel: registered at C:\Jenkins\workspace\IPEX-WW-BUILDS\ipex-gpu\build\Release\csrc\gpu\csrc\aten\generated\ATen\RegisterXPU.cpp:2971 (function operator ())
ipex_init: (True, None)
Total VRAM 15931 MB, total RAM 32694 MB
pytorch version: 2.5.1+cxx11.abi
Set vram state to: LOW_VRAM
Device: xpu
Using pytorch attention
ComfyUI version: 0.3.27
ComfyUI frontend version: 1.14.6
[Prompt Server] web root: F:\COOMFY UI\Comfy_Intel\cenv\lib\site-packages\comfyui_frontend_package\static

Import times for custom nodes:
   0.0 seconds: F:\COOMFY UI\Comfy_Intel\ComfyUI\custom_nodes\websocket_image_save.py

Starting server

To see the GUI go to: http://127.0.0.1:8188
got prompt

(F:\COOMFY UI\Comfy_Intel\cenv) F:\COOMFY UI\Comfy_Intel\ComfyUI>C:\Users\Admin\AppData\Local\Packages\MicrosoftWindows.Client.Core_cw5n1h2txyewy\TempState\ScreenClip\{65D38567-438B-42EC-AD4E-7CE461080E66}.png```
earnest grotto
primal hatch
earnest grotto
# primal hatch

Get an older driver and try again
Also install comfyui using my script

#

^

primal hatch
earnest grotto
#

ok, get an older driver and try again

primal hatch
earnest grotto
#

yes

#

6314 or 6319 specifically

primal hatch
#

okay let me check !

earnest grotto
#

and say if anything changes

primal hatch
#

sure

#

there is no version of 6314 or 6319

earnest grotto
#

damn, too old now

#

hmm

earnest grotto
# primal hatch

Download the Q4 flux models (you can use my script again), use the workflow below the script, say if it still crashes

reef ivy
#

You can try and find older drivers on techpowerup. I hoard them because of this lol

reef ivy
primal hatch
#

@reef ivy

somber trellis
#

4 is the 2.8 nightly

#

3 is 2.6, stable but slower

#

2 is 2.5 ipex, even slower

#

and 1 being 2.3.110 is still the "fastest"

#

but supports the least things

earnest grotto
#

No, it installs 2.5 so it's 2.5

somber trellis
earnest grotto
#

It's not the latest IPEX

earnest grotto
ripe pivot
#

I'm pretty sure my internet works fine

earnest grotto
ripe pivot
earnest grotto
ripe pivot
#

do I just extend the ss?

earnest grotto
primal hatch
earnest grotto
earnest grotto
#

it installed 2.7.0 🤔

#

--pre it is, then

#

download script again, run again

#

0.1.8

ripe pivot
#

still got the same error

earnest grotto
#

Download script again, run again

ebon wolf
#

how to do it for intel arc 370M ?

hard kraken
#

Can a770 run flux.1 model? I tried the nf4 one, vram usage roofed to 22gb then system hung, but seems nvidia can run it with 4gb vram..

earnest grotto
hard kraken
#

So no need to pass different arguments running main.py?

#

Dev or Schnell or either

earnest grotto
earnest grotto
earnest grotto
#

turn off comfyui. open a command prompt in comfyui's folder
type
git checkout e1da98a
press enter. restart comfyui, say if it still crashes

reef ivy
#

There is actually bitsandbytes arc support now, but just not for nf4. I don't know if they ever plan to add it or not.

reef ivy
#

Heads up, latest Comfy updates removed reroute nodes, so alot of workflows will be weird looking and probably won't work on older versions etc.

earnest grotto
#

hwut

#

what are they replaced with

#

surely comfyanonymous/litegraph didn't just... literally remove reroutes???

#

maybe just a bug?

hard kraken
earnest grotto
#

though i'm not sure, it's possible q5 or higher is better than nf4 anyways

reef ivy
# hard kraken Thanks. What kind of bitsandbytes are support? Maybe a link?

Don't think there is any functionality that would be usefull in comfyUI. I have it installed just to stop the errors I was getting https://github.com/bitsandbytes-foundation/bitsandbytes/tree/multi-backend-refactor?tab=readme-ov-file#bitsandbytes-multi-backend-alpha-release-is-out

GitHub

Accessible large language models via k-bit quantization for PyTorch. - GitHub - bitsandbytes-foundation/bitsandbytes at multi-backend-refactor

reef ivy
#

I find for wan2.1 models GGUF is just too inconsistent with speed, it is slower in general but sometimes it is much slower like double speed loss, I've switched to fp8 even though it hits the ram heavier it is still faster and more consistent in speed.

civic charm
#

nf4 is supposed to run faster than fp16 / bf16

earnest grotto
#

hmm, need to find where i put that panhard crab pic i had masked out, new 3d model generator showed up

#

If anyone wants it

#

(Color if alpha is naively removed is black)

earnest grotto
#

trellis's hf space | triposg's hf space

#

last one flipped

#

gonna try without simplifying

#

Really not sure what to think. Need good retopo. Nvidia's thing looked great.

#

ran out of free. people are saying it's better with anime than trelliis 🤔

#

might poke to see if it works locally later i guess

earnest grotto
wet barn
#

cmd

lament shale
# reef ivy Heads up, latest Comfy updates removed reroute nodes, so alot of workflows will ...

is that why some packages don't import on comfy? I've been getting this one since I installed 3 days ago
Cannot import D:\Extra\Comfy_Intel\ComfyUI\custom_nodes\comfyui-brushnet module for custom nodes: Failed to import diffusers.loaders.peft because of the following error (look up to see its traceback): cannot import name 'clear_device_cache' from 'accelerate.utils.memory' (D:\Extra\Comfy_Intel\cenv\lib\site-packages\accelerate\utils\memory.py)
And another package didn't work with the same memory.py error

earnest grotto
#

works without any additional hassle

#

first run took 90 seconds, different seed 72
peak vram usage with the example workflow was 12.3gb

reef ivy
reef ivy
# earnest grotto

Doesn't seem like any of these models would be good for complex geomtry with moving parts like vehicles etc.

earnest grotto
#

It's complex and has all sorts of parts - antennas, cannons, whatever

formal tusk
primal hatch
#

@ripe pivot How did you solve this problem?

ripe pivot
#

running the newest script

primal hatch
ripe pivot
#

haven't tried flux

rustic sonnet
#

Huh

#. Both eager mode and torch.compile is supported. The feature torch.compile is also supported on Windows from PyTorch* 2.7 with Intel GPU, refer to How to Use Inductor on Windows with CPU/XPU <https://pytorch.org/tutorials/prototype/inductor_windows_cpu.html>_.

earnest grotto
#

nice

primal hatch
earnest grotto
ripe pivot
#

comfy_extras.chainner_models is deprecated and has been replaced by the spandrel library. I got UR error with this

primal hatch
earnest grotto
#

You did not use the "original flux models"

#

the fp8 ones are not that

#

fp8 is not the best quantization there is

#

it might be mildly faster (?) than the gguf ones

primal hatch
#

Q4 is around 6.3gb but i used a flux dev model which was around 22gb

earnest grotto
#

q6 probably gets close if not better. but really not sure about that

reef ivy
earnest grotto
#

you were just spending twice as long loading the model, to then convert it to fp8

reef ivy
#

There is an 11gb fp8 model out there

#

For flux, there is also a bigger fp8 as well

earnest grotto
reef ivy
#

As for quality I felt they were all basically the same down to q4 gguf, fp8 is fastest but --reserve-vram used to get back some speed for gguf

earnest grotto
#

which even with an ssd, that takes some time

primal hatch
#

Which one do you think has better image quality?

earnest grotto
#

probably q6, definitely q8

#

not like you are forced to use q4, especially if you have more vram

#

but also the differences get extremely minor

reef ivy
#

I feel like the loss is minimal down to q4. Its 95% the same quality as q8 and probably close to fp8

#

But use the biggest that can fit on your GPU with decent speed. Q8 or q6 probably for 16gb

#

Also try the reserve vram command for speed

primal hatch
#

what is that ? Every time I open comfy it shows that step by step 5 to 80

reef ivy
#

Q3 quants are where quality gets burnt, haven't seen a good q3 of any model ever.

#

Q2 should just not exist lol

reef ivy
#

Not 100% sure though

primal hatch
#

reverse vram

primal hatch
reef ivy
# primal hatch reverse vram

Add --reserve-vram 6 to the startup parameters. Play with the number from 4 up depending on your vram amount. Check the pinned comments here

hard kraken
#

thank you vik. finally have flux gguf working. used about 9gb vram on Q5_0

reef ivy
#

Well that explains why it didn't make it faster lol.

signal patio
#

Can someone explain why i can't find missed nodes of ACE++ ?
tried manual install but no luck

primal hatch
#

2.3+ipex or nightly? which is better and faster ?

primal hatch
earnest grotto
#

or idk, maybe they build their torches during the day

#

but i doubt it / or it's day in your timezone

ripe pivot
#

I got this error on using hiresfix, does my ipex_to_cuda has a problem or what?

ripe pivot
#

I don't know man, it seems like you better off asking directly in the github that made the nodes instead

reef ivy
#

I check the git repository for commit updates and to see if anyone else has the issue for problem nodes

primal hatch
#

Need Help !

#

custom node import failed

#

@earnest grotto can you help me?

primal hatch
#

This problem has been happening since I updated comfy ui few ours ago

signal patio
# signal patio Can someone explain why i can't find missed nodes of ACE++ ? tried manual instal...

i found what the problem was, requirements.txt is blank, it contains nothing. But in the same folder i found repo_requirements.txt and used it. So my comfy setup is crashed after that, and i decided to run @earnest grotto script again. It worked, Comfy is back to life but ACE++ nodes is not working still...
First error was - No module named 'scepter' so i put only this module name inside of requirements.txt and in installed everything but 1 error again:

transparent-background 1.3.3 requires albucore==0.0.16, but you have albucore 0.0.23 which is incompatible.

@earnest grotto need your help boss

repo_requirements.txt:
huggingface_hub
diffusers
transformers
torch>=2.4.1
xformers>=0.0.27.post2
gradio>=4.44.1
scepter
ms_swift

hard kraken
#

curious i'm running flux1 gguf q5 on a770, speed's at 8.2s/it, that normal?

earnest grotto
hard kraken
#

using ur workflow,python main.py --disable-ipex-optimize ^
--lowvram --bf16-unet

#

torch is on 20250327 nightly

reef ivy
#

Also add --reserve-vram 6

earnest grotto
#

No, doesn't seem right, I'm getting ~2x faster speed

#

WITH things going on in the background

#

4.6s for q4, 5.6 for q5_1

earnest grotto
hard kraken
earnest grotto
#

This is not using my script

#

Or you are not launching using the shortcut it makes, or something else

hard kraken
#

workflow looks exactly the same? what shortcut?

earnest grotto
#

my script makes a shortcut, that I expect you to launch comfyui using

#

a shortcut that you can move around anywhere, put on your desktop, in your start menu, whatever

#

which you can not do with a batch script

#

or a python script

hard kraken
#

i run your blue balloon and got 7.3s

earnest grotto
#

.

#

It's the very first linked file in that message

hard kraken
#

i see the script, it's using the same python ./main.py --bf16-unet --disable-ipex-optimize --lowvram

#

so no need to reserve vram?

earnest grotto
hard kraken
#

so i need to use ipex_to_cuda?

earnest grotto
#

Yes

#

3.11 might also be slower, though i doubt it

hard kraken
#

I git cloned it. Do I need to run any .py to enable it?

shrewd plaza
#

is it normal for a flux Dev Q8 GGUF image to take ages to generate on Ultra 9 285K with Arc A770?

earnest grotto
shrewd plaza
earnest grotto
#

Generate at a lower resolution or use one of the smaller quants

shrewd plaza
earnest grotto
#

@shrewd plaza Install ComfyUI using my script instead ^.

lament shale
#

Ipex isn't needed with torch nightly Anyway right?

reef ivy
earnest grotto
hard kraken
reef ivy
wicked fulcrum
earnest grotto
#

Don't be too pessimistic
17B, at q4, 8.5gb, that should fit on an a770

#

And even, it makes me wonder
@reef ivy Have you had success running the less quantized wan versions on your a750 with kijai's block offloading?

reef ivy
#

If I get 64gb or ram it should probably be fine

earnest grotto
#

the 14b wan?

reef ivy
#

Yeah, 14b

#

I2v i should add, haven't tried t2v

earnest grotto
#

impressive

#

though probably also too slow for my liking

hard kraken
#

Ipex_to_cuda doesn't work with sam2, just curious what custom nodes does this hijack work with?

earnest grotto
#

show the stack trace

ripe pivot
#

many of the upscaler doesn't seems to work with the hijack

formal tusk
#

As I understand it, there are some changes to IPEX in the latest pytorch XPU nightly 2.8 - is the hijack appropriate still?

reef ivy
#

might be faster on a770 with more vram

earnest grotto
#

Use the hijacks always because there are always people who hardcode .to('cuda') in their nodes

formal tusk
reef ivy
#

Gguf is slower and inconsistent with speed but uses less resources

hard kraken
#

shocking, upscale image using model node simply sends my computer to poweroff crash...

earnest grotto
#

odd, that specific model works perfectly fine for me

#

i've had some issues with some denoising models i think, but nothing like that

hard kraken
#

just came back from the 10th time power-off crash XD

earnest grotto
#

ah, hm
"4x-UltraSharp" and "ESRGAN-UltraSharp-4x" should be the same thing

hard kraken
#

i tried 4xFaceupdat, 4x_nmkd-siax-200k ... all crash

earnest grotto
#

Do you wanna try a different driver

hard kraken
#

i'm on 6632

#

resizable bar was the only other thing would cause poweroff crash, so I had it disabled...

earnest grotto
#

what's your psu wattage and what specs

reef ivy
#

I think dan was having this exact issue

somber trellis
#

So there's a model named Hidream now

#

17b parameters, gonna wait for a gguf version

earnest grotto
#

this your local gen or just an example pic

somber trellis
#

this is a huggingface gen

#

using a nf4 quantized version of the model

#

specifically the fast version

hard kraken
earnest grotto
hard kraken
#

A770

earnest grotto
#

a770 has even higher tbp

#

but go install different drivers instead

hard kraken
#

And I ran 4x esrgan that's default in sdnext without issue

earnest grotto
#

In general, keep your drivers up to date. I've often had issues that only happen on one specific driver then are fixed in the next one, which with intel's usual release schedule is after 1.5 weeks i think

#

6647 is old

hard kraken
#

6734 same crash. Now getting antique 6314

hard kraken
#

6314 same crash

#

don't see any power spike before crash either

ripe pivot
#

I think b580 only capable for 1.5 times on 1024x1024 comfortably

#

1.7 is pushing it to UR error more than often and 2.0 will do the power off crash on me

earnest grotto
#

This is not upscaling using some diffusion model

#

It's not running out of vram

#

I haven't stared at gpu usage but I wonder if it could cause power draw to peak and shut down, especially if the cpu also gets involved

#

Given the official minimum recommendation for a b580 is 600w, let alone an a770 that needs more...

#

Guess it would depend on cpu but i doubt it's some ultra low power one

hard kraken
#

I just tested on ultrasharp 4x, input of 230x154 works without crash, but 269x179 would crash the system... this is much less than what you guys have.

hard kraken
#

when the 230x154's upscaled, GPU registered a 2 sec spike of 90W (2 sec is the resolution of hwinfo), not sure if this justifies the psu theory?

#

what node do you use with 4xUltrasharp?

hard kraken
#

update: I hacked the upscale node so that it uses CPU and it can upscale to 3xxx*2xxx image no problem, although takes a whopping whole minute

earnest grotto
hard kraken
#

Nightly. Was using dev0330, just tried 0409 same thing

earnest grotto
hard kraken
#

ran furmark for an hour. no crash.

earnest grotto
#

welp, guess it's (probably?) not a psu issue then

signal patio
#

guyz why any controlnet workflow ends up with this error?

reef ivy
#

So latest drivers add one api to the install? I guess this makes it uneccesary to manually install the level zero stuff for torch.compile now?

earnest grotto
#

You don't want to untick it, but if someone really wants to not be able to use Blender's Cycles, do anything AI, probably do anything OpenCL-related or whatever else, they can untick it and save like a gigabyte or who knows

hard kraken
#

curious, anyone's using A770 and have no problem with ESRGAN upscale? may I ask what version your spandrel is ?

earnest grotto
#

I have no problems yes

reef ivy
ripe pivot
#

File "E:\Comfy_Intel\ComfyUI\execution.py", line 327, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "E:\Comfy_Intel\ComfyUI\execution.py", line 202, in get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "E:\Comfy_Intel\ComfyUI\execution.py", line 174, in _map_node_over_list process_inputs(input_dict, i) File "E:\Comfy_Intel\ComfyUI\execution.py", line 163, in process_inputs results.append(getattr(obj, func)(**inputs)) File "E:\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\impact_pack.py", line 1314, in doit current_latent = upscaler.upscale_shape(step_info, current_latent, new_w, new_h, temp_prefix) File "E:\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\core.py", line 1744, in upscale_shape latent_upscale_on_pixel_space_with_model_shape2(samples, scale_method, self.upscale_model, File "E:\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\core.py", line 1456, in latent_upscale_on_pixel_space_with_model_shape2 pixels = vae_decode(vae, samples, use_tile, hook, tile_size=tile_size, overlap=overlap) File "E:\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\core.py", line 1386, in vae_decode pixels = nodes.VAEDecode().decode(vae, samples)[0] File "E:\Comfy_Intel\ComfyUI\nodes.py", line 287, in decode images = vae.decode(samples["samples"]) File "E:\Comfy_Intel\ComfyUI\comfy\sd.py", line 524, in decode samples = samples_in[x:x+batch_number].to(self.vae_dtype).to(self.device) File "E:\Comfy_Intel\ComfyUI\comfy\ipex_to_cuda\hijacks.py", line 223, in Tensor_to return original_Tensor_to(self, device, *args, **kwargs) RuntimeError: Native API failed. (UR_RESULT_ERROR_UNKNOWN)

#

it's so annoying that the upscaler works on one day and broke it on the next day

reef ivy
#

Did you update something?

ripe pivot
#

No

azure lily
#

Need help with an error
Intel ARC 530M

ComfyUI error

earnest grotto
#

@azure lily Install comfyui using my script #1193952640225267802 message

signal patio
#

@earnest grotto can u help me, how to fix it?

hard kraken
#

I simply changed it to fp32

azure lily
earnest grotto
#

And show the stack trace

reef ivy
#

Likely you installed ipex 2.3? If so update to latest pytorch 2.8. if you need 2.3 you have to add some parameters to the comfyui management file for florence to work iirc.

#

#1193952640225267802 message

earnest grotto
#

My script does that, but only for some ipex/torch versions, so I wanna know which is the broken one

somber trellis
#

I seem to be hitting a 6gb vram limit with this node

#

erroring out with dynamic_scaled_dot_product

#

The repo works, it's just not letting me generate more than a few words then OOMing at 6gb VRAM. It doesn't seem to be allocating any further than that

#
    return original_scaled_dot_product_attention(query, key, value, attn_mask=attn_mask, dropout_p=dropout_p, is_causal=is_causal, **kwargs)
  File "C:\Users\dbs_5\Comfy_Intel\ComfyUI\comfy\ipex_to_cuda\attention.py", line 116, in dynamic_scaled_dot_product_attention
    hidden_states = original_scaled_dot_product_attention(query, key, value, attn_mask=attn_mask, dropout_p=dropout_p, is_causal=is_causal, **kwargs)
RuntimeError: Unknown exception```
somber trellis
#

Any help would be appreciated, as I really do like messing with this model.

#

It's very effective at cloning for what it is.

somber trellis
#

I don't know what changed because I was able to use all 16gb of my vram no problem beforehand

#

in case I did anything wrong I had re-updated my install with the latest 0.1.7p script

#

same issue

#

probably one of the scripts within this node repo is the problem

reef ivy
#

Iirc there were two sets of comfy nodes for that and one I couldn't get to work for the life of me and the other worked but not great.

late glen
#

City96 just released hidream quants

earnest grotto
somber trellis
#

same erro

#

r

#
onednn_verbose,v1,common,error,ocl,Error during the build of OpenCL program. Build log:
,src\gpu\intel\ocl\ocl_gpu_engine.cpp:169
onednn_verbose,v1,primitive,error,ocl,errcode -42,CL_INVALID_BINARY,src\gpu\intel\ocl\ocl_gpu_engine.cpp:264,src\gpu\intel\ocl\ocl_gpu_engine.cpp:264```
earnest grotto
#

if it still breaks, try a different driver

somber trellis
#

Fully operational now.

#

Thank you, Vik.

#

👍

earnest grotto
#

ah, listened to the thing

somber trellis
#

it seems that voicemeeter still crashes and blackscreens on 24h2, and thats whats been causing my blackscreens (i think)

#

That sucks though, because voicemeeter is what I constantly use.

#

I seem to be right, too. The moment I swapped to my hyperx as the default I instantly stopped blackscreening.

#

And right as I say that, it restarts itself...

#

guess it isnt that

reef ivy
#

God I hope windows doesn't end up forcing 24h2 on me.

somber trellis
#

I don't get what causes the blackscreen either. I stress-tested my PC for an hour and it was fine.

#

It's something to do with how ComfyUI and the games I play allocate memory I think.

#

I don't think it's only driver related either, as it's happened on earlier drivers too.

#

*With a fresh safe-mode DDU uninstall.

reef ivy
#

22h4 has many issues for some reason

earnest grotto
somber trellis
#

hasn't crashed since

#

probably something i did

somber trellis
#

welp that didnt even fix it either

earnest grotto
#

Bot got angry that I posted too many fumo images at once and timed me out

#

So, single image

#

I'm not very happy with the results honestly

#

This is with TripoSG

wicked fulcrum
earnest grotto
#

I just downloaded the full gguf, will try in a bit... If I oom with that, I'll be trying dev tomorrow

reef ivy
earnest grotto
#

I didn't oom but turns out the gguf versions of the text encoders don't want to load, so I guess I'll pick up tomorrow either way, with normal versions of those

somber trellis
#

Even my CPU is only spiking at like 60C

#

Also Hidream Full Q8_0 runs fine with reserve-vram 8.0 at 9s/it

wicked fulcrum
somber trellis
wicked fulcrum
somber trellis
#

No. I just took the comfy examples workflow and swapped it to GGUF.

earnest grotto
#

~38gb RAM peak, 14GB VRAM peak

#

very scientifically and objectively measured through staring at task manager with fingers on ctrl and c

#

3.5 minutes (~11.5s/it)

earnest grotto
#

dev is ~8.3s/it

#

got down to 7.12s/it

hard kraken
#

tried to mimic the cube with flux with 8steps

earnest grotto
#

I have no idea why lumina 2 is noisy

earnest grotto
#

Wonder if it's jpeg artifacts it's trying to emulate 🤔

somber trellis
#

set to 16 instead

earnest grotto
#

I'm not using any reserve-vram

somber trellis
#

i oomed without it

earnest grotto
#

I didn't 🤷‍♂️

somber trellis
#

im starting to think my slicing musnt be working right or something

#

then again im also getting 9s/it and not 11

#

🤷‍♂️

earnest grotto
earnest grotto
#

Any of you guys poked lanpaint? I guess I'll have to try it more myself, since I haven't had much luck but their results look good

nocturne fjord
copper void
earnest grotto
copper void
#

that makes a bit more sense, guess i'll need to upgrade from 32GB ram before i can get started on f.lux. thank you for the advice

earnest grotto
#

you probably will. but RAM is cheap. a shame VRAM isn't so easy to get more of

earnest grotto
#

The safetensors file is meant to only "contain the weights" - for comfyui-compatible format, we will try to prepare it as soon as possible.

#

🤔

reef ivy
#

Anybody looked at framepack? Not sure if it's open source yet to make work on xpu. I am a bit out of the loop as I haven't done much in the last week which is like a decade in AI time lol. Also, Ltxv 0.9.6 distilled was released and looks better and faster than the older models from what I"ve seen(still no hunyuan or wan though).

wicked fulcrum
#

Anyone using LTXV 9.6. Compared to 9.5 it seems to produce artifacts, but maybe 9.6 requires something special Im missing

nocturne fjord
wicked fulcrum
nocturne fjord
wicked fulcrum
#

Ok I'll look for an 0.9.6 workfkow to see what the setting should be.
I just moved from 0.9.1 to 0.9.5 and that's a huge difference.

nocturne fjord
#

It would also be better to combine it with a VLM.

civic charm
reef ivy
late glen
# wicked fulcrum Anyone try this on Arc yet?

I've messed around with it a bit, the results are pretty decent. I'm running an A770 on torch nightly 2.8, and I have to use a node that clears the cache after every run for it to not fail. Nature related imagery has a lot of issues with watermarks being included. It is a promising model, with any luck we will see some LORAs.

hard kraken
#

after 32min... already with teacache

reef ivy
hard kraken
#

default wan2.1 template 512x512 33length q6

reef ivy
#

You have reserve-vram 7(or another number) in your launch parameters? For gguf it's necessary for decent speed.

#

I use wan2.1 fp8 scaled now, and it's much faster (quality is a bit lower than gguf though from my experience)

hard kraken
#

is that reserve-vram 7 to make sure not to overspill into shared gpu memory?

hard kraken
#

using q5 to keep it within vram, now within 10 min, but seems worse too

earnest grotto
reef ivy
#

also you don't have to use 7 you can mess around with it, I found that 7 for my a750 is the most stable speed wise now for gguf models in recent drivers/pytorch.

reef ivy
signal patio
#

can someone help why flux controlnet is not working? im also cant use flux inpaint :((( really need that features

signal patio
#

@earnest grotto man plz help

earnest grotto
signal patio
earnest grotto
signal patio
hard kraken
earnest grotto
#

Workflow image

#

Image being inpainted

signal patio
#

thanks man! you was right, after i used another workflow canny is working, trying to setup depth but can't find depth workflow 😦 its strange that comfy have canny node by default but no depth one

also about inpainting, your workflow is workin, im waiting till it finish, its ok that its extremely slow? 15s/it

#

it didn't worked, same image i recieved in the end. i guess im missing something, and it should work faster than that

tiny bolt
#

will i see the end of haruhi before i die DoggoGrin

earnest grotto
earnest grotto
signal patio
#

my promt is cat eyes, but its not working
and it takes 771 seconds on my a770 16gb pytorch nightly, installed using your script

earnest grotto
# signal patio how long did it take?

120 seconds with 30 steps
probably would've worked fine with 20 steps, and generally wavespeed's first block cache also works decently, so it can drop down to ~30-60 seconds

signal patio
earnest grotto
#

don't set the guidance crazy high
what gpu, what launch arguments for comfy, are the models stored on an ssd or hdd

#

ah, and your image is 2000*2000

#

the image I used is 1mp

#

Here is a workflow that uses brushnet's cut for inpaint node and optionally kjnodes' color match

#

My script installs those if you chose to install the extra nodes

#

if flux is struggling to follow your prompt, set the cfg to 3 or so. keep in mind this will slow things down and if you're running out of vram, which you might also have been with a 4mp image, it will be even slower and you might crash

#

hence why, the cut for inpaint node

#

If flux doesn't make what you want even with cfg, then it might just be that flux can't do what you want it to

signal patio
signal patio
earnest grotto
signal patio
honest hull
#

Pytorch 2.7 + xpu is now supporting sdpa

#

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/xpu

reef ivy
#

getting this error with florence in comfy now. The operator 'aten::_conv_depthwise2d' is not currently implemented for the XPU device. Please open a feature on https://github.com/intel/torch-xpu-ops/issues. You can set the environment variable `PYTORCH_ENABLE_XPU_FALLBACK=1` to use the CPU implementation as a fallback for XPU unimplemented operators. WARNING: this will bring unexpected performance compared with running natively on XPU.

#

looks like kijai made it so florence loads to cpu first for some reason, gonna try and revert the florence nodes

reef ivy
wet barn
#

Does anyone tested ReActor/Instand_ID or anything else needs torchvision on Intel Core Ultra 7 258V iGPU? I can't install those things!

honest hull
civic charm
#

I guess we should see what ipex 2.7 speed will be like

#

Also the bf16/fp16 reduction flag with math sdpa doesn't seem to do anything with intel

#

For context, sdxl at 1024x1024:
ipex 2.1.4: 2.0 it/s
torch 2.7 / 2.8: 1.4 it/s
openvino: 2.4 it/s

earnest grotto
#

also wow, hidream has really soaked in a bunch of very random anime characters, I wish it knew artstyles better instead

#

It knows saber, and it has very faint knowledge of megumin, nothing that looks similar at all, but mentioning her makes it generate a big pointy witch hat, long boots and a dark cloak

earnest grotto
#

I'll try with sdnext later, but even on windows I was getting ~1.85it/s, but on comfy

#

I've had cases where some kernels inexplicably tank performance

civic charm
#

I got up to 1.8 it/s by moving the model from GPU to CPU and then back to GPU at the start of every generation

#

This doesn't make sense

#

Also using non_blocking=True when moving any model kills the performance

#

It drops to 2 s/it (or 0.5 it/s)

#

It supposed to make thing faster when moving the text encoders from gpu to cpu, not slower

reef ivy
#

Any clue as to what caused the slowdown? just pytorch regression or something intel specific?

#

If things were compatible i'd still be on 2.1.4

civic charm
#

this one is intel specific

reef ivy
#

Specific pytorch 2.7 fixes in latest windows driver for b series and core ultra related to torch.compile. no mention of alchemist

earnest grotto
#

WIll test later, but if someone else wants to beat me to it

#

claims to beat 11labs

#

we love claims

reef ivy
#

Nice, hope its not just more smoke

reef ivy
#

So, is the use-bf16 comand arg still necessary? I notice that without it the quantization nodes actually work(although output is distorted), and normal gens seem fine so far(you can also select bf16 now in most model load nodes)

earnest grotto
#

bf16 was needed because of a bug where iirc generation flat out didn't work due to a data type mismatch. you should prefer to use bf16 because it's ever so slightly faster. but otherwise it's whatever

reef ivy
#

Fp16 seems to give black image(atleast with wan and kijai nodes) bf16 and converting to fp8 work but the conversion is distorted

reef ivy
#

so far looks like the black output issue is only with Kijai nodes, I have been able to quantize a few models so far with no issue.

civic charm
#

IPEX 2.7

civic charm
#

Still can't be imported with glibc 2.41

#

I was able to get it to work by using patchelf --clear-execstack $lib but pytorch 2.7 + ipex 2.7 is slower than just pytorch 2.7?

#

It drops the performance to 1 it/s

sweet rapids
#

cant go past this point

One or more packages have failed to download: intel_extension_for_pytorch\s+2.5.10+bmg
Please run the script again and ensure your internet connection is working.

earnest grotto
earnest grotto
#

1 more frenda

#

Oops, I should've posted the frenda elsewhere. Oh well, here's fine too, since it doesn't have that creepy smile. I wonder how much synthetic data they used and if it might actually be better to use higher quality ai gens 🤔

#

If the model they release is actuaally this good, this will be pretty big

#

Since uh, it's only on their service for now 😛

sweet rapids
earnest grotto
#

I updated the script to fix that.

#

You can download the script again and replace the old one. You don't need to because it should work anyways.

#

Show what the specific errors are with the impact pack

somber trellis
#

The node widgets for the load and run dia models dont even have corresponding widgets. The run dia model node is looking for a purple model widget but the load dia node only has a grey model widget

#

I went looking for other repos, the customdia repo doesnt work whatsoever and straight up just throws a bunch of tensor shape errors

earnest grotto
#

Oh boy

somber trellis
#

I wanted to try it too. It had lower VRAM requirements than spark while promising better performance

#

Something tells me though by online reviews that it really isnt that good

earnest grotto
#

they have a hf space if you want to assess quality without comfy

somber trellis
#

also the intro of this is just funny

#

"Ugh this coffee tastes like cardboard (moans)"

#

But the moan ends up sounding more like a birdcall

earnest grotto
#

that was some pained screeching

#

well the repo does say it doesn't handle those well

somber trellis
#

That coffee musta had the worst aftertaste in the last millenia

#

ok dude put (off putting laughter) as one of the prompts

#

And the AI is like in the corner of the room in the clip yelling "OFF PUTTING LAUGHTER"

#

in the most sarcastic, cliche voice

#

oh i just realized it sounds like JERMA

#

LMAO

#

Ok im not gonna lie its hilarious

#

i kinda wanna mess with it

reef ivy
#

I think its in kijais nodes, but you have to edit his files to get wan to run on arc at all (change float64 to 32)

sweet rapids
# earnest grotto I updated the script to fix that.

Im getting above error.

Prestartup times for custom nodes:
2.5 seconds: C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\comfyui-manager

Checkpoint files will always be loaded safely.
ipex_init: (True, None)
Total VRAM 11874 MB, total RAM 32607 MB
pytorch version: 2.8.0.dev20250427+xpu
Set vram state to: LOW_VRAM
Device: xpu
Using pytorch attention
Python version: 3.10.16 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 16:19:12) [MSC v.1929 64 bit (AMD64)]
ComfyUI version: 0.3.30
ComfyUI frontend version: 1.17.11
[Prompt Server] web root: C:\Users\COMFY\anaconda3\Comfy_Intel\cenv\lib\site-packages\comfyui_frontend_package\static

Loading: ComfyUI-Impact-Pack (V8.14.2)

[Impact Pack] Failed to import due to several dependencies are missing!!!!
Traceback (most recent call last):
File "C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\nodes.py", line 2128, in load_custom_node
module_spec.loader.exec_module(module)
File "<frozen importlib.bootstrap_external>", line 883, in exec_module
File "<frozen importlib.bootstrap>", line 241, in call_with_frames_removed
File "C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Pack_init
.py", line 46, in <module>
raise e
File "C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Pack_init
.py", line 28, in <module>
import cv2
File "C:\Users\COMFY\anaconda3\Comfy_Intel\cenv\lib\site-packages\cv2_init
.py", line 181, in <module>
bootstrap()
File "C:\Users\COMFY\anaconda3\Comfy_Intel\cenv\lib\site-packages\cv2_init_.py", line 153, in bootstrap
native_module = importlib.import_module("cv2")
File "C:\Users\COMFY\anaconda3\Comfy_Intel\cenv\lib\importlib_init_.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed while importing cv2: The specified module could not be found.

earnest grotto
#

What is the "above error"

reef ivy
#

probably auto deleted

sweet rapids
#

Cannot import C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Pack module for custom nodes: DLL load failed while importing cv2: The specified module could not be found.

Loading: ComfyUI-Impact-Subpack (V1.3.1)

Traceback (most recent call last):
File "C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\nodes.py", line 2128, in load_custom_node
module_spec.loader.exec_module(module)
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib.bootstrap>", line 241, in call_with_frames_removed
File "C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Subpack_init
.py", line 23, in <module>
imported_module = importlib.import_module(".modules.{}".format(module_name), name)
File "C:\Users\COMFY\anaconda3\Comfy_Intel\cenv\lib\importlib_init
.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Subpack\modules\subpack_nodes.py", line 3, in <module>
from . import subcore
File "C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Subpack\modules\subcore.py", line 3, in <module>
import cv2
ImportError: DLL load failed while importing cv2: The specified module could not be found.