#ComfyUI for Intel Arc using IPEX

1 messages · Page 5 of 1

somber trellis
#

install numpy 1.26.4

#

and install ltx-video's dependencies to match its transformers requirement

#

I simply go through the dependencies and install compatible packages following their versions

#

but ye the speed diff between the two versions dont really matter anymore

#

It's literally around 10-12% perf loss with the added command for inferencing flux

#

but i can run instantir image restoration now

reef ivy
#

It should be able to run mmaudio now too I hope.

somber trellis
#

9s/it on flux with dynamic thresholding AND dpmpp_2m sgm uniform

#

that isnt too bad

reef ivy
#

I think I got 6s/it with just base flux at 1024x1024 last I tried.

somber trellis
#

unlike 2.5.1 the inference speed is stable at around 9s

#

2.3 liked to go between 7 and 13

#

and 2.5.1 started at 19s then went down to 8

reef ivy
#

2.3 was unstable unless I ran the --reserve-vram 6.0 command

somber trellis
#

i still use reserve vram 4.0

#

not sure if i should on 2.6.0 though

reef ivy
#

trying ltx video and seems like the speed is also extremely slow, or it's not working properly. Hasn't gone past ?it/s in about a minute

somber trellis
#

wh

reef ivy
#

Yeah somethings wrong, 174.65s/its lol

somber trellis
#

ye it doesnt take that long for me

#

4s/it with pag from ltxtricks

reef ivy
#

it should go about 4 or 5s/it with the length I am using

somber trellis
#

your reserve vram command is why

#

ltxvideo gguf

#

you should be using that

reef ivy
#

I can try without it, it helped before on 2.3

somber trellis
#

ye but ur using it on a model capable of fitting onto your vram

#

you use reserve vram 6.0 on an 8gb card right

#

i run the fp8 version of all the checkpoints i have

#

ltxvideo fp8 works well

reef ivy
#

yeah, but it typically uses as much as needed. And goes faster when not fully loaded into vram

#

I have fp16

#

and the full model, both worked fast on ipex 2.1.4 btw

#

full model got slow on 2.3.....and I'm guessing 2.6 now fp16 doesn't work. Seems they are messing up the vram allocation of lower vram cards ith each update. sigh

#

yeah, makes no difference. Still slow. Going back to ipex 2.3 I guess lol

#

well before that, let me try and update comfy.

#

If I could get stuff to work on 2.1.4 I'd never update tbh

#

well updating is better at least, now only 22.33s/its.

#

still a lot slower than 4sit/s lol

somber trellis
#

what workflow

reef ivy
#

custom, using ltx and perturbed attention with img2img. I also have florence and ollama but bypassing atm (florence still runs fine).

#

could also be the windows drivers + ipex, there is still a memory issue with 8gb cards and ipex. Usually solved by --reserve-vram in comfy

#

could try different --reserve settings.

#

also thinking of trying wsl2 again

#

linux was generally faster anyway last i tried

#

could try reverting drivers again too, but since they got rid of arc control would have to ddu

somber trellis
#

when you spoke about a florence fix that confused me since I got the promptgen 2.0 model working great

#

It was an issue with the tokenizer i believe, I simply swapped it out with the base florence 2 large's json

#

normal florence worked outta the box for me

#

2.6 works fine at least

#

we finally have a decent version other than 2.3

#

that and with ipex-llm supporting ollama 0.4.6

#

i can use a majority of llms

reef ivy
#

yeah, it's unusable for me sadly. flux is also much slower, going about 12s/its with the q4 model.

#

Since all the new stuff requires it, I may try linux wsl2 and pray that it isn't terrible as well lol.

#

really want to try the mmaudio stuff.

reef ivy
#

can also add my own commands to the prompt for more control over camera and any motion

#

get stuff like this now

#

test for camera zoom. also added it's own random title card lol

#

with stg I do notice that things like blinking are less realistic but the image all around is more stable.

somber trellis
#

now im confused

#

im getting 9s/it on ltxvideo with pag

reef ivy
#

might be inconsistent vram usage like 2.3

#

also wonder if it's the latest drivers with the battlemage update

#

I have completely forgotten how to use wsl2 and linux lol. Will need a refresher

#

you know what, do you still need libuv for pytorch?

#

And could that cause issues with it?

somber trellis
#

id assume so since ipex 2.3 did

#

then again this isnt ipex

#

🤷‍♂️

#

just tested the 2.5.1 ipex version and thats slightly slower than 2.6

#

and 2.6 is half as fast as 2.3

reef ivy
#

okay, so it's not just me with the speed decrease.

#

Might just be torch itself, also nightly builds I think are updated all the time so something could break

#

this is the version I got 2.6.0.dev20241215+xpu

somber trellis
#

im assuming there isnt much we can do to speed up 2.6

#

idk

reef ivy
#

i may try and make a new comfy env, without libuv just to see lol

kind parrot
#

Why is there an error during the computation?

I have followed the steps above and reached this stage, but an error occurs.

Who can help me?

somber trellis
#

pip install numpy==1.26.4

kind parrot
reef ivy
#

I guess torch installs numpy on it's own every time. Ipex doesn't seem to afaik

#

a question, would python 3.11 be better to use with 2.6? Or is 3.10 still the recommendation?

honest hull
#

I have been using 3.11, and we are using 3.11 for AI Playground

somber trellis
#

Can't wait for the ai playground comfy backend

honest hull
honest hull
honest hull
somber trellis
#

2.5.1 ipex and 2.6 is sadly half as fast

honest hull
somber trellis
#

Intel arc a770 LE.

honest hull
#

Pytorch 2.6+xpu is slower than IPEX and that is expected, some of the IPEX optimization is not yet upstreamed

somber trellis
#

Pytorch 2.3.1 ipex is faster than 2.5.1 IPEX and 2.6.0+xpu

#

At least in comfyui this seems so.

honest hull
#

with a normal SD1.5 workflow? I can give it a try

somber trellis
#

I use Flux and LTXVideo currently

honest hull
#

could you share you flux workflow file

somber trellis
#

It utilizes a couple different nodes.

honest hull
#

or an image

somber trellis
#

Here.

#

bruh

#

That workflow I use is capable of good outputs.

honest hull
#

so on A770, with ipex 2.3 you are getting ? second per image?

#

I can try to replicate and troubleshoot for ipex 2.5

somber trellis
#

luckily ipex-llm works fine in administrator mode with phi-4 4_k_m. generates 256 tokens in half a minute

#

flux dev fp8 on pytorch 2.3.1 ipex is doing 6-7s/it

honest hull
#

oh dev fp8

somber trellis
#

yep

honest hull
#

have you tried using Q4/Q8 flux?

#

quality is good and it's faster

somber trellis
#

with --reserve-vram 4.0 both equal the same speed

#

on 2.3.1 that is

#

they arent far off but yes q8_0 is more accurate

#

its more faithful to fp16

#

keep in mind

#

this is a workflow that uses

honest hull
#

I would say Q8 quality is 95% similar to FP16.. minor detail differences

somber trellis
#

dynamic thresholding and a dpmpp_2m sampler

somber trellis
#

fp8 does affect minor things

#

same with gguf fp8 but it's a much more complex architecture than simple fp8 conversion

#

I should probably be using it

reef ivy
#

so in a fresh env without libuv I am getting 11s/it with ltx video. Still way slower than the 4/5s/it but much faster. Dunno if it's a fresh install or the libuv. I am going to try python 3.11 next

honest hull
# somber trellis

couldn't find the merge string custom node, do you know which one was it?

somber trellis
reef ivy
#

having issues getting pytorch 2.6 to work with comfy and python 3.11.

honest hull
#

@somber trellis couple things to improve ipex 2.5 speed. I also noticed that 2.5 is a bit slower than 2.3.

  1. add torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True) to model_management.py in comfyUI
  2. pull latest update for ipex_to_cuda from disty's repo to be compatible with ipex 2.5
  3. it should now improve perf
honest hull
reef ivy
#

I am going to try again, but it's failing at the management.py have added the hijacks already. Might have been the pip3 that was in the install instructions at the pytorch repo, so trying pip now.

honest hull
#

oh did you pull the latest hijack from Disty's repo?

reef ivy
#

I did yesterday

#

Going to try 3.11 and see if it increases the speed

honest hull
#

I ran pytorch 2.6.0dev yesterday in a fresh env on ComfyUI it worked fine so

#

ensure to add torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True)

reef ivy
#

yeah I have that, going to try once everything installs

#
    from ipex_to_cuda import ipex_init
    ipex_init()
    torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True)
    _ = torch.xpu.device_count()
    xpu_available = torch.xpu.is_available()
except Exception:
    pass```
reef ivy
#

nope, still massive slowdown compared to 2.3. 5s/it to 12s/it with this ltx workflow. I do think that not having libuv installed increased the speed from 22s/it though so that may help someone. Dunno why the speed decrease is so massive, will try flux next to see if it's just ltx.

#

also --reserve-vram 6.0 is still necessary or it's even slower.

#

also having install issues with the vlm nodes, so I may try this one more time tomorrow. and see if I just have done something wrong.

#

Flux is slower, but not nearly as bad. 9s/it now for 1024x1024, think it was 6s before.

#

yeah, about 6s/it on ipex 2.3 with flux, and this is a bit slower than it was with ipex 2.1.4 also.

#

Another question, would pytorch still require conda? Or could it be created in a regular python env?

honest hull
#

regular env is sufficient

#

it’s still in the early stage of being fully upstreamed but usability is the main focus right now 😉

civic charm
#

Also don't mix PyTorch and IPEX on the same venv

#

PyTorch Triton gets very upset when you install one and then change to other

civic charm
#

If active is false, then message will return the reason

#
ipex_active, message = ipex_init()
print(f"IPEX Active: {ipex_active} Message: {message}")
#

@earnest grotto do we still have the memory leaks with wsl2 + empty_cache? I would like to remove that hijack if it is not an issue anymore.

earnest grotto
#

I will check in a bit

earnest grotto
civic charm
#

Can you test ipex 2.5 and PyTorch?

#

Currently that hijack is disabled with PyTorch

earnest grotto
#

where do i get ipex 2.5

earnest grotto
civic charm
# earnest grotto where do i get ipex 2.5
pip install torch==2.5.1+cxx11.abi torchvision==0.20.1+cxx11.abi intel-extension-for-pytorch==2.5.10+xpu oneccl_bind_pt==2.5.0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
earnest grotto
#

ah, wonder why they still haven't added that to the install guide

civic charm
#

US mirror is down, so probably that

earnest grotto
#

incredible, the basekit and the pti still completely break everything if both are installed

reef ivy
earnest grotto
#

Welp, whatever PTI/pytorch-dev versions are downloadable, are not compatible with the current nightly or ipex 2.5
torch 2.5 works at least

#

let's see if it leaks

earnest grotto
# civic charm Can you test ipex 2.5 and PyTorch?

Pytorch 2.5 also leaks
I think the comment on the issue saying it's probably a driver/wsl bug not an ipex/torch bug is right, the amount of memory used torch reports changes with only +-16 bytes if at all

#

And it still errors out about running out of memory with... This time ~12-11gb of vram free

somber trellis
#

since trying out 2.6 with instantir i can safely say its just better to mimic what instantir does with flux, using an upscaler controlnet combined with redux on pytorch 2.3 for speed

silk umbra
#

does the b580 work with comfyui yet

reef ivy
#

I was getting 4s/it at this point with flux gguf at 1024x1024 with a lora. not sure if it was the older ipex or the older drivers. But with 2.3 the fastest I get is 6.40s/it with a lora loaded.

reef ivy
#

Maybe it needs 2.5 or higher?

silk umbra
#

i tried @earnest grotto 's script to install comfyui and it didn't work when i tried to diffuse

silk umbra
reef ivy
#

Might need later ipex maybe, also his script might not account for battlemage and install the right ipex.

silk umbra
#

how do i update my ipex

earnest grotto
#

There is no right ipex

#

You can try 2.5. But again, I'd expect that to not work in the end

silk umbra
#

just keep waiting?

reef ivy
#

They shipped without ipex support?? wtf, glad I didn't pick it up lol

reef ivy
# silk umbra just keep waiting?

2.5 isn't officially released but the whls are avaialbe to download, i couldn't get pip to work so I had to make a requirements file, maybe somebody can make a pip install snippet for the windows version, I think the one disty posted is linux.

silk umbra
#

i see

reef ivy
#

you could also try this #1193952640225267802 message try and go into the conda env and install that.

#

I saw bmg benchmarks for AI, I guess they only used openvino?

reef ivy
# silk umbra i see

maybe this will work python -m pip install https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/ipex_stable/xpu/torch-2.5.1%2Bcxx11.abi-cp310-cp310-win_amd64.whl https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/ipex_stable/xpu/torchvision-0.20.1%2Bcxx11.abi-cp310-cp310-win_amd64.whl https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/ipex_stable/xpu/torchaudio-2.5.1%2Bcxx11.abi-cp310-cp310-win_amd64.whl --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

#

Activate your env and run that

reef ivy
#

you aren't in your env, you activated the base env. cenv is your env

reef ivy
#

hmm, i ran it to see if it would run for me and it started installing. Maybe it's a cache issue.

#
somber trellis
reef ivy
#

oh your on 3.11?

silk umbra
#

i think?

reef ivy
#

I didn't download the files for 3.11, the links will be different.

silk umbra
reef ivy
#

you can replace everywhere it says cp310 with cp311 i think it should work

somber trellis
#

You could also just

#

install the 3.11 wheels

reef ivy
#

I manually went in and found all the links myself, I don't know the pip code for them

somber trellis
#

i can give him it

silk umbra
#

wait

#

replacing 310 with 311 worked

somber trellis
#

or torch2.6.0+xpu

silk umbra
#

whatever gets this working

reef ivy
#

He is on battlemage, so we aren't sure it's supported

#

I suggested ipex just in case

somber trellis
#

well if they want speed they download 2.3.1

reef ivy
#

I think he said it didn't work, although it could have been because of the wrong python version for Vik's script?

somber trellis
reef ivy
#

Yeah

somber trellis
#

if you want slower speed by 25% and better compatability

#

you want 2.6.0

reef ivy
silk umbra
#

testing the thing you told me to do rn

#

will try that after if this dosen't work

somber trellis
#

pip uninstall torch torchaudio torchvision intel-extension-for-pytorch && pip install --pre torch torchaudio torchvision --index-url https://download.pytorch.org/whl/nightly/xpu

somber trellis
#

that is an uninstall on purpose

#

read the full command

reef ivy
#

oh okay. I see

#

I wonder what the speed will be for battlemage tbh

silk umbra
#

ok that didn't work

somber trellis
#

What was the error

reef ivy
#

if it says something didn't exist then it was probably the file name being wrong again. If you were doing the Ipex method still.

silk umbra
#

ok i reinstalled using viks script with python 3.10

#

anything else i need to do?@reef ivy

reef ivy
#

Is it working?

silk umbra
#

wait i need to install flux first oop

reef ivy
#

if you get it working, let me know your speed and image size

silk umbra
honest hull
somber trellis
honest hull
#

force reinstall ensure that it pulls the package again and reinstall it

#

––upgrade would work when versions are different

earnest grotto
#

@honest hull Do you know if battlemage is supported yet and by what

midnight mauve
#

I am a newbie to intel arc, while following the pinned message of @somber trellis of installing comfyui on my laptop ( intel ultra 9 h series and intel arc a370m) I got this error. Someone pls help.

reef ivy
midnight mauve
#

@reef ivy please brief me steps. That would be helpful.

earnest grotto
#

.

midnight mauve
#

Do you mean this?

earnest grotto
#

yes

midnight mauve
#

Thank you

reef ivy
#

Just a heads up, there are 3 ways to install ipex for comfyui in the pined comments to chose from. Viks is the easiest imo

somber trellis
earnest grotto
#

must've happened today

reef ivy
#

hunyuan 4bit gguf with new fast lora from kijai, "Will smith eating spaghetti in a cozy restaurant by the mountains" 512x512 14s/it with lora, 11s/it without 6 steps

#

6 steps without lora

somber trellis
#

me over here trying to run 4_k_m and failing miserably

reef ivy
# somber trellis how

I think you should be able to get my workflow from those images. Probably the vae decode and I use a fp8 version of the llm

#

There is a gguf of the llm too but don't know how to load it as a clip, gguf clip has no option hunyuan

somber trellis
#

i have the fp8 scaled version of llava llama3

reef ivy
#

Whoops read that wrong lol

somber trellis
#

i typed that wrong

#

you read it right despite that

reef ivy
#

I can't get the gguf to work, i used the fp8 safetensor for lava lama

#

It loads into the standard dual clip i believe

#

About 8gb, I offload directly to cpu but you can probably handle it with 16gb

somber trellis
#

ope

#

ok uhb

#

your workflow works but the official workflow does not

reef ivy
#

I will say hunyuan is waaaay better with natural looking human movements. Quality is lower than ltx img2vid but probably better thant txt2video. At that res might even be faster than cog video tbh

somber trellis
#

neither the one from hunyuan video wrapper or the official comfycore implementation workflow works

reef ivy
#

The kijai workflows don't work for me either

somber trellis
#

idk how you got

#

16s/it

#

im getting 20s on torch 2.6.0, so im assuming you arent on 2.6.0

reef ivy
#

I am on 2.3 ipex

#

I swapped back due to speed

somber trellis
#

instantir and certain upscalers are broken on 2.3

#

im probs not gonna use instantir tho

#

compared to flux redux its really not good

reef ivy
#

I have a 2.6 env if I need it

somber trellis
#

ye it works YAY I HAVE A VIDEO MODEL THAT WORKS WELL

reef ivy
#

That is better than my outputs lol

#

Yeah it is nice, just need img2vid

#

Oh, ltx 1.0 arriving next week if nothing goes wrong

#

Official word from CEO

midnight mauve
#

@earnest grotto Comfyui working 🔥. 1.25its/s for 512512 with 20 sampling steps and 8its/s for 10241024 with 70 sampling steps
So much thanks. @reef ivy Thanks for pointing out. 👍

But still there are some errors while running Comfyui. Is there anything need to be done or ignorable

earnest grotto
midnight mauve
#

Thanks for info.
Not yet I am a newbie still exploring.
Any suggestion?

earnest grotto
#

nothing much besides the ones the script gives you the option to install

midnight mauve
#

Okay 👍👍

reef ivy
reef ivy
#

So far running extremely slow for me

reef ivy
#

bfloat16 error at vae decode, sigh. Model is about half as fast as the bf16 version and doesn't seem to work with the default workflow with intel.

reef ivy
# somber trellis

with pytorc 2.6 I seem to be getting the same speed as 2.3 now after disabling lowvram in the command line. I am also not running it in conda but a regular python env. Could just be ltx video, but for some reason it is faster today.

somber trellis
#

btw I can run hunyuan 4_k_m n my card at 256x256 121 frames

#

horror in the palm of my hands

#

however getting good results at this res is a lot harder

#

ill probably stick to 512x320 49frames

reef ivy
#

Nice, I wonder if using a video2video with cogx or ltx could help upscale if used with an image and the ir stuff. Movement in hunyuan is unmatched

#

Also try the new ltx model and the new workflow, i got a mismatched bfloat error on vaedecode. I was able to use the previous model vae but that likely ruins the new output

reef ivy
#

Comfy UI was updated though

somber trellis
#

If 2.6.0 is truly faster and equal to 2.3 @reef ivy

#

I'll swap back

#

it has compatability fixes I want

#

theres a newer torch 2.6.0 for today's nightly so there might be a change there

reef ivy
#

If you want, try and make a python env for comfy with 2.6 and run it with the --reserve-vram but omit the --lowvram. gonna try flux in a minute and see if it's not just ltx video.

somber trellis
#

well its 8s/it on ltxvideo for me

#

on the latest nightly 2.6.0

#

just got the same vae decode error

#

rip ig

#

back to 2.3.1 i go

#

It's 3.5 seconds faster per it on 2.3.1 on ltxvideo but the same vae decode error occurs

#

the issue is the loaded dtype regarding the vae decode error

#

its loading the vae as f32 i think

#

ltxvideo 0.9.1 was made as a bfloat16 model unlike it's original which was in float32

#

I'm fairly certain it's an issue with the code handling the vae's dtype

reef ivy
#

I dunno i can run the bf16 model fine in it, there is something with the new vae itself. I have gotten the nee model to work with the vae from the old one, but extracting the new one still gives same message.

somber trellis
#

oh no i can run both the bf16 and fp32 dtypes fine

#

its the vae decode that breaks

#

ye like you said

reef ivy
#

You can also chose float32 in the nodel loader

somber trellis
#

the vae included with the model doesnt seem to work for us

reef ivy
#

Yeah, I have set the vae to run on CPU and it hasn't errored yet but its taking forever so far

#

They said something about the vae and compatibility but their workflow uses the default vae decoder.

somber trellis
#

i just did a 1-step run with set vae device to cpu

#

nothing's outputting lol

#

it's just stuck

#

well at least hunyuan works

reef ivy
#

Yeah, I had to quit out.

somber trellis
#

pity i cant do 5 seconds

reef ivy
#

could try hunyuan 2 flux img2img 2 ltx image/video to video.

somber trellis
#

well currently ltx no woky

reef ivy
#

i mean 0.9.0

somber trellis
#

I've tested 0.9.0 enough to know that it is very inflexible as a model

reef ivy
#

only issue with it is wonky movement with img2video

somber trellis
#

It's good when you get a source

#

like an image or video

reef ivy
#

Yeah, that's why I say use a hunyuan video and a flux image with the img/video to video workflow.

#

so take the first frame of hunyuan do an img2img upscale with flux, then take both into ltxvideo. Might get a decent output with good movement? may try it later

#

really wish we could use the new model though, it seems pretty good

somber trellis
#

hes runnin get em

reef ivy
#

got a bunch like that, but then got this one

somber trellis
#

ye but its at 256x256

#

its barely holding together

reef ivy
#

Seems 512x512 works best for me

somber trellis
#

Maybe I should go down to 4_0 instead of 4_k_m

#

the ram usage difference is big enough

reef ivy
#

wide screen also bugs out for me, but this could be just the fast lora issue

#

kijai said it's experimental. Waiting for a full quant of the fasthunyuan model

somber trellis
#

oh actually hes kinda terrifying

#

his hand is the sword

reef ivy
#

ai can't tell which way he's facing

#

I use the 4_0 quant

honest hull
#

welcome to GenAI era 😉

somber trellis
#

ok with 4_k_m i can actually do 384x256121f

#

at 21s/it

#

aka 2 minutes for 5 seconds

#

dude is morphin

#

but his walk is consistent

#

its like he noticed his helmet disfigured and put his hand up to fix it

reef ivy
#

honestly not bad

#

closeup shots will likely do better at the lower res

somber trellis
reef ivy
#

lol

#

so yeah, flux is still much slower on 2.6 but I got ltx running about the same, maybe it's because of the lower resolution and it slows down after a certain size/vram usage

honest hull
#

🙂 we now also have Ai Playground 2.0 alpha build released. feedback is welcome and look forward to community members adding new workflows for non-technical users to experience the cutting edge models

somber trellis
#

you guys added ipex-llm to that

#

correct

honest hull
#

we added llama.cpp as experimental support

somber trellis
#

This was more regarding if comfyui would be able to use sym_int4

#

It'd honestly be even more nice if intel had some method of using svdquant

#

that quantization method is something that would be to die for on models like hunyuan

honest hull
#

ah. so you are looking for something like a custom comfyUI node — intel neural compressor model loader ?

somber trellis
#

Something that can load models into sym_int4 format. Video models or image models alike.

#

A node that could do that would be very convenient

#

it would be a great alternative to gguf int4

honest hull
#

let me check with team and see what we could do

somber trellis
reef ivy
#

have tried setting the vae to bf16 and fp32 and still erros on vae decode. Gonn try fp16, have no idea what's happening atm.

reef ivy
#

RuntimeError: Input type (float) and bias type (struct c10::BFloat16) should be the same

honest hull
#

there seems to be updates in comfyUI repo regarding hunyuan vae, have you tried updating

reef ivy
#

The vae error is for the new ltx 0.9.1 but I can check and see if there is another update.

earnest grotto
#

@silk umbra

silk umbra
#

Ooo

earnest grotto
#

aipg 2 has bmg support

open junco
#

i followed all the steps suggested to install and make ConfyUI to work with Intel graphics card...

can anyone help with installation steps ... i followed all suggested steps.. but getting these wierd errors.

earnest grotto
#

@open junco ^

open junco
#

thanks a lot @earnest grotto for quick reply..

... will try if present attempt fails..
presently, someone suggested AI playground from Intel ...
seems its working .. not sure which one is better though... (it seems to be using python version 3.11.10 )

post installation .. i tried to copy checkpoints etc.. to comfyUI .. and loaded it independantly... seems its using A750 GPU .. for now...

earnest grotto
#

aipg is recommended
it uses comfyui as a backend but I'm not sure if it exposes it for you to use yet

#

but if you also just want something simple to start generating images asap, would recommend

open junco
#

regarding this..
"it uses comfyui as a backend but I'm not sure if it exposes it for you to use yet"

was able to make it work with GPU.. for now atleast..

but yeah.. im not sure if it has future implications 😛 .. (i mean if that causes any other problems)

somber trellis
earnest grotto
#

since this is a comfyui thread, presumably you want a nodal mess like Blender or Houdini or Unreal or whatever else

open junco
earnest grotto
#

and comfyui provides the spaghetti

earnest grotto
#

if you just want images, get aipg 2.0

open junco
#

thank you!!

reef ivy
#

new update for ltx vae, going to see if it works

#

Nope, ltx still fails at vae decode with RuntimeError: Input type (float) and bias type (struct c10::BFloat16) should be the same Sucks

#

So my guess is either the model or the vae is loading at fp32? I've tried setting vae to fp32 maybe I should try the model?

somber trellis
#

ltx 0.9.1 is force loaded as bf16

reef ivy
#

it has an option for fp32, I've also set it in the comfy ui args, but nothing so far.

#

bf16 shoud work anyway, since we force that on intel already. Don't get it tbh. tried forcing vae to bf16 as well.

#

When i researched this error, it was usually fixed with --no-half and --no-half-vae, which basically forced fp32 precision.

#

I've separated the vae and I have gotten the model to work with the 0.9.0 vae but not sure how much that effects quality

reef ivy
#

with the help of chat gpt I got it to work, although maybe with errors lol, added this to casual_conv3d.py x = x.to(self.conv.weight.dtype) right before x = self.conv(x) return x

#

oh can't post the entire thing,

earnest grotto
reef ivy
#

yeah, I guess about 20 lines of code is the cut off

#

I notice the speed fluctuates a bit, probably going between float types during generation?

earnest grotto
#

@upbeat crow @reef ivy

upbeat crow
#

Yeah.. I dunno why

#

Sorry guys

#

@reef ivy can you send me a dm of the whole thing, i wanna see if I can repost it

reef ivy
#

okay hold on

upbeat crow
#

def forward(self, x, causal: bool = True):
if causal:
first_frame_pad = x[:, :, :1, :, :].repeat(
(1, 1, self.time_kernel_size - 1, 1, 1)
)
x = torch.concatenate((first_frame_pad, x), dim=2)
else:
first_frame_pad = x[:, :, :1, :, :].repeat(
(1, 1, (self.time_kernel_size - 1) // 2, 1, 1)
)
last_frame_pad = x[:, :, -1:, :, :].repeat(
(1, 1, (self.time_kernel_size - 1) // 2, 1, 1)
)
x = torch.concatenate((first_frame_pad, x, last_frame_pad), dim=2)
#my edit
x = x.to(self.conv.weight.dtype)
#end of edit
x = self.conv(x)
return x

#

==============================

#

its letting me post it.

earnest grotto
#

code block?

earnest grotto
civic charm
#

Duped the conv2d fix for conv1d and conv3d too

reef ivy
#

okay, I will in a few

reef ivy
upbeat crow
#

so discord probably limits code block

reef ivy
#

figure it's easier to read and copy that way.

#

yeah, that's what I was thinking

upbeat crow
#

sorry guys

reef ivy
civic charm
#

It casts inputs to model data type

#

Is the model in FP32 or FP16?

reef ivy
#

bf16

#

Not sure what the newer model is, but it's also probably bf16, however the new vae is likely fp32 or something.

#

It's only slow in the new ltx workflow, in the original workflow it's fine

#

So it's something with their nodes

#

can't get the latest model to work in old workflows

#

With using the old workflow, the new model has good speed but gets this error at vae decode now Expected all tensors to be on the same device, but found at least two devices, cpu and xpu:0! (when checking argument for argument self in method wrapper_XPU_out_addmm_out) on new workflow it works but is about 2x slower

reef ivy
#

Getting this error constantly now on all workflows with ltx. But it's random, sometimes it works fine after a restart

#

So messing around I got speed back by changing cpu in this code xpu, in their loader_node.py

#

and it still got deleted lol. hold up let me make a text file

#

my theory is when it loaded into cpu, it kept the fp32 weights when it shouldn't.

#

or maybe not, still slower at higher frames. with their new nodes, and xpu error on old ones.

#

crazy speed difference at same settings, old nodes vs new

#

first one at 5 steps and less frames seemed on par with how it should be, but once upped the steps and frames it went down exponentially

#

okay I get it, probably not offloading model so slows down over time.

reef ivy
#

getting a workflow together but it may need a node I had to add xpu support too in order to work. The node they built does something that makes everything way slower, I think it has something to do with mixed precision. It's noodle soup atm, i may post something tomorrow if I get it decent enough.

tiny bolt
#

someone is lazy and didnt write device agnostic code

#

is my guess

somber trellis
#

it's just the ltxvideo custom nodes

#

I'd say we really just have to wait until 0.9.1 is supported by the official nodes so that we have device agnostic code to utilize

#

other than that im kinda just waiting for info on how to or if we can utilize a comfyui workflow in ai playground

somber trellis
#

@reef ivy disty updated the ipex-to-cuda repo

#

something regarding ltxvideo was changed and now im running it

#

0.9.1 is a huge improvement seemingly

#

but dead straight, hunyuan 4_0 beats it by a mile

#

It's just so much more data even if it is quantized

#

LTXV 0.9.1 With pag and stg

reef ivy
#

Hunyuan just doesn't have img2video, so no control over output. But it is good

#

With the speed of the official workflow its better to use cogvideo at that point

#

would something like this work if added to the hijacks? torch.cuda.device_count = torch.xpu.device_count I basically had to change this to force the vae to load onto xpu instead of cpu. Otherwise it would get the cpu and xpu error at vae decode when not using their official nodes.

#

it's noodle soup with a bunch of unused bypassed nodes since i am still experimenting but you can see where I am going with it. Bypassing their model configuration node allows me to make 121frame videos at around 6s/it, as opposed to like 50s/it for 49 sec with their stuff. Just not sure about the stg quality as I am using a different sigma node. Guess for anybody interested lol.

honest hull
#

what if you add reserve-vram 8.0 ?

reef ivy
#

8? I can try it and see, I have reserve-vram 6.0

#

Doesn't seem to help with there official workflows. It's something with their model configuration node and the new vae. I read some nvidia users oom'ng with the workflow even with the older model, so likely related. Only issue with my new workflow is you have to edit that custom node force/set vae type for xpu support

#

Another thing with their workflow I noticed, increasing the step count also decreases the iteration speed for some reason.

honest hull
#

when models are too big and it starts using shared GPU memory fallback

reef ivy
#

Yeah, there is an issue with the newer drivers for windows with memory and a750, and this is compounded with a similar issue wtih ipex that starts with 2.3.1. 2.14 doesn't have the issue, and drivers from 5971 can load larger models without --reserve-vram

#

if you follow this convo I had from the pinned post you can see where the driver issue started and I found the reserve-vram fix. #1193952640225267802 message

#

so actually while it doesn't actually fix the issue, it does speed up the generation with their workflow. from 60s/it to 30s/it with reserve vram 8.0. #1193952640225267802 message also speed didn't decrease with step increase this time. However it's still like 5x slower than it should be with their workflow

honest hull
#

if you are on A770 you can even try reserve vram 16

reef ivy
#

I am on a750

honest hull
#

doesn’t hurt to try lol.. 😆 the beauty of pathfinding

reef ivy
#

lol that is true

#

no change for me, but maybe it will help somebody on an a770. I will continue to work on my workflow as an alternative, only issue will be the custom node edit.

#

with mine you can get 6s/it with 121 frames

reef ivy
#

So far no quant for 0.9.1 ltx, I think it is already bf16 it's only around 5gb. It uses a new vae structure which is what is giving everyone problems

#

Once there is a quant I will try it out.

somber trellis
#

Pretty sure I can convert its safetensors

#

welp nvm

#

i dont think i actually can

reef ivy
#

and are you using the official workflow?

somber trellis
#

this is how i have my workflow laid out

reef ivy
somber trellis
reef ivy
#

damn, must be the vram issue and reserve-vram doesn't help. Although maybe I will try and add --lowvram back and see if that helps. I get 6s/it with the new model and my workflow at 121f and like 30-50s/it with the official one.

#

nope, their nodes are just slow, specifically the the model configurator. For my workflow you have to force the vae onto gpu to retain speed, but that requires an edit of a specific custom node that most people would likely have trouble doing.

#

but going from 6s/it to 30-40s/it is crazy.

silk umbra
#

how did intel get flux working so fast in the battlemage reveal

#

meanwhile it's slow for me

wicked fulcrum
# silk umbra how did intel get flux working so fast in the battlemage reveal

Do the following

  1. Restart your PC
  2. Launch AI Playground
  3. Go to Settings, workflow
  4. Select Flux.1-Schnell me Q4
  5. Be sure resolution is 896x896. At 0.8MP
  6. Set numbers images to 4
  7. Enter a prompt

The following should happen at these relative times

Load model: 0:25s
Generate 1st image: 20s
2nd -4th image 12s each

If performance degrades over time, Restart the ComfyUI and AI Playground backend in Basic settings

somber trellis
#

Everything else should be comfycore

reef ivy
#

I will have to take a look at that one, been awhile since I messed with hunyuan. Also taking a break with holiday stuff.

gusty goblet
#

dear sir:How can I use Python to send an image to ComfyUI via its API and then receive the generated image in response?

earnest grotto
gusty goblet
earnest grotto
#

You want to batch queue up images? Make your own prompt generator? You wanna make your own UI for Comfy?

honest hull
#

Happy holidays everyone!

reef ivy
#

Happy holidays!

signal patio
#

ERROR: torch-2.1.0a0+cxx11.abi-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.

can someone help me plz, how to fix it?

signal patio
earnest grotto
#

What OS, what python version

#

why are you installing an outdated torch and presumably ipex

#

you can just install AIPG, it installs comfyUI

#

Or if you want a simple installer for just comfyUI, I have a script ^

signal patio
signal patio
earnest grotto
#

For AIPG, you will need to probably inspect element or look in the command prompt somewhere for the actual port to connect to, as it's not the comfyui default 8188

#

Or there might be some button in the ui now, not sure

signal patio
earnest grotto
#

ComfyUI, like the other webuis, is a server that runs on your own computer that you connect to through the browser, where you see the actual UI
Same for A1111, SDNext and so on
AIPG (and SwarmUI) run a Comfy server, but have their own UI and communicate with that server themselves
But you can also just connect to it directly

#

In that vein, you can also open it up (depends on webui) and connect through your phone

#

Not very convenient for ComfyUI specifically, though usable

#

Or if you portforward properly, can connect from outside your lan as well

signal patio
#

ill stick with clasic comfyui interface, because i have my workflows from previous rtx gpu

earnest grotto
#

ComfyUI includes the UI

#

As in... so does AIPG

#

You can connect (just to a different port) to use comfyui as usual

#

but if you don't wanna do that, there's my script too

#

you need conda, git, any python. if you have python, the script will tell you where to get conda and git from

earnest grotto
#

.

signal patio
#

is this ok?

earnest grotto
#

probably fine

#

hmm

#

do you have your pc's local set to japanese or arabic or something else

#

russian

signal patio
#

this was the reason of this errors?

earnest grotto
#

yes

signal patio
#

im stuck there or its done?

earnest grotto
#

If it does not say that something went wrong or that it's done, then it's not done

signal patio
earnest grotto
#

wait more

#

it's downloading

#

probably ipex, off the cn mirror which was slow-ish but not super slow, and right now it could just be giga slow

#

mm, bmg added

#

yeah uh, china is REALLY slow right now apparently

signal patio
#

need to add progress bar to the script if its real lol

earnest grotto
#

it's temporary

#

normally you'd download from us. but something happened and we couldn't access that, even the official guide had the cn mirror in it

#

looks like it's back though

signal patio
#

lemme run a proxy

earnest grotto
#

That's not going to change anything

#

I'm going to update the script to download off us, fix the broken text encoding and use battlemage since that's finally a thing

signal patio
#

im from Spain now lol

#

but nothing changed

#

Intel blocked Russia, seems like this is the main reason?

earnest grotto
#

@signal patio stop the script. Download it again. Run it again (preferably same directory)

#

It's probably gonna take a whole day to finish (the one you were running)

#

the stuff it did already download, will be verified then that's that
It's not gonna redownload things

signal patio
#

😢

#

why official repo isnt did everything for us

#

i mean comfyui

earnest grotto
signal patio
# earnest grotto what

I mean, for rtx all u need to do is just download folder of comfy and run. Sad that users of other hardware can’t do it that easy.

earnest grotto
#

everything should be there now to allow a simple-ish download-and-run install for comfy

#

it will be slow, though that's a pytorch issue, they did something dumb with 2.5

#

sdnext is a simple install

#

but it's not comfy

earnest grotto
#

???
so 2.3 works on us but access is denied to 2.5... odd

earnest grotto
signal patio
earnest grotto
# signal patio

Either
a) Run the script in a folder that does not need admin privileges
or b) Run powershell as admin, type in Set-ExecutionPolicy Unrestricted, then press enter. Run script from anywhere.

midnight mauve
#

Is there any updated instruction to work on comfyui in the intel arc by @somber trellis in pinned comment.

someone please help.

I get 4gb error

earnest grotto
#

@midnight mauve ^

reef ivy
#

Does the script edit and re-edit the model management file when installed now?

earnest grotto
#

well, it edited it before as well, just didn't refresh that when running again

#

though after trying out 2.5 on windows more... damn that is SLOW

#

literally 2x slower (~1.7s/it vs ~3.1s/it) for sdxl

#

but it's the only one with battlemage wheels

#

not that I have a battlemage gpu

earnest grotto
#

@vocal swan Do you want to install comfyui using my script, and say if that works?

#

This one

#

First with 2.3

vocal swan
#

Okay 1 min

vocal swan
#

sorry for the wait, seems to be working great, installed without any problems and the gens are much quicker at 3it/s

earnest grotto
#

did you choose pytorch 2.3 or 2.5

vocal swan
#

b580, sdxl, pytorch 2.5 as i just used the link u refered to and it installed 2.5

earnest grotto
#

ok, there's no earlier pytorch for battlemage, rip

vocal swan
#

why, is 2.3 better ?

earnest grotto
#

expect, hopefully, ~1.5-2.0 it/s in 3 months or somewhere around there

earnest grotto
#

it's faster

#

worse compatibility, some things are broken (notably stable cascade)

vocal swan
#

hm alright, either way thanks a lot for helping me out I thought i was completely unable to do anything

earnest grotto
#

well, the problem ultimately still isn't solved

vocal swan
#

Yeah I suppose, just wanted to get sd running anyway

reef ivy
earnest grotto
civic charm
#

PT 2.3 vs PT 2.5 on SD 1.5 for example:

signal patio
#

i thought its the same as windows

earnest grotto
# signal patio its better on linux?

You should assume by default that on linux for any GPU (incl. AMD, Nvidia), compute will be better (let's say 20% faster) and gaming worse (let's say more than 20% slower)

civic charm
#

On Linux;
Transformer models like SD3 and Flux has no change in speed

#

UNets like SDLX has 1.5x to 2x slowdown

#

SDXL goes from 1.6 it/s to 1.0 - 0.8 it/s

#

PT 2.3 vs PT 2.5
Both on Linux

#

literally 2x slower (~1.7s/it vs ~3.1s/it) for sdxl

Linux is 3x faster than Windows on both cases if you use this metric from Vik

#

I am using Arch Linux with Linux kernel 6.12.6

signal patio
#

I was thinking about to start using linux since deadlock works well with proton
I need comfyui for flux and latest photoshop but i think its a pain in the ass to get it working

#

My gpu is a770 16gb

civic charm
#

You are going to have a bad time with any adobe software on linux

signal patio
#

Seems like VM is the only way

#

All i need from adobe is inpainting, it’s better to build a workflow for that and stop using Photoshop 🤣

civic charm
#

Check out Krita with Stable Diffusion plugin

#

Krita + SD is very powerful tool

signal patio
#

Thx ima definitely check that out

#

What about support in other things in linux for arc? Its not that good as amd?

civic charm
#

Both AMD and Intel are plug & play on Linux

#

Just don't use Linux kernel 6.8 on both

#

AMD has severe stability issues and Intel either doesn't work with stable diffusion or works but 2x to 4x slower than normal on Linux 6.8

#

6.12 is stable and fast on both

civic charm
#

Ubuntu 24.10 uses 6.11

#

6.11 is fine on Intel

signal patio
#

Yea but switching from windows to ubuntu is strange for me, because ubuntu is kinda windows in Linux world 😂
Its ok for people who never tried linux distros before

earnest grotto
#

but nonetheless the same thing I did was faster on linux

#

with sd xl base 1.0, with a barebones workflow with the default ksampler, I get
0.67s/it with 2.3 ||1.48 it/s|| windows

#

1.22s/it with 2.5+IPEX, windows again (comfy has the allow bf16 sdp enable by default)

#

0.71s/it with 2.3 ||1.40it/s|| linux with 6.5.0-44-generic

#

0.73s/it with 2.3 ||1.36it/s|| linux with 6.11.11-061111-generic

#

i think that's about as much as I'm willing to test

#

welp, flux was notably faster on linux at one point, and I still have some fairly bad vram-leak-like issues on windows (native, not wsl) which simply don't happen on linux

#

and for these removing empty cache makes them worse

reef ivy
#

Does wsl still have the memory leak?

#

Might try and run comfy from wsl and update to 2.6 or 2.5. i do remember giving up some speed but windows was fast enough before. 2.1.4 was the last fast ipex in windows.

earnest grotto
civic charm
#

We finally got temp reading in Linux with 6.13 kernel. It took way too long for an essential feature :/

civic charm
#

GPU is not fully utilized with UNet models on PT 2.5

slender zodiac
#

Exciting times 🤖

somber trellis
#

Are there commands besides torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True) that can assist in performance on 2.6.0?

earnest grotto
somber trellis
#

It was recommended by Li to add it 🤷‍♂️

earnest grotto
#

comfyui didn't have it back then (maybe)

#

it has it now (definitely)

somber trellis
#

Alrighty then ig 🤷‍♂️

#

I can't use InstantIR or certain upscalers on 2.3

#

and 2.6 is slow and i am sad

#

but it worky so yknow what ok

#

models like NomosUni 2x doesn't work on 2.3

#

super-fast compact upscaling model

earnest grotto
#

well, blame the pytorch foundation i guess

#

i wonder if they're understaffed like blender or what

somber trellis
#

I can't be completely unhappy though. Being capable of generating 512x384 on hunyuan with fastlora 4_0 at 20s/it 49 frames is ok

#

combine that with an upscaler like uni

#

which takes 2 seconds to upscale 49 frames

earnest grotto
#

i don't think that video is worth 16 minutes of wait

somber trellis
#

15 minutes

#

that video took 2 minutes

#

fasthunyuan allows inference to be done in 6 steps

#

Even on FP8 fasthunyuan I can generate a 512x384 49f video in 3 minutes

earnest grotto
#

honestly even 2 minutes might be too long

somber trellis
#

It isn't. LTXVideo takes more time.

earnest grotto
#

i mean in the sense of, it's simply not worth it, quality is too bad

#

it's a still image with shaky cam but because it goes through a VAE it has fat artifacts

#

and it's some big artifacts

somber trellis
#

doesnt help that it's interpolated with rife

#

and upscaled

earnest grotto
#

you can just make an image with flux and shaky cam it with anything else

somber trellis
earnest grotto
#

high fidelity, chainlink probably won't shuffle around though i'm not sure if it'd still look sensible

earnest grotto
somber trellis
#

hunyuan likes long prompt videos

#

but unlike ltx, can do short-prompts like that one

#

Also I'm generating at a far lower resolution and total framerate than the model is designed for (and its fasthunyuan, 6 steps instead of 30)

#

keep it in mind

#

I hope LTXVideo becomes more usable.

reef ivy
#

Ltx is really good with img2video, you just need good prompts

somber trellis
reef ivy
#

Also video2video

somber trellis
#

hunyuan can do good text to video

#

if done right

reef ivy
#

Yeah, its the best at text2video. Problem is consitency for me, but they do have lora training now

honest hull
#

so yeah, it’s included in comfyUI now

earnest grotto
#

huh nice

honest hull
civic charm
#

We need Flash Attention on PyTorch : )

somber trellis
civic charm
#

2.3.1 CPU

somber trellis
#

Damnit

#

I just want a fast pytorch that supports newer stuff

#

welp

tiny bolt
#

i remember flash attention isnt just some matrix operations in pytorch DoggoGrin the need of cuda means sycl is needed to implement it on arc gpus

civic charm
#

It needs direct access to GPU caches

#

It works by executing attention exclusively in the GPU cache

midnight mauve
#

Can someone help me how to do a text to video models in vik's script

#

I tried hanyuan but got an error related to bitesandbytes

signal patio
#

hey @civic charm 👋
can u help me to figure out why i can't get 4k 144hz under arch on my a770? same setup on windows was working fine
just installed latest arch (6.12.7-arch1-1) + Gnome
connected to my 4K 144Hz monitor with display port

Using journalctl -b | grep drm im recieving this errors:archlinux kernel: [drm] DisplayID checksum invalid, remainder is 190

xrandr --listproviders gives me 0 providers

signal patio
#

Also fans is not spinning on my gpu

#

Wtf 😬

civic charm
#

Also i use Wayland

#

Another issue i have ran into is DP ports doesn't support custom refresh rates on my A770. HDMI overclocks to 75 Hz fine.

signal patio
#

I just reinstalled everything from 0 again, but with kde now
same problem - 60hz refresh rate is maximum

civic charm
earnest grotto
signal patio
civic charm
earnest grotto
#

my script works on linux as well, unless you're on battlemage, then i'm not sure

#

ok, alchemist

#

hmmm

#

ok nevermind

#

something's broken

earnest grotto
#

either do what that tells you (create a venv, or get conda and make a conda environment, or whatever else)
or... I'll look into fixing whatever broke with my script 🤔

#

python -m venv your_venv_name

#

newer linices have python 3.12. you will want to have 3.11

#

conda can install any python. otherwise, deadsnakes ppa

signal patio
#

I’ve never seen this, I should do something with this venv every time i want to run comy and when i stop using it?

earnest grotto
#

you should in general have every python-using thing have its own venv

#

it avoids conflicts.

#

some things need specific versions of specific packages. if they use the system python and install there, things will override one another's requirements and break

#

some bigger projects will outright bundle their own python

#

blender does this

signal patio
#

Thank u for explaining! I should create this venv and put pytorch command in a specific folder? Or it doesn’t matter?

earnest grotto
#

the venv is a folder

signal patio
#

So i need to create it inside comfy folder?

earnest grotto
#

@signal patio Just use my script ^

signal patio
earnest grotto
#

it's for both

signal patio
earnest grotto
#

?

signal patio
#

Im starting thinking that this is impossible to run comfy on arc

earnest grotto
#

whut

signal patio
#

Everything is so complicated

earnest grotto
#

things are getting simplified

signal patio
earnest grotto
#

if the pytorch foundation didn't massively slow down 2.5 it would be much simpler

earnest grotto
#

a bunch of various driver packages are probably missing

#

or is there no bash in bin on arch

#

do clinfo and say what happens

signal patio
#

Sorry for photos, discord is not working for me

earnest grotto
#

that's all? it doesn't detect the arc gpu

signal patio
#

Yup, thats all

earnest grotto
signal patio
#

I think its time to remove arch forever

#

I dont know how to call people who creates a distro with a font without japanese, cyrillic and etc by default
This shit is sooooooo dumb

earnest grotto
#

ubuntu i'm using has cyrillic working in the terminal

signal patio
#

Yea, ubuntu is way more comfortable

#

But

#

Its bloated with telemetry

#

Fedora is the same

earnest grotto
#

idk, i don't care about telemetry, I use discord already and it has no ads

open junco
earnest grotto
#

i've removed it

#

you just run the python script

signal patio
signal patio
signal patio
#

Running powershell from administrator

#

Restriction policy unrestricted now, but the problem is the same

earnest grotto
#

At this point though, Python/Windows's fault.

#

I'm making some workaround

#

@signal patio run python, run this, and show me what the outputs are for this:

import sys
import subprocess
sys.getdefaultencoding()
sys.getfilesystemencoding()
subprocess.check_output("chcp", shell=True)
earnest grotto
#

type python in the command prompt. Press enter. Copypaste the code above. Press enter.

signal patio
earnest grotto
#

Press enter after typing in python.

#

@signal patio

signal patio
earnest grotto
#

so python is cooked, jesus

#

alright wait a bit

signal patio
earnest grotto
#

This is an issue with python in general.

#

It's working "correctly"

signal patio
#

🤣🤣🤣

#

Working fine, like Arch 🤣🤣🤣

earnest grotto
#

@signal patio Download the script again. run it again.

#

It should work now. If you see funny symbols instead of proper text, no idea

signal patio
#

BUT

#

When i run it

earnest grotto
#

Just wait it out

#

This is normal

#

You are also going to see it every time

#

Wait harder

signal patio
earnest grotto
#

List your full specs

signal patio
#

Intel Core i5 12500
Intel Arc A770 16gb
16gb RAM

signal patio
earnest grotto
#

that's bad but that's not the issue

#

are you using the igpu for anything

signal patio
earnest grotto
#

@signal patio Download script again, run script again

#

in the future consider disabling igpu but this should now work with the igpu enabled

earnest grotto
#

You can run it in the same place as last time, it'll update whatever's already installed

reef ivy
#

hey @earnest grotto @civic charm is there anything I would need to add to comfyui with wsl2 to mitigate the memory leak issue? Or is it already in the hijacks?

#

I do remember using like tmalloc or something before, but I don't fully remember anymore

civic charm
#

tcmalloc will help with the system ram leaks

signal patio
#

Can someone help me to run comfy on arc a770? Its been a week and i still cant do it. Tried official comfy guide from github, tried Vik script, tried Arch, Ubuntu and nothing 😢

reef ivy
#

Thanks disty, will try wsl 2 today or tomorrow

reef ivy
#

Common things make sure you habe the right python for viks script (3.10 i believe), and make sure you call the one api stuff everytime you start comfy

reef ivy
#

What is the command you are using to open comfy? I am unfamiliar with this error, but it is happening with the hijacks edit in model_manegment.py (I think). Are you calling the oneapi files when opening comfy?

#

this could be wrong, but you need to run something like this source /opt/intel/oneapi/setvars.sh everytime you start comfy in linux if using ipex I believe.

signal patio
#

Idk any other ways to run it. My previous card was rtx3060 and i was installing comfy in 1 min just downloading a folder from github

earnest grotto
# signal patio

After it breaks, type in:
python -c "import torch; import intel_extension_for_pytorch; print([torch.xpu.get_device_properties(i).name.lower() for i in range(torch.xpu.device_count())])

#

press enter
Show what it says

earnest grotto
signal patio
earnest grotto
# signal patio

OK, sorry. Type in set ONEAPI_DEVICE_SELECTOR=, press enter, then the above

earnest grotto
#

weird

#

How about then,
python -c "import torch; import intel_extension_for_pytorch; print(torch.xpu.device_count())

signal patio
#

Maybe

earnest grotto
#

you posted the same screenshot

signal patio
signal patio
earnest grotto
#

remove 1 bracket, sorry

#

at the end

#

python -c "import torch; import intel_extension_for_pytorch; print(torch.xpu.device_count())

earnest grotto
#

@signal patio

#

set ONEAPI_DEVICE_SELECTOR=

#

then press enter

signal patio
earnest grotto
#

No.

earnest grotto
#

Then press enter

#

No comma after it. No pasting the rest after it.

signal patio
earnest grotto
#

Whatever, do you want to just disable the iGPU from device manager?

earnest grotto
signal patio
earnest grotto
#

It's an IPEX/OneAPI bug

#

Intel peeps worked around it somehow with AIPG but that doesn't quite seem to work for me.

signal patio
#

i should disable igpu in bios?

earnest grotto
#

In device manager.

#

bios in case you want to disable it for linux too

signal patio
#

only

earnest grotto
#

Go into task manager and disable it.

signal patio
#

disabled

earnest grotto
#

@signal patio Restart PC so we're sure the disabling takes effect. Download script again. Run again.

signal patio
#

U are legend

#

Thank you for your help, idk i hope Intel will make this easier in the future

#

Because for now it was something like Souls-like quest

earnest grotto
signal patio
#

Now, everything is the same as on nvidia cards? All nodes is working?

earnest grotto
#

Everything in comfy by default, and the custom nodes my script installed, works

#

Of course, not absolutely everything works. For example, the various 3D-related custom nodes, rely on explicitly nvidia-only things (e.g. nvdiffrast)

#

But you're probably not using those

#

The vast majority of things work

reef ivy
earnest grotto
#

there is 2.5 yes

reef ivy
#

Issues with pytorch itself, it's apparently slower on all vendors with intel being the worst afaik.

#

2.6 nightly build is a bit faster but it's still being worked on and is also not as fast as 2.3.

signal patio
#

Guys how to install Puild for flux? All the guides telling that i should use comfyui/python embedded/ folder but i dont have one

reef ivy
#

not sure, my guess is that would be your venv folder. I think that python embeded is for the standalone comfyui.

#

also from what i'm reading on reddit "This workflow currently works with ComfyUI versions 0.2.3 or lower, as the new one uses an incompatible Python version "

paper root
#

Hello,
I’ve been trying to install ComfyUI for 3 days. My graphics card is Intel Arc B580.
I managed to reach a certain point, but I’m encountering an issue. When I click "Generate," after loading the model, the process stops at the clip_text_encode step. How can I resolve this?

reef ivy
#

are you using --bf16-unet

paper root
# reef ivy are you using ```--bf16-unet```

I was trying this way, but the same error persists. I tried --gpu-only and --force-fp16 vs. but still the same.
Python version: 3.10.16, PyTorch version: 2.1.0a0+cxx11.abi

reef ivy
#

you need bf16 for intel not fp16 for comfy

#

your ipex is also old

#

also, maybe try a different workflow or model.

reef ivy
#

So if I update the wsl 2 kernel to 6.2+ do I still need to manually install the intel drivers? Seems they aren't being picked up by torch, but not sure if installing should be necessary. edit seems you do, probably something for loading the windows drivers?

#

also, trying torch 2.6

signal patio
#

I get this error

reef ivy
#

It probably doesn't support python 3.13.1, most webui stuff is on 3.10 or 3.11

#

Also, make sure you are in your env

#

It may also be able to install from the comfyui manager, most nodes can unless they are brand new

#

On another topic, I forgot how much a pain linux and wsl 2 was lol. Been at this all day. Probably should just nuke my wsl env and start from scrath tbh.

reef ivy
#

took all day to realize that 2.6 doesn't need oneapi and pulls an error if it's called. 🙂

earnest grotto
#

do NOT install the pti if you have the basekit installed

#

and vice versa

#

@reef ivy

#

@signal patio Run comfy with the shortcut. press ctrl+c, this will shut it down. install whatever you need to install, the command prompt will persist with the environment activated.

reef ivy
#

Yeah, I am running it through a venv but was calling one api by default from my old ipex install. Got it running but installing comfy nodes as we speak

#

2.5 needs one api files but not 2.6 nightly apparently

reef ivy
#

keep getting this error now 2 active drivers ([<class 'nvidia.CudaDriver'>, <class 'intel.XPUDriver'>]). There should only be one. with a bunch of nodes

reef ivy
#

seems to be an issue with 2.6 and triton.

reef ivy
#

Can't get any video stuff to work yet, but flux is getting about 7.45s/it with wsl2 and pytorch 2.6. I believe that is just 1 sec slower than 2.3 but not sure off the head atm.

#

ram usage is still crazy with wsl it seems though.

nocturne fjord
#

I can't get Hunyuan to work I get a completely black output. I tried with 2.3 and 2.5 ipex

#

I use kijai/ComfyUI-HunyuanVideoWrapper

#

Too bad that Node block_Swap is not natively integrated into comfyUI

reef ivy
signal patio
#

How to fix it?

signal patio
tribal hare
signal patio
#

Seems like my comfy is dead

#

Anybody knows how to fix it?

signal patio
# signal patio

i used this command and it fixed BUT i have the same problem as @tribal hare now. Its loading forever

reef ivy
#

Might be an issue with that node, check the github and see of others have the issue.

#

I noticed comfy taking a while to load when I was testing wsl 2 so likely something wrong with comfy commit. I thought it was a wsl 2 issue but I hadn't went back to my windows env to test

somber trellis
reef ivy
#

New pytorch? Is it fast again? I couldn't get 2.6 to work right in wsl, triton issues could be because ipex was installed at one point on the main install.

#

It did seem faster than windows for when it did work in flux at least.

somber trellis
#

🤷‍♂️ it seems around the same as 2.6

civic charm
#

Only fix is a new venv

reef ivy
#

I never had it in the venv but it was installed in wsl 2 itself. Might just have to nuke the whole thing

earnest grotto
#

venv isolates from your main python

proper axle
#

Can someone tell me how to fix this problem. I used the tutorial below to install it。I used the tutorial below to install it。

proper axle
#

like this

#

I use the flux model. When the program progresses to the clip nodes, it will report the userwaining as shown in the figure, and then comfyui will terminate and exit

earnest grotto
#

how much ram, what gpu

#

what quantization for the t5

proper axle
#

sorry, I don't quite understand what quantization for T5 is.

earnest grotto
#

extremely old.

#

how did that even work? wild

#

AIPG includes comfyui, and it will set things up for you

#

I'd suggest you use that and use its comfy

#

install it, run it, ctrl+shift+i, see what comfy's port is

#

IF you don't want to use AIPG, I have a script that will install comfyui for you, as well as some addons, patches, whatever

#

.