#ComfyUI for Intel Arc using IPEX
1 messages · Page 5 of 1
install numpy 1.26.4
and install ltx-video's dependencies to match its transformers requirement
I simply go through the dependencies and install compatible packages following their versions
but ye the speed diff between the two versions dont really matter anymore
It's literally around 10-12% perf loss with the added command for inferencing flux
but i can run instantir image restoration now
It should be able to run mmaudio now too I hope.
9s/it on flux with dynamic thresholding AND dpmpp_2m sgm uniform
that isnt too bad
I think I got 6s/it with just base flux at 1024x1024 last I tried.
unlike 2.5.1 the inference speed is stable at around 9s
2.3 liked to go between 7 and 13
and 2.5.1 started at 19s then went down to 8
2.3 was unstable unless I ran the --reserve-vram 6.0 command
trying ltx video and seems like the speed is also extremely slow, or it's not working properly. Hasn't gone past ?it/s in about a minute
wh
Yeah somethings wrong, 174.65s/its lol
it should go about 4 or 5s/it with the length I am using
I can try without it, it helped before on 2.3
ye but ur using it on a model capable of fitting onto your vram
you use reserve vram 6.0 on an 8gb card right
i run the fp8 version of all the checkpoints i have
ltxvideo fp8 works well
yeah, but it typically uses as much as needed. And goes faster when not fully loaded into vram
I have fp16
and the full model, both worked fast on ipex 2.1.4 btw
full model got slow on 2.3.....and I'm guessing 2.6 now fp16 doesn't work. Seems they are messing up the vram allocation of lower vram cards ith each update. sigh
yeah, makes no difference. Still slow. Going back to ipex 2.3 I guess lol
well before that, let me try and update comfy.
If I could get stuff to work on 2.1.4 I'd never update tbh
well updating is better at least, now only 22.33s/its.
still a lot slower than 4sit/s lol
what workflow
custom, using ltx and perturbed attention with img2img. I also have florence and ollama but bypassing atm (florence still runs fine).
could also be the windows drivers + ipex, there is still a memory issue with 8gb cards and ipex. Usually solved by --reserve-vram in comfy
could try different --reserve settings.
also thinking of trying wsl2 again
linux was generally faster anyway last i tried
could try reverting drivers again too, but since they got rid of arc control would have to ddu
when you spoke about a florence fix that confused me since I got the promptgen 2.0 model working great
It was an issue with the tokenizer i believe, I simply swapped it out with the base florence 2 large's json
normal florence worked outta the box for me
2.6 works fine at least
we finally have a decent version other than 2.3
that and with ipex-llm supporting ollama 0.4.6
i can use a majority of llms
yeah, it's unusable for me sadly. flux is also much slower, going about 12s/its with the q4 model.
Since all the new stuff requires it, I may try linux wsl2 and pray that it isn't terrible as well lol.
really want to try the mmaudio stuff.
I put the photo through florence2 large, then i run the prompt through Ollama with qwen2.5 7b with a system prompt for ltx videos and ti seems to really help make videos stable with good movement. Can even do camera moves now, it's not perfect but much better
can also add my own commands to the prompt for more control over camera and any motion
get stuff like this now
test for camera zoom. also added it's own random title card lol
with stg I do notice that things like blinking are less realistic but the image all around is more stable.
might be inconsistent vram usage like 2.3
also wonder if it's the latest drivers with the battlemage update
I have completely forgotten how to use wsl2 and linux lol. Will need a refresher
you know what, do you still need libuv for pytorch?
And could that cause issues with it?
id assume so since ipex 2.3 did
then again this isnt ipex
🤷♂️
just tested the 2.5.1 ipex version and thats slightly slower than 2.6
and 2.6 is half as fast as 2.3
okay, so it's not just me with the speed decrease.
Might just be torch itself, also nightly builds I think are updated all the time so something could break
this is the version I got 2.6.0.dev20241215+xpu
i may try and make a new comfy env, without libuv just to see lol
Why is there an error during the computation?
I have followed the steps above and reached this stage, but an error occurs.
Who can help me?
numpy is not available
pip install numpy==1.26.4
Thank you for your reply. I will conduct further testing on my end.
I guess torch installs numpy on it's own every time. Ipex doesn't seem to afaik
a question, would python 3.11 be better to use with 2.6? Or is 3.10 still the recommendation?
I have been using 3.11, and we are using 3.11 for AI Playground
Can't wait for the ai playground comfy backend
https://github.com/comfyanonymous/ComfyUI/pull/6069 also created a PR to comfyUI README to update Intel GPU installation instruction.
updated and simplfied ComfyUI readme for Intel GPUs.
Intel GPU Support Now Available in PyTorch 2.5 https://pytorch.org/blog/intel-gpu-support-pytorch-2-5/
more info can be found: https://pytorch.o...
will be out soon 😉 I think community folks will like it as we created a way to allow developers/community members to add curated workflows for others to use.
I already use torch 2.3.110 for ipex as directed from https://pytorch-extension.intel.com/installation?platform=gpu
yeah but the old instruction requires installation of the oneAPI base toolkit.. which is no longer needed
2.5.1 ipex and 2.6 is sadly half as fast
hm.. that shouldn't happen. which card are you on?
Intel arc a770 LE.
Pytorch 2.6+xpu is slower than IPEX and that is expected, some of the IPEX optimization is not yet upstreamed
Pytorch 2.3.1 ipex is faster than 2.5.1 IPEX and 2.6.0+xpu
At least in comfyui this seems so.
with a normal SD1.5 workflow? I can give it a try
I use Flux and LTXVideo currently
could you share you flux workflow file
It utilizes a couple different nodes.
or an image
so on A770, with ipex 2.3 you are getting ? second per image?
I can try to replicate and troubleshoot for ipex 2.5
luckily ipex-llm works fine in administrator mode with phi-4 4_k_m. generates 256 tokens in half a minute
flux dev fp8 on pytorch 2.3.1 ipex is doing 6-7s/it
oh dev fp8
yep
with --reserve-vram 4.0 both equal the same speed
on 2.3.1 that is
they arent far off but yes q8_0 is more accurate
its more faithful to fp16
keep in mind
this is a workflow that uses
I would say Q8 quality is 95% similar to FP16.. minor detail differences
dynamic thresholding and a dpmpp_2m sampler
depends on what tokens are being used for inference because of the weights themselves i assume
fp8 does affect minor things
same with gguf fp8 but it's a much more complex architecture than simple fp8 conversion
I should probably be using it
so in a fresh env without libuv I am getting 11s/it with ltx video. Still way slower than the 4/5s/it but much faster. Dunno if it's a fresh install or the libuv. I am going to try python 3.11 next
couldn't find the merge string custom node, do you know which one was it?
LogicUtils
having issues getting pytorch 2.6 to work with comfy and python 3.11.
@somber trellis couple things to improve ipex 2.5 speed. I also noticed that 2.5 is a bit slower than 2.3.
- add
torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True)to model_management.py in comfyUI - pull latest update for ipex_to_cuda from disty's repo to be compatible with ipex 2.5
- it should now improve perf
what's the issue? should also apply disty's hijacks
I am going to try again, but it's failing at the management.py have added the hijacks already. Might have been the pip3 that was in the install instructions at the pytorch repo, so trying pip now.
oh did you pull the latest hijack from Disty's repo?
I ran pytorch 2.6.0dev yesterday in a fresh env on ComfyUI it worked fine so
ensure to add torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True)
yeah I have that, going to try once everything installs
from ipex_to_cuda import ipex_init
ipex_init()
torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True)
_ = torch.xpu.device_count()
xpu_available = torch.xpu.is_available()
except Exception:
pass```
nope, still massive slowdown compared to 2.3. 5s/it to 12s/it with this ltx workflow. I do think that not having libuv installed increased the speed from 22s/it though so that may help someone. Dunno why the speed decrease is so massive, will try flux next to see if it's just ltx.
also --reserve-vram 6.0 is still necessary or it's even slower.
also having install issues with the vlm nodes, so I may try this one more time tomorrow. and see if I just have done something wrong.
Flux is slower, but not nearly as bad. 9s/it now for 1024x1024, think it was 6s before.
yeah, about 6s/it on ipex 2.3 with flux, and this is a bit slower than it was with ipex 2.1.4 also.
Another question, would pytorch still require conda? Or could it be created in a regular python env?
Pytorch2.6+xpu from upstream doesn’t require conda env or oneAPI base toolkit
regular env is sufficient
it’s still in the early stage of being fully upstreamed but usability is the main focus right now 😉
Also don't mix PyTorch and IPEX on the same venv
PyTorch Triton gets very upset when you install one and then change to other
ipex_init returns active and message
If active is false, then message will return the reason
ipex_active, message = ipex_init()
print(f"IPEX Active: {ipex_active} Message: {message}")
@earnest grotto do we still have the memory leaks with wsl2 + empty_cache? I would like to remove that hijack if it is not an issue anymore.
I will check in a bit
Still leaks with ipex 2.3, 24.04, latest driver
Should I test newer?
where do i get ipex 2.5
I'll test the nightly pytorch
pip install torch==2.5.1+cxx11.abi torchvision==0.20.1+cxx11.abi intel-extension-for-pytorch==2.5.10+xpu oneccl_bind_pt==2.5.0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
ah, wonder why they still haven't added that to the install guide
US mirror is down, so probably that
incredible, the basekit and the pti still completely break everything if both are installed
seems super slow, surprised they released it tbh. I thought they were done with Ipex with the upstream.
Welp, whatever PTI/pytorch-dev versions are downloadable, are not compatible with the current nightly or ipex 2.5
torch 2.5 works at least
let's see if it leaks
Pytorch 2.5 also leaks
I think the comment on the issue saying it's probably a driver/wsl bug not an ipex/torch bug is right, the amount of memory used torch reports changes with only +-16 bytes if at all
And it still errors out about running out of memory with... This time ~12-11gb of vram free
since trying out 2.6 with instantir i can safely say its just better to mimic what instantir does with flux, using an upscaler controlnet combined with redux on pytorch 2.3 for speed
does the b580 work with comfyui yet
I was getting 4s/it at this point with flux gguf at 1024x1024 with a lora. not sure if it was the older ipex or the older drivers. But with 2.3 the fastest I get is 6.40s/it with a lora loaded.
It doesn't work? Shouldn't it be able to use ipex?
Maybe it needs 2.5 or higher?
i tried @earnest grotto 's script to install comfyui and it didn't work when i tried to diffuse
where do i install 2.5
Might need later ipex maybe, also his script might not account for battlemage and install the right ipex.
how do i update my ipex
There is no right ipex
You can try 2.5. But again, I'd expect that to not work in the end
oh ok
just keep waiting?
They shipped without ipex support?? wtf, glad I didn't pick it up lol
2.5 isn't officially released but the whls are avaialbe to download, i couldn't get pip to work so I had to make a requirements file, maybe somebody can make a pip install snippet for the windows version, I think the one disty posted is linux.
i see
you could also try this #1193952640225267802 message try and go into the conda env and install that.
I saw bmg benchmarks for AI, I guess they only used openvino?
maybe this will work python -m pip install https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/ipex_stable/xpu/torch-2.5.1%2Bcxx11.abi-cp310-cp310-win_amd64.whl https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/ipex_stable/xpu/torchvision-0.20.1%2Bcxx11.abi-cp310-cp310-win_amd64.whl https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/ipex_stable/xpu/torchaudio-2.5.1%2Bcxx11.abi-cp310-cp310-win_amd64.whl --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
Activate your env and run that
you aren't in your env, you activated the base env. cenv is your env
hmm, i ran it to see if it would run for me and it started installing. Maybe it's a cache issue.
try this, make a text file and add ```torch @ https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/ipex_stable/xpu/torch-2.5.1%2Bcxx11.abi-cp310-cp310-win_amd64.whl
intel_extension_for_pytorch @ https://intel-optimized-pytorch.s3.cn-north-1.amazonaws.com.cn/ipex_stable/xpu/intel_extension_for_pytorch-2.5.10%2Bxpu-cp310-cp310-win_amd64.whl``` name it something like ipex_requirements.txt. Past it into the main comfyui folder. Activate your env then navigate to comfy and type into the cmd prompt pip install -r ipex_requirements.txt
cp310 amd64 for python 3.10 on windows
oh your on 3.11?
i think?
I didn't download the files for 3.11, the links will be different.
can i just uninstall 3.11 and use 3.10?
you can replace everywhere it says cp310 with cp311 i think it should work
I manually went in and found all the links myself, I don't know the pip code for them
i can give him it
do you want to be using ipex2.5.1
or torch2.6.0+xpu
whatever gets this working
I think he said it didn't work, although it could have been because of the wrong python version for Vik's script?
vik's script uses 2.3.1 doesnt it
Yeah
if this works, I would suggest trying vik's script again but uninstalling 3.11 and using 3.10 and see if it works, 2.3ipex is faster atm
pip uninstall torch torchaudio torchvision intel-extension-for-pytorch && pip install --pre torch torchaudio torchvision --index-url https://download.pytorch.org/whl/nightly/xpu
*install in first command
no
that is an uninstall on purpose
read the full command
ok that didn't work
if it says something didn't exist then it was probably the file name being wrong again. If you were doing the Ipex method still.
ok i reinstalled using viks script with python 3.10
anything else i need to do?@reef ivy
Is it working?
wait i need to install flux first oop
if you get it working, let me know your speed and image size
hi it works, i get 8.95 seconds for 20 steps on sdxl
if you add ––force–reinstall at the end there is no need to do pip uninstall
im actually p sure --pre --upgrade will just do it too
hm.. not necessarily.. for example if you accidentally installed a MTL-H ipex wheels on a desktop A770 system, it wouldn’t upgrade the wheels because the versions are the same
force reinstall ensure that it pulls the package again and reinstall it
––upgrade would work when versions are different
@honest hull Do you know if battlemage is supported yet and by what
I am a newbie to intel arc, while following the pinned message of @somber trellis of installing comfyui on my laptop ( intel ultra 9 h series and intel arc a370m) I got this error. Someone pls help.
For new users I recommend using viks script and let it do the heavy lifting for you.
@reef ivy please brief me steps. That would be helpful.
.
yes
Thank you
Just a heads up, there are 3 ways to install ipex for comfyui in the pined comments to chose from. Viks is the easiest imo
I looked at the ipex gpu install website and they updated it from 2.3.1 to 2.5.1
must've happened today
hunyuan 4bit gguf with new fast lora from kijai, "Will smith eating spaghetti in a cozy restaurant by the mountains" 512x512 14s/it with lora, 11s/it without 6 steps
6 steps without lora
how
me over here trying to run 4_k_m and failing miserably
I think you should be able to get my workflow from those images. Probably the vae decode and I use a fp8 version of the llm
There is a gguf of the llm too but don't know how to load it as a clip, gguf clip has no option hunyuan
i have the fp8 scaled version of llava llama3
Whoops read that wrong lol
I can't get the gguf to work, i used the fp8 safetensor for lava lama
It loads into the standard dual clip i believe
About 8gb, I offload directly to cpu but you can probably handle it with 16gb
I will say hunyuan is waaaay better with natural looking human movements. Quality is lower than ltx img2vid but probably better thant txt2video. At that res might even be faster than cog video tbh
neither the one from hunyuan video wrapper or the official comfycore implementation workflow works
The kijai workflows don't work for me either
idk how you got
16s/it
im getting 20s on torch 2.6.0, so im assuming you arent on 2.6.0
instantir and certain upscalers are broken on 2.3
im probs not gonna use instantir tho
compared to flux redux its really not good
I have a 2.6 env if I need it
That is better than my outputs lol
Yeah it is nice, just need img2vid
Oh, ltx 1.0 arriving next week if nothing goes wrong
Official word from CEO
@earnest grotto Comfyui working 🔥. 1.25its/s for 512512 with 20 sampling steps and 8its/s for 10241024 with 70 sampling steps
So much thanks. @reef ivy Thanks for pointing out. 👍
But still there are some errors while running Comfyui. Is there anything need to be done or ignorable
all of that is ignorable
have you installed any other custom nodes
Thanks for info.
Not yet I am a newbie still exploring.
Any suggestion?
nothing much besides the ones the script gives you the option to install
Okay 👍👍
well not 1.0 but new model 0.9.1 https://huggingface.co/Lightricks/LTX-Video/tree/main gonna play around with it today
So far running extremely slow for me
bfloat16 error at vae decode, sigh. Model is about half as fast as the bf16 version and doesn't seem to work with the default workflow with intel.
with pytorc 2.6 I seem to be getting the same speed as 2.3 now after disabling lowvram in the command line. I am also not running it in conda but a regular python env. Could just be ltx video, but for some reason it is faster today.
Update, potentially
btw I can run hunyuan 4_k_m n my card at 256x256 121 frames
horror in the palm of my hands
however getting good results at this res is a lot harder
ill probably stick to 512x320 49frames
Nice, I wonder if using a video2video with cogx or ltx could help upscale if used with an image and the ir stuff. Movement in hunyuan is unmatched
Also try the new ltx model and the new workflow, i got a mismatched bfloat error on vaedecode. I was able to use the previous model vae but that likely ruins the new output
I am not sure its the same install from the other day, just no lowvram command. It is also not inside a conda env so that may be helping as well
Comfy UI was updated though
If 2.6.0 is truly faster and equal to 2.3 @reef ivy
I'll swap back
it has compatability fixes I want
theres a newer torch 2.6.0 for today's nightly so there might be a change there
If you want, try and make a python env for comfy with 2.6 and run it with the --reserve-vram but omit the --lowvram. gonna try flux in a minute and see if it's not just ltx video.
well its 8s/it on ltxvideo for me
on the latest nightly 2.6.0
just got the same vae decode error
rip ig
back to 2.3.1 i go
It's 3.5 seconds faster per it on 2.3.1 on ltxvideo but the same vae decode error occurs
the issue is the loaded dtype regarding the vae decode error
its loading the vae as f32 i think
ltxvideo 0.9.1 was made as a bfloat16 model unlike it's original which was in float32
I'm fairly certain it's an issue with the code handling the vae's dtype
I dunno i can run the bf16 model fine in it, there is something with the new vae itself. I have gotten the nee model to work with the vae from the old one, but extracting the new one still gives same message.
oh no i can run both the bf16 and fp32 dtypes fine
its the vae decode that breaks
ye like you said
You can also chose float32 in the nodel loader
the vae included with the model doesnt seem to work for us
Yeah, I have set the vae to run on CPU and it hasn't errored yet but its taking forever so far
They said something about the vae and compatibility but their workflow uses the default vae decoder.
i just did a 1-step run with set vae device to cpu
nothing's outputting lol
it's just stuck
well at least hunyuan works
Yeah, I had to quit out.
pity i cant do 5 seconds
could try hunyuan 2 flux img2img 2 ltx image/video to video.
well currently ltx no woky
i mean 0.9.0
I've tested 0.9.0 enough to know that it is very inflexible as a model
only issue with it is wonky movement with img2video
Yeah, that's why I say use a hunyuan video and a flux image with the img/video to video workflow.
so take the first frame of hunyuan do an img2img upscale with flux, then take both into ltxvideo. Might get a decent output with good movement? may try it later
really wish we could use the new model though, it seems pretty good
got a bunch like that, but then got this one
Seems 512x512 works best for me
Maybe I should go down to 4_0 instead of 4_k_m
the ram usage difference is big enough
wide screen also bugs out for me, but this could be just the fast lora issue
kijai said it's experimental. Waiting for a full quant of the fasthunyuan model
welcome to GenAI era 😉
ok with 4_k_m i can actually do 384x256121f
at 21s/it
aka 2 minutes for 5 seconds
dude is morphin
but his walk is consistent
its like he noticed his helmet disfigured and put his hand up to fix it
lol
so yeah, flux is still much slower on 2.6 but I got ltx running about the same, maybe it's because of the lower resolution and it slows down after a certain size/vram usage
🙂 we now also have Ai Playground 2.0 alpha build released. feedback is welcome and look forward to community members adding new workflows for non-technical users to experience the cutting edge models
question
you guys added ipex-llm to that
correct
we added llama.cpp as experimental support
This was more regarding if comfyui would be able to use sym_int4
It'd honestly be even more nice if intel had some method of using svdquant
that quantization method is something that would be to die for on models like hunyuan
ah. so you are looking for something like a custom comfyUI node — intel neural compressor model loader ?
Something that can load models into sym_int4 format. Video models or image models alike.
A node that could do that would be very convenient
it would be a great alternative to gguf int4
yes, and better performance for sure
let me check with team and see what we could do
have tried setting the vae to bf16 and fp32 and still erros on vae decode. Gonn try fp16, have no idea what's happening atm.
nice I will check it out tomorrow.
RuntimeError: Input type (float) and bias type (struct c10::BFloat16) should be the same
there seems to be updates in comfyUI repo regarding hunyuan vae, have you tried updating
The vae error is for the new ltx 0.9.1 but I can check and see if there is another update.
@silk umbra
Ooo
aipg 2 has bmg support
i followed all the steps suggested to install and make ConfyUI to work with Intel graphics card...
can anyone help with installation steps ... i followed all suggested steps.. but getting these wierd errors.
@open junco ^
thanks a lot @earnest grotto for quick reply..
... will try if present attempt fails..
presently, someone suggested AI playground from Intel ...
seems its working .. not sure which one is better though... (it seems to be using python version 3.11.10 )
post installation .. i tried to copy checkpoints etc.. to comfyUI .. and loaded it independantly... seems its using A750 GPU .. for now...
aipg is recommended
it uses comfyui as a backend but I'm not sure if it exposes it for you to use yet
but if you also just want something simple to start generating images asap, would recommend
regarding this..
"it uses comfyui as a backend but I'm not sure if it exposes it for you to use yet"
was able to make it work with GPU.. for now atleast..
but yeah.. im not sure if it has future implications 😛 .. (i mean if that causes any other problems)
this "workflow mode" they have in the alpha, no clue how to actually access the comfy backend
I have not installed aipg 2.0 so I'm not familliar with what it does or doesn't let you do
since this is a comfyui thread, presumably you want a nodal mess like Blender or Houdini or Unreal or whatever else
sry, was just trying to share the observation .. if only it helped..
first time here. .. plz. take no offense..
and comfyui provides the spaghetti
oh ok..
if you just want images, get aipg 2.0
Ah!! i see. thanks..
would eventually need to try some videos... so i guess i will have to try the above steps.. in that case..
thank you!!
new update for ltx vae, going to see if it works
Nope, ltx still fails at vae decode with RuntimeError: Input type (float) and bias type (struct c10::BFloat16) should be the same Sucks
So my guess is either the model or the vae is loading at fp32? I've tried setting vae to fp32 maybe I should try the model?
ltx 0.9.1 is force loaded as bf16
it has an option for fp32, I've also set it in the comfy ui args, but nothing so far.
bf16 shoud work anyway, since we force that on intel already. Don't get it tbh. tried forcing vae to bf16 as well.
When i researched this error, it was usually fixed with --no-half and --no-half-vae, which basically forced fp32 precision.
I've separated the vae and I have gotten the model to work with the 0.9.0 vae but not sure how much that effects quality
here is the full error for anybody curious
with the help of chat gpt I got it to work, although maybe with errors lol, added this to casual_conv3d.py x = x.to(self.conv.weight.dtype) right before x = self.conv(x) return x
oh can't post the entire thing,
It gets deleted?
yeah, I guess about 20 lines of code is the cut off
I notice the speed fluctuates a bit, probably going between float types during generation?
@upbeat crow @reef ivy
Yeah.. I dunno why
Sorry guys
@reef ivy can you send me a dm of the whole thing, i wanna see if I can repost it
okay hold on
def forward(self, x, causal: bool = True):
if causal:
first_frame_pad = x[:, :, :1, :, :].repeat(
(1, 1, self.time_kernel_size - 1, 1, 1)
)
x = torch.concatenate((first_frame_pad, x), dim=2)
else:
first_frame_pad = x[:, :, :1, :, :].repeat(
(1, 1, (self.time_kernel_size - 1) // 2, 1, 1)
)
last_frame_pad = x[:, :, -1:, :, :].repeat(
(1, 1, (self.time_kernel_size - 1) // 2, 1, 1)
)
x = torch.concatenate((first_frame_pad, x, last_frame_pad), dim=2)
#my edit
x = x.to(self.conv.weight.dtype)
#end of edit
x = self.conv(x)
return x
==============================
its letting me post it.
code block?
were you posting it in a code block or anything like that
Can you try updating ipex_to_cuda
Duped the conv2d fix for conv1d and conv3d too
okay, I will in a few
yeah, i posted it as a code block
so discord probably limits code block
sorry guys
it works, only thing about the fix I found is that inference speed on the base model also slows down by about 2s/it
bf16
Not sure what the newer model is, but it's also probably bf16, however the new vae is likely fp32 or something.
It's only slow in the new ltx workflow, in the original workflow it's fine
So it's something with their nodes
can't get the latest model to work in old workflows
With using the old workflow, the new model has good speed but gets this error at vae decode now Expected all tensors to be on the same device, but found at least two devices, cpu and xpu:0! (when checking argument for argument self in method wrapper_XPU_out_addmm_out) on new workflow it works but is about 2x slower
Getting this error constantly now on all workflows with ltx. But it's random, sometimes it works fine after a restart
So messing around I got speed back by changing cpu in this code xpu, in their loader_node.py
and it still got deleted lol. hold up let me make a text file
my theory is when it loaded into cpu, it kept the fp32 weights when it shouldn't.
or maybe not, still slower at higher frames. with their new nodes, and xpu error on old ones.
crazy speed difference at same settings, old nodes vs new
first one at 5 steps and less frames seemed on par with how it should be, but once upped the steps and frames it went down exponentially
okay I get it, probably not offloading model so slows down over time.
getting a workflow together but it may need a node I had to add xpu support too in order to work. The node they built does something that makes everything way slower, I think it has something to do with mixed precision. It's noodle soup atm, i may post something tomorrow if I get it decent enough.
which comfy's core components are fine
it's just the ltxvideo custom nodes
I'd say we really just have to wait until 0.9.1 is supported by the official nodes so that we have device agnostic code to utilize
other than that im kinda just waiting for info on how to or if we can utilize a comfyui workflow in ai playground
@reef ivy disty updated the ipex-to-cuda repo
something regarding ltxvideo was changed and now im running it
0.9.1 is a huge improvement seemingly
but dead straight, hunyuan 4_0 beats it by a mile
It's just so much more data even if it is quantized
LTXV 0.9.1 With pag and stg
I have it working with his hijacks, just the official workflow is incredibly slow. Like 2x or3x slower. I have made it work in another workflow but had to add xpu code to the force/set vae nodes from comfy extras
Hunyuan just doesn't have img2video, so no control over output. But it is good
With the speed of the official workflow its better to use cogvideo at that point
would something like this work if added to the hijacks? torch.cuda.device_count = torch.xpu.device_count I basically had to change this to force the vae to load onto xpu instead of cpu. Otherwise it would get the cpu and xpu error at vae decode when not using their official nodes.
force/set vae device node from here https://github.com/city96/ComfyUI_ExtraModels
it's noodle soup with a bunch of unused bypassed nodes since i am still experimenting but you can see where I am going with it. Bypassing their model configuration node allows me to make 121frame videos at around 6s/it, as opposed to like 50s/it for 49 sec with their stuff. Just not sure about the stg quality as I am using a different sigma node. Guess for anybody interested lol.
what if you add reserve-vram 8.0 ?
8? I can try it and see, I have reserve-vram 6.0
Doesn't seem to help with there official workflows. It's something with their model configuration node and the new vae. I read some nvidia users oom'ng with the workflow even with the older model, so likely related. Only issue with my new workflow is you have to edit that custom node force/set vae type for xpu support
Another thing with their workflow I noticed, increasing the step count also decreases the iteration speed for some reason.
could be similar issue on Arc, reserve-vram arg seems to be the workaround for Arc GPU with minimal performance impact
when models are too big and it starts using shared GPU memory fallback
Yeah, there is an issue with the newer drivers for windows with memory and a750, and this is compounded with a similar issue wtih ipex that starts with 2.3.1. 2.14 doesn't have the issue, and drivers from 5971 can load larger models without --reserve-vram
if you follow this convo I had from the pinned post you can see where the driver issue started and I found the reserve-vram fix. #1193952640225267802 message
so actually while it doesn't actually fix the issue, it does speed up the generation with their workflow. from 60s/it to 30s/it with reserve vram 8.0. #1193952640225267802 message also speed didn't decrease with step increase this time. However it's still like 5x slower than it should be with their workflow
if you are on A770 you can even try reserve vram 16
I am on a750
doesn’t hurt to try lol.. 😆 the beauty of pathfinding
lol that is true
no change for me, but maybe it will help somebody on an a770. I will continue to work on my workflow as an alternative, only issue will be the custom node edit.
with mine you can get 6s/it with 121 frames
maybe I missed the above convo, did you use quantized model?
So far no quant for 0.9.1 ltx, I think it is already bf16 it's only around 5gb. It uses a new vae structure which is what is giving everyone problems
Once there is a quant I will try it out.
I could give you the fp8 version
Pretty sure I can convert its safetensors
welp nvm
i dont think i actually can
how fast is 0.9.1 going for you?
and are you using the official workflow?
Comfycore nodes for ltxvideo don't work. You need to use the comfyui-ltxvideo workflow with normal cliploader for t5xxl
this is how i have my workflow laid out
I got them working but had to edit a force vae to device node to support xpu. What is your speed? Their workflow is unusably slow for me, even with the older model
With STG and PAG its 7s/it at 768x512 121f
damn, must be the vram issue and reserve-vram doesn't help. Although maybe I will try and add --lowvram back and see if that helps. I get 6s/it with the new model and my workflow at 121f and like 30-50s/it with the official one.
nope, their nodes are just slow, specifically the the model configurator. For my workflow you have to force the vae onto gpu to retain speed, but that requires an edit of a specific custom node that most people would likely have trouble doing.
but going from 6s/it to 30-40s/it is crazy.
how did intel get flux working so fast in the battlemage reveal
meanwhile it's slow for me
Do the following
- Restart your PC
- Launch AI Playground
- Go to Settings, workflow
- Select Flux.1-Schnell me Q4
- Be sure resolution is 896x896. At 0.8MP
- Set numbers images to 4
- Enter a prompt
The following should happen at these relative times
Load model: 0:25s
Generate 1st image: 20s
2nd -4th image 12s each
If performance degrades over time, Restart the ComfyUI and AI Playground backend in Basic settings
ahhhhh thank you so much
your hunyuanvideo workflow you gave me should use the hunyuandecode node from the hunyuanvideowrapper
Everything else should be comfycore
\
I will have to take a look at that one, been awhile since I messed with hunyuan. Also taking a break with holiday stuff.
dear sir:How can I use Python to send an image to ComfyUI via its API and then receive the generated image in response?
What are you actually specifically looking to accomplish
I can't find some convenient documentation. You can look at AIPG's code, or this Blender addon to integrate comfyui https://github.com/AIGODLIKE/ComfyUI-BlenderAI-node
And someone has also made some wrapper for the API https://github.com/SaladTechnologies/comfyui-api
The actual requests should be fairly simple
I will give it a try.,Thank you for giving me great advice.
What are you trying to accomplish
You want to batch queue up images? Make your own prompt generator? You wanna make your own UI for Comfy?
Happy holidays everyone!
Happy holidays!
ERROR: torch-2.1.0a0+cxx11.abi-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
can someone help me plz, how to fix it?
What is your platform
trying to instal comfyui for my a770
What OS, what python version
why are you installing an outdated torch and presumably ipex
you can just install AIPG, it installs comfyUI
Or if you want a simple installer for just comfyUI, I have a script ^
what is AIPG? its more simple way to do it?
Specifically the 2.0 version
thank u! ima try that rn
For AIPG, you will need to probably inspect element or look in the command prompt somewhere for the actual port to connect to, as it's not the comfyui default 8188
Or there might be some button in the ui now, not sure
wait there is other UI? not comfyui?
ComfyUI, like the other webuis, is a server that runs on your own computer that you connect to through the browser, where you see the actual UI
Same for A1111, SDNext and so on
AIPG (and SwarmUI) run a Comfy server, but have their own UI and communicate with that server themselves
But you can also just connect to it directly
In that vein, you can also open it up (depends on webui) and connect through your phone
Not very convenient for ComfyUI specifically, though usable
Or if you portforward properly, can connect from outside your lan as well
ill stick with clasic comfyui interface, because i have my workflows from previous rtx gpu
ComfyUI includes the UI
As in... so does AIPG
You can connect (just to a different port) to use comfyui as usual
but if you don't wanna do that, there's my script too
you need conda, git, any python. if you have python, the script will tell you where to get conda and git from
can u share it plz?
.
probably fine
hmm
do you have your pc's local set to japanese or arabic or something else
russian
yes
im stuck there or its done?
If it does not say that something went wrong or that it's done, then it's not done
nothing happening
wait more
it's downloading
probably ipex, off the cn mirror which was slow-ish but not super slow, and right now it could just be giga slow
mm, bmg added
yeah uh, china is REALLY slow right now apparently
need to add progress bar to the script if its real lol
it's temporary
normally you'd download from us. but something happened and we couldn't access that, even the official guide had the cn mirror in it
looks like it's back though
lemme run a proxy
That's not going to change anything
I'm going to update the script to download off us, fix the broken text encoding and use battlemage since that's finally a thing
im from Spain now lol
but nothing changed
Intel blocked Russia, seems like this is the main reason?
.
@signal patio stop the script. Download it again. Run it again (preferably same directory)
It's probably gonna take a whole day to finish (the one you were running)
the stuff it did already download, will be verified then that's that
It's not gonna redownload things
what
I mean, for rtx all u need to do is just download folder of comfy and run. Sad that users of other hardware can’t do it that easy.
everything should be there now to allow a simple-ish download-and-run install for comfy
it will be slow, though that's a pytorch issue, they did something dumb with 2.5
sdnext is a simple install
but it's not comfy
???
so 2.3 works on us but access is denied to 2.5... odd
looks like comfy has this by default at least
Either
a) Run the script in a folder that does not need admin privileges
or b) Run powershell as admin, type in Set-ExecutionPolicy Unrestricted, then press enter. Run script from anywhere.
Is there any updated instruction to work on comfyui in the intel arc by @somber trellis in pinned comment.
someone please help.
I get 4gb error
@midnight mauve ^
Does the script edit and re-edit the model management file when installed now?
yes, including different edits for the different ipex versions
well, it edited it before as well, just didn't refresh that when running again
though after trying out 2.5 on windows more... damn that is SLOW
literally 2x slower (~1.7s/it vs ~3.1s/it) for sdxl
but it's the only one with battlemage wheels
not that I have a battlemage gpu
@vocal swan Do you want to install comfyui using my script, and say if that works?
This one
First with 2.3
Okay 1 min
sorry for the wait, seems to be working great, installed without any problems and the gens are much quicker at 3it/s
a770? that with sdxl or flux or what
did you choose pytorch 2.3 or 2.5
b580, sdxl, pytorch 2.5 as i just used the link u refered to and it installed 2.5
ok, there's no earlier pytorch for battlemage, rip
why, is 2.3 better ?
expect, hopefully, ~1.5-2.0 it/s in 3 months or somewhere around there
yes
it's faster
worse compatibility, some things are broken (notably stable cascade)
hm alright, either way thanks a lot for helping me out I thought i was completely unable to do anything
well, the problem ultimately still isn't solved
Yeah I suppose, just wanted to get sd running anyway
Yeah, its unusable with a750. Might as well use directml or something lol
it's not as bad on linux
Anything that uses CLIP is broken too. (SD 1.5, SDXL, SD3 etc)
PT 2.3 vs PT 2.5 on SD 1.5 for example:
its better on linux?
i thought its the same as windows
You should assume by default that on linux for any GPU (incl. AMD, Nvidia), compute will be better (let's say 20% faster) and gaming worse (let's say more than 20% slower)
On Linux;
Transformer models like SD3 and Flux has no change in speed
UNets like SDLX has 1.5x to 2x slowdown
SDXL goes from 1.6 it/s to 1.0 - 0.8 it/s
PT 2.3 vs PT 2.5
Both on Linux
literally 2x slower (~1.7s/it vs ~3.1s/it) for sdxl
Linux is 3x faster than Windows on both cases if you use this metric from Vik
I am using Arch Linux with Linux kernel 6.12.6
I was thinking about to start using linux since deadlock works well with proton
I need comfyui for flux and latest photoshop but i think its a pain in the ass to get it working
My gpu is a770 16gb
You are going to have a bad time with any adobe software on linux
Seems like VM is the only way
All i need from adobe is inpainting, it’s better to build a workflow for that and stop using Photoshop 🤣
Thx ima definitely check that out
What about support in other things in linux for arc? Its not that good as amd?
Both AMD and Intel are plug & play on Linux
Just don't use Linux kernel 6.8 on both
AMD has severe stability issues and Intel either doesn't work with stable diffusion or works but 2x to 4x slower than normal on Linux 6.8
6.12 is stable and fast on both
Ubuntu 24.04.1 LTS uses Linux 6.8
Ubuntu 24.10 uses 6.11
6.11 is fine on Intel
Yea but switching from windows to ubuntu is strange for me, because ubuntu is kinda windows in Linux world 😂
Its ok for people who never tried linux distros before
ah, i think my test was with a more complex workflow that slowed things down. masked conditioning and such
but nonetheless the same thing I did was faster on linux
with sd xl base 1.0, with a barebones workflow with the default ksampler, I get
0.67s/it with 2.3 ||1.48 it/s|| windows
1.22s/it with 2.5+IPEX, windows again (comfy has the allow bf16 sdp enable by default)
0.71s/it with 2.3 ||1.40it/s|| linux with 6.5.0-44-generic
0.73s/it with 2.3 ||1.36it/s|| linux with 6.11.11-061111-generic
i think that's about as much as I'm willing to test
welp, flux was notably faster on linux at one point, and I still have some fairly bad vram-leak-like issues on windows (native, not wsl) which simply don't happen on linux
and for these removing empty cache makes them worse
Does wsl still have the memory leak?
Might try and run comfy from wsl and update to 2.6 or 2.5. i do remember giving up some speed but windows was fast enough before. 2.1.4 was the last fast ipex in windows.
There is a workaround to help reduce leaking with wsl but I believe ultimately it's a windows driver bug
We finally got temp reading in Linux with 6.13 kernel. It took way too long for an essential feature :/
For SDXL, power usage goes down from 190W to 120W with PT 2.5
GPU is not fully utilized with UNet models on PT 2.5
Exciting times 🤖
incredible
Are there commands besides torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True) that can assist in performance on 2.6.0?
That will not change performance, because it's already in comfyui
Is that why your script doesn't contain it?
It was recommended by Li to add it 🤷♂️
Alrighty then ig 🤷♂️
I can't use InstantIR or certain upscalers on 2.3
and 2.6 is slow and i am sad
but it worky so yknow what ok
models like NomosUni 2x doesn't work on 2.3
super-fast compact upscaling model
well, blame the pytorch foundation i guess
i wonder if they're understaffed like blender or what
I can't be completely unhappy though. Being capable of generating 512x384 on hunyuan with fastlora 4_0 at 20s/it 49 frames is ok
combine that with an upscaler like uni
which takes 2 seconds to upscale 49 frames
i don't think that video is worth 16 minutes of wait
that didnt take
15 minutes
that video took 2 minutes
fasthunyuan allows inference to be done in 6 steps
Even on FP8 fasthunyuan I can generate a 512x384 49f video in 3 minutes
honestly even 2 minutes might be too long
i mean in the sense of, it's simply not worth it, quality is too bad
it's a still image with shaky cam but because it goes through a VAE it has fat artifacts
and it's some big artifacts
you can just make an image with flux and shaky cam it with anything else
high fidelity, chainlink probably won't shuffle around though i'm not sure if it'd still look sensible
this is a bit more justified
has to do with prompt
hunyuan likes long prompt videos
but unlike ltx, can do short-prompts like that one
Also I'm generating at a far lower resolution and total framerate than the model is designed for (and its fasthunyuan, 6 steps instead of 30)
keep it in mind
I hope LTXVideo becomes more usable.
Ltx is really good with img2video, you just need good prompts
Also video2video
Yeah, its the best at text2video. Problem is consitency for me, but they do have lora training now
https://github.com/comfyanonymous/ComfyUI/pull/6069
the allow sdp reduction is added after this PR and the oneAPI PR merged to comfyUI 🙂
updated and simplfied ComfyUI readme for Intel GPUs.
Intel GPU Support Now Available in PyTorch 2.5 https://pytorch.org/blog/intel-gpu-support-pytorch-2-5/
more info can be found: https://pytorch.o...
so yeah, it’s included in comfyUI now
huh nice
https://github.com/comfyanonymous/ComfyUI/pull/6112
This PR to be exact
OpenVINO gives 2.4 it/s
We need Flash Attention on PyTorch : )
What pytorch version is OpenVINO on right now?
2.3.1 CPU
i remember flash attention isnt just some matrix operations in pytorch
the need of cuda means sycl is needed to implement it on arc gpus
It needs direct access to GPU caches
It works by executing attention exclusively in the GPU cache
Can someone help me how to do a text to video models in vik's script
I tried hanyuan but got an error related to bitesandbytes
hey @civic charm 👋
can u help me to figure out why i can't get 4k 144hz under arch on my a770? same setup on windows was working fine
just installed latest arch (6.12.7-arch1-1) + Gnome
connected to my 4K 144Hz monitor with display port
Using journalctl -b | grep drm im recieving this errors:archlinux kernel: [drm] DisplayID checksum invalid, remainder is 190
xrandr --listproviders gives me 0 providers
I haven't ran into this issue
Also i use Wayland
Another issue i have ran into is DP ports doesn't support custom refresh rates on my A770. HDMI overclocks to 75 Hz fine.
I just reinstalled everything from 0 again, but with kde now
same problem - 60hz refresh rate is maximum
Wayland with GNOME*
you got an error or you got a warning
show it
How did u install comfy on arch? I cant install pytorch
pip install torch==2.5.1+cxx11.abi torchvision==0.20.1+cxx11.abi intel-extension-for-pytorch==2.5.10+xpu oneccl_bind_pt==2.5.0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
my script works on linux as well, unless you're on battlemage, then i'm not sure
ok, alchemist
hmmm
ok nevermind
something's broken
either do what that tells you (create a venv, or get conda and make a conda environment, or whatever else)
or... I'll look into fixing whatever broke with my script 🤔
python -m venv your_venv_name
newer linices have python 3.12. you will want to have 3.11
conda can install any python. otherwise, deadsnakes ppa
I’ve never seen this, I should do something with this venv every time i want to run comy and when i stop using it?
you should in general have every python-using thing have its own venv
it avoids conflicts.
some things need specific versions of specific packages. if they use the system python and install there, things will override one another's requirements and break
some bigger projects will outright bundle their own python
blender does this
Thank u for explaining! I should create this venv and put pytorch command in a specific folder? Or it doesn’t matter?
the venv is a folder
So i need to create it inside comfy folder?
@signal patio Just use my script ^
I thought its for windows
it's for both
No way
?
Im starting thinking that this is impossible to run comfy on arc
whut
Everything is so complicated
things are getting simplified
if the pytorch foundation didn't massively slow down 2.5 it would be much simpler
you don't have clinfo
a bunch of various driver packages are probably missing
or is there no bash in bin on arch
do clinfo and say what happens
that's all? it doesn't detect the arc gpu
Yup, thats all
here's a guide for installing stuff. it's for ubuntu. i don't know how much of it, if any, applies to arch https://dgpu-docs.intel.com/driver/client/overview.html
I think its time to remove arch forever
I dont know how to call people who creates a distro with a font without japanese, cyrillic and etc by default
This shit is sooooooo dumb
ubuntu i'm using has cyrillic working in the terminal
Yea, ubuntu is way more comfortable
But
Its bloated with telemetry
Fedora is the same
idk, i don't care about telemetry, I use discord already and it has no ads
the powershell script is old and outdated
i've removed it
you just run the python script
Just installed ubuntu and all software for it from intel website
Steam - crashes, no 144hz on all of the distros i tried. Clinfo - 0 platforms on arch/ubuntu.
Back to Windows, i still can’t install comfy with your script? Can u help me? How to pass this?
What is breaking
@signal patio ^
Running powershell from administrator
Restriction policy unrestricted now, but the problem is the same
This is a completely different issue.
At this point though, Python/Windows's fault.
I'm making some workaround
@signal patio run python, run this, and show me what the outputs are for this:
import sys
import subprocess
sys.getdefaultencoding()
sys.getfilesystemencoding()
subprocess.check_output("chcp", shell=True)
How to run python?
type python in the command prompt. Press enter. Copypaste the code above. Press enter.
How? This is fresh installed windows + latest 3.13 python
@signal patio Download the script again. run it again.
It should work now. If you see funny symbols instead of proper text, no idea
Installed without any errors
BUT
When i run it
Just wait it out
This is normal
You are also going to see it every time
Wait harder
List your full specs
Intel Core i5 12500
Intel Arc A770 16gb
16gb RAM
16gb ram is not enough?
No, im connected with DP in my dGPU
@signal patio Download script again, run script again
in the future consider disabling igpu but this should now work with the igpu enabled
You can run it in the same place as last time, it'll update whatever's already installed
Nothing changes
hey @earnest grotto @civic charm is there anything I would need to add to comfyui with wsl2 to mitigate the memory leak issue? Or is it already in the hijacks?
I do remember using like tmalloc or something before, but I don't fully remember anymore
hijcaks disables torch gc on wsl2 for this
tcmalloc will help with the system ram leaks
Can someone help me to run comfy on arc a770? Its been a week and i still cant do it. Tried official comfy guide from github, tried Vik script, tried Arch, Ubuntu and nothing 😢
Thanks disty, will try wsl 2 today or tomorrow
What error are you getting?
Common things make sure you habe the right python for viks script (3.10 i believe), and make sure you call the one api stuff everytime you start comfy
What is the command you are using to open comfy? I am unfamiliar with this error, but it is happening with the hijacks edit in model_manegment.py (I think). Are you calling the oneapi files when opening comfy?
this could be wrong, but you need to run something like this source /opt/intel/oneapi/setvars.sh everytime you start comfy in linux if using ipex I believe.
Vik’s script creates a file shortcut to run comfy
Idk any other ways to run it. My previous card was rtx3060 and i was installing comfy in 1 min just downloading a folder from github
After it breaks, type in:
python -c "import torch; import intel_extension_for_pytorch; print([torch.xpu.get_device_properties(i).name.lower() for i in range(torch.xpu.device_count())])
press enter
Show what it says
He has the iGPU enabled, since it's an older one this causes a problem with OneAPI/IPEX. One of the two. Workaround (implemented in AIPG) is to just tell which GPUs OneAPI should use directly. Something didn't work out when it does that.
OK, sorry. Type in set ONEAPI_DEVICE_SELECTOR=, press enter, then the above
weird
How about then,
python -c "import torch; import intel_extension_for_pytorch; print(torch.xpu.device_count())
you posted the same screenshot
my bad
remove 1 bracket, sorry
at the end
python -c "import torch; import intel_extension_for_pytorch; print(torch.xpu.device_count())
Then this
No.
Copypaste this
Then press enter
No comma after it. No pasting the rest after it.
Whatever, do you want to just disable the iGPU from device manager?
I will update the script again
all i need from it to be able too use it in case my main gpu wont let me boot
It's an IPEX/OneAPI bug
Intel peeps worked around it somehow with AIPG but that doesn't quite seem to work for me.
i should disable igpu in bios?
im on windows
only
Go into task manager and disable it.
@signal patio Restart PC so we're sure the disabling takes effect. Download script again. Run again.
U are legend
Thank you for your help, idk i hope Intel will make this easier in the future
Because for now it was something like Souls-like quest
It will definitely be getting easier in the future
Now, everything is the same as on nvidia cards? All nodes is working?
Everything in comfy by default, and the custom nodes my script installed, works
Of course, not absolutely everything works. For example, the various 3D-related custom nodes, rely on explicitly nvidia-only things (e.g. nvdiffrast)
But you're probably not using those
The vast majority of things work
Some nodes will require 2.5 but most will still work. Its a speed vs compatibility choice. Not sure if vik added 2.5 to the script or not. 2.3 is most well rounded speed wise and 2.5 is slowest.
there is 2.5 yes
Why 2.5 is slower?
Issues with pytorch itself, it's apparently slower on all vendors with intel being the worst afaik.
2.6 nightly build is a bit faster but it's still being worked on and is also not as fast as 2.3.
Guys how to install Puild for flux? All the guides telling that i should use comfyui/python embedded/ folder but i dont have one
not sure, my guess is that would be your venv folder. I think that python embeded is for the standalone comfyui.
also from what i'm reading on reddit "This workflow currently works with ComfyUI versions 0.2.3 or lower, as the new one uses an incompatible Python version "
Hello,
I’ve been trying to install ComfyUI for 3 days. My graphics card is Intel Arc B580.
I managed to reach a certain point, but I’m encountering an issue. When I click "Generate," after loading the model, the process stops at the clip_text_encode step. How can I resolve this?
are you using --bf16-unet
I was trying this way, but the same error persists. I tried --gpu-only and --force-fp16 vs. but still the same.
Python version: 3.10.16, PyTorch version: 2.1.0a0+cxx11.abi
you need bf16 for intel not fp16 for comfy
your ipex is also old
also, maybe try a different workflow or model.
So if I update the wsl 2 kernel to 6.2+ do I still need to manually install the intel drivers? Seems they aren't being picked up by torch, but not sure if installing should be necessary. edit seems you do, probably something for loading the windows drivers?
also, trying torch 2.6
Hey! Maybe u know how to install reactor?
I get this error
It probably doesn't support python 3.13.1, most webui stuff is on 3.10 or 3.11
Also, make sure you are in your env
It may also be able to install from the comfyui manager, most nodes can unless they are brand new
On another topic, I forgot how much a pain linux and wsl 2 was lol. Been at this all day. Probably should just nuke my wsl env and start from scrath tbh.
took all day to realize that 2.6 doesn't need oneapi and pulls an error if it's called. 🙂
do NOT install the pti if you have the basekit installed
and vice versa
@reef ivy
@signal patio Run comfy with the shortcut. press ctrl+c, this will shut it down. install whatever you need to install, the command prompt will persist with the environment activated.
Yeah, I am running it through a venv but was calling one api by default from my old ipex install. Got it running but installing comfy nodes as we speak
2.5 needs one api files but not 2.6 nightly apparently
keep getting this error now 2 active drivers ([<class 'nvidia.CudaDriver'>, <class 'intel.XPUDriver'>]). There should only be one. with a bunch of nodes
seems to be an issue with 2.6 and triton.
Can't get any video stuff to work yet, but flux is getting about 7.45s/it with wsl2 and pytorch 2.6. I believe that is just 1 sec slower than 2.3 but not sure off the head atm.
ram usage is still crazy with wsl it seems though.
I can't get Hunyuan to work I get a completely black output. I tried with 2.3 and 2.5 ipex
I use kijai/ComfyUI-HunyuanVideoWrapper
Too bad that Node block_Swap is not natively integrated into comfyUI
Try this workflow. Just download the image and drag into comfy should work #1193952640225267802 message
Thanks
How to fix it?
And this 🤣
FETCH DATA from: https://api.comfy.org/nodes?page=1&limit=1000
is causing the ComfyUI tab to reload very slowly. How can I prevent this?"
i used this command and it fixed BUT i have the same problem as @tribal hare now. Its loading forever
Might be an issue with that node, check the github and see of others have the issue.
I noticed comfy taking a while to load when I was testing wsl 2 so likely something wrong with comfy commit. I thought it was a wsl 2 issue but I hadn't went back to my windows env to test
New pytorch? Is it fast again? I couldn't get 2.6 to work right in wsl, triton issues could be because ipex was installed at one point on the main install.
It did seem faster than windows for when it did work in flux at least.
🤷♂️ it seems around the same as 2.6
Do not mix IPEX and PyTorch
Only fix is a new venv
I never had it in the venv but it was installed in wsl 2 itself. Might just have to nuke the whole thing
venv isolates from your main python
Can someone tell me how to fix this problem. I used the tutorial below to install it。I used the tutorial below to install it。
what problem
like this
I use the flux model. When the program progresses to the clip nodes, it will report the userwaining as shown in the figure, and then comfyui will terminate and exit
sorry, I don't quite understand what quantization for T5 is.
https://github.com/comfyanonymous/ComfyUI/discussions/476 I used this installation tutorial
extremely old.
how did that even work? wild
AIPG includes comfyui, and it will set things up for you
I'd suggest you use that and use its comfy
install it, run it, ctrl+shift+i, see what comfy's port is
IF you don't want to use AIPG, I have a script that will install comfyui for you, as well as some addons, patches, whatever
.