#ComfyUI for Intel Arc using IPEX

1 messages · Page 4 of 1

karmic jasper
#

cause i drop that in comfy and nothing loads

#

tried another one, that also brings nothing into comfyui, i don't get it

#

yeah, none of the json files bring anything in. not sure whats wrong

#

still getting this

reef ivy
karmic jasper
#

it's nowhere

reef ivy
#

Also make sure you select everything and make sure it loads, sometimes the workflow has the wrong name for files or you have them in a different place or different name etc

reef ivy
# karmic jasper yeah its not that

I have no idea, you can try downloading the cogvideox nodes in manager first maybe. Everything loads fine for me. Also what browser are you using? Sometimes that can have issues with webui's. I am on firefox, and have used chrome/edge(though not with cogvideo yet)

#

Oh, and make sure comfy is up to date

karmic jasper
brisk blade
#

Curious. How are you running this? native linux install? Windows? WSL?

reef ivy
#

I am in native windows, but the 4gb issue is with arc in general.

#

The hikacks bypass it somehow(have to ask disty about that)

earnest grotto
earnest grotto
reef ivy
karmic jasper
reef ivy
# karmic jasper Yes, the flux one I saved works fine

did you install the cogvideoxwrapper nodes? you can do it in the manager, or git pull. I honestly have absolutely no clue why these would work different for you. Even if you don't have them installed they should show up red.

karmic jasper
reef ivy
#

I have no clue, only thing with me is they aren't centered so you have to zoom out compared to other workdlows. If comfy is updated I have no idea.

karmic jasper
#

yeah, i zoomed out and everything

#

lets see if i can update comfy

#

nope, its latest version

reef ivy
#

Try and git pull in the comfy folder

reef ivy
karmic jasper
karmic jasper
#

Gotta generate an image to test it with

karmic jasper
#

ok, it seems to be working!

#

Thank you mate, much appreciated

reef ivy
#

no doubt, glad it worked.

karmic jasper
#

something about format

reef ivy
#

This effects all the video output bot just cog video

#

start your venv manually and then navigate here ...ComfyUI\custom_nodes\ComfyUI-VideoHelperSuite and then pip install -r requirements.txt make sure your in that folder you dont' want to install the default comfy requirements

#

This should install the needed encoder in the venv, I tried updating the node through comfy but this was the only way it worked for me

native jackal
#

Hi Guys, I just bought a laptop with Intel Core Ultra 7 155H CPU with 32gb RAM. I just generated an image on SD3.5 medium model with the example workflow provided by stability AI and it took 1 hours and 12 minutes to generate! Is there a way to accelerate this to more reasonable waiting times (like 10 minutes)? I appreciate the help!

karmic jasper
native jackal
#

I was afraid this would be the answer 😂 Anyway thank you for quick reply 👍

earnest grotto
#

I don't think 1 hour is quite right regardless 🤔

earnest grotto
native jackal
#

I installed ipex and then followed the instructions on the first post of this channel but got this error when try to pip install -r requirements-ipex-ultra.txt. Any help is highly appreciated 🙏

earnest grotto
native jackal
#

3.11.0

reef ivy
nocturne fjord
#

Hello, has anyone managed to get comyui's Flux Train to work with a 770. I get the following error at the start of the workout "RuntimeError: FP64 data type is unsupported on current platform."

#

Loading: ComfyUI-Inspire-Pack (V1.6)

Total VRAM 15931 MB, total RAM 32127 MB
pytorch version: 2.3.1+cxx11.abi
Set vram state to: NORMAL_VRAM
Device: xpu

Loading: ComfyUI-Manager (V2.51.9)

ComfyUI Revision: 2804 [cc9cf6d1] | Released on '2024-10-31'

#

Python main.py --auto-launch --bf16-unet --disable-ipex-optimize --fast with hijack

reef ivy
#

Only pytorch version that supports arc is 2.5 or ipex afaik

#

And fp64 isn't supported natively, have the same issue trying to use frame interpolation ATM.

nocturne fjord
#

I guess it's not a good idea to install pytorch 2.5 because it's much too recent?

reef ivy
#

Only issue I have heard is that it is slower

#

But it supports intel natively

#

I am stll on ipex though, as speed is important to me lol

rocky sedge
#

i am using comfyUi with stability matrix

#

and currently using ultra 7 258v with intel arc

#

Can anyone suggest ways to make better use of resources?

reef ivy
#

Found a workflow for extended videos in cogvideox-fun models, used the 5b-fun since 2b is kinda nonsense so far. Real low res but I may try an video upscale node. With fun models you can have a start and end image, not sure how to do it with the regular and this work flow is crazy looking lol.

#

Honestly, pretty cool for a video on an a750 intel arc gpu that couldn't even generate images over a year ago. Progress is crazy.

earnest grotto
#

Has this changed with 2.5?

earnest grotto
#

Cuz when I tested with earlier 2.5, it was still borked

reef ivy
#

You don't have the t5 model loaded , you have the vit L loaded twice

reef ivy
#

The cpu offload option gave me errors like it wasn't supported. You got the 5b gguf to work? They also gave me errors, I did just update maybe I will try again. Also haven't actually tried the gguf clip models are they faster with same precision?

earnest grotto
#

So, no one else either can get flux loras working?

reef ivy
earnest grotto
reef ivy
#

Tried detail lora's and turbo lora's

#

So far yeah, but I think not all loras work

earnest grotto
#

Do you wanna test out a lora for me? Are you using 2.5?

reef ivy
#

No, I am on Ipex.

#

I can try the lora, but I am using Ipex

#

Also, is there a way to update comfyui without having to re-add the ipex hijacks each time?

#

I keep forgetting then get the 4gb error lol

earnest grotto
reef ivy
#

I tried a bunch of git stash commands and it "worked" but completely broke the file lol.

earnest grotto
earnest grotto
reef ivy
#

Should lora be full strength?

earnest grotto
#

yes

#

the expected result is not a pure black image

reef ivy
earnest grotto
#

so it works

#

nice

#

i'll go plop my hacks on github later

reef ivy
#

Maybe issue with pytorch 2.5?

earnest grotto
#

I had tried with both 2.5 and 2.3, and it was pure black

#

May be something fixed with windows drivers now, idk

#

actually, training might still not work on windows, I'll have to try and see

reef ivy
earnest grotto
#

This was both on linux and windows. I only train on linux, I don't use wsl and training on windows natively hasn't worked for a while

#

actually i may have only tried linux, i don't remember anymore, i'll try again

reef ivy
#

Strange, it could be the driver I guess.

#

Nice! How many steps did you use? Just a heads up you can get decent quality with like 10 steps.

reef ivy
#

bf16 is the only one that works, and the fp8 converter thing doesn't work either so no speed boosts for us at all(at least none that worked for me). I need to try the cpu offload again, it also gave errors last I tried.

wicked fulcrum
#

All we're looking at ComfyUI as an optional backend to AI Playground.
With this we'd commit workflows that integrate well with AI Playground and provide value added features.

I'm checking if this community would be interested in submitting workflows for us to test and review for this.

If interest, I'll create a ComfyUI workflow thread here, for shared workflows.

earnest grotto
#

What about workflows that need specific custom nodes?

#

And/or potentially even models

#

Specifically:
Upscaling with upscale models (tons of those, different results), like realesrgan-x4plus, the nodes for this are in by default but no model; good for textures and anime
Upscaling with SUPIR, needs nodes and model, good for normal realistic images
Inpainting with powerpaint, needs nodes and model, uses a non-inpainting SD1.5 model, does object removal much better than regular inpainting models, can be better than regular models besides that but that stuck with me

wicked fulcrum
#

We'll probably look at creating a manifest for a workflow, everything needed for it, with a user controlled option to download and install.

The harder part is input types images, masks etc. If AI Playground already has the input then we can map. If it doesn't then that would be harder to implement

reef ivy
#

What would be cool is to have a toggle to view the nodes and edit etc. If you ever did music there is a program called reason where you can toggle and view everything like a hardware setup and mess with the connections then flip back to ui.

earnest grotto
#

@reef ivy Do you use Linux/WSL and if yes, would you like to test out training a flux lora through comfyui on your 8gb gpu?

#

xpu-smi is telling me my vram usage is consistently below 8gb. It is only getting polled something like every 20 seconds but it certainly got me thinking

#

i kinda don't think there's a vram equivalent/argument for ulimit to test myself 🤔

reef ivy
#

I have wsl but haven't used it in almost a year, also no clue how to train

reef ivy
#

https://blog.comfy.org/mochi-1/ Mochi in comfy, there is a 9gb model under low ram solutions and a fp8 clip, someone ran it on a 3060 12gb.

Comfy Org Blog

We are excited to announce that ComfyUI now has optimized support for Genmo’s latest model, Mochi! This integration brings state-of-the-art video generation capabilities to the ComfyUI community, even if you're working with consumer-grade GPUs.

The weights and architecture for Mochi 1 (480P) are open and available, and Mochi 1

reef ivy
#

Seems like it will take at least an hour to genetate at 30 steps, but seems to be working so far. Not sure I want to wait that long

reef ivy
#

it works

#

Takes like an hour on a750 with default settings, I ran it i with less steps and lower res and it sped up a little, took like 17 minutes but lower res the output was bad.

#

make sure you update comfy

#

and use the workflow they provide, should be able to click and drag tha photo

#

I'm not sure, I update manually and have an old script that doesn't have that option.

#

Just have to re-edit the file for hijacks each time

earnest grotto
#

It does, just re-run it

#

Also updates the custom nodes it installs, if they're already installed

#

It doesn't update edited files, i should go fix that

reef ivy
#

If you get it working let me know how fast it goes, it's real slow on a750. And i think I read it's also slow on amd, but not sure what they were using.

#

so far less frames doesn't seem to speed up generation much if at all, and lower res only does to a point. So it's much diffrent from cogvideo

#

might let it sit for an hour and see how good it is, but don't feel like it now

#

Are you running their workflow? or the one from mochiwrapper? I am using the one they posted

#

I am using the scaled version

earnest grotto
#
  if bias: bias = bias.to('cpu')
  o = torch._scaled_mm(inn.to('cpu'), w.to('cpu'), out_dtype=input.dtype, bias=bias, scale_a=scale_input, scale_b=scale_weight).to(inn.device)

there, quick dirty hack

#

might not work since with fp8 i know things might also not be implemented for cpu

reef ivy
#

I just changed this to 5 steps because it's so slow but here is the workflow, it is default 30.

#

you should also be able to drag that video I posted earlier after downloading it #1193952640225267802 message

#

that resolution is super low though lol

#

also, it's much faster now. Maybe I was having some issues when i tried, sometimes comfy goes slow until I re start it.

#

oom'd at vae, need to use tiled probably

earnest grotto
#

yeah some bot here deletes things

#

no idea why

reef ivy
#

If a piece of code is too long it auto deletes, probably for security reasons.

earnest grotto
#

rip dan

reef ivy
#

It deletes with like more than a couple lines

reef ivy
earnest grotto
#

I've posted longer that didn't get deleted

reef ivy
#

It's been deleting almost everything for me, if it's longer than like 2 lines maybe even one line sometimes

earnest grotto
#

What he was posting was fairly short, and I think sometimes he posted my thing copypasted without the code block?

reef ivy
#

even with code block it deletes for me sometimes

#

it's a recent issue

earnest grotto
reef ivy
#

Who should we contact to get him unbanned?

earnest grotto
#

no idea who specifically

#

204342691964780546

#

IDK how i'd turn that into a mention, dammit discord

somber trellis
#

Wow.

earnest grotto
#

Well that was resolved fast

reef ivy
#

They deleted all your posts too it seems?

earnest grotto
#

normal for a ban

somber trellis
#

It was three lines, 277-280 of ops.py

#

Was I to replace that with your code.

earnest grotto
#

either way, replace the 2nd line you posted which does have bias, with

  o = torch._scaled_mm(inn.to('cpu'), w.to('cpu'), out_dtype=input.dtype, bias=bias.to('cpu'), scale_a=scale_input, scale_b=scale_weight).to(inn.device)
#

and 4th one which doesn't have bias with

  o = torch._scaled_mm(inn.to('cpu'), w.to('cpu'), out_dtype=input.dtype, scale_a=scale_input, scale_b=scale_weight).to(inn.device)
#

try again

#

might still not work, stuff is unimplemented for fp8

somber trellis
#

I should've just posted an image.

#

There, that look correct?

earnest grotto
#

i wonder, do spambots post code that often? because I have not seen any in a different server I'm in that gets spambots fairly often

earnest grotto
#

might also be giga slow though 🤔

somber trellis
#

17.80it/s

#

on bf16

#

set to fp8 weights via the 'load diffusion model' node.

#

Because I remembered

upbeat crow
#

If anyone gets banned in the future please just contact a mod, we can fix the issue immediately. sorry for auto mod.

reef ivy
# somber trellis

wow, that is pretty fast tbh. I can't get the vae to work, tiled vae isn't working. Gonna try the mochi decode node

reef ivy
upbeat crow
reef ivy
#

okay, thanks a lot. Appreciate it

reef ivy
#

can't get the vae to work, tiled vae gives errors and the mochiwrapper nodes refuse to install. Guess low res is all i can do

reef ivy
#

I cant get it to even install at all, will try another day.

somber trellis
#

If anyone else can get cogvideo 5b working that'd be cool.

#

I wish we could run mochi but that's 24gb vram

#

🤷‍♂️

reef ivy
#

5b regular can only run at a set res and frames

sour depot
upbeat crow
#

Just checking in, have issues cleared up??

#

In this chat?

earnest grotto
upbeat crow
#

We changed limits in this channel only, it should help but still curious

somber trellis
#

@earnest grotto The latest IPEX version has issues with Florence2 and reading filepaths

#

the florence2node by kijai

#

Works fine on the previous version.

earnest grotto
#

ipex doesn't read files

#

what do you mean

somber trellis
#

I'll show you the error.

#

It's probably some dependency ipex uses, but I don't know why.

devout tangle
#

How do i install comfy and llms using pytorch 2.5.1 instead of ipex

#

Is it better than ipex?

civic charm
#

Replace import ipex with this:

try:
    import transformers # ipex hijacks transformers and makes it unable to load a model
    backup_get_class_from_dynamic_module = transformers.dynamic_module_utils.get_class_from_dynamic_module
    import intel_extension_for_pytorch as ipex
    ipex.llm.utils._get_class_from_dynamic_module = backup_get_class_from_dynamic_module
    transformers.dynamic_module_utils.get_class_from_dynamic_module = backup_get_class_from_dynamic_module
except Exception:
    pass
somber trellis
civic charm
#

find the import intel_extension_for_pytorch in model management

somber trellis
#

oh ye

#

👍

somber trellis
civic charm
#

after this: transformers.dynamic_module_utils.get_class_from_dynamic_module = backup_get_class_from_dynamic_module

somber trellis
#
    import transformers # ipex hijacks transformers and makes it unable to load a model
    backup_get_class_from_dynamic_module = transformers.dynamic_module_utils.get_class_from_dynamic_module
    import intel_extension_for_pytorch as ipex
    ipex.llm.utils._get_class_from_dynamic_module = backup_get_class_from_dynamic_module
    transformers.dynamic_module_utils.get_class_from_dynamic_module = backup_get_class_from_dynamic_module
    from ipex_to_cuda import ipex_init
    ipex_init()
    xpu_available = True
except Exception:
    pass````
civic charm
#

yep

somber trellis
#

👍

#

works now

civic charm
somber trellis
#

Now I can do big funny auto-I2V cog workflow

#

slo

#

w

earnest grotto
#

do any of these local video models produce decent results
the online stuff has been pretty disappointing to look at
show us your results

reef ivy
reef ivy
civic charm
#

4 GB issue is a hardware issue

#

Alchemist is a 32 bit architecture

uncut bronze
#

Is it still recomanded to use IPEX as in the original post explained or is there a better Method to get ComfyUI to run by now?

earnest grotto
#

Seems pretty good?

uncut bronze
#

It actualy does. I have Comfy running with Ipex allready though. But I would love it for a second install to try the hijacks

#

I had some issues with bf16 and the fooocus nodes and with torch audio

uncut bronze
# earnest grotto Seems pretty good?

Okay I have now clou how to use Disty's hijack. Do where do I have to put the Hiijak. And do I need to run different requirements for them or update pytorch or anything?

earnest grotto
#

You git clone the hijacks repo in comfyui's comfy folder, find where intel_extension_for_pytorch is imported in model_management.py and edit that so it also does from ipex_to_cuda import ipex_init and ipex_init() right afterwards

uncut bronze
#

Thanks, Uff, with oneAPI installation and all thats a lot more complicated 😄

earnest grotto
#

windows or linux

reef ivy
earnest grotto
#

oh, i saw it but forgot

#

I'll edit the script later

reef ivy
#

Cool, seems like this will work with pytorch as well? If not using ipex

earnest grotto
devout tangle
#

Do you have a script for linux

earnest grotto
#

@devout tangle ^
Windows only for now.

devout tangle
earnest grotto
earnest grotto
#

restart whatever is using the ram

#

what kernel

devout tangle
#

I didnt install out of tree gpu driver, just regular that came with distro

swift aurora
#

anyone know a relatively painless fix for the numpy-problem?

earnest grotto
#

probably you want numpy==1.26.4

earnest grotto
devout tangle
earnest grotto
#

yes

devout tangle
#

It's not convenient, it must be a big of some sort

earnest grotto
#

it is a bug yes

#

are you launching with --lowvram, --highvram, etc.?

civic charm
swift aurora
#

Suggestion; If possible, add pre-requisite part to the pinned script-post mentioning that you need to have anaconda/forge/miniconda installed. Just for clarity's sake

honest hull
#

you have iGPU enabled?

somber trellis
#

Also, I'd like to mention that the flux toolkit loras (depth and canny) work on arc, but will not load if the main model is loaded in fp8 dtype with a seperate lora. It results in a black screen.

honest hull
#

tried it with --bf16-unet ?

somber trellis
#

...You are asking common-sense questions, Li.

#

Yes.

#

But it's always good to make sure.

#

🤷‍♂️

honest hull
#

haha my bad.. been trying to help troubleshooting non-stop these days..

somber trellis
#

The flux fill fp8 model works great

#

so inpainting and outpainting on arc is no problem

#

I assume the main flux canny and depth models (non-lora) would work but nobody has converted them to FP8 yet

#

and I don't want to install two 23 gigabyte files lmao

honest hull
#

one thing that we could experiment is to use the gguf model

#

instead of FP8

somber trellis
#

GGUF works fine

#

It's just slower

#

But more accurate

#

higher precision than base fp8

somber trellis
#

So I made my own fp8 versions and they work

#

This script I found off of Kijai's stuff is nice to have.

honest hull
somber trellis
#

That'd literally cripple the bandwidth I have.

#

🤷‍♂️

earnest grotto
#

try and see

#

i know lowvram improved my speed, and running the t5 off the cpu could be even faster than shuffling it around

reef ivy
#

It will still use more than 5gb vram if it needs to (At least it seems to sometimes), it just makes the gguf models go faster. (on a750 anyway)

earnest grotto
#

it attempts to leave that much vram to the rest of the system, not to reserve it for comfyui itself

honest hull
#

yeah i can even get fp16 gguf model running with decent speed when launching with reserve vram 4.0

#

without it, it keeps swapping with DRAM for smaller chunks and taking much longer time

#

for NV users they seem to have similar memory management techniques. for example on a 4060 8GB you would see only 7GB out of 8GB is being used for Flux.1 Q8 running, and can notice the increase in system RAM usage while inferencing.

earnest grotto
#

What's up with the larger than usual amount of mysteriously vanishing messages now

reef ivy
#

Nvidia added a cpu offload option to the drivers a while back, it can be toggled on/off in drivers also a comfyui command for i think.

honest hull
somber trellis
#

But I removed my messages since they were at like 2 AM for me

earnest grotto
somber trellis
#

yes i know i said it here lmao

earnest grotto
#

saw something in digital-art, went to see, nothing
the rvc thread now had a notification, but nothing

somber trellis
#

This node errors out 🤷‍♂️ I want to use this over SUPIR since it's better overall at image restoration

earnest grotto
#

lower the resolution, say what happens

#

use all vram-reducing options you can

somber trellis
#

at a resolution of around 480x240 it still failed

#

🤷‍♂️

earnest grotto
#

use the lowest possible resolution
I have no idea

somber trellis
#

0.1 megapixel scale causes a tensor a and b mismatch due to it literally being too small

#

rip

red niche
#

RuntimeError: Numpy is not available?

#

using numpy==1.26.4

#

@earnest grottotor

earnest grotto
red niche
earnest grotto
#

that's a pretty old ipex

red niche
earnest grotto
#

Install using my script

#

.

red niche
#

should I remove all previous modules?

earnest grotto
#

Put the script in some random folder. Run it. It will create everything for you.

#

Whatever you had previously won't matter.

red niche
#

nice script

#

feels very welcoming to use

#

for people like me

earnest grotto
#

that's the point yes

somber trellis
#

So I tried out GGUF models with reservevram enabled

#

They equal the speed of FP8 models

#

but are actually more accurate to fp16

#

🤷‍♂️

somber trellis
#

ltxvideo works with comfycore base nodes, but not nodes from ltxvideo-comfyui

#

Faster than flux too btw, at 2.5S/IT.

sly trench
#

I got this error when running ComfyUI. I used a-One-fan's setup file. Can anyone give me some advice? Thanks so much!

!!! Exception during processing !!! The program was built for 1 devices
Build program log for 'Intel(R) Arc(TM) 140V GPU (16GB)':
-11 (PI_ERROR_BUILD_PROGRAM_FAILURE)

earnest grotto
#

You can run it again, show what it says

reef ivy
#

You may need to disable your IGPU when using arc for AI, not 100% sure though there might be workarounds now.

earnest grotto
#

The issue here is pretty likely that I didn't expect Intel to have a dedicated GPU called "140V"

#

And it probably installs the igpu version of ipex

#

which is also partially why i made it spit the name of the GPU and if it decided it's dedicated or integrated back to the user when installing

reef ivy
#

oh that's battlmag-- i mean xe2 mobile. Not sure what that runs on as far as ipex

earnest grotto
#

wait, so it is integrated

earnest grotto
honest hull
#

140V is the Core Ultra series 2 integrated GPU

#

need to replace the ipex wheels with

#

conda install libuv python -m pip install numpy==1.26.4 torch==2.3.1.post0+cxx11.abi torchvision==0.18.1.post0+cxx11.abi torchaudio==2.3.1.post0+cxx11.abi intel-extension-for-pytorch==2.3.110.post0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/us/

earnest grotto
#

I get the GPU by basically looking at powershell's Get-WmiObject Win32_VideoController, which is fairly descriptive but I don't think gives info on generation and such

#

maybe it does but i'm on windows rn

sly trench
# honest hull 140V is the Core Ultra series 2 integrated GPU

You’re absolutely right. I’m running this on a laptop with Core Ultra 7 258V (32GB RAM) and no dGPU.

I installed IPEX 2.3.110 following Intel’s instructions. Then I ran pip install -r requirements.txt in ComfyUI directory.

Server initially reported missing modules like opencv-python, which I installed individually. After resolving those, everything worked perfectly without any errors.

I haven’t tried hijacks yet. I’m newbie and not sure what it can do.

sly trench
earnest grotto
#

I am the One fan in question

#

I can just hardcode a check for "140V" right now but i want to see if there's a better way to do it

sly trench
honest hull
#

--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/us <--- lnl/us at the end instead of xpu/us or mtl/us

#

multi-AOT for more devices should be WIP. once that's done then we can use the same wheel for different devices

#

technically it should still work if you use other AOT wheels, it would take a long time for the first image generation to compile kernels for that device.. tho

earnest grotto
#

I mean in the name string of the GPU, at least the one that powershell command spits out, so i can identify which to download for
as i'm pretty sure that's not the only lunar lake igpu

honest hull
#

oh

#

AFAIK,

#
  1. MTL = "Intel(R) Arc(TM) Graphics"
  2. LNL = "Intel(R) Arc(TM) 1**V GPU"
  3. ACM Arc = "Intel(R) Arc(TM) A*** Graphics"
#

maybe using regex to filter "Intel(R) Arc(TM) A" and " Intel(R) Arc(TM) 1** "

earnest grotto
#

hmm, what about something like the A60

#

2 or 3 digits? 🤔

honest hull
#

A60 uses the same wheel as A770

#

so if name starts with Intel(R) Arc(TM) A then download the A770 wheels???

earnest grotto
#

Hmm, I guess I'll do that, thanks

earnest grotto
#

welp, that was odd, my SSD decided it should load a model for ~700 seconds

#

Oh well

#

@sly trench Download the script again. Should say 0.0.7p now. Run again. Should work now.

earnest grotto
#

damn, from 8s/it to 14s/it, 2.5 man

earnest grotto
#

damn... something in 2.3 causes the flux trainer i'm using to save zeroed out loras, but it works with 2.5

earnest grotto
#

i'm turning into shadow the hedgehog here

civic charm
#

Probably because of BF16's rounding to zero issue thanks to its lack of precision

#

And random IPEX corruptions doesn't help either

#

Cache the CLIP embeds on 2.5

#

Do not run CLIP on IPEX

sly trench
#

The script identified the device as a Meteor Lake iGPU.

#

And I have no dGPU

earnest grotto
sly trench
earnest grotto
sly trench
#

You mean this?
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\websocket_image_save.py
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI_TiledKSampler
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-GGUF
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI_ExtraModels
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\rgthree-comfy
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-KJNodes
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\comfyui_controlnet_aux
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\comfyui-inpaint-nodes
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\comfyui-tooling-nodes
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-BrushNet
C:\Users\Lenovo\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-SUPIR

#

I'm using it as Local Server for Krita AI Diffusion.

sly trench
# earnest grotto This

I'm not using ComfyUI on webUI.
I'm using it as Local Server for Krita AI Diffusion plugin.
Yesterday when I ran it on my LNL laptop everything worked perfectly fine.
But I can't get it to run on the MTL laptop in any way.

#

Have you ever encountered this error? What can I do to fix it.

#

Thank you very much.

earnest grotto
#

Yes. I will look into it in a bit, I'm on linux rn

sly trench
earnest grotto
# civic charm Cache the CLIP embeds on 2.5

on some more testing, the lora is full of zeroes during training and remains so after backward-ing, and after whatever else
same embeds for 2.3 and 2.5, works with bf16 on 2.5
I remember now, the 2.5 build I had done didn't have whatever change they did with the attention to slow everything down, it was running at the same speed as 2.3

#

gonna try 2.6

earnest grotto
#

welp, from 8 @ 2.3, to 14 @ 2.5, to 11.5 @ 2.6

#

I guess they improved a bit with 2.6

#

not quite 8 but oh well

nocturne fjord
#

Hi, who managed to get bitsandbytes working on the arc gpu?

earnest grotto
nocturne fjord
#

in kohya

#

They ask for v2.4.0+ (ipex) but xpu version is not available

earnest grotto
#

I think it's safe to say no one has bitsandbytes working.
they've probably just copypasted the cpu ipex because they plan to support gpus as well, but haven't gotten that to work yet?
2.5 ipex looks to be in the works. if we assume the guide is actually true anyways you will likely need to build it yourself, which is gonna take a while.
You can then get the multi-backend bnb and test, but it says some things are not supported so I'd really expect whatever is needed for training to be among them

#

There are other ways to get faster training. What python version are you using? What model are you training? Windows or Linux?

nocturne fjord
#

I am using version 2.6.1dev+XPU

earnest grotto
#

what model are you training?

nocturne fjord
#

Flux dev fp8

#

512x512

earnest grotto
#

windows? linux?

nocturne fjord
earnest grotto
#

linux will be faster

nocturne fjord
#

I tried without much success on Linux ubuntu 24

earnest grotto
#

how many s/it are you getting on windows and with what gpu

nocturne fjord
#

A770 7s/its

earnest grotto
#

hmm, that's pretty fast

nocturne fjord
#

In fact I am also looking to train with a higher resolution.

#

I am limited to 512x512

earnest grotto
nocturne fjord
#

I think bitsandbytes can save me some memory

#

I need to enter images of width 512x1024

#

train*

earnest grotto
#

Ah, you're not splitting the model or fluxgym doesn't do that

#

Is that it?

nocturne fjord
#

I tried the --split option but it requires more than 64 gb ram

earnest grotto
#

Probably a fluxgym issue

nocturne fjord
#

I think the best solution is to run simpletuner another script with more optimizations

#

They use Quanto

#

But I couldn't get it to work.

earnest grotto
#

If you build a version of pytorch 2.5 with xpu support from before they did whatever they did with slowing down attention, it'd probably be faster

#

i don't have such a build, and no gaurantees, it's a gamble

earnest grotto
nocturne fjord
#

Ok, I will continue to experiment and give feedback.

sharp goblet
#

Why this happen? Conda is whitelisted in firewall, port is unused, it has admin perms, thx for the help in advance and sorry if this isn't the correct channel

earnest grotto
warm radish
#

What changed with Pytorch 2.5 having XPU support? Is IPEX optional? ( I see pytorch detects an xpu device without it installed). It is unclear whether some ipex versions are tied to certain torch versions.

earnest grotto
#

2.6 looks to improve on that performance regression

warm radish
earnest grotto
#

No. The performance regression with 2.5 is for every GPU.

#

Intel or otherwise

#

Except supposedly, H100s?

warm radish
#

torch.OutOfMemoryError: XPU out of memory. Tried to allocate 5.59 GiB. GPU 0 has a total capacity of 28.66 GiB.

earnest grotto
#

Did you install ComfyUI with my script, which will apply disty's hijacks?

#

If yes, I have no idea why currently

warm radish
#

no, I am just trying plain pytorch, just setting up to see the basics work at all

#

do you know if there are other discords/channels where such xpu stack related discussions are held (not necessarily in gen-ai context)?

#

I heard there is one, this was the first hit for intel-discord

earnest grotto
#

no idea

sharp goblet
earnest grotto
# warm radish Where is the iGPU max memory set? Is it BIOS? I see it OOMs with >4G requests ev...

You seem like the more active mtl user here
Open ComfyUI/comfy/ipex_to_cuda/hijacks.py
Go to line 7, device_supports_fp64 = torch.xpu.has_fp64_dtype() if hasattr(torch.xpu, "has_fp64_dtype") else torch.xpu.get_device_properties("xpu").has_fp64
below it, add
print(f"\n\n\nfp64: {device_supports_fp64}\n\n\n")
run comfyui, you will see a lot of black space and in the middle of it "fp64: True" or "fp64: False", show which it is

OR

alternatively you can activate the conda environment, go into python and do what that does

#

preferably you'd do the latter as I'd probably want to try out a few other commands

civic charm
#

There are these too:

#

If you still want to use attention slicing, use the IPEX_FORCE_ATTENTION_SLICE=1 env var

#

Xe2 should support 64 bit

#

So it shoudln't have 4GB issues and FP64 issues anymore

earnest grotto
#

ah, this is making me realize
with the fp64 emulation that's shaping up, we're probably gonna reach a point where alchemist can do the fp64 data type but can't allocate more than 4gb?

civic charm
#

Pytorch 2.5's FP64 emulation causes exactly this (manually enabled)

#

So have to use force attention slicing env var as well

earnest grotto
tribal hare
#

I was using ComfyUI without any issues and generated several images. After stopping the server, I tried to restart it, but this happened.😭

earnest grotto
earnest grotto
# earnest grotto You seem like the more active mtl user here Open ComfyUI/comfy/ipex_to_cuda/hija...

@sly trench
Open ComfyUI/comfy/ipex_to_cuda/hijacks.py
Go to line 7, which is device_supports_fp64 = torch.xpu.has_fp64_dtype() if hasattr(torch.xpu, "has_fp64_dtype") else torch.xpu.get_device_properties("xpu").has_fp64
make a new line below it, add
print(f"\n\n\nfp64: {device_supports_fp64}\n\n\n")
run comfyui, you will see a lot of blank space and in the middle of it "fp64: True" or "fp64: False", show which it is
then undo the new line and text you added

tribal hare
somber trellis
#

just realized the non-comfyui vanilla nodes for ltxvideo work in arc, just not with --lowvram enabled.

#

having --reservevram 4.0 does just entirely mitigate the issue

reef ivy
#

Comfy should go to lowvram automatically if needed I think it's just slower than enabeling it by default, maybe --reservevram speeds up the process where you don't need to enable it manually.

#

I want to update my drivers but the vram issue still isn't fixed I don't think. The reserve vram command helps in comfy but not sure there is alternatives for other applications.

reef ivy
#

LTX in comfyui, 50 steps with a default prompt/workflow. There are some tricks to getting more movement and better video. It's super fast, 2.77s/it there abouts. There are some quants but they ran much slower for me (probably due to --reserve-vram)

#

This was img2video, generated the image with flux with the default prompt.

#

might try my darth vader and compare to cogvideox lol

sly trench
#

Additionally, your 0.0.7 version working fine on LNL laptop

earnest grotto
sly trench
#

I have both MTL and LNL

rustic sonnet
#

MTL has native FP64 support

earnest grotto
rustic sonnet
earnest grotto
#

ah
seems like an odd choice of what to support

earnest grotto
#

man, hopefully i don't get some really weird 700 second long load time again

#

nice, I didn't
wonder why my ssd decided to do that that one time

#

@sly trench Ok, I've updated the script, download it again, put it where you put it last time and run it again, it updates when you do that, faster than reinstalling. Regardless of that, it should work now

reef ivy
#

I wonder if it would be possible to get the 4gb fix committed to comfy? Its annoying to redo it each update tbh

earnest grotto
#

Especially now that some fp64 emulation has made it to 2.5

reef ivy
#

Could having multiple conda environments in windows slow stuff down even if they aren't both running.edit Seems I had to reinstall libuv/ipex files into conda env, even the quant models are faster now.

honest hull
reef ivy
#

honestly, very cool. I hope we will be able to import are own workflows and edit them in comfy. (seems like it from the demo).

reef ivy
#

latest cogvideox updates broke all my workflows for it. No clue how to fix it

reef ivy
#

finally figured out what was wrong, but now all outputs are garbage. If anybody tries out cogvideox again let me know if you can get decent output.

reef ivy
#

There is something majorly wrong with the latest ipex in windows in terms of speed, its often 5x slower but sometimes its the same speed. I realized I had installed an older ipex which was why it was fast, but its not compatible with florence 2 so I upgraded and now speed is ridiculously slow half the time. Is this issue known? I know i reported it many months ago, i could update drivers but haven't since there is a memory issue in the latest ones.

#

for instance, same prompt and settings for ltx video. last ipex version it's 2.55s/it all the time

reef ivy
#

ipex, latest version 2.3xx

#

I am on older drivers, but it was a problem months ago as well. There is a memrory issue for a750 now in latest drivers so haven't updated.

#

also, ipex hasn't updated since then either

#

frustrating since some new stuff like the florence VLM won't work on the older version

earnest grotto
#

there is a 2.5 ipex in progress
if you're using my script, I poked into adding 2.5 and with it 2.1.40 but gave up somewhat halfway through when I couldn't get 2.5 working on windows without the basekit, I can fully add 2.1.40 as an option

reef ivy
#

I here pytorch is still slower, probably about what I am getting in 2.3

#

I could try it and see I guess

#

yeah, now it's back to 11s/it lol it's wierd

#

2.1.4 is fast, but it isn't compatible with all the newer stuff

#

Now it's back to 2.44s/it during the same generation? it's like it's changing randomly.

earnest grotto
#

2.5 is slower and also has bugfixes

#

e.g. if you want stable cascade, that works on 2.5

#

doesn't on 2.3

#

2.6 is somewhat faster than 2.5, haven't tested too much

#

i think it's still slower than 2.3, just not almost-2x-slower

reef ivy
#

I may try 2.6 out, I think ipex 2.3 is just bugged for a750 in windows. I may also try new drivers, but last time it wouldn't even run flux without oom.

#

I can get the whls off github? Don't think i've ever installed straight pytorch tbh lol

earnest grotto
#

of course, since it's nightlies, it's probably possible that you get one where someone did a whoopsie and everything is completely broken

reef ivy
earnest grotto
#

yeah 2.5 is the preview

reef ivy
#

for the 4gb fix, what should I alter?

#

just take out import ipex?

earnest grotto
reef ivy
#

maybe that's this

earnest grotto
#

most likely

honest hull
#

it could also be driver/windows issue.. I noticed that when shared GPU memory is being used by > 200MB.. it drops perf by like 4x.. but on comfy you could try adding —reserve-vram 6.0

honest hull
earnest grotto
# reef ivy just take out import ipex?

comfyui and disty's hijacks are made to work with xpu support in 2.5
you should move init_ipex above any ipex importing, still in the try-catch block
I'd assume the transformers load issue isn't present in 2.5 since there's no ipex

reef ivy
reef ivy
# earnest grotto most likely

couldn't get either version of pytorch to work at all, installed one api, called environment kept getting different errors. The procedure entry point is the main one.

reef ivy
#

But ipex2.3 is a regression from 2.1.4, in windows with comfyui on a750 anyway

wicked fulcrum
# reef ivy There is something majorly wrong with the latest ipex in windows in terms of spe...

Im seeing something similar. When running loading and unloading models that take you to the edge of memory.
It might be a memory leak and that driver fix. Remember that driver fix that fixed SDXL in A750. It looks like that driver starts to swap memory when under max memory. This shared memory mode interrupts and slows compute.
Getting to max memory seems to happen as you load and unload models. 10-20% of memory never clears and causes a memory bottleneck

reef ivy
#

So far, I've only experience it with 2.3 ipex, 2.1.4 doesn't seem to suffer from it. At least for Comfyui, I've found that the sweet spot seems to be --reserve-vram 5.0 this keeps most of the speed and stays consistent. At 4.0 it goes randomly slow and faster etc. At 6.0 the speed decreases more with no other benefit. I have only tested LTX video and Flux so far though. Maybe the memory fix they are working on will fix it

reef ivy
#

Can Ipex and IpexLLM be run from the same Env? or would they conflict with eachother? going to install Ollama to run LLM's in comfy but not sure if I should run it in the same Env or not.

#

Also is there any issue with running multiple conda environments? Would resources get stuck or slow down etc.

wicked fulcrum
reef ivy
#

Okay thanks. I was thinking that is what ai playground was doing. But wasn't sure. Also seems ipexllm wants 3.11 while ipex wants 3.10

#

Gonna try and just install ollama in its own environment abd see how it goes with comfy, probably be easier when ai playground intergrates comfy

wicked fulcrum
#

I believe AI Playground is on 3.10.11 @honest hull can check me on that

honest hull
#

we are on 3.11

lucid lily
#

Hello everyone, I recently purchased an A770 graphics card, and I want to run comfyui on it. I followed the pinned instructions to install comfyui and run it, but many newer features are not available. I found that Intel® Extension for PyTorch* v2.3.110+xpu was released a few months ago, and the requirements.txt in the pinned tutorial is somewhat outdated. Is there an updated version of the requirements?

earnest grotto
#

@lucid lily Use this script ^

reef ivy
lucid lily
#

@earnest grottoThank you. I used the script you provided for the installation, and everything went relatively smoothly. However, after completing the installation, when I launch ComfyUI using the shortcut, I receive the following error:
Error loading "J:\Comfy_Intel\cenv\lib\site-packages\torch\lib\torch_xpu_ops_aten.dll" or one of its dependencies.
I checked and found that this file is not problematic, but I'm not sure why it failed to load successfully.

lucid lily
#

@earnest grottoUm... I tried reinstalling from scratch, and this time it started without any errors. It's working now, very strange.😅

earnest grotto
#

hmm

#

bet something didn't install properly

rough crystal
#

I installed AI playground through official website. Everything's work as expected

reef ivy
civic charm
#

Triton itself doesn't work on Windows

reef ivy
#

Seems they got it working for windows in that post? But it needs cuda stuff

reef ivy
#

Anybody on a770 tried the hunyuan model?

somber trellis
#

Me. I can't get it to work effectively at all, and at lower resolutions gives me tensor errors 🤷

#

LTXVideo works with PAG.

earnest grotto
#

They don't want people in europe using their models so I'm not trying out another okish image generator 😔

somber trellis
#

it's an actually competitive one

#

and it can run on a 3090

earnest grotto
#

that's ok-ish to me

somber trellis
#

16gb

#

of vram

#

we cant tho

#

for some reason

#

funny ah 32-bit architecture

earnest grotto
#

i think the images the current best models like flux generate, are ok-ish

#

nevermind that each has its own drawbacks

somber trellis
#

do you not tinker with realism or guidance models

#

there are ways to pull out details

#

like uh...

earnest grotto
somber trellis
#

well fluxtapoz has PAG for flux

#

and SEG (Smoothed Energy Guidance)

#

they can help improve prompt adherence, combined with stuff like perpnegguider

#

which is designed to help further follow prompting

earnest grotto
somber trellis
#

They are just nodes that make it easier to manipulate. More UI friendly.

#

After all, ComfyUI is down to the bones a visual scripting language of sorts.

earnest grotto
#

I'm kinda getting tired of bad fingers at this point honestly

somber trellis
#

Yes, it's slower.

#

I kinda wish we just had better models.

#

LTXVideo 0.9 is great for how small it is 🤷‍♂️

earnest grotto
#

furthermore, in stacking a lot of the almost-placebo improvements like PAG, SAG, perpneg, AYS and so on, I've sometimes had cases where the model starts failing to denoise

somber trellis
#

I've not gotten blackscreened images from that, but for some reason I do from the lora nodes when used with flux.

#

Including LoraLoaderModelOnly.

earnest grotto
#

Not black. Just images with some leftover noise in them

somber trellis
#

Oh.

earnest grotto
#

in spots

somber trellis
#

🤷‍♂️

#

I don't seem to have that issue

#

This was one I did yesturday with all of the nodes

#

this is pixelwave flux btw

earnest grotto
#

that was with SDXL, maybe it's less likely with flux but I'm hesitant to use flux much due to how slow it is and how biased it is towards real life photos with strong depth of field

somber trellis
#

you can definitely get past that problem with flux

earnest grotto
somber trellis
#

Well you don't.

#

Any model above SDXL imo really isn't a requirement anymore

earnest grotto
#

but, i don't want abstract shapes or eldritch non-people

somber trellis
#

🤷‍♂️

#

I wonder what you look for.

#

Do you want full artistry?

earnest grotto
#

it was one of the big things SAI were touting with the original SD3

#

I like the SDXL finetunes. But then, there's no SD3.5 finetunes, and I don't think flux finetuning has progressed much

somber trellis
#

nvidia sana looks nice for how fast it is

#

if only it worked properly on arc

#

because it kept giving me terrible outputs with the wip node

earnest grotto
#

It is not supposed to work properly on anything other than nvidia because nvidia explicitly want to forbid that

#

in their license

somber trellis
#

Well someone is doing it

#

So I gotta hope they get it fully operational

#

thats the node i tried

reef ivy
earnest grotto
# somber trellis Do you want full artistry?

Ideally, I'd want a good 3d model generator and then I can do as I please with the models, but that's still far off
For image generators, I want

  • Coherent, symmetric, 5-fingered 5-toed humans (none do this)
    And by extension, other coherent things too. Guns? Tanks? Most of my attempts with controlnet, like putting in the effort to pose an openpose skeleton, just made me facepalm, finetune couldn't do crossed legs properly and even with controlnet just fused them together
  • action (can any do 2 people fighting? I think no)
  • Styles and some reference sheet for them
    most anime finetunes do this indirectly. PonyXL author wanted to be malicious about this. Base Flux struggles. SD3.5 looks to do styles, but I don't want to be finding out what it can or can't do
    In general, I'd like to see the actual captions used for 100-1000 random varied images in these models' datasets. Surely that's not too much to ask for?
  • Zero snr or something close enough (this means the model can generate very dark or bright images)
  • A fast inpainting model that's better than SD1.5
    among other things
#

SD1.5 still feels like the best for inpainting

#

I haven't tried the new Flux controlnets

#

So, it might tick that box of not failing to follow them

earnest grotto
#

I tried it

#

As a reference, the 2nd image is from a playstation 2 game.

somber trellis
#

ok instant ps2-level models

#

g

#

g

reef ivy
#

I'd say you'll get what you want within the next year or so. Although if it stays open by then who knows

somber trellis
#

3d modelling will obsolete itself if we can use an inpainter to Draw a base model image t-pose

earnest grotto
somber trellis
#

A lot of these will revolutionize in 2025.

earnest grotto
somber trellis
#

I feel it.

#

Let's just hope uh

#

🤷‍♂️

#

ye

reef ivy
#

Kijai updated cog video nodes and now almost nothing works right anymore. Been messing with ltx and can get decent results sometimes.

earnest grotto
#

I'm not gonna dunk on it too much since the hf space errored out when trying to do textures over 1024^2 or simplify less than 0.95, but...

somber trellis
#

LTX 0.9 btw, theyre gonna release the 1.0

#

which hopefully fixes the long-prompting issue

#

which they know

reef ivy
#

Yeah, hopefully they train human movement into that one. Also the need to add noise to make the image worse so it animates

somber trellis
#

but ltxvideo is already OP for video-to-video

reef ivy
#

I haven't tried video to video yet

somber trellis
#

darn

earnest grotto
somber trellis
#

thats videotovideo via ltxtricks

earnest grotto
#

why hand not green >:(

somber trellis
#

i dunno but its pretty close

reef ivy
#

probably the denoise value

somber trellis
#

the other two nodes in ltxtricks

#

are cool

#

one of them essentially does what an application you might know in the past did

reef ivy
#

image+video to video looks really good from the example. Although it was pretty much a perfect one to one drawing

somber trellis
#

a youtuber got famous for it

#

lol i cant remember that software

#

Ah, I remember.

#

Ebsynth.

#

from 2019 lol

earnest grotto
# somber trellis https://www.youtube.com/watch?v=0RLtHuu5jV4

ah, i don't know that
or is that the thing that joel haver uses https://www.youtube.com/watch?v=c6MW-qdNoYA

ELDEN RING out now! - https://www.bandainamcoent.com/games/elden-ring

Featuring @comedianalecrobbins @cerspence and Calvin LaVallee - https://www.youtube.com/channel/UC_DudUFOztlAHj5JlHXbuDQ

More Animations - https://www.youtube.com/playlist?list=PLKtIcOP0WvJDZemPYZZQSqotCgpps5DbX

Subscribe for weekly short films.

Support -
Patreon: https:...

▶ Play video
somber trellis
#

it is the thing joel haver uses

#

its rotoscoping + ebsynth

#

he says so in the comments of the vid i posted

reef ivy
#

That was an amazing video tbh, I only know of ebsynth from a1111 extension, never tried it though. I think animatediff replaced it

reef ivy
somber trellis
#

aaron

#

use the pag node from ltxtricks

#

it slows it down a bit but greatly helps consistency

#

Todays world sucks

reef ivy
#

The stg nodes? Yeah, I use them. I was able get them to work with img2video as well. They help consitency but also slow down movement (at least wtih img2video). Longer steps also slow down movement, not sure if that is only with stg nodes though.

#

People use the Detail Daemon nodes as well, but for img2video I could not get them to work properly.

somber trellis
#

ltxvideo can give some pretty interesting and atmospheric outputs tho

reef ivy
#

I need to try some more txt2video stuff. Need to setup my llm to prompt for it though.

#

Running florence2 to ollama to ltx feels really cool tbh. The vram issue pops up sometimes with ipex 2.3 though

somber trellis
#

with underscores removed

#

fed that into api llm

#

thats what I use to img2img usually, that and combine it with flux depth/redux

reef ivy
#

I need to check that out.

#

On another note cogvideox fun models just don't seem to work anymore with the new workflow for img2img. Either just black output or misty noise that vaguely looks like the image. Gguf still doesn't work for cogvideo either

somber trellis
reef ivy
torpid moat
#

has anyone tried to install sd3.5 large and succeeded, if so i need help

formal tusk
#

Hi Bob and Community. 1st: Thanks to INTEL to all the effort they take to establish themself as an additional GPU developer and vendor; I also owned an ARC A770. So my Q directly to Bob:
Why does INTEL spent that less effort on also get all those nice tools and running smoothly on Linux, since all those big models are trained on it on big fives side?
It is a huge amount to take to get this GPU also running ComfyUI since there are much less instructions and hints to find to set it up on Debian as there are,e.g, for Windows.
And, please, no offense, but also e.g. the python version in those tools from INTEL is still on V3.10 and so some kind of outdated related to latest Debian dists.
So, u always have to look for some elder,e.g, pip wheels and stuff to get downloaded to get it nearly some kind of running/working.
Will there be a bit better instruction to get this running?
Regards.

reef ivy
reef ivy
earnest grotto
# formal tusk Hi Bob and Community. 1st: Thanks to INTEL to all the effort they take to establ...

Many things do not work with the latest python version, including some webuis themselves, this is normal
It's assumed that if you're on linux, you can figure things out yourself, otherwise you wouldn't be using linux
You can get older python from the deadsnakes ppa, or use conda and make a conda environment with an older python
https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=gpu&version=v2.3.110%2Bxpu&os=linux%2Fwsl2&package=conda
You'll likely want 2.3 for the best performance ^
Note that the pip instructions here are broken, at the very least there is no conda prefix with pip
OR, you can get 2.6 for more compatibility, less bugs https://pytorch.org/docs/stable/notes/get_start_xpu.html (nightly)

civic charm
#

Everything works fine with Python 3.11, it is just that the UIs are very stubborn in staying with Python 3.10

#

The only issue is Python 3.12

#

Not many things support 3.12 yet

honest hull
#

many projects still use numpy < 2.0 too.. above pip installs would install latest numpy and it might also break projects

earnest grotto
sly trench
quartz kelp
#

Now that the B580 is released I have a question about it, does it also have a 4GB memory block allocation limit?

reef ivy
#

I don't think so, but someone will have to test. Xe2 shouldn't need it

civic charm
#

IPEX 2.5 is listed on the intel repo

#

Seems like the US mirror is giving access denied error on every ipex version

#

CN mirror works but very slow

reef ivy
#

nice, will try it out and see if it works. Maybe coincide with b580 launch

civic charm
reef ivy
#

Haven't seen DSL in a long time lol

civic charm
#

Now it is giving me the expected speeds to China

#

Probably was a caching issue

civic charm
#

IPEX 2.5 is a little bit slower but completely acceptable for the accuracy improvements

reef ivy
#

How much slower? on windows 2.3 is already slower than 2.1.4 for me, so it might be faster lol

civic charm
#

I am using a custom model arch

#

Went from 2.6 s/it to 2.8 s/it

#

But now i am able to run CLIP on the GPU without corruptions

reef ivy
#

Yeah, that's not bad at all. I am downloading now. I guess there is no need for a oneapi update?

#

Nice, so stable 3.5 should work now

civic charm
#

Linux installs those from pip after ipex 2.3

#

idk what happens on the windows side

reef ivy
#

I couldn't get it to pip install, so made a requirements file. Might not work then. Windows typically have to install oneapi afaik.

#

so far seems way slower in windows, but I do have a new driver update waiting which could improve things. Also issues with memory in latest drivers so could be faster on older one lol.

#

flux from about 5or6/sit's at 1024 to 12.9s/it and ltx from 2s/it to 82s/it lol.

#

gonna update the drivers and try again, then if that is worse still might try older driver, then go back to 2.3 or 2.1 if it's still terrible

earnest grotto
#

requirements files are just stuff to install with pip, laid out in a file so you don't have to type them out
source control is more convenient, whatever else

reef ivy
#

I tried to copy the pip install from the other one and input the new links but it wouldn't work for me, might have neede the entire url for each file like I did in requirements file though.

reef ivy
#

ipex 2.5 is unusably slow in windows, at least with comfy ui

civic charm
#

are you using ipex 2.5 or pytorch 2.5?

reef ivy
#

ipex

#

My guess it's a compounding issue from 2.3 and current drivers with a750 memory allocation.

#

2.3 is slower than 2.1.4 but it can be mitigated with reserve-vram 6.0, nothing seems to help with ipex 2.5

#

can't view vram usage anymore wtih the new arc control thing, so maybe it's running on cpu or something? seemed too fast for that with flux though.

honest hull
reef ivy
#

I am on windows😭

honest hull
#

works on windows 😉

reef ivy
#

Ohh didn't know that, thanks!!

honest hull
#

xpu-smi.exe dump to view the metrics available.. most of them are avaiable but some might report N/A as the tool is developed for Data Center GPUs

#

xpu-smi.exe dump -m 0,18 -d 0 should show you the GPU utilization as well as memory used in MB unit

honest hull
#

when you pip install ipex it also installs the oneAPI dependencies. for ipex 2.5.110 it should be installing dpcpp-cpp-rt==2025.0.4 mkl-dpcpp==2025.0.4 etc

reef ivy
#

It worked it was just like 100% slower in everything for me in windows. Probably a750 related

#

2.1.4 is the most stable and fastest but it is no longer compatible with alot of new stuff

somber trellis
#

still on 2.3 myself

#

trying to get comfy_extramodels sana to work

#

I don't seem to be getting the greatest results.

#

Maybe there's a problem.

#

I've tried it at multiple CFGs, on euler simple

reef ivy
#

Haven't tried that, but It's geared specifically for nvidia all together so maybe that's why. could also be the clip issue that 2.5 might fix

primal hatch
#

Someone please give me the bat file for installing comfy ui

earnest grotto
#

@primal hatch

silk umbra
#

the script dosen't work with the b580

silk umbra
#

in fact i can't get comfyui to work on my b580 at all

#

is it too new?

earnest grotto
#

it might indeed be too new

silk umbra
#

ahhh so wait?

earnest grotto
#

I'll edit the script to install the same thing for battlemage as it does for alchemist, but you'll have to find out if that works or not, and I lean towards no

silk umbra
#

yeah

earnest grotto
#

Also I guess I'll have to check if it can be downloaded from the us at all anymore or they've just moved to china since my script downloads from the US

#

I'll be uh, waiting on a windows update in the meantime though

#

So don't expect that in 10 minutes

silk umbra
#

okie

civic charm
#

I hope US will be back up soon

#

CN connections are slow even if it works properly

silk umbra
#

@earnest grottowait the ps1 file works

earnest grotto
#

That's from when I decided to be very generous with detecting the GPU

#

That will install alchemist stuff

silk umbra
#

ohhh.....

silk umbra
earnest grotto
#

I don't know

#

Discord crashed due to windows updating

#

???

#

I assume that battlemage support will need to be added to pytorch, and IPEX, and I assume that it will not be retroactively added for 2.3 which is not as laggy as 2.5 or 2.6

silk umbra
#

yeah the ps1 script didn't bring me much further

earnest grotto
#

I don't think this is a battlemage issue

#

I'll check it out

silk umbra
#

okie

earnest grotto
#

@silk umbra @sly trench Updated

silk umbra
#

nice

earnest grotto
# silk umbra nice

You can just download the new version and replace the old one with it. Run it again, it will not have the same error again but I'd still expect it to not work

silk umbra
#

ye did it

earnest grotto
#

If it doesn't work, you wanna follow some of my directions instead after that?

silk umbra
#

@earnest grotto sure

#

WAIT

earnest grotto
#

It's just loading

silk umbra
#

IT WORKS

earnest grotto
silk umbra
#

no not yet

#

i usually use comfy with swarmui

earnest grotto
#

well then you don't know if it works

#

so generate one

silk umbra
#

ok downloading flux rn

somber trellis
#

Where do you get the whl for ipex 2.5.10?

#

Actually, should I even bother going from pytorch 2.3

tiny bolt
#

not officially released

#

find it amusing how these people discover it so quick

earnest grotto
#

well, since the us one is down, it's probably worth taking a look into the cn one to see if there's anything new

earnest grotto
tiny bolt
#

it works for me tho

#

ah okay i see what you mean

earnest grotto
#

cn might've sped up since disty posted though

#

it was slow but it wasn't too slow

somber trellis
#

If it takes me half a minute to download it

#

i think its ok

somber trellis
#

but it doesnt exactly work

#

🤷‍♂️

#

lol

silk umbra
earnest grotto
silk umbra
# earnest grotto What does the command prompt say

Failed to validate prompt for output 9:

  • CheckpointLoaderSimple 4:
    • Value not in list: ckpt_name: 'Flux\flux1-schnell-fp8.safetensors' not in []
      Output will be ignored
      invalid prompt: {'type': 'prompt_outputs_failed_validation', 'message': 'Prompt outputs failed validation', 'details': '', 'extra_info': {}}
earnest grotto
#

Show your workflow

#

And please post screenshots, not copy text

silk umbra
#

im using swarmui

earnest grotto
#

Do it in comfyui then

silk umbra
#

do you have a workflow i can try

#

idk comfy very well

earnest grotto
#

Bottom of this post

#

Download the image. Drag-and-drop it onto comfyui in the browser

silk umbra
#

oh ok

earnest grotto
#

Did you download flux with my script or one of the various all-in-one variants

earnest grotto
#

Use the workflows from that then

silk umbra
#

@earnest grotto

earnest grotto
#

Yeah that's what was expected

#

Your swarmui issue is unrelated

#

I'd bet in a week's time support will be here

silk umbra
#

okie

reef ivy
#

Feel like speed and stability has regressed from each version after 2.1.4 tbh

#

could also be a skill issue on my part lol

somber trellis
#

I wanted to use it in conjunction with ipex 2.5.10

#

but idek if that works

#

also uh

#

phi-4 released

reef ivy
#

Yeah, I did it with ipex 2.5. I just found the links for each and copied them to a requirements.txt file. Ipex likely still needs the special version of each

somber trellis
reef ivy
#

I wonder how true those benchmark results are tbh.

#

from user feedback seems on par with qwen2.5 14b

somber trellis
#

when you realize you just had to restart your pc when properly re-installing ipex-llm ollama

#

now it workin like a charm with phi-4

reef ivy
#

Will they release a 7b model?

somber trellis
#

🤷‍♂️ No clue yet. This model's brand new.

reef ivy
#

There is a new video2audio model, but apparently only works on 2.5.1 pytorch

somber trellis
#

If there was any real reason to swap to it, that is...

reef ivy
somber trellis
#

Yep, far slower indeed.

#

Ipex didn't seem to help much either, I guess everything's still in its testing phase

#

imma keep with 2.3

#

nothing seems better atm

wicked fulcrum
#

Sneak peek coming to AI Playground 2.0 with ComfyUI workflows and support for Flux.1

civic charm
#

Use the torch from ipex

reef ivy
#

Intel should make some comfy nodes for llms, then I won't have to run it through ollama if using it in a workflow with ai playground

somber trellis
#

ok well 2.5.1 works on the a770 le no problem it's just half speed like aaron said

#

rip

earnest grotto
#

pytorch 2.6 is faster while still having fixes that came with 2.5

#

if that is what you're looking for

#

otherwise can just stick with 2.3 for now

reef ivy
reef ivy
#

I may start up another wsl2 enviornment and see if I can get more speed in windows.

honest hull
#

just pip install torch torchvision with the extra url that points to pytorch.org

#

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/xpu

reef ivy
#

I will give it one more try with this install command, thanks.

honest hull
# reef ivy I will give it one more try with this install command, thanks.

would recommend testing it with a simple standalone script first and a fresh python env..

for example

  1. create env and pip install torch torchvision using above link
  2. pip install diffusers transformers accelerate
  3. run below code snippet
from diffusers import AutoPipelineForText2Image, DEISMultistepScheduler
import torch

pipe = AutoPipelineForText2Image.from_pretrained('lykon/dreamshaper-8', torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = DEISMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("xpu")

prompt = "portrait photo of muscular bearded guy in a worn mech suit, light bokeh, intricate, steel metal, elegant, sharp focus, soft lighting, vibrant colors"

generator = torch.manual_seed(33)

image = pipe(prompt, generator=generator, num_inference_steps=25).images[0]  
image.save("./image.png")
honest hull
#

also add this to ComfyUI/comfy/model_management.py if you are using pytorch 2.6 nightly. it improves performance by about 1.5x .. The perf issue with upstream Pytorch seemed to be not yet fixed in 2.6 nightly either (happens on other vendors too)

reef ivy
#

do the ipex hijacks still work in windows or am i doing something wrong? Get this error with latent upscale instead of oom Current platform can NOT allocate memory block with size larger than 4GB! Tried to allocate 4.25 GiB (GPU 0; 7.75 GiB total capacity; 4.19 GiB already allocated; 4.35 GiB reserved in total by PyTorch)

#

this is what I have now with the florence fix as well try: import transformers # ipex hijacks transformers and makes it unable to load a model backup_get_class_from_dynamic_module = transformers.dynamic_module_utils.get_class_from_dynamic_module import intel_extension_for_pytorch as ipex ipex.llm.utils._get_class_from_dynamic_module = backup_get_class_from_dynamic_module transformers.dynamic_module_utils.get_class_from_dynamic_module = backup_get_class_from_dynamic_module from ipex_to_cuda import ipex_init ipex_init() xpu_available = True except Exception: pass

civic charm
lucid lily
#

Some nodes of ComfyUI need fp64, but my arc a770 don't have it. Any solutions?

earnest grotto
#

@lucid lily ^

#

Or you edit model_management.py as shown right above you

#

and you clone disty's hijacks into the comfy folder

lucid lily
#

add here?

earnest grotto
#

You find the try-catch block that does import intel_extension_for_pytorch as ipex and you replace it with what you see above
I think the except might've had something else that isn't pass but not sure

lucid lily
#

OK, I got it

#

still report same error😅

earnest grotto
#

@lucid lily Please link the exact custom node and copypaste the full stack trace, and link any models if they need to be seperately downloaded

somber trellis
earnest grotto
#

probably after ipex_init()?

somber trellis
#

well considering youre not using IPEX in 2.6.0

earnest grotto
#

I want to know too, seems odd if that's for a different backend

somber trellis
#

wouldnt you just import ipex_init() and thats it

#

then that command ig

#
    from ipex_to_cuda import ipex_init
    ipex_init()
    torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True)
    _ = torch.xpu.device_count()
    xpu_available = torch.xpu.is_available()
except Exception:
    pass```
lucid lily
#

@earnest grottoany instructions?🥲

earnest grotto
#

Do this

lucid lily
somber trellis
#

well im on 2.6.0

#

i wanted to see if 2.6.0 would fix my issue with sana in comfyui outputting improper outputs

#

Same workflow on arc

#

on both 2.3.0 and 2.6.0

#

also torch 2.6 with that command makes the performance equal to 2.3

lucid lily
#

@somber trelliscan you run any custom_node like semantic segmentation which use fp64 when using cuda with ARC?

earnest grotto
#

@somber trellis you have some spare nvidia gpu? you wanna run some stuff for me with nvdiffrast, after some time?

earnest grotto
#

Support for fp64 will come. It's already in an experimental state in 2.5/2.6
If you want things fixed now, do what I said

#

And I'll look into the custom node and I can put patching it into my installer script or tell you what to edit or whatever else

silk umbra
#

wait did you guys get stable diffusion working on arc b580

lucid lily
earnest grotto
earnest grotto
#

Link the custom node

#

Link any models if they need to be seperately downloaded

lucid lily
#

Did I misunderstood?😅

#

I found any nodes which utilize Semantic Segmentation preprocessor in comfyui report fp64 error when using Arc

somber trellis
#

i have a 1060 but its not installed lmao

#

Are there any other commands I should add to model_management other than torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True)

lucid lily
#

@earnest grottoI replaced any datatype float64/double in this nodes with float/float32, but still got same error report😅

somber trellis
#

instantir works on torch 2.6+xpu at 4s/it

somber trellis
#

Ipex-to-cuda fixes that issue for you in most cases but certain custom nodes you might need to modify it manually

earnest grotto
#

the hijacks don't cover every scenario

reef ivy
#

yeah, hijacks updated same error. Might just be something with the latent upscale nodes that don't get hijacked? Gonna try 2.6 later on fingers crossed it works this time

civic charm
#

resolution?

#

You can't split it below 4GB after a point

reef ivy
#

It was going from 768*512 to 2x that, I would have expected an oom rather than the 4gb message. Its an ltx vid though. I have been trying to find an upscale method that didn't take forever

earnest grotto
#

You do oom

#

4.25+4.19>7.75

reef ivy
#

But oom usually has another message not the 4gb, when I got that before I didn't have the hijacks working. Its strange

somber trellis
#

not sure if i wanna use 2.3.0 or 2.6.0

#

or ipex 2.5.1

#

2.6.0/2.5.1 is not far from 2.3.0 with the cuda mem eff sdpa fix

reef ivy
#

is there a way to get it to run with comfy? Getting not compiled with cuda errors

#

I got it to work in a fresh env by itself with the test script, but comfy won't recognize xpu or something

somber trellis
#

with 2.6.0?

reef ivy
#

Yeah

somber trellis
#

In model_management.py you need to remove all ipex-related code for xpu

reef ivy
#

okay thanks

somber trellis
#
    from ipex_to_cuda import ipex_init
    ipex_init()
    torch.backends.cuda.allow_fp16_bf16_reduction_math_sdp(True)
    _ = torch.xpu.device_count()
    xpu_available = torch.xpu.is_available()
except Exception:
    pass```
#

for me i just use this

#

that third line helps with performance

reef ivy
#

also, did you get a lot dependency errors?

somber trellis
#

I got some 2025 oneapi dependency errors that I fixed by installing either their 2025.0.2 or 2025.0.1 counterparts

#

otherwise I have torch, torchaudio and torchvision for torch 2.6.0

reef ivy
#

this is what I am getting.