#ComfyUI for Intel Arc using IPEX

1 messages · Page 9 of 1

sweet rapids
#

Cannot import C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Subpack module for custom nodes: DLL load failed while importing cv2: The specified module could not be found.

Loading: ComfyUI-Manager (V3.31.12)

[ComfyUI-Manager] network_mode: public

ComfyUI Version: v0.3.30-16-g83d04717 | Released on '2025-04-28'

Import times for custom nodes:
0.0 seconds: C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\websocket_image_save.py
0.0 seconds (IMPORT FAILED): C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Subpack
0.0 seconds (IMPORT FAILED): C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\ComfyUI-Impact-Pack
0.5 seconds: C:\Users\COMFY\anaconda3\Comfy_Intel\ComfyUI\custom_nodes\comfyui-manager

I installed requirements for both nodes that are not loading. I reinstalled opencv-python. I downgraded numpy to 1.26 as i couldnt reinstall opencv still same error.

reef ivy
#

try reinstalling from comfy manager?

sweet rapids
#

its stuck on reinstalling not working as well

#

It did work on my Tesla p4 on second PC. On the new one with B580 is not i even copy the ComfyUI at the beggining and patched the files for Intel but no luck.

reef ivy
#

try uninstalling the node then installing from scratch, maybe delete the node folder if it doesn't auto delete.

earnest grotto
#
Q: Import fails on Windows: ImportError: DLL load failed: The specified module could not be found.?
A: If the import fails on Windows, make sure you have Visual C++ redistributable 2015 installed. If you are using older Windows version than Windows 10 and latest system updates are not installed, Universal C Runtime might be also required.
Windows N and KN editions do not include Media Feature Pack which is required by OpenCV. If you are using Windows N or KN edition, please install also Windows Media Feature Pack.
somber trellis
#

Yep, none of these dia nodes seem to work.

#

🤷‍♂️

#

RIP intel i guess

#

for now

earnest grotto
#

i will check it out now, after i finish my wasabi tuna pate pesto and slop sausage sandwiches

sweet rapids
#

i bet thats the one Windows N and KN editions do not include Media Feature Pack which is required by OpenCV. If you are using Windows N or KN edition, please install also Windows Media Feature Pack.

earnest grotto
#
somber trellis
earnest grotto
#

the wasabi was so strong the power went out

somber trellis
#

dias voice cloning seems worse than spark

#

At least based on the videos ive seen

#

base tts is funny

earnest grotto
#

alright, whoever made the custom dia nodes has made a few mistakes

#

100% nothing to do with intel

somber trellis
#

lmao

earnest grotto
#

i mean, i'm still having intel issues

#

running out of vram still causes pretty big problems (linux)

#

and with the compute runtime headaches i've had recently, xpu-smi isn't working so i don't have a way to monitor vram on linux, not that i'd want to stare at the vram number constantly ready to press ctrl+c

sweet rapids
#

I was able to load impact pack thank you.

#

so it was the windows N fault

earnest grotto
#

man, there's quite a lot broken with those dia nodes

reef ivy
earnest grotto
#

This one's a fair bit better, though it doesn't use comfy's device management and so ends up using the cpu
hopefully fixing this, there won't be more issues

#

nope...

earnest grotto
#

This was supposed to have the fitnessgram thing as prompt
I will try splicing some audio with 2 random speakers if that works 🤔

#

This is with FP32. Their BF16 implementation is broken, let's see if fixing it gets me a further 2x speedup

earnest grotto
#

it wasn't any faster

#

i don't know if it used less vram. i'd hope so.

#

time to splice audio

somber trellis
somber trellis
#

Still, literal fitnessgram jumpscare

earnest grotto
#

Fix for intel + bf16 support, but i dunno if there's any point in using bf16 or if i hacked it in badly

somber trellis
#

Those are stats for the base repo for the 4090

earnest grotto
#

It breaks for me on linux with 2.8, and I don't have 2.7 or others set up

#

I don't recall if 2.5 had compile working

#

well, let's test i guess

earnest grotto
#

Could also just be a nightly bug, but then i was on an old nightly and it's present on a fresh nightly too...

#

this night nightly

#

😔

#

If you do get it working on a newer torch, since this was on 2.5, you'd get a faster result than 0.07x anyways, as 2.5 is slow

somber trellis
#

RuntimeError: Unknown exception

earnest grotto
#

No stack trace or anything else?

somber trellis
#

Yep. Right here.

earnest grotto
#

Guess I'll try on windows tomorrow

#

Try other torches 🤷‍♂️

somber trellis
#

Alright, I'll try the non-nightly non-ipex one first

#

aka torch 2.7.0+xpu

#

Ipex Torch 2.6.10 works

#

on fp32

#

and not fp16 or bf16

earnest grotto
#

Ok, I think hacking even more things to BF16 might have given me a mild boost from ~0.07x to 0.08x

somber trellis
#

RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: struct c10::Half and value.dtype: struct c10::Half instead.

earnest grotto
#

But it's pushed now

somber trellis
#

k ill pull

earnest grotto
#

do a git pull in the custom node's folder and try again

earnest grotto
somber trellis
#

0.1

#

around 8 tokens

earnest grotto
#

that's a bit faster than 0.07x

#

2.3 also doesn't work, too old, missing something new added in torch but i forgot what

somber trellis
#

bf16 uses 3gb less vram

#

generate step 86: speed=7.314 tokens/s, realtime factor=0.085x

#

generate step 172: speed=8.023 tokens/s, realtime factor=0.093x

earnest grotto
#

😐

#

oh well

somber trellis
#

and my relative speed is almost identical to fp32

#

so i mean

#

less vram is less vram

#

Prompt executed in 86.33 seconds

earnest grotto
#

how much do you use with fp32 or with fp16

somber trellis
#

generate step 86: speed=8.839 tokens/s, realtime factor=0.103x

earnest grotto
#

if it didn't take so long to generate i'd probably have an easier time figuring out what to tweak since they sound a bit sped up, same in yours too

somber trellis
#

realtime factor fp32

#

usage is around 13gb

earnest grotto
somber trellis
#

i know lol

#

the difference is like 3-4gb of vram when jumping from bf16 to fp32

#

at the cost of like .008 or so realtime factor

#

RuntimeError: Native API failed. Native API returns: 20 (UR_RESULT_ERROR_DEVICE_LOST)

#

FP16 caused a device lost error

#

File "C:\Users\dbs_5\Comfy_Intel\ComfyUI\custom_nodes\comfyUI-customDia\dia\audio.py", line 78, in apply_audio_delay
bos_tensor = torch.tensor(bos_value, dtype=audio_BxTxC.dtype, device=device)

reef ivy
#

If you have the use bf16 command arg maybe it is converted to bf16 both times?

earnest grotto
#

Those are for comfy, custom nodes have to be coded to actually respect those (or comfy's model management, so on)
This node wasn't
Others aren't too, and a good amount of others have their own data type selectors too

#

And that's also kinda what disty's hijacks are for, for when people don't use comfy's model management and just do .to('cuda' if torch.cuda.is_available() else 'cpu')

somber trellis
#

Dia's cloning doesn't seem to be that great, even if the base tts is pretty good.

#

lol

#

at least not on my end

#

Do you have to specify [S1] in the audio transcript?

#

okay

#

Yes, you do actually need to specify speaker in the audio transcript

#

It INSTANTLY got better quality

#

The emotion is rather

#

nice

earnest grotto
#

yeah

somber trellis
#

Also, there's no way to run out of VRAM with the way it actually seems to generate tokens

earnest grotto
#

my first audio clip was with fitnessgram, without specifyin speakers

#

the spliced audio was with speakers specified

somber trellis
#

I just put the entire "What the F did you say about me you lil" in

#

as speaker 1

#

and its doin it

earnest grotto
#

Sure, but at ~0.1x that's going to take like 20 minutes

somber trellis
#

8 tokens a second at realtime 0.1

#

yeah

#

thats half of the transcript

#

it did it in four minutes

earnest grotto
#

that's quite the speed reading. it would make a good news anchor

#

it's most likely caused by one of the parameters you can set

#

but i dunno what some better values with sane speaking speeds are

#

presumably lower top p and lower cfg but idk

somber trellis
#

what

#

why did it become a 2005 movie commercial

#

at the beginning

#

"Coming next on DVD!"

#

What's the default temperature for dia supposed to be?

#

top k is 30

#

top p is 0.95

#

temp is 1.3

#

checked the official space to get that answer

#

Also, this node is missing the "speed factor" that the official space has.

earnest grotto
somber trellis
#

🤷‍♂️

#

normal

#

slowed

earnest grotto
somber trellis
#

In terms of audio quality though

#

if it can nail the voice

#

it nails it well

#

LOL

#

I don't get why it's doing this.

#

I wonder if the issue is audio type.

#

Maybe it hates .ogg

#

It did the fitnessgram voice completely fine

earnest grotto
#

If you have silence at the start of the file, that can be it

#

i doubt the file types themselves affect much, they're all converted to raw data by comfy

earnest grotto
#

Wonder if this would mean even bigger speed increase for intel

#

Since surely it can at least bump up to ~0.3x

somber trellis
#

im getting somewhere

#

top_k 15

#

top_k 10

#

This model seems super-sensitive to data outside the cloned audio.

#

fast speech problem isn't just me, it seems.

#

yep

lament shale
#

Are you genning audio?

somber trellis
#

We are, but I'm still having issues getting it to work well. It can clearly clone voices but I think there are major issues with the nodes

earnest grotto
hard kraken
#

Feels like I spent 90% time debugging/finding workaround for intel gpu, only get to spent 10%time on ai

#

Ok to be fair, a smaller part of debugging time is not intel related

reef ivy
#

Mostly because everything is made for nvidia, 90% of stuff can run just fine but node creators aren't taking intel into mind at all. Hijacks tend to handle most thing though. I'd say Intel may be ahead of AMD in compatibility.

hard kraken
#

really? i would be happy to know in what apps? I kept seeing repos like vllm gives direct support to cuda/rocm only..

reef ivy
#

Rocm itself is only available on select gpus and afaik windows implementation is still hit or miss when it does.

#

XPU is built straight into pytorch, so it's usually just cuda to xpu.

#

There is Zluda but it's unnofficial and also hit or miss, afaik anyway

hard kraken
#

You're right. Happier now

reef ivy
#

If the 24gb bmg gpu is real then Intel will have a big leg up on AMD, obviously though nothing can catch up to nvidia atm.

civic charm
#

Converted ipex hijacks to be pip compatible:
You can install it directly with pip and import it on any python code:

pip install git+https://github.com/Disty0/ipex_to_cuda

Using the repo via git cloning still works.

earnest grotto
#

Nice!

pale night
#

does anyone have a link on how to properly get comfyui working in ubuntu? i've tried too many of what's on the web and running into some issue with torch seeing my card ... came upon this channel as a last resort because someone on reddit mentioned Vik's script but that's for windows ...

earnest grotto
#

though i've tested it less on linux. and since you're on linux, shouldn't be an issue to fix up whatever issues do come up

#

run it and say what's wrong

hard kraken
#

Reactor faceswap is so unstable on my computer. Just tried a 1min 750 frame swap, crash reboot at around 740 frame swap..

#

Worked with 370 frame

reef ivy
#

what would be the code to use to import the ipex_to_cuda? Could it be able to import to custom nodes that don't play nice with it running through comfy itself?

#

Have issues with Kijai nodes and I think it's because it bypasses some stuff in comfy to do it's own thing, but not sure.

earnest grotto
civic charm
#

If you try to import ipex_to_cuda again after it is activated once, it will just skip it.

reef ivy
#

I really hate the auto delete, here is a sample of an issue with torch compile with his nodes I get

#

!!! Exception during processing !!! AttributeError: 'method_descriptor' object has no attribute 'module'

from user code:
File "H:\Stable diffusion\ComfyUI\comfy\ipex_to_cuda\hijacks.py", line 223, in Tensor_to
return original_Tensor_to(self, device, *args, **kwargs)

#

i also got issue with float64 but fixed it manually by just converting to 32

#

I saw that it called the hijacks but couldn't figure out why it wouldn't work when other compile nodes do.

earnest grotto
#

@pale night Still alive?

reef ivy
#

seems like --reserve-vram is behaving differently and now it's putting more of the model on vram (which makes it slower on a750 for some reason).

#

yeah, changed the values to something lower and it is now sped back up. Could be a run to run variance or just something different in the latest pytorch/comfyui/or driver. But it definitely is not behaving like it did before at the same values.

civic charm
#

Probably the same issue i am experiencing on intel with pytorch >= 2.5

#

putting the model weights to the GPU is slower than moving them from the CPU, then to GPU and then back to CPU even tough there is plenty of free vram available

#

Everything is on GPU gives 1.4 it/s vs moving back and forth gives 1.8 it/s

#

ipex 2.1 is still ahead with 2.0 it/s

reef ivy
#

I have definitely noticed that for a while now, if my gpu uses more vram the inference speed is much slower. --reserve-vram helps by reserving more vram for the system which usually uses less of my gpu vram when generating.

#

I think only if the entire model fits on gpu with at least 2gb to spar it runs faster

civic charm
#

I use A770 so it has 4 GB free VRAM left when everything is in GPU yet it still runs slower than moving the weights back and forth

reef ivy
#

hmm, that may be the same with a750, it has been a while since I tried a model that fit into 8gb completely (months)

reef ivy
#

just searched, looks like it's slower? Must have missed the posts about this earlier.

signal patio
#

How to install comfy on arch to use it with a770 16gb?

reef ivy
#

Viks script should work afaik.

snow gyro
#

Does anyone have Framepack working in Comfy and/or the standalone app? Thank you.

signal patio
#

Arch Linux | Arc A770 16GB

reef ivy
reef ivy
signal patio
earnest grotto
# signal patio Arch Linux | Arc A770 16GB

Don't install clinfo just because the script needs it. Install the compute runtime and everything else properly (which includes clinfo). I dunno how that works for arch so i can't help you there.
After that, you are supposed to run clinfo yourself and verify that the GPU you want to be used is listed.

civic charm
signal patio
#

iGPU disabled, necessary packages installed

sudo pacman -S intel-compute-runtime level-zero-headers level-zero-loader base-devel git python-pip python-virtualenv

earnest grotto
#

is there /bin/bash on arch

#

just type that in, then --version

signal patio
civic charm
#

you should disable the device, not the driver

signal patio
signal patio
civic charm
earnest grotto
#

at this point, why not just disable the igpu in windows or in the bios

signal patio
earnest grotto
#

That's the point, yes
Why do you even want to pass it through

civic charm
#

Just need to block Linux from using it

#

Disabling the driver is one way of doing it

#

Proper way is device-id blacklist

earnest grotto
#

Things still don't work super well when it's enabled in windows
Two birds one bios/driver disable

#

I can try adding the blacklisting AIPG did in oneapi to my script as well, wonder if that will work well

signal patio
#

Photoshop is the only thing isn’t available on Linux for me, there is no alternatives if u’re a designer. I have a macbook for design work but sometimes i need to be able to use it on a desktop

formal tusk
reef ivy
#

13b Ltxv Model and upscaler released #1158434286765084834 message

reef ivy
#

well can't get the q8 kernels to install without cuda, god I hate cuda tbh.

#

apparently the fp8 will only run with 40series and above, guess will have to wait for gguf?

somber trellis
#

🤷‍♂️ Still blackscreening and cold-restarting from using comfyui

reef ivy
#

Does it happen during anything else?

reef ivy
#

Not sure if also requires the cuda kernel stuff but hopefully not

#

What res?

nocturne fjord
#

For gguf it is necessary to make some modifications in the comfyui code because it does not officially support it. The instructions are in the huggingface page of the model

#

I also published an extension in comfyui you can find it by searching "ComfyUI-LTX13B-Blockswap"

#

by setting the parameter to 40 blocks it only consumes 7 GB of VRAM while maintaining a reasonable speed, however 32 GB of RAM is necessary.

somber trellis
#

If I'm able to run Oblivion Remastered at ultra settings at 60FPS with my GPU getting into the 70-80's

#

And if I'm able to stresstest my hardware without fault

#

Then I literally don't know why comfyui specifically is causing this

civic charm
#

What sampler are you using? Running torch.linalg.solve was causing a hard crash of the GPU in the ipex 2.0 days.

#

UniPC uses linalg.solve

somber trellis
#

even illustrous 2.0 and other sdxl based models are crashing me

#

on latest whql rn

hard kraken
#

I can run 3A games fine, furmark for an hour, and still crash reboot on all kinda models. then switching to tier1 750w psu saves my day

somber trellis
#

I've been able to generate in the past without issue.

#

And I've been using the card for a while now, too.

#

It shouldn't be crashing like this running SDXL.

hard kraken
#

well, i didn't think it was psu either, the models it was crashing on, sometimes they worked too.

#

and i was on 550w

earnest grotto
#

Do a power test with this.

somber trellis
#

I just checked coretemp and my 13600k's performance cores are clocked to 5ghz

somber trellis
#

I set the core power limit of my A770 to 95w

#

It ran, but the moment I opened discord I got a "SYSTEM THREAD EXCEPTION" bluescreen

hard kraken
#

anyone got kijai frampack working ?

#

mine just get stuck, progress stay at 0%

somber trellis
#

Both at once had spiked to a max of 200w and i was still running

#

I am so... confused.

#

Also it freezes a minute in. The image on-screen right now is it stuck at 1:45.

#

It just self-terminates after.

reef ivy
#

500w is definitely pushing it imo. It may be that it is degrading a bit so you don't have the head room you used to have which is why it worked without issue before and is having issue now.

#

Also, pro BMG gpus are officially announced

signal patio
# civic charm Proper way is device-id blacklist

Hey! Can u plz help me how to do that? Its been a week and i still can’t do it with guides and chatgpt 😢
I also tried GPU Passthrough Manager and it shows NO iGPU in the list…i have intel core i5 12500 with UHD770

earnest grotto
#

set ONEAPI_DEVICE_SELECTOR=0 to make only the 0th device visible for pytorch

#
import torch
import intel_extension_for_pytorch as ipex

# filter out non-Arc devices
supported_ids = []
for i in range(torch.xpu.device_count()):
    props = torch.xpu.get_device_properties(i)
    if 'arc' in props.name.lower():
        supported_ids.append(str(i))

print(','.join(supported_ids)) 

code that lists out the ids of arc devices

reef ivy
reef ivy
#

so checking out the new ltxv model, tbh it doesn't seem much better than the distilled model just much slower. May have a use case for upscaling but not sure yet. LTXV just seems good for videos without much movement, anatomy gets crazy wonky just like the older models so far. takes between 7-8 minutes with compile and teacache. GGUF q4_ks

#

skyreels 1.3b wan 2.1 model i2v, takes about the same amount of time as the larger ltx model

earnest grotto
somber trellis
#

@earnest grotto Ok uh good news

#

it was never my psu that was the issue

#

my thermal paste yellowed

#

I am currently running a power test right now with my arc clocked to allow the max core wattage, while also running the CPU

#

both spiked to 200w and i survived this time

earnest grotto
#

@halcyon swallow Install ComfyUI using my script #1193952640225267802 message or manually edit in Disty's hijacks

earnest grotto
somber trellis
#

Well the power test is stable now

#

im no longer crashing

earnest grotto
#

bb may have had more drives or other things

somber trellis
#

but id like to mention that coretemp readings

#

while on the power test

#

are in the 90's post-thermal paste replacement

#

however the readings on occt state no higher than 90

earnest grotto
#

yeah that's pretty high, especially for a cpu

somber trellis
#

this might be an issue with the cpu overclocking preset on my bios

#

was fine before

tiny bolt
#

you tried new bios for raptor lake?

#

well, for the mobo. but you get what i mean

somber trellis
#

I have an MSI-Z690-A.

somber trellis
#

set CPU lite load to 110 mohms (intel default), and also followed an undervolt guide for my a770.

#

should be fine now

ember surge
#

Hi guys, I'm looking for image generation models with comfyUI and an B580. Do you have any list that I can start with. I tested a couple of models :
sd_xl_turbo
v1-5-pruned-emaonly-fp16

civic charm
ember surge
reef ivy
#

If anybody wants to get wanvideowrapper nodes working you do have to make an edit to the model.py in wanvideo/modules folder. Find line 69 and change torch.float64 to torch.float32. There are some other float64 calls but so far haven't run into trouble. If it's slow use the blockswap node as it's likely due to this #1193952640225267802 message

reef ivy
#

since I can use the wrapper now with decent speed, I can mess around with vace. I find it's actually pretty close to the 14b wan model in i2v. I also added some of the DG lora's for motion/quality and they do seem to work with it (as well as skyreels 1.3b). Native vace can't be used with teacache so the speed isn't worth it, but the wrapper doesn't have this issue. If you don't want to use the 14b this seems like a pretty good alternative for speed and vram. This is in the wrapper so you would need to make that edit I mentioned to use this workflow. Face gets wonky but it's probably some settings I need to tweak etc.

earnest grotto
# ember surge Both it'll be great!

https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main
Realistic model. Works with natural language. Get Q5, if you want to do bigger images or CFG get Q4
https://huggingface.co/city96/FLUX.1-schnell-gguf/tree/main
Faster (less steps) but lower quality version.
https://huggingface.co/YarvixPA/FLUX.1-Fill-dev-gguf/tree/main
Inpainting model. Heavy, good.
Seperately, you will need, for either of them:
https://huggingface.co/Kijai/flux-fp8/blob/main/flux-vae-bf16.safetensors
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp8_e4m3fn.safetensors
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors

Seperately, check out https://bfl.ai/models/flux-tools and consider if you want others.

https://civitai.com/models/833294/noobai-xl-nai-xl
For anime images. If you just want to make images without anything extra, get the v-pred-1.0-version. If you want canny and other things, get epsilon-pred-1.0-version, and you can get controlnets from here:
https://civitai.com/models/929685/noobai-xl-controlnet
You can see what each is for from the images.

#

If you've installed with my script, it can download some of those for you

signal patio
#

@earnest grotto how can i see the progress of installation and image generation like it was on windows?

Something is wrong here, it takes forever to install any node

earnest grotto
#

i don't use comfy manager or anything like that, I install what I need to manually

signal patio
earnest grotto
#

if by "progress of image generation" you mean the image previews and not progressbars, I don't do those either, they take up more vram and slow things down for no benefit

signal patio
earnest grotto
#

that's not any different and is always shown

signal patio
#

Something is broken then

earnest grotto
#

you have not queued any prompts in your screenshot

signal patio
signal patio
#

Someone help me i cant install any node
Its ok that it shows puthon 3.10 ? I thought 3.12 is the minimum

earnest grotto
# signal patio

Install spandrel or comfy_extras' requirements like it tells you to

signal patio
earnest grotto
signal patio
earnest grotto
#

Same thing

earnest grotto
#

idk why the built-in manager is slow. it is. you don't need to use it.

#

i'm trying it now and it just doesn't work, so, don't use it

#

my script installs those fine

#

i'm pretty sure the old comfyui manager extension also works fine

signal patio
#

Thank u for your help, as always

earnest grotto
#

comfy has been having some nearly broken updates recently

#

the new mask editor for some reason sometimes makes half-strength brush dabs, panning with middle mouse will spam new nodes under the image node and non-scrollwheel zoom shortcuts are gone

#

IIRC aaron said at one point recently the reroutes were all broken

#

so, keep that in mind

#

nothing i can do about it

signal patio
#

Do u think its possible that in the future comfy will be working with intel/amd videocards like out the box?

earnest grotto
#

well, i can make the script install an older version of comfy but that feels weird

earnest grotto
#

for intel*

#

the amd situation on windows is not good, i'm not aware what's up with ZLUDA, I think specific drivers were needed and so on so that won't be easy for AMD people

#

like on windows, on linux my script also makes a shortcut for you

#

it's the .desktop file

somber trellis
#

Qwen3 A3B 30B runs pretty well on 4_k_m quant on the a770.

#

@earnest grotto One of the nodes in your 0.1.8.1p script installs an old version of accelerate and protobuf

#

I'd recommend force upgrading those in your script

earnest grotto
#

Hmm, I'll see which one it is later, alright

#

What old versions if you recall

somber trellis
#

0.38.0 of accelerate, latest is 1.6.0

#

Reinstalled it just now, let me look

earnest grotto
#

ok, so brushnet installs an old accelerate

somber trellis
#

yep

#

that was what i was about to paste but it didn't copy lmao

#

Collecting accelerate<0.32.0,>=0.29.0 (from -r ./ComfyUI/custom_nodes/ComfyUI-BrushNet/requirements.txt (line 2))
Using cached accelerate-0.31.0-py3-none-any.whl.metadata (19 kB)

#

Collecting protobuf<5,>=4.25.3 (from mediapipe->-r ./ComfyUI/custom_nodes/comfyui_controlnet_aux/requirements.txt (line 14))

#

im also using the IPEX release of 2.7.10

#

manually installing it atm since your script doesnt include it, but thats easy to do 👍

#

customdia errors on the 2.8.0 nightly

earnest grotto
#

no idea about protobuf

#

doesn't seem like any of them ask for an older one specifically

somber trellis
#

always been recommended to upgrade and ignore protobuf conflicts from what ive seen

earnest grotto
#

could be from accelerate but i doubt

#

Updated

somber trellis
#

i think the main thing ive been waiting for now

#

is straight up just the transformers version of diatts

earnest grotto
#

Wow, so they are probably actually training it more

#

I expect whatever issues dia had initially need more training to fix and I expect no one ever trains more unless you're SAI about to burn down

#

Or an anime finetune

#

Good to see

reef ivy
#

Yeah, comfy has been breaking alot of custom nodes lately with major updates, some get rolled back to allow node makers to fix but obviously not all nodes are up to date. They recently separated the Ui/UX to a separate github all together as well. Comfy is pretty good about fixing things quickly though, but I always check the commits before updating.

#

Also with comfy manager, there are separate builds you can choose (default/dev) etc, and with each one some nodes won't update and some just don't exist at all.

#

It's easy to just choose a different one but it's not really intuitive and I just figured it out myself tbh.

somber trellis
#

imma just stick with spark for now though until the transformers dia comes out

#

current dia is way too inconsistent on cloning, and right now it just has a really high tendency to offset the output to the end of the audio clip, or just straight up ignore what you put in the text prompt.

#

oh lovely the bot is removing my hashes again

earnest grotto
#

dm me

reef ivy
somber trellis
#

Spark-TTS does better currently when it comes to cloning

somber trellis
#

The Dia TTS discord also mentions it being inconsistent or offset

#

Either that or too fast.

reef ivy
#

Yeah, I was also reading it's technically built more so for specifically having 2 speakers like a podcast as well so that may be part of the issue, not really sure what to make of it probably just needs to cook longer.

steel quarry
#

Hi, I am trying to install it now, but I am getting a lot of errors with respect to version of pytorch, NumPy, etc. and even though it's getting installed, it's not running the ComfyUI i nthe last step. Can you please help me with this installation?

steel quarry
earnest grotto
somber trellis
#

yeah dont mind me ltxv 13b is here

wicked fulcrum
somber trellis
#

it can do both

#

Supports I2V | T2V | V2V | Video Extension.

#

also comes with a spatial and temporal upscaler

earnest grotto
#

very nice

steel quarry
#

can you please let me know why this is happening! Thanks again for your help!!

steel quarry
#

want me to copy the entire loading of comfyUI and paste it here?

earnest grotto
#

Ok

steel quarry
#

I am completely new to this btw, so half the items are greek and latin to me... kindly guide me

earnest grotto
#

You probably ran out of ram, vram or both

#

given you have very little of either

#

Use a different model.

steel quarry
#

8gb vram is not sufficient

#

?

#

ok. will try

earnest grotto
#

It's enough for just flux
For this controlnet, probably not

steel quarry
#

ok. understood

earnest grotto
#

Especially since it's not quantized

steel quarry
#

i found one that's quantized, will it be possibly better to use considering my situation?

earnest grotto
#

If you're doing depth or canny, flux tools has flux versions specifically for those, and people have made quantized versions of those that might work with just 8gb of vram

#

but no guarantees

#

16gb of ram is already kinda low for gaming, and for ai... it's even lower

#

Use some SDXL models instead like juggernaut

steel quarry
#

oh ok. understood... i wanted to try this in this laptop to see how much can this handle... thanks for your feedback

#

will do my homework on SDXL models

earnest grotto
#

my script has a bunch included for convenient download, you probably want juggernaut 9 for realistic stuff and noobai vpred for anime

#

which is also another benefit with sdxl. no anime flux.

earnest grotto
#

To whoever's writing and presumably getting their messages with a stack trace deleted by the bot, posting it again will get it deleted again, explain your issue

paper bolt
#

umm what am i supposed to do from here , i tried installing before it got messed up

earnest grotto
#

@paper bolt Install using my script #1193952640225267802 message

paper bolt
paper bolt
#

@earnest grotto I downloaded anaconda and the script and downloaded the py 10.11 too now what do i do (apologies i am an amature)

earnest grotto
#

or open a command prompt there (e.g. shift+right click the folder's background, cmd here) and run it through that

paper bolt
reef ivy
# somber trellis

How has it been for you? I've gotten pretty poor results and it's not very fast for the quality. Get better results with Wan2.1 1.3b and vace, or skyreels v2 1.3b, with less resources and better speed.

#

Might just be my example is just too complex for it, but being 13b it should be better than a 1.3b model tbh.

paper bolt
earnest grotto
paper bolt
earnest grotto
#

Just run the script again and select that

earnest grotto
paper bolt
#

oh

#

@earnest grotto but i press this

earnest grotto
paper bolt
paper bolt
#

why isnt the download progressing

#

@earnest grotto

#

IT WORKED

lament shale
#

Nice

paper bolt
paper bolt
formal tusk
formal tusk
paper bolt
#

ltx img to vid

somber trellis
somber trellis
#

maybe more steps woulda helped

nocturne fjord
#

It works with the native workflow

reef ivy
#

Is there an I2V version?

nocturne fjord
reef ivy
#

Nice, 3 min for 5 seconds is something for wan.

nocturne fjord
reef ivy
#

Yeah, with Vace and DG 1.3 I think I can get 4 seconds in like 5 to 6 minutes, i2v workflow from kijai (just using first frame no last frame)

reef ivy
# somber trellis

I don't know if it would help or not, I feel like LTXV always give that 3d scan model look like it was trained on animations with those or something.

#

There seems to be a causvid 1.3b model but it seems huge for 1.3b. Could use that with Vace for I2v maybe

nocturne fjord
wicked fulcrum
#

Anyone have LTXV 0.9.7 working on Arc in ComfyUI fir text to video
If so please share workflow

reef ivy
#

Haven't tried t2v but the workflow from the gguf thread should work, i2v worked with the older model haven't tried distilled yet

#

You do need to have a recent comfyui update to use the gguf as support had to be added

wicked fulcrum
#

I have 0.9.6 distilled img2vid working. Cant get 0.9.7 distilled working. Model fails on load. I assume too big. Haven't tried the GGUF version

#

I assume the same GGUF required nodes for Flux GGUF... or something even more special?

reef ivy
#

I think the fp8 distilled works without that special fp8 node so if that is enabled, make sure to disable that (it would error out on anything but 40-50series nvidia anyway). It's needed for the dev to work but they claim the distilled will run native now.

reef ivy
#

super big day on movie models. Moviegen1.1 released, finetune of wan14b that does 1080p, and Vace 14b released and the final version of Vace 1.3b released today. Kijai has them all, you need his nodes to use his vace version ( have to edit his nodes to get them to work on arc) https://huggingface.co/Kijai/WanVideo_comfy/tree/main and what file to edit in the wanwrapper #1193952640225267802 message

#

Not sure if moviegen is even usefull for 480p, which is the highest I could do on 8gb (maybe 12 could do 720p? 16gb should be able to

#

Samples from moviegen look amazing though (quality wise)

paper bolt
#

LoadFramePackModel
Native API failed. Native API returns: 2147483646 (UR_RESULT_ERROR_UNKNOWN)

what is this error about

paper bolt
#

NVM FIXED IT CLT+C

placid violet
#

hello, does vikscript uses the new ipex 2.7?

earnest grotto
#

you can choose between 2.3 with ipex, 2.5 with ipex, stable torch without ipex and the nightly torch without ipex

#

stable should be 2.7 now i guess

#

you can change at any time

#

there is no 2.3 for battlemage

reef ivy
#

There is something called flex attention now, i think its meant for training but seems to be getting used for some of the new wan stuff by kijai. Haven't tried it so no clue if its even supported on Arc. edit* only supported on cuda and cpu atm

reef ivy
wicked fulcrum
#

Float for turning an image into singing video
https://note.com/bakushu/n/n1f884f4f40fd

note(ノート)

「音楽AIで生成した日本語曲を動画AIに歌ってもらう」のが、思ったよりも自然にできて感動したので、記事にしてみました。 Sunoで生成した曲 AI音楽生成サービスのSunoでつくった曲を使いました。バージョンはv4.0、プロンプトは「mellow cute chip tune, female...

somber trellis
#

@earnest grotto I found another good TTS that seems to do slightly better than spark in cloning

#

IndexTTS

somber trellis
#

like

#

a lot better

#

im testing it on hf rn but there is a comfy node by billwuhao

#

Id also like to mention

#

this hf demo doesnt use transcribed audio

#

it failed here a bit

#

but

#

it definitely became nazeem

earnest grotto
#

I will definitely check it out later

somber trellis
#

it has spacing issues

earnest grotto
#

I am doing this right now

#

Blender nodes in Comfy, if you've used Blender

somber trellis
#

I have weightpainted in the past in exchange for some things

#

such as a combine metrocop reskin

#

I was given these metrocop textures custom-made as a gift for the exchange

#

and the custom node for it works

#

IT GENERATES TEXT SO MUCH FASTER

somber trellis
reef ivy
#

With any of these models can you prompt for emphasis and emotion etc?

earnest grotto
#

Doesn't seem like it

#

In fact in my tests, index tts recognized the emotion and removed it

#

You might be able to kiiinda get around it with punctuation...

#

But ultimately, for fine control rvc seems to still be king

#

Hmm, there might be some control over emotion, going off their demos

#

no idea how that happens tho

somber trellis
reef ivy
#

I asked this in the arc forum, but would a b580 actually be slower than an a750 if using a model that goes over vram? I'm thinking it would because it is only 8lanes of pcie. And since there seems to be an issue where having the whole model on vram is slower than swapping, would it actually be faster at all if it did fit on vram?

earnest grotto
#

the "slower in memory than swap" issue could be alchemist only

#

if you're contemplating a b60 24gb, intel is also promising more robust drivers. but i'd hold off on the dual gpu one until we see non-llm stuff on it, which we likely will and it's not coming out soon either way

last dock
#

Sooo i'm new to this community in terms of messaging, but I've been a lurker for abit for any issues I've come across. That aside, I'm not sure what improved but i have comfyUI installed everything works perfectly, was using torch 2.5.1 for around a month or so because I couldn't get 2.6 to work.

Now at this very moment, using my Arc A580, I've got torch 2.7 installed (xpu) and the speed has improved drastically. Namely doing 1920x1080 currently at 00:22sec-00:23sec per image. Before with torch 2.5.1 it was this fast but at a resolution of 768x1024, any higher took more than a minute to generate.

Not sure if its my install or what but man am I happy to be alive.

#

And I just tested an SD 1.5 model at 1920x1080 and the "RuntimeError: Current platform can NOT allocate memory block with size larger than 4GB!" bug seems to not be present anymore.

I'll finally check back this week on video generation

earnest grotto
earnest grotto
#

.

reef ivy
# earnest grotto the "slower in memory than swap" issue could be alchemist only

I believe it is a pytorch issue, but not sure if it also effects battlemage as the sample size is too low to ask people who would know. There is an update to XPU in pytorch that I think is released in nightly that changes how gpu/cpu is handled but have no idea if it would effect this issue. https://github.com/intel/torch-xpu-ops/commit/02cc63fa440b28b51765fce7a1eabb3a06a98bfa

GitHub

Refer https://github.com/pytorch/pytorch/pull/147820
https://github.com/pytorch/pytorch/pull/150398
To launch kernels on the current stream and reduce the CPU overhead
introduced by recordStream,...

#

Probably will end up on nvidia at some point if no b700 come out or wait till celestial, still need a all in one gpu not just a pro one with low power etc. May grab a a770 at some point and sell the a750 also as I really want to avoid nvidia lol.

earnest grotto
#

The built-in manager or the addon/custom node

#

That's not a missing node

somber trellis
civic charm
civic charm
#

Also nvtop finally got proper Intel ARC support

earnest grotto
#

judging by the B580's speeds

earnest grotto
#

no xpu-smis behind the scenes or anything else?

civic charm
#

I am on Arch Linux with kernel 6.14

civic charm
#

Zluda is tricky but works on native Windows with Cuda PyTorch on RX 6000 and RX 7000

earnest grotto
earnest grotto
#

halfway through season 2. jesus.

earnest grotto
#

anime equivalent of drakengard 1 honestly

tiny bolt
#

just skip endless august

earnest grotto
#

I've almost finished the 5th out of the 8

#

sunk cost...

tiny bolt
#

i gave up at the third iteration DoggoGrin

earnest grotto
#

finally something sane to monitor vram with

#

thanks!

reef ivy
#

I just like the way intel is handling it tbh, maybe it's easier since they have less gpu's I dunno.

earnest grotto
#

Well, I guess the whole thing should work now?

earnest grotto
civic charm
maiden basin
#

I am getting this error after installing per the instructions. WHat would cause this?

earnest grotto
#

Install using my script #1193952640225267802 message

maiden basin
#

Will do. Thanks Vik

somber trellis
maiden basin
#

That last voice sounds like the Gluron news caster lol

maiden basin
somber trellis
maiden basin
#

I was able to get the Kokoro workflow installed. I think.

somber trellis
maiden basin
#

Is there anyway to add the parameter --cpu to the launch .bat? It seems to crash when I do it.

#

Kokoro doesn't want to play with the xpu

reef ivy
#

Try the multi-gpu custom nodes and select CPU on Model loader maybe.

earnest grotto
maiden basin
#

Lemme see if I can generate the error.

maiden basin
maiden basin
maiden basin
#

How would one implement that in this configuration?

earnest grotto
maiden basin
#

Also, out of curiosity. Should I be using the lowVRAM.bat to launch for an A770 16GB? It seems to force half precision (BF16). I think the A770 excels at Half Precision anyway

earnest grotto
# maiden basin Also, out of curiosity. Should I be using the lowVRAM.bat to launch for an A770 ...

There is basically no point for you to be using fp32 instead of bf16, it's half the performance and twice the vram for no gain.
bf16 is a bit faster, let's say 10-20% on Intel than fp16.
the bf16 launch argument is a bit of a holdover. comfy should be using bf16 without it but i'm not completely sure.
"low vram" means that text encoders don't sit around in vram doing nothing. it frees up vram so you can instead, say, work with a larger image or a model that naturally takes more vram, without running out

reef ivy
#

On a750 lowvram seemed to slow everything down, using --reserve-vram seemed to do something similar but actually increase speed for some models

somber trellis
reef ivy
#

it's probably common to these models, but the fact that it adds breaths is kinda crazy to me.

somber trellis
#

WAN 2.1 VACE 14B Q8 GGUF works fine with teacache on arc (512x512 resolution)

#

5 minutes a gen

reef ivy
#

They fixed teacache for vace in native? Will give it a try

reef ivy
somber trellis
#

Equivalent TTS quality to IndexTTS. Clone quality isn't there.

#

it made my imperial watchguard irish

#

or british

reef ivy
#

I wonder if you can prompt for accent?

somber trellis
#

indextts for comparison

reef ivy
#

Sounds just like skyrim lol, on one hand I personally kind of want to make a voice different than a real one.

#

I wish you could mix multiple voices etc

somber trellis
#

i cant be mad though

#

this is very close to the quality of old elevenlabs

reef ivy
#

Yeah, that is really good. You could also use dia then use cloning on the output I guess.

hard kraken
#

took 243s, weird thing is both cpu/gpu never worked higher than 30% ... during the sampler stage

earnest grotto
#

ace-step might have loras trainable. I guess I'll investigate soon™️
ace-step itself looks to be absolutely bonkers quality, and that speed... damn

#

40 seconds generation for 150 seconds of music

hard kraken
#

XPU out of memory. Tried to allocate 4.47 GiB. GPU 0 has a total capacity of 15.56 GiB. Of the allocated memory 4.78 GiB is allocated by PyTorch, and 29.08 MiB is reserved by PyTorch but unallocated. Please use empty_cache to release all unoccupied cached memory.
what to do?

reef ivy
#

If so then probably run his script again, if it still happens post the entire error.

hard kraken
#

if XPU's already recognized, why need cuda hijkacks?

reef ivy
#

Thats what it does, it changes calls for cuda to calls for xpu, also it gets past the 4gb limit issue with alchemist(not an issue with battlemage). I assume you are on alchemist as that is the error you get.

#

Just assume everyone is only coding for cuda unless you are making your own code

maiden basin
#

WHat settings are you guys using for WAN. All I get is this lol.

#

Well, it was animated GIF lol.

#

Here it is. It was supposed to be about a pig who befriended a robot.

reef ivy
maiden basin
earnest grotto
#

@woeful sable You can get my script from here #1193952640225267802 message (click this link or see this thread's pins)

#

This is ComfyUI. You work with nodes. If you don't like that, don't use that, which is why I suggested sdnext and aipg instead

somber trellis
earnest grotto
#

Doesn't quite match the stanley parable voice

earnest grotto
#

that one's closer

somber trellis
#

I went and maxed out top_k, top_p and temp

#

just to see if that'd actually change it in any meaningful way

#

It seems as if the voices are far less artifacted

earnest grotto
#

I had really funny results with neco arc. I should try again

#

Am waiting on qwen 2.5 omni to caption some music files but i'm using cpu because vulkan causes massive artifacts

somber trellis
#

Do you know why there's such a long pause without much usage

#

Sometimes it happesn and sometimes it doesnt'

earnest grotto
#

no, other than try lowering the temperature

#

but then yeah that can be a kinda worse result

somber trellis
#

lower temp, top_p and top_k means less variety which means more monotone outputs

#

🤷‍♂️

#

token prediction be like

earnest grotto
#

part of why i'm looking forward to nari labs' hopefully now fixed dia 1.5 or 2 or whatever they will call it

#

it had both good clone quality and emotion but just broke so damn easily...

somber trellis
#

Yeah, I noticed that if you word-for-word the clone with dia

#

it's near identical

#

When I tried trailing off using the original text to preserve quality, it crashed.

woeful sable
earnest grotto
#

Let's sort out sdnext or ai playground first

somber trellis
somber trellis
earnest grotto
#

the stanley parable VA suddenly sounds like the globglogabgalab VA to me

somber trellis
#

Does this sound better to you?

somber trellis
#

Kevan Brighting being the voice actor for the parable, and Tony Halstead being Mr. Globgogabgalab.

#

They give the same vibe, though.

#

Same voice feel.

#

Though the timbre of gabgalab is much higher

earnest grotto
earnest grotto
#

No nans so far

reef ivy
#

That guy makes some dope nodes, always seemed too heavy for me though i think he is on a 90 series nvidia.

earnest grotto
#

ace-step should work on an a750 and i kinda doubt this node is heavy enough to change anything, though i haven't tried since I'm still training

reef ivy
#

Yeah, I may check it out. Mostly his audio reactive nodes with animatediff and depth etc.

earnest grotto
#

hmm, looking into it ace-step might need just a bit over 8gb in comfy...

#

probably doable with b580 but maybe not a750?

#

not sure

#

also my first lora test went pretty badly

warped briar
#

Can anyone help me with this error? When I run: python main.py --bf16-unet, I get this error

earnest grotto
somber trellis
#

Has anyone messed with Bagel?

somber trellis
#

@earnest grotto

#

I don't think these nodes are working properly at all on our hardware.

#

It's 229s seconds of gpt_gen_time at 10% cpu and gpu load, barely being utilized.

earnest grotto
#

what are "these nodes"

earnest grotto
somber trellis
tiny bolt
#

still a classic. shame the series died with kyoani (and of course the author's laziness)

somber trellis
#

Also the gpt gen time (one of the index-tts models) takes almost 4 minutes for what seems to be no reason, since both the cpu and gpu both idle at 10% usage

signal patio
#

How can i fix that? Arch Linux, A770 16GB

earnest grotto
somber trellis
#

Also I'm an idiot and realized the assert error message was due to me going above the mel token limit the index-tts node has built in.

#

just increasing it resolved the issue lmao

earnest grotto
# somber trellis

argh
it's good but it makes me wish even more that ace-step lora training actually worked with ~100 files, it feels like it overfits crazy fast while also not quite getting there

ripe pivot
#

You can make lora with intel? It seems like there's no tutorial for it out there

earnest grotto
#

I have successfully trained loras for sdxl, flux dev and presumably ace-step ended up bad due to the nature of the training, or my hyperparameters, rather than any other issues
I have finetuned tortoise tts successfully
I tried finetuning pixart sigma however it broke. but i might've set the learning rate waaaaaay too high, i don't remember

#

What do you want to make a lora for

ripe pivot
#

some obscure character and only used in-game screenshoot or something

earnest grotto
#

what model

earnest grotto
#

And what style do you want to make images in after that

ripe pivot
#

anime, and I assume illustrious is sdxl infrastructure right?

earnest grotto
# ripe pivot anime, and I assume illustrious is sdxl infrastructure right?

I'd suggest noobai vpred if you want to just generate, and noobai epspred if you want controlnets
Instructions here for setting up kohya https://discord.com/channels/554824368740630529/1162821590153699438
Get taggui: https://github.com/jhc13/taggui/releases and use it instead of kohya's captioner. Add your triggerword then use wd-swinv2-tagger-v3, then manually go through the images and make sure the automatic tags are sane.
If it's just in game screenshots, your lora will slightly overfit to making things 3d. You can mess with which lora blocks take effect in comfy, you can do masked training (paint out with alpha the not-character parts of the images), or you can prompt that away, generate some images you think are good enough, and train a new lora on those.
Resize the images to 1mp. kohya should have a thing to do that for you.
Show some pics of your dataset before you start training

GitHub

Tag manager and captioner for image datasets. Contribute to jhc13/taggui development by creating an account on GitHub.

#

Don't remember what disty's example config's settings were, but keep the learning rate and aim for ~2000-3000 steps. Make it save a lora about every 500 steps.

ripe pivot
#

do you have to put linux or can you do it on native windows?

earnest grotto
#

If you decide to use vpred, kohya's gui should have some toggles to tell it it's a vpred model. tell me if it doesn't

earnest grotto
#

It should

ripe pivot
#

well, with assumption it should. how do I install the Intel Compute Runtime on windows? Is there any downloadable .exe for it?

earnest grotto
#

By installing your drivers.

#

It's included in those

#

You most likely already have it

earnest grotto
#

@ripe pivot Any luck?
Forgot to mention but you probably also want to caption images with "3d" or such

ripe pivot
earnest grotto
# ripe pivot not yet, still trying to compile the images. How many images do you reckon for 2...

You set your epochs to such a number that you train for about 2000-3000 steps. If you have 10 images, that's 200-300 epochs. If you have 100, that's 20-30 epochs and so on.

You should use at least 30-40 images for decent results. The images should be varied, which is why I'd like to see your dataset before you start training
You can flip them, however you should only do that if your character is symmetric. For example, armbands, earpieces, prosthetics and so on are not symmetric.

#

Flipping with kohya makes it so you can't cache the latents and the vae has to stay loaded and take up vram and time, I suggest to flip them some other way

ripe pivot
earnest grotto
#

Did you install comfy with my script

#

@ripe pivot

#

If you did, run it, exit it (ctrl+c), install kohya's requirements.txt and use it from that environment

ripe pivot
#

wait how do you install .txt

civic charm
#

kohya's requirenments are ancient, just use the comfy venv as is

earnest grotto
#

if you have moved to inside kohya's folder

ripe pivot
#

how do you that again? I forgot how to do the cmd movement with script convenience

earnest grotto
#

You have a folder in C:\Users\You\Documents\Folder, with the file requirements.txt in it (C:\Users\You\Documents\Folder\thing.txt)
your command prompt is in C:\Users\You\Documents
you can:

  1. refer to the file file directly
    pip install -r "C:\Users\You\Documents\Folder\thing.txt"
  2. refer to the file relative to where the cmd is
    pip install -r ".\Folder\requirements.txt"
  3. move to that folder and install the file directly
    cd C:\Users\You\Documents\Folder OR cd .\Folder
    pip install -r requirements.txt
    . means current folder. .. means go up one folder.
    If the folder is in another drive, let's say E: and the cmd is somewhere in C:, to move the terminal there (not for referring to the file), you need EITHER first do E: then cd E:\wherever..., or just do cd /d E:\wherever...

If you can't do this, I'll add kohya to my installer script later

ripe pivot
#

uh do I need another file or what

earnest grotto
#

ok, you have to do it from inside kohya's folder

ripe pivot
#

it's been stuck for 1 hour now

earnest grotto
#

you are downloading pytorch with cuda support

#

@ripe pivot stop doing whatever you're doing, go collate, caption, flip, whatever your images, I'll add kohya to my script

earnest grotto
#

Untested kinda, it launches kohya but that's as far as i tested and fixed for now

#

More testing and fixing later probably unless i lucked out and this works

earnest grotto
#

Native windows OOMs, useless for training sdxl

#

i wonder if a different pytorch/ipex version can save it, but i doubt that

#

hmm, 2.3 might work

#

this is a bit troubling to look at

#

but it's going somewhere

earnest grotto
#

seems to have settled on ~4.64s/it, with a dataset of all the acceptable sdxl resolutions

#

i wonder if it's came making it slow or just windows

hard kraken
#

this doesn't work with xpu nightlies?

earnest grotto
#

Did not work with stable pytorch. I didn't test with the nightlies

#

I am using 2.3 now

#

Almost done training, less than an hour remaining and I can see if it produces garbage or actually works, but given the loss is still sane and slowly going down it's probably gonna be fine

#

It's in my script. So you can easily just choose what version you want

#

I still have some other things to test out before I'll move it to the main branch

civic charm
#

Last time i tried xpu nightlies, it gave me 30s/it to a minute per iteration

earnest grotto
#

i get a pretty comfortable 1.7-2.5s/it for an sdxl lora with adam, 12 dim 1 batch size no TE, with some not super recent 2.8, depending on dataset or some magic randomness because i've had different speeds if I just reboot multiple times

hard kraken
#

why comfy/sdnext works with nightlies but not kohya?

#

by 2.3 you mean ipex or xpu?

earnest grotto
#

it works with nightlies on linux, for me. it does not work with stable on windows, because it uses too much VRAM and runs out. I have not tested nightlies on windows. it uses way more vram than i'd like even with 2.3, see the screenshot above. IIRC on linux I have about 4-5GB free with 1 batch size

earnest grotto
hard kraken
#

right, i forgot

earnest grotto
#

it's also possible the newer ipexes use less vram than plain pytorch but i haven't tested

hard kraken
#

so 2.3.1 is better than the 2.1.4 stipulated in the kohya requirements?

earnest grotto
#

when the requirements were written, 2.1 was the newest

#

that's why it says 2.1

#

it has not been updated since

hard kraken
#

ah.. i saw file was last updated 3 months ago, thought i had to stick to old ones

earnest grotto
#

they might have moved it around then i guess, idk

#

anything below 2.5 also does not work for flux lora training

#

not that i'm even gonna attempt that on windows

#

it was some bug rather than an OOM

earnest grotto
# earnest grotto https://raw.githubusercontent.com/a-One-Fan/ComfyUI-Intel-Installer-Script/refs/...

You don't have to stick to any specific version, you can use whichever you want
In general i'm pretty sure most of kohya's requirements are specific just because they froze what worked for them and didn't want to test out different versions of random dependencies, which is fair, i wouldn't wanna do that either. it worked on linux when I ignored their requirements and just manually installed what was missing on top of comfy

civic charm
#

Tried 2.7 and 2.8 again.
18 s/it with them
2.3 is 4.5 s/it
batch size = 2
gradient checkpointing = true
optimizer = adamw

earnest grotto
civic charm
#

almost full but i am on linux

#

pytorch shouldn't be able to use gtt

earnest grotto
#

hmm, yeah

#

i'll try with batch size 2 in a bit

earnest grotto
#

lora is sane and working

#

@ripe pivot

#

||leonardo da vinci style lora||

civic charm
#

batch size 1 is 16 s/it

#

1-2 gb free

ripe pivot
#

I hope 12gb is enough

earnest grotto
#

or you can train for sd 1.5

civic charm
earnest grotto
#

still getting lower and lower, 2.34s/it

#

with this torch

#

and this kohya hash e5e8be05fe0475a04e61ef668afffc632aa178f5

#

6.14.8-061408-generic

#

1 batch size only

#

running the training scripts a bit more directly

earnest grotto
#

2 batch size, ~4s/it but i don't want to keep waiting for it to plateau

reef ivy
#

Haven't checked 2.8 nighly in over a month but it was still slower than 2.1.4 last I tried.

earnest grotto
#

I've been commenting them out so far. I wonder if those break the hijacks enough to kill performance?

#

I haven't tested which pytorch/ipex versions torch.xpu.List/Tuple do exist for so no pr yet

#

and i guess kohya's hard requirement for numpy 2.1 is bad for intel, 1.26.4 is needed here too

#

and huh, I oomed with the same config that worked on windows

#

very strange

civic charm
#

sd-scripts version isn't compatible with 2.7+

earnest grotto
#

huh
why not a submodule

#

oh well

#

alright

#

oops, i guess i forgot gradient checkpointing

#

that's why it was using so much vram

#

speed from the gui is still worse than i'd like (~6s/it on linux), I wonder if I'm missing other arguments but progress i guess

#

@ripe pivot Sorry, I lied, yes you can train an sdxl lora with your 12gb b580

#

and given windows' offloading, flux might be feasible but slow and low res and whatever else

civic charm
#

IPEX 2.3 achieves 2.5 s/it with 9.2 gb vram usage
PT 2.7.1 achieves 12 s/it with 11.5 gb vram usage and only 25% GPU usage
RX 7900 XTX with PT 2.7.1 achieves 1.2 s/it with 8.7 gb vram usage

command:

accelerate launch --num_cpu_threads_per_process=2 "/mnt/DataSSD/AI/Apps/ipex/sd-scripts/sdxl_train_network.py" --bucket_no_upscale --bucket_reso_steps=64 --cache_latents --cache_latents_to_disk --cache_text_encoder_outputs_to_disk --full_bf16 --gradient_checkpointing --keep_tokens="12" --learning_rate="0.0001" --logging_dir="/mnt/DataSSD/AI/anime_image_dataset/old_characters/hakurei_reimu/log" --lr_scheduler="constant" --lr_scheduler_num_cycles="100" --max_data_loader_n_workers="0" --max_grad_norm="1" --resolution="1024,1024" --max_train_steps="5000" --mixed_precision="bf16" --network_alpha="3" --network_dim=24 --network_module=networks.lora --network_train_unet_only --optimizer_type="AdamW" --output_dir="/mnt/DataSSD/AI/anime_image_dataset/old_characters/hakurei_reimu/model" --output_name="hakurei_reimu" --pretrained_model_name_or_path="/mnt/DataSSD/AI/models/sd-webui/Stable-diffusion/SDXL/icedcoffeeil_V30.safetensors" --save_every_n_epochs="1" --save_model_as=safetensors --save_precision="fp16" --seed="123456789" --train_batch_size="2" --train_data_dir="/mnt/DataSSD/AI/anime_image_dataset/old_characters/hakurei_reimu/img" --unet_lr=0.0001 --sdpa
earnest grotto
#

for some reason the gui's "print training command" doesn't actually print the command for me and instead prints the toml file. oh well, i do have a command directly on linux

#
accelerate launch --num_processes=1 --num_machines=1 --num_cpu_threads_per_process=8  "/mnt/C_SSD/VMs/Kohya/newer_kohya/kohya_ss/sd-scripts/sdxl_train_network.py" --mixed_precision="bf16"  --bucket_reso_steps=32 --cache_latents --cache_latents_to_disk --caption_extension=".txt" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --full_bf16 --gradient_checkpointing --huber_c="0.1" --huber_schedule="snr" --learning_rate="0.0004" --logging_dir="/mnt/C_SSD/VMs/Kohya/log"  --bucket_no_upscale --loss_type="l2" --lr_scheduler="constant" --lr_scheduler_num_cycles="8" --max_data_loader_n_workers="0" --max_grad_norm="1" --resolution="2048,2048" --max_train_steps="3216" --min_snr_gamma=5 --min_timestep=0 --mixed_precision="bf16" --network_alpha="12" --network_dim=12 --unet_lr=0.0002 --network_module=networks.lora --no_half_vae --optimizer_type="AdamW" --output_dir="/mnt/C_SSD/VMs/Kohya/Datasets/leonardo_da_vinci/model/" --output_name="noobai_1_0_vpred_leonardo_linux_part1_0" --pretrained_model_name_or_path="/mnt/D_SSD/common_models/Stable-diffusion/noobaiXLNAIXL_vPred10Version.safetensors" --save_every_n_epochs="1" --save_model_as=safetensors --save_precision="bf16" --network_train_unet_only --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk --train_batch_size="2" --train_data_dir="/mnt/C_SSD/VMs/Kohya/Datasets/leonardo_da_vinci/scaled/" --sdpa --caption_dropout_rate="0.0" --v_parameterization --zero_terminal_snr
reef ivy
#

Have we ever learned why later versions of pytorch/ipex has such slowdown (for arc specifically).

hard kraken
civic charm
#

IPEX 2.7 was even slower than PT 2.7 on inference, haven't tried it on training

earnest grotto
#

linux speeds don't entirely match windows. i've had things slower on windows be faster on linux and with training i had it vice versa, on linux with gradient checkpointing the config i squeezed out of the gui with whatever else i forgot was 4s/it but windows 3s/it, both with pytorch 2.7

hard kraken
earnest grotto
hard kraken
#

i tried to install it in windows, but tensorflow has a tensorflow-io-gcs-filesystem depency that's only available in linux and mac

main spear
#

Hi, is there an updated install guide? I tired following the install guide pinned up top but I can't get it to work. I'm on an B580 if that makes a difference

reef ivy
main spear
#

I looked at that but it just opened a text page on my browser rather than a download file

#

Do i need to copy/paste that somewhere specific?

#

nvm, reread the instructions and just had to right click and save

main spear
#

\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?

#

Is this normal?

#

It loads successfully, but just want to make sure I didn't make a mistake

#

also this doesn't show automatically, where do i enable this?

#

It generates images correctly so I'm assuming its running as intended

reef ivy
#

For better info on using comfyui itself check the wiki on thier github. Probably just download some workflows or use the samples to get started

main spear
#

Okay, the check point I'm using has an optional extension. Is there a specific folder in Comfy that I use so the git command installs it in the correct place?

main spear
#

are only SD1.5 models compatible?

earnest grotto
earnest grotto
earnest grotto
#

be more specific.

earnest grotto
main spear
# earnest grotto "extension"?

I figured it out, the model i used was using a diffusion model but they called it an extension in the description. And I was exploring CivitAI and looking at all the models, just remember in the video guide he said SD1.5 so wasn't sure if others would work. I'm just testing to see what works and what doesn't now. Though a guide to get txt2vid or img2vid working with WAN would be good.

main spear
#

nvm, i figured it out

earnest grotto
#

Trying to get instantcharacter to work locally. I wonder what's up with the image

#

hmm, wonder if it's flan

#

guess i'll find out some other time

valid escarp
#

I am trying to install using anaconda and i am getting this error: Checkpoint files will always be loaded safely.
C:\Users\anuma\anaconda3\envs\GenAI\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: 'Could not find module 'C:\Users\anuma\anaconda3\envs\GenAI\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
Traceback (most recent call last):
File "C:\comfy\ComfyUI\main.py", line 129, in <module>
import execution
File "C:\comfy\ComfyUI\execution.py", line 14, in <module>
import comfy.model_management
File "C:\comfy\ComfyUI\comfy\model_management.py", line 221, in <module>
total_vram = get_total_memory(get_torch_device()) / (1024 * 1024)
File "C:\comfy\ComfyUI\comfy\model_management.py", line 172, in get_torch_device
return torch.device(torch.cuda.current_device())
File "C:\Users\anuma\anaconda3\envs\GenAI\lib\site-packages\torch\cuda_init_.py", line 1026, in current_device
lazy_init()
File "C:\Users\anuma\anaconda3\envs\GenAI\lib\site-packages\torch\cuda_init
.py", line 363, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

#

mine is intel arc graphics hence i used the steps pinned for this chat, yet im getting this error where CUDA is of NVIDIA i guess?

earnest grotto
#

@valid escarp Install using my script ^

#

The steps at the very start of this chat are old and IIRC should be pretty outdated now

signal patio
#

@earnest grotto

#

Why it cant find my gpu?

signal patio
#

clinfo shows 0 platforms

#

Wtf

signal patio
#

Fixed it by force probe

signal patio
#

Installation script working fine but still cant get comfy working

#

Deleted old .cenv result is the same

#

@earnest grotto plz help

earnest grotto
#

if non-ipex, do
python -c "import torch; print(torch.xpu.device_count())"

signal patio
earnest grotto
#

if the bot deleted it, dm me

signal patio
earnest grotto
#

yes

earnest grotto
#

i think it didn't install right but i'll look later, doing other stuff rn

earnest grotto
#

at the end

#

add it

signal patio
#

0

#

Is result

earnest grotto
# signal patio Installation log

you can just press enter without typing anything in to proceed with the default option (You don't need to type "N" if the default is the "N" option, etc.)
Do this from that conda environment:
SYCL_UR_TRACE=1 python -c "import torch; print(torch.xpu.is_available())"
and show what errors pop up

#

also search for "libstdc++" inside the conda environment (the cenv folder), with the file explorer, and show what shows up

earnest grotto
earnest grotto
signal patio
signal patio
earnest grotto
# signal patio

If you don't know how to activate it then how was this screenshot taken?

#

Did you just make a brand new environment yourself with that name?

signal patio
#

I completely lost 😆 and i don’t know how to answer you

#

Im just trying to do something

earnest grotto
#

source ./Comfy_Intel/start_lowvram.sh
press ctrl+c to stop comfy.
afterwards, you should be in the conda environment.

signal patio
earnest grotto
signal patio
#

Seems like there is no environment

earnest grotto
#

Does the /home/pavelatti/Applications/Comfy_Intel/cenv folder actually exist?

signal patio
earnest grotto
earnest grotto
signal patio
earnest grotto
#

source-ing it and just running it directly are not the same

#

source it like i said

signal patio
#

I runned comfy, stopped it and then pasted your command

#

What i did wrong?

earnest grotto
#

You did not type in source ./start_lowvram, you just typed in ./start_lowvram

earnest grotto
#

and show the results

valid escarp
#

@earnest grotto I just saw your message, I have created Setup_ComfyUI_Intel.py pasting your script. I am a noob to this so dont mind me asking this. 1. where should i locate this py file? 2. should i run this in conda environment? like at this point in conda as - (GenAI) C:\comfy\ComfyUI> python Setup_ComfyUI_Intel.py ?

earnest grotto
signal patio
#

🥲🥲🥲

#

Fkin hate comfy for this

earnest grotto
# signal patio

Delete this one. run comfy again with the shortcut/start shell script. say what happens.

earnest grotto
#

the files. not the folder

signal patio
earnest grotto
#

i'll make the script do this automatically in the future. for now, delete these files next time you use it on linux.

valid escarp
#

i already have this from the old instructions i followed. should i delete this and run ur py?

earnest grotto
#

When you run the script, it will explain to you what will happen.

valid escarp
#

I gave Y here coz why not install the stuff

main spear
#

@earnest grotto Do you have anything on CivitAI? Like models, workflows, etc..

earnest grotto
#

What do you mean by do i have

#

recommendations, or if i've personally uplaoded things

main spear
#

Both

earnest grotto
#

some models I'd recommend are included in my script. flux, flux fill, noobai epspred, noobai vpred, animagine 4.0, ponyxl, dreamshaper 8 inpainting.
loras are imo too specific for what you're doing. find ones for what you want. i don't do realistic stuff so no recommendations on more general loras for things like fixing flux's incessant DoF
i have no broad recommendations for workflows. IMO you should learn how things work and make your own workflows mostly. there's the example workflow that comes with the script, for using flux. here's some workflows for inpainting with brushnet/powerpaint: #1303399518263443507 message
i haven't uploaded anything to civit. i might at some point, i've trained a pretty large amount of loras now

#

for the most part i train my own loras and rarely use others'

#

Example: Style lora for 2nd pic + character lora for 3rd pic
Highly unlikely 2nd pic style has any loras. character from 3rd pic might but I prefer knowing exactly what the dataset is and how it's captioned

#

i think the tons of mixes on civit are pointless

main spear
#

seperate question, I have 2 ssd's the one on the mother board and on connected with a sata. roughly 2tb total, and i've noticed that some models, specifically wan diffusion models, can be massive. I have a 5tb HDD, can I install and run ComfyUI on that HDD without much loss in performance? So far I just swap out the diffusion models from one storage to the other as I need them but I can tell that this can easily run through 1tb of storage

earnest grotto
#

Things will load substantially slower from an hdd

#

This can take a while with larger models like t5 or flux

#

on the other hand, loras are individually small. you can put them on your hdd

#

comfy itself is pretty small and lightweight

#

your virtual environment is big (only a few gb) but i don't think hdd would make a big difference for it

main spear
#

Is there a way to add my HDD so that Comfy can read the models that are stored there?

#

with keeping comfy itself installed on my ssd

earnest grotto
main spear
#

perfect I think thats exactly what I need

main spear
#

So, I edited the yaml and it loaded comfyui correctly and it said it added the paths but when I I go to look find it in ComfyUI I can't find it. That link says it should be located under the "Help" button but nothing shows up

#

This part specifically is what I mean, nothing below feedback is showing

#

nevermind, the dev mode was disabled in the UI

#

nope that wasn't it, still not showing

#

Okay, I guess that's a little outdated. They just load with the other models and I guess I don't need to do the Open>extra_model_paths.yaml

valid escarp
#

so is it done?

valid escarp
#

thanks a lot for the support @earnest grotto ❤️

main spear
#

@earnest grotto Do you perhaps know how to fix this? I have visual studios installed but not entirely sure how to fix this

#

nevermind, I realize my mistake.

main spear
#

@earnest grotto Do you use the torch.complie nodes? I've been trying to get it working to no success

earnest grotto
#

no, i don't