#ComfyUI for Intel Arc using IPEX
1 messages · Page 13 of 1
I like z image's realistic fidelity, bet you could get some good images out of it
you could get them even gooder given it's 6b
training loras with it is all but guaranteed locally doable without needing 64gb+ of RAM that's getting more and more expensive
just need training code
Yeah, the realism's already there.
I really have to wonder now how much worse flux klein will be than this
z image kinda blows flux 2 away in realism standards
total hunyuan image 3.0 situation
I feel how this guy looks with this model.
EAT
goin home to cook him rn
now we just need z image edit
Supposedly, this is a poop atronach.
The lizard has magic poop powers.
da bear wizard
are you adding the black borders
ngl, i need to see side by sides to know what models are actually better now. I haven't used anything since flux for images, if something is better and faster that'd be amazing.
Z image turbo is better, faster and half the size of flux 1
No one wants to wait 5 minutes per image (or more). You're not using flux 2, and it's debatable if the results are better than z image anyways
When it comes to realism it seems to be... mildly worse than z image?
flux 2 just seems like the kind of model made so you can automatically generate a lot of images for your product to sell on amazon
not for a normal person to use
it's probably more accurate to say z image is slightly slower than sdxl while being better quality than flux
and again, without torch compile. lumina benefitted a whole lot more from compiling when i poked on linux than I think sdxl does
supposedly it's based on lumina. how much, is that just the text encoder part, idk
just a typical case of dataset & training methodology (& architecture?) > moar parameters
it's so wild that flux2 is almost 5.3x the size of z-image
and like 10x slower?
despite also being a distill itself
Illustrious Lumina and Animaestro aren't even out yet!
Damn
If this materializes before them...
So wild
What kind of filesize are we talking with this zimage model? This usable for the low VRAM contingent?
6B model
4B text encoder
1 parameter = 1 byte at 8 bits (1 byte) per parameter, fp8
IE, 6GB at fp8, 12GB at bf16
regardless of low vram, offloading is a thing too
both of them are cancelled for a while already
damn
I guess everyone is talking about Z image rn?
yeah, I just read the messages lol
this model is actually insane
for such low vram use and fast speeds to produce THAT quality?
the only issue I get is that my comfyui doesn't want to update thru comfy manager
thanks!
well those aren't really relevant to you i think, you wanted more realistic stuff right
z image itself does that
i personally am interested in those because the best (local) anime models are still sdxl finetunes even after so long and so many other new models
disty might be interested if he doesn't know about those 2 already, and there's some other people here who liked doing anime stuff too
i guess novelai is good but it's not local
How are you generating images under 30 seconds? I'm new to ComfyUI. I'm starting it with "python main.py --fp8_e4m3fn-unet" and have set weight_dtype - fp8_e4m3fn in the workflow. It still takes 216 seconds to generate 1024x1024 image with 20s/it. I'm using Arc 770M with 16GB Vram and 64GB ram. 9 steps.
--fp8_e4m3fn-unet
don't do that
amount of time it takes depends on model and other things, what model
z image?
yes
if you're using z image and want it in fp8, do that from the selector
in the node to load the diffusion model
if you fp8 sdxl i'd expect it gets cooked
run with --lowvram as well and say what happens
Do I need to download different model? I see only bf16 here: https://huggingface.co/Comfy-Org/z_image_turbo/tree/main/split_files/diffusion_models
I see fp8 is here: https://huggingface.co/drbaph/Z-Image-Turbo-FP8/tree/main I guess I will give it a try
Still slow if I switch it here
@hard sphinx
You don't want to use the fp8 launch argument you do because of this
if you download and use the fp8 model, it will load a bit faster
your 20s/it times are not because of that
--lowvram helped, now it is 2.3s/it
with fp8, bf16 is still slow, but I guess 16GB is not enough
if you want block swapping, --reserve-vram 7
at bf16, it's 12gb model + latents + windows and your screen + browsers or whatever else, you are very close to running out of vram
there is no point in making the model get casted to bf16 entirely if it's the fp8 model though i'm not sure how you'd even do that
arc has no fp8 hardware iirc, or if it does it is not used in pytorch. comfy has its own special stuff that makes normally fp8-incompatible GPUs work
while stuff is stored in fp8, calculations are done in bf16
you can't squeeze out more precision from the fp8 model by upcasting it to bf16, that precision is gone
it's just for doing maths that the gpu can do
ok, thanks for explaining, really appreciate
just keep using the fp8 model and keep --lowvram
there is practically no reason to not have --lowvram
it makes text encoders run on the cpu. 99% of them are so fast that's something like a 2 second difference (total time). only exception is qwen image, the not plus (2509) one. which, with 2509 now, it's redundant so whatever
I was wondering how much the offloading of clip/text to cpu mattered to speed
Good to know 👍
I update comfyui by rerunning the script setup (to try z image), when i try to run my old workflow with tiled vae decode i get this now:
AttributeError: 'NoneType' object has no attribute 'cudaHostRegister'
@earnest grotto And since this is only the turbo model released, we still have the normal and edit models to wait for which is gonna be awesome.
I'm ready for it.
I wonder how it will stack up against qwen edit
Same company different divisions(?)
If it can fill in the gaps that qwen edit can't do, without taking 5 minutes per image like flux2 that'll be pretty nice
or hell, even if it can't, i'll be able to train it at least
for loras, i really don't think making edit datasets is too hard
for the scale of a whole model yeah, but for loras, mmm
like, surely an edit model will be able to learn how to totally kill any sort of depth of field and blurring through blurry images -> impossibly sharp but still realistic 3d renders
the blender foundation has released and still releases a lot of free scenes
if you've seen this specific classroom before
2.82, yes
yeah
I used blender for a period of time when modding bonelab
made a gun, weightpainted
etc
nice
in return for that i essentially got my very own custom metrocop textures from someone as a trade for weightpainting halo models for them
hl2's usp always has me expecting usps to be whiter and more matte
yeah
https://www.youtube.com/watch?v=ah3BFn0AvQI I asked you a while ago for a ranni voice
I'm happy with the result
neco arc singing videos are truly a miracle
kinda odd that no one has made dagoth sing but oh well
i did a skyfall 007 cover with TF2 spy but they took it down lmao
even though they took it off I still have it ofc
There are parts that can be fixed, like the fact that I couldn't seperated the top and bottom voices from the middle and near-end of this song
But I think the spy voice is pretty good
best parts are definitely the beginning and end
Z image just floors flux2, man.
tried to update from the new script and now i get black images generation and this error:
C:\Modding C\Stable Diffusion\Comfy_Intel\ComfyUI\nodes.py:1594: RuntimeWarning: invalid value encountered in cast
img = Image.fromarray(np.clip(i, 0, 255).astype(np.uint8))
b580?
update the script if you haven't (2.5 now), run the script again, install torch 2.8. don't use stable or nightly, those currently break with battlemage and produce black images. Hopefully soon™ this will be fixed in the nightly, which will eventually be the stable
the script should highlight 2.8, if it doesn't, tell me
you can also just press enter to pick the default option
ok cool i wasa on stable
the script was highlighting stable btw
0.2.5p
thanks that fixed it!
updated script, now should highlight the correct option and also inform you if there's a newer version
Just got Z_Image running on my A580, how much faster is the B580?
Ill look into getting it set up with my B580 when I get home tomorrow. Then can compare, if no other B580 owner does first
launch comfy with --lowvram --reserve-vram 7
if you're using my script, the lowervram shortcut does this
I have a B580 coming in the mail today, I was just wondering if anyone knew off the top of their head. Thanks for offering though.
You are highly likely running out of vram. You can make the a580 not run out of vram and then I'd expect about 4-8s/it?
And I'd expect the b580 to have... 2-4s/it?
Extremely rough numbers
I got the generation time down to 30seconds using —novram and —reserve-vram 4.0
Use the arguments I told you
Is it possible to use QwenVL on Intel Arc in ComfyUI? KSampler node works fine and it runs on GPU, but QwenVL node from AILab only uses cpu. Is there other node that does not rely on cuda?
Ah, so I take it Comfy's implementation is fried. I got ~9s/it
their sampling is fried too
you have to hijack in diffusers and run diffusers sampling to get rid of the comfyui noise on z-image: https://github.com/erosDiffusion/ComfyUI-EulerDiscreteScheduler
damn, that's pretty sad for comfy
by the way, what are your lumina 2 speeds
I was getting 1.6s/it without compile and 1.0/ with
Ah, nevermind on the 9s. Comfy's new nodes are just crazy broken, that was at 2048^2 when the nodes were set to 1024^2. Actual speed at 1024 is 2.11s/it, still slower than diffusers though
Exact same speeds as z image
Make it 2x slower for CFG
anyone still using ipex-llm? does it work with latest ipex 2.8 ?
Ipex llm doesn't actually use ipex, and ipex-llm is no longer being developed
i swapped to lm studio
since it's just a llama-cpp wrapper
z-image on a770 with sdnq int8 matmul + torch.compile
1.75 s/it is the normal speed
im running the q8 gguf version and on an a770 it's 3s/it
A770 has only 2x INT8, 4x is not possible
city96 gguf is slower than ram offloading btw
you should use ram offloading instead
So just straight safetensors?
🤷♂️
yes
if you want to use real gguf, use sdcpp or llamacpp instead
city96 gguf isn't gguf
anyone tried it out on B580 yet?
get a chance to install that new GPU?
guess i gotta live with city96 in comfyui
all the models i have in comfy are gguf
I guess comfy fixed their stuff
This is outdated now. I originally wrote a somewhat wordy post comparing the performance of the two and speculating on why my card (2060 6GB) was faster on sdcpp. Since then comfy pushed a fix which remedied the performance discrepancy, and outperforms both by a wide margin.
Why not try it out yourself? Nothing bad's gonna happen
It's basically SDXL speed (when Comfy's working properly), I doubt that'd be much different for the b580 and even if it were a bit slower than sdxl it won't be by a lot and it's still a good model
comfy stopped working the moment i updated it
lmao
I'm getting invalid argument errors for the clip text encode from multigpu
the latest update broke ggufmultigpu nodes
nvm im getting an invalid argument at ksampler now using normal safetensor models
driver/pytorch/oneapi/... bug
intel issue
most likely
could be down to specific pytorch versions, have to est
Alright, let me swap to 2.9.1 stable then.
steam decided to pop up the hardware survey as i was updating drivers
yep 2.9.1 is working, with gguf models
yeah 2.10 bug then
torch.count_nonzero is broken
by the error I'm guessing someone is indeed like it says calling their kernel with the wrong number of arguments (or they made it take the wrong number), i'm sure they'll fix it
rather than it being completely unrelated like other errors
pytorch 2.9, a770, bf16 z-image with
--reserve-vram 11 (~6gb actually reserved on windows): ~2.6-2.7s/it
--reserve-vram 7 (~2gb): ~2.32s/it
--reserve-vram 6 (IIRC I've seen AIPG use this at one point, IMO it's too low at ~1GB): ~2.06s/it
no reserve-vram: 1.76s/it
no reserve-vram, comfy's own fp8_e4m3fn: ~2.07s/it
no reserve-vram, fp8_e4m3fn_fast: ~2.07s/it
I guess they did optimize it to be as fast as diffusers now that it's 1.76s/it
hopefully they fix or fixed the sampler too
will be interesting to check out comfy's own training nodes once the base model pops up
Oh I intend to, I know nothing bad will happen. I am asking out of curiosity.
I don't get on my desktop every day, and I haven't had a free moment to go tinkering lately. So I just haven't gotten around to it myself.
updated comfyui, ran the 'image_z_image_turbo' workflow it has by default and using all default settings. 2.31s/it rate for 1024x1024 9 steps
pretty darn fast and an impressive generation
Crucially, with pytorch 2.8
Huh actually going back to it Im seeing variability in generation speed. Went as fast as 1.77s/it
Running 8 batches now to get a feel for the average.
as fast as 1.59s/it
using the lower vram script (--reserve-vram 11 iirc) lead to great stability but slower generation. I believe it was around 2.4s/it. But using the normal script I was getting crashes occasionally and would have to reboot comfy.
Good things are that I confirm that black image error is yet not hapenning with latest drivers, same as for yesterday version
(maybe) black image generation fixed for Battlemage in latest nightly pytorch
If you want better performance you can lower the reserve vram amount, however i've upped it because intel still has some iffy memory management bugs
and even with it this high, i've had issues, and I've seen others having issues too. the driver(?) randomly decides to die sometimes, which will break comfy
going as low as 7 is generally ok. 6 gets risky. on windows, imagine you are reserving 5gb less than the number, so 7 would be reserving 2gb for the rest of the OS, and 6 would be 1gb
I plan to experiment with it 👍 and find a good middle ground
if you are on linux, try 6, 2 and 1 directly
hmm, I should make the script change the reserved amount when used on linux
I am. I plan to try around 6 to begin and see how it goes
When running SDXL I just use the normal one with no reserve, but z image is a bit chonkier
z image works with this
flux 2 also works with the q8_k_xl quant of mistral 2506 instruct
all the gguf quants for llms in comfy are now that quantization unless it doesnt exist, then its q8
but for text gen in lm studio, q4_k_xl
i would do q8_k_xl ministral quants in lm studio but it's too big to fit in vram
I finally got everything working well but using SDXL and a bunch of models based on it I can't get crowds or multiple people, 9/10 pictures doesn't get more than 1 subject when I enter such prompts. Any model recommendations?
Also the instructions I followed didn't download and use any clips just checkpoints and VAE is that fine?
Because SDXL checkpoints have them integrated. They are AIO models, all in ones.
If you're looking to do newer models, Z-image, Qwen Image and flux 2 are choices you could look at.
alright, thanks
do those also have everything integrated?
holy 64GB flux 2
tried using Z-image. seems to need 17gb VRAM and crashes.
"torch.OutOfMemoryError: XPU out of memory. Tried to allocate 76.00 MiB. GPU 0 has a total capacity of 11.59 GiB. Of the allocated memory 17.78 GiB is allocated by PyTorch, and 99.50 MiB is reserved by PyTorch but unallocated. Please use empty_cache to release all unoccupied cached memory."
any solutions?
Same issue with 9 steps
Did you install comfyui through vik's setup script? It should already have --reserve-vram, ipex_to_cuda and such to prevent OOMs.
It should allow you to run most of these models with offload.
Also, I'm using gguf q8 models to half the vram requirements.
yeah im no expert but the --lowvram parameter i believe is supposed to solve this problem by segmenting the workload and offloading part to CPU
A bunch of stuff kept giving errors so I updated to the latest versions. I'll try adding reservevram and lowvram on launch and try it out again later
ok so lowvram alone didn't give an error but the whole thing crashed and disconnected
Same when adding reserve vram 2
I personally use --reserve-vram 8 but i dont kind the slight slowdown over ooming
How much ram do you have
Oh, nevermind
16 is too little
do get the q8 or an fp8 version
I'm getting edged over here. Fp8 was going well but crashed at vae decode
get the text encoder in fp8/q8 as well. if you're still running out of ram, get lower quants or run comfy with --cache-none. This makes it reload the respective model every time then unload it immediately when done, this will make things much slower but you won't run out of ram
the vae is like 200mb at bf16. and if you quantized it any lower it'd probably break
Do what I said
Get lower quants, and if you're still crashing then --cache-none
You are not running out of ram because the vae is too big, it's most likely because the latents are too much for your ram and there's not much you can do about that, other than use a lower resolution
or maybe tiled decode but i kinda doubt tiled would change much
I just used cached none before moving on and it worked. 2 s/it btw.
I'll try to see how much less vram i can reserve before it crashes again
Thanks a bunch
Congrats! Oof 16gb of ram yeah that was crucial info. GJ getting it running on that
Hi! I’d like to ask for your help. I’m getting the error message shown in the picture. Could you tell me step by step what I need to do? ComfyUI was installed before and it worked, but now I’m stuck. I tried to fix it based on the error message, but it didn’t work.
pip install av
Worked for me
I spent wayyyy too long doing absolutely unecessary stuff for like 3 days trying to fix it. All it took was one line. 🥲
Thanks a lot for the reply. I tried it, but I’m still getting an error message. It looks like the comfyui-frontend-package isn’t installed. Can I install this separately, or something else?
Try downloading the latest versions of everything. That solved some of my problems. I don't exactly remember exactly I downloaded the latest versions of but I remember doing that for intel's pytorch extension and the requirements-ipex.text
This was one thing I did
python -m pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/xpu
python -m pip install intel-extension-for-pytorch==2.8.10+xpu --index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
Thanks! It’s looking much better now, but there’s still some kind of issue. I’m leaving the screenshot here.
I remember this error specifically. read the instructions above the error
Go to the file location it tells you and run the python.exe as admin
Then copy paste the install command
I think I might have ran through the exact sequence of problems I just don't remember very well 😂
Thanks a lot! It is working!! 😁 😎 Have a good day, Bro!

so I removed lowvram and reserve vram altogether and it still works
it's gone down from 130 seconds to 78
batch size of 4 that is
or is that because I was using the same prompt over and over without reloading the UI
Nvm those were all tests on the first run
Forgor
Still though itbworks just with python main.py now
you shouldn't need reserve-vram if you're using the model at fp8 or the q8 quant, or lower, since it will be small enough to fit in vram
lowvram is redundant if you have cache-none
If kijai has nodes I would recommend them for block swap, just have to usually edit line of code for a series to work, but battlemage shouldn't have issues at all
Few days ago I've asked if there is anything similar to Qwen3 VL custom node that can run on Intel GPU. Turns out it is super simple to modify ComfyUI-QwenVL node to use XPU instead of cuda. Now it runs super fast compared to cpu 🙂
anyone have problems running z image with lora? it turns only black images for me if i use it with a lora
try with q8 ZIT
zit = z image turbo so i assume he means fp8 or q8 gguf
fp8 is what I'm using, i can try that gguf version i never used it
the images i take the workflow from use the fp8 i think
not fp8, q8
is the only q8 the gguf one, or there is a "normal" one? I tried it but i get this error: "size mismatch for x_pad_token: copying a param with shape torch.Size([3840]) from checkpoint, the shape in current model is torch.Size([1, 3840])." when loading the gguf model
Update comfy and the gguf custom nodes
the new script 0.2.6p doesn't open for me, just the cmd window flashes for a brief time and closes. I think i have all requirements already (python, miniconda) as I used the old version of the script in the past
Right click the background of the folder the script is in, open a command prompt.
python ./Setup-Comfy-Intel.py
And show what it says
you forgot ./
i had it in the first
thanks, is running now!
Is it a new dependency or did I break something?
it's something i expected to exist by default in python
also should I leave stable or 2.8?
apparently it's not in by default
use stable
Thank you again, now the lora work with the normal fp8 model!
apparently their diffuers PR isn't accepted yet and they rushed it a bit. also said PR has a hard requirement for flash attention even though they state it and triton are optional ("recommended")
and autodownloading doesn't work
damn
I tried to run z-image template from comfy but got this error..
How did you install Comfy?
CUDA in those error messages is a reference to Nvidia cards
it's ok, i just code it over now it works
what pytorch are you using? xpu? have the ipex to cuda hijacks?
disty fixed it
nice
certainly getting undercooked vibes from that image
huh, so it fries at 2.5 cfg
1.0 CFG, seed 42
1.5 CFG
1.75 CFG
2.0 CFG
2.5 CFG
1.5 CFG, seed 43
+<copyright>...
<copyright>..., 2.0 CFG
<>, 2.0 CFG, 50 steps
rip. worse than neta
will local anime models ever move on from sdxl finetunes. jesus
getting an urge to try out HDM and compare
lumina 2 itself? i can definitely see that
zit seems to make the lumina 2 architecture work though
maybe i'm just holding onto my hopes that onoma's cancelled lumina 2 finetune would've worked out well. even when very undertrained i feel it was promising
unlike neta
zit is fixed and improved lumina2
it is not lumina2
different architectures
zit also runs 2x faster than lumina2 even tough it is almost 3x larger
i skimmed the 2 papers and the 2 diagrams look near identical, except different rope which they say is for image editing, makes sense, and of course different text encoder, dimensions of stuff and training etc.
hidden dim of 3840 is pretty cool for intel with int8 i guess
dunno. i'm kinda out of my depth
Hi! I’m experimenting with Z-Image. I can run one generation at 1.94 it/s, but the next one only runs at 70–80 s/it. Can you tell me what I need to change so I can consistently get the same performance?
how much ram, what gpu
are either or both z-image and qwen fp8 or bf16 or
don't use the new nodes
they're broken
most of those nodes could have different values than what they show
and a lot of things are not visible (e.g. the setting to not change seed every time is missing with nodes 2.0)
Should I try another workflow?
turn them off
they're purely visual. the visual is broken and worse
It is off
it was off, or you changed it to off now
It was off
open task manager. go to the 2nd tab. select your GPU. move task manager somewhere onscreen
run the workflow to when it slows down to 70s as you say, show pics of both task manager and the whole workflow
don't crop out task manager. the ram is normally visible even when you have the gpu selected
With --lowvram i have this feature 😅 I already tried it, but because of this I’m not using it.
in the future - if your vram spills out into shared memory, you're running out of vram
but i think this can come when i have not too much space on disc
you're using it at fp8 anyways. just download some pre-existing fp8 quant of the model
you can also make a workflow with purely a model loader plugged into the save model node and just set it to fp8
besides not running out of ram, it will also load much faster since ukno, model will be half the size
Thank you for your help! The downloads are in progress. I can’t test it today anymore, but I’ll definitely follow the advice and continue this way.
no longer need to use --disable-ipex-optimize flag?
it's not needed yes
lol zimage lora training speed with musubi when i run out of vram is comparable to lumina 2 with kohya when i don't (though with ~0.6mp images vs 1mp)
165 steps later and the broken memory management kills me. dammit
So I installed using Vik's script in a new location. Downloaded the nightly (2.10.0.dev20251205+xpu) and ran 52 images of the default SDXL workflow with default settings and prompt. The one using the base model and refiner.
No black images, with B580
Looking like they truly fixed that problem 👍
Same thing on RX 7900 XTX runs at 1.38 it/s
A770 with INT8 is very close to RX 7900 XTX
Damn that's fast
speedup seems to be so big, z-image with compile and int8 with cfg will likely match lumina 2 without
that base model is getting ever so exciting
makes me think i should set up wsl2 and use linux compile
compile speedup depends on model
I still should though. Shouldn't take long either, since it just means I'll move my comfyui setup to wsl
lumina, z-image and probably newbie but you don't wanna use that model all get a very big boost from compile alone
z-image is also structured in a way that makes intel's int8 perform pretty well with it
z-image should be 3800-something idk scroll up or check their paper
qwen and flux are 3072 or so
@earnest grotto Well I have comfyui set back up through linux, with torch 2.9.1+xpu
Are there any specific things needed to be done?
I'm currently getting slower speeds on the Q8 Z-image model I'm using, from 3 to 3.6 seconds.
it's the comfy examples z-image workflow with ggufmultigpu loading nodes
--bf16-unet --disable-ipex-optimize --lowvram --reserve-vram 8.0
why are you using reserve-vram with zimage
cuz i dont just use z-image
also, reserve-vram works correctly on linux, it's off by 5gb on windows. if you were using 8 on windows, that'd be 3 on linux
Oh.
What compile node are you using?
2.6s/it with no reserve vram
on the gguf q8
i haven't used any recently since i haven't used linux in a bit. use the default compile node.
I swapped back to windows due to compile errors on 2.9.1. Probably something I caused, but it's all good.
Speeds were basically the same non-compile, don't know what I expected.
Left a zimage lora to train overnight for ~3600 steps.
it crashed again
this time at 460 steps
amazing
so much for assuming I didn't need to have a script to constantly restart training running anymore
mm... even with the dedistill, the turbo model is way too unstable
rip (well, maybe better captioning will save it)
Hi, I have the B580, i was wondering if i can run Z-image model withing the AI Playground or its better to just follow the instruction above which gave me connection error, probably because of low VRAM ?
AI playground has comfyui
The instructions at the very top of this channel are outdated and not necessary, things have been simplified
okay! so which model do i need to download for my b580
and all the dependencies for comfy ui are already installed right?
Yes when you install AIPlayground it installs ComfyUI completely and simply.
AIPlayground is actually just a simple GUI stacked on top of ComfyUI
do i need any workflow file? I put all those 3 files in comfyUI folder then how do i work with it
Comfy has a z-image workflow built in, by default
Here's one with its group node ungrouped. replace the bf16 model with the fp8 one.
How much ram do you have
16GB DDR5 6000MHz
and B580
aipg
Get versions of these that will fit in your ram. close everything else open to free up ram. Consider getting more ram... In 2 years when the insane ram prices go down?, if they go down... Given micron's exiting, things are looking pretty grim. 16 was already kinda low for gaming nevermind AI, wonder if newer games will get more reasonable on RAM usage or what
For Qwen, don't get IQ, only Q3, Q6 or Q8. not sure if the IQ ones will work
Lower number after Q = less ram, less quality. Q8 is fastest, Q4 is second, all the other ones like Q3 or Q5 are slower.
Also, consider upping your pagefile
cant seem to use the GGUF files, it says only safetesensors are needed
Use them with the gguf loader nodes
from here?
Install the gguf custom node https://github.com/city96/ComfyUI-GGUF
got this then tried to install comfyui manager then got another error
[Installation Errors] 'ComfyUI-Manager': This action is not allowed with this security level configuration.
I dunno how to fix that within AIPG's Comfy, sorry
I have a script that installs comfy for you, standalone. 3rd pin in the channel
@earnest grotto it seems pytorch 2.10+xpu is released under stable now
and it's also got invalid argument errors like the nightly
rip
i guess i should be more proactive in reporting issues like this
ah, well, good
https://github.com/pytorch/pytorch/issues/170166
until then i'll just use pip3 install torch==2.9.1+xpu torchvision torchaudio --index-url https://download.pytorch.org/whl/xpu
its pinned
literally the third message
Not working 🥹
Oh.
Are you getting invalid_argument errors in comfyUI?
^
The current version of the script installs the latest stable (2.10.0+xpu) if you choose stable. Both nightly and stable right now are broken on ComfyUI.
In order to remedy this, you need to downgrade the version of pytorch installed by the installation script.
pip3 install torch==2.9.1+xpu torchvision torchaudio --index-url https://download.pytorch.org/whl/xpu
You need to go into the comfy_intel directory, enable the cenv and install torch2.9.1+xpu
This is very confusing. I don't understand much of it, but thank you.
Where are you stuck in terms of installing via the script?
so can I just use z image directly in ai playground or how does that work?
I don't believe so, because AIPlayground is curated so unless an update has gone out to add z image in the fields, it won't be there.
But AI Playground has a 'launch comfyui' button that sends you to the standard interface. It will work from there
I havent used AI Playground in several months. I noped out of Windows officially
when i run that bat file it shows error
That's not a bat file, it's a python file, and you're copypasting it into the command prompt
Do what the instructions say, download it, don't copypaste it into the command prompt, and run it from the command prompt
Or set your python up such that it can run the script by double clicking
which one should i select?
3
^
Double click somewhere on the workflow's background, type in gguf, use the diffusion model and text encoder gguf loader nodes
i have downloaded bf16 already, Do I have to download that gguf fp8?
Sorry, I mistook you for the other person
Just set the model to load at fp8
crashing
What CPU, what GPU
Ryzen 5700x ARC A770
How much RAM
32GB
Close everything that's eating up ram and start over
Especially if your browser is hogging 20gb
Open task manager, go to the 2nd tab, select the GPU, show a screenshot without cropping such that the RAM usage is also visible just before and after the crash
.
https://huggingface.co/drbaph/Z-Image-Turbo-FP8/tree/main
Download and use this already-fp8 model instead of the bf16 one. say what happens
both are same?
use e4m3fn
now working fine with e5m2
but what was the problem with bf16?
you ran out of ram and have no pagefile
there's a spike in ram usage while loading a model. that's when you ran out.
specifically, 32gb is fine, you should just go set some proper pagefile, unless you're extremely worried about your SSDs/HDDs
where is this button? I can't find it, I'm on version v2.6.0 tho
Click these red areas in order from top to bottom
Gear, Image, Workflow
Then that blue Open ComfyUI link will be there (right above the workflow button)
thank you
z image works flawlessly on my A750 and 16GB RAM
do you think it's worth trying q8.gguf version over fp8.safetensors?
q8 gguf has less perplexity (meaning it's more true to fp16) than fp8, but it's slower
Both are very usable
but is it visibly better than fp8?
or just different
could you tell which one is which if you didn't know which is which?
i know there's a difference between let's say q8 and q6
but I don't know how fp8 compares to q8
I'm just curious
I don't exactly know how quantusations work so I don't know what technically I lose
sorry if my questions are annoying
anime is probably not great for a comparison like this
Actually a very illuminating example tbh
always found it crazy that the smaller gguf's are slower than a q8 quant with most models.
You are not going to be using the model for anime, and these non-anime art models are very weak at anime/art. Flux dev needed cfg to make a darker skinned anime girl as another example. Yet, I expect that it can generate a realistic image fine without forgetting the skin color at lower quants
also i think longer text is more affected by lower quants, this speech bubble isn't super long
since it's weaker at longer text too, naturally
.
To even get it to generate that skin color in the first place you have to go out of your way and actually use cfg for a model that was not intended to be cfg'd
nevermind very much not intended for art
z-image is at least a bit better in that regard
but it's still ultimately not an anime model
And, different models respond differently to quantization, you can't quite compare them the same way. E.g. Wan 2.2 I2V's low noise model you can quantize down to q4 with basically no loss in quality
the high noise model otoh had more of a difference, with anime, but hey, there's no anime video model
Another tts voice cloning model, Cosy Voice 3
https://www.modelscope.cn/studios/FunAudioLLM/Fun-CosyVoice3-0.5B
Huggingface space but chinese ^
Topmost text box: text to generate
2 radio buttons: 3 second fast inference | instruction-guided
Box to drop the text file is in english. Don't use audio >10s for the 3 second inference?
Textbox below it is automatic transcription, and last textbox is instruction (for shouting, etc.?) (only applicable when instruction-guided)
GLM-tts as well btw
cant get it running tho
havent found any good alternatives to Vibevoice yet
and that model is slow
I tested both fp8 and q8 version and q8 is like 3 seconds slower than fp8 on average on my PC
44 vs 49 seconds
but q8 does a visibly better job at generating text
there are two Q8 and two FP8 images
just by looking at the small text you can tell which one is which
I think i found that with most models q8 and fp8 where about the same, but stuff like q4 would be much much slower than both. i guess because of decompression
guys i tried to use seedvr2 but I can't download custom nodes it says "User provided device_type of 'cuda', but CUDA is not available." in comfyui manager. I don't know how to go around it
aw man
how many ComfyUIs do I have to install
the one from the AI playground won't work?
How did you install it?
if you just grabbed something and ran it, it probably defaulted to CUDA. Because that is 98% of the userbase
So (guessing here) you installed something that installed expecting CUDA and now CUDA is not present
AIPlayground is not intended for you to go installing custom stuff its just supposed to be click and play. You can expand the available ComfyUI options but you're on your own with it.
Vik has created the most curated Intel GPU option for running ComfyUI that exists anywhere. It just requires a little more work to setup vs AIPG which is just an exe
idk if the workflow is good
i just copied it from reddit
@lunar thicket
I saw someone here using SeedVR2
They probably used Vik's script lol
Just grabbing random stuff off reddit is likely gonna give errors 👍
@somber trellis how did you set up SeedVR2 upscaler?
Probably best to post in the ai playground section for that version
Unsure what I am doing, trying to generate anything and it keeps kicking back with out of memory errors.
Is it using any of my memory?
loaded partially; 0.00 MB usable, 0.00 MB loaded, 1639.41 MB offloaded, 168.76 MB buffer reserved, lowvram patches: 0
Exception during processing !!! level_zero backend failed with error: 39 (UR_RESULT_ERROR_OUT_OF_DEVICE_MEMORY)
Show the workflow, explain how you installed comfy and say what arguments you're launching it with
Attaching workflow...
Followed the pins and when installing
pip install -r requirements-ipex.txt (if using Intel Arc dGPU ie A750, A770)
It gave a error, saw your script, ran and installed ComfyUI correctly and its currently "running" with the simple image generation template " v1-5-pruned-emaonly-fp16 "
what driver version
restart your pc, try again, say if it's still broken
Driver version of what? and ill restart
your gpu
Restarted and still kicks back with the same error
loaded partially; 0.00 MB usable, 0.00 MB loaded, 1639.41 MB offloaded, 168.76 MB buffer reserved, lowvram patches: 0
level_zero backend failed with error: 39 (UR_RESULT_ERROR_OUT_OF_DEVICE_MEMORY)
Intel® Arc™ B580 Graphics
Version
32.0.101.6739
and what pytorch version did you pick when installing
stable?
There is a update available for the gpu and yes, i did all the default the script had.
When you launch comfy, in its command prompt it says, among other things:
ipex_init: (True, 'Skipping IPEX hijack')
Total VRAM 15931 MB, total RAM 49075 MB
pytorch version: 2.9.1+xpu
Set vram state to: LOW_VRAM
Device: xpu:0 Intel(R) Arc(TM) A770 Graphics
what pytorch version does it say
ipex_init: (True, None)
Total VRAM 11944 MB, total RAM 32592 MB
pytorch version: 2.10.0+xpu
Set vram state to: LOW_VRAM
Device: xpu:0 Intel(R) Arc(TM) B580 Graphics
run the script again from the same location, but choose 2.8
no need to delete anything, it will just change the pytorch version, it will be faster than installing from scratch
say if it works then
ok, thanks
oo boy did they break a new nightly for B580 😅
Not nightly, running stable
Oh did they launch 2.10 officially? Neat
i am running on one of the dev releases of 2.10 still and it's working well for me (B580)
Looks like it works, thanks 
do you intend to do anime art or realistic stuff
Unsure yet might be a little of both
models that can do one well struggle hard at the other
I was mostly just trying to find out why it wasn't producing anything, so now I guess ill poke around a bit
If you have some suggestions for me to grab ill do so and try them out
for realistic stuff i recommend z-image. has workflow in comfy.
for anime i recommend NoobAI v-pred 1.0 with EQ-VAE. you can look that one up, I won't link it
each is prompted differently. z-image takes natural language, the anime model takes booru tags + a few extra tags not in boorus (masterpiece, best quality, great quality, low quality, worst quality / year 2024/2023/etc., see noobai's readme)
Awesome, ill try those two out thanks for the help and the script vik!
haha I see 
Maybe I try noobai over illustrious?
vpred means the model can generate a dark/bright image. regular epspred tends towards making a greyish image though there are ways to remedy this, like noise offsetting. the downside is noobai vpred can also go crazy a bit and make blown out images, however that's the default noobai vpred.
the eq vae is less noisy than the normal sdxl vae, which makes the model train better. (doubt this is meaningful for lora training)
the noobai vpred with the eq vae model someone finetuned, doesn't have the broken coloration issue and is pretty decent
if you're using any illustrious loras, 99% chance they will not be working for this model. i train pretty much all the loras i use myself
(this one is with some edits afterwards)
And one without any loras. This might've been with regular vpred, don't remember
Is there a guide to getting torch compile working through linux/WSL?
I am rather frustrated that a 3060 can get near 5x my speeds on certain text to speeches, because they have cudagraphs.
It works by default, what are you doing, what error are you getting
ah, hm, it works for image gen
sdxl was formerly broken but apparently works now
On Linux only though, right? Not Windows.
Aaron had it working on windows. Things are partially set up on windows however I gave up trying to get their compiler to compile with the right flags and to figure out which of the various C++ compilers the basekit includes I should be using. Let's assume it doesn't work on windows
If compile works for z-image/lumina for you, with the newest torch, on linux, then whatever doesn't work is probably a bug you won't be fixing
I didn't ever get compile working on linux.
well, i guess 2.10 is broken for alchemist so eyah use 2.9
Install comfy with my script and tell me what is not working when you try to compile, with the default compile node
you don't need that
Oh.
only windows theoretically needs it for compile to work
i got it working
thats with the q8 gguf of z image
use the same environment for whatever you were doing with tts
you should be able to just ctrl+c or ctrl+z and be in the environment
i needed a lot of new variables to be set for it to work
qwen-image-layered is out
with GGUFs
Their HF space
hmm
If you want to get the same error, https://huggingface.co/spaces/Qwen/Qwen-Image-Layered
no comfy just yet 🤔
god i hate the sd subreddir
it's like 80% of posts about amd are condensed ragebait misinformation, and I even saw some of that spilling over into intel too
Had to edit some files, don't actually remember off the top. Have to do a search as it never got pinned. No clue if latest pytorch/oneapi still work though
not a oneapi issue
they most likely made it worse with newer triton/torch
it gives wrong arguments to each compiler and i didn't wanna bother
Do you have any experience with comfy and AMD cards? I see posts about DirectML and such. Asking for a friend who has an older AMD card.
How older
I don't have personal experience, neither do people in the SD sub. I'm just not colossally misinformed
If it's an rx 580 or something then yeah sure, give up on it
Radeon Vii It was purchased for some desktop work but now the AI things are here they want to try out things. They've put in some considerable time and effort into getting things working. However, it's really hard going.
They have the DirectML working to a point but everything is so focused towards Cuda it's a PITA
I was hoping for some solutions that would be more in line with the IPEX Intel ARC we have here.
Searching the internet is just like searching through $#!+ for some things
Is the person willing to dual boot Linux
They are very competent (PhD in engineering)
I don't know if the vii is supported but it probably is, for windows in particular tell the person to try ZLUDA
SDNext has a zluda guide
If not, ROCm on native Linux
VII is also a bit oldish so i'm not super sure on what it supports
Thanks for the input, much appreciated.
damn, zluda does go as far back as including the rx 580
the guide doesn't name the vii and i don't know gfxwhat it is but, yeah
I recommended the card to him a while ago because it has some nice memory bandwidth for the tasks he has. It was also very reasonably priced for a 16GB card with 1TB/s memory.
yeah it was a decent choice back then
if only amd had been more proactive about compute support
I also have a couple Radeon Vii semi-retired in workstations. I should be able to test things out for him.
If you do, no wsl for rocm.
Zluda on native windows, or rocm on native linux
90% chance rocm on linux works
it will probably perform better than zluda
man
bleach post for that amd post i saw
New sneak peek at AI Playground 3.0 with release of 3.0.0 alpha. See in AI Playground General thread
https://discordapp.com/channels/554824368740630529/1245461432141873245/1451701978643038221
directml is dead and is total pain when it works
directml only supports fp32
and is very slow
use native rocm on linux
radeon vii should be one of the oldest cards still supported by rocm
it has fp16 and int8 but no bf16 support
llms should work very well on it via vulkan
1 tb/s is still very competitive
Preliminary result
Most likely the leftover noise is something that will be fixed
strongly considering just going to sdnext for this one though, or doing it in a standalone script with sdnq. all the extra performance will help. yeesh.
A little better. Still has leftover noise though, even at 30 steps. Yeesh. Getting the feeling that this is undertrained
Isn't that only for rx 9000s and some 7000s?
yeah only for newer Gen GPUs, and some integrated GPUs
yoooooo ltx has lightning loras already trained
(also fp8 scaled version of the model is there too)
Doesn't work in comfy. I have a feeling that the block swapping/reserve-vram is once again broken
Gonna try with GGUFs https://huggingface.co/unsloth/Qwen-Image-Edit-2511-GGUF
works
looks better than 2509, as expected
ah, not issue-free though. that ghosting that was present in Q4 and below of 2509 seems to be happening with Q6 of 2511 now (only sometimes??? when asked to upscale? not sure)
Ah, the fluxkontextmultireferencelatentmethod node set to index_timestep_zero helps
I guess this node just got renamed to "edit model reference method"
How does 2511 compare in terms of style transfer @earnest grotto
Let's start off with the negative. There is still no zero shot style transfer. First row - the 2 reference images, "Transform the image of the woman in white clothes with red eyes into the style of the image with the woman with long curly hair and blue eyes"
The 2 other rows - 4 different seeds
And the positive. You probably already know these images. Whatever they did, they've slightly veered away from that disgusting sloppy look most post-SDXL (even base SDXL) models have when asked for "anime", besides just better quality in general.
Still gave venom an eye when his eye was closed but hey, it didn't remove the eyepatch
rip stylegan
I really hope this sort of stuff doesn't just get forgotten or whatever
Anyways, in general the model is just better
It didn't get worse or something
You can just go download the ggufs, the new lightning lora, add the new reference latent method node after the text encode, and done
RuntimeError: Given normalized_shape=[3584], expected input with shape [*, 3584], but got input of size[2, 77, 2560]
Im using the GGUF q8 model, probably missing something in either the workflow or maybe an outdated node
You might need to update the node and comfy
I actually already did a git reset --hard, git pulled the current repo and then reinstalled via your setup script, overwriting the pytorch to 2.9.1+xpu
Fixed the issue. Improperly named mmproj for qwen 2.5 vl.
Friends with the b580 using comfyui and qwen image edit, which quant are you using and how much ram do you have? I'm maxing out vram with the Q3_K_L
How much ram do you have
I updated ai playground to the latest version and now when I try to generate an image using z image turbo in ComfyUI I'm getting this error. Can anyone help?
Post in #1243956384052285560 sections
Looks like you aren't alone
okay thanks
16gb system DDR5 🤯. But another 16gb stick coming tomorrow, I’ll be interested to see if that makes any diffeeence
Man, thats like 300+ dollars now lol. I should have just upgraded my ram early this year smh
Only 16 is low yes
When you get more, up your pagefile as well. you should be able to at least use q4
also, --reserve-vram 7
Show the whole error
Hope it's ok i ask here since it's a niche forum. My CD 'directory' path seems to do nothing and conda can't find my requirements file. How do I proceed?
to cd across a drive, either use powershell, do cd /D letter:wherever/, or first cd letter: then cd to the folder.
In either case, the instructions you're following from the top of this thread are very outdated. I have a script that will install comfy for you, 3rd pin in this thread.
#1193952640225267802 message
@umbral nimbus
Thanks so much! swift answer!
Hi. I've been out of touch with open-source developments for a while. Currently, what's the best generative video model that works flawlessly with Intel Arc A770? I tried AI playground but it doesn't work. it just give me "level_zero backend failed with error: 45 (UR_RESULT_ERROR_INVALID_ARGUMENT)"
AI Playground 3.0 installs a broken version of pytorch (2.10) that doesn't work with alchemist GPUs.
Eventually, that version will be fixed. Here's an issue you can look at to see when it might get fixed. https://github.com/pytorch/pytorch/issues/170166
If you want to use comfy with a working pytorch (2.9), I have a script that sets it up for you #1193952640225267802 message
You will need some modern version of python, doesn't matter which, git, and conda installed (ideally miniconda in your user directory)
@left glacier
Thank you. I'll try that.
@somber harbor ^ Link and some instructions there
2.8, or stable?
Use what the script tells you is default. If you have an alchemist gpu like an a770, that's 2.8.
You can also just press enter without typing in anything to get the default
Has anyone installed SeedVR2 in comfyui? I'm using torch 2.11.0.dev20260103+xpu and everything I tried with it has worked so far except this.
It doesn't import. The Upscaler node seems to work but the LoadDitModel and LoadVAEModel nodes are not.
I wouldnt want to download a random zip file, but if youre willing to test it on a sandbox
comfy-kitchen/backends/eager/__init__.py
line 30,
all_devices = frozenset({"cuda", "cpu"})
->
all_devices = frozenset({"cuda", "cpu", "xpu"})
I'll see about reporting it
I imagine it'll be fixed in less than a day. There's already a PR for it and other bugs, if anyone wants to keep track https://github.com/Comfy-Org/comfy-kitchen/pull/8
Merged
some examples of it seem decent
i've been away from comfyui for a long while and am having trouble updating comfyui to a newer version via comfyui manager after running a new version of vik's installation script on my ancient install (0.2.1p -> 0.2.6.1p). hitting "Update ComfyUI" just dumps the following into the console and does nothing:
[ComfyUI-Manager] Failed to checkout 'master' branch.
repo_path=F:\AI-NVMe\Comfy_Intel\ComfyUI
Available branches:
master
ComfyUI update failed
[ComfyUI-Manager] Queued works are completed.
{'update-comfyui': 1}
After restarting ComfyUI, please refresh the browser.
i can't add, update, or remove custom node packages through comfyui manager until i am updated as stated by this message:
[Security Alert] ComfyUI outdated. Installations blocked (update allowed).
Update ComfyUI for normal operation.
and if i recall correctly there are patches that are done during the installation that i would have to reapply manually if i were to force an update, so im a little stuck on what i should do
any help would be appreciated
The code block you posted is not output of the script
What does the script say
Did you run it from the same location you did the first time
The script has no "update comfyui"
poor wording mb
"Update ComfyUI" is in ComfyUI Manager
i can run the script again to get the output
The script will update comfy for you, don't use comfyui manager for that
does the script install a specific version of comfyui or is it meant to be the latest release/dev build
it installs/updates to the latest
oh im way more outdated than i thought
?
the latest release for comfyui is listed as 0.7.0 on github, it shows 0.3.43 here
rerunning your script and posting the logs here in a moment
apparently im on a detached head
would running git switch main inside the comfyui folder and running the script again be a bad idea? im not too familiar with git
open a command prompt inside the ComfyUI folder (e.g. right click the background when you have the folder open)
do git branch and tell me what it says
PS F:\AI-NVMe\Comfy_Intel\ComfyUI> git branch
* (HEAD detached from v0.3.42)
master
do git reset --hard and tell me what happens
PS F:\AI-NVMe\Comfy_Intel\ComfyUI> git reset --hard
HEAD is now at 170c7bb9 Fix contiguous issue with pytorch nightly. (#8729)
ok, run script again now
said the same on trying to pull again in the script and is still the same version after running it
what does git branch say now
PS F:\AI-NVMe\Comfy_Intel\ComfyUI> git branch
* (HEAD detached from v0.3.42)
master
same as before
git reset --hard
git pull
git status
PS F:\AI-NVMe\Comfy_Intel\ComfyUI> git reset --hard
HEAD is now at 170c7bb9 Fix contiguous issue with pytorch nightly. (#8729)
PS F:\AI-NVMe\Comfy_Intel\ComfyUI> git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 5 (delta 3), reused 5 (delta 3), pack-reused 0 (from 0)
Unpacking objects: 100% (5/5), 2.33 KiB | 44.00 KiB/s, done.
From https://github.com/comfyanonymous/ComfyUI
fc0cb10b..c0c9720d master -> origin/master
You are not currently on a branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.
git pull <remote> <branch>
PS F:\AI-NVMe\Comfy_Intel\ComfyUI> git status
HEAD detached from v0.3.42
Untracked files:
(use "git add <file>..." to include in what will be committed)
.gitBACKUP/
SDXL_trainer_output/
comfy/ipex_to_cuda/
huggingface/
loras_tags.json
nothing added to commit but untracked files present (use "git add" to track)
git checkout master
git pull
git status
ok, run the script again
Are nodes 2.0 enabled?
If they are, consider disabling them
at least for now
they're buggy
I wonder when torch compile on windows will ever be fixed.
Is there still a speed increase compared to older pytorch with torch.compile? Or is it about relatively the same?
For me, Z-image was about 30-40% faster on compile when i tried it out via wsl2
@earnest grotto I got ltx-2 working
10 minutes per video gen tho (for 720p 24fps)
lol and i just realized i had it still set for the normal model (20 steps) when i should be using 8
not even just that, I'm on cfg 4 and not 1 for the distill
How much ram/vram needed
should be doable with pretty low vram, but ram is a concern now
Wish I had upgraded early last year smh, oh well. Probably be cheaper to get a 16gb or used 20/24gb gpu now lol
Interested to know what the render time is once you correct the steps
This is still on A770?
20 second 640x480 video took almost 9 minutes
4-5minutes for 5 seconds
It'd be faster if I had an fp8 gemma 12b cuz a nice percent of that time is the clip text encode
gonna be honest, all the voices i've seen from ltx are atrocious
or i guess heard
the video matching the audio is pretty good and there being audio is good but man, how is it such a step back from even TTSes from 5 years ago
good thing it has audio input i guess
agreed
im not very sure if im excited over it
or more excited over the wan 2.2 lora svi pro's infinite video capability
what's the ram usage
reserve vram 8 set
They plan to release a 2.1 that will have better audio, and a bigger increment like a 2.5 that will better preserve details in motion, better latent space
I'll wait for that, then. Until then wan is better overall imo
until then
imma jury rig a workflow I found for SVI pro
so I can use T2V alongside SVI pro's I2V infinite video
@earnest grotto
I think there are examples where LTX-2 can provide some interesting shots
The screen is black, then a slow push-in reveals a desolate corridor where dark crimson blood crusts the concrete floor, glistening with wet sheen beneath cracked walls littered with rust flakes. Flickering fluorescent tubes cast long, trembling shadows over dried blood residue, painting the scene in cold, uneven light. From 30 centimeters above the ground, a medium-low angle camera tracks a skeleton—smooth, pale white bone like porcelain, devoid of joints or tissue—moving at a deliberate 0.5 meters per second. Its spine is arched at 45 degrees; its femurs elongate and shift in unnatural sequence, the right hip rotating ten degrees clockwise before the left hip follows. As it steps forward, each movement ripples through the dried blood, followed by deep bone clicks and groans that echo every three seconds—low-frequency vibrations beneath a steady electrical hum. Faint drops of blood fall at five per minute, silent but inevitable, like punctuation in a slow, unresolved sentence. No voice, no sound beyond the ambient bleed of wetness and mechanical resonance—the skeleton moves with quiet inevitability through the silence.
i'm seeing some very cool ltxv audio examples on the banodoco discord. Not able to keep up with it, but think I saw someone talking about voice cloning
apparently video continuation is also possible
and jesus what is this lora
Surprising that it can do seemingly infinite (at least 2 minutes) video without any burn-in or other such issues
good stuff
no clue how to do that yet
I kept ooming on gguf
This time it might be a comfy issue
fp8 works fine, so i assume its to do with the gguf nodes themselves
Try third party nodes like kijais, I remember a while back native gguf nodes had issues im comfy from what I heard
"Hmm yes surely comfy implemented Blender's group nodes correctly this time :^)"
^ Clueless
Changing a subgraph and finding out that the other 5 are unchanged is unreal
kinda stinks that it takes me 12 minutes per 20 second 640x480 gen 🙁
but it works
12 minutes for 20 sec's is amazing tbh. it's usually 4 or 5 sec's for wan
I can generate 5 second 3mp (640x480) video with wan 2.2 in around... 3-4 minutes?
well that is on track with Dan if you just scale up to 20 seconds
yeah, 20secs would be way more than 12minutes though.
so it seems i can do 1280x720x484 frames
on distilled, with res_2m sampler, two pass (8 step and 3 step) with second pass using ic detailer lora
Just barely too, but this is with --reserve-vram 16
Not ridiculously slow, either. Might actually take 12-15 minutes as well.
Okay, well second pass is taking 8 minutes per step. LOL
oh that's a lot better
swapped to ggufmultigpu nodes (which work for ltx-2 now it seems)
(generation took 30 minutes at 1280x720 20 seconds)
With the distilled gguf and multigpu nodes, 640x480 20 seconds takes almost 10 minutes
has just compared z-image turbo in ai playground 3 with z-image comfyui nvidia same prompt, a770 vs 3090 -> 17 and 10 sec, not that bad!
ok creepy
weird models are always great with horror
Seems like I need to edit my system prompt, though. The LLM I use to enhance prompts is adding the speech in the output prompt.
How much does the video change with resolution and the same seed? IMO, that wouldn't be a bad time for a final cut if you can do lower res versions.
No clue, but it's not like the quality of 640x480 outputs are immediately bad
It's an odd, mixed bag.
Like it's actually
quite movie-like in some shots
Once/if some loras come out the quality will get better. Same with wan 2.1 when it first released.
flux 2 klein 9b base is gated. off to a great start.
The 4B one looks to not be gated
They really couldn't resist slapping that license on both 9Bs huh
4b distilled | 4b base
Colorize the image. The girl has blonde hair and blue eyes. Her uniform is dark green. She has gloves. She's sitting on a brown couch.
The uh, skin-colored pants seem to be an issue with quite a few models
(For that matter the gloves are too but I decided I will specify them this time. From my experience with qwen, prompting "she has gloves" instead of something like "the gloves are black" is likely more prone to producing additional glovey artifacts)
Mm, a bit like qwen it's a bit better at higher resolution
Yeah, at 2mp the pants are mostly green
Welp, I'd say it's overall much worse than qwen for editing.
Very touchy when it comes to CFG amounts
Not that I like CFG as it's mostly done now anyways
Upscaling, it has the tendency to fry the image at 5.0, 2.5 works. Edits where it needs to generate more new stuff from scratch though, needs 5.0 otherwise blurs
I wonder if z-image-omni-base will once again drop really soon now and shame bfl again
can anyone help on this error?
Create a virtual environment named comfy-env
python3 -m venv comfy-env
Activate the virtual environment
source comfy-env/bin/activate
@fiery current @slender zodiac Use my script that automates it for you and does some extra patching to Comfy ^
Did you activate the env?
Also, https://github.com/Comfy-Org/ComfyUI/commit/2108167f9f70cfd4874945b31a916680f959a6d7
3 hours ago, comfy support for z-image-omni-base
@earnest grotto
Thanks for the script
I'm just old and stuck in the past before script times
sooooonnnn™©™
Comfy can be installed without my script now, however my script adds disty's hijacks which help for custom nodes
And also certain pytorch versions break with certain GPUs
or right now in particular pytorch 2.10 breaks with alchemist, so my script handles that
used to be that 2.9 and 2.10 would break with battlemage iirc
Is there a standard workflow for benchmark?
Just wanted to check everything is ok 😊
any sdxl model at 1024x1024 with euler ancestral and more than 20 steps, 1 batch size
for a770, you should expect ~2.45it/s
For b580 it was 3.4it/s I think
run twice (two prompts/seeds, don't restart comfy). measure second run
Thanks 😊
b580 should be able to hit around 4it/s with pytorch 2.10
im still running one of the nightlies from December
is that possible to make a script that will modify comfyui portable installation to run in intel arc? in theory
The script makes a portable installation
I don't intend to make it use python venvs instead of conda, for now
Hey Vik maybe a question you have answered before but: if you run the script on an existing installation location, say you wanted to change pytorch versions or update something, does it wipe your models?
No
for safety ive just been moving my location whenever running the script again, but I figured that was probably the case. Thanks 👍
You can plop an existing comfy inside the file structure the script makes, you won't lose anything other than possibly any manual edits to comfy (and only comfy)'s code
Don't recall if it will install requirements for all present custom nodes or only the ones i've set it to install (if you choose so)
That being said, I keep my models in a separate external folder, for other web UIs or such
I haven't used any other tools that would benefit from such an arrangement
well I managed to fix it by manually adding 2 packages I traced from the error log and then installing the default requirements.txt that comes with comfy. But thanks for the script though.
Had latest pytorch been fixed yet?
Seems its fixed but not added as a commit for some reason?
if you really need the portable version... for now you can just download the nvidia portable.7z and modify one of the batch file in ComfyUI_Windows_Portable/update/update_comfyui_and_python_dependencies.bat
change 'cu130' to 'xpu', save the file and run the batch file to update the pytorch dependencies.
ideally we would work with comfy to get it integrated into github workflows for releases
If you have an alchemist GPU, like an a770, you will also need to specifically use pytorch 2.9 because 2.10 is currently broken
And, this won't cover certain custom nodes (e.g. ultimate SD upscale will be very slow)
I've installed the torch but it says this
@quasi cypress You have not installed torch. You can just install comfy using my script ^
what to do
Show the whole command prompt
You don't have permission to create your Comfy install in C:/Windows/system32.
And you don't want to do that anyways.
IDK how you're running the script to convince it to try to set up inside system32, but, don't
Right click the background in the documents folder, "Open command prompt/powershell/terminal here", "python ./Setup_Comfy_Intel.py"
Check to make sure that this is where you saved the file and that's how it's named.
I've installed with the script then clicked the icon that appeared after the installation
hmm
same error
tnx! i also found portable release one chinese guy did (on github) YanWenKun. He did some tweaks to vanilla portabe
im getting like 60-70% perfomance with it of my rtx 3090
Run the script again from the same location, this time choose stable pytorch instead of 2.9. It will install faster as it will only change the version. Say if it's still broken
How many it/s with the 3090 with sdxl at 1024x1024
i'll give exact numbers tmrw, but i compared with z-image turbo 1024x1024
right now i can say however, that comfyui version bundled with AIP 3.0 still performs better, 17 vs 23 sec for 1024x1024 but could be they used somehow modified z-image workflow
models seems the same, output quality same, same steps but somehow AIP3 is faster
What pytorch version are you using
portable using 2.9.1+xpu and AIP i downgraded to 2.9.0
@quasi cypress Well, did it work?
in AIPG we added --lowvram as launch arg, so that could improve performance as well
timezone diff sorry couldnt reply, yes bro it did thank you so much
i noticed that, and re-configured to "--highvram" back again, thought you added it to be on a safe side with b570 and a750?
may i as you another question then ) testing "advanced chat" i noticed that it always crash when trying to load large models - which theoretically should fit in a770 vram, but when it goes even slightly above 12gb vram, it crashes