#ComfyUI Intel Arc performance

31 messages · Page 1 of 1 (latest)

quasi orbit
#

I just started experimenting with Comfy and the Wan2.2 model.

I manage to get everything working and it's utilizing the GPU. But I get a feeling I'm getting very low performance compared to a similarly tiered Nvidia card.

Here's my specs

Intel Arc a750 8bg vram
Intel i7 13700k
32gb ram
AI running on a fast Samsung SSD 970 EVO Plus drive.
Running ComfyUI with Powershell in windows (not WSL2)

I tried running a Wan2.2 14B Q3 K M gguf model paried with LightX2V 14B 480p Lora model that fits in my GPU vram.

Tried running a test with 16fps and 120 length

It runs. But I'm getting 700-900s/i . Which feels very slow in comparison to someone who runs more or less the same workflow but is getting something like 20-30s/i on his Nvidia RTX 4060, also with 8gb vram (he was running 80 length).

Feels like something is off with this huge gap in performance?

scenic glacier
#

They are likely using sage attention and torch.compile. you can get torch.compile working on intel but you have to do a little manual installation of stuff. Also rtx 4060 is a much faster gpu than an a750, it should be comparable to a 3060 without any nvidia optimizations

zinc locust
#

sage attention and torch compile are not the difference between 900 seconds and 30 seconds

scenic glacier
#

could be the vram stuff as well, --reserve-vram would likely help or using the block swaps in kijais nodes

zinc locust
scenic glacier
#

I would like to mess with kijais nodes and the gguf models , haven't ever had a chance. The block swap lets you more directly control how much ram/vram to use.

quasi orbit
#

Is there any new information on the B60 GPU's? Trying to find pricing, release dates etc. But can't find any. Might get a B60 dual GPU, depending on the pricing and availbility in europe.

scenic glacier
#

Q3* so probably Q5

outer perch
#

I'm speaking out of turn here, as I've only recently picked up an A770 with the intent to try and get a better understanding of how AI technologies work. I already had an A380, but I am also anxiously waiting for the new pro cards. I imagine they're being held up by the software stack. IPEX works well but I think is incomplete. I can't get parallelism working between the two intel cards in my system. And now, if I'm reading right, engineering seems to be pushing AI Playground, while still having IPEX in active development supposedly.

It certainly feels like Intel is trying to attract the engineering/enterprise market rather than building a hobbyist user base. Like I get it, enterprise is where the money is at and Intel needs as much help as they can get right now. But I feel like it would be reasonable to develop parallel use cases.

And while it may come off as ungrateful, I'm just relaying observations and conjecture. I have no way of knowing what's actually going on and I could be entirely incorrect.

scenic glacier
#

Honestly, I'm not even sure intel knows what it wants to do. This new philosophy of everything needs to be immediately profitable is not looking to good for anything new or innovative for intel. I'm glad we are still seemingly getting anything graphics related right now, but I have no clue what they will do in the future with their gaming/consumer graphics cards.

zinc locust
# outer perch I'm speaking out of turn here, as I've only recently picked up an A770 with the ...

IPEX works well but I think is incomplete
You don't need IPEX anymore.
I can't get parallelism working between the two intel cards in my system.
As this is a comfy thread - Comfy doesn't do multi-GPU by default. And image gen isn't super multi-GPU friendly. You'd need to start writing your own script, and then, you'd probably want T5/CLIP on the A380 and the rest on the A770 and to just do them in sequence like that.
engineering seems to be pushing AI Playground, while still having IPEX in active development supposedly
I don't understand what this statement is supposed to imply. AI Playground is completely unrelated to IPEX. It's a convenient frontend for ComfyUI, diffusers and llamacpp.
It certainly feels like Intel is trying to attract the engineering/enterprise market rather than building a hobbyist user base
Not sure how you got to this conclusion from what you said before but, every big tech company wants the enterprise market because it's more profitable

outer perch
#

I assume you're implying that llama.cpp has native support for ARC, so IPEX isn't needed anymore. I can't get multigpu processing working in any LLM system. I noticed today that Intel seems to be favoring vLLM which I haven't tried yet.

I understand this is a ComfyUI thread, and I apologize for speaking out of turn in it.

AI Playground is a front end for all of these systems is it not? And it's Windows only.

My comments come from various Intel statements that I've found. Perhaps this is more stream of consciousness, and feel free to ignore me. My point was simply a wishlist. I know development is hard. I just feel like Intel as a company would be more motivated to expand compatibility and documentation with a little more urgency.

zinc locust
#

llama cpp does not use pytorch, nevermind ipex

#

py torch is for py thon

#

not c++

#

there's a bunch of llama cpp versions that use sycl, opencl and whatever else

scenic glacier
#

it can use ipex-llm afaik, not sure if ai playground is using it though.

zinc locust
#

also i might be wrong about llama cpp, i don't exactly remember if this is what ai playground was using, but it's something in the same vein. many other LLM inference servers, like vllm as you say, or ollama or others

#

ai playground installs a bunch of things for you and conveniently lets you use them. you can use comfyui, llama cpp, whatever else by yourself

scenic glacier
zinc locust
#

you can hack it to work on linux. but if you're on linux, you can also just get llama cpp yourself

outer perch
#

Thanks @scenic glacier Aaron, I have tried that version. The container detects both of my installed cards, but prioritizes my 770 over my 380 and won't share vram between them which is what I was hoping for. I'm successfully running ollama-vulkan which is slow, but does what I've asked it to, which is use models that are a little larger.

@zinc locust True! But I'm not trying to run this in a command line with one model. I'm trying to use OpenWebUI because what interests me more is running models and comparing output. Maybe this is possible, but it's beyond my skill level.

#

On topic, I'm running SD.Next which is working well enough to play with. I wasn't aware that generative AI did not see benefits of multigpu processing, so that's neat.

outer perch
#

sorry, I used generative ai wrong, I meant image gen

zinc locust
#

also multi gpu for training image gen is a thing. just not very much for inferencing

outer perch
#

That makes sense to me.

knotty gazelle
#

Hello 🙂 i am returning to ComfyUI after a long break.

Successfully compiled pytorch 2.8.0 and intel_extensions. Is there anything i need more to add to increase Comfy performance? 🙂

scenic glacier
#

Install with viks script and you should have everything you need for performance I think.