#SDNext WebUI on Intel ARC
1 messages ยท Page 6 of 1
try https://genai.edge.devcloud.intel.com/ as you get OpenVINO env there
That asks me to enroll and forwards to a broken website
consumer or developer account?
maybe they auto ban everything that contains "webui" like colab once did
but you should get that fixed
Finally getting around to testing the new ipex and seems memory utilization is higher now, barely able to do 1080*1080 with sdxl with sdp. Subquad seems no longer supported by attention slicing(thought it was before?)
Okay figured out the memory problem, it was the ipex optimize stuff.
Are you on dev?
Memory optimizations got disabled by default since they had too much impact on performance
I am on the main branch
I copied over my config so that's probably why
does the new diffusers controlnet not work with arc? Keep getting xpu errors
might be just some of them, got canny to run. Have to figure out how to use it I guess.
Which ones?
MediaPipe Face preprocessor doesn't work
SegmentAnything preprocessor is fixed
Everything else seems to be working
Input / Output
Open pose wouldn't load at all qnd gave an error, depth models gave error and crashed ui. I will try again and see if I just did something wrong
Should mention this was only on sdxl, didn't try 1.5
Just main branch, I will check out dev branch next launch and see
wanna try out controlnet toys someday, but ipex really doesn't like me 
I've never gotten it to work outside of WSL, and y'all know how buggy that is
Uninstall torch. And then follow the steps on the installation guide
Now all you need to do is install python and sd.next should install everything for you in a venv
The comfy UI fork on here should do the same as well.
No need to install the oneapi manually anymore I don't think.
oh seriously? bless 
Yeah it's windows native so it should work easily
well, I tried.
and this when i tried to run pip install -r requirements.txt
all of this is inside a fresh miniconda venv
oh, I tried looking for an appropriate python version on the readme, didn't see any anywhere
Wipe your venv folder and use webui.bat --use-ipex with Python 3.11 or 3.10
how should I manage python version without conda?
Ipex guide says the python version
set PYTHON=python3.11
webui.bat --use-ipex
Or uninstall Python 3.12
I only have 3.12 in the conda, so that should be fine. lemme check when's the last time I updated my system wide python though, since i've been using conda since the last time I screwed myself over with python versions
looks like 3.10
not exploding itself on that spot again ๐
Also you need to remove your venv folder after changing major python versions
venv will still be the old version otherwise
wiped ๐
was about to wipe the whole git root again out of habit
looks like launch succesful 
will see if I can get a proper image out or not
a lil funny lookin' but hey, that's definitely an image
better 
looks like its running!
no wsl here, just native windows
yeah only problem is only diffusers runs at full speed
feels very nice
do upscalers run at all though? I get a crash every time I try to use img2img or highres second pass
I haven't tested the new ipex in original backend yet, might be full speed in windows now.
INT4 for iGPUs is interesting
INT8 and INT4 on GPUs didn't work with 2023.2
Still doesn't work
It works if we use HETERO:GPU,CPU
But doesn't use 4 bit or 8 bit
oh wow Ive been running in wsl this whole time
advantages vs disadvantages of running sd next in Windows vs wsl?
advantages of wsl:
????none that I know of, its buggy and has memory issues
could be slightly faster but I don't know for sure
can't say for sure on the latest ipex, but advantage for wsl2 before was that it was about 20% faster. And advantage for windows was no need to run wsl lol. I was using both at one point wsl for 1.5 and native for sdxl. Since the speed difference for sdxl wasn't that great
Diffusers speed is the same atleast so I'd recommend native
Works with new nncf
diffusers in native is about as fast as original in wsl. Might be faster now though, I need to test sd1.5. So it's like 6it/s in windows vs 9it/s in wsl in diffusers(in previous ipex anyway).
Added 8 bit Quantization support with OpenVINO
https://github.com/vladmandic/automatic/commit/db6fd95351eb402676daa948fd39ff60b3f8993f
does SDNext still work over local network by default? I'd assume so, but I triple checked everything else to make sure it was the right address and accessible, so I figure I come ask as a sanity check
Just trying to access from phone so I can gen while walking around the house 
If it is correctly bound to off localhost loopback, it should.
localhost:7860 works fine on the host machine, i run a couple of other services that are connecting remotely just fine (koboldccp, jellyfin), but this one just times out for some reason
Always? You might need to bind it to your local IP.
what do you mean 'always'?
Does it works sometimes or never?
Your initial statement is unclear.
ahh. I haven't tried accessing SDNext remotely before, last time I would've was on A1111 a long time ago
I can currently access jellyfin though, I just put host address:8096
which works the same as localhost:8096 on the host machine. but while localhost:7860 works for SDNext on the host machine, host address:7860 on the phone does not connect
Yeah IDK it must be a configuration issue.
Add --listen
It will reject any other IP other than localhost by default
ahh, so it is a setting, thank you
Experimenting with 1024x1024 curse
FP32 works fine
FP16 NaNs out
BF16 has the curse
Only happens with Alchemist GPUs
(GPU Max works fine)
Upcasted attention to FP32 then downcasted it to BF16 and now it NaNs out too
Using BMM instead of SDPA works fine
There is something wrong with Alchemist + SDPA combination
but there is slight hints of the curse with BMM too
Created a issue:
https://github.com/intel/intel-extension-for-pytorch/issues/519
i'm assuming controlnet is still not supported in diffusers yet?
TLDR
RuntimeError: Device type XPU is not supported for torch.Generator() api.
(This is on windows)
Dropped IPEX 2.0 support
Update to 2.1
It has ฤฑt's own ControlNet tab
Can it install that itself in its own venv, or do I still need to do that myself in my conda
Ok, I guess not
tbh i havent touched windows in a while but a fresh install seems to result in this error?
im unsure if ive done anything wrong as i just installed a fresh windows
--reinstall
Pushed a hotfix for this, run --reinstall again
tensordict 0.3.0 installs torch 2.2.0 for some reason
Locked it to 0.2.1
the oneapi base toolkit is no longer necessary, right?
oh yeah k ill try that
Yep
nice, thanks
ohhhh that's interesting, is diffusers capable of everything original could do now?
i find diffusers is behaving a fair bit different on image outputs, but it could be some diffuser settings maybe. I have more cursed resolutions, and some LORAs behave very differently
tbh i cant seem to replicate sdxl outputs from civitai, tho i know it changes the resolution a bit, but everything seems unsaturated and slightly blurry, tho maybe ive set something wrong
In diffusers you have ti manually set your sampler settings. I honestly don't full understand but if you were using ones with karras ++ or whatever, they have the actual settings you have to set. Its probably more versatile but def more confusing especially if you follow guides that use a1111 etc.
I... think I have the sampler settings all set the same?
some definitely get closer than others, but I seem to get more artifacting in diffusers
<original backend | diffusers (copied settings with move to prompt) >
seems to me that LORAs are particularly getting messed up. If I don't use them, then the output is generally fine
eugh nevermind, some prompts become vastly degraded if I'm not comparing UniPC sampler
Prompt weighting actually works with Diffusers backend
Be careful with spamming ()
Doing this (((((())))) will burn your image for sure
oh i do a lot of prompt weighting 
so does the prompt weighting effect just kick in a lot sooner? It seems my normal loras at 1.0 will bake an output in no time, but lower weights don't feel like they have as much impact
Loras shouldn't have any effect
Do you use Karras scheduler?
Not using it will cause artifacts
Also there is diffusers native lora loader too
"LoRA use alternative loading method" in Extra networks settings
It is disabled by default
Got that enabled, as for LORAs... they seem to be having an effect without enabled that setting?
its just kinda blown out compared to my usual stuff, so i cut all weight numbers by half and the results are looking better?
I won't go over 1.2 most of the time
1.5 if you really want that prompt
Also downweighting is working on Diffusers
Original backend doesn't work with downweighting
downweighting?
weigth < 1
0.5 etc
for prompts
Lora loader should be the same on both backends
oh, that's how I usually use loras, is sub 1
oftentimes negative number
oh, alternattive loading method setting has.... weird effects
๐
Artifacting happened to me too rn
Refreshing the page fixed it
Don't know why and how it happened
i turned down all of my lora weights and now everything is outputting pretty decently. I didn't turn on the alternative loader, but i'm using downweighting to something around 0.2 on some and getting fairly good results. Slider type loras i can run at full 1.0 and scale as I normally would, including to the negative weights
IPEX Optimize Model gets messy with Loras
Creates duplicate UNet for some reason
Hey! My apologies if this should have been posted someplace else, but most of my efforts have been focused on getting SD.Next working, so I figured this was the way to go. Anyway, here's where I'm at:
I have an i5-13400F, 64 gig of RAM, and an A750 (so 8 gig of VRAM). I realize that this is an upper-poverty-level PC, so I need to accept limitations. I'm relatively new to this stuff. I want to generate 1024x1024 images with SD XL (or a derived model), if possible.
I've been able to get Fooocus up and running using its default model (Juggernaut XL) with the directions from the appropriate thread in this channel. It hasn't been without its hiccups - I still don't have upscaling working - but it produces good results at speeds that I think are fine. It uses every last byte of my VRAM and without throwing "out of VRAM" errors. However, I want way more customization than Fooocus offers.
I've tried both A1111 and SD.Next, but haven't had a ton of luck with either one; I've gotten further in SD.Next, so I've focused most of my efforts there. I'm currently focusing on using the same model as Fooocus (Juggernaut XL), both to change as few variables as possible and to eliminate the "refining" step, which was giving me its own set of problems. To summarize my current status in SD.Next:
-
If I don't use --medvram or --lowvram, but I enable all of the other assorted VRAM-saving settings (Move base model to CPU when using refiner, move base model to CPU when using VAE, move refiner model to CPU when not in use, VAE slicing, VAE tiling, and Attention slicing), it will generate 512x512 images with acceptable speed, but throws errors when trying to generate 640x640 (RuntimeError: Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)). Looking at my VRAM usage, it seems to be running out.
-
Using --medvram along with the other VRAM-saving settings (VAE slicing, VAE tiling, and Attention slicing) generates 640x640 with acceptable speed. However, it throws (what seem to be) the same error at 768x768. Again, it does seem to max out VRAM.
-
Using --lowvram (which, to my understanding, implies all the other VRAM-saving settings) will generate 1024x1024 with no errors, at about 30% of the speed of Fooocus, while only using about 18-20% of my VRAM.
Does anyone have any ideas about how to get the best of both worlds? I'd love to be able to get SD.Next's customizability (and am happy to use something other than SD.Next) while getting performance comparable to Fooocus.
Native windows? Make sure you have the ipex optimize stuff turned off, you should be able to do over 1024(1024x1536 is about my max on an a750) at medvram no problem. Also make sure your using the scale dot optimizations, and running at fp16 or BF16 and do NOT have the full precision stuff on.
Thanks! Yes, native Windows, but I'm up for switching to WSL if it works better. Re: IPEX, I want to make sure I understand what you mean. I installed according to SD.Next's directions on GitHub, here: https://github.com/vladmandic/automatic/wiki/Installation
Those directions tell me to use the --use-ipex command line argument, so that's what I've done. By default, this does enable all of the "IPEX Optimize for Intel GPUs" options under "Compute Settings". I've tried disabling these per your reply (while still using --use-ipex), and I haven't seen any difference. I've checked and I'm using "Scaled-Dot-Product" under "Attention optimization method", I'm not using either of the "Full precision" options, and I'm running at BF16 (as I heard this was faster than FP16 on our cards).
Disable ipex optimize, not sure if you need the use ipex command but either way disable ipex optimize in native windows
Yup, I disabled the IPEX optimizations, but it didn't make any noticeable difference. I do need --use-ipex; I tried a separate fresh install without it and it doesn't work.
Use medvram
Also try fp16 and see what happens
Also disable the move to cpu stuff(for me I didn't have enough system ram but see if that helps as well)
Yup, was already using medvram. The move to CPU stuff is already disabled, because it's incompatible with medvram and lowvram. I just tried FP16, and I'm having some odd, possibly memory-leaky issues with it, as it will consistently generate one image at 512x512 without a problem, then crash when I try to generate another. BF16 will generate 512x512 all day long.
Odd, are you on the main branch and up to date?
Also you just let sdnext install itself and didn't install ipex yourself right? Just to make sure, if you are not in a venv and using your own install of ipex there could be issues with the latest version (haven't personally installed it in windows and just let sdnext do it)
Maybe post your settings and disty can help you further. Post error messages as well.
Yup, just reinstalled today. And yeah, it's running in its own venv and it installed its own ipex. Is there a way to export your settings for posting? What's the preferred way to do that?
Just screenshot with the snippet tool
Okay, thanks! I'll do that a little later; at the moment I'm a little frustrated and blowing off steam doing something else, but I'll no doubt be in the mood again in an hour or two ๐คฃ I'll take extensive screenshots.
It could also be your vae for sdxl, try turning it off and aee what happens
You used to have to do a hack for the vae, but I have just kept it the same so don't know of it was updated for sdxl.
I'm using Juggernaut XL, because that's what I had working in Fooocus, which already has the VAE fix baked-in.
recommended settings for SD NEXT in windows? (Arc A770 16gb)
fresh install, 1st time running through windows once I get my output paths the way I want them. just wondering if its good to go without things like attention slicing. mainly just going to run sdxl
disable live previews
not sure how to disable that
question: if I launch through the shortcut for webui.bat do i need to add the --use-ipex arg?
Search for live
0 disables it
Disabling them gives a minor speed and VRAM boost
i love my live previews =w=
with an 8gb card though, might not be able to afford it as easily
@proper cradle Hey Disty have you tried Stable Cascade with ipex?
yup, getting similar corruption
Happens on dev cloud with gpu max too
Reminds me of the early stable diffusion on ipex output
@proper cradle Do you want to fix a typo in modules/ipadapters.py
SDXL IP Adapters are typo'd as SXDL
Anyone having any issues getting the SDXL refiner to work recently?
Someone posted an issue with refiner about sdnext in the a111 thread
I'll put a log here
Refiner just flat out doesn't work for me
Seems that person is getting artifacts, but I don't get any refiner effect
I do get an image at least
Try downloading the diffusers version from Models -> Huggingface tab
It is failing to load the refiner
Both of my GPUs are busy so I can't test anything rn
might test on linux later; either way, it used to work, it just stopped working "now"
on that note, ip adapter no worky either
reproduced refiner
ipadapter seems to be working fine
do you have sequential offload enabled
alright
Only with lowvram
should be around 1.5it/s maybe s/it's. Did you try those options I posted in the other thread?
#1193952640225267802 message
rest of the options seems to be same
I am on an older commit btw, so som options look different on yours I see.
default might be giving you fp32 speed, try setting model variant to fp16
also restart the webui after changing vram settings
make sure no-half and full precision or also not checked. And use fp16 or bp16 not fp32
changed the settings now
gonna see how much it changed
same/just a lil better
2.40 to 2.50s/it lol
maybe i will try the older commits when I have time
High memory utilization: GPU=80% RAM=47% {'ram': {'used': 14.01, 'total': 29.66}, 'gpu':
{'used': 6.19, 'total': 7.75}, 'retries': 0, 'oom': 0}
Processed: images=1 time=215.15 its=0.23 memory={'ram': {'used': 12.26, 'total': 29.66}, 'gpu':
{'used': 4.41, 'total': 7.75}, 'retries': 0, 'oom': 0}
what are you using as samplet btw
DPM++ 2M Karras is 2.50s/it
Euler A is 1.30it/s
oh yeah, I forgot about that. I used Euler and the Unipc one, one of the karras ones was always slower, but I am not well versed on whats what with those in Sd.next.
How can I get an older commit of the git repository?
uhhhh you have to get the commit head, and then... commands... i always get lost when doing it so I'd just ask chatgpt 
I feel like my generations are really slow still... And when I check other people's diffuser settings etc. I dont have most of the stuff in their screenshots
git checkout
I have a feeling SDNext doesn't support this, is that right?
Or maybe a diffusers limitation?
sdnext was forked from a1111, and a1111 supported this a long long time ago, so I'd be surprised if it got removed? i guess it is possible like you said to have been removed with diffusers, but i dont like diffusers for several reasons still so I'm on old backend
the prompt manipulation syntax is pretty janky anyway, but is fun to play with anyway
If it worked in a1111 before sd.next was forked it should probably work in original backend. Original is basically whatever a1111 did to get it to work in webui afaik.
I have reinstalled SDNext and getting this error, when i use try to run it.
any ideas?
ok, its running now. 512x512 with 2,3it/s with an A380
Thats good for A380
SD1.5 running fine. SDXL crashes but dont know why.
512x768 is about 1.3it/s quite ok for the small A380.
too little vram
you have to use --medvram or --lowvram
have --medvram active
all ipex optimize options?
yep
still crashing. maybe SDXL is to much for A380.
ok, its running now. with 3,2s/it
what is the "normal" it/s for a750 with 768x1024? any ideas? with dpm++ 2m karras
dpm thing is a bit slower than Euler btw last time i checked
yeah it is, but gives better images imo
also i cant seem to change between VAEs
NotImplementedError: Cannot copy out of meta tensor; no data!
the error I'm getting
well fml. i cant seem to get it working properly lol. a1111 with openvino straight up doesnt work, sdnext is acting weird, foocus doesnt work at all either... idk
if anyone can help hmu please I'm down to hop on call and get whatever help I can get
Those are because of --lowvram
That area had improvenments in dev branch
i had tried medvram later on... now i cant generate images lol.
You might want to try NNCF with --medvram (tho just --medvram should work too)
NNCF compresses the model into 8 bit without losing much quality and halves the model size in VRAM
but it is slower to run
Also disable IPEX Optimize if you use Loras
IPEX Optimize is disabled by default in dev branch now
Also NNCF will trigger model reload with Loras
can you explain what is NNFC by any chance?
this.
It's in Compute Settings
Also done this since I use them
yep
what is the average it/s for a750 do you have any ideas by any chance?
7~ it/s with SD 1.5 512x512
SD 1.5?
Yes sir
HyperTile is a must for SD 1.5
RuntimeError: Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES) -5 (PI_ERROR_OUT_OF_RESOURCES)
launched with --medvram
for arc gpus? I could generate up to 1.8it/s with RTX3050ti in 768x1024
With NNCF?
Yes sir
NNCF will make it significantly slower
Doesn't worth it for SD 1.5
I tought you were using SDXL
No I didnt even have chance to try SDXL I've been having issues since I bought this card for SD
It generates sometimes, sometimes doesnt, or generates slow as **
EinopsError: Error while processing rearrange-reduction pattern "b c (nh h) (nw w) -> (b nh nw) c h w".
Input tensor shape: torch.Size([1, 512, 64, 64]). Additional info: {'nh': 8, 'nw': 6}.
Shape mismatch, can't divide axis of length 64 in chunks of 6
I get random errors like this now and then even though I havent changed anything and just adding to queue
RuntimeError: Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)
Back to square one
I literally changed nothing lmao
Defaults with only Unet enabled is good
512x512
Those are generally ipex or driver issues
even though i changed nothing?
like i said, its giving random errors at random times
lol... I cant figure it out I tried nearly all the webui's from forge to foocus
Probably better of using Linux or at least WSL
i tried wsl too
Native Windows is still pretty new
I'll send the error I get in wsl
Command: /home/mercury/automatic/venv/bin/python3 -u launch.py --use-ipex --medvram
Uptime: 62.413761 s
Segmentation fault
Traceback (most recent call last):
File "/home/mercury/automatic/venv/bin/ipexrun", line 8, in <module>
sys.exit(main())
File "/home/mercury/automatic/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/launcher.py", line 109, in main
cpu_run_main_with_args(args)
File "/home/mercury/automatic/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/cpu/launch/launch.py", line 447, in run_main_with_args
launcher.launch(args)
File "/home/mercury/automatic/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/cpu/launch/launcher_multi_instances.py", line 322, in launch
raise subprocess.CalledProcessError(
subprocess.CalledProcessError: Command 'taskset -c 0,2,4,6,8,10 /home/mercury/automatic/venv/bin/python3 -u launch.py --use-ipex --medvram' returned non-zero exit status 139.
./webui.sh --use-ipex
this is how i run it
i thought it doesnt set up it for ipex if i dont --use-ipex
Try;
DISABLE_IPEXRUN=1 ./webui.sh
Did you follow the guide and install the necesarry packages?
And do you have an iGPU?
disabled from bios
Intel iGPU will conflict
Try this
aight lemme try that again, deleted and reinstalling ubuntu wsl
mercury@DESKTOP-3EH5PI7:~/automatic$ DISABLE_IPEXRUN=1 ./webui.sh
Create and activate python venv
Error: [Errno 13] Permission denied: '/home/mercury/automatic/venv'
Error: Cannot activate python venv
ok wait i think i figured out why this is lol
yeah, that was on me
started once crashed
mercury@DESKTOP-3EH5PI7:~/automatic$ DISABLE_IPEXRUN=1 ./webui.sh
Segmentation fault
then when trying to start again gave this error
it doesnt start anymore
For those on Arch, don't update to Linux 6.8 until Intel Compute Runtime adds support for the new DRM Xe driver.
I downgraded to Linux LTS 6.6 for now.
Wait why, isn't the new driver not enabled by default?
Noticed that too
Exporting these worked:
NEOReadDebugKeys=1 OverrideGpuAddressSpace=48
I haven't upgraded to 6.8 yet (still waiting on the Fedora kernel) so I don't have any first-hand experience.
just came in to ask this
unfortunately I'm stuck on 6.8 because of bcachefs having a fairly critical bug in earlier versions
Export these
Putting this here, as a lot of coders are here. Seems there is interest in using Arc mGPU for AI workloads from tinygrad and they put out a bounty for XMX.
https://twitter.com/__tinygrad__/status/1770145884862546282?s=46&t=FGbX5gLUn0Bqa1heNNuXaQ
We are putting resources into Intel now. multiGPU training worked out of the box, and haven't seen a lick of driver instability. The A770 card is currently slow though, we need to add XMX support ($400 bounty added)
Have 32GB Ram but XL Models keep freezing my PC because it takes up all my Ram. Anyone knows how to solve this?
nothing can solve it lol\
more ram
either that or dont offload ur models
if u have enough vram u can maybe do it
32gb should be enough tho
atleast i dont have much issue
How much memory on your GPU
16 (A770)
Uncheck any of the offload to cpu stuff, make sure you are using the recommended optimizations as well. And If on windows native turn off ipex optimize
Anyone else on windows wanna confirm if the euler sampler (and probably other non-default samplers) break SDXL?
No clue if you know anything about it, but the error I was getting as of late was caused by the intel-compute-runtime package being out of date on Arch, thus having to resort to utilizing an AUR version
I'm not at home right now to tell you the exact error it was throwing before but I can get it to you just for kicks and giggles
(moral of the story: nice job not maintaining the latest packages, Arch devs)
I do a lot of SDXL and have not run into a memory issue. Are you doing latent upscale at some multiple of 1024? What resolution are you working at?
need to apply attention slicing
scaled_dot_product_attention, apply the workaroud to slice the memory allocation chunks to be less than 4GB
Getting this when I have FreeU enabled:
Intel MKL FATAL ERROR: Cannot load mkl_def.2.dll.```
tried --reinstall, no change.
Hmm, guess someone had the same issue with another repo ๐ค https://discord.com/channels/554824368740630529/1222726813558636585
IPEX 2.1.20 fixed the 1024x1024 curse 
https://github.com/intel/intel-extension-for-pytorch/releases
This is most likely from the Windows packeges not fully including MKL
WHOA
Requires wiping the venv folder
Sycl fails if you skip this step
Didn't update Windows since i don't have a Wฤฐndows envrionment to test how to include MKL with Windows
ah, still using custom builds for windows, huh
:(
wonder what's still broken with the regular ones
They don't have MKL and DPCPP with them
SDNext downloads them from pypi on Linux
And webui.sh adds them to PATH
Seems like memory leak issues got better too
So disabled ipexrun by default now
ipexrun causes more issues than it fixes
How's performance with the new IPEX version?
same
So perf is same but less buggy
Neat
That's always welcome
I'm new here ๐ค Do we just run Installer.py to update IPEX?
You're on windows, right?
Yes sir ๐ค
I found this intel github
This website introduces Intelยฎ Extension for PyTorch*
I'll read and try to assimilate ๐ค
You'll likely need to either wait for null to do another build, or to install the OneAPI base toolkit and update ipex manually
You know some basic python usage? You can see the install instructions below, replace them in install.py, and --reinstall
i wonder if there's some dependency issues
Waiting is probably for the best. ๐ค
libuv requires conda
and ipex on windows needs libuv
So we need custom packages to include libuv with ipex or make the user install conda, libuv, mkl, dpcpp manually
not interested in including conda in SDNext
pretty understandable, yes
Don't have SD Next installed at the moment so, need to go to intel github and update IPEX, then reinstall SD Next (windows) right?
u dont have to do anything with ipex for windows
its setup for you
windows ipex hasnt been updated yet i think
Is the windows version of one api fixed now? Also can we access the old versions yet?
The one SDNext uses hasn't been updated
Can look at this file here https://github.com/vladmandic/automatic/blob/ad7d0bbf6a58af3cb1cbe21823a1644df262e0d1/installer.py#L561 at lines 561 and below, once the ipex there goes from 2.1.10 to 2.1.20 it's updated
Or, I'd assume someone might say something here
Probably disty ๐
Given that on Linux 2.1.20 still leaks under heavy usage, or sometimes not so heavy usage... There's likely still bugs to be fixed
Is there something I'm missing something with inpainting - isn't only masked area supposed to be much faster? It still takes as long as for the full image, and if that image is 4k^2 i kinda don't want to wait that long and need to resort to cutting the image up
pixart alpha (Diffusers/PixArt-alpha/PixArt-XL-2-1024-MS [b89adadecc])'s negative prompt seems to be put in 1 more list than it should be, and the pipeline fails making it lowercase
Hacking a fix in diffusers since idk where automatic puts it in a list, it produces an image... broken :(
Bright scene, aerial view,ancient city, fantasy, gorgeous light, mirror reflection, high detail, wide angle lens.
111 seed, 4.5 cfg scale, 20 steps, everything else cfg set to 0, default sampler, karras and low order, 1 clip skip
on the other side, what the demo gives me
wonder if it could be medvram or vae tiling
let me fetch a bigger log that isn't an annoying crop
Probably that VAE failing to load
I don't think non SD VAE loading is supported
And PixArt doesn't specify a VAE with it's examples either?
I saw in an old closed issue vlad posted some pixart alpha result that didn't look garbled
so it has worked before, on nvidia
i will poke a bit more, vae tiling and medvram weren't the cause
Fp16 result is definitely better, that was probably it ๐
Trying out the other demo prompts here, looks pretty good
is pixart sigma too new for sdnext? or too new in general
I see some of their HF repos have 0 downloads ๐
dev branch
and click on it from the reference models
My UI is completely different
It looks better, how can I use it
the new gallery UI I saw is already a massive improvement
though I wish it was sorted newest first instead of oldest first
Still in dev, don't create bug reports for this in the main repo. use the extension repo instead
https://github.com/BinaryQuantumSoul/sdnext-ui-ux
I haven't really made any reports, and most of them anyways seem to boil down to ipex issues instead
probably only this is worth reporting
Looking at their github, they seem to have uploaded incorrect/broken models twice
So I would not be surprised if this is a pixart issue, unless you or someone else got it running?
I don't know what happened but my A770 can't run any SYCL or OpenCL rn
it get's stuck with 100% single core CPU usage
clpeak gets stuck
ipex gets stuck
opengl apps seems to run fine
vkpeak seems to run fine
some vulkan apps refuses to start
some works
I got a GPU lock-up and this happened after reboot
tho may packages was upgraded before the reboot
Tried going back to Linux 6.6 LTS but still the same thing
tried using docker to see if it is an Arch Linux issue but still the same thing
it was fun while it lasted ๐ข
(venv) disty:~ $ python
Python 3.10.14 (main, Mar 25 2024, 23:39:34) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import intel_extension_for_pytorch
No ROCm runtime is found, using ROCM_HOME=' '
>>> a = torch.randn((10,10,10))
>>> a = a.to("xpu")
>>> print(a)
(gets stuck here)
GPU usage maxes out but pulls only 84W
Also maxes out a single CPU core
GPU usage dorps after 10~ seconds but still stuck with 100% single core CPU usage
clpeak:
disty:~ $ clpeak -p 1
dlerror: /opt/intel/oneapi/compiler/latest/linux/lib/x64/libintelocl.so: cannot open shared object file: No such file or directory
Platform: Intel(R) OpenCL Graphics
Device: Intel(R) Arc(TM) A770 Graphics
Driver version : 24.09.28717.12 (Linux x64)
Compute units : 512
Clock frequency : 2400 MHz
Global memory bandwidth (GBPS)
float : (gets stuck the same way here)
I wonder if this is a firmware issue
Have you tried completely powering off the PC and leaving it like that for a longer amount of time
How do Arc firmware updates work on linux anyways, the windows driver installer updates the firmware but I haven't heard much about that on linux
tried both
left it off 10 mins~
Tried updating the firmware manually with igsc
Used these
Do you wanna try 6.5
disty:~ $ sudo igsc fw version
Device: FW Version: DG02_1.3257
I have the same version
VUlkan apps that refuses to launch throws these errors:
MESA-INTEL: warning: cannot initialize blitter engine
OpenGL runs fine
vkpeak runs fine
or according to https://dgpu-docs.intel.com/driver/client/overview.html, 6.2 or 5.19?
I am on 6.5 without too much issues
And, well, make an #1028170137158811659 thread
Installed linux-hardened since it is still on 6.8.4
And it works fine
6.8.5 and 6.6.27 is broken
reported
For downgrading;
linux 6.8.4: https://archive.archlinux.org/packages/l/linux/linux-6.8.4.arch1-1-x86_64.pkg.tar.zst
linux-headers 6.8.4: https://archive.archlinux.org/packages/l/linux-headers/linux-headers-6.8.4.arch1-1-x86_64.pkg.tar.zst
after pulling from master, models no longer load
6.9 kernel, and I think i had generation working with the export environment variables
did something new break?
(just checked on dev branch, happens as well, perhaps due to symlinked models dir? just gets stuck on "load model")
not due to symlink
same issue as the message above
Last working kernel is 6.8.4
Got it, thanks
is it related to the mkl and mkl-cpp package version on your env?
Same with 2024.1 and 2024.0
Same with ipex 2.1.10 and 2.1.20
clpeak doesn't even use these
Yeah, something is not OK in the latest kernels (stable, lts and rc). ComfyUI, automatic etc. was working two weeks ago (6.8.3 ish, 6.6.24), now they sit with 100% cpu on the python process. No apparent error message. Quickly testing kernel 6.6.24 and everything works as expected.
6.8.6 dropped, don't see anything relevant in the changelog
That doesn't work either
Can some one please tell me the process to install SDNext Webui ? @proper cradle
No luck with 6.8.7 or 6.6.28 either here.
any way to set executable run on GPU not iGPU?
typically you'd need to set that with arguments in the software itself, otherwise it'll just go through your active system devices, and the only way to remove from that list would be to disable the device anyway
very annoying when you want to use immature ai utilities like that llm intel repo posted in intel blog
SDnext is very convenient how it takes care of the necessary venv stuff, manages the nitty gritty settings, and all i need to do to run it even on windows is shell webui.bat
try --device-id 1
unless you're on a laptop, you should probably just disable the iGPU, are you using it for something?
I get an issue seemingly random lately, where during boot up of webui.bat, it'll just say "press any key to continue..." and exit out
its not even always the same point in the script either
oh, i tried a few more times and my whole computer blue screened with kernel error, so maybe it's something on the driver side 
Have you been generating a lot of images without turning the PC off? There seems to be some serious driver-side issue that gets progressively worse
Though I haven't had bluescreens, I've had my screen flash black if I generate too many images, with sdnext or with comfy
and then that's it, no more generating images until reboot
sometimes "too many" being 7 images with a lora
before thing get critically unusable
on linux it's better, since there the driver doesn't seem to die, though there are still leaks, at least I can have sdnext continuously killed and restarted when it leaks to a too high amount
if you can dual boot, I'd suggest it
I haven't tried WSL but I wouldn't be surprised if it's the same situation as native windows
idk if it ever got fixed, but WSL had some nasty memory leaking and i once woke up to a 200gb pagefile
i do have dualboot, but the problem i run into is that i want to draw while generating, and use it as a corrections tool. my drawing software only runs in windows so its not very useful to dual boot
I also can't get it to run in linoox for some reason
Yeah, I am in a similar position
There's a pretty swanky addon that integrates comfyui into blender, but is windows only
blender + SD does look fun
I've wanted to try it for quick demoing background composition, just to see if something works before i spend all the time doing it
still busted on 6.8.9
Man, is it just me or is the base Pony model fried
Are you using non default clip skip with pony?
Pony doesn't work with anything other than the default
i'm not getting broken images, but the results I am getting are just meh and look a bit fried
Let me generate something
I am using it with the default clip skip
score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up, source_anime, 1girl, aqua hair, aqua eyes, twintails, necktie, boots, detached sleeves, sitting, holding, book, bench, outdoors, blue sky, sun, grass, tree, forest
7.0 cfg scale, euler, 27 sampling steps, 0.7 rescale guidance
yes karras, no dynamic threshold, yes low order, no rescale beta
2580050041
At this point I'm wondering the model is just bad and I need to be putting extra effort to salvage it, or what
I've seen the aua stuff, I've seen the dude admit he did lossy compression on his dataset multiple times
Sometimes when I write an extremely long prompt, in specific cases, it will start approaching some decent looking style that isn't a watercolor filter that can't be prompted away
dtype, and the resulting image from a 2 clip skip, so I keep the clip skip at 1 which was the default, right?
had a much better and smoother experience with animagine
cranking the cfg up helps a bit
That looks like a normal Pony gen
Default for SDXL is already -2
So use 1
1 = -2
2 = -3
Pony is extremely fragile
Hm, I assumed so, thanks
i use a branch from animagine that's been working very nicely, i seem to have forgotten the name though. quite reliable and seems to perform well on various benchmarking tests that check for bias
I keep seeing people on reddit singing high praises over pony, to the point i noticed disty even got downvoted when he mentioned animagine, so i really started wondering if there was something i was TOTALLY missing with that model
i have managed to finangle a decent result out of it every once in a while, but it's been a headache
It's just that pony has the best nsfw content
7th AnimeXL A is pretty good
Sure, but many people want to say that it's not just that
i think i'm using... anima pencil XL
Another update:
https://github.com/vladmandic/automatic/discussions/3175
nice ui
looking good!
:( so sotediffusion needs pytorch 2.2 in sdnext? is that right?
trying it on comfy, it ooms
actually, let me try medvram or lowvram
welp, still OOMing
tried with --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet, and... it seems like a good thing that the 4gb limit triggered here
Probably not exciting but I think SD.Next just added Sd3-mid support
You can't run wuerstchen on ARC
https://github.com/intel/intel-extension-for-pytorch/issues/529
Oh noes
I have problems with animatediff and SDNext. Everytime I try to make a short video my vram and system ram is filled to the limit. 16GB on a A770 and 29GB of 32GB System. Also on my 3090 24GB is complete used.
I use SD 1.5 models and the 1.5 V3 animate module.
how can i use contorlnet ? when i try to apply change it did,t apply
Controlnet is installed and enabled by default... Why did you disable it? Enable it again, apply, restart the webui
Use the Control tab
Extension will not work so that's why it is disabled
It autocratically disabled after installation! I try to enable it! When i apply and reset sd next it Again disabled autocraticly!
Sorry yes, my bad
It should still work by default, just use it from the control tab like disty says
SDNext has controlnet built-in
Controlnet is built into sdnext, so the webui extension doesn't work anymore.
On that, have you guys added the new sdxl control net aio thing?
HOW CAN I ENABLE ControlNet???
Sorry, I was wrong. You don't need to enable it. Just use it. From the control tab.
BRO HOW??
You can't use any random resolution you want
I'm also not really seeing any blur
Models are trained for specific resolutions and aspect ratios
Going above the total number of pixels will result in repeating patterns
Going below will result in images getting progressively more fried
Using a different webui isn't going to make that look any better
This is how image generators are
They don't produce super great results
furthermore, "photorealistic" means something that approximates realism
the actual realistic things, are just photos
You're using a 1.5 model. There is no refiner. The refiner doesn't matter anyways.
The refiner is an SDXL thing.
It is not good enough.
i have a750 cant handle sdxl model
If you specifically don't like faces and hands, you can try the adetailer extension https://github.com/Bing-su/adetailer
This is not magic that will fix every issue with AI image generators
You have to understand that, for the time being, the quality isn't that great and you will need to manually touch things up
Inpainting, editing with gimp/krita, whatever
You can go have a poke with ideogram, which is the current state of the art, best image generator, if all you want is to just type in a prompt, and see the issues it also still has https://ideogram.ai/t/explore
no i want to edit like here
do u have that pose links
thx for help again
For example
None currently can count
i see
asking for a specific number of inanimate objects, as opposed to spider legs that it has learned tend towards 4+4, is even worse
you can do text to video with 16
I wouldn't bother
not good enough currently
unless maybe you just want some looping wavy hair or something very trippy
Works fine on a750, just need to use vram optimizations like medvram
Also, images are low res so you have to upscale for crisper detail.
Ever since driver 5448 SDXL speed has slowed on 8GB arc cards from 1.4it/s to 8s/it, and vram usage is the cause.
I will have to test but that doesn't seem right, have you filed a ticket with intel anywhere?
How can i use upscaler on arc gpu?( can i use Ultimate upscale?)
Is that possible?
guys how can i use sdxl model on next?
i download the model and add the folder and choice already but no refiner section on my screen
they says add this settings but how to add
how can i change interface as like this settings
its been a while since I touched sd.next but i believe that you have to click
Apply Settings and either restart or shutdown server.
Also, if you plan on using SDXL, youll need to modify backend support to diffusers
Backend Support
problem is there is no opitons to change settings
if i change some how than of course i will apply and restart man
Wrong panel
Settings -> User Interface Options
Tho refiner isn't needed for most of the time
Doesn't worth it
And that wiki page is kinda outdated now
You can skip them and just use the default settings
oh thx
is there an updated guide for installing? I haven't used SDNext for a while but thinking about getting back to it
Pretty sure its as simple as git pull now, but also haven't tried in a while. I don't think you even need to install oneapi anymore
get python 3.10
git clone https://github.com/vladmandic/automatic
cd ./automatic
python launch.py --use-ipex
Windows install is good now if I remember so I think ill be going through that instead of wsl
FLUX with NNCF INT8 compression on A770
how is it compared to fp8
int8 pretty much have the same precision as bf16
(for models that stay around 0-1)
non quanted fp8 is pretty much a dumpster fire
This is INT8 for example with cascade
these two are fp8
FLUX BF16 vs INT8
damn, that's pretty good
I have a feeling issues like that are probably more pronounced with smaller models
Think we will get flux on 8gb cards anytime soon. (Not counting nvida which uses system ram through drivers)
I'm getting the impression that this is becoming a thing on windows for intel
That being said, the amount of time I've seen people mention waiting... at 5+ minutes for an image, I think it's a better idea to just pay for the online stuff, at least for the time being
We'll see where finetuning does or doesn't go ๐ค
personally I'm not super convinced throwing more parameters at the problem is going to solve image generation, or at least not by itself
while this is a big improvement in anatomy, it still struggles, you can even see it in disty's pics
difference between BF16 and INT8 is pretty noticeable
You mean... For LLMs?
i mean the wifus above
wouldnt trade quality for speed
if you compare fp16 vs fp32, they are just different. i wouldnt say fp16 is worse
This difference is fairly minor compared to BF16 vs fp8e4m3fn
(kinda source trust me bro)
fp8 changes the whole composition (from what I've seen from people on reddit, when browsing?)
now that's noticeable
this here is what, slightly different frills, maybe a watermark maybe not?
quality is pretty much the same, both have different errors in the details
both have... that leg
presumably changing the whole composition when quantizing means it's worse
no one will sit down and grade a mixture of 1000+ images blind individually for a good sample size of the quality and those lovely errors we all like to see which I don't think any aesthetics score predictors can catch, maybe some bad anatomy models and that's about it
it's just a minor detail, it also has better frills, but at the same time int8 doesn't have a fried watermark, the particles near the magic spell are more sensible and the fingers are a bit better
realistically you can't run the model at bf16 on an a770 16gb in a timely fashion, assuming windows has offloading now which I think it does
you need ~22gb
if int8 is this close to bf16 it might as well be way way better than fp8
BF16 only has 7 bits for the float part
it is an inferior format compared to fp16 for diffusion use
int8 is pretty close to bf16
fp8 is comparable to int4
diffusion needs the float part, integer parts stays around 0 most of the time
also if you are doing a large scale training, avoid bf16 unless you want a blurry model that can't learn details
use fp16, or fp32 if you can't use fp16
fp32 / bf16
fp32 to int8 / bf16 to int8
fp32 to int4 / bf16 to int4
failed on int2
Using SDNext with an H100 80GB.
INT8 using NNCF
INT4 and INT2 using Optimum Quanto
There's an fp32 version of flux?
wonder if int4 is small enough to work on 8gb GPUs
So flux in sdnext works? Dev build i am guessing right?
My build is so old I may have to fresh install lol
Not sure if lcm is working properly in image to image the output just looks like a low step from any sampler . Seems to work with controlnet and text to image though. Although i could be missing a setting in img2img the UI is def not intuitive anymore.
is animatediff broken or do I need to do something in settings? Keep getting crystalized images. maybe a windows thing? i had sdnext download the models itself so maybe that's it too. dunno. also tried multiple models, lcm and regular etc. Latest master repo
seems to just be txt2img giving the error. Also program doesn't manually disable animatediff pipeline, you have to unload and reload the models manually or it will keep trying to use animate diff even when set to non and extensions closed.
how the heck did you get an h100
a cloud instance provided by fal.ai
ah i see
Do we have a guide or tutorial on how to run Flux on ARC A770? I've been busy and trying to catch up with the latest stuff
Where might I find the most "optimized" SDNext config options for SDXL? I run into memory issues past around 1000px in width and performance issues throughout (512x768 at like 2s/it).
Try medvram
it seems like balanced offload causes the "expected all tensors to be on the same device" error for mr
actually it seems that everything on recent dev/master branch builds is busted
sd 1.5 models just produce noise/sparkles/weird busted-VAE type stuff and SDXL runs absurdly slowly
ipex optimize also broken. I do have latest toolkit
Is anyone able to run the SD 3.5 medium model? Every time I try to run it I get a few errors that amount to "incomplete model" and "model not loaded"
Ah, thank you so much
2 of them : )
Disty is #1 for arc support IMO. We would be so screwed without his work. Intel should cut a check lol
Resolution locking of OpenVINO is finally eleminated: https://github.com/vladmandic/automatic/commit/19989971891c8508eaa42db6e3f6318a3a91f1ac
Welp, not completely. You can use up to 3 different resolutions without re-compiling.
You can use base + hires and have 1 extra resolution to use without recompiling
SDXL 1024x1024 on A770 with OpenVINO
that's pretty fast
Did open vino get faster? I thought it was like 2s/it in linux iirc.
It was always faster than IPEX but not this fast
Old speed i remember was 1.8 it/s
My guess is new OpenVINO has Flash Attention support because i don't run into the 4gb alloc issues anymore on OpenVINO
howdy folks, been about 5 months absent, what kinds of developments have there been lately/in the last half year ish?
Flux, cogvideo, ltx video, hunyuan video, mochi video, mmaudio(video to audio), and honestly more than i can remember. As far as intel, we have built in xpu support im pytorch 2.5 and newer but its much slower(especially on windows)
2.5 is slower in my experience but it's much easier to get set up
I am going to try and set up wsl2 and hope that the speed increase from linux mitigates the speed decrease from pytorch
how long did it take to compile? it was taking minutes for SDXL awhile ago
Around a minute
@proper cradle i have this error what can i do ?
(B580 user ^)
that's not sdnext that's a1111
^ sdnext
^ when running for the first time, run with --use-ipex
^ ai playground
#1193952640225267802 message
^My script that installs comfy (comfy is also in AI playground, will be faster for battlemage through AI playground)
Ok i try and you link but know how i can fix problem in A1111 @fickle plume
by not using A1111
sdnext is just plain better than a1111
and not just that, i think a1111 is starting to get a bit stale now...?
this is right order
there is no order, the 3 things I linked you are all 3 different web UIs
you pick one and you install one
Automatic (SDNext) is the one this thread is for
AI Playground is Intel's official web UI (runs using comfy, also does LLMs)
wich one is best to use with Realisticvision 6.0 checkpoint ?
sorry for my stupid question but i just start with this type of program and im so confuse @fickle plume
all of them work
there are screenshots of the UI of each
you look at the screenshots and decide
This is SDNext
In the bottom image, the top right โ๏ธ is AIPG (images)
This is comfyui
Ok i choose SD next
@fickle plume --use-cuda Use nVidia CUDA backend (autodetected by default)
--use-rocm Use AMD ROCm backend (autodetected by default)
--use-ipex Use Intel OneAPI XPU backend (autodetected by default)
--use-openvino Use Intel OpenVINO backend
--use-zluda Use ZLUDA
--use-directml Use DirectML
witch one need to choose and when to put it ?
.
the instructions most likely say you will be typing in install.bat or whatever, and you type in --use-ipex after it and before pressing enter
.\webui.bat --use-ipex
SDNext will install PyTorch nightlies on Windows, do they work with Battlemage too?
Last commit to A1111 is half a year ago, it is pretty much dead
Also this is the Intel ARC wiki: https://github.com/vladmandic/automatic/wiki/Intel-ARC
Or the new docs: https://vladmandic.github.io/sdnext-docs/Intel-ARC/
@proper cradle idk why ai have all path
It should be the .exe file
Remove the venv folder in SDNext
You can set the python path for SDNext like this:
set PYTHON=path_to_the_exe
You have to remove the existing venv folder in SDNext's folder if you want to change Python after the venv is already created
My floder look like this. i Use weui-user bath who looks like that
There is no COMMANDLINE_ARGS in SDNext, add them next to the webui.bat like this:
call webui.bat --use-ipex
Remove the second call
Yes, also there is a stray D in --use-ipex
Do you have OneAPI installed?
no
https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=windows&oneapi-win=offline this need to install ?
No idea honestly; I don't have a battlemage gpu
I know the battlemage-specific ipex wheels do work - there's one for 2.5, and AIPG gets one for 2.3 from somewhere
I think I had trouble getting the nightlies to work on windows anyways last time, but it could've been a one-off thing
i think i need visual studio install first
@proper cradle after install what need to do ?
"C:\Program Files (x86)\Intel\oneAPI\compiler\2025.0\env\vars.bat"
"C:\Program Files (x86)\Intel\oneAPI\ocloc\2024.2\env\vars.bat"
.\webui.bat --use-ipex
type in 1 lane or 3 separate ? @proper cradle
3 lines
ะพะบ
C:\Program Files (x86)\Intel\oneAPI\compiler\2025.0\env\vars.bat webui-user.bat too like this ?
or in file ?
with call infront or clear ?
now start and close automaticaly nothing run
change the path_to_python.exe with the actual python file
Probably is inside that folder in the error log
I am guessing the name of the exe is python.exe
Tip: you can copy a file and it will paste its full path into the terminal
same error
missing set before PYTHON=
Probably this, i am not really familiar with Windows bat scripts
well, same for the other ones too?
it's basically windows' export
oh, git and such were in the previous ones
does it have an equivalent to source might need to replace the calls to oneapi
set PYTHON=C:\Users\suncho0\AppData\Local\Programs\Python\Python310\python.exe
and again ....
call is close enough, should work
Remove the venv folder and try again with this added
set TORCH_COMMAND="torch torchvision --index-url https://download.pytorch.org/whl/test/xpu"
Try this:
set TORCH_COMMAND=torch==2.6.0+xpu torchvision==0.21.0+xpu --index-url https://download.pytorch.org/whl/test/xpu
finnaly start but didnt see any image
What is the dtype is set to?
setting -> compute
That is an out of memory error
SDXL shouldn't use 26 GB
In SD.Next
You can also try removing the config.json file to reset settings
If it went to CPU fallback at some point it will set FP32 / no-half
It should be Auto or BF16
why here on card told me use 26 of 12 ?
Try this
make it 1 time
FP32 will use a lot more memory
ok i will delete 1 more time config json file and run it again
pfff finaly work
this think my card have 26 gb Vram how can set it to 12
That counts the shared RAM
From System RAM
I don't know if Intel has an option to disable system memory fallback
and what need to do now any suggestion
Reported VRAM is the real 12GB so it shouldn't be a problem on SDNext's memory management
What problem are you having rn?
Go to System and go to System Info and send a screenshot
Dtype is correct, BF16
Try adding this to webui-user.bat:
set IPEX_FORCE_ATTENTION_SLICE=1
I guess flash atten support for battlemage is not in yet, i will enable this variable by default on battlemage on sdnext side
ok for now i will just see what i can do and make some modify by my settings in ui and wait for your info please @ at me if you have something new
@proper cradle @fickle plume Very very much thank you for your time and support to can make me this start !!!!!!!!!
Does anyone know how to get it to run on the igpu instead of the cpu? On my computer it's gpu 1 and I was wondering if anyone knew how to switch it to hopefully make it faster than 30 s/it
Refactored the Dynamic Attention SDPA, it doesn't corrut the outputs with very small slicing rates or very large sequence lengths anymore: https://github.com/vladmandic/automatic/commit/a770b1c888261e6f0cd7a731a3deca3397caf47c
New Dynamic Atten SDPA with SD 1.5 at 768x1280.
500 MB limit vs 4 GB limit:
2.3 GB Total VRAM usage vs 4.5 GB total VRAM usage
offload mode = none
Very nice work!
Hey im running into a weird error wit SDNext with openvino, is this the correct place to ask?
yes
Okay so i just installed it and the live previews makes it look like it works, but when it finishes its just a blank/black image and then this pops up in the bottom column.
I'm using a B580 on a Ryzen 5700x if that matters
2025-01-26 12:55:05.027 ERROR sd processing_helpers Decode: sample=(1280, 1080, 3) invalid=4147200 dtype=float32 vae=torch.float32 upcast=True failed to validate
You need this VAE if it is an SDXL model: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors
Same thing still unfortunately, I double checked that its selected in the settings ๐ฆ
I did, i did a new install and then tried Ipex and now it just works, thanks for the tips tho, I appreciate it ๐
Removed IPEX and fully switched to pure PyTorch 2.6.0+xpu with the latest update.
Also Python 3.12 is now fully supported by SDNext.
Also there is now a --use-nightly flag if you want to download nigthly PyTorch.
How's the performance?
Same as ipex 2.5
Nice
OpenVINO also updated to PyTorch 2.6 and mismatched shapes errors on too many resolution changes are finally fixed for good
https://github.com/vladmandic/sdnext/commit/1acbabb276f2a9d63670a91c47934cd341fb53e2
Upscaler compile is enabled by default now
@proper cradle https://x.com/XYang2023/status/1888529378856043007
BMG is supported on SDNext already?
I don't have a BMG card myself so can't comment about the speed.
I would offer to try it if I had one, but I don't. I got a 258V though
I am not sure what the speed is of the b580 at all, only a few people with it using any AI stuff.
It's not really much better than that A770 for it so most existing enthusiasts aren't upgrading
Yeah, I am likely one of the only a750 owners who mess around with ai alot lol. I was thinking about grabbing the b580 as a stop gap but seems it has alot of other issues atm too.
Docker image for Intel ARC:
https://github.com/vladmandic/sdnext/wiki/Intel-ARC#docker-installation
https://hub.docker.com/r/disty0/sdnext-ipex
base img
venv will be created and installed on the first run
docker my love
intel compute runtime my hate
Compared to A750, I'd think the B580 should have less problems
Docker image for OpenVINO:
https://github.com/vladmandic/sdnext/wiki/OpenVINO#docker-installation
base img
using a B580 with sdnext at the moment, anyone who has curiousities about battlemage AI just let me know and I'll give it a try
what does AssertHandler::printMessage mean?
it happens after ive had the webui open for a long time
and then it just closes
happens when i click on generate after ive had the webui opened for a long time
does it happen because my computers not that powerful? i think it happens less when i lower the resolution and stuff
but if its open for a long time it does do it pretty easily regardless of the resolution
i have an arc a750 and 16 gigs of ram
and an i5 10500
and im on windows
and also sometimes, after this it also displays a python.exe error
something along thje lines of referenced memory at (address) could not be read
and i think it happens more when i used longer prompts
Try running memtest to see if your ram is dying or not
Also 16 GB RAM is too low / barely enough
It will help but still barely enough
Tho it won't help if your RAM is dying
This one is concerning
ill run a memtest
i have one stick of ram thats new and one stick thats old
both are same tho
corsair vengeance
both the same speed as well even though motherboard doesnt let it run in full speed
the next 2 stick that im supposed to get, should they be corsair vengeance as well? or am i allowed to get different ones
Should be fine as long as the memory chip is the same
My PC has 2x 16 gb curicial and 2x 32 gb kingston
Both has micron memory chips
Both are 3200 MHz CL16
do speeds matter? i have xmp on currently otherwise i think it defaults to 2133
it should be same
or you will have to run all of them at the speed of the slower one
Does this image work in Windows 11? or does windows docker not have a way to pull the gpu in?
found a way to get the gpu through, but still had weird errors. tried running in windows and I don't know what i'm doing anymore apparently 'cause I can't even get default models running with a clean install 
queue cannot be constructed, engine did not start, lots of fun errors with no real clue as to what i'm missing
guess i'll just go back to drawing manually for today
Is it not working without docker?
