#SDNext WebUI on Intel ARC

1 messages ยท Page 6 of 1

proper cradle
#

I used that

#

OpenVINO fails because of the virtualizaton layer

#

IPEX works fine

grave condor
proper cradle
grave condor
#

consumer or developer account?

proper cradle
#

Free tier

#

Didn't upgrade anything

grave condor
#

edge gets you openvino, ai/hpc gets you ipex etc

proper cradle
#

Tried to run IPEX with SDNext before and that caused this

grave condor
#

maybe they auto ban everything that contains "webui" like colab once did

#

but you should get that fixed

keen marsh
#

Finally getting around to testing the new ipex and seems memory utilization is higher now, barely able to do 1080*1080 with sdxl with sdp. Subquad seems no longer supported by attention slicing(thought it was before?)

#

Okay figured out the memory problem, it was the ipex optimize stuff.

proper cradle
#

Are you on dev?

#

Memory optimizations got disabled by default since they had too much impact on performance

keen marsh
#

I am on the main branch

#

I copied over my config so that's probably why

#

does the new diffusers controlnet not work with arc? Keep getting xpu errors

keen marsh
#

might be just some of them, got canny to run. Have to figure out how to use it I guess.

proper cradle
#

MediaPipe Face preprocessor doesn't work

#

SegmentAnything preprocessor is fixed

#

Everything else seems to be working

#

Input / Output

keen marsh
#

Open pose wouldn't load at all qnd gave an error, depth models gave error and crashed ui. I will try again and see if I just did something wrong

#

Should mention this was only on sdxl, didn't try 1.5

proper cradle
#

Dev branch?

keen marsh
#

Just main branch, I will check out dev branch next launch and see

restive parcel
#

wanna try out controlnet toys someday, but ipex really doesn't like me DinaKEK

#

I've never gotten it to work outside of WSL, and y'all know how buggy that is

grave condor
#

Uninstall torch. And then follow the steps on the installation guide

keen marsh
#

The comfy UI fork on here should do the same as well.

#

No need to install the oneapi manually anymore I don't think.

tall grove
#

Yeah it's windows native so it should work easily

restive parcel
#

well, I tried.

restive parcel
#

and this when i tried to run pip install -r requirements.txt

#

all of this is inside a fresh miniconda venv

proper cradle
#

Don't use miniconda

#

Don't use Python 3.12

restive parcel
#

oh, I tried looking for an appropriate python version on the readme, didn't see any anywhere

proper cradle
#

Wipe your venv folder and use webui.bat --use-ipex with Python 3.11 or 3.10

restive parcel
#

how should I manage python version without conda?

proper cradle
proper cradle
#

webui.bat --use-ipex

#

Or uninstall Python 3.12

restive parcel
#

I only have 3.12 in the conda, so that should be fine. lemme check when's the last time I updated my system wide python though, since i've been using conda since the last time I screwed myself over with python versions

#

looks like 3.10

#

not exploding itself on that spot again ๐Ÿ‘

proper cradle
#

Also you need to remove your venv folder after changing major python versions

#

venv will still be the old version otherwise

restive parcel
#

wiped ๐Ÿ‘

#

was about to wipe the whole git root again out of habit

#

looks like launch succesful TakoBless

#

will see if I can get a proper image out or not

#

a lil funny lookin' but hey, that's definitely an image

#

better GGNodders

#

looks like its running!

tall grove
#

ur not using wsl or anything?

#

oh discord didnt update nvm

restive parcel
#

no wsl here, just native windows

tall grove
#

yeah only problem is only diffusers runs at full speed

restive parcel
#

feels very nice

#

do upscalers run at all though? I get a crash every time I try to use img2img or highres second pass

tall grove
#

i think?

#

been a while since i used it

keen marsh
#

I haven't tested the new ipex in original backend yet, might be full speed in windows now.

safe plume
proper cradle
#

INT4 for iGPUs is interesting

#

INT8 and INT4 on GPUs didn't work with 2023.2

#

Still doesn't work

#

It works if we use HETERO:GPU,CPU

proper cradle
#

But doesn't use 4 bit or 8 bit

onyx moth
#

oh wow Ive been running in wsl this whole time

#

advantages vs disadvantages of running sd next in Windows vs wsl?

restive parcel
#

advantages of wsl:
????none that I know of, its buggy and has memory issues

#

could be slightly faster but I don't know for sure

keen marsh
tall grove
#

Diffusers speed is the same atleast so I'd recommend native

proper cradle
keen marsh
#

diffusers in native is about as fast as original in wsl. Might be faster now though, I need to test sd1.5. So it's like 6it/s in windows vs 9it/s in wsl in diffusers(in previous ipex anyway).

restive parcel
#

does SDNext still work over local network by default? I'd assume so, but I triple checked everything else to make sure it was the right address and accessible, so I figure I come ask as a sanity check

#

Just trying to access from phone so I can gen while walking around the house DinaKEK

green pagoda
restive parcel
#

localhost:7860 works fine on the host machine, i run a couple of other services that are connecting remotely just fine (koboldccp, jellyfin), but this one just times out for some reason

green pagoda
restive parcel
#

what do you mean 'always'?

green pagoda
#

Your initial statement is unclear.

restive parcel
#

ahh. I haven't tried accessing SDNext remotely before, last time I would've was on A1111 a long time ago

#

I can currently access jellyfin though, I just put host address:8096

#

which works the same as localhost:8096 on the host machine. but while localhost:7860 works for SDNext on the host machine, host address:7860 on the phone does not connect

green pagoda
#

Yeah IDK it must be a configuration issue.

proper cradle
#

It will reject any other IP other than localhost by default

restive parcel
#

ahh, so it is a setting, thank you

proper cradle
#

Experimenting with 1024x1024 curse

#

FP32 works fine

#

FP16 NaNs out

#

BF16 has the curse

#

Only happens with Alchemist GPUs

#

(GPU Max works fine)

proper cradle
#

Upcasted attention to FP32 then downcasted it to BF16 and now it NaNs out too

#

Using BMM instead of SDPA works fine

#

There is something wrong with Alchemist + SDPA combination

#

but there is slight hints of the curse with BMM too

proper cradle
proper cradle
#

Dev branch merged to master:

#

Diffusers is the default backend now.

restive parcel
#

i'm assuming controlnet is still not supported in diffusers yet?

fickle plume
#

TLDR
RuntimeError: Device type XPU is not supported for torch.Generator() api.

#

(This is on windows)

proper cradle
#

Update to 2.1

proper cradle
fickle plume
#

Ok, I guess not

tall grove
#

tbh i havent touched windows in a while but a fresh install seems to result in this error?

#

im unsure if ive done anything wrong as i just installed a fresh windows

proper cradle
#

tensordict 0.3.0 installs torch 2.2.0 for some reason

#

Locked it to 0.2.1

fickle plume
#

the oneapi base toolkit is no longer necessary, right?

tall grove
#

oh yeah k ill try that

fickle plume
#

nice, thanks

restive parcel
proper cradle
#

Even more

#

Some A1111 extensions can break tho

fickle plume
restive parcel
#

i find diffusers is behaving a fair bit different on image outputs, but it could be some diffuser settings maybe. I have more cursed resolutions, and some LORAs behave very differently

tall grove
#

tbh i cant seem to replicate sdxl outputs from civitai, tho i know it changes the resolution a bit, but everything seems unsaturated and slightly blurry, tho maybe ive set something wrong

keen marsh
restive parcel
#

I... think I have the sampler settings all set the same?

#

some definitely get closer than others, but I seem to get more artifacting in diffusers

#

<original backend | diffusers (copied settings with move to prompt) >

#

seems to me that LORAs are particularly getting messed up. If I don't use them, then the output is generally fine

#

eugh nevermind, some prompts become vastly degraded if I'm not comparing UniPC sampler

proper cradle
#

Prompt weighting actually works with Diffusers backend

#

Be careful with spamming ()

#

Doing this (((((())))) will burn your image for sure

restive parcel
#

oh i do a lot of prompt weighting DinaKEK

#

so does the prompt weighting effect just kick in a lot sooner? It seems my normal loras at 1.0 will bake an output in no time, but lower weights don't feel like they have as much impact

proper cradle
#

Do you use Karras scheduler?

#

Not using it will cause artifacts

#

Also there is diffusers native lora loader too

#

"LoRA use alternative loading method" in Extra networks settings

proper cradle
restive parcel
#

its just kinda blown out compared to my usual stuff, so i cut all weight numbers by half and the results are looking better?

proper cradle
#

I won't go over 1.2 most of the time

#

1.5 if you really want that prompt

#

Also downweighting is working on Diffusers

#

Original backend doesn't work with downweighting

restive parcel
#

downweighting?

proper cradle
#

weigth < 1

#

0.5 etc

#

for prompts

#

Lora loader should be the same on both backends

restive parcel
#

oh, that's how I usually use loras, is sub 1

#

oftentimes negative number

#

oh, alternattive loading method setting has.... weird effects

proper cradle
restive parcel
#

๐Ÿ‘

proper cradle
#

Artifacting happened to me too rn

#

Refreshing the page fixed it

#

Don't know why and how it happened

restive parcel
#

i turned down all of my lora weights and now everything is outputting pretty decently. I didn't turn on the alternative loader, but i'm using downweighting to something around 0.2 on some and getting fairly good results. Slider type loras i can run at full 1.0 and scale as I normally would, including to the negative weights

proper cradle
#

Creates duplicate UNet for some reason

spice knoll
#

Hey! My apologies if this should have been posted someplace else, but most of my efforts have been focused on getting SD.Next working, so I figured this was the way to go. Anyway, here's where I'm at:

I have an i5-13400F, 64 gig of RAM, and an A750 (so 8 gig of VRAM). I realize that this is an upper-poverty-level PC, so I need to accept limitations. I'm relatively new to this stuff. I want to generate 1024x1024 images with SD XL (or a derived model), if possible.

I've been able to get Fooocus up and running using its default model (Juggernaut XL) with the directions from the appropriate thread in this channel. It hasn't been without its hiccups - I still don't have upscaling working - but it produces good results at speeds that I think are fine. It uses every last byte of my VRAM and without throwing "out of VRAM" errors. However, I want way more customization than Fooocus offers.

I've tried both A1111 and SD.Next, but haven't had a ton of luck with either one; I've gotten further in SD.Next, so I've focused most of my efforts there. I'm currently focusing on using the same model as Fooocus (Juggernaut XL), both to change as few variables as possible and to eliminate the "refining" step, which was giving me its own set of problems. To summarize my current status in SD.Next:

#
  1. If I don't use --medvram or --lowvram, but I enable all of the other assorted VRAM-saving settings (Move base model to CPU when using refiner, move base model to CPU when using VAE, move refiner model to CPU when not in use, VAE slicing, VAE tiling, and Attention slicing), it will generate 512x512 images with acceptable speed, but throws errors when trying to generate 640x640 (RuntimeError: Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)). Looking at my VRAM usage, it seems to be running out.

  2. Using --medvram along with the other VRAM-saving settings (VAE slicing, VAE tiling, and Attention slicing) generates 640x640 with acceptable speed. However, it throws (what seem to be) the same error at 768x768. Again, it does seem to max out VRAM.

  3. Using --lowvram (which, to my understanding, implies all the other VRAM-saving settings) will generate 1024x1024 with no errors, at about 30% of the speed of Fooocus, while only using about 18-20% of my VRAM.

Does anyone have any ideas about how to get the best of both worlds? I'd love to be able to get SD.Next's customizability (and am happy to use something other than SD.Next) while getting performance comparable to Fooocus.

keen marsh
spice knoll
# keen marsh Native windows? Make sure you have the ipex optimize stuff turned off, you shoul...

Thanks! Yes, native Windows, but I'm up for switching to WSL if it works better. Re: IPEX, I want to make sure I understand what you mean. I installed according to SD.Next's directions on GitHub, here: https://github.com/vladmandic/automatic/wiki/Installation

Those directions tell me to use the --use-ipex command line argument, so that's what I've done. By default, this does enable all of the "IPEX Optimize for Intel GPUs" options under "Compute Settings". I've tried disabling these per your reply (while still using --use-ipex), and I haven't seen any difference. I've checked and I'm using "Scaled-Dot-Product" under "Attention optimization method", I'm not using either of the "Full precision" options, and I'm running at BF16 (as I heard this was faster than FP16 on our cards).

GitHub

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models - vladmandic/automatic

keen marsh
spice knoll
#

Yup, I disabled the IPEX optimizations, but it didn't make any noticeable difference. I do need --use-ipex; I tried a separate fresh install without it and it doesn't work.

keen marsh
#

Use medvram

#

Also try fp16 and see what happens

#

Also disable the move to cpu stuff(for me I didn't have enough system ram but see if that helps as well)

spice knoll
#

Yup, was already using medvram. The move to CPU stuff is already disabled, because it's incompatible with medvram and lowvram. I just tried FP16, and I'm having some odd, possibly memory-leaky issues with it, as it will consistently generate one image at 512x512 without a problem, then crash when I try to generate another. BF16 will generate 512x512 all day long.

keen marsh
#

Odd, are you on the main branch and up to date?

#

Also you just let sdnext install itself and didn't install ipex yourself right? Just to make sure, if you are not in a venv and using your own install of ipex there could be issues with the latest version (haven't personally installed it in windows and just let sdnext do it)

#

Maybe post your settings and disty can help you further. Post error messages as well.

spice knoll
#

Yup, just reinstalled today. And yeah, it's running in its own venv and it installed its own ipex. Is there a way to export your settings for posting? What's the preferred way to do that?

keen marsh
#

Just screenshot with the snippet tool

spice knoll
#

Okay, thanks! I'll do that a little later; at the moment I'm a little frustrated and blowing off steam doing something else, but I'll no doubt be in the mood again in an hour or two ๐Ÿคฃ I'll take extensive screenshots.

keen marsh
#

It could also be your vae for sdxl, try turning it off and aee what happens

#

You used to have to do a hack for the vae, but I have just kept it the same so don't know of it was updated for sdxl.

spice knoll
#

I'm using Juggernaut XL, because that's what I had working in Fooocus, which already has the VAE fix baked-in.

onyx moth
#

recommended settings for SD NEXT in windows? (Arc A770 16gb)

#

fresh install, 1st time running through windows once I get my output paths the way I want them. just wondering if its good to go without things like attention slicing. mainly just going to run sdxl

fickle plume
onyx moth
#

not sure how to disable that

#

question: if I launch through the shortcut for webui.bat do i need to add the --use-ipex arg?

fickle plume
#

Disabling them gives a minor speed and VRAM boost

restive parcel
#

i love my live previews =w=
with an 8gb card though, might not be able to afford it as easily

spiral junco
#

@proper cradle Hey Disty have you tried Stable Cascade with ipex?

proper cradle
spiral junco
#

yup, getting similar corruption

proper cradle
#

Happens on dev cloud with gpu max too

keen marsh
#

Reminds me of the early stable diffusion on ipex output

fickle plume
#

@proper cradle Do you want to fix a typo in modules/ipadapters.py
SDXL IP Adapters are typo'd as SXDL

fickle plume
#

Anyone having any issues getting the SDXL refiner to work recently?

keen marsh
fickle plume
#

Refiner just flat out doesn't work for me

#

Seems that person is getting artifacts, but I don't get any refiner effect

#

I do get an image at least

proper cradle
#

Try downloading the diffusers version from Models -> Huggingface tab

#

It is failing to load the refiner

#

Both of my GPUs are busy so I can't test anything rn

fickle plume
#

might test on linux later; either way, it used to work, it just stopped working "now"

#

on that note, ip adapter no worky either

proper cradle
#

ipadapter seems to be working fine

#

do you have sequential offload enabled

fickle plume
#

yes

#

i oom on windows otherwise (when trying to use ipadapter specifically)

proper cradle
#

Try dev branch

#

there were improvenments on that area

fickle plume
#

alright

dense wyvern
#

Is it normal with A750 to get 2.50 to 2.60s/it on 1024x1024 sd1.5?

#

Seems odd tbh

keen marsh
#

Only with lowvram

dense wyvern
#

--medvram is mine

#

lowvram is even worse

keen marsh
#

should be around 1.5it/s maybe s/it's. Did you try those options I posted in the other thread?

dense wyvern
keen marsh
#

#1193952640225267802 message

dense wyvern
#

rest of the options seems to be same

keen marsh
#

I am on an older commit btw, so som options look different on yours I see.

#

default might be giving you fp32 speed, try setting model variant to fp16

#

also restart the webui after changing vram settings

#

make sure no-half and full precision or also not checked. And use fp16 or bp16 not fp32

dense wyvern
#

changed the settings now

#

gonna see how much it changed

#

same/just a lil better

#

2.40 to 2.50s/it lol

#

maybe i will try the older commits when I have time

#

High memory utilization: GPU=80% RAM=47% {'ram': {'used': 14.01, 'total': 29.66}, 'gpu':
{'used': 6.19, 'total': 7.75}, 'retries': 0, 'oom': 0}
Processed: images=1 time=215.15 its=0.23 memory={'ram': {'used': 12.26, 'total': 29.66}, 'gpu':
{'used': 4.41, 'total': 7.75}, 'retries': 0, 'oom': 0}

dense wyvern
#

DPM++ 2M Karras is 2.50s/it
Euler A is 1.30it/s

keen marsh
#

oh yeah, I forgot about that. I used Euler and the Unipc one, one of the karras ones was always slower, but I am not well versed on whats what with those in Sd.next.

dense wyvern
#

How can I get an older commit of the git repository?

restive parcel
#

uhhhh you have to get the commit head, and then... commands... i always get lost when doing it so I'd just ask chatgpt cri

dense wyvern
#

I feel like my generations are really slow still... And when I check other people's diffuser settings etc. I dont have most of the stuff in their screenshots

grave condor
#

git checkout

fickle plume
#

I have a feeling SDNext doesn't support this, is that right?

fickle plume
#

Or maybe a diffusers limitation?

restive parcel
#

the prompt manipulation syntax is pretty janky anyway, but is fun to play with anyway

keen marsh
#

If it worked in a1111 before sd.next was forked it should probably work in original backend. Original is basically whatever a1111 did to get it to work in webui afaik.

blazing trellis
#

I have reinstalled SDNext and getting this error, when i use try to run it.

#

any ideas?

blazing trellis
#

ok, its running now. 512x512 with 2,3it/s with an A380

ember orchid
#

Thats good for A380

blazing trellis
#

SD1.5 running fine. SDXL crashes but dont know why.

#

512x768 is about 1.3it/s quite ok for the small A380.

proper cradle
#

you have to use --medvram or --lowvram

blazing trellis
#

have --medvram active

proper cradle
#

Disable IPEX Optimize

#

You don't have enough vram to hold SDXL on vram

blazing trellis
#

all ipex optimize options?

proper cradle
#

yep

blazing trellis
#

still crashing. maybe SDXL is to much for A380.

blazing trellis
#

ok, its running now. with 3,2s/it

dense wyvern
#

what is the "normal" it/s for a750 with 768x1024? any ideas? with dpm++ 2m karras

tall grove
#

dpm thing is a bit slower than Euler btw last time i checked

dense wyvern
#

yeah it is, but gives better images imo

#

also i cant seem to change between VAEs

#

NotImplementedError: Cannot copy out of meta tensor; no data!
the error I'm getting

dense wyvern
#

well fml. i cant seem to get it working properly lol. a1111 with openvino straight up doesnt work, sdnext is acting weird, foocus doesnt work at all either... idk

#

if anyone can help hmu please I'm down to hop on call and get whatever help I can get

proper cradle
#

That area had improvenments in dev branch

dense wyvern
proper cradle
#

You might want to try NNCF with --medvram (tho just --medvram should work too)

#

NNCF compresses the model into 8 bit without losing much quality and halves the model size in VRAM

#

but it is slower to run

#

Also disable IPEX Optimize if you use Loras

#

IPEX Optimize is disabled by default in dev branch now

proper cradle
dense wyvern
proper cradle
#

It's in Compute Settings

dense wyvern
proper cradle
#

Enable Model and Text Encoder

#

VAE isn't really worth it

dense wyvern
#

Also done this since I use them

proper cradle
#

yep

dense wyvern
#

what is the average it/s for a750 do you have any ideas by any chance?

proper cradle
#

7~ it/s with SD 1.5 512x512

dense wyvern
#

768x1024 didnt generate

#

512x512 gives 2.62it/s

proper cradle
#

SD 1.5?

dense wyvern
#

Yes sir

proper cradle
#

HyperTile is a must for SD 1.5

dense wyvern
#
RuntimeError: Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES) -5 (PI_ERROR_OUT_OF_RESOURCES)
#

launched with --medvram

proper cradle
#

768x1024 is way too much for SD 1.5 architecture

#

HyperTile is different

dense wyvern
#

for arc gpus? I could generate up to 1.8it/s with RTX3050ti in 768x1024

proper cradle
dense wyvern
#

Yes sir

proper cradle
#

NNCF will make it significantly slower

#

Doesn't worth it for SD 1.5

#

I tought you were using SDXL

dense wyvern
#

No I didnt even have chance to try SDXL I've been having issues since I bought this card for SD

#

It generates sometimes, sometimes doesnt, or generates slow as **

#
EinopsError:  Error while processing rearrange-reduction pattern "b c (nh h) (nw w) -> (b nh nw) c h w".
 Input tensor shape: torch.Size([1, 512, 64, 64]). Additional info: {'nh': 8, 'nw': 6}.
 Shape mismatch, can't divide axis of length 64 in chunks of 6

I get random errors like this now and then even though I havent changed anything and just adding to queue

proper cradle
#

HyperTile VAE is enabled?

#

VAE part can cause these

#

Oly enable Unet

dense wyvern
#

RuntimeError: Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)

#

Back to square one

#

I literally changed nothing lmao

dense wyvern
#

How should I make the settings?

proper cradle
#

Defaults with only Unet enabled is good

proper cradle
dense wyvern
#

even though i changed nothing?

#

like i said, its giving random errors at random times

#

lol... I cant figure it out I tried nearly all the webui's from forge to foocus

proper cradle
#

Probably better of using Linux or at least WSL

dense wyvern
#

i tried wsl too

proper cradle
#

Native Windows is still pretty new

dense wyvern
#

I'll send the error I get in wsl

#
Command: /home/mercury/automatic/venv/bin/python3 -u launch.py --use-ipex --medvram
Uptime: 62.413761 s
Segmentation fault
Traceback (most recent call last):
  File "/home/mercury/automatic/venv/bin/ipexrun", line 8, in <module>
    sys.exit(main())
  File "/home/mercury/automatic/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/launcher.py", line 109, in main
    cpu_run_main_with_args(args)
  File "/home/mercury/automatic/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/cpu/launch/launch.py", line 447, in run_main_with_args
    launcher.launch(args)
  File "/home/mercury/automatic/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/cpu/launch/launcher_multi_instances.py", line 322, in launch
    raise subprocess.CalledProcessError(
subprocess.CalledProcessError: Command 'taskset -c 0,2,4,6,8,10 /home/mercury/automatic/venv/bin/python3 -u launch.py --use-ipex --medvram' returned non-zero exit status 139.
proper cradle
#

1st, use ./webui.sh

#

That sets a lot of stuff for you

#

It won't work without it

dense wyvern
#

./webui.sh --use-ipex

#

this is how i run it

#

i thought it doesnt set up it for ipex if i dont --use-ipex

proper cradle
#

Try;

DISABLE_IPEXRUN=1 ./webui.sh
proper cradle
dense wyvern
#

Yes sir

proper cradle
#

And do you have an iGPU?

dense wyvern
#

disabled from bios

proper cradle
#

Intel iGPU will conflict

proper cradle
dense wyvern
#

aight lemme try that again, deleted and reinstalling ubuntu wsl

#
mercury@DESKTOP-3EH5PI7:~/automatic$ DISABLE_IPEXRUN=1 ./webui.sh
Create and activate python venv
Error: [Errno 13] Permission denied: '/home/mercury/automatic/venv'
Error: Cannot activate python venv
#

ok wait i think i figured out why this is lol

#

yeah, that was on me

#

started once crashed

#
mercury@DESKTOP-3EH5PI7:~/automatic$ DISABLE_IPEXRUN=1 ./webui.sh
Segmentation fault
#

then when trying to start again gave this error

#

it doesnt start anymore

proper cradle
#

For those on Arch, don't update to Linux 6.8 until Intel Compute Runtime adds support for the new DRM Xe driver.

#

I downgraded to Linux LTS 6.6 for now.

green pagoda
proper cradle
#

Noticed that too

#

Exporting these worked:
NEOReadDebugKeys=1 OverrideGpuAddressSpace=48

green pagoda
#

I haven't upgraded to 6.8 yet (still waiting on the Fedora kernel) so I don't have any first-hand experience.

broken grail
#

unfortunately I'm stuck on 6.8 because of bcachefs having a fairly critical bug in earlier versions

broken grail
#

Will do, testing now

#

Where'd you find those?

proper cradle
broken grail
#

ah I see

#

and it's working now, thank you very much

ember orchid
#

Putting this here, as a lot of coders are here. Seems there is interest in using Arc mGPU for AI workloads from tinygrad and they put out a bounty for XMX.
https://twitter.com/__tinygrad__/status/1770145884862546282?s=46&t=FGbX5gLUn0Bqa1heNNuXaQ

We are putting resources into Intel now. multiGPU training worked out of the box, and haven't seen a lick of driver instability. The A770 card is currently slow though, we need to add XMX support ($400 bounty added)

small mason
#

Have 32GB Ram but XL Models keep freezing my PC because it takes up all my Ram. Anyone knows how to solve this?

tall grove
#

nothing can solve it lol\

#

more ram

#

either that or dont offload ur models

#

if u have enough vram u can maybe do it

#

32gb should be enough tho

#

atleast i dont have much issue

small mason
keen marsh
#

Uncheck any of the offload to cpu stuff, make sure you are using the recommended optimizations as well. And If on windows native turn off ipex optimize

fickle plume
#

Anyone else on windows wanna confirm if the euler sampler (and probably other non-default samplers) break SDXL?

upper pine
#

No clue if you know anything about it, but the error I was getting as of late was caused by the intel-compute-runtime package being out of date on Arch, thus having to resort to utilizing an AUR version

#

I'm not at home right now to tell you the exact error it was throwing before but I can get it to you just for kicks and giggles
(moral of the story: nice job not maintaining the latest packages, Arch devs)

ember orchid
# small mason 16 (A770)

I do a lot of SDXL and have not run into a memory issue. Are you doing latent upscale at some multiple of 1024? What resolution are you working at?

spiral junco
#

scaled_dot_product_attention, apply the workaroud to slice the memory allocation chunks to be less than 4GB

fickle plume
#

Getting this when I have FreeU enabled:

Intel MKL FATAL ERROR: Cannot load mkl_def.2.dll.```
tried --reinstall, no change.
proper cradle
proper cradle
proper cradle
#

Didn't update Windows since i don't have a Wฤฐndows envrionment to test how to include MKL with Windows

fickle plume
#

ah, still using custom builds for windows, huh
:(

#

wonder what's still broken with the regular ones

proper cradle
#

SDNext downloads them from pypi on Linux

proper cradle
#

Seems like memory leak issues got better too

#

So disabled ipexrun by default now

#

ipexrun causes more issues than it fixes

safe plume
proper cradle
#

same

safe plume
#

So perf is same but less buggy

proper cradle
#

yes

#

inpainting crashes should be fixed too

safe plume
#

Neat

green pagoda
#

That's always welcome

minor plank
#

I'm new here ๐Ÿค– Do we just run Installer.py to update IPEX?

fickle plume
minor plank
#

Yes sir ๐Ÿค–

#

I found this intel github

#

I'll read and try to assimilate ๐Ÿค–

fickle plume
#

You know some basic python usage? You can see the install instructions below, replace them in install.py, and --reinstall
i wonder if there's some dependency issues

minor plank
#

Waiting is probably for the best. ๐Ÿค–

proper cradle
#

and ipex on windows needs libuv

#

So we need custom packages to include libuv with ipex or make the user install conda, libuv, mkl, dpcpp manually

#

not interested in including conda in SDNext

fickle plume
#

pretty understandable, yes

onyx moth
#

Don't have SD Next installed at the moment so, need to go to intel github and update IPEX, then reinstall SD Next (windows) right?

tall grove
#

u dont have to do anything with ipex for windows

#

its setup for you

#

windows ipex hasnt been updated yet i think

keen marsh
#

Is the windows version of one api fixed now? Also can we access the old versions yet?

fickle plume
#

Probably disty ๐Ÿ˜›

fickle plume
#

Is there something I'm missing something with inpainting - isn't only masked area supposed to be much faster? It still takes as long as for the full image, and if that image is 4k^2 i kinda don't want to wait that long and need to resort to cutting the image up

fickle plume
#

pixart alpha (Diffusers/PixArt-alpha/PixArt-XL-2-1024-MS [b89adadecc])'s negative prompt seems to be put in 1 more list than it should be, and the pipeline fails making it lowercase

#

Hacking a fix in diffusers since idk where automatic puts it in a list, it produces an image... broken :(
Bright scene, aerial view,ancient city, fantasy, gorgeous light, mirror reflection, high detail, wide angle lens.
111 seed, 4.5 cfg scale, 20 steps, everything else cfg set to 0, default sampler, karras and low order, 1 clip skip
on the other side, what the demo gives me

#

wonder if it could be medvram or vae tiling

#

let me fetch a bigger log that isn't an annoying crop

#

Probably that VAE failing to load

proper cradle
#

I don't think non SD VAE loading is supported

#

And PixArt doesn't specify a VAE with it's examples either?

fickle plume
#

I saw in an old closed issue vlad posted some pixart alpha result that didn't look garbled

#

so it has worked before, on nvidia
i will poke a bit more, vae tiling and medvram weren't the cause

proper cradle
#

fp16 vs bf16?

#

also fyi, you don't have jemalloc installed for ipexrun

fickle plume
#

bf16

proper cradle
#

bf16 can reduce the quality of some models

#

try setting fp16 and restart

fickle plume
#

Fp16 result is definitely better, that was probably it ๐Ÿ‘

#

Trying out the other demo prompts here, looks pretty good

#

is pixart sigma too new for sdnext? or too new in general
I see some of their HF repos have 0 downloads ๐Ÿ˜›

proper cradle
#

and click on it from the reference models

fickle plume
#

i am on dev, i'll try it out

#

what do you mean by reference models?

proper cradle
#

Main menu

fickle plume
#

My UI is completely different

proper cradle
#

Networks tab

#

I im using a beta ui rn

fickle plume
#

It looks better, how can I use it

fickle plume
#

downloading it now

fickle plume
#

though I wish it was sorted newest first instead of oldest first

proper cradle
fickle plume
#

I haven't really made any reports, and most of them anyways seem to boil down to ipex issues instead

fickle plume
fickle plume
#

Looking at their github, they seem to have uploaded incorrect/broken models twice

#

So I would not be surprised if this is a pixart issue, unless you or someone else got it running?

proper cradle
#

I don't know what happened but my A770 can't run any SYCL or OpenCL rn

#

it get's stuck with 100% single core CPU usage

#

clpeak gets stuck

#

ipex gets stuck

#

opengl apps seems to run fine

#

vkpeak seems to run fine

#

some vulkan apps refuses to start

#

some works

proper cradle
#

tho may packages was upgraded before the reboot

#

Tried going back to Linux 6.6 LTS but still the same thing

#

tried using docker to see if it is an Arch Linux issue but still the same thing

chrome bone
#

it was fun while it lasted ๐Ÿ˜ข

proper cradle
#
(venv) disty:~ $ python
Python 3.10.14 (main, Mar 25 2024, 23:39:34) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import intel_extension_for_pytorch
No ROCm runtime is found, using ROCM_HOME=' '
>>> a = torch.randn((10,10,10))
>>> a = a.to("xpu")
>>> print(a)
(gets stuck here)
#

GPU usage maxes out but pulls only 84W

#

Also maxes out a single CPU core

#

GPU usage dorps after 10~ seconds but still stuck with 100% single core CPU usage

#

clpeak:

disty:~ $ clpeak -p 1
dlerror: /opt/intel/oneapi/compiler/latest/linux/lib/x64/libintelocl.so: cannot open shared object file: No such file or directory

Platform: Intel(R) OpenCL Graphics
  Device: Intel(R) Arc(TM) A770 Graphics
    Driver version  : 24.09.28717.12 (Linux x64)
    Compute units   : 512
    Clock frequency : 2400 MHz

    Global memory bandwidth (GBPS)
      float   : (gets stuck the same way here)
fickle plume
fickle plume
#

Do you wanna try 6.5

proper cradle
#
disty:~ $ sudo igsc fw version
Device: FW Version: DG02_1.3257
fickle plume
proper cradle
#

VUlkan apps that refuses to launch throws these errors:

MESA-INTEL: warning: cannot initialize blitter engine
#

OpenGL runs fine

#

vkpeak runs fine

fickle plume
proper cradle
#

Installed linux-hardened since it is still on 6.8.4

#

And it works fine

#

6.8.5 and 6.6.27 is broken

#

reported

broken grail
#

after pulling from master, models no longer load

#

6.9 kernel, and I think i had generation working with the export environment variables

#

did something new break?

#

(just checked on dev branch, happens as well, perhaps due to symlinked models dir? just gets stuck on "load model")

#

not due to symlink

proper cradle
#

Last working kernel is 6.8.4

broken grail
#

Got it, thanks

proper cradle
#

Latest LTS also doesn't work

#

Kernels that has Spectre BHI fix doesn't work.

spiral junco
#

is it related to the mkl and mkl-cpp package version on your env?

proper cradle
#

Same with ipex 2.1.10 and 2.1.20

proper cradle
scarlet ingot
#

Yeah, something is not OK in the latest kernels (stable, lts and rc). ComfyUI, automatic etc. was working two weeks ago (6.8.3 ish, 6.6.24), now they sit with 100% cpu on the python process. No apparent error message. Quickly testing kernel 6.6.24 and everything works as expected.

broken grail
#

6.8.6 dropped, don't see anything relevant in the changelog

proper cradle
#

That doesn't work either

rocky onyx
#

Can some one please tell me the process to install SDNext Webui ? @proper cradle

fickle plume
scarlet ingot
#

No luck with 6.8.7 or 6.6.28 either here.

summer ocean
#

any way to set executable run on GPU not iGPU?

restive parcel
#

very annoying when you want to use immature ai utilities like that llm intel repo posted in intel blog

#

SDnext is very convenient how it takes care of the necessary venv stuff, manages the nitty gritty settings, and all i need to do to run it even on windows is shell webui.bat

fickle plume
restive parcel
#

I get an issue seemingly random lately, where during boot up of webui.bat, it'll just say "press any key to continue..." and exit out

#

its not even always the same point in the script either

#

oh, i tried a few more times and my whole computer blue screened with kernel error, so maybe it's something on the driver side DinaKEK

fickle plume
#

Though I haven't had bluescreens, I've had my screen flash black if I generate too many images, with sdnext or with comfy
and then that's it, no more generating images until reboot

#

sometimes "too many" being 7 images with a lora

#

before thing get critically unusable

#

on linux it's better, since there the driver doesn't seem to die, though there are still leaks, at least I can have sdnext continuously killed and restarted when it leaks to a too high amount

#

if you can dual boot, I'd suggest it
I haven't tried WSL but I wouldn't be surprised if it's the same situation as native windows

restive parcel
#

i do have dualboot, but the problem i run into is that i want to draw while generating, and use it as a corrections tool. my drawing software only runs in windows so its not very useful to dual boot
I also can't get it to run in linoox for some reason

fickle plume
restive parcel
#

blender + SD does look fun

#

I've wanted to try it for quick demoing background composition, just to see if something works before i spend all the time doing it

broken grail
#

still busted on 6.8.9

fickle plume
#

Man, is it just me or is the base Pony model fried

proper cradle
#

Pony doesn't work with anything other than the default

fickle plume
#

i'm not getting broken images, but the results I am getting are just meh and look a bit fried
Let me generate something
I am using it with the default clip skip

fickle plume
#

score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up, source_anime, 1girl, aqua hair, aqua eyes, twintails, necktie, boots, detached sleeves, sitting, holding, book, bench, outdoors, blue sky, sun, grass, tree, forest

#

7.0 cfg scale, euler, 27 sampling steps, 0.7 rescale guidance
yes karras, no dynamic threshold, yes low order, no rescale beta
2580050041

#

At this point I'm wondering the model is just bad and I need to be putting extra effort to salvage it, or what

#

I've seen the aua stuff, I've seen the dude admit he did lossy compression on his dataset multiple times

#

Sometimes when I write an extremely long prompt, in specific cases, it will start approaching some decent looking style that isn't a watercolor filter that can't be prompted away

#

dtype, and the resulting image from a 2 clip skip, so I keep the clip skip at 1 which was the default, right?

#

had a much better and smoother experience with animagine

#

cranking the cfg up helps a bit

proper cradle
proper cradle
#

So use 1

#

1 = -2
2 = -3

fickle plume
restive parcel
#

i use a branch from animagine that's been working very nicely, i seem to have forgotten the name though. quite reliable and seems to perform well on various benchmarking tests that check for bias

fickle plume
#

I keep seeing people on reddit singing high praises over pony, to the point i noticed disty even got downvoted when he mentioned animagine, so i really started wondering if there was something i was TOTALLY missing with that model

#

i have managed to finangle a decent result out of it every once in a while, but it's been a headache

proper cradle
#

It's just that pony has the best nsfw content

fickle plume
restive parcel
#

i think i'm using... anima pencil XL

proper cradle
fickle plume
#

nice ui

upper pine
#

looking good!

fickle plume
#

:( so sotediffusion needs pytorch 2.2 in sdnext? is that right?

#

trying it on comfy, it ooms
actually, let me try medvram or lowvram
welp, still OOMing

fickle plume
#

tried with --fp8_e4m3fn-text-enc --fp8_e4m3fn-unet, and... it seems like a good thing that the 4gb limit triggered here

keen marsh
#

Probably not exciting but I think SD.Next just added Sd3-mid support

proper cradle
fickle plume
#

Oh noes

proper cradle
#

Tho SoteDiffusion used all 3 GPU vendors in its training process

#

An RGB model : )

blazing trellis
#

I have problems with animatediff and SDNext. Everytime I try to make a short video my vram and system ram is filled to the limit. 16GB on a A770 and 29GB of 32GB System. Also on my 3090 24GB is complete used.

#

I use SD 1.5 models and the 1.5 V3 animate module.

merry inlet
#

how can i use contorlnet ? when i try to apply change it did,t apply

fickle plume
proper cradle
#

Extension will not work so that's why it is disabled

merry inlet
fickle plume
#

Sorry yes, my bad
It should still work by default, just use it from the control tab like disty says

#

SDNext has controlnet built-in

keen marsh
#

On that, have you guys added the new sdxl control net aio thing?

merry inlet
fickle plume
merry inlet
#

BRO HOW??

proper cradle
#

It is right next to the image tab

proper cradle
fickle plume
modern cedar
#

here it is @fickle plume i try different models same result pics quality very bad

fickle plume
#

You can't use any random resolution you want

#

I'm also not really seeing any blur

modern cedar
#

oh thats the problem?

#

i id ur res still bad

fickle plume
#

Models are trained for specific resolutions and aspect ratios
Going above the total number of pixels will result in repeating patterns
Going below will result in images getting progressively more fried

fickle plume
#

This is how image generators are

#

They don't produce super great results

#

furthermore, "photorealistic" means something that approximates realism

#

the actual realistic things, are just photos

modern cedar
#

look at face lul

#

i dont have refiner section as u have

fickle plume
#

You're using a 1.5 model. There is no refiner. The refiner doesn't matter anyways.

#

The refiner is an SDXL thing.

#

It is not good enough.

modern cedar
#

i have a750 cant handle sdxl model

fickle plume
#

This is not magic that will fix every issue with AI image generators

#

You have to understand that, for the time being, the quality isn't that great and you will need to manually touch things up

#

Inpainting, editing with gimp/krita, whatever

modern cedar
#

i see

#

for sdxl model need at least a770 right cause 16vram

fickle plume
#

You can go have a poke with ideogram, which is the current state of the art, best image generator, if all you want is to just type in a prompt, and see the issues it also still has https://ideogram.ai/t/explore

modern cedar
#

do u have that pose links

#

thx for help again

fickle plume
#

It's on the right

#

โžก๏ธ

#

this right

fickle plume
#

None currently can count

modern cedar
#

i see

fickle plume
#

asking for a specific number of inanimate objects, as opposed to spider legs that it has learned tend towards 4+4, is even worse

modern cedar
#

how is going for text to video think?

#

maybe need 24gb vram right? ๐Ÿ˜…

fickle plume
#

you can do text to video with 16
I wouldn't bother

#

not good enough currently

#

unless maybe you just want some looping wavy hair or something very trippy

keen marsh
#

Also, images are low res so you have to upscale for crisper detail.

dreamy loom
keen marsh
#

I will have to test but that doesn't seem right, have you filed a ticket with intel anywhere?

merry inlet
#

How can i use upscaler on arc gpu?( can i use Ultimate upscale?)

merry inlet
modern cedar
#

guys how can i use sdxl model on next?

#

i download the model and add the folder and choice already but no refiner section on my screen

#

they says add this settings but how to add

modern cedar
pastel geode
modern cedar
#

problem is there is no opitons to change settings

#

if i change some how than of course i will apply and restart man

proper cradle
#

Settings -> User Interface Options

#

Tho refiner isn't needed for most of the time

#

Doesn't worth it

#

And that wiki page is kinda outdated now

#

You can skip them and just use the default settings

modern cedar
#

oh thx

onyx moth
#

is there an updated guide for installing? I haven't used SDNext for a while but thinking about getting back to it

keen marsh
#

Pretty sure its as simple as git pull now, but also haven't tried in a while. I don't think you even need to install oneapi anymore

fickle plume
proper cradle
#

otherwise you will get mkl not found errors

onyx moth
#

Windows install is good now if I remember so I think ill be going through that instead of wsl

proper cradle
#

FLUX with NNCF INT8 compression on A770

fickle plume
#

how is it compared to fp8

proper cradle
#

int8 pretty much have the same precision as bf16

#

(for models that stay around 0-1)

#

non quanted fp8 is pretty much a dumpster fire

#

This is INT8 for example with cascade

#

these two are fp8

#

FLUX BF16 vs INT8

fickle plume
#

damn, that's pretty good

fickle plume
keen marsh
#

Think we will get flux on 8gb cards anytime soon. (Not counting nvida which uses system ram through drivers)

fickle plume
#

I'm getting the impression that this is becoming a thing on windows for intel

#

That being said, the amount of time I've seen people mention waiting... at 5+ minutes for an image, I think it's a better idea to just pay for the online stuff, at least for the time being
We'll see where finetuning does or doesn't go ๐Ÿค”

#

personally I'm not super convinced throwing more parameters at the problem is going to solve image generation, or at least not by itself
while this is a big improvement in anatomy, it still struggles, you can even see it in disty's pics

chrome bone
#

difference between BF16 and INT8 is pretty noticeable

fickle plume
#

You mean... For LLMs?

chrome bone
#

i mean the wifus above

#

wouldnt trade quality for speed

#

if you compare fp16 vs fp32, they are just different. i wouldnt say fp16 is worse

fickle plume
#

This difference is fairly minor compared to BF16 vs fp8e4m3fn

#

(kinda source trust me bro)

#

fp8 changes the whole composition (from what I've seen from people on reddit, when browsing?)

#

now that's noticeable

#

this here is what, slightly different frills, maybe a watermark maybe not?

#

quality is pretty much the same, both have different errors in the details

#

both have... that leg

fickle plume
chrome bone
#

the jawline is different! thats turning a waifu to my aunt

#

in comparison

fickle plume
# chrome bone the jawline is different! thats turning a waifu to my aunt

it's just a minor detail, it also has better frills, but at the same time int8 doesn't have a fried watermark, the particles near the magic spell are more sensible and the fingers are a bit better
realistically you can't run the model at bf16 on an a770 16gb in a timely fashion, assuming windows has offloading now which I think it does
you need ~22gb
if int8 is this close to bf16 it might as well be way way better than fp8

proper cradle
#

BF16 only has 7 bits for the float part

#

it is an inferior format compared to fp16 for diffusion use

#

int8 is pretty close to bf16

#

fp8 is comparable to int4

#

diffusion needs the float part, integer parts stays around 0 most of the time

#

also if you are doing a large scale training, avoid bf16 unless you want a blurry model that can't learn details

#

use fp16, or fp32 if you can't use fp16

proper cradle
#

fp32 / bf16

#

fp32 to int8 / bf16 to int8

#

fp32 to int4 / bf16 to int4

#

failed on int2

#

Using SDNext with an H100 80GB.
INT8 using NNCF
INT4 and INT2 using Optimum Quanto

fickle plume
#

There's an fp32 version of flux?

#

wonder if int4 is small enough to work on 8gb GPUs

keen marsh
#

So flux in sdnext works? Dev build i am guessing right?

#

My build is so old I may have to fresh install lol

keen marsh
#

Not sure if lcm is working properly in image to image the output just looks like a low step from any sampler . Seems to work with controlnet and text to image though. Although i could be missing a setting in img2img the UI is def not intuitive anymore.

keen marsh
#

is animatediff broken or do I need to do something in settings? Keep getting crystalized images. maybe a windows thing? i had sdnext download the models itself so maybe that's it too. dunno. also tried multiple models, lcm and regular etc. Latest master repo

keen marsh
#

seems to just be txt2img giving the error. Also program doesn't manually disable animatediff pipeline, you have to unload and reload the models manually or it will keep trying to use animate diff even when set to non and extensions closed.

broken grail
proper cradle
broken grail
#

ah i see

acoustic lagoon
#

Do we have a guide or tutorial on how to run Flux on ARC A770? I've been busy and trying to catch up with the latest stuff

broken grail
#

Where might I find the most "optimized" SDNext config options for SDXL? I run into memory issues past around 1000px in width and performance issues throughout (512x768 at like 2s/it).

keen marsh
#

Try medvram

broken grail
#

it seems like balanced offload causes the "expected all tensors to be on the same device" error for mr

broken grail
#

actually it seems that everything on recent dev/master branch builds is busted

#

sd 1.5 models just produce noise/sparkles/weird busted-VAE type stuff and SDXL runs absurdly slowly

#

ipex optimize also broken. I do have latest toolkit

small mason
#

Is anyone able to run the SD 3.5 medium model? Every time I try to run it I get a few errors that amount to "incomplete model" and "model not loaded"

proper cradle
#

SD 3.5 files are not a complete checkpoint

#

You have to load them as UNETs

small mason
#

Ah, thank you so much

chrome bone
#

disty's projects made it to official intel channel DoggoThumbsup

proper cradle
#

2 of them : )

keen marsh
#

Disty is #1 for arc support IMO. We would be so screwed without his work. Intel should cut a check lol

proper cradle
#

You can use base + hires and have 1 extra resolution to use without recompiling

proper cradle
#

SDXL 1024x1024 on A770 with OpenVINO

fickle plume
#

that's pretty fast

keen marsh
#

Did open vino get faster? I thought it was like 2s/it in linux iirc.

proper cradle
#

Old speed i remember was 1.8 it/s

#

My guess is new OpenVINO has Flash Attention support because i don't run into the 4gb alloc issues anymore on OpenVINO

restive parcel
#

howdy folks, been about 5 months absent, what kinds of developments have there been lately/in the last half year ish?

keen marsh
#

Flux, cogvideo, ltx video, hunyuan video, mochi video, mmaudio(video to audio), and honestly more than i can remember. As far as intel, we have built in xpu support im pytorch 2.5 and newer but its much slower(especially on windows)

green pagoda
#

2.5 is slower in my experience but it's much easier to get set up

keen marsh
#

I am going to try and set up wsl2 and hope that the speed increase from linux mitigates the speed decrease from pytorch

spiral junco
proper cradle
#

Around a minute

chrome junco
#

@proper cradle i have this error what can i do ?

fickle plume
#

(B580 user ^)

chrome junco
#

@fickle plume yes b580

#

When im wrong?? Run image on CPU not GPU how can fix this ?

fickle plume
#

^ sdnext

#

^ when running for the first time, run with --use-ipex

#

^ ai playground

#

#1193952640225267802 message

#

^My script that installs comfy (comfy is also in AI playground, will be faster for battlemage through AI playground)

chrome junco
#

Ok i try and you link but know how i can fix problem in A1111 @fickle plume

fickle plume
#

by not using A1111

#

sdnext is just plain better than a1111

#

and not just that, i think a1111 is starting to get a bit stale now...?

chrome junco
#

this is right order

fickle plume
#

there is no order, the 3 things I linked you are all 3 different web UIs

#

you pick one and you install one

#

Automatic (SDNext) is the one this thread is for

#

AI Playground is Intel's official web UI (runs using comfy, also does LLMs)

chrome junco
#

wich one is best to use with Realisticvision 6.0 checkpoint ?

#

sorry for my stupid question but i just start with this type of program and im so confuse @fickle plume

fickle plume
#

there are screenshots of the UI of each

#

you look at the screenshots and decide

#

This is SDNext

#

In the bottom image, the top right โ†—๏ธ is AIPG (images)

#

This is comfyui

chrome junco
#

Ok i choose SD next

#

@fickle plume --use-cuda Use nVidia CUDA backend (autodetected by default)
--use-rocm Use AMD ROCm backend (autodetected by default)
--use-ipex Use Intel OneAPI XPU backend (autodetected by default)
--use-openvino Use Intel OpenVINO backend
--use-zluda Use ZLUDA
--use-directml Use DirectML
witch one need to choose and when to put it ?

fickle plume
#

the instructions most likely say you will be typing in install.bat or whatever, and you type in --use-ipex after it and before pressing enter

proper cradle
proper cradle
chrome junco
#

@proper cradle idk why ai have all path

proper cradle
#

It should be the .exe file

#

Remove the venv folder in SDNext

#

You can set the python path for SDNext like this:
set PYTHON=path_to_the_exe

#

You have to remove the existing venv folder in SDNext's folder if you want to change Python after the venv is already created

chrome junco
#

My floder look like this. i Use weui-user bath who looks like that

proper cradle
#

There is no COMMANDLINE_ARGS in SDNext, add them next to the webui.bat like this:
call webui.bat --use-ipex

chrome junco
#

and get this result after run it

#

like this ?

proper cradle
#

Remove the second call

chrome junco
proper cradle
#

Yes, also there is a stray D in --use-ipex

proper cradle
chrome junco
#

no

chrome junco
#

ok let me install it

fickle plume
chrome junco
#

i think i need visual studio install first

#

@proper cradle after install what need to do ?

proper cradle
#
"C:\Program Files (x86)\Intel\oneAPI\compiler\2025.0\env\vars.bat"
"C:\Program Files (x86)\Intel\oneAPI\ocloc\2024.2\env\vars.bat"
.\webui.bat --use-ipex
chrome junco
#

type in 1 lane or 3 separate ? @proper cradle

proper cradle
#

3 lines

chrome junco
#

ะพะบ

proper cradle
#

You can probably add them to webui-user.bat too

#

(The first 2)

chrome junco
#

C:\Program Files (x86)\Intel\oneAPI\compiler\2025.0\env\vars.bat webui-user.bat too like this ?

#

or in file ?

proper cradle
#

in the file

#

So you don't have to run that line every single time

chrome junco
#

with call infront or clear ?

proper cradle
#

before the call

#

on a different line

chrome junco
proper cradle
#

yes but don't forget the "

#

Spaces will cause issues without "

chrome junco
#

now start and close automaticaly nothing run

proper cradle
#

Try adding call before the oneapi stuff

chrome junco
proper cradle
#

PYTHON=path_to_python.exe

#

Point it ot the exe file, not folder

chrome junco
proper cradle
#

change the path_to_python.exe with the actual python file

proper cradle
#

I am guessing the name of the exe is python.exe

#

Tip: you can copy a file and it will paste its full path into the terminal

chrome junco
fickle plume
#

missing set before PYTHON=

proper cradle
fickle plume
#

well, same for the other ones too?

fickle plume
#

oh, git and such were in the previous ones

proper cradle
#

does it have an equivalent to source might need to replace the calls to oneapi

fickle plume
#

set PYTHON=C:\Users\suncho0\AppData\Local\Programs\Python\Python310\python.exe

chrome junco
#

and again ....

fickle plume
chrome junco
#

all my loading is this

proper cradle
#

Remove the venv folder and try again with this added

#
set TORCH_COMMAND="torch torchvision --index-url https://download.pytorch.org/whl/test/xpu"
chrome junco
#

incorect url @proper cradle

proper cradle
#

Try this:

set TORCH_COMMAND=torch==2.6.0+xpu torchvision==0.21.0+xpu --index-url https://download.pytorch.org/whl/test/xpu
chrome junco
#

finnaly start but didnt see any image

proper cradle
#

That is strange, can you open it in a new tab?

#

Might be a timeout

chrome junco
#

after try to promt something with realistic or jugernalt just freeze and tell me

proper cradle
#

What is the dtype is set to?
setting -> compute

#

That is an out of memory error

#

SDXL shouldn't use 26 GB

chrome junco
#

dtype is ?

#

i have 32gb ddr 4 3600

proper cradle
#

You can also try removing the config.json file to reset settings

#

If it went to CPU fallback at some point it will set FP32 / no-half

#

It should be Auto or BF16

chrome junco
#

why here on card told me use 26 of 12 ?

chrome junco
#

make it 1 time

proper cradle
#

FP32 will use a lot more memory

chrome junco
#

ok i will delete 1 more time config json file and run it again

#

pfff finaly work

#

this think my card have 26 gb Vram how can set it to 12

proper cradle
#

That counts the shared RAM

#

From System RAM

#

I don't know if Intel has an option to disable system memory fallback

chrome junco
#

and what need to do now any suggestion

proper cradle
#

Reported VRAM is the real 12GB so it shouldn't be a problem on SDNext's memory management

#

What problem are you having rn?

chrome junco
#

same error after try to run a promt

proper cradle
chrome junco
proper cradle
#

Dtype is correct, BF16

#

Try adding this to webui-user.bat:

set IPEX_FORCE_ATTENTION_SLICE=1
chrome junco
#

now use low memory

#

gpu 4 from 12 ram 3.6 from 32

proper cradle
#

I guess flash atten support for battlemage is not in yet, i will enable this variable by default on battlemage on sdnext side

chrome junco
#

ok for now i will just see what i can do and make some modify by my settings in ui and wait for your info please @ at me if you have something new

#

@proper cradle @fickle plume Very very much thank you for your time and support to can make me this start !!!!!!!!!

small mason
#

Does anyone know how to get it to run on the igpu instead of the cpu? On my computer it's gpu 1 and I was wondering if anyone knew how to switch it to hopefully make it faster than 30 s/it

proper cradle
#

New Dynamic Atten SDPA with SD 1.5 at 768x1280.

500 MB limit vs 4 GB limit:
2.3 GB Total VRAM usage vs 4.5 GB total VRAM usage
offload mode = none

fickle plume
#

Very nice work!

mighty wind
#

Hey im running into a weird error wit SDNext with openvino, is this the correct place to ask?

mighty wind
#

Okay so i just installed it and the live previews makes it look like it works, but when it finishes its just a blank/black image and then this pops up in the bottom column.

I'm using a B580 on a Ryzen 5700x if that matters

2025-01-26 12:55:05.027 ERROR sd processing_helpers Decode: sample=(1280, 1080, 3) invalid=4147200 dtype=float32 vae=torch.float32 upcast=True failed to validate

mighty wind
#

Same thing still unfortunately, I double checked that its selected in the settings ๐Ÿ˜ฆ

proper cradle
#

Did you reload the model after this?

#

OpenVINO will need to recompile

mighty wind
#

I did, i did a new install and then tried Ipex and now it just works, thanks for the tips tho, I appreciate it ๐Ÿ˜„

proper cradle
#

Removed IPEX and fully switched to pure PyTorch 2.6.0+xpu with the latest update.

Also Python 3.12 is now fully supported by SDNext.

#

Also there is now a --use-nightly flag if you want to download nigthly PyTorch.

proper cradle
green pagoda
#

Nice

proper cradle
proper cradle
#

Upscaler compile is enabled by default now

remote nacelle
proper cradle
#

I don't have a BMG card myself so can't comment about the speed.

green pagoda
#

I would offer to try it if I had one, but I don't. I got a 258V though

keen marsh
#

I am not sure what the speed is of the b580 at all, only a few people with it using any AI stuff.

green pagoda
#

It's not really much better than that A770 for it so most existing enthusiasts aren't upgrading

keen marsh
#

Yeah, I am likely one of the only a750 owners who mess around with ai alot lol. I was thinking about grabbing the b580 as a stop gap but seems it has alot of other issues atm too.

proper cradle
#

base img

#

venv will be created and installed on the first run

restive parcel
#

docker my love

fickle plume
#

intel compute runtime my hate

ember orchid
proper cradle
#

base img

drowsy phoenix
#

using a B580 with sdnext at the moment, anyone who has curiousities about battlemage AI just let me know and I'll give it a try

teal hinge
#

what does AssertHandler::printMessage mean?

#

it happens after ive had the webui open for a long time

#

and then it just closes

#

happens when i click on generate after ive had the webui opened for a long time

teal hinge
#

does it happen because my computers not that powerful? i think it happens less when i lower the resolution and stuff

#

but if its open for a long time it does do it pretty easily regardless of the resolution

#

i have an arc a750 and 16 gigs of ram

#

and an i5 10500

#

and im on windows

teal hinge
#

and also sometimes, after this it also displays a python.exe error

#

something along thje lines of referenced memory at (address) could not be read

#

and i think it happens more when i used longer prompts

proper cradle
# teal hinge

Try running memtest to see if your ram is dying or not

#

Also 16 GB RAM is too low / barely enough

teal hinge
#

how much shouldbe enough

#

another 16?

proper cradle
#

32

#

Yes

teal hinge
#

ok

#

does switching to linux help?

proper cradle
#

It will help but still barely enough

proper cradle
teal hinge
#

ill run a memtest

#

i have one stick of ram thats new and one stick thats old

#

both are same tho

#

corsair vengeance

#

both the same speed as well even though motherboard doesnt let it run in full speed

#

the next 2 stick that im supposed to get, should they be corsair vengeance as well? or am i allowed to get different ones

proper cradle
#

Should be fine as long as the memory chip is the same

#

My PC has 2x 16 gb curicial and 2x 32 gb kingston

#

Both has micron memory chips

#

Both are 3200 MHz CL16

teal hinge
#

do speeds matter? i have xmp on currently otherwise i think it defaults to 2133

proper cradle
#

or you will have to run all of them at the speed of the slower one

restive parcel
restive parcel
#

found a way to get the gpu through, but still had weird errors. tried running in windows and I don't know what i'm doing anymore apparently 'cause I can't even get default models running with a clean install DinaKEK

#

queue cannot be constructed, engine did not start, lots of fun errors with no real clue as to what i'm missing

#

guess i'll just go back to drawing manually for today

fickle plume