#SDNext WebUI on Intel ARC

1 messages ยท Page 4 of 1

onyx moth
#

that would be somewhere in bios right

pastel geode
onyx moth
#

MAG B760 TOMAHAWK WIFI (MS-7D96)

pastel geode
#

I have mine enabled cuz I usually look at it to see if deep link is actually working on Handbrake

onyx moth
#

i'll poke around in the bios. also, I have no idea what deep link or hnadbrake is

pastel geode
onyx moth
#

omw to bios thank you so much

pastel geode
#

just to be on the safe side, make sure that you have no monitor(s) connected to ur motherboard.

onyx moth
#

I don't understand but I found it here

pastel geode
onyx moth
#

Yeah

pastel geode
#

yea so from my experience, once you disable it, your task manager will only show ur dgpu

onyx moth
#

If this fixes it ur my hero

pastel geode
#

also, if u check device manager, under display adapters, youll only have the A770 after disabling it that is.

onyx moth
pastel geode
#

looks good so far, hopefully it works on u ^^

#

otherwise, Im sorry but it was worth a shot :>>. you can enable ur igpu again using the same process

onyx moth
#

IM IN

#

I don't know why I'm still using my phone for screen shots. You are the man. And disty and everyone else thx

pastel geode
#

there are some settings you might have to change. Im not really familiar with them

#

disty has some screenshots but u gotta scroll up for them. Have fun playing around with it ^^

onyx moth
#

it works bros

#

i even upscaled it by accident

#

OpenVINO doesn't hires fix yet..

#

I love them but I feel like there's more capability here

pastel geode
#

nice nice! but if you want to us sdxl, refiner, etc. youll have to modify ur settings.

onyx moth
#

yeah, I'm gonna stick to 1.5 for a while till I get used to the differences in the ui

#

I'm a slow learner but I learn

onyx moth
#

also interested in learning this out of curiosity

proper cradle
#

SDNext WebUI on Intel ARC

grave condor
#

I searched through the repo and seems like the ControlNet pipelines are from an external plugin

keen marsh
onyx moth
#

my lifes greatest achievement

#

now i can link all the big folders to the other drive

chrome bone
#

i just install wsl on the other drive

keen marsh
#

I didn't know you could install wsl 2 on another drive, and how did you link another drive in wsl2??

onyx moth
chrome bone
#

installing wsl2 on another drive is quite convoluted.. and you can even install it on usb drive if you prefer. theres absolutely no down side about it (except it will load slow)

onyx moth
#

oh my bad you werent asking me

chrome bone
#

pretty sure he is

onyx moth
#

im stoned that should explain it

onyx moth
#

the only folder i cant get link to work right is outputs, so i just did it for a few things like sd models, vae folder, loras and embeddings for now

onyx moth
#

that was an easy one

#

just right click in windows to make a shortcut

pastel geode
keen marsh
pastel geode
#

I think you can delete the FILE created by ln -s so they are linked until you delete it.

keen marsh
#

interesting, if there is a way to link to other drivers in wsl2 you can add that link inside the options in both sd.next and a1111 I believe. I'm going to look this up so I can free up more space on my main drive and just use the same models for wsl2 and native windows.

onyx moth
#

yes I already had the models in a folder on my d: drive, but the engine is in c: drive in wsl. so using the link puts a file in the folder inside wsl that points it to the folder on d: drive

pastel geode
#

space-efficient ๐Ÿ‘

onyx moth
onyx moth
onyx moth
#

does anybody know how to enable the extension browser? aka load available extensions through the webui? thanks

keen marsh
#

It should be in the extensions tab almost the same as a1111, might have an option to show all or something but I am not in the webui atm to be 100%.

onyx moth
proper cradle
#

Manage extensions, refresh

restive parcel
#

I haven't checked back in a hot minute, is the windows install less complex now/intel uploaded the complete whls this time?

#

or do we still gotta self compile/share with a neighbour on those and the other hacky parts?

grave condor
#

I haven't seen any claim that they uploaded new wheels

#

but you can install it really easily and do JIT

#

compiling wheels with AOT took me 5.5 hours

onyx moth
keen marsh
#

I think its usuable if you use it for a long time and don't mind a 15minute startup. However you can use openvino now, which only requires one file edit for native windows

restive parcel
novel sphinx
#

Openvino backend really likes alot of system ram @.@, using xl models while it compiles into the openvino format i see it use 100% of ram and start using the pagefile, i have 32gb btw, could be why xl models take 5-6 mins on the first generation while it compiles because it has to use my pagefile

keen marsh
#

not sure about the a1111 webui

novel sphinx
#

yeah im using sd.next it just works better imo then the a1111, is more streamlined

restive parcel
#

openvino is working for sdxl?

#

I DO have tons of ram to spare...

pastel geode
#

Is there a known issue in sd.next regarding ipex and sdxl refiner?

novel sphinx
#

openvino is working for sdxl models yes, disty has shown it though i have encountered something odd, i get this error C:\Users\KingOfMemes\automatic\modules\processing.py:584: RuntimeWarning: invalid value encountered in cast
sample = sample.astype(np.uint8) when trying to run at a res thats not 1024x1024 in my case 1024x1536, not running out of resources or anything but i get that error and a black image but 1024x1024 works fine

novel sphinx
onyx moth
#

wait ipex works with sdxl right

novel sphinx
#

Yes

#

With or without refiner tho from most of what ive seen the refiner is pretty unnecessary

pastel geode
# novel sphinx not that im aware of, what issue are you having?

I followed the instructions here on github

  • VRAM Optimization: Chose option 1
  • did Fixed FP16 VAE

BUT for Using SD-XL

everything works but (optional) Finally select the sd_xl_refiner_1.0.safetensors file from the Refiner dropdown.

I get the error in the picture with --ipex and then it kills the server. Without ipex, it doesnt send me the error but kills the server either way

Main Issue: Fails to load the refiner

GitHub

SD.Next: Advanced Implementation of Stable Diffusion - vladmandic/automatic

pastel geode
proper cradle
#

Not enough for refiner

#

Increase it to at least 24

pastel geode
keen marsh
#

So with openvino you need to have double the space for your cache folder it seems? Since it makes a new model each time. Any way to have it only complied once and remember the cached model when chosing models?

proper cradle
#

It will use the cached model

#

Also different resolutions compiles to different models

#

Each model can do only one resolution

keen marsh
#

Interesting, well I think I will stick to Ipex as I don't have enough space or memory atm.

grave condor
#

out technically don't need the default model or onnx intermediate after you done the openVINO IR conversion

onyx moth
#

I got another newbie question if anybody got a sec - since ipex is run in wsl, does the output save location have to also be in wsl? or can I get it to save outputs on a different drive

#

aka a folder on my d: drive desktop

keen marsh
pastel geode
# onyx moth aka a folder on my d: drive desktop

I think there is an easier way. If you go
settings > Image Paths, you should be able to set the directory for the images. Im just not sure if you have to include mnt relative to wsl or just the drive letter D: for example.

onyx moth
#

o crap

#

i changed the diffusers backend from original

#

didnt like it because couldnt use controlnet from the webui built in

#

so changed it back

#

not some settings are missing

#

aka in settings I dont see a few things including the options for SDXL

#

should i reinstall the whole dam thing?

chrome bone
#

yeah i think theres a bug where once you changed to diffuser you cannot enable back some extensions

#

you dont have to reinstall the whole thing though

#

just delete config.json and ui-config.json

onyx moth
#

thank you thank you

onyx moth
#

still doesnt show it

#

all good clean restarts feel good

keen marsh
#

SDXL will only run in diffusers, and Controlnet should work for diffusers just not sdxl?

onyx moth
#

@pastel geode Hey bro you were right, just add /mnt/d/ in the output destinations to output to the drive I want

#

in image paths

onyx moth
#

is this something I have to revert if I want to go back to 1.5 sometime? edit: I meant VAE Upcasting

proper cradle
#

Disable VAE Upcast on Diffusers SD 1.5 too

#

SD 1.5 doesn't need a special VAE when VAE Upcast is set to false

#

Only base SDXL VAE is broken with it and needs a fixed VAE

#

Also Diffusers works with SD 1.5 too

#

And Diffusers is faster and uses less VRAM

onyx moth
#

would controlnet work in 1.5 w diffusers

proper cradle
onyx moth
#

oh ok, thanks for all the guidance btw

#

im getting a lot of sparkly noise in xl

proper cradle
#

Use 1080x1080

onyx moth
#

oh ok I had no idea

onyx moth
#

any tips on where to go from here? haha

#

had some pretty good qualities coming out of 1.5 - haven't really got a feel for the xl samplers yet

proper cradle
#

Euler a with more CFG scale

#

8-12 CFG

onyx moth
#

does sdxl have DPM++ 2M SDE Karras?

chrome bone
#

ask them to implement it

proper cradle
#

DPM 2M is the base and rest are an option in the settings

#

But Euler a is the best sampler for SDXL

onyx moth
#

@proper cradle If you made a YouTube channel with various tutorials related to the arc a770 I'd subscribe day one

onyx moth
#

what's the process for changing back the checkpoint to a 1.5 sd? Last night something broke trying it

proper cradle
#

Just chose a SD 1.5 model

#

If you want to use ControlNet, set backend to original from Diffusers Settings.
If you don't use ControlNet, use diffusers. Diffusers is better than original backend.

onyx moth
#

and the refiner I just leave selected on sdxl refiner right? After I changed that to none all kinds of errors came up and I couldn't do anything any more

#

I forgot what order I did things that broke everything. Think I switched the refiner off then the checkpoint then the VAE

restive parcel
#

yeah I've never been able to get 1.5 to work after I've used sdxl setup, it just breaks for some reason I haven't figured out yet

proper cradle
#

I go with Refiner = None (I never use Refiner tbh) then VAE, then model

#

Also set Diffusers Pipeline to Autodetect from Diffusers Settings

#

If you switch to original backend, do a full restart

#

Also enable the disabled extensions if you go back to the original backend

onyx moth
#

thank you

#

i fixed it without having to reinstall WHICH I ALMOST DID

keen marsh
#

From what I am reading controlnet 1.1 should support diffusers?

onyx moth
#

uncheck autoload model on startup

#

i had sdxl refiner as the checkpoint and it would load up and break everything, system wouldnt let me change anything related to models and nothing worked

#

wait

#

'StableDiffusionXLPipeline' object has no attribute 'parameterization'

#

still broken

keen marsh
onyx moth
keen marsh
#

what are your diffuser settings?

#

Also, is this Ipex or Openvino?

restive parcel
#

need that obscenely fat gerbil

onyx moth
#

ok its fixed after a restart

onyx moth
#

and i meant to inpaint this but got sidetracked with xl

restive parcel
#

HIS HAAAAAANDS

onyx moth
#

anybody have methods of getting multiple subjects? I keep getting hybrids.

#

here is an example

chrome bone
#

you cant afaik

#

you have to inpaint it

#

like selecting the raptor behind and prompt raptor

keen marsh
onyx moth
#

1.5, xl both

restive parcel
#

finally got around to setting up the conda on windows and oneapi and all the fun stuff. used default whl, but during the long windup to first render, I just crash to Native API -997 InaPensiveMore

#

oh, i managed to get a couple of images out, but they looked awful, and then I changed backend and it crashed again annieSnap

keen marsh
#

try and run this in python "import torch
import intel_extension_for_pytorch
torch.xpu.is_available()"

#

Also using the default whl should take about 15 minutes to start up the first time.

restive parcel
#

yeah it takes a really long time to start up. it just usually crashes halfway, and if it does work, it spits out mostly garbage

restive parcel
#

curiously though, when I went to import intel pytorch, i got

#

not sure if that can cause issues

keen marsh
#

Yeah, its looking for cuda in an nvidia folder? You have multi-gpus in the system?

#

Make sure to run it in the oneapi env, call all 3 vars each run

restive parcel
#

though I did have the cuda toolkit for back when I did use the card

#

a li'l ol' 1050 ti

onyx moth
#

this is necessary if I want to upscale 2x right? since I have to use XL at 1080 instead of 1024

keen marsh
onyx moth
#

@proper cradle since I'm generating at 1080 x 1080 should I go 1920 x 1080 for landscapes or something else? thanks!

keen marsh
restive parcel
#

although, if it was actually using cuDNN somehow, I'm not sure how it managed to actually generate a pair of images on the gpu

#

confusing, but I'll try to get it back to running state

#

oh wait, i'm a moron, i was trying to run the wrong sd install

#

not before mind you, just this time

restive parcel
#

removed all nvidia toolkit stuff, we still getting -997

chrome bone
#

native pytorch can use device "cuda" just fine but not for xpu

#

this is where ipex comes in

#

now that intel dev made its way into pytorch maintainers (i think this is how its called) we can expect to use xpu without ipex in a not so distant future... hopefully

keen marsh
#

It is a bit confusing since they call it intel extension for pytorch, yet short hand it IPEX for intel Pytoch extension. ๐Ÿคทโ€โ™‚๏ธ

onyx moth
#

aw crap inpainting crashes me

#

probably have to set to a 1.5 checkpoint for inpainting

restive parcel
#

yeah I'll try removing and readding it, though the package does seem correct on the git version number

onyx moth
#

RuntimeError: Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES) -5 (PI_ERROR_OUT_OF_RESOURCES)

#

um help thanks lol

#

im seeing this

#

the normal (correct pipeline) was selected

#

happening when it tries to load up the refiner i guess

proper cradle
#

Needs more luck tho

#

And you can 2x from 1920x1080 with Attention Slicing enabled

#

A770 16 GB can go up to 4096x4096

onyx moth
onyx moth
proper cradle
#

Also Refiner is pretty much useless if you are using a finetuned model

onyx moth
#

yeah I went back to base to try it just couldnt get it to work again

proper cradle
#

Refiner generally makes images worse with finetuned models

onyx moth
#

...can refiner be finetuned for a finetuned model??

proper cradle
#

Can be but no one did

keen marsh
onyx moth
#

is the 1024 x 1024 curse for the arc fixable? genning from 1920 x 1080 needs a lot of luck proportions get messed up all the time etc

#

I mean is it something in the works/being looked at etc

onyx moth
#

also can someone please give me any tip you can about inpainting

#

thats what I got out of this

#

CrystalClearXL btw

proper cradle
# onyx moth is the 1024 x 1024 curse for the arc fixable? genning from 1920 x 1080 needs a l...

Try 1024x1536 or 1536x1024. Only 1024x1024 is cursed.
Or try these:

resolutions = [
        # SDXL Base resolution
        {"width": 1024, "height": 1024},
        # SDXL Resolutions, widescreen
        {"width": 2048, "height": 512},
        {"width": 1984, "height": 512},
        {"width": 1920, "height": 512},
        {"width": 1856, "height": 512},
        {"width": 1792, "height": 576},
        {"width": 1728, "height": 576},
        {"width": 1664, "height": 576},
        {"width": 1600, "height": 640},
        {"width": 1536, "height": 640},
        {"width": 1472, "height": 704},
        {"width": 1408, "height": 704},
        {"width": 1344, "height": 704},
        {"width": 1344, "height": 768},
        {"width": 1280, "height": 768},
        {"width": 1216, "height": 832},
        {"width": 1152, "height": 832},
        {"width": 1152, "height": 896},
        {"width": 1088, "height": 896},
        {"width": 1088, "height": 960},
        {"width": 1024, "height": 960},
        # SDXL Resolutions, portrait
        {"width": 960, "height": 1024},
        {"width": 960, "height": 1088},
        {"width": 896, "height": 1088},
        {"width": 896, "height": 1152},
        {"width": 832, "height": 1152},
        {"width": 832, "height": 1216},
        {"width": 768, "height": 1280},
        {"width": 768, "height": 1344},
        {"width": 704, "height": 1408},
        {"width": 704, "height": 1472},
        {"width": 640, "height": 1536},
        {"width": 640, "height": 1600},
        {"width": 576, "height": 1664},
        {"width": 576, "height": 1728},
        {"width": 576, "height": 1792},
        {"width": 512, "height": 1856},
        {"width": 512, "height": 1920},
        {"width": 512, "height": 1984},
        {"width": 512, "height": 2048},
]
onyx moth
#

thanks Disty

restive parcel
#

I forget what exactly it is right now though, since I haven't gotten sdnext working again yet...

onyx moth
#

1536x1024 working nicely

#

just having too much fun sorry

broken grail
#

hamster man. . . .

broken grail
#

If I were to try my hand at getting some basic form of training working, what would be the most straightforward approach?

#

Reimplementing gradscaler?

proper cradle
#

Training works on the dev branch

proper cradle
broken grail
#

hmm

#

training starts and proceeds for a dozen or so steps, before throwing a TypeError, log attached. This is with BF16, going to try FP16.

#

this actually happened when I tried the gradscaler patch from 1.5

#

on 1.5

#

nope, needs bf16...dman

broken grail
#

it appears to be an issue with preview image generator

#

commenting out these lines lets it proceed until instructed to generate a preview image

#

that patch, plus cranking up interim images to some hilariously large number, fixes the crashes. let me check the output of the training

#

okay, training is working correctly outside image generation during training

restive parcel
#

wowie, training was a primary reason I wanted an arc a770 for originally, exciting if it can work now (in a dumb enough manner for me to understand)

proper cradle
#

Training is for the original backend

proper cradle
#

It will use that if it's available

broken grail
#

Yeah this is all reasonably fresh

onyx moth
#

by the way, can I use sdxl through original backend? is diffusers just a way to conserve ram or vram or somethin?

keen marsh
onyx moth
#

how are people having access to new samplers? I thought it was a diffusers difference

keen marsh
proper cradle
#

SDP is the default optimizer now.
And this 32 batch size, 20 steps, 512x512 generation took 40 seconds:

#

And speed on original backend with SDP is 7.5 it/s

onyx moth
#

intel drivers just dropped

keen marsh
#

did something change with styles? Seems my csv file isn't being read right anymore

keen marsh
#

on windows SDP uses about 2gb more vram than Sub-quadratic about 6gb with sub and 8.5 with SDP at 960x544. Not great for the a750. This is using Ipex in original backend.

#

Also was garbage collection disabled? seems to hold more vram now and I don't see the option anymore.

proper cradle
onyx moth
#

whats garbage collection

restive parcel
# onyx moth whats garbage collection

its a term referring to how memory is managed in computing. useless data doesn't need to stick around in ram/vram, but its a pain in the neck to manage removal manually when developing software. For this reason, there's a lot of libraries made that are really good at handling it automatically, and a lot of higher level languages just handle it for you by defualt. If garbage collection doesn't work correctly though or if its turned off, memory will just fill and fill and fill until the program is terminated.

proper cradle
chrome bone
#

garbage collection is an action to empty memory for objects that are no longer referenced. iirc in python it is automatically done when certain threshold is met but you can call it manually. examining if objects are referenced in general lower your perf so only do it when necessary

onyx moth
#

DPM 2M 50 step cfg 7 sdxl

proper cradle
#

Also training on Kohya SS GUI is working now:
#1142086112014237857 message

onyx moth
grave condor
#

is there a good way to benchmark hardware? I got PVC access but only 10GB of storage but want to try stable Diffusion inference

restive parcel
#

man, this is weird. it hits embeddings and then declares it can't find the device

#

the device listed up here

#

also listed here when running sycl-ls, so oneapi access is exposed

#

also still getting error -997 if I try to infer anything. This is on latest version, just deleted config and ui-config to ensure i'm starting mostly fresh, although this was already basically fresh the first time I got the error a week or two ago

chrome bone
#

native api failed mostly means oom afaik

restive parcel
#

it has its own error call for oom, i think -5?

#

I've got plenty of overhead though. loading orange mix on a770 16gb with 64gb system ram

chrome bone
#

you can probably monitor vram usage while you use it

#

16gb is nothing for sdxl if you dont utilize all memory saving tricks you have

proper cradle
#

Are you running native Windows?

#

I've never tested native Windows

chrome bone
#

yep i think he is.

restive parcel
#

I'm trying to run windows yep. vram doesn't go over 20%

#

it's literally failing before even hitting render on anything

#

some of the others have no problem, so I'm just wondering if I screwed the setup somewhere

restive parcel
proper cradle
#

Try diffusers backend

#

start with --backend diffusers if you can't get into settings.

restive parcel
#

well, I was having issues with diffusers backend too, but I'll give it a try now since I'm having a slightly different issue than I was before

proper cradle
#

Also OpenVINO will be better on Windows

#

Create a new install and use --use-openvino

restive parcel
#

switching to diffusers crashed my card and i had to reboot ๐Ÿซ 

#

what are the disadvantages of open-vino? I heard there were still some issues/features that don't work

#

I want to do upscaling and training at some point

proper cradle
#

resloutions more tthan 768x768 on SD 1.5 and 1024x1536 on SDXL

#

Compiling models takes time and RAM

restive parcel
#

If I'm using a conda venv, should I not use the same venv as ipex install?

proper cradle
#

Do not mix them

#

OpenVINO needs cpu pytorch

restive parcel
#

thanks

proper cradle
#

Also SDNext has full Python 3.11 support now

restive parcel
restive parcel
#

well, I certainly get an image which is much better than I was getting on ipex

#

Do LORAs work with openvino at all?

#

the system log seems to suggest it is, but I see no effect from it

#

also, vram use keeps climbing, and the quality of outputs gets lower over time...

proper cradle
restive parcel
#

weird, my loras didn't affect the output at all

proper cradle
#

Did you change lora loading method?

#

Only the diffusers original works

restive parcel
#

I left on default because I don't know that setting

proper cradle
restive parcel
#

testing same seed with and without lora, i get very minor change

#

Without lora

#

with lora

#

having the lora set to maximum weight

#

its a pretty influential lora, and this isn't the effect it should have

onyx moth
#

how to enable DPM++ 3M SDE Exponential??

keen marsh
grave condor
#

They fixed some of the duplicated VRAM usage in openVINO (if you export via optimum/onnx)

#

There is a large patch coming 2023.1. Sometime next week most likely

#

it's meant to improve GPU inference performance for language models. But might bring more improvements with it

#

also might be crucial to some MTL products that Intel wants to show at InnovatiON. So that's the proper deadline

keen marsh
#

does sdxl controlnet work in sdnext? I just read they updated the webui version.

restive parcel
#

holy crap

#

I tried using 3 loras

#

wait that's.... ok, i misread the digits. that's only 5 gb

restive parcel
#

ahh, i finally managed to get the lora working, ish

#

the result is still awful 'cause I think my vae's are actually not getting used, and its 1.4 s/it and almost all of my vram annieSnap

onyx moth
#

i didn't mess anything up did i?

keen marsh
onyx moth
#

i dont know how to go back

chrome bone
#

you can do git checkout <branch>

onyx moth
#

my bad i forgot to mention i dont know much about any of this

#

if it wasnt obvious lol

chrome bone
#

DoggoGrin ..

#

you need to replace <branch> with branch name

#

so suppose you were on dev, you can run git checkout dev

#

heres a list of branch available for this repo

onyx moth
#

whats the branch of this thread

chrome bone
#

you can also checkout to a specific commit (which is a snapshot of a repo)

#

like this

onyx moth
#

whats the uh. the branch for sd.next

chrome bone
#

you can choose from master, dev, and fix_init_latent_diffusers

#

the other 2 is stale and i doubt theres any use case

onyx moth
#

so if i just leave it on master is the best then

chrome bone
#

for normies yeah

onyx moth
#

im somewhat of a normie myself

chrome bone
onyx moth
#

thank you btw

keen marsh
south viper
#

I see you're the one writing the doc. I take it you're on python 3.11 with arch so you need to use --experimental. it looks like --use-ipex doesn't work on Arch as well. I would like to know exactly how you got it running cause last time I tried i had to do the whole intel_extension_for_pytorch rigmarole and it would never succeed, between pip being an actual a**hole about non-package manager packages, to compiled binary deps not being happy it never worked.

#

๐Ÿ™ƒ

onyx moth
#

I see a lot of XL images on civitai using DPM++ 2M SDE Karras and DPM++ 3M SDE Karras but not seeing them available here. Is that an IPEX limitation? Are these people using Comfy or something?

#

I know that @proper cradle said DPM 2M is the base and rest are an option in the settings but the images being produced are using normal scheduling.

restive parcel
#

I tried using comfy extension on ipex and it just hard crashes my server launch cri

#

i'll try to just install comfy on its own and see if I can run it, 'cause if I can't run comfy on its own either then i probably have something messed up in my oneapi install or something

proper cradle
#

Did you install OneAPI, intel-compute-runtime and rest of the necessary packages?

proper cradle
#

You can also ask SD.Next issues on SD.Next discord server

restive parcel
#

oh, there's an sdnext server?

proper cradle
#

Yep, link is on the readme

restive parcel
#

sweet, I'll hop on tomorrow

spiral junco
#

@proper cradle have you seen this red/green pixel dot artifact on sdxl? it seems to be it's related to invisible-watermark

proper cradle
#

That's already fixed?

#

Like 2 months ago

spiral junco
#

I was working on the A1111 fork and ran into this issue for some reason. but after overriding pipe.watermark then issue seems to be fixed

#

not sure why

proper cradle
#

It needs to be manually disabled in the code

#

idk what A1111 does

spiral junco
#

add_watermark=False?

proper cradle
#

It's disabled in SDNext for 2 months already

#

If you still have this issue then don't use the original base VAE

spiral junco
#

thanks for the info ๐Ÿ˜„

onyx moth
#

oh uh hey. is 1 to 2 it/s normal with the arc a770 on sdxl?

proper cradle
#

1.85 it/s at 1024x1024

onyx moth
#

That's about what I'm getting. Nice.

onyx moth
#

hey disty, out of curiousity im just wondering if the 1024 curse is something that might be fixed in the future

proper cradle
#

It reduced to 1024x1024 only after a few updates to IPEX itself and the compute runtime stack

onyx moth
#

I see, I'm glad to know there is/has been progress with that

broken grail
#

such a weird bug

keen marsh
#

Sdxl doesn't want to work at all in native windows anymore with ipex. Error diffuser model not loaded. Maybe some setting I have is wrong its been a while

#

Also getting massive memory usage and slow down of system like back with wsl2

#

Seems like a memory leak? Python isn't even using it. Oh I seez it reset itself to fp32 again lol, why does it do this again?

#

Still won't load any sdxl model with xl backend for some reason. Actually nothing will load in diffusers.

keen marsh
#

okay figured it out, Ipex optimize causes the problem. but now native API failed error. ๐Ÿ˜ฆ Might have to pull an older commit. okay, shutdown fixed that lol

proper cradle
#

Put the return model on wrong place

keen marsh
#

I keep getting an api error when loading a Vae or the refiner etc that requires a complete Ui shutdown to work again. I used to be able to swap out no problem, only had to restart when chaning the medvram/lowvram setting etc

proper cradle
#

Disabling IPEX Optimize and Compile is the same?

keen marsh
#

I have them both disabled

#

It seems to be hanging on loading the Vae or something. I tried to swap vaes and it it just says it's loading in the cmd window, but nothing in the webui, and if I load a model it will load that and then say it's loading the vae still.

proper cradle
#

Only change with the latest update on IPEX was IPEX Optimize and Compile

#

All i did was moving them to a function so i can call compile with OpenVINO

keen marsh
#

I don't know, I haven't used it in a while it could be something from a while ago. Also could be just windows ipex stuff

onyx moth
#

same, still happening

#

when trying to change checkpoint. also, it used to say in the command prompt the details of the checkpoint being loaded, now it only mentions the VAE. And when prompting it always says loading vae approx instead of the checkpoint (baked vae or not)

#

like it doesn't say loading weights about it

proper cradle
#

Pushed another fix

#

Now there shouldn't be any difference functionality vise

proper cradle
#

Unloading logic was broken

#

fixed it

tall grove
#

am attempting to setup via docker on fedora as icba to setup intel stuff there. Also docker desktop wont connect to the gpu as it is setup without sudo lol (atleast docker-ce works when using sudo)

#

hasnt crashed yet so it may work hmhm

tall grove
#

idk i think sd xl works uhh

#

is this a good speed with an a770?

#

tbh i need more ram lol

tall grove
#

works well enough i am satisfied

restive parcel
#

LMAO

proper cradle
#

Attention Slicing is turned on by default on IPEX now

#

New speed you should expect at 1024x1024 is 1.70 it/s

#

Slicing Scaled Dot Product to workaround 4GB limit made it slower than Sliced Attention

tall grove
#

yeah i think i get that

grave condor
#

has there been a solution to the 4GB thing?

proper cradle
#

Slicing the attention layer did the trick

proper cradle
#

And it's still faster than sub-quad

keen marsh
#

Any changes with the new Openvino update? Any optimizations etc, or compatibility issues?

grave condor
#

lots of improvements in that release. but I suppose wait a few days for all other libraries to incorporate those

proper cradle
#

Updating OpenVINO to latest one caused a slight performance loss on my end

novel sphinx
#

Reading the notes, it looks like openvino no longer needs to convert the models? That could be pretty big

onyx moth
#

oh? that was a big issue for me. because changing even the resolution had to let it convert and the cache would get huge fast

grave condor
novel sphinx
#

Still a huge improvement should lower the time to compile the models and such

grave condor
#

yes, the torch.compile backend is interesting to compare to ipex now. But it will take a few days for optimum updates to properly change their backends etc

cobalt abyss
#

The 4GB VRAM limit on Arc GPUs seems to be related to the memory usage of the attention layers in Stable Diffusion models.

Some key points:

  • The attention mechanism is very memory intensive, since it calculates relationships between all tokens in the input.

  • For higher resolutions like 1024x1024, the attention layers start exceeding 4GB of VRAM usage on Arc GPUs.

  • This was causing crashes or failures to load models at 1024x1024 and above.

  • A technique called attention slicing was implemented to work around this issue. It splits the attention calculation into smaller chunks to reduce peak memory usage.

  • After adding attention slicing, Arc GPUs can now run 1024x1024 resolutions at decent speeds of around 1.7 iterations/sec.

  • But the 4GB limit still remains as an architectural restriction of current Arc GPUs. So higher resolutions like 1536x1536 remain out of reach for now.

  • The attention slicing approach helps make 1024x1024 feasible, but doesn't fundamentally lift the 4GB barrier. Further memory optimization work is still needed for higher resolutions.

So in summary, attention slicing was an optimization to make 1024x1024 possible by reducing memory usage, but the fundamental 4GB VRAM limit on Arc still poses challenges for resolutions above 1024x1024.

is it true? Then where did I see people saying Arc can generate upto 4096x

cobalt abyss
#

@proper cradle Based on my research, it seems there are a few key points:

  1. Intel Arc GPUs currently have a 4GB limit on memory allocations per buffer in OpenCL and Level Zero APIs. This is due to the "stateful addressing model" used and 32-bit integer overflow in index calculations.

  2. It is possible to workaround this limitation by:

  • Using the CL_MEM_ALLOW_UNRESTRICTED_SIZE_INTEL flag in OpenCL API calls to disable the error and allow larger allocations. However, this can lead to incorrect results if the kernel is not compiled properly.

  • Passing the ze_relaxed_allocation_limits_exp_desc_t struct in Level Zero API calls to disable the size restriction.

  • Compiling kernels with -cl-intel-greater-than-4GB-buffer-required (OpenCL) or -ze-opt-greater-than-4GB-buffer-required (Level Zero) build flags. This puts the kernel into "stateless addressing model" to support larger buffers.

  1. Even with workarounds, there seem to be issues with performance degradation and incorrect results when using buffers over 4GB. This points to deeper architectural limitations in supporting large allocations efficiently.

  2. The 4GB per buffer limit severely restricts the usable VRAM capacity, especially on high memory cards like the A770 16GB. Many applications require larger allocations.

  3. Other GPU vendors like Nvidia and AMD support full VRAM utilization with large single buffer allocations.

Overall, the 4GB allocation limit needs to be revisited in Intel GPU drivers and hardware. Larger memory sizes should be supported natively through 64-bit indexing and removing arbitrary limits. This will allow more software to take advantage of the available VRAM capacity. Workarounds help but have drawbacks. Long term, Intel GPUs should match the memory flexibility provided by competitors.

proper cradle
#

You can generate 4096x4096 with attention slicing on IPEX

#

OpenVINO still hits that 4GB limit because you can't use Attention Slicing with OpenVINO

proper cradle
#

And performance drop is not too bad
1.85 to 1.75 it/s at 1024x1024

#

Also 4GB limit isn't an issue wil LLMs since they are not a single block of a model.

#

Models with layers (almost every model) are fine unless a single layer goes above 4GB (and you probably won't be able to fit that model into 16 GB VRAM even if you could allocate more than 4GB).

#

4096x4096 with SD 1.5 12.5 s/it

#

4096x4096 with SDXL 10.8 s/it

#

These are direct generations without any upscaling

proper cradle
#

And we were already using pre-release OpenVINO 2023.1.0

#

So no big changes for Stable Diffusion WebUI

#

There are a few changes in the WebUI tho

#

Loras are working now (but needs to recompile with Lora)
And caching isn't broken anymore

cobalt abyss
cobalt abyss
proper cradle
#

SD 1.5 at 512x512

#

SDXL needs 48GB RAM to compile with OpenVINO

#

OpenVINO supports iGPUs too

#

and CPU performance is double if you use CPU Backend with OpenVINO

#

On my R7 5800 X3D at 512x512
6 s/it normal
3 s/it OpenVINO

cobalt abyss
#

double !!

proper cradle
#

But i would recommend 1080x1080 because Intel ARC is cursed at 1024x1024

cobalt abyss
proper cradle
cobalt abyss
#

still.. i would love if openvino compile on low ram

cobalt abyss
proper cradle
#

New model for each resolution and Lora used

#

3.5 GB per SD 1.5 model

restive parcel
proper cradle
restive parcel
#

Do loras balloon ram or vram requirements massively? I tried using 3 of them and hit memory out every attempt

#

64gb ram, a770 LE

cobalt abyss
proper cradle
#

It will recompile a new model with Lora

proper cradle
proper cradle
#

OpenVINO will work with Diffusers method

restive parcel
#

i dont think I was using diffusers method?

#

just openvino

#

though i can't recall the settings of that

proper cradle
#

Also --use-openvino will lock you in Diffusers backend

restive parcel
#

oh so it automatically uses diffusers

#

shame only 1 lora i guess, i usually use 2 for style and 1 for negative

#

but hey, having the openvino method at all is fantastic

proper cradle
#

Also VAE decoding step is almost instant on OpenVINO

#

This is a big improvement over IPEX

cobalt abyss
cobalt abyss
restive parcel
#

I wonder if I could like, garbage cleanup the openvino model cache folder at a certain size so it doesn't eat my whole drive

cobalt abyss
proper cradle
cobalt abyss
restive parcel
#

comfy is pretty cool for folk who come from node based workflows

#

though sadly I never had a really good success rate in running comfy myself

proper cradle
cobalt abyss
proper cradle
#

Generation times are slower because of OBS recording

proper cradle
cobalt abyss
restive parcel
#

its a cool concept, but i have much less reason to use nodes for ai than for say, blender

proper cradle
#

Diffusers library is already easy enough to use as bare code without any UI.

#

ComfyUI is just a UI above that.

cobalt abyss
cobalt abyss
#

could help out with visualizing complex multistep prompts

keen marsh
#

There are supposedly new compile optimization in the new openvino, not sure if other things need to be updated as well to support it lole vipitis said for it to help stable diffusion yet.

#

ComfyUI is slower than sd.next with ipex, it doesn't seem to have any optimizations

proper cradle
#

We were using the pre-release OpenVINO, not the old one

keen marsh
#

Oh, interesting. Didn't know that. So basically already as optimized as it can be.

grave condor
#

the full release is ahead of the prerelease for all the LLM stuff. They used openVINO for torch compile as dynamo backend with their own stable Diffusion fork

keen marsh
#

non stable diffusion related, Disty do you think it would be possible to run your Koboldai United Ipex in native windows? Or are some dependencies not available? (if this is the wrong place to talk about it let me know)

proper cradle
#

It should be possible

#

I didn't try adding Windows support

proper cradle
#

Only thing left is adding play-ipex.bat file for Windows

keen marsh
#

oh wow, amazing! Appreciate the work man, forreal.

keen marsh
#

Is it just me or is SD.next using a lot more vram than it did before(maybe a month ago). Using -lowvram and it's taking about 4.8gb of vram now and just keeping it. I think it used to take about 2 or something. Can't generate much with -medvram anymore either, but I am not sure how big a file I was able to do before. I might just be misremembering. Anybody on a750 what is the larget you can gemerate on medvram?

restive parcel
#

windows?

keen marsh
#

Yeah, or Linux/wsl2. I tried to update native Linux version and everything broke on me lol, wsl2 works for me.

#

1024*1536 is about where I cap at with medvram now, 1080p ooms

proper cradle
#

Using FP16 fixed VAE?

keen marsh
#

Using the fp16 vae (workaround) that was suggested a month ago or so. Where you created a folder etc

proper cradle
#

It's on by default now

keen marsh
#

Okay will do, thanks

#

Immediately out of memory with medvram with that disabled.

keen marsh
#

Even if I disable the vae completely, if i turn off attention slicing it runs out of memory at 1536*1024 which runs with it enabled.

paper horizon
plain jay
#

getting this error

paper horizon
grave condor
#

libuv seems to be a dependency, even if you don't run Ipex in a conda env

keen marsh
#

Diffusers is about even, so my current workflow is sdxl in native windows and sd1.5 original backend in wsl2.

broken grail
#

anyone here experiment with v-prediction or "vpred" models on diffusers backend? seems like it needs a different scheduler

broken grail
#

aye, I did try that; got an error about it being v-prediction and not v_prediction so I manually set it in config.json, but model still gave bad results

i gotta get some work done today so I'll have to try again later

keen marsh
proper cradle
#

Only SD 2.0 uses it iirc

#

So it can be safely ignored.

keen marsh
#

Have you guys noticed that there seems to be much less variation with the same prompt in sdxl? Like even with different seeds it will be a very similar painting or almost the same exact same person. Seems the same with the few finetuned models i have tried as well.

proper cradle
#

Most of them are overtrained with too small dataset

#

Only model worth trying for anime stuff are AstreaPixie's models imo

#

Rest are merges of merges based on CounterfeitXL which is overtrained with too small dataset

keen marsh
#

This is with the base model too though

#

My guess is its a side effect of better understanding of prompt

#

Less happy accidents but better fine tune control maybe

chrome bone
#

the larger the model the more training is needed. i'd say its undertrained instead

#

by that i mean the model does not generalize well. a more diverse training sample is needed

proper cradle
#

Pushed some optimizations to SDP attention on dev branch.
Speed is back to normal now. (8.2 it/s on SD 1.5 and 1.8 it/s on SDXL)

minor plank
#

I want to get into the generative AI business. It looks like fun and seems very interesting.

paper horizon
#

for batch size = 16, native windows is as good as WSL/linux

proper cradle
#

512x512 at batch size 1 is CPU bottlenecked both on OpenVINO and IPEX

#

And it's not isolated to SDNext

#

Barebone Diffusers scripts are the same

#

Also IPEX performance catched up to OpenVINO with the new Diffusers

#

Both of them takes 2.75 seconds to generate a 512x512 image

plain jay
#

If i want 4k images will it work with 16 ram and ARc a750 ,ryzen 3600 ? or will crash ?

proper cradle
#

Direct 4096x4096 generation?

plain jay
#

yup or should i scale up ?

proper cradle
#

4096x4096 will mosty likely crash with 8GB VRAM

#

--lowvram can help but it's too slow

#

A770 16GB can generate direct 4096x4096

plain jay
#

what about 3840 x 2160 ?

#

on arc a750 , i have a750

tall grove
#

u can try generating?

proper cradle
#

SDNext with IPEX, SDXL 3840x2160:

Default settings used 12GB.

--medvram used 10GB
--lowvram used 4GB (half the speed and needs more RAM to offload)

keen marsh
proper cradle
#

Used Github releases as file hosting

onyx moth
#

Hi, can you clarify this for me? I'd like to use DPM++ SDE Karras

#

so for the checkboxes would this be it?

#

The default? Or do I pick dpmsolver++

#

with this

proper cradle
#

dpmsolver does nothing with DPM SDE

proper cradle
onyx moth
#

oh ok

#

I get it thanks Disty

proper cradle
proper cradle
#

Can someone with an iGPU or multi GPUs try setting this environment value and try OpenVINO?
OPENVINO_TORCH_BACKEND_DEVICE="MULTI:GPU.1,GPU.0"
I am curious if it will use both GPUs.

restive parcel
#

i have openvino sdnext set up in a conda venv, if i can just tweak something in there

restive parcel
#

while I figure out what part of the init file is safe to edit if any, I gave a Lora a go, nice to see its working nicely now, although only one at a time

proper cradle
#

or
VALUE webui.bat

proper cradle
restive parcel
#

tried without lora

#

this is on a770 LE + i5-13600k

#

oh yeah, i noticed that while using openvino mode, i can't switch models anymore. it errors out about a config and then goes back to what i was already using

restive parcel
#

removing the current model entirely let it pick a new default, but i couldn't just unload the checkpoint and pick a new one from the dropdown

#

oh nvm, it died on loading

proper cradle
#

Run this

set "OPENVINO_TORCH_BACKEND_DEVICE=GPU"
#

Also OpenVINO device logging is in dev branch:

restive parcel
proper cradle
#

I can't reproduce the model switching issue

#

Try removing the cache folder

restive parcel
#

hmm, removing cache still has the issue

#

I tried all of the models I have on hand, and only the one i started on (AOM) works

#

Anythingv5, CounterfeitXL, and SDXL don't load

proper cradle
#

Did you change the default settings?

#

There was a use dict from curret model setting that can cause this but i can't find it

#

Yep it is

restive parcel
#

oh wtf, it actually was using that

proper cradle
restive parcel
#

i wonder if i changed that while debugging another issue from before

proper cradle
#

Probably mixed that up with model selection

restive parcel
#

I was having a vae loading issue of some sort iirc, so I just set everything i could to point to the model I was using

proper cradle
#

Multiple Loras are working with the new diffusers

#

It's currently in dev branch

#

Kurokawa Akane + Hoshino Ai

restive parcel
#

oh? will that work for openvino as well?

proper cradle
restive parcel
#

ooooh

proper cradle
#

IPEX was already working with sequential apply method

proper cradle
proper cradle
#

Device name works with CPUs too

proper cradle
tall grove
#

is that 512x512 1.5?

proper cradle
#

Yes

#

IPEX and OpenVINO has the same performance in dev branch

tall grove
#

has it got faster

true spade
#

i need help for installation... any one ??

proper cradle
proper cradle
# true spade i need help for installation... any one ??
true spade
#

i dont know.. i mess it up every time..

#

my a1111 currently using cpu, ๐Ÿ˜ฆ

tall grove
#

stop using a1111

true spade
#

i am so confuse dude.

#

if i share my screen is there any one can guide me??

tall grove
#

first

#

u using ubuntu?

#

ill be honest the easiest way to do it is on linux with ubuntu

#

follow this right?

true spade
#

E: Unable to locate package intel-oneapi-compiler-dpcpp-cpp
E: Unable to locate package intel-oneapi-mkl

proper cradle
true spade
#

N: Ignoring file 'oneAPI.listsudo' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
E: Unable to locate package intel-oneapi-compiler-dpcpp-cpp
E: Unable to locate package intel-oneapi-mkl
aipriom@DESKTOP-IVNLUFL:~$

tall grove
#

u didnt do it in the correct order?

#

do it one line at a time

true spade
#

sudo apt-get install -y gpg-agent wget
wget -qO - https://repositories.intel.com/graphics/intel-graphics.key |
sudo gpg --dearmor --output /usr/share/keyrings/intel-graphics.gpg
echo 'deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc' |
sudo tee /etc/apt/sources.list.d/intel.gpu.jammy.list
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \ |gpg --dearmor |
sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update && sudo apt upgrade -y

tall grove
#

if it still doesnt work i can help u do the docker way as it just requires properly setting up docker-ce to work

#

then u can easily setup the web ui with 1 cmd, hopefully...

#

atleast it was that simple for me

true spade
#

i installed it

tall grove
#

u got docker?

true spade
#

sudo usermod -aG docker username ???

tall grove
#

dont bother with the group thing for now

#

just make sure when running docker in sudo it works

#

run that cmc and it should setup everything

#
--device /dev/dri \
-v ~/docker-mount/sd-webui:/sd-webui \
-v deps:/deps \
-v huggingface:/root/.cache/huggingface \
-p 7860:7860 \
--name sd-server \
nuullll/ipex-arc-sd:latest```
#

that cmd right in the link

true spade
#

@tall grove

#

is it okay?

tall grove
#

u have to wait a while

#

um if thats windows idk about that working properly

true spade
#

ok brother thanks for your support

#

actually bro my nvidia was good.. arc fucks my head. ๐Ÿ˜ฆ

tall grove
#

setting it up is mostly fine, but if ur using wsl it may be problematic. Native windows support seems to be coming soon with pre-compiled aot wheels but it will perform a bit worse i think?

true spade
#

look i m not a devoloper. even i dont have 2% knowlage on this... but nvidia was user frirndly....

tall grove
#

well yeah arc never touted a complete support for sd

true spade
#

i bought this arc last months..

tall grove
#

but it works with a little work

#

nvidia is always what u get if u care about cuda stuff

#

though intel is working on ipex and their various things like openvino its still all new stuff

true spade
#

nd i faced unbeatable issues .. which i dont even imagine...

tall grove
#

for games?

true spade
#

my first issue was (game shut down)

#

then i faced issue on Adobe premiere pro

#

u cnt blv my pc got stuck while color grading.... fully hang..

#

๐Ÿ˜ฆ

#

naw this...

#

actually i m sweating after thinking that i have to use this gpu next 2years.

tall grove
#

?? whats the issue lol

#

stable diffusion works fine it just requires a bit more setuo

#

the same is true for amd

true spade
#

i dont know about amd.

tall grove
#

amd does not work on windows native aswell

#

if u wanted stable diffusion performance with ease of use you go with nvidia, but intel is also about a 3060 ti in performance and totally fine running on a linux distro, and with native aot wheels coming it should do fine (i hope) on windows with degrabed performance. If you just want 1.5 there is also open vino

#

the setup ur doing rn is similar to nvidia its just that you need to install it on linux for it to work, and add the packages required for ipex and whatnot to run

#

you cant ask for much more except making it do this for you i guess

#

the docker ur using does this environment setup for you, thus the single command

#

intel arc will always be a rather precarious card because of it being first gen, and naturally with drivers that have a host of things needing work doing. Choosing intel as a long term choice can be fine if you understand that some things may not work, and if they dont it can take a while for it to get fixed

proper cradle
#

Using the docker image is actually easier than Nvidia

true spade
#

yeah its true..

tall grove
#

I mean it is a single command once u have docker installed so it cant get more complicated than that

true spade
#

why u gues dont tell me about docker before...

tall grove
#

idk native is a bit easier to use once its done i guess

#

u have to start the docker up when u want to use it, but you can rename the container to a nickname to make this easier

#

it does bind to a diorectory so u can easily add ur models and whatnot to the container so it isnt much different

#

if u need help doing that its quite easy to do

#

i just named the container sdnext and so whenever i want it i just start up sdnext with docker run or something

true spade
tall grove
#

ignore

proper cradle
#

Do you have an iGPU?

tall grove
#

oh is it an actual error

#

tbh i think hes doing it in wsl so idk if that does anything weird

true spade
#

yes i have igpu

proper cradle
#

Let it do it's thing and eventually crash

#

After that disable your iGPU from device manager

true spade
#

then??

proper cradle
#

restart

#

and it should work

keen marsh
proper cradle
#

VM disks generally need manual pruning

keen marsh
#

Yeah, it doesn't free it even uninstalled had to manually find the vm files and delete them, its a pita in windows

#

Wsl2 by itself is fine though, can run a cmd line and prune it

restive parcel
#

i tried docker in native linux, and i just got a lot of errors somehow DinaKEK

#

never did manage to get it working

tall grove
#

yeah i got the same error and it fixed itself when switching to docker-ce and using sudo

restive parcel
#

I tried that too, and using sudo just gave me a different error

tall grove
#

weird lol

restive parcel
#

I'm new to docker so its entirely possible i screwed something up, but I do find it amusing that I can still screw up something so turn-key as docker DinaKEK

keen marsh
#

I can also say, i never got any stable diffusion docker to work either, it just ate my sdd space lol

restive parcel
#

I feel like docker for windows wouldn't perform well in this case simply because docker (to my knowledge) still relies on the current system kernel, and since its on windows its not any better than running it as a VM due to how there are specific kernel needs for the card with the current oneapi environment. I could easily be wrong, but that's my interpretation of the issues for VMs and containers with arc

keen marsh
unborn flume
#

I installed this, it works fine, but I cannot generate with sdxl models. It gives "Error loading model weights" error. Does this setup not support sdxl or am I having another problem? @proper cradle

proper cradle
#

Use the Diffusers backend

unborn flume
# proper cradle https://github.com/vladmandic/automatic/wiki/SD-XL

I have A770 16GB. I changed Execution backend from original to diffusers and diffusers pipeline to Stable Diffusion XL. It didn't work at first, but after restarting the server few times, it started working. But I'm getting problematic results. I hope I can figure it out.

proper cradle
#

Try aything other than 1024x1024 like 1080x1080 or 1024x1536

broken grail
#

I wonder if there are any other cursed resolutions

unborn flume
proper cradle
#

Also DMP 2M isn't good with SDXL

#

Best one is Euler a with higher CFG

#

DPM SDE is good but slow

unborn flume
minor plank
#

I'm about to start my SD journey.

Anything I should do or avoid?

Things you wish that you had known before starting on your journey?

9900k
32GB RAM
A770 LE 16GB
Lots of NVMe

tall grove
#

use linux if u can bout it

#

32gb is enough for me so u should be fine

keen marsh
#

Its so weird that I have never noticed any problem with 1024 res at all. I wonder if it's because i am onlu on an a750? Or that I use med-vram usually?

onyx moth
#

1024x1344 works nicely for me

restive parcel
# proper cradle DPM SDE is good but slow

dpm sde very nice, the convergence is powerful enough that I can usually use less than a third or even a quarter as many samples as I would euler to get a coherent result

#

I feel like I do use more vram with it though?

chrome bone
#

euler is a pretty dated solver

#

heres a comparison table made by creators of DPM++, the value in table is FID score (lower is better), NFE is number of steps

restive parcel
#

if I'm reading this right, isn't DDIM doing really well here? not that I'd complain, DDIM my beloved

chrome bone
#

yeah it is. but DPM SDE is stochastic and that gives your picture more variability

keen marsh
restive parcel
#

I havent really had good performance from SDXL models in the first place, so I just went back to 1.5

#

need good anime models before i have any use of it

keen marsh
#

It def has a ways to go, but it has some advantages with high res off the bat. I can do higher resolution in it than with 1.5 without ooms. I have to tile upacale just to get the base res that xl gives.

keen marsh
#

So i tried to use a lora with ipex, and it imediately shutdown webui. Then I tried to move to dev branch and it shutdown just trying to load a sdxl model, now I switched back and it shuts down as soon as I load an xl model in diffusers. No errors just shuts down, tried debug too. Edit I see now, it deleted my custom folder location in the settings.

#

Nope, still just shuts down. lol, wtf?

keen marsh
#

can't load any models or vae's, it just crashes. Went back to an older commit and the same. No clue what broke, guess I have to reinstall again. okay got original backend to load up finally.

#

doesn't even tell me what file just line 198

proper cradle
#

try removing --use-ipex if you are on Linux

#

ipexrun can cause random errors

#

Welp you are on Windows

keen marsh
#

I will see, I downloaded a whole new folder to try and see. There were some new extensions from the dev branch that may have broken it i dunno

keen marsh
#

well I am completely at a loss now, redownloaded everything from scratch and it's still just quiting when trying to generate an image. Maybe I will try a system restart?

keen marsh
#

okay, just tried comfy ui and it seemed to crash as well, might be an issue with the latest drivers.

proper cradle
#

iGPU is disabled? Don't really know else.

keen marsh
#

Don't have one, I thik it's the driver. I actually hadn't generated anything on it since I updated yesterday

#

I also may try the whl you guys compiled to see if mine is just too old

keen marsh
#

So yeah, it was the latest drivers. They somehow break something in Ipex

proper cradle
#

HyperTiling with IPEX

#

HyperTile almost fixes the 1024x1024 curse

keen marsh
#

Is this a new extension?

novel sphinx
#

So multi gpu with igpu does weird things and i do not recommend, it appears to load the model into the vram on the dgpu but then inferences on just the igpu? The igpu usage ping pongs between 100 and 0 with no dgpu usage besides vram being shown. Its obviously much slower than just using the dgpu at like 2.5s/it vs the almost 11it/s i get on just the a770

#

I think it's prioritizing the igpu

proper cradle
proper cradle
proper cradle
#

Muti GPU option on SDNext uses MULTI:GPU.0,GPU.1 and so on

#

If you check Remove iGPU option, it will remove GPU.0 from the list

#

But it will probably end up ignoring the iGPU with MULTI:GPU.1,GPU.0 since iGPU is too slow to be useful.

novel sphinx
#

even just using the dgpu alone i barely see any usage

#

yeah setting 1 then 0 just uses the arc i see no igpu usage

keen marsh
novel sphinx
#

actually using multi with igpu second is still worse than just igpu

#

it uses more system ram like that

keen marsh
novel sphinx
#

im getting errors with hypertile but im using openvino

#

so iot might not be compatible

proper cradle
#

OpenVINO doesn't work with any slicing/tiling

novel sphinx
#

ahh that explains it

#

RuntimeError: Check 'backward_compatible_check || in_out_elements_equal' failed at src\core\src\op\reshape.cpp:433:
While validating node 'opset1::Reshape Reshape_57436 (input1[0]:f32[2,77,768], Constant_57435[0]:i32[2]) -> (f32[?,?])'
with friendly_name 'Reshape_57436':
Requested output shape [8192,320] is incompatible with input shape [2,77,768]

#

this is what i get

proper cradle
#

Apparently Dynamic Slicing i added for IPEX actually makes generating faster at higher resolutions.

(Both of them are with HyperTile.)
Dynamic Slicing: Time: 1m 15.78s
No Dynamic Slicing: Time: 1m 32.74s

novel sphinx
#

thats pretty good

novel sphinx
#

Definitely the new driver in windows breaks ipex somehow

#

Openvino is fine

#

Just reverted from 4885 to 4826 and ipex works

#

Hypertile is faster goes from 8.2 it/s to 9.5 it/s, neat

keen marsh
novel sphinx
#

yeah and has the starfield update but yeah, i might still update and just use the openvino tho its more limited, been testing diff things today

#

openvino on win is still a bit faster too

#

more resource intensive tho

keen marsh
novel sphinx
#

Oh idk i dont have wsl setup anymore

keen marsh
#

Hopefully they see my thread and fix it soon.

novel sphinx
#

I mean it uses the same driver but idk, i wonder what causes the issue, seems random

keen marsh
#

I have both, wsl for sd1.5 and native for sdxl, better workflow than trying to reactivate plugins constantly

keen marsh
novel sphinx
#

Yeah it just loads everything fine you click generate and after a few secs it just doesn't do anything says press any key to continue and that's that

gaunt breach
#

Followed the install instructions for Sd next (ipex)

#

During the first run/instal I run into an error.

#

I need help getting ipex working in the python venv.

running it will just produce an error like ModuleNotFoundError: No module named 'intel_extension_for_pytorch'

#

Iโ€™m doing this on Linux

grave condor
#

then ipex is not install in that env

proper cradle
#

20 steps 512x768 + 20 steps 1024x1536 with hires took 13 seconds with HyperTile

#

20 steps 512x512 takes 2.2 seconds

keen marsh
proper cradle
#

More efficient

#

5.5 vs 4.5

gaunt breach
#

Why did https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ work? while https://developer.intel.com/ipex-whl-stable-xpu did not

proper cradle
#

It's updated to this URL on dev branch

#

Old one was causing issues randomly

gaunt breach
#

Alright, good to know, thanks

proper cradle
#

This is with IPEX

grave condor
#

could the callback to show inference step be done more often? So you really get like a 20fps experience?

keen marsh
#

Update by command " wsl --update --pre-release "

proper cradle
#

Did a sampler XYZ

keen marsh
#

Native windows Ipex still broken on latest drivers ๐Ÿ˜ฆ

fickle plume
#

tried to do vlad under wsl

proper cradle
fickle plume
#

final

#

linux works fine

proper cradle
#

Cause sometimes previews randomly break on original backend.

proper cradle
#

Try diffusers with attention slicing enabled

grave condor
keen marsh
#

Good news, both the driver team and ipex team are looking into it though.

keen marsh
grave condor
#

hm, I got issues trying language model inference via accelerate that I clearly hadn't before. And now worry it's the driver. What was the last known good driver? 4672?

fickle plume
keen marsh
fickle plume
#

i am using 4878 beta, and while vlad is downloading i wonder if the whql is any different

#

so many drivers

#

updating the repo fixed it... interesting

#

oh well

#

wonder if i broke something trying to get sdxl to work

#

well, sdxl still has things just die under wsl
welp, i guess i'll stick to ubuntu, i have a feeling i don't want to have headaches tryint to train either

#

Killed
descriptive

proper cradle
fickle plume
#

48gb

#

it does go pretty high

proper cradle
#

24 GB should be barely enough for SDXL + Refiner

#

WSL RAM = System RAM / 2

fickle plume
#

damn, why

proper cradle
fickle plume
#

ooh, thanks!

keen marsh
fickle plume
#

I had some other issues getting native windows to work at all, but I'll have to check again I guess

keen marsh
fickle plume
#

Does training also work natively or no idea

#

Well, training a lora

#

I'm in the convenient situation of actually having lightweight games i can play while training
Mindustry is a pretty enjoyable factoriolike

keen marsh
#

Just expect about 20% decrease in speed with aot whls, the regular whls will give faster generation technically but a 10-15min startup time lol

restive parcel
#

guess i'm glad i've been too lazy to update my arc driver since before starfield updates, a lot of things seem to have been breaking

keen marsh
novel sphinx
#

Welcome to the arc experience ๐Ÿ™ƒ

#

Dual booting has never worked for me with an arc card ive tried several distros and once whatever linux os i try installs and tries to reboot i never get a display and constantly see the light on my monitor flicker on and off over and over. Tried Ubuntu, EndeavorOS, Manjaro sane thing every time and tried with secure boot enabled and disabled and with igpu enabled and disabled to see if any of that was creating conflicts and no luck.

restive parcel
keen marsh
novel sphinx
#

I tried both on the same ssd as windows and also on a seperate one and same exact thing makes me think possibly something with grub idk and all distros had 6.2 kernel or newer

proper cradle
#

I was using Arch Linux before i bought my A770 and it worked fine without doing anything when i bought my A770.

novel sphinx
#

I haven't tried just linux bc i unfortunately need windows for certain games and such but dual booting i havent ever had success with

tall grove
#

u must use new kernels with arc

#

fedora works for me on a partition with the same drive

#

linux is always plug and play as the drivers are in the kernel

#

well might be igpu weirdness but it shouldnt be that much of an issue

novel sphinx
#

I was using new kernels as i stated along with turning off my igpu

keen marsh
#

The easiest way is just unhook your windows drives while you install, then you can also use that usb on any computer

novel sphinx
#

Yeah i def aint goin through all that lol my windows drive is an nvme on my board i honestly dont have a real need for native linux just something i experimented with to no success

fickle plume
#

๐Ÿ˜

#

when i make it generate an image on windows, it just flat out crashes

#

no message, no nothing

fickle plume
#

ok, even with everything fresh i still get this weird crash

keen marsh
keen marsh
fickle plume
#

Oh, so that is what's broken

#

I wasn't expecting such a.. straight up, nothing said crash

keen marsh
#

So far wsl2 still works

keen marsh
#

Anybody gotten the Ipadapter in controlnet to work on arc? I keep getting a dtype error?

keen marsh
#

Wow fast, awesome work man.

keen marsh
#

updated dev branch but now TypeError: 'Logger' object is not callable

proper cradle
#

Logs?

keen marsh
#

Went backa bunch of commits and tehy all were broken, don't remember which i was on before I updated but back on master and it works. Let me update again and get the log.

#

Oh and also I get a different error with "--use-ipex" which is what I always use. also, I am now getting a different error without too. It's now an error in ui extensions.

proper cradle
#

Someone with the same error opened a bug report on SDNext

#

"b55bb8c still works when I switch to it.
74fb8fb breaks it."

keen marsh
#

I just got this one, the other error was different. about typeerror: logger, nto getting it now though. I will try that commit and see

keen marsh
#

something with the Ui again, not sure maybe it's my settings

proper cradle
#

try removing config.json

#

It will use 1080x1080 if you select 1024x1024

keen marsh
#

okay, deleting the UI-config fixed it. Gonna try and edit that file so i can use try the Ipadapter

proper cradle
#

Dev branch finally merged

#

Now Windows should be supported with:

webui.bat --use-ipex
novel sphinx
#

Looks great

upper pine
#

Disty and Nuullll, THANK YOU for all you do. Been having a blast with SD on Linux and I love seeing it work on intel GPUs

green pagoda
#

I'm trying to run this but running into the following issue with TensorFlow:

2023-10-19 20:06:49.362030: F tensorflow/c/experimental/stream_executor/stream_executor.cc:808] Non-OK-status: stream_executor::MultiPlatformManager::RegisterPlatform( std::move(cplatform)) status: INTERNAL: platform is already registered with name: "XPU"

Any ideas?

#

I'm thinking it might be the fact that I have two Intel GPU devices, but I cannot say for certain.

novel sphinx
#

Must have igpu disabled

green pagoda
#

How do you do that.

novel sphinx
#

In bios

green pagoda
#

I cannot do that, the host system is a laptop.

green pagoda
#

What exactly is using TensorFlow?

green pagoda
#

This is really cursed.

proper cradle
proper cradle
#

You can ignore that error if everything else runs fine.

proper cradle
#

Seems like OpenVINO works with AMD GPUs on Windows too.

#

Speeds are comparable to Shark.

grave condor
#

openVINO has an experimental Nvidia GPU plugin as well

#

plus it can do ONNX runtime which should work across everything

proper cradle
#

OpenVINO can run on anything then

grave condor
#

and openVINO as a torch.compile backend should avoid requiring the whole openVINO IR conversion and saving that model somewhere

#

Many options and I haven't tried them all, as I currently struggle with implementation. Once I get into running a lot of evals, I will take a closer look there.

green pagoda
proper cradle
#

I don't know if there is a laptop with Intel ARC dGPU.

proper cradle
#

Use OpenVINO if you want to use non ARC GPUs.

green pagoda
proper cradle
#

Try xpu_VISIBLE_DEVICES=1 ./webui.sh --use-ipex

#

You can't really disable iGPU with that config

green pagoda
#

Thanks.

green pagoda
#

TensorFlow still registers two XPU platforms.

proper cradle
green pagoda
#

Well it crashes soโ€ฆ

proper cradle
#

if not try;

ONEAPI_DEVICE_SELECTOR="ext_oneapi_level_zero_gpu:1"

Or;

ONEAPI_DEVICE_SELECTOR="ext_oneapi_level_zero_gpu:0"
#

try these with sycl-ls

green pagoda
#

Now it is only seeing a single device, but still registering two XPU platforms.

green pagoda
#

Yes.

proper cradle
#

and what error are you having?

green pagoda
proper cradle
green pagoda
#

Yes.

#

Maybe I was not clear about that.

proper cradle
#

source /opt/intel/oneapi/setvars.sh

proper cradle
green pagoda
#
sycl-ls
``````ansi
[opencl:cpu:0] Intel(R) OpenCL, Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz 3.0 [2023.16.7.0.21_160000]
[opencl:gpu:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics 3.0 [23.35.27191.9]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) HD Graphics 630 3.0 [23.35.27191.9]
[opencl:acc:3] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2023.16.7.0.21_160000]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.27191]
[ext_oneapi_level_zero:gpu:1] Intel(R) Level-Zero, Intel(R) HD Graphics 630 1.3 [1.3.27191]
proper cradle
#
ONEAPI_DEVICE_SELECTOR=level_zero:0 ./webui.sh
green pagoda
# proper cradle ``` ONEAPI_DEVICE_SELECTOR=level_zero:0 ./webui.sh ```
2023-10-20 13:01:03.512410: I itex/core/wrapper/itex_gpu_wrapper.cc:35] Intel Extension for Tensorflow* GPU backend is loaded.
2023-10-20 13:01:03.587475: W itex/core/ops/op_init.cc:58] Op: _QuantizedMaxPool3D is already registered in Tensorflow
2023-10-20 13:01:03.597085: I itex/core/devices/gpu/itex_gpu_runtime.cc:129] Selected platform: Intel(R) Level-Zero
2023-10-20 13:01:03.597432: I itex/core/devices/gpu/itex_gpu_runtime.cc:154] number of sub-devices is zero, expose root device.
2023-10-20 13:01:03.683572: F tensorflow/c/experimental/stream_executor/stream_executor.cc:808] Non-OK-status: stream_executor::MultiPlatformManager::RegisterPlatform( std::move(cplatform)) status: INTERNAL: platform is already registered with name: "XPU"
Aborted (core dumped)