#SDNext WebUI on Intel ARC

1 messages · Page 2 of 1

grave condor
#

the prebuilt wheels for windows are available

keen marsh
#

Nice!

chrome bone
#

thats fast

grave condor
#

can't try it myself right now, but please try to just use the command they have (maybe exclude torchvision) to see if it throws any errors

keen marsh
#

I will, wsl 2 just won't work for me. Wonder if its their pythin version is ahead of mine

grave condor
#

there are wheels for python38 all the way through 311

keen marsh
#

Hopefully I can get this running native... it's installingAlchemistWorry

grave condor
#

exciting times... And I am away from my computer for 3 whole days. so much work to catch up to in the next few weeks

pastel geode
#

I’m scared to mess up my versions, virtual environments and paths😅.
I’ll do this once there is a clear set of instructions for both windows and wsl

grave condor
#

I have asked for an article that addresses consumer GPUs for native windows with this new release

chrome bone
#

very excited

grave condor
#

please confirm it's working. I have already closed the issue because I need to sleep

keen marsh
#

Now I need to learn how to set the one api environment in windows lol

chrome bone
#

the basekit has windows version, you can choose to install only dpcpp and mkl

#

ill toy with it after lunch

keen marsh
#

I have it, but I think i need to set it proper to use

chrome bone
keen marsh
#

i think it was setvars in linux

grave condor
#

try to run %ONEAPI_ROOT%compiler\env\vars.bat

#

and the one for mkl

#

see the sanity check at the bottom of the Windows installation instructions. Maybe exclude torchvision

grave condor
#

maybe need to run it with call, not sure.

keen marsh
#

should this command run in windows? "python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.version); print(ipex.version); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];" " I pull an error with it

chrome bone
#

version

#

stupid discord

grave condor
#

put the code between ticks ` to avoid discord markdown formatting

grave condor
keen marsh
#

yes

grave condor
#

try a pip show intel_extension_for_pytorch

keen marsh
#

package not found 😦

#

it installed though, let me try again

grave condor
#

but the pip install didn't throw any errors?

keen marsh
#

WARNING: Package(s) not found: intel_extension_for, pytorch

grave condor
#

turns out the devs have been using C:/oneAPI as Installation path not C:/Program Files (x86)/Intel/oneAPI and therefore didn't run into the issues with white spaces and getting errors that return like half the path

#

do a pip show torch and hope it gives you torch2.0.1+git#hash

keen marsh
#

Version: 2.0.0a0+gitc6a572f

#

i get this warning when reinstalling DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.*; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063

GitHub

Towards #11715 What is changing pip has historically allowed many arbitrary strings as versions and version specifiers. Going forward, all versions and version specifiers will need to conform to PE...

chrome bone
#

you can ignore this for now, this is just used to run sde samplers

keen marsh
#

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.14.1 requires torch==1.13.1, but you have torch 2.0.0a0+gitc6a572f which is incompatible.

grave condor
#

yeah, get rid of torchvision for now, it's optional. try to just get IPEX working and then see if you really need it

#

you can do pip uninstall torchvision

keen marsh
#

Name: intel-extension-for-pytorch
Version: 2.0.110+gitba7f6c1AlchemistArt

#

that seemed to work.

grave condor
#

try to import it in python

keen marsh
#

still get the same module error AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\backend_with_compiler.dll"

chrome bone
#

hey vipitis, did you change intel basekit directory to without white space

keen marsh
#

SD.next ipex install needs to change for windows, it's trying to instal the linux library. @proper cradle

grave condor
#

It's 5.20 am and the matches start at 15.30, I really need to fall asleep now.

Good luck on this, but it sounds really promising. If I am home early enough tomorrow, I might try around a bit myself. But I do hope that someone figures it out and either posts a proper guide or they fix it to make the experience more seamless.

#

see if they push an update to docs or the release branch here

keen marsh
#

good night bro, appreciate the help

#

so far got sd.next installing with ipex in windows, hopefully I can get it to work.

#

nope, almost ' OSError: [WinError 126] The specified module could not be found. Error loading "H:\Stable
diffusion\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its dependenc '

chrome bone
#

did you try this

#

im moving basekit's directory now. time never passes so slow

keen marsh
#

ahhh, I did in linux lol. I forgot

#

I'm not sure if use-ipex is running the one api environment, it may be because it's used to linux. I will see if this update fixes things first

chrome bone
#

ill try to get ipex running first, then see whats needed to be adapted in sdwebui

coral mulch
#

How do you actually activate the oneapi environment in Windows?

keen marsh
coral mulch
#

Yep, I remembered now.

keen marsh
#

I hit setvars.bat it runs and seems to load everything but it closes fast not sure if it's sticking?

coral mulch
#

"C:\Program Files (x86)\Intel\oneAPI\setvars.bat"

#

You run it

#

in a command prompt.

#

You don't click the bat.

keen marsh
#

do you run it each time?

#

and keep the prompt open?

coral mulch
#

You have to setvars with openvino as well

#

So I'm assuming you do.

keen marsh
#

I used open vino once with the first ever version it supported stable diffusion with

coral mulch
#

You can probably modify whatever starts up oneapi for linux with the directory of the bat for windows

#

Let's see what happens.

keen marsh
#

finally figured it out, now lets see if it works lol

#

Same error... 😦

coral mulch
#
"C:\Users\dbs_5\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its
dependencies.```
#

Same error as you, Aaron.

keen marsh
#

https://github.com/intel/intel-extension-for-pytorch/issues/394 hmm closest I can find to our issue. they post a link to the new windows version in the last reply

GitHub

Describe the issue 环境为:Windows 11 企业版 显卡:因特尔 A770m 显卡 本地编译安装成功了intel-extension-for-pytorch、pytorch 运行提示: raise err OSError: [WinError 126] 找不到指定的模块。 Error loading "C:\Users\ws\PycharmProjects...

coral mulch
#

Visual Studio 2022

#

Install Desktop development with C++ in the Workloads tab. 2 separate options MSVC and Windows SDK are required.

#

I am using VS2019

keen marsh
#

In the Linux install they "export" the mkl and dpcc, i wonder if that means add to path

#

I am on 2022

coral mulch
#

aka

#

I do not think VS version is the reason at all then.

#

Lol

#

Hmm.

keen marsh
#

I am also not on the latest driver, maybe that could be it

coral mulch
#

I am.

restive parcel
#

good luck folks, i already burnt out all my debugging neurons over the last 3 days CeobeConfused

keen marsh
#

Lol, i have had a month off and came back at the best time

restive parcel
#

if i can't even get it running in linux native, I daren't attempt to run something this fresh in windows

coral mulch
#

Well I got it running on linux, until it blackscreened a few times.

keen marsh
#

I spend more time debugging than using the programs lol

#

In wsl2 there is something in my environment it doesn't like, i haven't tried native Linux yet

#

Maybe the python version in the venv needs to match with windows?

restive parcel
#

well, i used automatic on nvidia quite a bit

#

arc on the other hand has been a lot of work just to match what I could do before

#

...on an old 4gb card

#

we're slowly improving though

keen marsh
#

Yeah, if ipex can work in windows it's a game changer. full UI support at full speed. It would be second to nvidia as amd can only use directml or shark.

keen marsh
#

noticed the torch versions are different for windows and linux, don't know if that is the issue. linux is 2.0.1a and windows is 2.0.0a

chrome bone
#

building torch now.. the prebuilt couldnt be imported

#

i wonder why

keen marsh
#

make sure you use the right prebuilt links, the old ones were wrong and for linux. It's importing fine for me

chrome bone
#

.................... -.-

keen marsh
chrome bone
#

so this ^ is correct?

#

ill halt the build then

keen marsh
#

it should be, you can check the link and see if those win files show up for you if it's a cache issue.

chrome bone
#

can you import ipex too?

#

i get errors

keen marsh
#

try and uninstall torch vision,

#

pip show intel_extension_for_pytorch
Name: intel-extension-for-pytorch
Version: 2.0.110+gitba7f6c1
Summary: Intel® Extension for PyTorch*

chrome bone
#

how come this would interfere with ipex...?

keen marsh
#

I am thinking torchvision doesn't work on windows for intel...

chrome bone
#

i'd think all of them work for intel, but some configuration isnt right

keen marsh
#

I dunno, but when I did it it worked for me . You can check up a little bit when I was trouble shooting.

#

also think lack of torchvision is why sd.next isn't loading? not sure.

chrome bone
#

it never launched successfully for me, windows or linux

#

ill just let computer do its thing while i watch some anime

keen marsh
#

I got it to launch in linux, but it wouldn't output lol

chrome bone
novel sphinx
#

Building on my pc too, my poor 11700k doin its hardest work ever lol

chrome bone
#

yeah my 5700x is at 100% load too

#

lol

#

i bought 8 cores gotta use em all

keen marsh
#

For what its worth i compiled xpu10 on wsl2 limited to 3 cores lol. Took a couple hours

#

I am going to try and run it outside the venv

novel sphinx
#

3 threads sounds awful lol. Im like 80% done at 13 mins in

keen marsh
#

lol, it was the only way to get it to finish. it kept running out of resources.

novel sphinx
#

That's rough

keen marsh
#

well got it to load outside the virtual environment, forced me to fp32 but just errors out again.

#

think it doesn't see the gpu

novel sphinx
#

I got sdxl to work in wsl with the new ipex using sequential cpu offload, was only using like 1.5gb vram, it was pretty slow tho like 2s/it vs 1.8 it/s i was getting before without sequential cpu offload. Using the model shuffle it starts generating but ultimately crashes and never finishes. Could be related to using the refiner i need to test without

keen marsh
#

I'm just trying basic 1.5 right now.

novel sphinx
#

That is if i can get it to load without the weird fp32 thing that seems to be a hit or miss. Sometimes it all loads sometimes it doesnt. 1.5 worked for me no issues tho

#

I think its somewhat slower than before however like marginally

#

I was getting over 8it/s previously seems to be closer to 7 now

#

For sd1.5 512x512

keen marsh
#

maybe try a different optimization method

novel sphinx
#

Sub quadratic seems to be the best still

keen marsh
#

I was hoping sdp would work.

#

Well i'm going to sleep, will maybe try compiling from source tommorow.

chrome bone
#

finally!!!

#

now i need to find a way to make sdwebui use it

proper cradle
chrome bone
chrome bone
#

so i used --debug flag to install packages and manually replaced torch. been stuck here for a while. the package loading time takes forever. probably a loop somewhere that deters model loading(?)

tall grove
#

it loaded natively?

chrome bone
#

yes. on windows

chrome bone
#

managed to get jbaboval to launch, it is slow as hell too lol i wonder what went wrong

chrome bone
#

it takes forever to do these operations.. i wonder what exactly the problem is

#

removing ipex.optimize reduces loading time by 33 seconds, for both embeddings and ckpt

proper cradle
#

90 seconds is still too long :/

tall grove
#

Lol what is it doing

chrome bone
#

it is not just loading time, the time it takes to initialize an inference is also veeeeeeery long and i went out of patience before the first iteration is finished

tall grove
#

Make an issue on the gifhub?

chrome bone
#

there must be something wrong in the process, im finding out what that is

chrome bone
#

ive pinned down the problem to this

#

getting expected shape takes 90 seconds... while id like to go further its bed time now

grave condor
#

that's a hack. You embed a small string and then look at its size instead of looking at the config or whatever. If this is really slow, then actually embedding something will be way slower.

#

Perhaps they do some caching/compiling upon first inference? Not too sure

proper cradle
#

Only custom AOT IPEX builds doesn't do that.

coral mulch
#

Outcome is the same.

keen marsh
#

also, how do you get the latest python 3.10 in windows? Conda is 3.10.12 but my version is 3.10.6 so maybe that's the conflict.

#

well, it works now on new fork, so nevermind lol.

#

well it defaulted to fp32 and cpu though I think

#

about 10s/it using cpu lol

tall grove
#

speed

keen marsh
#

@proper cradle i think the word torchvision in the install for windows isn't supposed to be there, pulls an error don't think torch vision is part of the windows ipex wheels yet

proper cradle
#

But we need it

keen marsh
#

I mean its just pulling error and not downloading anything

#

Anybody have luck with compiling from source? I wonder if they just uploaded the cpu version

#

I can get it run i. Windows but only hat full precision and using cpu. System shows torch.cpu autocast or something like that

proper cradle
#

what torch.xpu.is_available() returns?

coral mulch
novel sphinx
#

hmm compiling from source and using those packages doesnt seem to change anything

proper cradle
novel sphinx
proper cradle
#

Can you run this manually?

pip install --upgrade torch==2.0.0a0 torchvision intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
#

Cuda version of torchvision is working fine on my end

novel sphinx
#

didnt throw any errors just said i already had everything installed

#

OSError: [WinError 127] The specified procedure could not be found. Error loading
"C:\Users\KingOfMemes\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of
its dependencies.

#

still throws this error trying to run webui.bat though

coral mulch
#

^ That is the error that I have as well, and aaron also has it.

novel sphinx
#

happened after manually replacing the files with the ones compiled from source as well

proper cradle
#

And IPEX worked fine with normal torch 2.0.1

novel sphinx
#

works fine on wsl

keen marsh
coral mulch
#

That's why the normal pytorch distribution doesn't work

keen marsh
#

torch needs a patch to work with ipex iirc

novel sphinx
#

Note: The patched PyTorch 2.0.1a0 is required to work with Intel® Extension for PyTorch* on Intel® graphics card for now.

keen marsh
#

and they uploaded 2.0.0 AlchemistHmm

tall grove
#

intel confuse

novel sphinx
#

also torchvision can be removed from the installer line on windows, it installs torchvision anyway without it and including the line throws errors for whatever reason, prob because its trying to pull it from the intel dev page and torchvision doesnt exist on there for windows yet

tall grove
#

Tbh the windows release does say experimental

novel sphinx
#

Once they get torch to 2.0.1a0 on the Intel dev page i think itll work fine so theres that

keen marsh
#

link me to those notes I will make a thread I guess

novel sphinx
#

Its noted on the ipex github

keen marsh
#

However, the source should comile the right one if it doesn't they may not even have it. If that's the case, how is any of it experimental lol

tall grove
#

Is there an issue for this on github?

keen marsh
#

Not yet

proper cradle
#

Can you run this?

pip install --upgrade torch==2.0.0a0 torchvision==0.15.2 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
keen marsh
#

ERROR: Cannot install torch==2.0.0a0 and torchvision==0.15.2 because these package versions have conflicting dependencies.

proper cradle
#
pip install --upgrade torch==2.0.0a0 torchvision==0.15.1 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
keen marsh
#

made a support thread

keen marsh
#

I've noted it doesn't seem to exist on the support thread

proper cradle
keen marsh
#

so hopefully they respond sometime in the next week

#

let me try outside of the conda enviornment

#

ERROR: Cannot install torch==2.0.0a0 and torchvision==0.15.1 because these package versions have conflicting dependencies.

novel sphinx
#

The conflict is caused by:
The user requested torch==2.0.0a0
torchvision 0.15.2 depends on torch==2.0.1

#

what my error reads

proper cradle
#

That's why it works on Linux then

novel sphinx
#

yeah i guess for windows we wait for torch 2.0.1 from intel

keen marsh
#

😂 😭

proper cradle
#
pip install --upgrade torch==2.0.0a0 torchvision==0.15.0 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
novel sphinx
#

seems promising though

#

The conflict is caused by:
The user requested torch==2.0.0a0
torchvision 0.15.0 depends on torch==2.0.0

#

it has to be from intel because intel releases patched torch versions to work with ipex

#

so we need torchvision and torch from intel devs for windows

keen marsh
#

Guess no way to bootleg and copy the linux files to windows right? lol

proper cradle
#

Native Linux or WSL?

keen marsh
#

well, even to use it on wsl2 seems like I have to wipe my enviornment again sigh

novel sphinx
#

Thats weird I haven't had any issues on wsl side

keen marsh
#

Yeah, it's my environment. I think it's because it wants to install python 3.10.12 in the environment when I have 3.10.6, not sure why that is probably something I did a long time ago. Might try and run the launch.py outside it though

novel sphinx
#

i never installed or used conda for anything it isnt needed

keen marsh
#

I hacked that thing together after making millions of mistakes lol

proper cradle
keen marsh
#

yeah, I didn't even know what conda was I just follewed the instructions at first

#

had to set another python to default etc etc, learned a lot though and got it working but likely causing this error

novel sphinx
#

The conflict is caused by:
The user requested torch==2.0.0a0
torchvision 0.15.1 depends on torch==2.0.0

keen marsh
#

Conda seems needed for windows, I don't know how to install that package otherwise.

#

Not sure it makes a difference, might be why I can get into Sd.next now even with just cpu

novel sphinx
#

you dont need the pkg-config or libuv

#

just ignore that whole part

keen marsh
#

Without it I got that module error I think

#

It's only installed in conda anyway

#

Guess i will try wsl2 again later

novel sphinx
#

yeah works great on wsl2 with good speeds and stable inference

tall grove
#

Still leaks memory?

keen marsh
# tall grove Still leaks memory?

They fixed that already, you can use it as long as you don't exceede your memory capicity with too high resolution, if you do you will have to restart still.

coral mulch
#

So far this is what I've done to set up the WSL2 version of vladmantic

#
wget -qO - https://repositories.intel.com/graphics/intel-graphics.key | \
  sudo gpg --dearmor --output /usr/share/keyrings/intel-graphics.gpg
echo 'deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc' | \
  sudo tee  /etc/apt/sources.list.d/intel.gpu.jammy.list
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update && sudo apt upgrade -y```
#

sudo apt-get install intel-opencl-icd intel-level-zero-gpu level-zero intel-media-va-driver-non-free libmfx1 libgl-dev intel-oneapi-compiler-dpcpp-cpp intel-oneapi-mkl numactl python3-pip python3-venv git unzip liblzma-dev

#

git clone the automatic repo

#

run it with --use-ipex

#

And then modify the webui settings based on the SDXL Vladmantic wiki, installing the fp16 vae fix

#

That's it

tall grove
#

Native?

coral mulch
#

No.

#

WSL2 only

#

On windows, we still have DLL issues

#

For some reason regarding a minor pytorch version difference

keen marsh
#

Native needs updates from the intel team

tall grove
#

When native intel beats amd lol

keen marsh
#

Also, anybody compiled from source? Does source also compile wrong pytorch?

tall grove
#

Think there was issues with it

keen marsh
#

If so it probably just wont work

tall grove
#

Well they should reply in couple of days so hopefully its just a mistake

#

Seems like they done minimal testing for this

novel sphinx
#

Assuming compiled version of torch is same as in wheel, got same dll regardless, need to update to 2.0.1a0 vs 2.0.0a0 for gpu to work according to their docs, hopefully soon that will be updated and then all should be good, wsl environment is still good tho, would be curious if in native windows there would be a any speed benefit compared to wsl, if not, having native windows is cool but wouldn't really offer anything over wsl version as its not really any easier to set up

keen marsh
#

trying to compile from source now, realized I wasn't running hte one api environment before and that's why it never did anything

chrome bone
#

good morning everyone!

#

for anyone whos as confused as me, i understand you pain

#

this is what i did so far to get it working on windows

#
  • install intel dpcpp compiler 2023.2.0 + mkl 2023.2.0 + compiler classic 2023.2.1
  • uninstall all existing python version
  • install latest miniconda
  • check all 4 boxes during installation
  • run in cmd:
    {oneAPI ROOT}\setvars.bat {oneAPI ROOT}\mkl\2023.2.0\env\vars.bat {oneAPI ROOT}\compiler\2023.2.0\env\vars.bat conda install pkg-config libuv python -m pip install torch==2.0.0a0 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
#

after that you can just import torch and ipex and use xpu as device to calculate on your gpu

#

note: some packages in webui will update dependencies and cause dll errors, i cannot tell which package that is. not a problem for sd.next though, but you need to use debug flag to install before using --use-ipex

#

apparently theres also some bug there, which i hope to figure out today

#

and compile from source did not work for me, i think i got sth missing

keen marsh
#

will give it a shot, compile ended withi the same error I was getting in the webui lol

chrome bone
#

did you compile successfully?

#

go to ipex folder to see if there is dist folder

#

a whl should be there if you compile successfully

grave condor
#

I only found disk folders in the torch dir not vision or ipex

#

And I didn't catch any errors as a lot was output to console (and it recursively cloned like 170 repositories)

#

so compiling troch from source works (perhaps check which exact version you end up with).

#

since the pip command the the prebuilt wheels have torch2.0.0a+git while it said that you need torch2.0.1a which could be the issue

#

There is a new GitHub issue (not mine this time), but it's also the weekend. So perhaps we won't see any fixes before monday.

#

unless #software message you know who to email

restive parcel
#

I was so close to getting the linux native finally running, and then I somehow managed to uninstall bash and have thus ruined a third linux install DinaKEK

chrome bone
#

torch 2.0.0a works fine, not that it is bug free but it can launch

grave condor
#

I am home in maybe 30 minutes. Maybe I sit down to try around myself for a bit

keen marsh
# chrome bone did you compile successfully?

I will cjheck but i got this error
OSError: [WinError 127] The specified procedure could not be found. Error loading "C:\Users\mymin\AppData\Local\Programs\Python\Python310\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its dependencies.

chrome bone
#

yes i can

#

mind if you check which torch/ipex version you have? the intel prebuilt should work

keen marsh
#

nice, I am going to give that config a shot. It's wierd that you need to uninstall python to use conda, it's supposed to be separate and run virtual env's.

chrome bone
#

i agree

#

maybe it will work if you dont use conda, but i dont want to add more headaches

keen marsh
#

i've tried both, didn't work. But i never uninstalled python

#

well it worked, but on cpu

chrome bone
#

thats not very helpful... cpu works with unpatched torch too lol

keen marsh
chrome bone
#

thats fine too

#

torchvision is not necessary i think

keen marsh
#

okay, now webui says "no python"

chrome bone
#

did you check all 4 boxes when installing conda?

keen marsh
#

yes, do I need to run all var.bat files each time?

coral mulch
#

Well I seem to be

#

in a predicament

chrome bone
#

yes. but you can make another bat to automate that for you (perhaps integrate to webui too)

coral mulch
#

RuntimeError: Native API failed. Native API returns: -1 (PI_ERROR_DEVICE_NOT_FOUND) -1 (PI_ERROR_DEVICE_NOT_FOUND)

#

Every time I run an image gen, my entire PC blackscreens

#

Discord closes.

#

On WSL2*

keen marsh
coral mulch
#

│ /home/dan9070/automatic/modules/call_queue.py:34 in f

│ 33 │ │ │ try:
│ ❱ 34 │ │ │ │ res = func(*args, **kwargs)
│ 35 │ │ │ │ progress.record_results(id_task, res)
│ │
│ /home/dan9070/automatic/modules/txt2img.py:64 in txt2img
│ │
│ 63 │ if processed is None:
│ ❱ 64 │ │ processed = processing.process_images(p)
│ 65 │ p.close()
│ │
│ ... 6 frames hidden ...
│ │
│ /home/dan9070/automatic/modules/sd_hijack_utils.py:17 in <lambda>

│ 16 │ │ │ orig_func = getattr(resolved_obj, func_path[-1]) │
│ ❱ 17 │ │ │ setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **k
│ 18 │ │ self.init(orig_func, sub_func, cond_func)

│ /home/dan9070/automatic/modules/sd_hijack_utils.py:28 in call

│ 27 │ │ else:
│ ❱ 28 │ │ │ return self.__orig_func(*args, **kwargs)
│ 29

keen marsh
chrome bone
#

webui.bat

keen marsh
chrome bone
#

oh yeah i used debug flag first

#

using --use-ipex seems unable to find python. too bad i know nothing about powershell script

#

fwiw debug flag is how you set up sd.next for nv/amd cards

keen marsh
#

So wait, are you using directml then?

chrome bone
#

but it will download unpatched torch in venv so i had to manually replace em

keen marsh
#

Okay, trying launch.py right now. It hooks into the default python and doesn't need the venv

chrome bone
#

that works too, i guess, but i never tried

keen marsh
#

if you want to run it then "python launch.py --use-ipex" is how to launch.

chrome bone
#

this is what i did:

  • set up with debug flag
  • remove unpatched torch and add patched torch and ipex
  • run the bat with --use-ipex flag
keen marsh
#

I may give it a shot if this doesn't work. I think conda isn't reading as default python even after being added to path. it's looking for python folder

tall grove
#

oh damn is it close to working

keen marsh
#

not for me lol "OSError: [WinError 127] The specified procedure could not be found. Error loading
"C:\Users\mymin\miniconda3\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its
dependencies." 😦

restive parcel
#

especially with discord/other webapps

#

youtube/google calendar/discord do not like arc

keen marsh
#

i guess that messed something up with the extension, it's no longer found in python

#

😭

chrome bone
#

yeah try not to skip venv

#

because you dont know what messed up your "working" ipex

keen marsh
#

Yeah, gonna start over....again

#

You just copied the miniconda python folder over?

chrome bone
#

just patched torch and ipex lib

keen marsh
#

How do you do that in the venv?

chrome bone
#

after setting up using debug flag

#

go to automatic\venv\Lib\site-packages and delete unpatched torch (2 folders in total)

#

then add patched torch and ipex (4 folders in total)

#

your patched torch and ipex can be found in your conda directory

#

thats not very elegant but it works

restive parcel
keen marsh
#

you run the debug outside of oneapi environment? I still get python error

chrome bone
#

inside

#

strange.. lemme try again

#

i think i know what the problem is now

#

i mean why so slow

keen marsh
#

I wonder if I set python to the miniconda in the web-user.bat file I copied from automatic1111?

chrome bone
#

there is no need to do this

restive parcel
#

when you create a venv, doesn't it copy your python version anyway?

chrome bone
#

the moment you checked 4 boxes you call the python that comes with conda

#

like this

keen marsh
chrome bone
#

now lemme download a fresh copy..

restive parcel
#

well, for this, I am glad linux has pyenv. Now if only my bash didn't magically disappear after cloning sdnext...

keen marsh
#

yeah, i get the python error each time. maybe if I replace the files first?

chrome bone
keen marsh
#

' H:\Stable diffusion\automatic>call webui.bat --debug
Couldn't launch python

exit code: 103

stderr:
No Python at '"C:\Users\mymin\AppData\Local\Programs\Python\Python310\python.exe'

Launch unsuccessful. Exiting.
Press any key to continue . . . '

chrome bone
#

can you do a screen cap in cmd? just run python

tall grove
#

isnt that just saying you dont have python lol

keen marsh
chrome bone
#

hmm..

keen marsh
chrome bone
#

that is interesting

tall grove
#

its trying to use normal python

#

"AppData\Local\Programs\Python\Python310\python.exe" does this exist?

chrome bone
#

it should point to conda python..

keen marsh
#

same error

grave condor
#

experiencing the same thing where the dll throws and OS error

#

but I did manage to install it via the wheels and it showed up in my pip list

keen marsh
#

I think i figured it out, I installed by user and not system.

#

one more time...

grave condor
#

I believe it's a path/vars thing according to the WinErr 126, so maybe having the conda env really helps

#

perhaps it's the stupid windows file path with spaces in them that messes it all up once more

chrome bone
#

i installed conda by user, though system might help? (i did it once and it may have changed some config which otherwise wouldnt have)

keen marsh
#

well, now it won't see conda when I install it like that... 😦

chrome bone
#

i found out why it is so slow now

#

whoever wrote this note knows its an unfinished product the problem is slow start time

restive parcel
chrome bone
#

now i needa find a way to not trigger this thing

proper cradle
#

I can hijack the torch.asarray function

chrome bone
#

do you have windows lol. i dont think reshaping helps

chrome bone
#

ill keep finding what the problem is then.. i tried to understand what ids are (i think they are one hot vectors just vocab id) because printing ids return nothing. im not too sure whats going on

#

fwiw this is the line that executes so slow

keen marsh
#

okay, think i figured it out. You have to start everything within a conda environment.

#

well back where I started lol OSError: [WinError 127] The specified procedure could not be found. Error loading "H:\Stable
diffusion\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its
dependencies.

#

let me try and copy the files

#

man, it still just wants to use cpu.

chrome bone
#

did you start with --use-ipex after replacing libs?

keen marsh
#

yes, but it looks like it replaced the files. let me try again

restive parcel
#

@proper cradle do you know how to actually view Arc gpu status on linux? I want to keep an eye on memory and such, but can't figure out what actually works with arc

proper cradle
#

xpu-smi

restive parcel
#

thank

keen marsh
#

okay, finally seems to read my gpu but it's just hung at starting...AlchemistSad

#

RuntimeError: expected scalar type Float but found Half got this again now....lol. Think this is what I got in wsl2?

chrome bone
#

Huh… which line is throwing this error

keen marsh
#

going to try debug now

restive parcel
#

for some reason, my server is just trying to render something when it starts up annieDerp

keen marsh
#

I think my error was from restarting the ui instead of shutting down, not sure yet. it's just hanging ont he first generation

restive parcel
#

ohey, I think i'm finally getting an image

#

although i guess it can still black out...

keen marsh
#

check in the system and see if it's using cpu

restive parcel
#

~3s/it, 1024^2 SDXL + refiner

#

I don't have the gpu memory monitor set up yet so I'm not sure what its on

#

although, I think the debug stats number here means 4.9 gb used?

#

linoox of course, but it took me enough days to get it working DinaKEK

#

i get pi out of memory error well before that number reaches 15.91 though so idk

keen marsh
#

nope still get an error. format got wierd when moving to second screen lol

coral mulch
#

or maybe

#

I am wrong

#

🤷‍♂️

keen marsh
#

windows is working for RC, I ALMOST have it working lol

#

I am just trying to get sd1.5 to work before I even think about sdxl

chrome bone
#

that error is an easy fix

#

you just convert it to fp32

#

float usually refers to fp32

#

half = fp16

coral mulch
#

When using sequential offloading on WSL2 vladmantic, I cannot use loras. Images are black.

keen marsh
#

I tried in linux and it didn't work, also fp32 is about half as fast

restive parcel
# keen marsh windows?

naw, linux native. I killed it a lot of times so it was actually much harder than WSL for some reason

keen marsh
coral mulch
#

Otherwise his it/s don't match up

#

If I try to run it without sequential on WSL2, it blackscreens. This only ever happened since the IPEX update.

restive parcel
#

I turned on all of the optimizations because I was getting out of resource error, i'll see which ones i can turn back off

coral mulch
#

Which makes me very sad.

keen marsh
#

Yeah, use the other one. But you need to run that - sed sequence disty posted in here a while back. and I think you need the vae hack

#

although that was all in torch1.13

coral mulch
#

the command was for cpu offload

#

not sequential offload

keen marsh
#

yes, use that for faster speed the other just gave black images for me

#

seems like this takes 5 minutes just to start trying to make an image...

#

trying fp32 now no half etc

keen marsh
#

5minutes later.....expected scalar type Float but found Half

chrome bone
#

but that shouldnt be necessary.. i didnt have to modify anything

keen marsh
#

okay only thing I can think of is to update my drivers to latest beta.

chrome bone
#

my problem is it is sloooow

keen marsh
#

I was getting the same error on wsl2,

#

yeah, it's taking me 5 minutes the first time just to get an error

#

after that I can get an error quick though lol

#

in wsl2 the error was quick

chrome bone
#

oh so if i waited long enough id get that error?

#

nice

#

good to know

keen marsh
#

if you haven't gotten an image yet, wait about 5 minutes and see if it outputs an image or error

chrome bone
#

but even if theres no error, being slow requires a fix

keen marsh
#

I don't get any indication it's doing anything except the starting button in the ui

chrome bone
#

i aint waiting all day to see my waifus

keen marsh
#

lmao

#

gonna try the drivers

novel sphinx
#

Try enabling vae tiling, with vae tiling, vae slicing and attention slicing i can generate sdxl on wsl2 without using sequential at around 1.7it/s

#

Vae tiling in my experience is just enough to avoid the out of resource error which crashes the gpu driver forcing the black screen

coral mulch
#

@restive parcel This new version of ipex

#

It all works for me now

#

keep device precision type to fp16 but vae and diffusers to fp32

keen marsh
#

I think you also need the patche vae as well, then set it to fp32 or rename it

novel sphinx
#

Yeah use patched vae rename i just left vas and diffusers set to default

keen marsh
#

I wonder if this will take me as long as it took me to figure out the webui in wsl2 when I first started, think it took a month lol

keen marsh
#

I think hte patch is more for speed, the original will work but take more vram and take longer i think

novel sphinx
#

Well youre using sdxl that already has the patched vae baked in

coral mulch
chrome bone
#

it requires a separate download afaik

keen marsh
#

5 minutes and no error yet, debug says Sampler: Euler a {'uses_ensd': True} and it's still "starting"

coral mulch
#

Btw @keen marsh with it on or off I get the same speed

chrome bone
#

2 minutes in im still waiting it to throw an error

coral mulch
#

Take off the LORA and I get 1.78IT/S

keen marsh
#

I only used the one without the embedded vae, but I only have 8gb of vram

coral mulch
#

It seems the vae fix on the new IPEX is no longer required?

keen marsh
chrome bone
#

ill do a screen cap and report

keen marsh
#

I am also on fp32 now so it may take me even longer to error out lol

restive parcel
#

weird, i'm still getting average 3s/it

chrome bone
#

as soon as i switch device to cpu the inference time is sped up to 0 seconds

#

originally took 90s on xpu

keen marsh
#

try the settings disty posted in this thread

coral mulch
keen marsh
#

getting the it/s message 🙂

#

it's not moving though lol, let me set another 5 minute timer

coral mulch
chrome bone
#

5 minutes in it still doesnt give me the error i deserve

keen marsh
#

i GOT AN IMAGE IN WINDOWS ON THE GPU

restive parcel
#

ayo????

coral mulch
coral mulch
chrome bone
#

but... what did it cost?

restive parcel
#

this is our LX-99

keen marsh
#

it actually didn't take 6s/it's when it started though, it was actually lighting fast, probably 6it/s

chrome bone
#

nice

keen marsh
#

well maybe 3 given it's fp32

chrome bone
#

so something in the initial stage is slowing the process

keen marsh
#

lets see if second generation takes as long

chrome bone
#

i wonder what that is

#

raise an issue on gh. you are my hero now

keen marsh
#

yup, 1.7it/s. Goes right away after 10 minute load lmao

#

problem is I don't know what causes the slow down? I guess it was the thing you found?

chrome bone
#

i hv zero idea whats happening 🤷‍♂️

keen marsh
#

also, this is 512x512 sd1.5 revanimated I think

chrome bone
#

10 mins in still does nothing

#

i give up, probably should put my time to better use

keen marsh
#

are you getting messages in the debug

chrome bone
#

nope, im using jbaboval

#

finally!

keen marsh
#

sd.next you can run the debug flag

#

I have a customized fork of that one too, I messed aroudn with that for a long time lol

chrome bone
#

yeah same

keen marsh
#

swapping to bf16 didn't cause any loading. Also turning off live preview speed me up from 2.7 to 3.5 it/s

coral mulch
#

I'm gonna wait for the windows version to be fully figured out

chrome bone
keen marsh
#

not sure how sdxl will do with this.

chrome bone
#

it takes forever to initialize, but once it starts the speed is good

#

thats a full 20 mins

keen marsh
#

i wonder if it will be like that every startup, or a one time deal? I wish I could see what was happening

chrome bone
#

pretty sure its every startup

#

you can switch model and try

keen marsh
#

yeah, looks like swapping models did it lol

#

it loads fast but takes forever to start using it...

chrome bone
#

oof

#

we are making the world a better place - imagine how many SD enjoyers who will use SD.next on their arc igpus

keen marsh
#

lmao, yeah. I don't mind though, I even have reddit followers now lol

#

okay second model took about 3 minutes to start

#

it/s seem to be going up too for some reason

chrome bone
#

thats good

restive parcel
#

meanwhile I can hardly get the thing to spit out an image to me even going back to "normal" SD mode DinaKEK

#

now its complaining something about 'DiffusionSampler' not having an attribute 'sample'

keen marsh
restive parcel
#

when in doubt reboot i guess

keen marsh
#

You follow the instructions disty posted?

#

you can put all those options at the top and change them on the fly

restive parcel
#

aight, rebooting did the trick

#

yeah i changed em but didn't restart, so I guess that was it

#

still feels super slow for some reason... even regular SD at 512^2 is giving me 1.54 it/s

#

must've left some optimization on i shouldn't...

#

wait

#

full precision was turned on?

#

that might be it

#

hmmm 2 it/s...

keen marsh
#

try opt-sub optimization

#

or split if on an a770

restive parcel
#

it feels like my outputs are very wrong....

#

even without a ton of prompts, I really don't expect this from Abyss Orange Mix

keen marsh
#

what are your steps, cfg and resolution?

restive parcel
#

20 steps, 6 cfg, 512^2

keen marsh
#

make sure you change your vae as well

restive parcel
#

anythingv3 vae

#

not sure if I'm screwing up something here maybe

keen marsh
#

Try bf16

restive parcel
#

what is bf16?

keen marsh
#

I dunno, i think it's what arc uses

#

I was getting errors when I updated the UI before and fp16 wouldn't work, google search said arc didn't fully support fp16 yet and bf16 worked for me

restive parcel
#

scary kid

keen marsh
#

model compile backend set to ipex seemed to fix the slow model startup (or it just fixes itself by the third model)

keen marsh
restive parcel
#

not at all. the colors are very cold, and looks nothing like you typically get from orange mix models

keen marsh
#

same vae?

restive parcel
#

same vae

#

oh well, guess i'll try fiddling with it later

keen marsh
#

there is an option "apply color correction to match original colors" i think it may only be img2img but worth a shot

grave condor
# restive parcel what *is* bf16?

BF16 is still a 16bit number, but the bits (sign, exponent, mantissa) are allocated differently to fp16 with the idea of being more efficient for machine learning tasks. it's natively supported in most modern accelerators.

#

there is also tf16 which isn't even 16bit

#

bf16 is similar to fp32 in range but reduced precision and half the footprint

#

you don't need to be perfectly exact for machine learning (but you do for scientific computing and simulations etc, which is why there is fp64)

keen marsh
#

Tried it again and its back to giving me the float error. Also takes 5 minutes to start again

proper cradle
#

4096x4096 with SD 1.5

#

Time taken: 4m 39.67s |

GPU active 10408 MB reserved 14720 MB | System peak 10029 MB total 16288 MB

restive parcel
#

wow

proper cradle
#

SDXL 4096x4096 with 5MB VRAM left available

#

And it oom'd when i pressed the send button

#

SDXL 4096x4096:

Time taken: 3m 56.45s |

GPU active 11865 MB reserved 15978 MB | System peak 10830 MB total 16288 MB

keen marsh
#

for some reason the windows version seems to download the wrong version of torch.

proper cradle
#

It will either fail with windows ipex torch or it will download the cuda torch with it rn

keen marsh
#

I have gotten it to work by replacing the files, just once so far though.

#

have to wait 5-10 minutes before I can see if it works this time

mellow sparrow
#

Hi guys, I used the WSL method of getting vladmandic up and running and it's working with older models, but SDXL 1.0 isn't working. I just get a black screen. Any tips on getting it working?

#

I have an a770 16GB & 48GB RAM

novel sphinx
#

are you using the vae fix and what diffusers settings are you using? I reccomend looking at the vladmandic github wiki for the sdxl section

#

since wsl by default allocates half your ram, in your case 24gb, that should be enough but just for safety you could make a .wslconfig and up that to 32gb for wsl, i know 24gb is the absolute minimum as i see up to like 23gb usage so 24gb could be cutting it close, i only have 32gb ram so i really cant allocate more then 24gb myself

keen marsh
#

so today windows seems to only want to use full precision on the model, yesterday I got half precision working after subsequent runs.

keen marsh
mellow sparrow
#

Thanks Nick, am I supposed to use the Fixed vae along with the fixed checkpoint?

novel sphinx
#

No, just use the standard 1.0 checkpoint with the fixed vae, I've used the fixed checkpoint but i have found better stability with just the normal checkpoint and separate fixed vae

mellow sparrow
#

It started working for me but it's extremely haphazard in terms of the results when prompting for a photo of a grizzled old man, returns an illustration or cartoon alot of times. Not sure if I have it setup correctly. Adding negative prompts

#

for cartoon, illustration, monochrome doesnt seem to work at least not fully

#

also only 512x512 is working for me. Thanks for the wsl tip. I thought it was just openly sharing memory between host and guest os but it looks like by default it was capped to 16gb. Now i'm seeing 22gb in use...

keen marsh
#

compiled torchvision pulls an error when loading, definitely doesn't fix the start up speed problem in native windows

keen marsh
#

okay, in wsl2 it seems negative embeddings cause float errors in the lastest ipex.

#

Anybody else can use negstive embeddings eith fp16 or bf16? Seems they need full precision now

keen marsh
#

While it seems ipex 2.0 is good for sdxl, it seems like it's broken on sd1.5. Seems autocast has issues with half precision. this is in wsl2 btw.

chrome bone
chrome bone
#

ipex 2.0 works fine in wsl

proper cradle
#

Also if you are on FP32, set precision to FP16 and do a full restart

keen marsh
#

It will run, but any textual inversions pull the scalar float error unless at full precision, which eats up my ram and vram

proper cradle
#

Did you git pull to the latest vesion?

keen marsh
#

I may just wipe the distro and just do it all over from scratch to see if something got messed up somewhere

#

yeah

#

did a git pull and reinstall to make sure

proper cradle
#

Replicated with BF16

#

Autocast is getting worse with each update :/

#

1.13.110 autocast was worse than 1.13.10
And now 2.0.1 autocast is worse than 1.13.110

keen marsh
#

Yeah i remember hearing that before. I think i started in bf16 as well, maybe the settings didn't change when i tried fp16

#

tried importing the torch vision from wsl2 to native windows, seems it does the same slow loading lol

keen marsh
#

Nice! I will check it out

keen marsh
#

first sdxl 1024x1024 produced in native windows. sequential cpu offload, 2s/it. going to try model cpu offload but not sure if I can get it to work

#

not sure if those negative prompts are working lol

grave condor
#

and it seems like doing call doesn't really activate it, you can just launch the batch file

#

so essentially just run "%ONEAPI_ROOT%setvars.bat"

#

but now I get WinError 127 instead of 126, so some steps forward

keen marsh
#

also 15min+ to start the first generation on sdxl vs about 5-8min on sd 1.5

grave condor
#

I will try a conda env as they suggest on Monday. perhaps I also change the installation directory so it doesn't break anything with those whitespaces.

#

I wanted to move to Python 3.11 anyway but my workstation will only come together in October,

keen marsh
#

@proper cradle this seems to work in windows 'torch==2.0.0a0 torchvision==0.15.2a0 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu' . Might need someone else to check it as well, but seems to install the proper torch.

keen marsh
chrome bone
#

yeah i saw that too

#

i wonder if theyll upload a new prebuilt

keen marsh
#

hopefully, I remember settign that when building from source in wsl2

chrome bone
#

honestly i cannot wait xD might as well try again following another guide

keen marsh
#

I also fixed the torchvision error I was getting in conda, but It didnt' make a difference and likely doesnt matter

chrome bone
#

yep. as said, its mostly for image augmentation and stuff that are training related

keen marsh
#

I see the setting for Aot in the bat file, but the code is different from the sh file for linux and I dont' know what i'm looking at lol

#

yeah, if you want it was basically import conda libjpeg and libpng before compiling from source, probably because the wheels aren't for windows.

proper cradle
chrome bone
#

i expect them to upload a recompiled version because just now my build fails again

proper cradle
#

I mean the torchvision doesn't throw error?

chrome bone
#

i did not have any issue with that

#

but that could be because i hvnt been using the training functions. on wsl my torchvision was never imported successfully but i had no issue generating pictures

proper cradle
#

Training doesn't work on Linux either

#

ESRGAN upscaler uses torchvision

#

And webui.py itself tries to import torchvision

chrome bone
#

well i never use ESRGAN upscaler so id never know, tried it back in march and it failed miserably

proper cradle
#

Anyway if it doesn't break the pip install then it's fine for now.

chrome bone
#

yep, ill post here if intel decided to upload a working ipex

keen marsh
#

I have tried to compile overnight eith aot, not sure if I did it right but will report it it works later on

keen marsh
#

use_aot_devilst pulls an error, wonder if they have the windows version listed. I see options for IGPU in windows

#

might just end up stuck waiting 10-15 minutes before starting

chrome bone
#

are you using their bat?

keen marsh
#

I used the one vipitis posted

#

I've succesfully compiled with it before, failed once too though lol

#

this time it goes all the way through, the fails at aot-devlist option

chrome bone
#

lemme see..

#

you need to specify AOT device too otherwise it wont be too useful

keen marsh
#

I used it like you are supposed to on linux "compile.bat root root "aot"

chrome bone
#

you can omit the quotation mark

keen marsh
#

USE_AOT_DEVLIST='ats-m150'

#

I just put that, I wsa just usign short hand to explain

#

that's supposedly arc, but maybe its only linux

chrome bone
#

it goes like ___.bat ___.var ___.var ats-m150

#

i needa find his script, not sure what he changed

#

does your oneapi root contain white space?

keen marsh
#

yes, it's in programs.

#

Why they put white spaces in the default programs folder, I dunno lol

#

I have compiled before, honestly I'm not sure which folder is dpccroot, i forget which worked. I tried "C:\Program Files (x86)\Intel\oneAPI\dpcpp-ct\latest" this time, last time I tried "C:\Program Files (x86)\Intel\oneAPI\compiler\latest" but ipex was busted (this was without AOT)

#

So i guess that could also be it, I wish I didn't hav to compile the entire thing each time

chrome bone
#

i simply changed it to c:\oneapi, compilation still fails nonetheless

open sundial
#

Hey ho! Was just hoping someone might know something about StableSwarm and potentially using Intel Arc GPU's that you have as a secondary device for simultaneous batch generations across different GPU brands?

chrome bone
#

do you mind upload his script again?

keen marsh
#

Try the file Vipitis posted

#

it's on here still

#

#1127742927347666964 message

chrome bone
#

i found it

keen marsh
#

which one is dpcpp root folder? lol

#

I don't want to mess up because of that, it takes forever

#

They tell you where it is in the linux instructions lol

chrome bone
#

yeah so to use his script you can just run compile_bundle.bat 1 2 ats-m150. he simply replaced arguments with %ONEAPI_ROOT% env variable

keen marsh
#

no need for aot_use_devlist ?

chrome bone
#

nope

#

place this bat in a folder first

keen marsh
#

"C:\Program Files (x86)\Intel\oneAPI\dpcpp-ct\latest" this where my dpcpp root is right?

chrome bone
#

does not matter, you can run echo %ONEAPI_ROOT% to see where your base kit is installed

keen marsh
#

okay, thanks

chrome bone
#

this is what intel developers shouldve done

keen marsh
#

"compile_bundle.bat" "C:\Program Files (x86)\Intel\oneAPI\dpcpp-ct\latest" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest" ats-m150

chrome bone
#

this compile_bundle.bat 1 2 ats-m150. 1 2 are just placeholders, you can put whatever you want inside

keen marsh
#

that's what I got, oneapi root shows my one api folder in windows but not the dpcpp root

#

so "compile_bundle.bat 1 2 ats-m150 " should work just like that?

chrome bone
#

yep

keen marsh
#

okay wow, lol. thanks I will give it a shot

chrome bone
#

good luck, wish ill see a working ipex when i wake up next morning

keen marsh
#

If I get a working wheel I can try and upload it someplace

#

also, incase anybody missed it. the whl files are in the dist folders

#

took me a minute to find them lol

grave condor
#

also I seem to have found a solution for myself that doesn't require any conda env. Intel has a oneAPI env setup extension for VSCode

keen marsh
#

So far so good, torch has compiled and installed, no errors yet with ipex

tall grove
keen marsh
#

been stuck here for a while now '[1047/1049] Linking CXX shared library csrc\gpu\intel-ext-pt-gpu.dll'

grave condor
#

it's always this dll

#

But it's like a 600mb binary files so it does a lot

keen marsh
#

it's been stuck for over an hour, should I just cancel?

grave condor
#

is your CPU showing any utilization?

keen marsh
#

here are some warnings I got before

keen marsh
grave condor
#

press enter in the command prompt

keen marsh
#

15% ocloc.exe

#

nothing

grave condor
#

you can try to press Ctrl+c and hope it prompts you instead of aborting

keen marsh
#

no errors, but it's never taken this long

grave condor
#

And then type no

keen marsh
#

okay, well I said no, and it seems to just have finished? ran a sanity check and everything came out right it seems.

#

guess I will test it out and see what's happening lol

tall grove
#

Intel be shaking rn

keen marsh
#

nope, no whl for ipex.

#

probably just read my previous install

#

I wish there was a way to just compile ipex, everything else is fine

grave condor
#

go into the compile bundle script and change it

ember orchid
keen marsh
#

So the issue is something to do with sycl, I believe this is to convert c++, its compiling I guess but only using 15% cpu. Not sure it wont error out eventually. 🤷‍♂️

#

Its possible I am missing a necessary download somewhere, or there is a reason they didn't use aot in the wheel files.

proper cradle
#

Diffusers bakcend is way more stable than the original backend after i patched the attention functions.

#

Here is a SD 1.5 at 512x1024 with Batch Size 32:

#

And SDXL at 1024x1024 with Batch Size 16:

keen marsh
#

Nobody else tried to compile yet? Maybe it's my amd cpu and opencl? 😭

chrome bone
#

ive removed torch and torchvision

#

though theres no reason to think it would suddenly work if you keep doing the same thing over and over again

grave condor
#

I had it all working but then removed conda, as I wasn't using the conda env anyway. But now it doesn't work anymore. I will probably just keep conda but not use it. perhaps that is required without obviously stating it

chrome bone
#

no idea why they took all the time to develop but not compile properly for users 💀

keen marsh
#

nope, read that wrong. My bad

keen marsh
#

Okay, got it to compile! however it pulls an error now in python. will try and see if it will work anyway

#

nope! lol 😭

#

It might be because I only compiled ipex by itself...

chrome bone
#

hmm shouldnt be the case

keen marsh
#

ImportError: cannot import name 'register_backend_for_device' from 'torch._inductor.codegen.common'

#

I think torch does a patch during the compile if I am reading it right

chrome bone
#

yes but that was during compilation of torch

#

ill leave my pc compiling while i go get my pizza

#

so you have a wheel file for ipex?

#

upload it somewhere i can give it a try

keen marsh
#

I don't see an intel_extension folder in the pytorch folder so my guess is it's patching into the other folder

keen marsh
chrome bone
#

yeah it shouldnt be there

grave condor
#

arc gpus are still "cuda" not to break stuff... and now that might be breaking stuff for me -.-

proper cradle
#

use torch.xpu.device

coral mulch
#
  1. What are you current settings other than what is mentioned on the wiki
#
  1. How do I use your commits unless they're already in the main branch
grave condor
#

any success with compiling AOT? waiting 12 minutes every time I restart the kernel or run a script seems unreasonable.

The PyTorch webinar is on Wednesday https://software.seek.intel.com/techdecoded-webinars and I will ask a lot of questions.

chrome bone
#

my attempts have been futile

keen marsh
chrome bone
#

if it compiles fine it probably is working

keen marsh
#

ImportError: cannot import name 'register_backend_for_device' from 'torch._inductor.codegen.common'

#

I tried it in stable diffusion, same error

chrome bone
#

for whatever reason ninja keeps throwing ~~shuriken ~~errors at me

grave condor
#

so you seemingly cant compile ipex on it's own.

#

I will try my chances tomorrow (should I get out of bed before practice), and otherwise I hope to get answers during the webinar... Or just more and more GitHub issues.

#

Seems like there is at least one guy appointed to windows native (hopefully has a team behind them)

keen marsh
#

Maybe, I did compile the other two first but ended up quiting out and then compiling ipex by itself later.

#

I posted about it on github

#

I may leave it on overnight again.

#

completely wipe all the folders and recompile I guess

grave condor
#

I mean, it works for me right now. And without doing anything it's at least twice as fast as CPU. I can work with it. But the itrex (Intel extension for transformers) repo is filled with examples and even crazier inference deployments then I was expecting. I did manage to get xpu as my accelerate accelerator, but it's not being used correctly by device_map="auto" for pipelines. so I will try a few more things tomorrow. I didn't expect device="xpu" to be working

keen marsh
#

yeah, for me it's just the first run. takes about 15 minutes on average

#

it takes up way less resources than wsl2 and I can run it native.

proper cradle
coral mulch
#

The moment I attempt to generate without model cpu offload or sequential, I get either an out of resources or device not found error.

proper cradle
#

Don't use those options if you don't have enough system ram

#

It will put the model into the system ram

coral mulch
#

How is 32gb of ram not enough

proper cradle
#

SDXL Base

#

8-16 GB with those options

coral mulch
#

I have this set in diffuser settings with the fixed vae properly named.

proper cradle
#

Did you load different models?

coral mulch
#

No, I shutdown WSL, then started up the webui. Let it load the fixed VAE and base model

proper cradle
#

IPEX doesn't clean up system memory without ipexrun

#

Also what is your fixed vae?
diffusion_model.safetensors or diffusion_mode.fp16.safetensors?

coral mulch
#

diffusion_mode.fp16.safetensors, but I had the previous rename while I had the VAE set to fp32

proper cradle
#

Loading variant doesn't have any effect other than looking for a specific filename tho

#

So you don't have to rename

coral mulch
proper cradle
#

Upcast option changes the running dtype

proper cradle
coral mulch
#

Yes.

proper cradle
#

Is this in your logs?
Launching ipexrun launch.py...

#

Cause ipexrun fixes some memory leaks

coral mulch
#

Launching ipexrun launch.py...

#

Yes.

#

But there are errors past it.

proper cradle
#

These are fine:

#

intel_extension_for_pytorch.cpu.launch.launch - INFO - Use 'auto' => 'tcmalloc' memory allocator.
Is this line the same on your end?

coral mulch
#

2023-08-08 14:49:44,058 - intel_extension_for_pytorch.cpu.launch.launch - WARNING - Neither of ['tcmalloc', 'jemalloc'] memory allocator is found in ['/home/dan9070/anaconda3/envs/IPEX/lib/', '/home/dan9070/automatic/venv/lib/',

proper cradle
#

sudo apt install libgoogle-perftools-dev

coral mulch
#

2023-08-08 14:54:17,113 - intel_extension_for_pytorch.cpu.launch.launch - INFO - Use 'auto' => 'tcmalloc' memory allocator.

#

Set.

proper cradle
#

Let's see if it will help

coral mulch
#
--use-ipex' returned non-zero exit status 139.```
keen marsh
#

Wsl will take all the vram you give it and more

proper cradle
keen marsh
#

It will take all the ram too

keen marsh
#

I could run base model though, no refiner

#

At least in ipex 1.13

coral mulch
#

That is indeed what I am doing.

proper cradle
#

Removing --use-ipex will disable ipexun if it doesn't work on WSL

coral mulch
#
[W Device.cpp:120] Warning: XPU Device count is zero! (function initGlobalDevicePoolState)
Segmentation fault```
proper cradle
#

Rest are still autodetect

coral mulch
#

2023-08-08 15:00:29.766463: E itex/core/devices/gpu/itex_gpu_runtime.cc:173] Can not found any devices. To check runtime environment on your host, please run itex/tools/env_check.sh.

keen marsh
#

Oneapi at the latest?

coral mulch
#

Yes.

keen marsh
#

Ipex pull any errors outside the venv?

coral mulch
#
2023-08-08 15:03:16,824 - intel_extension_for_pytorch.cpu.launch.launch - WARNING - 'intel' OpenMP runtime is not found in ['/home/dan9070/anaconda3/envs/IPEX/lib/', '/home/dan9070/automatic/venv/lib/', '/home/dan9070/.local/lib/', '/usr/local/lib/', '/usr/local/lib64/', '/usr/lib/', '/usr/lib64/', '/usr/lib/x86_64-linux-gnu/'].```
#

Why do i not have OpenMP.

keen marsh
#

Its pulling the ipex cpu?

coral mulch
keen marsh
#

Python import torch import intel_extension_for_pytorch torch.xpu.is_available()

#

Try that outside the venv

coral mulch
#

syntax error near unexpected token `(' lmao

grave condor
#

you need to use ; for newline

#

can -c

#

python -c "import torch; import intel_extension_for_pytroch; torch.xpu.is_available()"

keen marsh
#

My bad typing on phone lol

proper cradle
#

sudo apt remove libgoogle-perftools-dev

#

sudo apt install libjemalloc2

keen marsh
#

alright, fully compile from source and I get the same error. What I do notice is the git# number is different than the prebuilt wheels, maybe this is causing a conflict with the pytorch version? or maybe it's compiling the wrong ipex?

coral mulch
# proper cradle `sudo apt install libjemalloc2`

This returns 2023-08-08 15:41:59,126 - intel_extension_for_pytorch.cpu.launch.launch - WARNING - Neither of ['tcmalloc', 'jemalloc'] memory allocator is found in ['/home/dan9070/anaconda3/envs/IPEX/lib/', '/home/dan9070/automatic/venv/lib/', '/home/dan9070/.local/lib/', '/usr/local/lib/', '/usr/local/lib64/', '/usr/lib/', '/usr/lib64/', '/usr/lib/x86_64-linux-gnu/'].

#

Same as the initial

proper cradle
#

sudo apt instal jemalloc-dev

coral mulch
#

2023-08-08 15:43:32,426 - intel_extension_for_pytorch.cpu.launch.launch - INFO - Use 'auto' => 'jemalloc' memory allocator.

#

It's libjemalloc-dev

#

Which worked

#

Sadly however

#

both alloc modules give the same error

#
--use-ipex' returned non-zero exit status 139.```
#

I just did a brand new install, and only set default parameters.

keen marsh
coral mulch
#

The person in that issue post has a completely different error to mine.

keen marsh
#

okay fellas, I FINALLY figured out what was wrong. My wheels were fine, there is an added line in the int.py file in the intel extension that called for something that doesnt exist. I got it running on windows, UI loaded in 26.s and first inference was almost instant.

#

What is a good place to upload a wheel file?

coral mulch
#

If you got the WebUI fully running and generating on windows, I want to get my hands on that fast.

#

@keen marsh Upload it to dropbox or google drive

keen marsh
#

Also if you want to use this in the sd env, just delete the two ipex folders in the venv and copy over the main ones. And if you have a wheel already just edit out that line

#

Good news is that it loads much faster, bad news is that so far it's about 2it/s slower. maybe it still needs time to warm up I dunno, or maybe the comment does something. let me know how it works for you

#

I guess I also should have put to run it in the oneapi enviornment as well...lol well i'm sure most of yall know that one.

#

Also, haven't tested it with SDXL yet

coral mulch
#

I get this error like I did before "C:\Users\dbs_5\OneDrive\Desktop\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its dependencies.

keen marsh
#

You have to edit the init file and comment out that line

coral mulch
#

oH.

keen marsh
#

1.)Run pip install --force-reinstall intel_extension_for_pytorch-2.0.110+git9fccbf1-cp310-cp310-win_amd64.whl

2.)Locate the init.py file in your intel extension for pytorch folder pip

"your_python_directory\Lib\site-packages\intel_extension_for_pytorch_init_.py"

3.) Comment out line 100

#from . import _inductor

coral mulch
#

Thank you.

keen marsh
#

No doubt, let me know how it goes. So far it traded off load time for speed for me

#

Its still not bad but 4it/s vs 6/its at 512*512 sd1.5

#

Will try sdxl in a bit

coral mulch
#

Commented out line 100 in both the main site package in my conda environment and the one inside the venv after I had already overwritten it with the conda one I installed

#

Same error.

chrome bone
chrome bone
coral mulch
#

The one on the IPEX git?

#

Yeah.

chrome bone
#

and make sure you set env variables first:
{ONEAPI_ROOT}\setvars.bat
{ONEAPI_ROOT}\mkl\2023.2.0\env\vars.bat
{ONEAPI_ROOT}\compiler\2023.2.0\env\vars.bat
{ONEAPI_ROOT}\compiler\2023.2.1\env\vars.bat

#

2023.2.1 is optional i believe

#

in the future these can be integrated to sd.next

#

ill look into it after i had my lunch

keen marsh
#

Try running in conda, i only get that error outside of one api

#

There is no env or var in 2023.2.1 for me

#

I use latest as its a symlink to the newest version so when you upgrade it still works

coral mulch
#

Did that command, did setvars.

#

Same .dll error.

keen marsh
#

Can you screen shot your environment when you load

coral mulch
#

What do you mean "When I load"

#

Lol

#

I open up Anaconda, run the environment I made for IPEX in python 3.10

#

aka conda activate IPEX

#

I go to the oneapi folder and activate all three needed setvars

#

go to the automatic folder and run webui.bat --use-ipex

keen marsh
coral mulch
#

With the site-package intel extension for pytorch's init.py's line 100 commented out.

#

I have that, yes.

#

Mine>

   Initializing Visual Studio command-line environment...
   Visual Studio version 16.11.28 environment configured.
   "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\"
   Visual Studio command-line environment initialized for: 'x64'
:  advisor -- latest
:  compiler -- latest
:  dal -- latest
:  debugger -- latest
:  dev-utilities -- latest
:  dnnl -- latest
:  dpcpp-ct -- latest
:  dpl -- latest
:  ipp -- latest
:  ippcp -- latest
:  mkl -- latest
:  tbb -- latest
:  vtune -- latest
:: oneAPI environment initialized ::```
keen marsh
#

Are you using Conda?

coral mulch
#

Yes.

keen marsh
#

Did you delete the old python?

coral mulch
#

What do you mean "Delete the old python" lmao

keen marsh
#

I had to delete python3 completely and set conda to path

coral mulch
#

Why would you even have to do that. The conda environment has it's own version of python set.

#

conda create -n IPEX python=3.10

keen marsh
#

Why, who knows

keen marsh
#

So I take it you never got it to work in windows at all right? If so that's likely why

keen marsh
#

Well, SDXL is just generating black images for me.

#

okay, got model cpu offload to work! sequential didn't work. (was the opposite before)

#

well 3 days of compiling and messing with stuff, time for a break. Hope those wheel files work for yall.

chrome bone
#

woohoo

#

finally

#

4 it/s

#

cant say it has impressed me but it runs

#

and it uses less sysram as you dont need to host a vm

#

now i needa find out if the function i commented out has anything to do with the slow speed

#

probably not, ill see what i can do nonetheless

restive parcel
#

very cute for an H model

chrome bone
#

huh they can do cuter things

#

too bad i cant post em here

restive parcel
#

wink

chrome bone
restive parcel
#

yeah, I'd like to get back up and running with something, but I've successfully ruined all working setups i've had DinaKEK

chrome bone
#

you can start new!

restive parcel
#

I spent 3 days doing that DinaKEK

chrome bone
#

i promise it wont be (too) hard

restive parcel
#

not sure what I'm not doing right, but on my linux setup I'm getting weird outputs

#

and slow inference speeds

chrome bone
#

do it in native windows!

#

who needs linux

restive parcel
#

maybe soon, I'm waiting for y'all to baby proof that before I ruin my system DinaKEK

keen marsh
# chrome bone 4 it/s

I found out how to compile from different commits as the one they use seems to ne from 2 weeks ago, so that could be why speed is slower. Its also in another branch so I am not even sure it would work lol. But I have compiled enough lol

chrome bone
#

yeah tryna compile that stuff gives me mental breakdown

#

they want market share they should make it user friendly

keen marsh
#

It seems like a small team

chrome bone
#

i bet it is big

#

but their focus is mostly on pvc/linux

keen marsh
#

Yeah, definitely.

chrome bone
#

in argonne theres 10000s of pvc needed to be run

keen marsh
#

Consumer ai is pretty new, developers will still likely prefer Linux

chrome bone
#

yep thats totally understandable

#

nonetheless im very excited it finally runs lol

keen marsh
#

Yeah, i can even run refiner now

#

Sdxl seems to run about the same speed

chrome bone
#

yeah they did a pretty good job optimizing sdxl, lemme test diffuser backend for sd1.5 too

keen marsh
#

Will it work with standard models?

chrome bone
#

sure

#

(half sure)

#

ok it breaks

#

not really

#

speed is good

#

6it/s

#

vs original backend

#

but i like original better

keen marsh
#

Image output was bad?

chrome bone
#

the girls are still very cute from what i can tell

keen marsh
#

Lol

chrome bone
#

yeah commenting out from . import _inductor in __init__.py should have no effects whatsoever

#

i dont see other codes depending on this import

#

so this is it, 4it/s

keen marsh
#

it may just be the commit, I have the last two both are slower. The one in the prebuilt is from 2 weeks ago and is in a branch

#

could be AOT on windows too

#

I think it stats it may have performance penalty? I could be making that up in my head though lol

chrome bone
#

hmm idk, i expect its perf to improve over time

#

after all its first release in windows

keen marsh
#

It will likely be better when they fix their whl's

#

It's actually full speed, just the initial load is 10 minutes