#SDNext WebUI on Intel ARC
1 messages · Page 2 of 1
Nice!
thats fast
can't try it myself right now, but please try to just use the command they have (maybe exclude torchvision) to see if it throws any errors
I will, wsl 2 just won't work for me. Wonder if its their pythin version is ahead of mine
there are wheels for python38 all the way through 311
Hopefully I can get this running native... it's installing
exciting times... And I am away from my computer for 3 whole days. so much work to catch up to in the next few weeks
I’m scared to mess up my versions, virtual environments and paths😅.
I’ll do this once there is a clear set of instructions for both windows and wsl
I have asked for an article that addresses consumer GPUs for native windows with this new release
please confirm it's working. I have already closed the issue because I need to sleep
Now I need to learn how to set the one api environment in windows lol
the basekit has windows version, you can choose to install only dpcpp and mkl
ill toy with it after lunch
I have it, but I think i need to set it proper to use

i think it was setvars in linux
try to run %ONEAPI_ROOT%compiler\env\vars.bat
and the one for mkl
see the sanity check at the bottom of the Windows installation instructions. Maybe exclude torchvision
ROOT
maybe need to run it with call, not sure.
should this command run in windows? "python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.version); print(ipex.version); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];" " I pull an error with it
put the code between ticks ` to avoid discord markdown formatting
What error do you end up with? module not found?
yes
try a pip show intel_extension_for_pytorch
but the pip install didn't throw any errors?
WARNING: Package(s) not found: intel_extension_for, pytorch
turns out the devs have been using C:/oneAPI as Installation path not C:/Program Files (x86)/Intel/oneAPI and therefore didn't run into the issues with white spaces and getting errors that return like half the path
do a pip show torch and hope it gives you torch2.0.1+git#hash
Version: 2.0.0a0+gitc6a572f
i get this warning when reinstalling DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.*; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
you can ignore this for now, this is just used to run sde samplers
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.14.1 requires torch==1.13.1, but you have torch 2.0.0a0+gitc6a572f which is incompatible.
yeah, get rid of torchvision for now, it's optional. try to just get IPEX working and then see if you really need it
you can do pip uninstall torchvision
try to import it in python
still get the same module error AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\backend_with_compiler.dll"
hey vipitis, did you change intel basekit directory to without white space
SD.next ipex install needs to change for windows, it's trying to instal the linux library. @proper cradle
I didn't myself, but I also wasn't successful. They acknowledged the issue, so I hope they test it beyond just the batch script
It's 5.20 am and the matches start at 15.30, I really need to fall asleep now.
Good luck on this, but it sounds really promising. If I am home early enough tomorrow, I might try around a bit myself. But I do hope that someone figures it out and either posts a proper guide or they fix it to make the experience more seamless.
see if they push an update to docs or the release branch here
good night bro, appreciate the help
so far got sd.next installing with ipex in windows, hopefully I can get it to work.
Also, if using in windows edit the installer.py and change the ipex install to the one from the windows version 'torch==2.0.0a0 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu' until Disty and them change it officially for windows
nope, almost ' OSError: [WinError 126] The specified module could not be found. Error loading "H:\Stable
diffusion\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its dependenc '
ahhh, I did in linux lol. I forgot
I'm not sure if use-ipex is running the one api environment, it may be because it's used to linux. I will see if this update fixes things first
ill try to get ipex running first, then see whats needed to be adapted in sdwebui
How do you actually activate the oneapi environment in Windows?
Build and run your first sample project using the Intel® oneAPI Base Toolkit for Windows*
Yep, I remembered now.
I hit setvars.bat it runs and seems to load everything but it closes fast not sure if it's sticking?
"C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
You run it
in a command prompt.
You don't click the bat.
I used open vino once with the first ever version it supported stable diffusion with
You can probably modify whatever starts up oneapi for linux with the directory of the bat for windows
Let's see what happens.
"C:\Users\dbs_5\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its
dependencies.```
Same error as you, Aaron.
https://github.com/intel/intel-extension-for-pytorch/issues/394 hmm closest I can find to our issue. they post a link to the new windows version in the last reply
Visual Studio 2022
Install Desktop development with C++ in the Workloads tab. 2 separate options MSVC and Windows SDK are required.
I am using VS2019
In the Linux install they "export" the mkl and dpcc, i wonder if that means add to path
I am on 2022
I am also not on the latest driver, maybe that could be it
I am.
good luck folks, i already burnt out all my debugging neurons over the last 3 days 
Lol, i have had a month off and came back at the best time
if i can't even get it running in linux native, I daren't attempt to run something this fresh in windows
Well I got it running on linux, until it blackscreened a few times.
I spend more time debugging than using the programs lol
In wsl2 there is something in my environment it doesn't like, i haven't tried native Linux yet
Maybe the python version in the venv needs to match with windows?
lmao, ain't that a mood
well, i used automatic on nvidia quite a bit
arc on the other hand has been a lot of work just to match what I could do before
...on an old 4gb card
we're slowly improving though
Yeah, if ipex can work in windows it's a game changer. full UI support at full speed. It would be second to nvidia as amd can only use directml or shark.
Maybe create another distro
noticed the torch versions are different for windows and linux, don't know if that is the issue. linux is 2.0.1a and windows is 2.0.0a
make sure you use the right prebuilt links, the old ones were wrong and for linux. It's importing fine for me
.................... -.-
' python -m pip install torch==2.0.0a0 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu '
it should be, you can check the link and see if those win files show up for you if it's a cache issue.
try and uninstall torch vision,
pip show intel_extension_for_pytorch
Name: intel-extension-for-pytorch
Version: 2.0.110+gitba7f6c1
Summary: Intel® Extension for PyTorch*
how come this would interfere with ipex...?
I am thinking torchvision doesn't work on windows for intel...
i'd think all of them work for intel, but some configuration isnt right
I dunno, but when I did it it worked for me . You can check up a little bit when I was trouble shooting.
also think lack of torchvision is why sd.next isn't loading? not sure.
it never launched successfully for me, windows or linux
ill just let computer do its thing while i watch some anime
I got it to launch in linux, but it wouldn't output lol
Building on my pc too, my poor 11700k doin its hardest work ever lol
For what its worth i compiled xpu10 on wsl2 limited to 3 cores lol. Took a couple hours
I am going to try and run it outside the venv
3 threads sounds awful lol. Im like 80% done at 13 mins in
lol, it was the only way to get it to finish. it kept running out of resources.
That's rough
well got it to load outside the virtual environment, forced me to fp32 but just errors out again.
think it doesn't see the gpu
I got sdxl to work in wsl with the new ipex using sequential cpu offload, was only using like 1.5gb vram, it was pretty slow tho like 2s/it vs 1.8 it/s i was getting before without sequential cpu offload. Using the model shuffle it starts generating but ultimately crashes and never finishes. Could be related to using the refiner i need to test without
I'm just trying basic 1.5 right now.
That is if i can get it to load without the weird fp32 thing that seems to be a hit or miss. Sometimes it all loads sometimes it doesnt. 1.5 worked for me no issues tho
I think its somewhat slower than before however like marginally
I was getting over 8it/s previously seems to be closer to 7 now
For sd1.5 512x512
maybe try a different optimization method
Sub quadratic seems to be the best still
I was hoping sdp would work.
Well i'm going to sleep, will maybe try compiling from source tommorow.
Windows installation has this too:
conda install pkg-config libuv
I am trying to do it, but all I can get is some source code that I don't know how to do deal with I downloaded from http://pkgconfig.freedesktop.org/releases/.
so i used --debug flag to install packages and manually replaced torch. been stuck here for a while. the package loading time takes forever. probably a loop somewhere that deters model loading(?)
it loaded natively?
yes. on windows
managed to get jbaboval to launch, it is slow as hell too lol i wonder what went wrong
it takes forever to do these operations.. i wonder what exactly the problem is
removing ipex.optimize reduces loading time by 33 seconds, for both embeddings and ckpt
Turn it off from the settings
90 seconds is still too long :/
Lol what is it doing
it is not just loading time, the time it takes to initialize an inference is also veeeeeeery long and i went out of patience before the first iteration is finished
Make an issue on the gifhub?
there must be something wrong in the process, im finding out what that is
ive pinned down the problem to this
getting expected shape takes 90 seconds... while id like to go further its bed time now
that's a hack. You embed a small string and then look at its size instead of looking at the config or whatever. If this is really slow, then actually embedding something will be way slower.
Perhaps they do some caching/compiling upon first inference? Not too sure
IPEX does that on Linux too.
Only custom AOT IPEX builds doesn't do that.
Already have this installed.
Outcome is the same.
you install it with Conda? I can't figure out how to call conda without clicking on the exe for it
also, how do you get the latest python 3.10 in windows? Conda is 3.10.12 but my version is 3.10.6 so maybe that's the conflict.
well, it works now on new fork, so nevermind lol.
well it defaulted to fp32 and cpu though I think
about 10s/it using cpu lol
speed
@proper cradle i think the word torchvision in the install for windows isn't supposed to be there, pulls an error don't think torch vision is part of the windows ipex wheels yet
But we need it
I mean its just pulling error and not downloading anything
Anybody have luck with compiling from source? I wonder if they just uploaded the cpu version
I can get it run i. Windows but only hat full precision and using cpu. System shows torch.cpu autocast or something like that
what torch.xpu.is_available() returns?
conda create -n name* python=3.10
hmm compiling from source and using those packages doesnt seem to change anything
What error?
ERROR Error running pip: install --upgrade torch==2.0.0a0 torchvision
intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f
https://developer.intel.com/ipex-whl-stable-xpu
Can you run this manually?
pip install --upgrade torch==2.0.0a0 torchvision intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
Cuda version of torchvision is working fine on my end
didnt throw any errors just said i already had everything installed
OSError: [WinError 127] The specified procedure could not be found. Error loading
"C:\Users\KingOfMemes\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of
its dependencies.
still throws this error trying to run webui.bat though
^ That is the error that I have as well, and aaron also has it.
happened after manually replacing the files with the ones compiled from source as well
And IPEX worked fine with normal torch 2.0.1
works fine on wsl
torch.xpu.is_available()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'torch' has no attribute 'xpu 
This isn't true for me because pytorch needs to be linked with IPEX properly to have XPU support
That's why the normal pytorch distribution doesn't work
my guess is the torch they have for windows doesn't support gpu, which may be why it's a different version. i no logner get the error now though, it just uses cpu
torch needs a patch to work with ipex iirc
Note: The patched PyTorch 2.0.1a0 is required to work with Intel® Extension for PyTorch* on Intel® graphics card for now.
and they uploaded 2.0.0 
intel confuse
also torchvision can be removed from the installer line on windows, it installs torchvision anyway without it and including the line throws errors for whatever reason, prob because its trying to pull it from the intel dev page and torchvision doesnt exist on there for windows yet
Tbh the windows release does say experimental
Once they get torch to 2.0.1a0 on the Intel dev page i think itll work fine so theres that
link me to those notes I will make a thread I guess
Its noted on the ipex github
However, the source should comile the right one if it doesn't they may not even have it. If that's the case, how is any of it experimental lol
Is there an issue for this on github?
Not yet
Can you run this?
pip install --upgrade torch==2.0.0a0 torchvision==0.15.2 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
ERROR: Cannot install torch==2.0.0a0 and torchvision==0.15.2 because these package versions have conflicting dependencies.
pip install --upgrade torch==2.0.0a0 torchvision==0.15.1 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
made a support thread
same error. I don't think we can install torchvision on windows, at least not from intel
I've noted it doesn't seem to exist on the support thread
This should install the normal torchvision
so hopefully they respond sometime in the next week
let me try outside of the conda enviornment
ERROR: Cannot install torch==2.0.0a0 and torchvision==0.15.1 because these package versions have conflicting dependencies.
The conflict is caused by:
The user requested torch==2.0.0a0
torchvision 0.15.2 depends on torch==2.0.1
what my error reads
That's why it works on Linux then
yeah i guess for windows we wait for torch 2.0.1 from intel
😂 😭
pip install --upgrade torch==2.0.0a0 torchvision==0.15.0 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
seems promising though
The conflict is caused by:
The user requested torch==2.0.0a0
torchvision 0.15.0 depends on torch==2.0.0
it has to be from intel because intel releases patched torch versions to work with ipex
so we need torchvision and torch from intel devs for windows
Guess no way to bootleg and copy the linux files to windows right? lol
Native Linux or WSL?
well, even to use it on wsl2 seems like I have to wipe my enviornment again sigh
Thats weird I haven't had any issues on wsl side
Yeah, it's my environment. I think it's because it wants to install python 3.10.12 in the environment when I have 3.10.6, not sure why that is probably something I did a long time ago. Might try and run the launch.py outside it though
i never installed or used conda for anything it isnt needed
I hacked that thing together after making millions of mistakes lol
This worked with base torch 2.0.0
yeah, I didn't even know what conda was I just follewed the instructions at first
had to set another python to default etc etc, learned a lot though and got it working but likely causing this error
The conflict is caused by:
The user requested torch==2.0.0a0
torchvision 0.15.1 depends on torch==2.0.0
Conda seems needed for windows, I don't know how to install that package otherwise.
Not sure it makes a difference, might be why I can get into Sd.next now even with just cpu
Without it I got that module error I think
It's only installed in conda anyway
Guess i will try wsl2 again later
yeah works great on wsl2 with good speeds and stable inference
Still leaks memory?
They fixed that already, you can use it as long as you don't exceede your memory capicity with too high resolution, if you do you will have to restart still.
So far this is what I've done to set up the WSL2 version of vladmantic
wget -qO - https://repositories.intel.com/graphics/intel-graphics.key | \
sudo gpg --dearmor --output /usr/share/keyrings/intel-graphics.gpg
echo 'deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc' | \
sudo tee /etc/apt/sources.list.d/intel.gpu.jammy.list
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update && sudo apt upgrade -y```
sudo apt-get install intel-opencl-icd intel-level-zero-gpu level-zero intel-media-va-driver-non-free libmfx1 libgl-dev intel-oneapi-compiler-dpcpp-cpp intel-oneapi-mkl numactl python3-pip python3-venv git unzip liblzma-dev
git clone the automatic repo
run it with --use-ipex
And then modify the webui settings based on the SDXL Vladmantic wiki, installing the fp16 vae fix
That's it
Native?
No.
WSL2 only
On windows, we still have DLL issues
For some reason regarding a minor pytorch version difference
Native needs updates from the intel team
When native intel beats amd lol
Also, anybody compiled from source? Does source also compile wrong pytorch?
Think there was issues with it
If so it probably just wont work
Well they should reply in couple of days so hopefully its just a mistake
Seems like they done minimal testing for this
Assuming compiled version of torch is same as in wheel, got same dll regardless, need to update to 2.0.1a0 vs 2.0.0a0 for gpu to work according to their docs, hopefully soon that will be updated and then all should be good, wsl environment is still good tho, would be curious if in native windows there would be a any speed benefit compared to wsl, if not, having native windows is cool but wouldn't really offer anything over wsl version as its not really any easier to set up
trying to compile from source now, realized I wasn't running hte one api environment before and that's why it never did anything
good morning everyone!
for anyone whos as confused as me, i understand you pain
this is what i did so far to get it working on windows
- install intel dpcpp compiler 2023.2.0 + mkl 2023.2.0
+ compiler classic 2023.2.1 - uninstall all existing python version
- install latest miniconda
- check all 4 boxes during installation
- run in cmd:
{oneAPI ROOT}\setvars.bat {oneAPI ROOT}\mkl\2023.2.0\env\vars.bat {oneAPI ROOT}\compiler\2023.2.0\env\vars.bat conda install pkg-config libuv python -m pip install torch==2.0.0a0 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu
after that you can just import torch and ipex and use xpu as device to calculate on your gpu
note: some packages in webui will update dependencies and cause dll errors, i cannot tell which package that is. not a problem for sd.next though, but you need to use debug flag to install before using --use-ipex
apparently theres also some bug there, which i hope to figure out today
and compile from source did not work for me, i think i got sth missing
will give it a shot, compile ended withi the same error I was getting in the webui lol
did you compile successfully?
go to ipex folder to see if there is dist folder
a whl should be there if you compile successfully
I only found disk folders in the torch dir not vision or ipex
And I didn't catch any errors as a lot was output to console (and it recursively cloned like 170 repositories)
so compiling troch from source works (perhaps check which exact version you end up with).
since the pip command the the prebuilt wheels have torch2.0.0a+git while it said that you need torch2.0.1a which could be the issue
There is a new GitHub issue (not mine this time), but it's also the weekend. So perhaps we won't see any fixes before monday.
unless #software message you know who to email
I was so close to getting the linux native finally running, and then I somehow managed to uninstall bash and have thus ruined a third linux install 
torch 2.0.0a works fine, not that it is bug free but it can launch
I am home in maybe 30 minutes. Maybe I sit down to try around myself for a bit
I will cjheck but i got this error
OSError: [WinError 127] The specified procedure could not be found. Error loading "C:\Users\mymin\AppData\Local\Programs\Python\Python310\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its dependencies.
you can run on gpu?
this
yes i can
mind if you check which torch/ipex version you have? the intel prebuilt should work
nice, I am going to give that config a shot. It's wierd that you need to uninstall python to use conda, it's supposed to be separate and run virtual env's.
i agree
maybe it will work if you dont use conda, but i dont want to add more headaches
i've tried both, didn't work. But i never uninstalled python
well it worked, but on cpu
thats not very helpful... cpu works with unpatched torch too lol
okay, I got it to read true now. I got a response from the developers as well. The version uses 2.0.1a0 code even though it's not updated version since it's exprimental, and torchvision is not available yet.
okay, now webui says "no python"
did you check all 4 boxes when installing conda?
yes, do I need to run all var.bat files each time?
yes. but you can make another bat to automate that for you (perhaps integrate to webui too)
RuntimeError: Native API failed. Native API returns: -1 (PI_ERROR_DEVICE_NOT_FOUND) -1 (PI_ERROR_DEVICE_NOT_FOUND)
Every time I run an image gen, my entire PC blackscreens
Discord closes.
On WSL2*
okay thanks, I have a bat running the setvars, I will add the rest
│ /home/dan9070/automatic/modules/call_queue.py:34 in f
│
│ 33 │ │ │ try:
│ ❱ 34 │ │ │ │ res = func(*args, **kwargs)
│ 35 │ │ │ │ progress.record_results(id_task, res)
│ │
│ /home/dan9070/automatic/modules/txt2img.py:64 in txt2img
│ │
│ 63 │ if processed is None:
│ ❱ 64 │ │ processed = processing.process_images(p)
│ 65 │ p.close()
│ │
│ ... 6 frames hidden ...
│ │
│ /home/dan9070/automatic/modules/sd_hijack_utils.py:17 in <lambda>
│
│ 16 │ │ │ orig_func = getattr(resolved_obj, func_path[-1]) │
│ ❱ 17 │ │ │ setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **k
│ 18 │ │ self.init(orig_func, sub_func, cond_func)
│
│ /home/dan9070/automatic/modules/sd_hijack_utils.py:28 in call
│
│ 27 │ │ else:
│ ❱ 28 │ │ │ return self.__orig_func(*args, **kwargs)
│ 29
hmm, are you running from webui.bat or launch.py? it seems it keeps wanting to point to my old python version
webui.bat
oh yeah i used debug flag first
using --use-ipex seems unable to find python. too bad i know nothing about powershell script
fwiw debug flag is how you set up sd.next for nv/amd cards
So wait, are you using directml then?
but it will download unpatched torch in venv so i had to manually replace em
Okay, trying launch.py right now. It hooks into the default python and doesn't need the venv
that works too, i guess, but i never tried
this is what i did:
- set up with debug flag
- remove unpatched torch and add patched torch and ipex
- run the bat with --use-ipex flag
I may give it a shot if this doesn't work. I think conda isn't reading as default python even after being added to path. it's looking for python folder
oh damn is it close to working
not for me lol "OSError: [WinError 127] The specified procedure could not be found. Error loading
"C:\Users\mymin\miniconda3\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its
dependencies." 😦
this is why I no longer wish to even toy with WSL method anymore. its just hugely unstable to my whole OS, easy crash pretty often
especially with discord/other webapps
youtube/google calendar/discord do not like arc
i guess that messed something up with the extension, it's no longer found in python
😭
just patched torch and ipex lib
How do you do that in the venv?
after setting up using debug flag
go to automatic\venv\Lib\site-packages and delete unpatched torch (2 folders in total)
then add patched torch and ipex (4 folders in total)
your patched torch and ipex can be found in your conda directory
thats not very elegant but it works
this is everything I would get yelled at for vaguely suggesting i've tried on a tech help forum, I love it
you run the debug outside of oneapi environment? I still get python error
inside
strange.. lemme try again
i think i know what the problem is now
i mean why so slow
I wonder if I set python to the miniconda in the web-user.bat file I copied from automatic1111?
there is no need to do this
when you create a venv, doesn't it copy your python version anyway?
the moment you checked 4 boxes you call the python that comes with conda
like this
it should but it's looking for my old python version
now lemme download a fresh copy..
well, for this, I am glad linux has pyenv. Now if only my bash didn't magically disappear after cloning sdnext...
yeah, i get the python error each time. maybe if I replace the files first?
' H:\Stable diffusion\automatic>call webui.bat --debug
Couldn't launch python
exit code: 103
stderr:
No Python at '"C:\Users\mymin\AppData\Local\Programs\Python\Python310\python.exe'
Launch unsuccessful. Exiting.
Press any key to continue . . . '
can you do a screen cap in cmd? just run python
isnt that just saying you dont have python lol
hmm..
I only have miniconda python as it won't work otherwise
that is interesting
its trying to use normal python
"AppData\Local\Programs\Python\Python310\python.exe" does this exist?
it should point to conda python..
yes, deleting now.
same error
experiencing the same thing where the dll throws and OS error
but I did manage to install it via the wheels and it showed up in my pip list
I believe it's a path/vars thing according to the WinErr 126, so maybe having the conda env really helps
perhaps it's the stupid windows file path with spaces in them that messes it all up once more
i installed conda by user, though system might help? (i did it once and it may have changed some config which otherwise wouldnt have)
well, now it won't see conda when I install it like that... 😦
i found out why it is so slow now
whoever wrote this note knows its an unfinished product the problem is slow start time

now i needa find a way to not trigger this thing
Does this work?
torch.asarray(np.array([foo]), device="xpu", dtype=torch.int)
I can hijack the torch.asarray function
do you have windows lol. i dont think reshaping helps
Nope
ill keep finding what the problem is then.. i tried to understand what ids are (i think they are one hot vectors just vocab id) because printing ids return nothing. im not too sure whats going on
fwiw this is the line that executes so slow
okay, think i figured it out. You have to start everything within a conda environment.
well back where I started lol OSError: [WinError 127] The specified procedure could not be found. Error loading "H:\Stable
diffusion\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its
dependencies.
let me try and copy the files
man, it still just wants to use cpu.
did you start with --use-ipex after replacing libs?
yes, but it looks like it replaced the files. let me try again
@proper cradle do you know how to actually view Arc gpu status on linux? I want to keep an eye on memory and such, but can't figure out what actually works with arc
xpu-smi
thank
okay, finally seems to read my gpu but it's just hung at starting...
RuntimeError: expected scalar type Float but found Half got this again now....lol. Think this is what I got in wsl2?
Huh… which line is throwing this error
going to try debug now
for some reason, my server is just trying to render something when it starts up 
I think my error was from restarting the ui instead of shutting down, not sure yet. it's just hanging ont he first generation
ohey, I think i'm finally getting an image
although i guess it can still black out...
check in the system and see if it's using cpu
~3s/it, 1024^2 SDXL + refiner
I don't have the gpu memory monitor set up yet so I'm not sure what its on
although, I think the debug stats number here means 4.9 gb used?
linoox of course, but it took me enough days to get it working 
i get pi out of memory error well before that number reaches 15.91 though so idk
nope still get an error. format got wierd when moving to second screen lol
windows?
windows is working for RC, I ALMOST have it working lol
I am just trying to get sd1.5 to work before I even think about sdxl
that error is an easy fix
you just convert it to fp32
float usually refers to fp32
half = fp16
When using sequential offloading on WSL2 vladmantic, I cannot use loras. Images are black.
I tried in linux and it didn't work, also fp32 is about half as fast
naw, linux native. I killed it a lot of times so it was actually much harder than WSL for some reason
you should be able to get 1.5-2it's per second
Pretty sure he's using sequential like I am
Otherwise his it/s don't match up
If I try to run it without sequential on WSL2, it blackscreens. This only ever happened since the IPEX update.
I turned on all of the optimizations because I was getting out of resource error, i'll see which ones i can turn back off
Which makes me very sad.
Yeah, use the other one. But you need to run that - sed sequence disty posted in here a while back. and I think you need the vae hack
although that was all in torch1.13
yes, use that for faster speed the other just gave black images for me
seems like this takes 5 minutes just to start trying to make an image...
trying fp32 now no half etc
I thik that robot is a bit..nsfw lol
5minutes later.....expected scalar type Float but found Half
but that shouldnt be necessary.. i didnt have to modify anything
okay only thing I can think of is to update my drivers to latest beta.
my problem is it is sloooow
I was getting the same error on wsl2,
yeah, it's taking me 5 minutes the first time just to get an error
after that I can get an error quick though lol
in wsl2 the error was quick
if you haven't gotten an image yet, wait about 5 minutes and see if it outputs an image or error
but even if theres no error, being slow requires a fix
I don't get any indication it's doing anything except the starting button in the ui
i aint waiting all day to see my waifus
Try enabling vae tiling, with vae tiling, vae slicing and attention slicing i can generate sdxl on wsl2 without using sequential at around 1.7it/s
Vae tiling in my experience is just enough to avoid the out of resource error which crashes the gpu driver forcing the black screen
@restive parcel This new version of ipex
It all works for me now
keep device precision type to fp16 but vae and diffusers to fp32
I think you also need the patche vae as well, then set it to fp32 or rename it
Yeah use patched vae rename i just left vas and diffusers set to default
I wonder if this will take me as long as it took me to figure out the webui in wsl2 when I first started, think it took a month lol
Actually, it seems not.
So.
I think hte patch is more for speed, the original will work but take more vram and take longer i think
Well youre using sdxl that already has the patched vae baked in
Nick he means the fp16 fixed version.
it requires a separate download afaik
5 minutes and no error yet, debug says Sampler: Euler a {'uses_ensd': True} and it's still "starting"
yeah theres definitely a problem there
2 minutes in im still waiting it to throw an error
I only used the one without the embedded vae, but I only have 8gb of vram
This is merge model for: 1. 100% stable-diffusion-xl-base-1.0 and 100% stable-diffusion-xl-refine-1.0 https://huggingface.co/stabilityai/stable-dif...
It seems the vae fix on the new IPEX is no longer required?
it will take at least 5, i got 3 server alive debug messages last time
ill do a screen cap and report
I am also on fp32 now so it may take me even longer to error out lol
weird, i'm still getting average 3s/it
as soon as i switch device to cpu the inference time is sped up to 0 seconds
originally took 90s on xpu
try the settings disty posted in this thread
getting the it/s message 🙂
it's not moving though lol, let me set another 5 minute timer
5 minutes in it still doesnt give me the error i deserve
i GOT AN IMAGE IN WINDOWS ON THE GPU
ayo????
Well this is good news.
but... what did it cost?
this is our LX-99
yeah it took me about 8 or 9 minuts just now.
it actually didn't take 6s/it's when it started though, it was actually lighting fast, probably 6it/s
nice
well maybe 3 given it's fp32
so something in the initial stage is slowing the process
lets see if second generation takes as long
yup, 1.7it/s. Goes right away after 10 minute load lmao
problem is I don't know what causes the slow down? I guess it was the thing you found?
i hv zero idea whats happening 🤷♂️
also, this is 512x512 sd1.5 revanimated I think
are you getting messages in the debug
sd.next you can run the debug flag
I have a customized fork of that one too, I messed aroudn with that for a long time lol
yeah same
swapping to bf16 didn't cause any loading. Also turning off live preview speed me up from 2.7 to 3.5 it/s
I'm gonna wait for the windows version to be fully figured out
not sure how sdxl will do with this.
it takes forever to initialize, but once it starts the speed is good
thats a full 20 mins
i wonder if it will be like that every startup, or a one time deal? I wish I could see what was happening
yeah, looks like swapping models did it lol
it loads fast but takes forever to start using it...
oof
we are making the world a better place - imagine how many SD enjoyers who will use SD.next on their arc igpus
lmao, yeah. I don't mind though, I even have reddit followers now lol
okay second model took about 3 minutes to start
it/s seem to be going up too for some reason
thats good
meanwhile I can hardly get the thing to spit out an image to me even going back to "normal" SD mode 
now its complaining something about 'DiffusionSampler' not having an attribute 'sample'
you need to change all that back to original and stable diffusion.
when in doubt reboot i guess
You follow the instructions disty posted?
you can put all those options at the top and change them on the fly
aight, rebooting did the trick
yeah i changed em but didn't restart, so I guess that was it
still feels super slow for some reason... even regular SD at 512^2 is giving me 1.54 it/s
must've left some optimization on i shouldn't...
wait
full precision was turned on?
that might be it
hmmm 2 it/s...
it feels like my outputs are very wrong....
even without a ton of prompts, I really don't expect this from Abyss Orange Mix
what are your steps, cfg and resolution?
20 steps, 6 cfg, 512^2
make sure you change your vae as well
Try bf16
what is bf16?
I dunno, i think it's what arc uses
I was getting errors when I updated the UI before and fp16 wouldn't work, google search said arc didn't fully support fp16 yet and bf16 worked for me
scary kid
model compile backend set to ipex seemed to fix the slow model startup (or it just fixes itself by the third model)
is this more inline with you used to get?
not at all. the colors are very cold, and looks nothing like you typically get from orange mix models
same vae?
there is an option "apply color correction to match original colors" i think it may only be img2img but worth a shot
BF16 is still a 16bit number, but the bits (sign, exponent, mantissa) are allocated differently to fp16 with the idea of being more efficient for machine learning tasks. it's natively supported in most modern accelerators.
there is also tf16 which isn't even 16bit
bf16 is similar to fp32 in range but reduced precision and half the footprint
you don't need to be perfectly exact for machine learning (but you do for scientific computing and simulations etc, which is why there is fp64)
Tried it again and its back to giving me the float error. Also takes 5 minutes to start again
Pushed this:
https://github.com/vladmandic/automatic/commit/489d0382cfd95ba4420d94dce4d2432c0cb54dd8
IPEX doesn't generate NaNs above 2032x2032 with diffusers anymore.
4096x4096 with SD 1.5
Time taken: 4m 39.67s |
GPU active 10408 MB reserved 14720 MB | System peak 10029 MB total 16288 MB
wow
SDXL 4096x4096 with 5MB VRAM left available
And it oom'd when i pressed the send button
SDXL 4096x4096:
Time taken: 3m 56.45s |
GPU active 11865 MB reserved 15978 MB | System peak 10830 MB total 16288 MB
for some reason the windows version seems to download the wrong version of torch.
That because of the torchvision
It will either fail with windows ipex torch or it will download the cuda torch with it rn
I have gotten it to work by replacing the files, just once so far though.
have to wait 5-10 minutes before I can see if it works this time
Hi guys, I used the WSL method of getting vladmandic up and running and it's working with older models, but SDXL 1.0 isn't working. I just get a black screen. Any tips on getting it working?
I have an a770 16GB & 48GB RAM
are you using the vae fix and what diffusers settings are you using? I reccomend looking at the vladmandic github wiki for the sdxl section
since wsl by default allocates half your ram, in your case 24gb, that should be enough but just for safety you could make a .wslconfig and up that to 32gb for wsl, i know 24gb is the absolute minimum as i see up to like 23gb usage so 24gb could be cutting it close, i only have 32gb ram so i really cant allocate more then 24gb myself
so today windows seems to only want to use full precision on the model, yesterday I got half precision working after subsequent runs.
I will try and compile from source again and see if patched torchvision makes a difference
Thanks Nick, am I supposed to use the Fixed vae along with the fixed checkpoint?
No, just use the standard 1.0 checkpoint with the fixed vae, I've used the fixed checkpoint but i have found better stability with just the normal checkpoint and separate fixed vae
It started working for me but it's extremely haphazard in terms of the results when prompting for a photo of a grizzled old man, returns an illustration or cartoon alot of times. Not sure if I have it setup correctly. Adding negative prompts
for cartoon, illustration, monochrome doesnt seem to work at least not fully
also only 512x512 is working for me. Thanks for the wsl tip. I thought it was just openly sharing memory between host and guest os but it looks like by default it was capped to 16gb. Now i'm seeing 22gb in use...
compiled torchvision pulls an error when loading, definitely doesn't fix the start up speed problem in native windows
okay, in wsl2 it seems negative embeddings cause float errors in the lastest ipex.
Anybody else can use negstive embeddings eith fp16 or bf16? Seems they need full precision now
While it seems ipex 2.0 is good for sdxl, it seems like it's broken on sd1.5. Seems autocast has issues with half precision. this is in wsl2 btw.
Did you try prompts other people are using? I tend to generate bad images when using short prompts
Can't replicate
ipex 2.0 works fine in wsl
Also if you are on FP32, set precision to FP16 and do a full restart
not for me 😦
It will run, but any textual inversions pull the scalar float error unless at full precision, which eats up my ram and vram
Did you git pull to the latest vesion?
I may just wipe the distro and just do it all over from scratch to see if something got messed up somewhere
yeah
did a git pull and reinstall to make sure
Replicated with BF16
Autocast is getting worse with each update :/
1.13.110 autocast was worse than 1.13.10
And now 2.0.1 autocast is worse than 1.13.110
Yeah i remember hearing that before. I think i started in bf16 as well, maybe the settings didn't change when i tried fp16
tried importing the torch vision from wsl2 to native windows, seems it does the same slow loading lol
It was FP32 and BF16 for me.
Fixed them:
https://github.com/vladmandic/automatic/commit/a22862d5c6da68c463e2b82f2a187efe858bf125
Nice! I will check it out
first sdxl 1024x1024 produced in native windows. sequential cpu offload, 2s/it. going to try model cpu offload but not sure if I can get it to work
not sure if those negative prompts are working lol
https://www.intel.com/content/www/us/en/docs/oneapi/programming-guide/2023-0/use-the-setvars-script-with-windows.html found this document
Programming oneAPI projects to maximize hardware abilities.
and it seems like doing call doesn't really activate it, you can just launch the batch file
so essentially just run "%ONEAPI_ROOT%setvars.bat"
but now I get WinError 127 instead of 126, so some steps forward
I have only gotten it to run by deleting python and running everything in conda, you can probably just set up a conda env and do it though.
also 15min+ to start the first generation on sdxl vs about 5-8min on sd 1.5
I will try a conda env as they suggest on Monday. perhaps I also change the installation directory so it doesn't break anything with those whitespaces.
I wanted to move to Python 3.11 anyway but my workstation will only come together in October,
@proper cradle this seems to work in windows 'torch==2.0.0a0 torchvision==0.15.2a0 intel_extension_for_pytorch==2.0.110+gitba7f6c1 -f https://developer.intel.com/ipex-whl-stable-xpu' . Might need someone else to check it as well, but seems to install the proper torch.
@chrome bone Found out why first inference is so slow https://github.com/intel/intel-extension-for-pytorch/issues/399#issuecomment-1667108663
hopefully, I remember settign that when building from source in wsl2
honestly i cannot wait xD might as well try again following another guide
I also fixed the torchvision error I was getting in conda, but It didnt' make a difference and likely doesnt matter
yep. as said, its mostly for image augmentation and stuff that are training related
I see the setting for Aot in the bat file, but the code is different from the sh file for linux and I dont' know what i'm looking at lol
yeah, if you want it was basically import conda libjpeg and libpng before compiling from source, probably because the wheels aren't for windows.
This works right @chrome bone
it... works? i mean it works, but it is slow because ipex is not compiled with AOT specified, see: https://github.com/intel/intel-extension-for-pytorch/issues/399#issuecomment-1667108663
i expect them to upload a recompiled version because just now my build fails again
I mean the torchvision doesn't throw error?
i did not have any issue with that
but that could be because i hvnt been using the training functions. on wsl my torchvision was never imported successfully but i had no issue generating pictures
Training doesn't work on Linux either
ESRGAN upscaler uses torchvision
And webui.py itself tries to import torchvision
well i never use ESRGAN upscaler so id never know, tried it back in march and it failed miserably
Anyway if it doesn't break the pip install then it's fine for now.
yep, ill post here if intel decided to upload a working ipex
I have tried to compile overnight eith aot, not sure if I did it right but will report it it works later on
use_aot_devilst pulls an error, wonder if they have the windows version listed. I see options for IGPU in windows
might just end up stuck waiting 10-15 minutes before starting
are you using their bat?
I used the one vipitis posted
I've succesfully compiled with it before, failed once too though lol
this time it goes all the way through, the fails at aot-devlist option
I used it like you are supposed to on linux "compile.bat root root "aot"
you can omit the quotation mark
USE_AOT_DEVLIST='ats-m150'
I just put that, I wsa just usign short hand to explain
that's supposedly arc, but maybe its only linux
it goes like ___.bat ___.var ___.var ats-m150
i needa find his script, not sure what he changed
does your oneapi root contain white space?
yes, it's in programs.
Why they put white spaces in the default programs folder, I dunno lol
I have compiled before, honestly I'm not sure which folder is dpccroot, i forget which worked. I tried "C:\Program Files (x86)\Intel\oneAPI\dpcpp-ct\latest" this time, last time I tried "C:\Program Files (x86)\Intel\oneAPI\compiler\latest" but ipex was busted (this was without AOT)
So i guess that could also be it, I wish I didn't hav to compile the entire thing each time
i simply changed it to c:\oneapi, compilation still fails nonetheless
Hey ho! Was just hoping someone might know something about StableSwarm and potentially using Intel Arc GPU's that you have as a secondary device for simultaneous batch generations across different GPU brands?
do you mind upload his script again?
i found it
which one is dpcpp root folder? lol
I don't want to mess up because of that, it takes forever
They tell you where it is in the linux instructions lol
yeah so to use his script you can just run compile_bundle.bat 1 2 ats-m150. he simply replaced arguments with %ONEAPI_ROOT% env variable
no need for aot_use_devlist ?
"C:\Program Files (x86)\Intel\oneAPI\dpcpp-ct\latest" this where my dpcpp root is right?
does not matter, you can run echo %ONEAPI_ROOT% to see where your base kit is installed
okay, thanks
this is what intel developers shouldve done
"compile_bundle.bat" "C:\Program Files (x86)\Intel\oneAPI\dpcpp-ct\latest" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest" ats-m150
this compile_bundle.bat 1 2 ats-m150. 1 2 are just placeholders, you can put whatever you want inside
that's what I got, oneapi root shows my one api folder in windows but not the dpcpp root
so "compile_bundle.bat 1 2 ats-m150 " should work just like that?
yep
okay wow, lol. thanks I will give it a shot
good luck, wish ill see a working ipex when i wake up next morning
If I get a working wheel I can try and upload it someplace
also, incase anybody missed it. the whl files are in the dist folders
took me a minute to find them lol
dpcpp folder is called compiler
also I seem to have found a solution for myself that doesn't require any conda env. Intel has a oneAPI env setup extension for VSCode
So far so good, torch has compiled and installed, no errors yet with ipex

been stuck here for a while now '[1047/1049] Linking CXX shared library csrc\gpu\intel-ext-pt-gpu.dll'
it's been stuck for over an hour, should I just cancel?
is your CPU showing any utilization?
here are some warnings I got before
it's like 8% when it was at 100% before
press enter in the command prompt
you can try to press Ctrl+c and hope it prompts you instead of aborting
no errors, but it's never taken this long
And then type no
okay, well I said no, and it seems to just have finished? ran a sanity check and everything came out right it seems.
guess I will test it out and see what's happening lol
Intel be shaking rn
nope, no whl for ipex.
probably just read my previous install
I wish there was a way to just compile ipex, everything else is fine
go into the compile bundle script and change it
So the issue is something to do with sycl, I believe this is to convert c++, its compiling I guess but only using 15% cpu. Not sure it wont error out eventually. 🤷♂️
Its possible I am missing a necessary download somewhere, or there is a reason they didn't use aot in the wheel files.
Diffusers bakcend is way more stable than the original backend after i patched the attention functions.
Here is a SD 1.5 at 512x1024 with Batch Size 32:
And SDXL at 1024x1024 with Batch Size 16:
Nobody else tried to compile yet? Maybe it's my amd cpu and opencl? 😭
ive removed torch and torchvision
though theres no reason to think it would suddenly work if you keep doing the same thing over and over again
I had it all working but then removed conda, as I wasn't using the conda env anyway. But now it doesn't work anymore. I will probably just keep conda but not use it. perhaps that is required without obviously stating it
no idea why they took all the time to develop but not compile properly for users 💀
yeah, i've started it with just ipex. i mean it hasn't actually errored out, opencl is running and using cpu. it's just super slow. I was going to see if it was just slow, but man it's been a few hours now lol.
nope, read that wrong. My bad
Okay, got it to compile! however it pulls an error now in python. will try and see if it will work anyway
nope! lol 😭
It might be because I only compiled ipex by itself...
hmm shouldnt be the case
ImportError: cannot import name 'register_backend_for_device' from 'torch._inductor.codegen.common'
I think torch does a patch during the compile if I am reading it right
yes but that was during compilation of torch
ill leave my pc compiling while i go get my pizza
so you have a wheel file for ipex?
upload it somewhere i can give it a try
I don't see an intel_extension folder in the pytorch folder so my guess is it's patching into the other folder
yeah, but it pulls an error.
yeah it shouldnt be there
arc gpus are still "cuda" not to break stuff... and now that might be breaking stuff for me -.-
torch.device(number) is cuda
use torch.xpu.device
Question.
- What are you current settings other than what is mentioned on the wiki
- How do I use your commits unless they're already in the main branch
any success with compiling AOT? waiting 12 minutes every time I restart the kernel or run a script seems unreasonable.
The PyTorch webinar is on Wednesday https://software.seek.intel.com/techdecoded-webinars and I will ask a lot of questions.
Sign up today to attend LIVE SESSIONS covering the latest overviews, insights, and how-to’s on topics that drive our cross-architecture, heterogeneous-compute world—oneAPI, AI, HPC, rendering & ray tracing, video & media, IoT, and more.
my attempts have been futile
I got it to compile, it took like 6 hours and I get an error when importing ipex in python.
if it compiles fine it probably is working
ImportError: cannot import name 'register_backend_for_device' from 'torch._inductor.codegen.common'
I tried it in stable diffusion, same error
for whatever reason ninja keeps throwing ~~shuriken ~~errors at me
so you seemingly cant compile ipex on it's own.
I will try my chances tomorrow (should I get out of bed before practice), and otherwise I hope to get answers during the webinar... Or just more and more GitHub issues.
Seems like there is at least one guy appointed to windows native (hopefully has a team behind them)
Maybe, I did compile the other two first but ended up quiting out and then compiling ipex by itself later.
I posted about it on github
I may leave it on overnight again.
completely wipe all the folders and recompile I guess
I mean, it works for me right now. And without doing anything it's at least twice as fast as CPU. I can work with it. But the itrex (Intel extension for transformers) repo is filled with examples and even crazier inference deployments then I was expecting. I did manage to get xpu as my accelerate accelerator, but it's not being used correctly by device_map="auto" for pipelines. so I will try a few more things tomorrow. I didn't expect device="xpu" to be working
yeah, for me it's just the first run. takes about 15 minutes on average
it takes up way less resources than wsl2 and I can run it native.
Move Model options, VAE slicing, VAE tiling and FP16 VAE with force upcast set to false
I have all of these set, but loading just the base model takes almost all of my 32gb of sysram
The moment I attempt to generate without model cpu offload or sequential, I get either an out of resources or device not found error.
Don't use those options if you don't have enough system ram
It will put the model into the system ram
How is 32gb of ram not enough
I have this set in diffuser settings with the fixed vae properly named.
Did you load different models?
No, I shutdown WSL, then started up the webui. Let it load the fixed VAE and base model
IPEX doesn't clean up system memory without ipexrun
Also what is your fixed vae?
diffusion_model.safetensors or diffusion_mode.fp16.safetensors?
diffusion_mode.fp16.safetensors, but I had the previous rename while I had the VAE set to fp32
Loading variant doesn't have any effect other than looking for a specific filename tho
So you don't have to rename
Which is why the .fp16 was added. It won't load the safetensors unless it's specifically named diffusion_mode.fp16.safetensors for fp16.
Set VAE loading variant to FP32
Upcast option changes the running dtype
Are you using --use-ipex?
Yes.
Is this in your logs?
Launching ipexrun launch.py...
Cause ipexrun fixes some memory leaks
These are fine:
intel_extension_for_pytorch.cpu.launch.launch - INFO - Use 'auto' => 'tcmalloc' memory allocator.
Is this line the same on your end?
2023-08-08 14:49:44,058 - intel_extension_for_pytorch.cpu.launch.launch - WARNING - Neither of ['tcmalloc', 'jemalloc'] memory allocator is found in ['/home/dan9070/anaconda3/envs/IPEX/lib/', '/home/dan9070/automatic/venv/lib/',
sudo apt install libgoogle-perftools-dev
2023-08-08 14:54:17,113 - intel_extension_for_pytorch.cpu.launch.launch - INFO - Use 'auto' => 'tcmalloc' memory allocator.
Set.
Let's see if it will help
--use-ipex' returned non-zero exit status 139.```
Wsl will take all the vram you give it and more
He is running out of system memory
It will take all the ram too
That's strange
Tried --reinstall?
That is indeed what I am doing.
Removing --use-ipex will disable ipexun if it doesn't work on WSL
[W Device.cpp:120] Warning: XPU Device count is zero! (function initGlobalDevicePoolState)
Segmentation fault```
Rest are still autodetect
2023-08-08 15:00:29.766463: E itex/core/devices/gpu/itex_gpu_runtime.cc:173] Can not found any devices. To check runtime environment on your host, please run itex/tools/env_check.sh.
Oneapi at the latest?
Yes.
Ipex pull any errors outside the venv?
2023-08-08 15:03:16,824 - intel_extension_for_pytorch.cpu.launch.launch - WARNING - 'intel' OpenMP runtime is not found in ['/home/dan9070/anaconda3/envs/IPEX/lib/', '/home/dan9070/automatic/venv/lib/', '/home/dan9070/.local/lib/', '/usr/local/lib/', '/usr/local/lib64/', '/usr/lib/', '/usr/lib64/', '/usr/lib/x86_64-linux-gnu/'].```
Why do i not have OpenMP.
Its pulling the ipex cpu?
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
Python import torch import intel_extension_for_pytorch torch.xpu.is_available()
Try that outside the venv
syntax error near unexpected token `(' lmao
you need to use ; for newline
can -c
python -c "import torch; import intel_extension_for_pytroch; torch.xpu.is_available()"
My bad typing on phone lol
alright, fully compile from source and I get the same error. What I do notice is the git# number is different than the prebuilt wheels, maybe this is causing a conflict with the pytorch version? or maybe it's compiling the wrong ipex?
This returns 2023-08-08 15:41:59,126 - intel_extension_for_pytorch.cpu.launch.launch - WARNING - Neither of ['tcmalloc', 'jemalloc'] memory allocator is found in ['/home/dan9070/anaconda3/envs/IPEX/lib/', '/home/dan9070/automatic/venv/lib/', '/home/dan9070/.local/lib/', '/usr/local/lib/', '/usr/local/lib64/', '/usr/lib/', '/usr/lib64/', '/usr/lib/x86_64-linux-gnu/'].
Same as the initial
sudo apt instal jemalloc-dev
2023-08-08 15:43:32,426 - intel_extension_for_pytorch.cpu.launch.launch - INFO - Use 'auto' => 'jemalloc' memory allocator.
It's libjemalloc-dev
Which worked
Sadly however
both alloc modules give the same error
--use-ipex' returned non-zero exit status 139.```
I just did a brand new install, and only set default parameters.
I think you need to export to path https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/10117
Is there an existing issue for this? I have searched the existing issues and checked the recent builds/commits What happened? when i run webui.sh on centos server, it came out a error : Cannot loca...
Same error as above.
The person in that issue post has a completely different error to mine.
okay fellas, I FINALLY figured out what was wrong. My wheels were fine, there is an added line in the int.py file in the intel extension that called for something that doesnt exist. I got it running on windows, UI loaded in 26.s and first inference was almost instant.
What is a good place to upload a wheel file?
If you got the WebUI fully running and generating on windows, I want to get my hands on that fast.
@keen marsh Upload it to dropbox or google drive
im excited 
Also if you want to use this in the sd env, just delete the two ipex folders in the venv and copy over the main ones. And if you have a wheel already just edit out that line
Good news is that it loads much faster, bad news is that so far it's about 2it/s slower. maybe it still needs time to warm up I dunno, or maybe the comment does something. let me know how it works for you
I guess I also should have put to run it in the oneapi enviornment as well...lol well i'm sure most of yall know that one.
Also, haven't tested it with SDXL yet
Downloaded your .whl, put the anaconda3 environment's IPEX folders inside the venv folder for SD
I get this error like I did before "C:\Users\dbs_5\OneDrive\Desktop\automatic\venv\lib\site-packages\intel_extension_for_pytorch\bin\intel-ext-pt-gpu.dll" or one of its dependencies.
You have to edit the init file and comment out that line
oH.
1.)Run pip install --force-reinstall intel_extension_for_pytorch-2.0.110+git9fccbf1-cp310-cp310-win_amd64.whl
2.)Locate the init.py file in your intel extension for pytorch folder pip
"your_python_directory\Lib\site-packages\intel_extension_for_pytorch_init_.py"
3.) Comment out line 100
#from . import _inductor
Thank you.
No doubt, let me know how it goes. So far it traded off load time for speed for me
Its still not bad but 4it/s vs 6/its at 512*512 sd1.5
Will try sdxl in a bit
Commented out line 100 in both the main site package in my conda environment and the one inside the venv after I had already overwritten it with the conda one I installed
Same error.
i got this error too, use intel prebuilt torch, it can work with ipex aaron uploaded
Intel prebuilt torch?
python -m pip install --force-reinstall torch==2.0.0a0 -f https://developer.intel.com/ipex-whl-stable-xpu
and make sure you set env variables first:
{ONEAPI_ROOT}\setvars.bat
{ONEAPI_ROOT}\mkl\2023.2.0\env\vars.bat
{ONEAPI_ROOT}\compiler\2023.2.0\env\vars.bat
{ONEAPI_ROOT}\compiler\2023.2.1\env\vars.bat
2023.2.1 is optional i believe
in the future these can be integrated to sd.next
ill look into it after i had my lunch
Try running in conda, i only get that error outside of one api
There is no env or var in 2023.2.1 for me
I use latest as its a symlink to the newest version so when you upgrade it still works
Can you screen shot your environment when you load
What do you mean "When I load"
Lol
I open up Anaconda, run the environment I made for IPEX in python 3.10
aka conda activate IPEX
I go to the oneapi folder and activate all three needed setvars
go to the automatic folder and run webui.bat --use-ipex
With the site-package intel extension for pytorch's init.py's line 100 commented out.
I have that, yes.
Mine>
Initializing Visual Studio command-line environment...
Visual Studio version 16.11.28 environment configured.
"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\"
Visual Studio command-line environment initialized for: 'x64'
: advisor -- latest
: compiler -- latest
: dal -- latest
: debugger -- latest
: dev-utilities -- latest
: dnnl -- latest
: dpcpp-ct -- latest
: dpl -- latest
: ipp -- latest
: ippcp -- latest
: mkl -- latest
: tbb -- latest
: vtune -- latest
:: oneAPI environment initialized ::```
Are you using Conda?
Yes.
Did you delete the old python?
What do you mean "Delete the old python" lmao
I had to delete python3 completely and set conda to path
Why would you even have to do that. The conda environment has it's own version of python set.
conda create -n IPEX python=3.10
Why, who knows
#1127742927347666964 message this is what I did
So I take it you never got it to work in windows at all right? If so that's likely why
Well, SDXL is just generating black images for me.
okay, got model cpu offload to work! sequential didn't work. (was the opposite before)
well 3 days of compiling and messing with stuff, time for a break. Hope those wheel files work for yall.
woohoo
finally
4 it/s
cant say it has impressed me but it runs
and it uses less sysram as you dont need to host a vm
now i needa find out if the function i commented out has anything to do with the slow speed
probably not, ill see what i can do nonetheless
very cute for an H model
wink

yeah, I'd like to get back up and running with something, but I've successfully ruined all working setups i've had 
you can start new!
I spent 3 days doing that 
i promise it wont be (too) hard
not sure what I'm not doing right, but on my linux setup I'm getting weird outputs
and slow inference speeds
maybe soon, I'm waiting for y'all to baby proof that before I ruin my system 
I found out how to compile from different commits as the one they use seems to ne from 2 weeks ago, so that could be why speed is slower. Its also in another branch so I am not even sure it would work lol. But I have compiled enough lol
yeah tryna compile that stuff gives me mental breakdown
they want market share they should make it user friendly
It seems like a small team
Yeah, definitely.
in argonne theres 10000s of pvc needed to be run
Consumer ai is pretty new, developers will still likely prefer Linux
yeah they did a pretty good job optimizing sdxl, lemme test diffuser backend for sd1.5 too
Will it work with standard models?
sure
(half sure)
ok it breaks
not really
speed is good
6it/s
vs original backend
but i like original better
Image output was bad?
the girls are still very cute from what i can tell
Lol
yeah commenting out from . import _inductor in __init__.py should have no effects whatsoever
i dont see other codes depending on this import
so this is it, 4it/s
it may just be the commit, I have the last two both are slower. The one in the prebuilt is from 2 weeks ago and is in a branch
could be AOT on windows too
I think it stats it may have performance penalty? I could be making that up in my head though lol
