#Running invoke on Linux on AMD GPU (6900xt)
562 messages Β· Page 1 of 1 (latest)
bump
bump
I'm on ubuntu 22.10 with a 6750XT and InvokeAI runs fine.
Ping me if you need help when you're installing it
@vital cargo I have the same problem, I tried auto and manual installation, the script detect the GPU, but then when I run it, it uses CPU
I'm on ubuntu 22.04, last version of ROCm and invoke source from github
I'm busy for the next 2 days but I'd be available to check for a solution with you @vagrant spire Thursday(?)
I think I found something, I cannot download pytorch rocm 5.3, so I'm trying to use 5.1.1, anyway it's ok for me (:
the installer already uses it, I removed everything, purged the cache of pip, I'm installing again π
ok, I reinstalled, it downloaded rocm 5.1.1, but when it start seems that is loading cuda and I don't understand why...
UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.)
return torch._C._cuda_getDeviceCount() > 0
I checked also if ROCm are installed and from the smi seems all ok...
======================= ROCm System Management Interface =======================
================================= Concise Info =================================
GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%
0 60.0c 32.0W 500Mhz 1000Mhz 0% auto 264.0W 7% 0%============================= End of ROCm SMI Log ==============================
Issues with AMD GPUs on Linux with ROCm installed not showing a recognized GPU when starting Invoke.sh giving the error "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
If you have ROCm installed and ROCminfo reports a GPU, you can add
"HSA_OVERRIDE_GFX_VERSION=10.3.0"
Before running invoke.sh
Example is
$ HSA_OVERRIDE_GFX_VERSION=10.3.0 /home/invokeai/invoke.sh
This will allow the script to run and utilize your GPU.
This is on my set up using Ubuntu 22.04 and ROCm 5.3.3 install with a RX 5700XT GPU
This is what I had to do to get running Ubuntu with a 5700xt
I already tried, but nothing so far...
I think I need to check the rocm I installed, maybe downgrade it π
I have ROCm 5.4.1, maybe is better roll back? @red gulch
I run again rocminfo:
Agent 1
Name: AMD Ryzen 7 2700X Eight-Core Processor
Agent 2
Name: gfx1030
So far...
I compiled last pytorch, but invoke remove it and place 1.13, it's fine,
I roll back to 5.3.3 and even python don't detect the GPU into a simple script:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)x = torch.rand(5, 3)
print(x)
this is sad
I found something,
I installed with sudo pip3 torch [...] and run the script with sudo and now I can see the GPU with it,
so I think that is something related to permissions that I missed or conf badly (:
On the rocm site I used 5.3.3 after you run the first script on the installation page for Ubuntu, if you scroll down there is a section for single install there is an option for for multiple use cases
sudo amdgpu-install --usecase=hiplibsdk,rocm
I removed everything, installed 5.3.3 and run also that, I removed and regroupped the user (from video group) and now I can use the pytorch from python.
I tried the manual installation and no luck also with the string you shared, now I'm trying to install with auto script will see
I used auto myself
It might give you the same error at the end of the auto script when it attempts to set up invoke
seems ok, I'm downloading the dataset again, but this time it wrote cuda instead of CPU
I'm sending you good linux and amd vibes, the dozens of us need all the help we can get
AMD an hard platform, but I wanted to test this AI, I'm so curious, I didn't expect to fight to get it run xD
I installed after a very long time linux and seems run right now.
Now I need to investigate how to configure this https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package π
now I'm using the GPU (:
idk why I don't need your string, maybe because with 6900XT isn't required, who knows?
a thing I noticed is that I must recompile opencv for patchmatch.patch_match so I can fix the warnings/errors, I did before, but I removed everything to start again, so I'll do later.
if anyone need, to fix the MIOpen warning it's pretty easy:
sudo apt list miopenkernels*
from the list select the version that you like to use, I installed miopenkernels-gfx1030-36kdb because my GPU is the 6900XT
(
not fixed yet, I need to work on it because sometimes it rebuild the SQLite DB
found something here: https://github.com/ROCmSoftwarePlatform/MIOpen/issues/1606
)
I fixed Patchmatch warnings in easier way than compile opencv:
sudo apt install libopencv-dev
I got into pkgconfig folder opencv4.pc instead of opencv.pc
so I decided to link it:
cd /usr/lib/x86_64-linux-gnu/pkgconfig/
sudo ln -s opencv4.pc opencv.pc
then I relaunched invoke and it compiled and loaded correctly
@red gulch I fixed also the manual installation.
I removed the cmd for torch from requirements.txt (I could also edit it, but I checked it before)
-r environments-and-requirements/requirements-base.txt
Get hardware-appropriate torch/torchvision
#--extra-index-url https://download.pytorch.org/whl/rocm5.2 --trusted-host https://download.pytorch.org
#torch>=1.13.1
#torchvision>=0.14.1
-e .
then I run:
pip --python invokeai install --use-pep517 --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm5.3
and after:
pip --python invokeai install --use-pep517 .
So I can use 5.3 version π
Great to hear! Glad everyone is working to get AMD working since they can't seem to get their mess together and fix it.
Still sad that AMD hasn't a proper support on windows and in general it's a pain use it also on Linux xD
I'm pretty happy with the 6900xt, but software side, Nvidia is still better imho
100% I have a 6800xt in my main rig and it runs great but always feels like it's just on the verge of a crash or driver reset.
Hey!
Can you please write what steps helped you get rid of the error 'HIP initialization: Unexpected error from hipGetDeviceCount()' ?
I'm using Ubuntu 20.04 and ROCm 5.4.3 install with a 6600XT GPU. Already tried manual and automatic install but got the same result with Unexpected error from hipGetDeviceCount() π¦
I will be very grateful for your help!
Hey I removed ROCm 5.4.3 and installed 5.3 using pytorch command like I wrote in my message
THIS IS FOR MANUAL INSTALL.
I removed the cmd for torch from requirements.txt (I could also edit it, but I checked it before)
-r environments-and-requirements/requirements-base.txtGet hardware-appropriate torch/torchvision
#--extra-index-url https://download.pytorch.org/whl/rocm5.2 --trusted-host https://download.pytorch.org/
#torch>=1.13.1
#torchvision>=0.14.1
-e .
then I run:
pip --python invokeai install --use-pep517 --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm5.3
and after:
pip --python invokeai install --use-pep517 .
So I can use 5.3 version
Did you add your user to video group?
Yes
rocm-smi output? @river sonnet
@vagrant spire Still trying to install ROCm 5.3.3..
AMD repo.radeon.com has horrible download speed and often even Connection fails π¦ So amdgpu-install -y --accept-eula --usecase=rocm command can't finish downloading process
can you show me nano /etc/apt/sources.list.d/rocm.list?
deb [arch=amd64] https://repo.radeon.com/rocm/apt/5.3.3 jammy main
I have 22.04, you need to change the jammy with 20.04 name π
deb [arch=amd64] http://repo.radeon.com/rocm/apt/5.3.3 focal main
that is correct also the usercase is better use the complete one like @red gulch said
sudo amdgpu-install --usecase=hiplibsdk,rocm
if I'm not wrong
anyway from italy I have not so much problem with AMD repo
perhaps this problem appears only in Ukraineπ
It's not a good time right now in Ukraine unfortunately π
Agreeπ
i'll try it tomorrow - maybe repo.radeon.com will work better
Thank you for your help!
np, I'm here, now I'm try to play with GPT2 but ROCm is hard, mainly on older code π©
Hey, @vagrant spire !
I finally installed ROCm 5.3.3 π (with usecases as @red gulch said - sudo amdgpu-install --usecase=hiplibsdk,rocm )
hey π
can you please provide more info about this operation "removed the cmd for torch from requirements.txt (I could also edit it, but I checked it before)"
How did you do that?
I'm following manual installation instruction https://invoke-ai.github.io/InvokeAI/installation/020_INSTALL_MANUAL/#linux-install-with-a-runtime-container
And now on step where i should run command pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/rocm5.2. Am I correct that at this stage I should make these changes?
P.S. I apologize in advance some stupid questions, I don't have much technical knowledge in this area
edit your requirements.txt
-r environments-and-requirements/requirements-base.txt
-e .
I left only those two
pip --python invokeai install --use-pep517 --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm5.3
pip --python invokeai install --use-pep517 .
it's better use 5.3 with 5.3
oh they changed manual, nice π
if you need something else, let me know π
maybe it's better use this string:
pip install InvokeAI --use-pep517 --extra-index-url https://download.pytorch.org/whl/nightly/rocm5.3
should be the same, but they changed a lot in the manual installation, so I need to test it (:
I tested the command and it install "normal" torch instead of rocm version π©
so...
first you must run
pip install --use-pep517 --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm5.3
then
pip install InvokeAI --use-pep517
I try to install again invoke so I'm sure that is working π
I'm downloading only SD1.5 model
So i should manually create requirements.txt and add this
-r environments-and-requirements/requirements-base.txt -e .
into it
But there not such file environments-and-requirements/requirements-base.txt π
https://github.com/invoke-ai/InvokeAI/issues/2499#issuecomment-1416415056
"As of 2.3.0 you no longer need to use requirements.txt, and support has been removed for requirements.txt and environment.yaml."
so i can just run this two commands, without any prerequisites and changes, right?
no with the new manual isn't requested anymore for what I saw
should be enough:
pip install --use-pep517 --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm5.3
then
pip install InvokeAI --use-pep517
I'm trying the first generation right now
ok it's working π
@river sonnet delete the invoke forlder, start again and use only those 2 commands π
<@&1020322764311253052> someone can update the manual with:
pip install --use-pep517 --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm5.3
and after that command
pip install InvokeAI --use-pep517
because AMD hate a bit their users and with normal command you get still cuda version and not rocm π
Yeah amd are just about useless with this.
mainly it's unfair, they merged cuda into main package and not rocm :/
also tensorflow is the same, they merged tf-gpu into main tf, but rocm is another package and only for 2.x so I'm rewriting some code for test gpt-2 finetuning π
Godspeed to you!
@river sonnet any update?
In downloading process π
We're here for you! I struggled through multiple OS installs and week of installing and reinstalling rocm distros,
There No longer is a requirements.txt. Weβve changed the build system.
yes, I know, I reinstalled and tested, check the topic, when I mentioned the maintainer group π
tried to generate image, but it still using CPU instead of GPU π
which model do you have?
cpu
6600 XT
mobile?
no
can you show me the groups command?
do you have 20.04? If I remember correctly you should have 2 groups for ROCm
the second should be render, I check the doc and I let you know
yes, it's 20.04
sudo usermod -a -G render $LOGNAME
add also this
and run again the little script
I think you will need to use also and env variable like shown by @red gulch
HSA_OVERRIDE_GFX_VERSION=10.3.0 invokeai --web
or something like that
After adding your user to the render group be sure to reboot as well.
π
Grats! Good work @vagrant spire
@vagrant spire - It looks like there was a lot that went into getting this to finally work. Thanks for this! I wonder if you could write a brief summary of the necessary steps for the next version of the installation documentation?
ok, I'll do again and write down the single step soon as possible π
@true shard those are the steps I did to launch it, about ROCm it's also a pain because AMD manual isn't so clear π
Thanks for the steps, i'll try this too on a fresh install! As it has been a real pain until now. The GPU is working but the image resulting is a grey square, so not that much of a Win.
I also managed to run Stable Diffusion on windows using nod-ai/Shark and a special version of the v22 AMD Drivers (note: updated v23 drivers supports now IREE compiler using MLIR interface on Vulkan, just saw this, maybe there is a way there too to use InvokeAI through that on windows in a next future?)
Just a question, which version of the AMDGPU drivers did you install to make it work ?
ROCm 5.3, I tried 5.4 but last time I didn't get them working, so I stay with 5.3 until pytorch will support 5.4 (:
about windows yes, MS tried to implement a layer to let AMD to run also on windows, but you will lose performance (there is another webUI that run smoothly on windows both native and bash)
about the grey square I think that is related to @red gulch command to add before launch invoke, which one GPU do you have?
Sapphire Pulse RX 5600XT 6gb
You mean HSA_OVERRIDE ? Yeah, i used this
6GB is very low memory for run AI models
but like I said, you need to use:
HSA_OVERRIDE_GFX_VERSION=10.3.0 invokeai --web
did you get grey img on windows?
It works "fine" on windows with the way I found, there are issues with the software. But I got Images, yes (I mean actual images, not grey square) I suspect a wrong install of ROCm or the driver or kernel module though, on linux.
I only said that is very low memory π
sometimes I have problems with my 6900XT infact I'm using google colab to train audioLM π
π
I saw ppl online that used RX580 just fine, so the 5600 xt shouldn't be a problem xD
on windows, it took 30sec, for a 512x512, Euler scheduler, AnythingV3 model, CFG 6, 50 steps. Compared to 20 min, I say it is progress xD. Well it tooks 15 min for a first compile of I don't know what, but that's once. Not too bad I'd say for a 6gb :p
I think that first run load the model on VRAM π
I see π
Doesn't explain why grey square on linux (for now) and not on windows with the alternative software and drivers :p
uhm tbh I got few when I didn't set it properly, so I think that there is something not correct in your conf
I'll have to try your way π and I hope that'll solve for the author of this post too π
mainly with 5.4 I got grey squares
Hmm, know what? I installed 5.4.1 now that you mention it ^^;
I started too with last version, but then I rollback to 5.3 because I saw that pytorch support it and 5.2
The InvokeAI installer installs 5.2 no ?
Hm, because, at setup, i choose AMD GPU, and during the setup process I saw ROCm 5.2 π

me too π
I used manual installation, but I got normal torch and not torch-rocm, because of that I install manually torch-rocm before invoke
and the pain is also for other type of models...I trained GPT-2 as chatbot, but I edited the code for rocm, because all code was for normal torch 
Seems like a lot of fun there xD
yes, I like the AI stuff tbh, I'm not so good to write a model from ground up, but I'm starting to see how they working, I'd like to test musicLM, but it is based on audioLM and it require a lot of train steps and I'm studying how to get enough CU to train it without spending too much xD
Especially now that the pricing is still quite high π well, it's better, but still...
cloud isn't so expensive, but still is for hobby project
Agree ^^
@grim sierra did you test?
I didn't, I'm at work. I might be able to do that this weekend or on Friday :)
There is a launch argument for Float32 that cleared up the grey block issue on linux
Really?? What is it ? π€©
First render was a solid grey picture with my card, I have to go into the invoke.init and add "--precision float32" argument
No quotes of course
This fixed the issue on my 5700xt
I'll test this evening!!
U might just be saving me from reinstalling for the 4th times a Linux distro π
Best of luck, once I'm home I can get the exact line i added it at.
I've gone from Ubuntu to manjaro, back to Ubuntu multiple times, amd has made this such a mess.
True!
I have to reinstall Automatic1111 for the deforum support. I had issues with it on Ubuntu, hopefully it will go a bit smoother this time.
May Tux be with you!
AMD ROCm is a little bit messy, but for what I read on their github there isn't so much budget for it so π€·ββοΈ
Meh, might retry that after a fresh install, looks like I broke something, I'm trying to reconfigure invokeai before but already have issues lol. Hope that the author of the main post did have more luck than me π
Update: yup, I broke something, UI starts, model loads, but AttributeError: 'Combine' ... Anyhow, have to get a new install π¦
Is it better on arch or ubuntu? π
I did a new install, following your steps. However I got an issue with dkms during the building process of the kernel modules. Don't understand...
I've installed all the rest, AMD drivers, rocm 5.3.2 (may be I should have get 5.3 or 5.3.3, will uninstall tomorrow) etc until invokeai, then at configure: "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!". Then the installer Aborted. I'm desperate now π Ubuntu is 22.4.1, kernel is 5.15 something.
i got that error installing mine. go ahead and start up the invokeai with the HSA command in front of it. it will finish installing and have the command going for the installer
i did use 5.3.3 in my install seemed to work the best
I need to check my version tbh
But for what I saw, 6800/6900xt are the only two GPU that has less problem with ROCm, I hope that pro version still will be better than those
Package: rocm-libs
Version: 5.3.3.50303-99~22.04
and I'm using source
https://repo.radeon.com/rocm/apt/5.3.3 jammy/main amd64 Packages
will uninstall later today and install 5.3.3 π
Did you guys had the DKMS issue ?
ParamΓ©trage de amdgpu-dkms (1:5.18.2.22.40.50303-1504718.22.04) ... Loading new amdgpu-5.18.2.22.40-1504718.22.04 DKMS files... Building for 5.15.0-60-generic Building for architecture x86_64 Building initial module for 5.15.0-60-generic ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/amdgpu-dkms.0.crash' Error! Bad return status for module build on kernel: 5.15.0-60-generic (x86_64) Consult /var/lib/dkms/amdgpu/5.18.2.22.40-1504718.22.04/build/make.log for more information. dpkg: erreur de traitement du paquet amdgpu-dkms (--configure)Β : installed amdgpu-dkms package post-installation script subprocess returned error exit status 10

this afternoon I'll go back on linux and I'll help better if you will can stay on π
Once I'm home ill take a look and see if I get the same issue
When I tried to install in 12/2022 (Ubuntu (non LTS) 20.10 + RX 6900) I had to use the "--no-dkms" option. Otherwise it threw me a bunch of errors which I didn't know what to make of / how to fix.
So I installed with: "sudo amdgpu-install --usecase=rocm,hip,mllib --no-dkms"
same π but it tries to build kernel module π
and it worked ? I mean SD on GPU ? Anyhow, i'll try that too if there is still issue. I'm gonna try like "with the error"
which one ubuntu version are you using?
with the rx6900, yes. (just updated to invoke 2.3.0 yesterday. I think i'm on rocm 5.4.0 by now, but need to check. On windows atm.)
Here are my notes how I installed the rocm driver: #1053043950262427648 message
Jammy
22.04 ok
Thanks you for that π I'll check that π
I did the same too, but I haven't the problem with --no-dkms, I check amdgpu-install version
amdgpu-install/jammy,jammy,now 22.20.50205-1511376~22.04 all [installed]
to check yours, apt list amdgpu-install
amdgpu-install/jammy,jammy,now 5.3.50303-1504718.22.04 all [installΓ©]
I don't understand, what should I do ? I've downloaded the rocm package from the rocm website π
(well with the command line they put for download)
I leave rocm installed ?
It does something, but no progress on the invokeai bash window ^^;
Did you download the amdgpu-install from AMD ?
yes, but I have also their reposity
one minute that I'm checking the docs
they upgraded the docs for 5.4 finally with 22.04 on it
can you run this command?
cd /etc/apt/sources.list.d && ls
amdgpu.list amdgpu.list.save amdgpu-proprietary.list amdgpu-proprietary.list.save rocm.list rocm.list.save
my rocm.list:
deb [arch=amd64] https://repo.radeon.com/rocm/apt/5.3.3 jammy main
my amdgpu.list:
deb https://repo.radeon.com/amdgpu/latest/ubuntu jammy main
before I was using instead of latest 22.20.5, because of that my amdgpu-install is different than yours π
now I updated and upgraded after adding latest and I'll test invoke after a reboot
My rom.list:
deb [arch=amd64] https://repo.radeon.com/rocm/apt/5.3.3 jammy main
My amdgpu.list
deb https://repo.radeon.com/amdgpu/5.3.3/ubuntu jammy main
ok, is still working
^^
change amdgpu that is better
so the .deb from amd website I guess ?
about rocm idk, like I said, I stay with 5.3 until pytorch will support it
no, you can edit directly with nano
sudo nano amdgpu.list
and change 5.3.3 with latest
then run:
sudo apt update
sudo apt upgrade
sudo apt install amdgpu-install
in this way I didn't get any issues, only few warnings about missing firmware π
damn, forgot to uninstall amdgpu-install
should be upgraded with upgrade command π
Ok, it tries to build kernel module
yes, also on my side, if you see W: those are warnings
Woooooooooooooh!! I only got this
Can't load /var/lib/shim-signed/mok/.rnd into RNG 40E7961E1E7F0000:error:12000079:random number generator:RAND_load_file:Cannot open file:../crypto/rand/randfile.c:106:Filename=/var/lib/shim-signed/mok/.rnd The rest just compiled
strange thing
indeed
did you already install invoke?
something related to openssl
yes, i did and it loads, but still "Preparing" to render
maybe it compiles something I don't know of ^^
first run is always slow
check the console
i'm gonna wait a bit
maybe is downloading something
nope π
2023-02-16T16:42:58+0100: Disconnected from server
and in the console, it is
terminal, sorry, not the console on the UI π
π
^^; TBH, on windows, with the other solution I got, it takes 20 min for the first compile. However, I got nothing in the terminal ^^;
Are u kidding me xD
just updated xD
π
i'm not sure I should ^^;
idk I do everything with terminal xD
it just popped-up ^^
was 5 years that I wasn't using linux other than in the bash on windows 10, I have to admit that I must say ty to AMD to forced me to go back to linux XD
I'd like to test the 7900 XTX in the future π
me too, but the budget is not there xD
and i'd have too much of a CPU bottleneck
yeh I have already some problems with my 2700x and 6900XT π
I'm waiting for the all new x3d, but change all the platform is expensive right now
lol, last time I checked I took a new AMD AM5 sochet CPU (7600 I think), not the most expensive motherboard, 32gb of DDR5, and a powersupply ~700W and I was around 800β¬ xD
I was litteraly on my ass
I need 64gb of ram and I'd like to buy a mobo with at least 3 PCI5 M2 π
that is expensive
hu hu, very ^^;
I start to be discouraged, there is nothing happening, I tried without --precision float32 and it crashed, then used again float32, and not it is doing something, don't know what, and nothing appears in the terminal, server is disconnected from the webui xD
AND... ubuntu it upgraded to ROCm 5.4
was not working before that anyway
It's been half-an-hour, i think it is not doing anything actually π
It is soooooooooo annoying to be that close and nothing works
uhm
I don't understand why you have ROCm 5.4
run this
apt show rocm-libs -a
mine is:
Package: rocm-libs
Version: 5.3.3.50303-99~22.04
Priority: optional
Section: devel
Maintainer: ROCm Libs Support [email protected]
Installed-Size: 13,3 kB
Depends: hipblas (= 0.52.0.50303-99~22.04), hipfft (= 1.0.9.50303-99~22.04), hipsolver (= 1.5.0.50303-99~22.04), hipsparse (= 2.3.2.50303-99~22.04), miopen-hip (= 2.18.0.50303-99~22.04), rccl (= 2.12.12.50303-99~22.04), rocalution (= 2.1.2.50303-99~22.04), rocblas (= 2.45.0.50303-99~22.04), rocfft (= 1.0.18.50303-99~22.04), rocrand (= 2.10.9.50303-99~22.04), rocsolver (= 3.19.0.50303-99~22.04), rocsparse (= 2.3.3.50303-99~22.04), rocm-core (= 5.3.3.50303-99~22.04), hipblas-dev (= 0.52.0.50303-99~22.04), hipcub-dev (= 2.10.12.50303-99~22.04), hipfft-dev (= 1.0.9.50303-99~22.04), hipsolver-dev (= 1.5.0.50303-99~22.04), hipsparse-dev (= 2.3.2.50303-99~22.04), miopen-hip-dev (= 2.18.0.50303-99~22.04), rccl-dev (= 2.12.12.50303-99~22.04), rocalution-dev (= 2.1.2.50303-99~22.04), rocblas-dev (= 2.45.0.50303-99~22.04), rocfft-dev (= 1.0.18.50303-99~22.04), rocprim-dev (= 2.10.9.50303-99~22.04), rocrand-dev (= 2.10.9.50303-99~22.04), rocsolver-dev (= 3.19.0.50303-99~22.04), rocsparse-dev (= 2.3.3.50303-99~22.04), rocthrust-dev (= 2.10.9.50303-99~22.04), rocwmma-dev (= 0.7.0.50303-99~22.04)
Homepage: https://github.com/RadeonOpenCompute/ROCm
Download-Size: 1.000 B
APT-Sources: https://repo.radeon.com/rocm/apt/5.3.3 jammy/main amd64 Packages
Description: Radeon Open Compute (ROCm) Runtime software stack
I wish i knew π¦
I'll get back to it but I need a break from it π
I think It updated the source list with 5.4
mhm strange, I still have 5.3.3
I think I'm doomed to not use this xD
@vagrant spire I've tried your way and @daring egret' (no-dkms), while setting ROCm to 5.3.3 but still having the latest amdgpu-install.
InvokeAI starts, but when I try to use any model (here stable-diffusion 1.5), the UI disconnects from the server after like 2-3 minutes and I'm left with a Terminal loading and showing nothing and a GPU running at 100% for a very long time without anything outputted.
- Did you add your user to the video/render groups?
- If you haven't yet since the rocm installation, try rebooting (I know, I know...).
- After reboot, what does "rocminfo" show? Does it list your gpu?
- Yes π
- Yes, can retry though
- Attached π (Yup, it does)
Ok, so you have a RX 5600 XT ?
indeed
Did you try this stuff with "export HSA_OVERRIDE_GFX_VERSION=10.3.0"
I did π
I can retry by putting it in front before the call to invoke ai
instead of the webui (i.e. don't press 2, but press the number for development console
Can you do that? Enter the developer console?
i.e. start the invoke.sh script, don't press 2 for the web-ui, instead press the number for the developer mode/console
hm, i did the install manually, I don't have invokeai.sh =/
ok
i'm not entirely familiar with the current instructions for manual install
you probably need to activate a virtual environment?
to start the invoke web ui
indeed
ok, please activate the virtual environment π
It is activated π
then start a "python" shell/console inside that virtual environment. i.e. type "python"
then "import torch"
inside that python shell
and finally type: "torch.cuda.is_available()"
it should return "True"
if it doesn't, then we have a problem with the rocm installation. And if it does return True, then your GPU should work properly and we need to find the error with the invoke AI installation.
Hm, grumpf
"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
Aborted (core dumped)
ok. try the following:
activate the virtual environment. type: "export HSA_OVERRIDE_GFX_VERSION=10.3.0"
and then everything above inside that same virtual environment: i.e. "python", "import torch", and "torch.cuda.is_available()"
"True"
ok!
that's already something ^^;
So everytime before you start the invokeai webui, make sure that after you activate the virtual enviroment, you enter that command "export HSA_OVERRIDE_GFX_VERSION=10.3.0"
then start the webui
and lets see if we have more success
what I do is : "HSA_OVERRIDE_GFX_VERSION=10.3.0 invokeai --web"
should it be "export HSA_OVERRIDE_GFX_VERSION=10.3.0 invokeai --web" ?
I'm afraid I'm not a linux guru. I assume that "HSA_OVERRIDE_GFX_VERSION=10.3.0 invokeai --web" does the same as entering the export command before calling the web ui
so if you did that already that way, we probably only confirmed that that command really is necessary
lemme try again :p
So when you start the webui. It lists some stuff in the terminal window. It should print something like ">> Using device_type cuda"
I get to the UI, i set a prompt, choose 11 steps to be quick, and after
">> ESRGAN Parameters: False
Facetool Parameters: False
Setting Sampler to k_euler (EulerDiscreteScheduler)"
it seems to do nothing but keeps my GPU busy
it is
the first time after start up, it can take some time (i.e. 1 or 2 minutes maybe more)
if it doesn't crash, I would suggest to wait quite a bit just to be sure
The thing is that after 1-2 minutes, the UI shows disconnected and the console shows "Disconnected from server"
while terminal shows still the same message about setting sampler and nothing after it
i'm rebooting to my ubuntu and check how long it takes for me
If you have amdgpu-install 5.4 is fine, you need to check the other command I sent you about rocm-libs
Package: rocm-libs
Version: 5.3.3.50303-99~22.04
Priority: optional
Section: devel
Maintainer: ROCm Libs Support [email protected]
Installed-Size: 13,3 kB
Depends: hipblas (= 0.52.0.50303-99~22.04), hipfft (= 1.0.9.50303-99~22.04), hipsolver (= 1.5.0.50303-99~22.04), hipsparse (= 2.3.2.50303-99~22.04), miopen-hip (= 2.18.0.50303-99~22.04), rccl (= 2.12.12.50303-99~22.04), rocalution (= 2.1.2.50303-99~22.04), rocblas (= 2.45.0.50303-99~22.04), rocfft (= 1.0.18.50303-99~22.04), rocrand (= 2.10.9.50303-99~22.04), rocsolver (= 3.19.0.50303-99~22.04), rocsparse (= 2.3.3.50303-99~22.04), rocm-core (= 5.3.3.50303-99~22.04), hipblas-dev (= 0.52.0.50303-99~22.04), hipcub-dev (= 2.10.12.50303-99~22.04), hipfft-dev (= 1.0.9.50303-99~22.04), hipsolver-dev (= 1.5.0.50303-99~22.04), hipsparse-dev (= 2.3.2.50303-99~22.04), miopen-hip-dev (= 2.18.0.50303-99~22.04), rccl-dev (= 2.12.12.50303-99~22.04), rocalution-dev (= 2.1.2.50303-99~22.04), rocblas-dev (= 2.45.0.50303-99~22.04), rocfft-dev (= 1.0.18.50303-99~22.04), rocprim-dev (= 2.10.9.50303-99~22.04), rocrand-dev (= 2.10.9.50303-99~22.04), rocsolver-dev (= 3.19.0.50303-99~22.04), rocsparse-dev (= 2.3.3.50303-99~22.04), rocthrust-dev (= 2.10.9.50303-99~22.04), rocwmma-dev (= 0.7.0.50303-99~22.04)
Homepage: https://github.com/RadeonOpenCompute/ROCm
Download-Size: 1.000 B
APT-Sources: https://repo.radeon.com/rocm/apt/5.3.3 jammy/main amd64 Packages
Description: Radeon Open Compute (ROCm) Runtime software stack
i'm on rocm 5.4.2 btw. and before that on 5.3.0. So I don't think it has do be some specific version.
I had grey images with 5.4, so I roll back to 5.3
maybe a combination of CPU/MB/GPU, which for some reasons don't work for this altogether
No, I don't think so, its only config problems
Which graphic drivers do you have, the open or proprietary one ?
(invokeai is sucking 25% of my CPU and 2.5GB of Ram xD)
Uhm I need to check, because I installed a minimal Ubuntu with check on install drivers
If you see true with python code it's not a problem for how you installed rocm
Because it is working
That's what I think, maybe it is doing something and I just have to wait π
If invokeai still doesn't create/show your image after several minutes, what are the latest entries in your terminal window?
GPU is working and has loaded the model on the memory
well it clearly tried to use your gpu to create the image
Can you show us also the terminal where you launch InvokeAI?
So I can check if there are problems with installation
The three last line since the message from the console of the web UI (2023-02-16T20:05:54+0100:Disconnected from server)
And the other console?
at launch of invokeai ?
does the rx5600 have 6GB Memory?
yes
works fine on windows with nod-ai/Shark and latest amd drivers (well, until it doesn't lol)
Everything looks just fine π
When I tested 1 month ago, it worked on a specific version not lasted, nice that they updated
(Well, unfortunately, it is not working on windows with InvokeAI)
Because it's require to write a specific code to use direcfml from Microsoft
And cut down performance xD
GPU is working his ass out :
Anyway, you need to follow that GitHub link and install what is missing
Yes, it's better install also it
Because it's a thing required for in painting feature if I'm not wrong
for inpainting and outpainting
but is it when you already use an image, right ?
Will install patchmatch tomorrow I think, for now I'll have a pizza. I'm frustrated that this is not working. Thanks you for your time today π !
Yes should be not a big deal, but it's better try everything at this point
so true ^^
I didn't follow the docs on InvokeAI, but I installed it and you find in this topic how, follow the guide that I believe that is the same I already did
My only remaining suggestions would be to try a fresh (automated) installation for InvokeAI.
I can do that if I just delete the manual one ?
Can be a solution, but I edited the requirement txt for make it work, now they changed how they install it, so should be checked if install torch or torch-rocm
Yep
good to know
if you have the diskspace and don't want to delete the old installation completely, because you want to try more with it later or something, you can just rename the old folder
while you try the automated installation
or you install the automated installation to a different folder
you don't have to delete the manual installation if you don't want to
Before launch the auto install, do a pip cache purge
noted !
And keep on eye on torch, if it download torch 1.13 not rocm you will not use GPU
I think that's why i took the manual way π
But I think the autoinstaller is asking about AMD/NVidia or CPU
I'll check the install script and I'll edit it then 
It did for me to, so I guess that shouldn't be a problem.
Should be the same, important is that it install rocm, then 5.2 is the stable version
I found this article, where they benchmark different GPUs for stable diffusion: https://www.tomshardware.com/news/stable-diffusion-gpu-benchmarks
And it says that the 5600 "failed", so they didn't benchmark it. Whatever "failed" means. Mabye they couldn't get it to run either.
I checked the install script on GitHub and seems that they run first torch rocm and then install invoke, maybe it can work
4090 is a monster, the 7900 xtx is impressive, mine is very bad xD
6gb of VRAM isn't so much plus 5xxx isn't very good on ML stuff
You can find some reddit posts or other sites on google claiming that all sorts of AMD GPUs work, for example this one: https://www.reddit.com/r/StableDiffusion/comments/ww436j/howto_stable_diffusion_on_an_amd_gpu/
But its ~6 months old, and the version of some libraries, etc. changed in the meantime. For example it also states the the RX480 8GB should work (which I have in an old pc), and it absolutely did not work for me in december 2022, no matter what I tried.
One specific reason seems to be that AMD stated that they no longer support the latest features for rocm on older gpus, and then some other developers dropped support for these gpus entirely in their libraries.
So even If you find older articles stating that the 5600 should work, that might no longer be the case.
Thanks (btw, I installed patchmatch and... nope, not at all, still same beahviour)
Thanks for those links π
Well, strange thing is that I can produce stuff in 30 sec on windows with shark and amd drivers π
Time to buy a 7900xtx then xD
soooo, i guess that's all for nvidia supremacy π
Because it uses vulkan and emulate in some sort the process
loool, I wish, my wallet is trying to kill itself xD
Vulkan is great π
Thanks to Microsoft there is a way to use AMD GPUs on windows, but then it should be implemented into the code and I don't know the difficulty tbh
I need to check how is working (:
Sorry it's not vulkan but dx12 π€£
Right, vulkan is on Linux (thanks valve btw π€©)
(they managed to use vulkan and other tricks to enable us to play DirectX games π)
Meh vulkan is everywhere, thanks to AMD xD
Then everyone join the project also Microsoft
I can't wait to install the official desktop version of steam os 3 (when valve offers it to the world π)
Ha ha, yeah, thanks AMD!
Don't know, I need to watch the steamos because I'm using windows for gaming π€£
I'm using windows too for gaming I've a steam deck and tbh, it is very good
Anyway the main problem is that direct ml uses a custom init for the devide and didn't share CUDA calls like rocm, that I don't understand why they don't use GPU() instead of cuda() π€£π€£π€£
SteamOS 3, because the original one was based on debian, now it is arch.
Really, I don't know, been awhile since I put my head in all of this stuff ^^;
Don't think so π But been awhile too away from Linux. I had dual boot etc, then I just kept windows π
Arch is... Well, Arch π
And Ubuntu is Debian based if I'm not mistaken
Been a long time π
No ok I'm sorry arch is arch π
π
Now if I can I'll try directml on windows, then I'll looking into invokeai how much is hard to implement π
I think, really, you'll become a god for a lot of windows users π€©
I'm not so good, depend on codebase of invokeai π
Trying to install after the graphic driver from amdgpu-install --use-case=graphics, and auto install of invokeai, wish me luckw after that, I give up π
never giveup π
I investigated 3 days in it before find a way to make it run xD
I just discovered that you download from pip invokeai instead to use the one you download with sources π
so it's not a proper manual installation 
I'm following the code, I stuck on encoding the prompt, because I need to load the device in another way π€£ but then someone decided to turn off the electricity on all block
π
I followed the guide on their wiki π¦ Indeed, I didn't clone the repository π
Now, I did the auto install, after the AMD graphics installation, and now I'm back to (grey) square 1. I'm adding the argument --precision float32.
It Works... But... HIP out of memory, cannot generate image π£
I'm sooooo frustrated, It works on windows with some magic trickery, but here... it just don't. I even see the image progress in the UI xD But... nope
i'll see tomorrow about the HIP stuff, weird thing is, If I try to make 256x256, the system tells me that there is 0 image created, but the image is present in the folder xD
I'll try the argument "--free_gpu_mem" that could help I think, found it here : https://invoke-ai.github.io/InvokeAI/features/CLI/#list-of-arguments-recognized-at-the-command-line
Funny thing is, the requirements for invokeai is a 4gb vram GPU π
I think that depend also by the model
makes sense π
π€£ Warning: '--free_gpu_mem' is not yet supported when generating image using model based on HuggingFace Diffuser.
Ok, I tried an embedding and i didn't got the message about hip, and it generated and image in 256x256 in 6 seconds. Trying with 512x512, and... worked xD Ok so I should not use the provided stable diffusion model, that's ok π
Today is a good day π
(so free_gpu_mem is needed for me π )
However, I might disconnect and reconnect between each model change.
For now a 512x512, 50 iterations, k_euler_a, CFG 9, takes 30sec and generates fine, with free_gpu_mem and precision float32.
I call that a win π
yep
Thanks a lot for your support @vagrant spire and @daring egret !! Your help was very helpful ! π
I might also try with ROCm 5.4 if there is any improvements π
π
I need ONNX format for models, so I hope that in the future invoke will support those type of models π
fingers crossed π ! never used ONNX.
I know that one option to use Stable diffusion on windows was to convert to ONNX, but all I know π
This is fixed in upcoming 2.3.1
Hi @vagrant spire
Sorry to bother you again..
A few months ago you helped me install InvokeAI on Ubuntu, for an XT6600 video card. Back then we were installing version 2.3.0. I recently tried to update to the latest version and again I don't have the image generation using the video card, only the CPU.
Perhaps you have already tried to install the latest versions of InvokeAI and already know some new nuances that arise during installation?
I can do π
tomorrow I'll try to install the new version and I'll let you know if for you is ok π
Thank you for helping me again! π₯Ή
I'm updating ubuntu...it's sad that with kernel 6 I must use repo for 22.10 instead of LTS 22.04 π
@river sonnet I need more time because I was testing kernel 6 for other stuff and ROCm ofc doesn't support it yet π₯Ή
tomorrow I'll reinstall ubuntu 22.04 so I can check the install script
@river sonnet I'm installing with automated script from 2.3.5, so far I see that it installed pytorch in the correct version and I'm downloading models (again xD) I let you know in a while
I finished the setup and downloads, but it start with CPU -.-
Of course! I'll wait as long as it takes π
Yeah, same story π
I don't think that is invokeAI issue, but still ROCm, because it doesn't use GPU even with a simple python code, so I need to investigate
a little update...seems that last update of 22.04 broken dkms drivers, because rocm 5.3.3 uses kernel 5.18 and ubuntu uses 5.19 π
so I need to understand how to fix the issue, because rocm 5.4 seems doesn't work with pytorch, 5.4 uses torch 2.x instread of 1.13 xD
next time Nvidia for this purposes
please write down the guide how you installed everything. because the video card is of the same generation, but I don't understand how to make ai run through the video card
you find all guides and test in this topic, but seems broken rn
https://www.tomshardware.com/news/amd-rocm-comes-to-windows-on-consumer-gpus <--- I hope it will be a game changer, when? idk
for now with rocm 5.5 no windows support nor 7900 XTX O_O https://github.com/RadeonOpenCompute/ROCm/issues/2128
@tiny thunder you find a method here #1049702853490511872 message
problems I have now it's with kernel drivers that didn't compile with last 22.04 version I downloaded from ubuntu.com so I need to check how to fix it π
rocm 5.4 never worked with torch 1.13 on my conf π¦
without kernel dkms I cannot use torch, so I need to work on it π
I read in general what they write. The only useful thing I found in the messages. I put it on 22.04.2 lts and Romc 5.4.2 from amd's off-site. Maybe I'm using the wrong software versions?... I just honestly have already reread so much and am completely confused
when I tested it before 5.4 doesn't work with torch 1.13 π¦ @tiny thunder
about 22.04, the last version 22.04.2 with updated kernel on my conf doesn't work, I need to find the old 22.04 with kernel 5.18 or rollback to it
AMD is pretty bad with ROCm unfortunately π¦
maybe use 20.04 LTS and rmc 5.3.3?
yes, I used 20.04 LTS and rocm 5.3.3 and everything worked
-_- i have no mouth and i must scream... these are about the emotions I have)))) I hope it works. you use manual Installation?
rn I didn't installed invoke
because I have problems with dkms, but I noticed that is improved a lot the automatically script
in the past I used the manual mode because was the only way to get rocm torch xD
ok) I'll try it tomorrow
I think you are trying to make it work with a 6600XT in the discussion here atm, right? So dunno how helpful this is. But with a 6900XT it worked for me without a problem when I installed everything a few days ago with ubuntu 22.04.2 lts and rocm 5.5
I cannot run it with 22.04.2 and rocm 5.3.3, I'll try with rocm 5.5 @daring egret
@daring egret I just installed on fresh ubuntu 22.04.2 ROCm 5.5 and with simple python script:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
and venv with pytorch 1.13 and rocm5.2 (that is lasted that is using 1.13.x) I get:
UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.)
return torch._C._cuda_getDeviceCount() > 0
cpu
so doesn't work anymore with last 22.04.2, at least on my machine
ok, I fixed, now also 22.04 require render group π
@tiny thunder and @river sonnet follow this guide https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.5/page/Prerequisites.html
with 5.3.3 you don't need the render group, but now it's mandatory also for 22.04 π
maybe you can fix, remove all amdgpu/rocm stuff and try to reinstall following the guide for ROCm 5.5 π
I confirm that also automated script work after following 5.5 installation guide
In general, I tried to install it all in different ways and honestly - I was desperate for today. Maybe you can tell me what I'm doing wrong. Let's take the new Ubuntu 22.04.02 LTS as a basis. It is not a problem to reinstall because it is on the m2 ssd. After that, I install the updates that are offered and reboot. Then I take from this site ΠΈΠ½ΡΡΡΡΠΊΡΠΈΡ(https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.5/page/How_to_Install_Rom.html#_How_to_Install ) and do the following -
sudo apt-get update
wget https://repo.radeon.com/amdgpu-install/5.5/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb
sudo apt-get install ./amdgpu-install_5.5.50500-1_all.deb
Next is
sudo amdgpu-install --usecase=hiplibsdk,rocm
Then I do the following from this site -
https://docs-amd-com.translate.goog/bundle/ROCm-Installation-Guide-v5.5/page/Prerequisites.html?_x_tr_sl=auto&_x_tr_tl=ru&_x_tr_hl=ru
sudo apt list --installed | grep 'wget|gnupg2|gawk|curl'
Next
sudo usermod -a -G render $LOGNAME
sudo usermod -a -G video $LOGNAME
Further
echo 'ADD_EXTRA_GROUPS=1' | sudo tee -a /etc/adduser.conf
echo 'EXTRA_GROUPS=video' | sudo tee -a /etc/adduser.conf
echo 'EXTRA_GROUPS=render' | sudo tee -a /etc/adduser.conf
Then I follow this instruction and install everything from this instruction automatically -
https://invoke-ai.github.io/InvokeAI/installation/010_INSTALL_AUTOMATED/#installation
sudo apt update
sudo apt install -y python3 python3-pip python3-venv
sudo update-alternatives --install /usr/local/bin/python python /usr/bin/python3.10 3
Next
sudo apt update && sudo apt install -y libglib2.0-0 libgl1-mesa-glx
And then I just have to install the automatic installation by selecting the 2nd item. Where did I make a mistake or what didn't I do?
It's all right, only after installing rocm reboot
not working:C I'll try again tomorrow from the beginning
which one GPU have you got? @tiny thunder
yes. there is also a video card in the processor (rysen 5 5600 g), but I disabled it in the bios and it does not appear in the system.
ok, but what have you got? 6900xt or less?
6750xt
Maybe it doesn't work on my graphics card? Because I have already reinstalled it 2 times with your correction for a reboot
I want to cry:C Decided to reboot 2 times just like that. And it worked after 2 times. Eh?
YAY. I WIN IT!!!!!
Congr
When generating, it just crashes without errors... I'm going to drink)
do you know what these mistakes are
@tiny thunder try to add:
sudo amdgpu-install --usecase=opencl
the same mistakes
RROR_COMPILATION (6)
MIOpen(HIP): Error [BuildHip] HIPRTC status = HIPRTC_ERROR_COMPILATION (6), source file: naive_conv.cpp
MIOpen(HIP): Warning [BuildHip] /tmp/comgr-ad3b08/input/naive_conv.cpp:39:10: fatal error: 'limits' file not found
#include <limits> // std::numeric_limits
^~~~~~~~
1 error generated when compiling for gfx1030.
terminate called after throwing an instance of 'miopen::Exception'
what(): /long_pathname_so_that_rpms_can_package_the_debug_info/data/driver/MLOpen/src/hipoc/hipoc_program.cpp:304: Code object build failed. Source: naive_conv.cpp
/home/puma/invokeai/invoke.sh: line 51: 5618 Aborted (core dumped) invokeai --web $PARAMS
maybe something went wrong?
it's something related to HIP
Uh-huh. now it doesn't fly out. now it just freezes.... AHAHAHAHAHA
okay. it generates black pictures
did you use: export HSA_OVERRIDE_GFX_VERSION=10.3.0 ?
yep
without this, it crashes
π€
come on))) see how cool it generates black squares
how?
I dont remember π¦
This?
nope it's only a list of generated images
try with:
--precision fp16
or
--precision fp32
how many model do you keep loaded?
try also this command:
--free_gpu_mem
it's slower but maybe help
but still I don't think is related to those settings 
When generating, other lines appear. Maybe some driver got up badly?
for me can be precision at this point tbh
Rephrase. I don't quite understand you
try to use the precision flag when you start the program, I read few months ago that with 6700 precision causes some troubles
Where is it?...
when you start invoke.sh, add --precision fp32 or --precision fp16
it should change precision π
I just didn't quite understand where to add it, because if I just write to the command line, he doesn't know this command
uhm
can you show me the content of invoke.sh?
because I'm not in linux rn and I try to remember
This?
we need to find where is the setting file π
can you screen the content of invokeai pls?
ok, give me the invoke.init
I did it
it didn't help. changed this setting already
π€
I think this driver got up somewhere wrong
have you got miopen-hip?
????
I don't know, I'm trying to understand why it doesn't generate
And what is it?
can be something related to the HIP problem you have, but not sure π€
when you generate an image, what did you get in the terminal?
uhm strange
++
maybe try to download or use another model?
smaller one
to understand if at least it can work
I already did. Maybe the problem is that if I don't use the code in developer console - pip install torch torch vision --force-reinstall --extra-index-url https://download.pytorch.org/whl/rocm5 .4.2
Then my video card does not work. Maybe that's the problem?
π€
I'll try to just reinstall. I will make a backup of what is already there
native
ok, let's reinstall then
install only ROCm then for invoke we will check the settings π
reinstall working...
π