#tensorflow gpu support on arch

145 messages · Page 1 of 1 (latest)

little galleon
#

I'm trying to utilize my RTX 3060 on my arch linux system but I am unable to do so. On the TensorFlow docs it says that it only officially supports ubuntu, so I'm unsure of how to set up my GPU on my arch machine.

native zodiac
#

are you installing tensorflow globally or in an venv? @little galleon

#

It's supported alright

native zodiac
#

do you have a requirements.txt?

little galleon
#

yep

#

tf version 2.12.0

native zodiac
#

within the venv just do pip install -r requirements.txt

#

that import statement failed because it was not installed globally

#

I don't like global installs (too messy)

little galleon
native zodiac
#

rerun the python file

#

if tensorflow was listed in that file it will work straight away

little galleon
native zodiac
#

is that DataSpell?

little galleon
#

the text editor?

#

its vscode

native zodiac
#

oh, you might need to restart that one

little galleon
#
tensorflow == 2.12.0
opencv-python == 4.7.0.72
pillow == 9.5.0
ipynb == 0.5.1
import_ipynb == 0.1.4
matplotlib == 3.7.1
pandas == 2.0.2
scikit-learn == 1.3
seaborn == 0.12.2
native zodiac
#

DS and Pycharm work just fine, at least for me

little galleon
native zodiac
#

the editor

little galleon
#

still doesnt work

native zodiac
#

is this a repo or a project of yours?

little galleon
#

repo yeah

#

dyw github?

native zodiac
#

mind sharing the repo?

little galleon
#

ofc

native zodiac
#

Yeah I want to see if they left instructions, sometimes they are not straight forward

little galleon
#

its my code lmfaoo

#

mb sorry shoulda clarified

native zodiac
#

well shit look at you, I heard about this project in another discord

little galleon
#

wait tf

#

by who lol

native zodiac
#

I'm in a few AI discords where projects are shared

little galleon
#

damn thats funny

native zodiac
#

but you did it right

#

hold on let me try it real quick

little galleon
#

bet

#

btw i put req.txt in this folder called setup

#

so j like cd setup then pip install

#

also if u have the invite link to the discord where they were talking abt this would u mind sharing

#

im curious to see what they were talking abt lol

native zodiac
#

it was in HuggingFace I think

#

I have to double check, I've been nerding out with AI stuff the last few months

little galleon
#

ill give it a look lol

native zodiac
#

what file did you call btw

little galleon
#

its in /models/regressions/models

#

LFW.ipynb

#

through you can run any of them lol

native zodiac
#

I didn't look at your project yet lol

#

it's all very hidden

little galleon
#

lmfaoo yeah im not that best at organizing

native zodiac
#

I think I found the problem, hold on let confirm it is the issue

#

can you do pacman -Q | grep cuda please

#

if it returns empty you're missing the cuda-toolkit

little galleon
#
cuda 12.3.1-2
python-pycuda 2022.2.2-4
python-tensorflow-opt-cuda 2.15.0-6
tensorflow-opt-cuda 2.15.0-6

native zodiac
#

it's not recognizing your GPU, that's the issue

#

I have another thing I wanna try real quick, I followed your instructions for the venv

#

I normally put that inside the project folder

little galleon
#

uhuh

#

this is where mine is

little galleon
native zodiac
#

the code

little galleon
#

huh

#

thats odd

native zodiac
#

add this under the import for tf print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU'))) it will return 0

little galleon
#

yeah thats what i have rn

#

oh wait can u git pull

native zodiac
#

I can git pull

little galleon
#

i think i forgot to push that

#

whoops

#

but yeah when i run that code i get 0

#
Num GPUs Available:  0
2024-01-12 19:25:44.841520: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-01-12 19:25:44.860821: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1956] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
native zodiac
#

@little galleon after a lot of prodding around, I can't get it to work within a venv

#

there's not a lot of docs on Arch but it seems like global installs work just fine, of course that's never recommended

little galleon
#

hmm thats strange

#

lemme try that

native zodiac
#

also pip install tensorflow[and-cuda] this doesn't exist

little galleon
#

yeah i think its only for ubuntu

native zodiac
#

not sure what tensorflow are thinking but that doesn't exist and they recommend that too

native zodiac
#

nothing

little galleon
#

so j global install ig

#

idk what in the venv is causing it to mess up

#

do i have to like pacman install cuda inside the venv

native zodiac
#

I'll keep going it at it, someone said that conda is better for it but I never use conda outside of Win env

little galleon
#

ya same

#

i can try conda tho

native zodiac
#

it's in the CUDA quick install guide

#

I'll spin up a VM with arch and solo test it, I forgot what else works since I do have one project pulling tensorflow globally but I don't remember if it's just a CPU thing setup

#

I'll keep you posted, shouldn't take long

little galleon
#

ok tysm

little galleon
#

ok i got the code to recognize the gpu

#

the issue was cuda and tf were not compatible

#

so i upgraded to tf v 15 and it works

#

but now im getting another error

#
JIT compilation failed.
     [[{{node sequential/batch_normalization/batchnorm/Rsqrt}}]] [Op:__inference_train_function_1290]
native zodiac
#

I upgraded and I couldn't get that to change

#

Let me go back to my env, did you push those changes?

little galleon
#

nah i didnt want to change the req file

#

just remove all the versions from it

#

and delete your venv

#

then reinstall all the packages

native zodiac
#

lol I didn't want to do all that

little galleon
#

its like 30 seconds 💀

#

sry i dual booted to windows

native zodiac
#

I've been tinkering so much my brain is all over the palce

#

I do too

#

I just like Arch more

little galleon
#

imma try getting it to work there

little galleon
#

it j breaks lol

native zodiac
#

.s rs

brittle sentinelBOT
native zodiac
#

I need that sorry lol

little galleon
#

huh

#

dyw me to run smth?

native zodiac
#

I couldn't remember the paste export thing, I get other issues compared to yours

#

that's what I get on a new venv and new install, that was not there before

little galleon
#

the NUMA mode error i got

#

not too sure what the others mean

native zodiac
#

you've run this model before? how CPU intensive is it?

little galleon
#

the model is light

#

the hyperparam tuning is what's intensive

native zodiac
#

unless tensorflow fix that, if something was not intensive enough it would not go to GPU but stay on CPU

little galleon
#

i mean if i need to run intensive models at some point

little galleon
#

@native zodiac i messed around with it some more but nothing seems to work

#

i think the cuda cudnn and tf versions are just incompatible

native zodiac
little galleon
#

have u used tf and arch w gpu support before though

native zodiac
#

globally not through venv

#

though I'm not sure if it was GPU or not, tf is CPU first

native zodiac
#

checked it, it was CPU only, hmm

little galleon
#

ok then im not too sure

#

ppl said it has compatibility

#

but theres abs no docs on how

native zodiac
#

You're not alone here either, the ubuntu way should work with arch just fine

#

I'll see how it works with Ubuntu later today and let you know