#Im new to OoogaBooga and LLM's please bare with me: How do i run GGUF models?

1 messages · Page 1 of 1 (latest)

raven apex
#

How do i download them too? This is what im currently doing (the image)

but then i get this error trying to load it with transformers:

Traceback (most recent call last):
File "/app/modules/ui_model_menu.py", line 213, in load_model_wrapper

shared.model, shared.tokenizer = load_model(selected_model, loader)
File "/app/modules/models.py", line 87, in load_model

output = load_func_maploader
File "/app/modules/models.py", line 141, in huggingface_loader

config = AutoConfig.from_pretrained(path_to_model, trust_remote_code=params['trust_remote_code'])
File "/venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1082, in from_pretrained

config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/venv/lib/python3.10/site-packages/transformers/configuration_utils.py", line 644, in get_config_dict

config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/venv/lib/python3.10/site-packages/transformers/configuration_utils.py", line 699, in _get_config_dict

resolved_config_file = cached_file(
File "/venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 360, in cached_file

raise EnvironmentError(
OSError: models/TheBloke_OpenHermes-2.5-Mistral-7B-GGUF does not appear to have a file named config.json. Checkout 'https://huggingface.co/models/TheBloke_OpenHermes-2.5-Mistral-7B-GGUF/None' for available files.

#

i dont understand

junior elbow
raven apex
# junior elbow try setting your loader to llama.cpp

Just did that and the new error is

Traceback (most recent call last):
File "/app/modules/ui_model_menu.py", line 213, in load_model_wrapper

shared.model, shared.tokenizer = load_model(selected_model, loader)
File "/app/modules/models.py", line 87, in load_model

output = load_func_maploader
File "/app/modules/models.py", line 250, in llamacpp_loader

model_file = list(Path(f'{shared.args.model_dir}/{model_name}').glob('*.gguf'))[0]
IndexError: list index out of range

junior elbow
#

put .gguf at the end of that bottom download

#

then try

raven apex
#

alright trying, its downloading

junior elbow
#

ok

raven apex
#

for some odd reason that just broke my instance lmao

#

like oooga booga just turned off

junior elbow
#

ahh

#

so lets do it manually then

raven apex
#

alright let me get another gpu lmao

junior elbow
#

?

raven apex
#

using a cloud gpu

#

it legit broke the whole thing it wont connect anymore so im getting another

junior elbow
#

can you access the files on it

raven apex
#

yes

#

i can ssh

#

and do stuff

junior elbow
#

go into the models tab and delete the Hermes folder "F:\AI_MAIN\Avael-AI\models\TheBloke_WizardLM-7B-uncensored-GGML" like this path

#

(i changed my stuff so its a bit different)

raven apex
#

im getting a new cloud instance so nothing will be on it, blank slate

#

so that won't exist

junior elbow
#

ahh ok, you shouldnt need to but ooba breaks often at first when your new to it so reinstall is easiest yeah

raven apex
#

yeah the instance wouldn't boot upo anymore

#

haha

#

crazy

#

okay on my new instance now

#

what do we do first

#

blank slate

junior elbow
#

so cd into thhe models tab

#

or whatever cd is on your system

#

and paste this

raven apex
#

okay doing that right now

#

done its cloned

junior elbow
#

now cd into that folder

raven apex
#

don

#

done

junior elbow
#

now go and download the file of the place manually and plop it in there if you can

#

the .gguf file you want i mean

raven apex
#

wait how do i download the file i want

junior elbow
#

should be this

raven apex
#

it would download it on my local machine though

#

thats fine?

junior elbow
#

ahhh are you not able to upload it to the cloud?

raven apex
#

let me see

#

i can connect google drive and upload it

#

or dropbox

#

okay downloading what i want

junior elbow
#

what system is its terminal?

#

unless you can just place it directly in the models without needing to cd

raven apex
#

Linux

junior elbow
#

mv [source] [destination]

#

should be the command

#

distination would be inside the gguf folder, sorry if im a bit iffy with my words, i dont know much coding stuff and struggle to make sentences

#

should look like this
mv [.gguf File source] F:\AI_MAIN\Avael-AI\models\TheBloke_WizardLM-7B-uncensored-GGML

raven apex
#

its still downloading

#

so mv the gguf to models folder

junior elbow
#

you put it into to folder you downloaded

#

you should already be cd-ed into it since i had you do that

raven apex
#

ah okay gotcha

#

alright

#

but the original folder already has this file no?

junior elbow
#

no

#

i thought so too, but each number you see is a seperate Model version

#

so you have to chose 1 of them you want, after you get the folder

raven apex
#

AHHH that makes sense

junior elbow
#

yeah, im using gguf models and they were different than im use to but they are so fast on my system do to me being able to split the usage up on my hardware

#

well so fast being not 5 minutes for 1 generation

#

(its around 3seconds to 2 minutes )

raven apex
#

LMAO yeah yeah, i have been trying to use other model 70B runs weirdly slow on x2 RTX 4090's

#

so i was trying to run it as GGUF but ran into well

#

this issue

#

lmao

#

uploading to dropbox now

junior elbow
#

70B needs both ram and Vram

raven apex
#

damn

#

well

junior elbow
#

i have a 70B model

raven apex
#

maybe i can do maybe 30B or 13B max

junior elbow
#

wanna see how long it took to output

raven apex
#

show me

#

dropbox says its gojnna take an hour to upload oof

junior elbow
#

oh wait it already closed, but it took 768 seconds for 1 prompt

#

which is like 11 minutes or something

#

12 minutes and 48 seconds

raven apex
#

jesus christ

junior elbow
#

im runnning it on a 3080 though so eh, i expected it

#

my specs are beefy but not enough for a 70B

raven apex
#

yeah i wonder what it would be like on 2x 4090's

junior elbow
#

i plan on doing that

#

but i need money first cause 4090s are like 4 kidneys worth of cash nekoDead

raven apex
#

LMAOOO that’s why I’m using cloud GPU

#

1TB 2x 4090’s decent cpu at 80 cents an hour

#

I use for about $2 a day or $60 a month

#

That’s if I consistently use it

#

Realistically I only end up spending $30

junior elbow
#

i run locally (like you) but i wanna have it be free once set up

#

and i also cant run through cloud cause i have files being sent and stuff since i have back end api stuff turning it into a discord bot

#

thats and its file size would suuuuuuuck to move

raven apex
#

jersus

junior elbow
#

(some generation models(the stuff you are getting now) are double that BY THEMSELVES)

raven apex
#

yeah the models are huge

#

like a whole harddrive for 1 model

#

its honestly been a headache but im a software engineer so im used to tinkering

#

ive only gotten 2 models to work so far

junior elbow
#

like... i want but cant have :(

raven apex
#

its such a hassle setting up local llms (espeically so since im new) but its worth it i know it

#

not locked in to some propreity api can use whenever i want no down time etc

#

the freedom

junior elbow
#

which makes them infinitly more valuable to the user cause of that

#

hows the upload going btw

raven apex
#

can make a model generate a bomb if need be

#

but thats not why i want uncensored local

#

just tired of being told what i can and can't do

#

im tired of GPT's idiot responses

#

i want a model fine tuned for coding

raven apex
#

ill be at work in 30 min so we can resume this when i get back in 6 hours

junior elbow
raven apex
#

chat GPT does this with code watch:

#

Building a blah blah blah is a complex and enduring task requring blah blah blah

#

like

#

then makes a list of shit

#

like please just give me the code

#

i do not care

junior elbow
#

ohhhh, well some models still do that but you can tweak that in the uhh, can i see your params tab real fast, i have nowebui enabled

#

if the AI is started

junior elbow
#

im mostly around all the time, so just ping me and ill help with what i can. do note, im not someone with deep knowledge on ooba, just someone who had to reinstall THE WHOLE THING about 7 to 8 times so ive learned a bit

junior elbow
raven apex
#

LMAOO

#

7 or 8 time

#

times

#

id cry

#

do you have a good understanding of params?

#

For example why does changing max_tokens make the model broken/ say stupid shit

junior elbow
#

if you use the Ban_EOS_Tokens thing it can very very long fever dreams

#

however its useful as well in some cases where you need a long message and it just, stops early. that option prevents it

#

changing the Context changes how the AI reacts and responds

raven apex
#

I hope I can get the hang of this local llm stuff so it’s second nature

#

I don’t want to go back to propriety models

#

Tired of the restrictions

#

And I will pay $60 a month vs $20 GPT sub if it means privacy, locality and control

junior elbow
#

in the params tab there is also instruction template... you might be able to understand it better than i given your experince but this is where you can really make the difference

#

there in the middle is context that the AI recieves

#

if you can learn custom system messages or learn to change the premade you can finely tune its output, without fine tuning the model itself

#

always use chat-instuct or instruct btw, i find them to be much better

raven apex
#

BOOGA

junior elbow
#

LOL

raven apex
junior elbow
#

yeah, i dont know why, i know some models even outright say not to use chat. i know others get it to work but for me i just prefer the others

raven apex
#

Yeah true

raven apex
junior elbow
#

Sure!

#

was gonna ask the same thing LOL

raven apex
#

Beat you too it haha

#

You are familiar with code/python or no?

#

Also: what model is your go to for chatting? Since I’m relatively new to the open source llm space I literally do NOT have a clue what model makes for good conversation, facts, code etc

#

And I’m hesitate to believe benchmarks since they can always use training data on the benchmakrks

#

Marks

junior elbow
#

i can read it and change premade code but i can not code from scratch, i am however very interested in learning it so thats what im doing now. i have a coder that is writing a discord bot using ooba and im reading the code and learning from that

junior elbow
junior elbow
#

but you have 2 4090s to play with so it should be faster :p

raven apex
#

I’m at work btw sorry for not replying

raven apex
raven apex
#

You’ll be able to code from scratch in no time

junior elbow
#

thanks

junior elbow
#

can you access the cloud from work LOL

#

AI generation at work to do your duties :p

raven apex
#

Need to see if the Dropbox finished uploading

#

Okay nice it finished uploading

#

I get home in 2 hours so can finish it up before sleep and stuff

junior elbow
#

sleep? whats that

raven apex
#

You will be on tmr at around 5 EST?

#

I’ll be home from work then

#

I go in early tmr

junior elbow
#

yeah, im up most times

raven apex
#

Okay cool haha

junior elbow
#

ill be up when you get home even

raven apex
#

im home

#

lets work

#

/src/models/OpenHermes-2.5-Mistral-7B-GGUF

junior elbow
#

ok

raven apex
#

getting a new instance rq and redoing the other steps

#

but the dropbox upload is complete

#

is

junior elbow
#

just let me know what is going on

raven apex
#

i don tget it

junior elbow
#

/src/models/OpenHermes-2.5-Mistral-7B-GGUF

raven apex
#

alright and for the location in dropbox?

#

i put it ina. folde rnamed testing

#

says it doesn't exist

junior elbow
#

ahh

#

do the file path

#

so workspace/testing/fileName

raven apex
#

/workspace/testing/openhermes-2.5-mistral-7b.Q4_0

#

still says it doesn't exist

junior elbow
#

can you screen share to VC? or would it doxx you

#

i dont use Drop box so im not sure how it works