Im new to OoogaBooga and LLM's please bare with me: How do i run GGUF models? | Text Generation WebUI | Page 1

raven apex Mar 13, 2024, 5:02 PM

#

How do i download them too? This is what im currently doing (the image)

but then i get this error trying to load it with transformers:

Traceback (most recent call last):
File "/app/modules/ui_model_menu.py", line 213, in load_model_wrapper

shared.model, shared.tokenizer = load_model(selected_model, loader)
File "/app/modules/models.py", line 87, in load_model

output = load_func_maploader
File "/app/modules/models.py", line 141, in huggingface_loader

config = AutoConfig.from_pretrained(path_to_model, trust_remote_code=params['trust_remote_code'])
File "/venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1082, in from_pretrained

config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/venv/lib/python3.10/site-packages/transformers/configuration_utils.py", line 644, in get_config_dict

config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/venv/lib/python3.10/site-packages/transformers/configuration_utils.py", line 699, in _get_config_dict

resolved_config_file = cached_file(
File "/venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 360, in cached_file

raise EnvironmentError(
OSError: models/TheBloke_OpenHermes-2.5-Mistral-7B-GGUF does not appear to have a file named config.json. Checkout 'https://huggingface.co/models/TheBloke_OpenHermes-2.5-Mistral-7B-GGUF/None' for available files.

#

#

i dont understand

junior elbow Mar 13, 2024, 5:10 PM

#

raven apex

try setting your loader to llama.cpp

raven apex Mar 13, 2024, 5:11 PM

#

junior elbow try setting your loader to llama.cpp

Just did that and the new error is

Traceback (most recent call last):
File "/app/modules/ui_model_menu.py", line 213, in load_model_wrapper

shared.model, shared.tokenizer = load_model(selected_model, loader)
File "/app/modules/models.py", line 87, in load_model

output = load_func_maploader
File "/app/modules/models.py", line 250, in llamacpp_loader

model_file = list(Path(f'{shared.args.model_dir}/{model_name}').glob('*.gguf'))[0]
IndexError: list index out of range

#

junior elbow Mar 13, 2024, 5:13 PM

#

put .gguf at the end of that bottom download

#

then try

#

raven apex Mar 13, 2024, 5:14 PM

#

alright trying, its downloading

junior elbow Mar 13, 2024, 5:15 PM

#

ok

raven apex Mar 13, 2024, 5:18 PM

#

for some odd reason that just broke my instance lmao

#

like oooga booga just turned off

junior elbow Mar 13, 2024, 5:18 PM

#

ahh

#

so lets do it manually then

raven apex Mar 13, 2024, 5:18 PM

#

alright let me get another gpu lmao

junior elbow Mar 13, 2024, 5:18 PM

#

?

raven apex Mar 13, 2024, 5:19 PM

#

using a cloud gpu

#

it legit broke the whole thing it wont connect anymore so im getting another

junior elbow Mar 13, 2024, 5:19 PM

#

can you access the files on it

raven apex Mar 13, 2024, 5:19 PM

#

yes

#

i can ssh

#

and do stuff

junior elbow Mar 13, 2024, 5:20 PM

#

go into the models tab and delete the Hermes folder "F:\AI_MAIN\Avael-AI\models\TheBloke_WizardLM-7B-uncensored-GGML" like this path

#

(i changed my stuff so its a bit different)

raven apex Mar 13, 2024, 5:21 PM

#

im getting a new cloud instance so nothing will be on it, blank slate

#

so that won't exist

junior elbow Mar 13, 2024, 5:21 PM

#

ahh ok, you shouldnt need to but ooba breaks often at first when your new to it so reinstall is easiest yeah

raven apex Mar 13, 2024, 5:21 PM

#

yeah the instance wouldn't boot upo anymore

#

haha

#

crazy

#

okay on my new instance now

#

what do we do first

#

blank slate

junior elbow Mar 13, 2024, 5:22 PM

#

so cd into thhe models tab

#

or whatever cd is on your system

#

and paste this

#

git clone https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GGUF

raven apex Mar 13, 2024, 5:23 PM

#

okay doing that right now

#

done its cloned

junior elbow Mar 13, 2024, 5:23 PM

#

now cd into that folder

raven apex Mar 13, 2024, 5:24 PM

#

don

#

done

junior elbow Mar 13, 2024, 5:25 PM

#

now go and download the file of the place manually and plop it in there if you can

#

the .gguf file you want i mean

raven apex Mar 13, 2024, 5:26 PM

#

wait how do i download the file i want

junior elbow Mar 13, 2024, 5:26 PM

#

should be this

raven apex Mar 13, 2024, 5:27 PM

#

it would download it on my local machine though

#

thats fine?

junior elbow Mar 13, 2024, 5:27 PM

#

ahhh are you not able to upload it to the cloud?

raven apex Mar 13, 2024, 5:27 PM

#

let me see

#

i can connect google drive and upload it

#

or dropbox

#

okay downloading what i want

junior elbow Mar 13, 2024, 5:30 PM

#

what system is its terminal?

#

unless you can just place it directly in the models without needing to cd

raven apex Mar 13, 2024, 5:31 PM

#

Linux

junior elbow Mar 13, 2024, 5:33 PM

#

mv [source] [destination]

#

should be the command

#

distination would be inside the gguf folder, sorry if im a bit iffy with my words, i dont know much coding stuff and struggle to make sentences

#

should look like this
mv [.gguf File source] F:\AI_MAIN\Avael-AI\models\TheBloke_WizardLM-7B-uncensored-GGML

raven apex Mar 13, 2024, 5:37 PM

#

junior elbow distination would be inside the gguf folder, sorry if im a bit iffy with my word...

no its okay i appreciate your help

#

its still downloading

#

so mv the gguf to models folder

junior elbow Mar 13, 2024, 5:38 PM

#

you put it into to folder you downloaded

#

you should already be cd-ed into it since i had you do that

junior elbow Mar 13, 2024, 5:40 PM

#

junior elbow git clone <https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GGUF>

its this folder

raven apex Mar 13, 2024, 5:41 PM

#

ah okay gotcha

#

alright

#

but the original folder already has this file no?

junior elbow Mar 13, 2024, 5:41 PM

#

no

#

i thought so too, but each number you see is a seperate Model version

#

so you have to chose 1 of them you want, after you get the folder

raven apex Mar 13, 2024, 5:43 PM

#

AHHH that makes sense

junior elbow Mar 13, 2024, 5:44 PM

#

yeah, im using gguf models and they were different than im use to but they are so fast on my system do to me being able to split the usage up on my hardware

#

well so fast being not 5 minutes for 1 generation

#

(its around 3seconds to 2 minutes )

raven apex Mar 13, 2024, 5:44 PM

#

LMAO yeah yeah, i have been trying to use other model 70B runs weirdly slow on x2 RTX 4090's

#

so i was trying to run it as GGUF but ran into well

#

this issue

#

lmao

#

uploading to dropbox now

junior elbow Mar 13, 2024, 5:45 PM

#

70B needs both ram and Vram

raven apex Mar 13, 2024, 5:45 PM

#

damn

#

well

junior elbow Mar 13, 2024, 5:45 PM

#

i have a 70B model

raven apex Mar 13, 2024, 5:45 PM

#

maybe i can do maybe 30B or 13B max

junior elbow Mar 13, 2024, 5:45 PM

#

wanna see how long it took to output

raven apex Mar 13, 2024, 5:45 PM

#

show me

#

dropbox says its gojnna take an hour to upload oof

junior elbow Mar 13, 2024, 5:46 PM

#

oh wait it already closed, but it took 768 seconds for 1 prompt

#

which is like 11 minutes or something

#

12 minutes and 48 seconds

raven apex Mar 13, 2024, 5:47 PM

#

jesus christ

junior elbow Mar 13, 2024, 5:47 PM

#

im runnning it on a 3080 though so eh, i expected it

#

my specs are beefy but not enough for a 70B

raven apex Mar 13, 2024, 5:47 PM

#

yeah i wonder what it would be like on 2x 4090's

junior elbow Mar 13, 2024, 5:48 PM

#

i plan on doing that

#

but i need money first cause 4090s are like 4 kidneys worth of cash nekoDead

raven apex Mar 13, 2024, 5:48 PM

#

LMAOOO that’s why I’m using cloud GPU

#

1TB 2x 4090’s decent cpu at 80 cents an hour

#

I use for about $2 a day or $60 a month

#

That’s if I consistently use it

#

Realistically I only end up spending $30

junior elbow Mar 13, 2024, 5:50 PM

#

i run locally (like you) but i wanna have it be free once set up

#

and i also cant run through cloud cause i have files being sent and stuff since i have back end api stuff turning it into a discord bot

#

thats and its file size would suuuuuuuck to move

raven apex Mar 13, 2024, 5:54 PM

#

junior elbow and i also cant run through cloud cause i have files being sent and stuff since...

yeah true

raven apex Mar 13, 2024, 5:54 PM

#

junior elbow thats and its file size would suuuuuuuck to move

LMAOOO

#

jersus

junior elbow Mar 13, 2024, 5:56 PM

#

(some generation models(the stuff you are getting now) are double that BY THEMSELVES)

raven apex Mar 13, 2024, 5:57 PM

#

junior elbow (some generation models(the stuff you are getting now) are double that BY THEMSE...

lord

#

yeah the models are huge

#

like a whole harddrive for 1 model

#

its honestly been a headache but im a software engineer so im used to tinkering

#

ive only gotten 2 models to work so far

junior elbow Mar 13, 2024, 5:58 PM

#

like... i want but cant have :(

raven apex Mar 13, 2024, 5:58 PM

#

its such a hassle setting up local llms (espeically so since im new) but its worth it i know it

#

not locked in to some propreity api can use whenever i want no down time etc

#

the freedom

junior elbow Mar 13, 2024, 5:59 PM

#

raven apex not locked in to some propreity api can use whenever i want no down time etc

plus running locally means its not going through the STUPID filters that hosted ones are, granted you can get models that do that locally but... why?

#

which makes them infinitly more valuable to the user cause of that

#

hows the upload going btw

raven apex Mar 13, 2024, 6:05 PM

#

junior elbow plus running locally means its not going through the STUPID filters that hosted ...

exactly

#

can make a model generate a bomb if need be

#

but thats not why i want uncensored local

#

just tired of being told what i can and can't do

#

im tired of GPT's idiot responses

#

i want a model fine tuned for coding

raven apex Mar 13, 2024, 6:06 PM

#

junior elbow hows the upload going btw

says 30 minutes

#

ill be at work in 30 min so we can resume this when i get back in 6 hours

junior elbow Mar 13, 2024, 6:06 PM

#

raven apex im tired of GPT's idiot responses

oh well.... you can still very much get fever dreams, however you can tweak the values of stuff so its less likely

raven apex Mar 13, 2024, 6:07 PM

#

junior elbow oh well.... you can still very much get fever dreams, however you can tweak the ...

By stupid responses i mean long stuff

#

chat GPT does this with code watch:

#

Building a blah blah blah is a complex and enduring task requring blah blah blah

#

like

#

then makes a list of shit

#

like please just give me the code

#

i do not care

junior elbow Mar 13, 2024, 6:08 PM

#

ohhhh, well some models still do that but you can tweak that in the uhh, can i see your params tab real fast, i have nowebui enabled

#

if the AI is started

junior elbow Mar 13, 2024, 6:09 PM

#

raven apex ill be at work in 30 min so we can resume this when i get back in 6 hours

ok!

#

im mostly around all the time, so just ping me and ill help with what i can. do note, im not someone with deep knowledge on ooba, just someone who had to reinstall THE WHOLE THING about 7 to 8 times so ive learned a bit

junior elbow Mar 13, 2024, 6:15 PM

#

junior elbow ohhhh, well some models still do that but you can tweak that in the uhh, can i s...

on the second pic, im unsure which is the bots and which is the user, but one of them are

raven apex Mar 13, 2024, 6:18 PM

#

junior elbow im mostly around all the time, so just ping me and ill help with what i can. do ...

okay cool thank you

raven apex Mar 13, 2024, 6:18 PM

#

junior elbow im mostly around all the time, so just ping me and ill help with what i can. do ...

Lmao

#

LMAOO

#

7 or 8 time

#

times

#

id cry

#

do you have a good understanding of params?

#

For example why does changing max_tokens make the model broken/ say stupid shit

junior elbow Mar 13, 2024, 6:24 PM

#

raven apex For example why does changing max_tokens make the model broken/ say stupid shit

its how many many tokens it can output

#

if you use the Ban_EOS_Tokens thing it can very very long fever dreams

#

however its useful as well in some cases where you need a long message and it just, stops early. that option prevents it

#

changing the Context changes how the AI reacts and responds

raven apex Mar 13, 2024, 6:29 PM

#

junior elbow its how many many tokens it can output

Ahhhh okay I get it

#

I hope I can get the hang of this local llm stuff so it’s second nature

#

I don’t want to go back to propriety models

#

Tired of the restrictions

#

And I will pay $60 a month vs $20 GPT sub if it means privacy, locality and control

junior elbow Mar 13, 2024, 6:30 PM

#

in the params tab there is also instruction template... you might be able to understand it better than i given your experince but this is where you can really make the difference

#

there in the middle is context that the AI recieves

#

if you can learn custom system messages or learn to change the premade you can finely tune its output, without fine tuning the model itself

#

always use chat-instuct or instruct btw, i find them to be much better

raven apex Mar 13, 2024, 6:39 PM

#

junior elbow in the params tab there is also instruction template... you might be able to und...

Very interesting I’ll definitely look into that after I understand each format of LLMS and get them to work without breaking oooga boots

#

BOOGA

junior elbow Mar 13, 2024, 6:40 PM

#

LOL

raven apex Mar 13, 2024, 6:42 PM

#

junior elbow always use chat-instuct or instruct btw, i find them to be much better

Yeah same chat gives gibberish and cuts off sometimes

junior elbow Mar 13, 2024, 6:43 PM

#

yeah, i dont know why, i know some models even outright say not to use chat. i know others get it to work but for me i just prefer the others

raven apex Mar 13, 2024, 6:46 PM

#

Yeah true

raven apex Mar 13, 2024, 6:47 PM

#

junior elbow yeah, i dont know why, i know some models even outright say not to use chat. i k...

Might sound akward but you are really cool bro and I appreciate your help a lot, after this maybe we can chat outside and get to know each other?

junior elbow Mar 13, 2024, 6:47 PM

#

Sure!

#

was gonna ask the same thing LOL

raven apex Mar 13, 2024, 6:47 PM

#

Beat you too it haha

#

You are familiar with code/python or no?

#

Also: what model is your go to for chatting? Since I’m relatively new to the open source llm space I literally do NOT have a clue what model makes for good conversation, facts, code etc

#

And I’m hesitate to believe benchmarks since they can always use training data on the benchmakrks

#

Marks

junior elbow Mar 13, 2024, 6:49 PM

#

i can read it and change premade code but i can not code from scratch, i am however very interested in learning it so thats what im doing now. i have a coder that is writing a discord bot using ooba and im reading the code and learning from that

junior elbow Mar 13, 2024, 6:50 PM

#

raven apex Also: what model is your go to for chatting? Since I’m relatively new to the ope...

most models by the Bloke are good, i mostly use this one now cause its SO FAST
Wizard-Vicuna-7B-Uncensored-GGUF

junior elbow Mar 13, 2024, 6:53 PM

#

junior elbow most models by the Bloke are good, i mostly use this one now cause its SO FAST W...

granted my PC is beefy-ish

#

but you have 2 4090s to play with so it should be faster :p

raven apex Mar 13, 2024, 10:44 PM

#

junior elbow granted my PC is beefy-ish

Sheesh nice PC

#

I’m at work btw sorry for not replying

raven apex Mar 13, 2024, 10:45 PM

#

junior elbow most models by the Bloke are good, i mostly use this one now cause its SO FAST W...

Ah so bloke models are good okay okay

raven apex Mar 13, 2024, 10:45 PM

#

junior elbow i can read it and change premade code but i can not code from scratch, i am howe...

That’s a start bro

#

You’ll be able to code from scratch in no time

junior elbow Mar 13, 2024, 10:46 PM

#

thanks

junior elbow Mar 13, 2024, 10:46 PM

#

raven apex I’m at work btw sorry for not replying

no worries

#

can you access the cloud from work LOL

#

AI generation at work to do your duties :p

raven apex Mar 13, 2024, 10:48 PM

#

junior elbow AI generation at work to do your duties :p

Yep I can access the cloud haha

#

Need to see if the Dropbox finished uploading

#

Okay nice it finished uploading

#

I get home in 2 hours so can finish it up before sleep and stuff

junior elbow Mar 13, 2024, 10:50 PM

#

sleep? whats that

raven apex Mar 13, 2024, 10:53 PM

#

junior elbow sleep? whats that

LMAOOO you don’t sleep huh

#

You will be on tmr at around 5 EST?

#

I’ll be home from work then

#

I go in early tmr

junior elbow Mar 13, 2024, 10:54 PM

#

yeah, im up most times

raven apex Mar 13, 2024, 10:54 PM

#

Okay cool haha

junior elbow Mar 13, 2024, 10:54 PM

#

ill be up when you get home even

raven apex Mar 14, 2024, 1:10 AM

#

junior elbow ill be up when you get home even

yo

#

im home

#

lets work

#

/src/models/OpenHermes-2.5-Mistral-7B-GGUF

junior elbow Mar 14, 2024, 1:14 AM

#

ok

raven apex Mar 14, 2024, 1:15 AM

#

getting a new instance rq and redoing the other steps

#

but the dropbox upload is complete

#

is

junior elbow Mar 14, 2024, 1:17 AM

#

just let me know what is going on

raven apex Mar 14, 2024, 1:20 AM

#

i don tget it

#

junior elbow Mar 14, 2024, 1:21 AM

#

/src/models/OpenHermes-2.5-Mistral-7B-GGUF

raven apex Mar 14, 2024, 1:22 AM

#

alright and for the location in dropbox?

#

i put it ina. folde rnamed testing

#

says it doesn't exist

#

junior elbow Mar 14, 2024, 1:22 AM

#

ahh

#

do the file path

#

so workspace/testing/fileName

raven apex Mar 14, 2024, 1:23 AM

#

/workspace/testing/openhermes-2.5-mistral-7b.Q4_0

#

still says it doesn't exist

junior elbow Mar 14, 2024, 1:23 AM

#

can you screen share to VC? or would it doxx you

#

i dont use Drop box so im not sure how it works

#Im new to OoogaBooga and LLM's please bare with me: How do i run GGUF models?