#Silly tavern Api issue

1 messages · Page 1 of 1 (latest)

median dune
#

I know this has been beaten to death, but I cannot connect oobabooga's api to Silly Tavern. I have api and open ai selected in the setting and I've pasted the 127.0.0.1:5000 address to silly tavern's api spot, but it still tells me it's not connected.

dark ruin
#

do not enable api

#

thats how i helped fix the other person's issue

#

the boolean command-line flag for api is not needed

#

only openai api

median dune
# dark ruin only openai api

I was looking at that one you referred to too. I only had openai checked off, but it was still a no go. I am also doing this all off of linux too btw. I copied the blue address to the API area in Silly Tavern and it would still not connect.

dark ruin
#

simply cuz he had the ST version before openai api was introduced

#

make sure you dont have ST legacy api enabled i guess

#

i dont know why this would even be an issue

#

you could also try enabling bypass check and try to gen an output anyway

#

dunno if that will work

median dune
dark ruin
#

Idk if that's used anymore but see how you go

median dune
dark ruin
#

yeah ST does

#

not text-gen

#

that should be the easiest step lol

#

could it be a firewall thing? its all local so it shouldnt be

#

i would get in jlllll or someone who knows a lot to have a look

median dune
dark ruin
#

does chat completion work?

#

try a reverse proxy

median dune
dark ruin
#

nah on ST

#

well that too

#

just to make sure

median dune
#

I can get it to work on oobabooga, but no ST.

dark ruin
#

ahh well thats sus

#

so ST wont work at all for even proxies?

median dune
dark ruin
#

alright see if this works

#

thats a tunnel ive opened to my port 5000 for ooba

#

chuck that in your text completion on ST

#

thats what it looks like

#

may need to put /v1 on the end

#

shouldnt need to tho

#

this should test if its ST being annoying or if its the linux ooba for whatever reason

#

it works on my end

median dune
#

I still struggling on what you are trying to have me do.

dark ruin
#

thats what it should look like

#

so you see it worked

median dune
dark ruin
#

alright so ST for you is working fine

median dune
dark ruin
#

that means text-gen-ui is screwing around with its api

dark ruin
#

are you using radeon GPU?

#

no reason to be on linux otherwise

median dune
dark ruin
#

yeah sounds like a pain

#

i have a friend who went radeon 7900xtx

#

hes had it for months, same time as me, and he still cant run stuiff

#

hes tried like 10 times lol

#

he hasnt tried the simple stuff recently tho

#

like koboldcpp and ollama

median dune
dark ruin
#

ok thats good

#

like full gpu offloading working good?

median dune
#

AMD is generally better on Linux from what I've heard.

dark ruin
#

oh it definitely is for now haha

median dune
# dark ruin like full gpu offloading working good?

I do I know if is? I got psyfighter 13b running faster than my windows 1080ti PC currently. I don't think that's the problem. I think it's my settings. Should I have it like this or is there something else I'm missing.

dark ruin
dark ruin
#

show me your loading model settings?

#

this stuff

median dune
#

Wait, I think I got it fixed.

#

@dark ruin Holy fuck... I got it working. Another question, do you know how to get Eurake 1.3 working on oobabooga?

dark ruin
#

ill have to pop you over to my friend so you can help him lol

#

What is eurake? you mean eurayle or something?

#

if it's 70B, you will need a gguf quant file of it

#

with 20gb vram and 64gb ram you should be able to run quant 4 KM but slowly

median dune
#

I followed this video. I also realized I had it connected, but it would go green and say none. The issue was, I didn't have a model loaded when on oobabooga.

dark ruin
#

LOL

#

oh well

median dune
# dark ruin oh well

it looks like this when you put that info on ST. I didn't think it worked because it said none. It wasn't until the video showed how it looked on his end that I connect the dots.

#

Also, wow, the 13b model is so much faster on this than my windows machine.

median dune
dark ruin
#

the whole model MUST fit on the vram

#

gguf is cpu and system ram, with the option to offload layers onto vram to speed things up

median dune
dark ruin
#

lol thats good as

#

so the idea is to offload as much layers of the model noto vram as possible

#

without overflowing the vram

#

cuz you need some space for context

#

i would start with 20 layers and increase that until it fails to load

median dune
dark ruin
#

have a look now at the lama.cpp loader

#

will have n_gpulayers

#

thats where you set it

median dune
dark ruin
#

web ui

median dune
#

or here?

dark ruin
median dune
dark ruin
#

as many as you can fit but not too many

#

if you can monitor vram usage that would help a lot

#

start with 20 layers and increase if that works

#

you want to aim for around 1.5-2 tokens/second

dark ruin
#

if its too many leyers on gpu, it will be very slow

#

if too little layers, it will be very slow