#Silly tavern Api issue
1 messages · Page 1 of 1 (latest)
do not enable api
thats how i helped fix the other person's issue
the boolean command-line flag for api is not needed
only openai api
I was looking at that one you referred to too. I only had openai checked off, but it was still a no go. I am also doing this all off of linux too btw. I copied the blue address to the API area in Silly Tavern and it would still not connect.
a friend of mine on linux had outdated ST and when he updated ST it worked
simply cuz he had the ST version before openai api was introduced
make sure you dont have ST legacy api enabled i guess
i dont know why this would even be an issue
you could also try enabling bypass check and try to gen an output anyway
dunno if that will work
I just downloaded it yesterday and I even updated it. I think I may also have to edit the CMD_FLAGS.txt file too.
Idk if that's used anymore but see how you go
I literally installed this a few days ago. But I'm pretty sure it updates through start.sh file when you open it up through the terminal. I am still having no luck with this.
yeah ST does
not text-gen
that should be the easiest step lol
could it be a firewall thing? its all local so it shouldnt be
i would get in jlllll or someone who knows a lot to have a look
I was talking about ST, there is an update file I've already ran on oobabooga.
on oobabooga?
no, chat won't work.
I can get it to work on oobabooga, but no ST.
it doesn't work if you don't have an api running. You can connect it to openai, but I refuse to pay for it.
alright see if this works
thats a tunnel ive opened to my port 5000 for ooba
chuck that in your text completion on ST
thats what it looks like
may need to put /v1 on the end
shouldnt need to tho
this should test if its ST being annoying or if its the linux ooba for whatever reason
it works on my end
I still struggling on what you are trying to have me do.
Ok, that worked, for you system I should say.
alright so ST for you is working fine
I even messaged the default bot and it messaged back
that means text-gen-ui is screwing around with its api
yeah it showed up here
are you using radeon GPU?
no reason to be on linux otherwise
Yes, I am. a XFX Radeon RX 7900t with 20gbs of ram to be precise.
yeah sounds like a pain
i have a friend who went radeon 7900xtx
hes had it for months, same time as me, and he still cant run stuiff
hes tried like 10 times lol
he hasnt tried the simple stuff recently tho
like koboldcpp and ollama
on oobabooga I can chat on that with no issues. I just can't connect it to ST
AMD is generally better on Linux from what I've heard.
oh it definitely is for now haha
I do I know if is? I got psyfighter 13b running faster than my windows 1080ti PC currently. I don't think that's the problem. I think it's my settings. Should I have it like this or is there something else I'm missing.
that doesnt tell me anything, but you shouldnt need anymore from there
show me your loading model settings?
this stuff
Wait, I think I got it fixed.
@dark ruin Holy fuck... I got it working. Another question, do you know how to get Eurake 1.3 working on oobabooga?
nice one!
ill have to pop you over to my friend so you can help him lol
What is eurake? you mean eurayle or something?
if it's 70B, you will need a gguf quant file of it
with 20gb vram and 64gb ram you should be able to run quant 4 KM but slowly
In this tutorial, I show you how to use the Oobabooga WebUI with SillyTavern to run local models with SillyTavern.
I followed this video. I also realized I had it connected, but it would go green and say none. The issue was, I didn't have a model loaded when on oobabooga.
it looks like this when you put that info on ST. I didn't think it worked because it said none. It wasn't until the video showed how it looked on his end that I connect the dots.
Also, wow, the 13b model is so much faster on this than my windows machine.
I downloaded gptq version, so if I download the gguf version, it should work?
thats good news
gptq and exl2 are gpu only
the whole model MUST fit on the vram
gguf is cpu and system ram, with the option to offload layers onto vram to speed things up
I got a 32 × AMD Ryzen 9 5950X 16-Core Processor and 125gb of ram.
lol thats good as
so the idea is to offload as much layers of the model noto vram as possible
without overflowing the vram
cuz you need some space for context
i would start with 20 layers and increase that until it fails to load
I'm downloading a gguf model now, but where would I adjust that, in parameters?
is this on the web ui settings or in the file directory?
web ui
that one
how many layers would I set it too?
as many as you can fit but not too many
if you can monitor vram usage that would help a lot
start with 20 layers and increase if that works
you want to aim for around 1.5-2 tokens/second
ok