does text-generation-webiui has api? | Text Generation WebUI | Page 1

lime thorn Jul 1, 2023, 3:52 PM

#

any api available for the webui?

rancid plinth Jul 1, 2023, 4:57 PM

#

There are 2, a built in API and an OpenAI API compatible extension

#

check the docs for more info

#

There are api examples in the api-examples/ folder, and the openai API has a readme about compatibility and links to API docs.

blazing oak Jul 29, 2023, 5:49 AM

#

rancid plinth check the docs for more info

Which docs exactly? Was looking for a web documentation but couldn't find any. Do you mean the sample py scripts?

rancid plinth Jul 29, 2023, 10:11 AM

#

The built it API only has examples, but the openai API is well documented, starting here: https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai

GitHub

text-generation-webui/extensions/openai at main · oobabooga/text-ge...

A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, OPT, and GALACTICA. - oobabooga/text-generation-webui

#

because it's openai compatible, once you understand the limits you can just use the openai API documentation and examples

blazing oak Jul 29, 2023, 12:18 PM

#

So you would recommend the openai API?

#

What's with that API-key in the environment? I have to set up my own one or do I need an actual OpenAI key?

rancid plinth Jul 29, 2023, 2:58 PM

#

I do, unless you have specific reasons for using the ooba native api, but I'm very biased, I wrote it.

#

The API key is needed by most existing clients, but you can set it to anything when using it with ooba, like 'sk-dummy' etc. I use sk-1111111111111111111111111111111111111111111111111111 (or some other long string of 11's - see the docs) because some apps are picky about the format. It is NOT a real openai API key.

#

it's just a placeholder because the existing clients require it, on the server it's ignored.

blazing oak Jul 29, 2023, 3:20 PM

#

Okay, then I might consider it. my plan was to stay completely offline, running my own STT, conversational model and TTS (Whisper-STT, Llama-2 uncensored, Coquis-TTS)

rancid plinth Jul 29, 2023, 4:20 PM

#

blazing oak Okay, then I might consider it. my plan was to stay completely offline, running ...

doesn't really sound like you need an API, unless you're building your own setup. There are extension for all that stuff direct in tgwui

blazing oak Jul 29, 2023, 4:25 PM

#

Yep and they are supoptimal for my purpose

#

You gotta press a button to talk and to stop recording
Silerto TTS quality is not what I had in mind, Bark TTS is insanely computationally intensive and Eleven.ai is overpriced. Therefore I went with Coquis-TTS and wrote a python script that records my voice once a volume threshold is reached/ stops recording once the voice falls below said threshold for a certain time.

#

Actually I wanted to avoid using tgwui completely because I won't need 90% of what if offers, but considering the lack of official documentation on the subject of newer LLMs like Llama2, I quickly realized that I can give up right away if I try to setup the LLM in python from scratch without any prior experience.

#

My end goal is A STT-TTS-VTube-AI that can perform simple read/write tasks on my mashine as well as querying the web (the latter two are optional and depend on difficulty). Therefore I tried to keep it as slim and performant as possible but with a decent TTS quality. It is supposed to run on my laptop GPU 3070Ti without crazy delays. Right now the delay between my STT input and the TTS output is about 6 to 15 seconds, which I consider already quite okay, but I bet it could be faster when I get a grip on streaming and feed the TTS sentence by sentence as they get finished by the AI while recording further STT in parallel and transcribing it. to be next in line for the AI to answer. If I can ditch tgwui entirely and run it all within one python application, it would be even faster, but I am a python novice, so I don't see any realistic chance of me pulling that off on my own.
For now I am happy with the current state my program is at and am working on getting the VTube Avatar hooked up to the whole thing to give the AI more personality. (I come from CGI/VFX background, so making the model, rigging, animating etc. is the easy part and nothing too concerning)

#

The only real problem I face right now is the chat history of the standard API. I have no idea if it includes everything that has been said or only the context plus the last message (that is what I see when using --verbose).
I am not sure how the whole template for the AI works neither, which template to use for my particular model, etc. I just can't find the necessary explanations and sometimes see contradicting information.
Maybe I am too late to the party or maybe there simply is none and everyone working on these things is just a fullfledged python and ML pro.

blazing oak Jul 29, 2023, 4:38 PM

#

rancid plinth doesn't really sound like you need an API, unless you're building your own setup...

So basically most of my work with the API is done. Only the coherence of my model is really bad and I am not sure if this is a problem of the Model itself, wrong settings OR the history simply not containing all of the chat messages but only the recent one. If it is the latter, I don't see much reason to switch to another API and start from square one if what I have works already with only a single piece missing.

blazing oak Jul 29, 2023, 5:01 PM

#

And one more question I also couldn't really find an answer to: What is the difference/purpose between history['internal'] and history['visible'] ? How does the model treat the two?

rancid plinth Jul 29, 2023, 7:54 PM

#

not sure, that's a good question for #api-dev-help though

brisk tinsel Jul 30, 2023, 5:54 PM

#

@rancid plinth can i ask something, so once a chat starts on the ui its like a websocket connection, is there option for us to do all of it via post request with text gen webui?

rancid plinth Jul 30, 2023, 6:29 PM

#

brisk tinsel <@366616087976214529> can i ask something, so once a chat starts on the ui its l...

yeah, there is a blocking API, check the API examples for it

brisk tinsel Jul 30, 2023, 6:36 PM

#

rancid plinth yeah, there is a blocking API, check the API examples for it

I'm sorry for bugging you with more questions, but there is no detail guide on the api for me to better understand. Do i need to enable the --no-stream and --api to do the post request. I am looking to create a discord bot that can send message and then receive the response from the model hosted on runpod

rancid plinth Jul 30, 2023, 6:38 PM

#

no guide, no, just the example. just --api.

#

I am biased, but the openai extension (--extensions openai) supports an API also, which is well documented (because it's openai) and they also have a demo discord bot that works out of the box. Link in the readme for the openai extension.

brisk tinsel Jul 30, 2023, 6:53 PM

#

rancid plinth I am biased, but the openai extension (--extensions openai) supports an API also...

oh i was going through the chat above and did saw about openai, and was confused what to go with, i'll check it out. Also the demo discord bot thing where can i check it out?

rancid plinth Jul 30, 2023, 6:58 PM

#

https://github.com/openai/gpt-discord-bot

GitHub

GitHub - openai/gpt-discord-bot: Example Discord bot written in Pyt...

Example Discord bot written in Python that uses the completions API to have conversations with the text-davinci-003 model, and the moderations API to filter the messages. - GitHub - openai/gpt-di...

brisk tinsel Jul 30, 2023, 7:07 PM

#

rancid plinth https://github.com/openai/gpt-discord-bot

bet thank you bro appreciate the help

dense sinew Jul 31, 2023, 8:53 AM

#

How can I make it to generate an api link so I can use that in silly tavern.

blazing oak Aug 1, 2023, 1:13 AM

#

In the settings you can check API and it should expose a link in the cmd window when you start the webUI

#

"Europa was discovered on January 31st, 16090 by Simon Marius while he was observing Jupiter through his telescope. It has been known for centuries that there were four large satellites orbiting Jupiter before it was officially named "Galileo" after its discoverer. In 17897, William Herschel gave them their current names based on mythological characters related to Zeus. Europa is the second largest moon of Jupiter at approximately 3,5400 kilometers across. Its surface appears mostly smooth due to impact craters caused by asteroids and comets hitting it over millions of years. Europa also contains ice deposits near its poles which could potentially contain water beneath the surface. NASA's Galileo spacecraft visited this moon during the late 19990s and early 200000s, revealing evidence of oceans underneath the ice shell. However, no missions have yet succeeded in landing on Europa because of extreme radiation levels found within its atmosphere."

Anyone else has the problem where Llama-2 Airoboros 7b adds another digit to all year numbers resulting in ridiculous large numbers?

rancid plinth Aug 1, 2023, 1:17 AM

#

I have this problem when using --compress_pos_emb, it's not the only thing that's 'stretched'

#

nothing to do with the API

blazing oak Aug 1, 2023, 1:18 AM

#

I know, I tried to answer the question of somebody before me. The weird numbers bug was something I posted after.
Let me check if I use that flag, but as far as I remember, I didn't use this.

#

Oh, well yes. I did, because I increased the max_seq_len for my model (GPTQ), so I had it at 2.

rancid plinth Aug 1, 2023, 1:22 AM

#

even at 2? yeah, damn. You could try --alpha_value instead

blazing oak Aug 1, 2023, 1:22 AM

#

Increase alpha to 2 and keep compress at 1 ?

rancid plinth Aug 1, 2023, 1:22 AM

#

I had problems when I used that too, but maybe they're fixed now

#

yeah, it's one or the other

blazing oak Aug 1, 2023, 1:23 AM

#

Alright, thanks for the heads-up 🙂 Will try it

#

Do you know by any chance where I can find a good TTS VITS model for child voice?

rancid plinth Aug 1, 2023, 1:24 AM

#

no

blazing oak Aug 1, 2023, 1:24 AM

#

nevermind, thanks 🙂

#

Top! The date thing works! Thnks for the hint!

#

Ah, I had another question on my mind which I almost forgot to ask:
When using http://127.0.0.1/api/v1/generate instead of http://127.0.0.1/api/v1/chat, the extensions of ooba will be circumvent or seem not to kick in.
I was building the prompt manually since I had problems using the api-chat method but now it seems as if I won't be able to use the Long Term Memory extension and others. There is no way to trigger them via the API, right?

rancid plinth Aug 1, 2023, 1:49 AM

#

I don't think so, no, generate is a completion api, you have to do it all yourself

dense sinew Aug 1, 2023, 6:34 AM

#

Doesn't 8K context GGML not work for webUI?

dense sinew Aug 1, 2023, 7:02 AM

#

this is the kind of result I am getting when using an 8K GGML model.

blazing oak Aug 1, 2023, 2:06 PM

#

dense sinew this is the kind of result I am getting when using an 8K GGML model.

Got something similar when increasing to 8k. I resorted to reducing it again to 4k.

#does text-generation-webiui has api?