#ad_discordbot (Fork of Fork of xNul's bot)
1 messages Ā· Page 15 of 1
found a interesting tts
but not sure if it is for tgwui
it got the tag
I was having some trouble with the "is typing..." management I incorporated in each Task() object.
This is my new approach to managing it... this ensures that there is only one running "is typing..." task, driven by the state of the event flag
hardly digging around found this chinese? project
Off topic
absolutly not
it gives ideas
worked
so
can you add emotion recognition? ^-^
can the bot stream?
i mean if a discord bot has the ability to stream
or share screens...
i think ive never seen a bot doing that...
i found a part that could be useful for you, this guy divide text chunks by detection .!?
and send the text to tts
by my calculations, this makes tgwui only need about 4 tokens per second to speak fluently.
i exploded
Chunking text is planned already, won't be a problem
TTS - I don't plan on marrying one TTS API yet so it's not planned
Improved TTS at the very bottom of my list
what about this
discord bots cannot send video in a voice channel
any way to make the bot play the tts on the computer too?
it is possible using pyaudio library.
But no the feature is not part of the bot.
How would you want that implemented (if it's put on the todo list)
Like tied to a specific user so only some users can trigger TTS on the desktop?
It would be weird to hear replies to conversations you're not a part of.
Also joining discord VC already has that functionality.
What feature are you looking for that couldn't be done over discord?
discord has a delay
i want to plug it into a 2d model
and lip sync or something
if i do it by discord, i have a delay sending the audio to discord and receive and generate and send to discord
I ran STT to textgen to TTS and the tts model was running on Google colab at the time.
I used discord for voice recieve and output.
It wasnt that bad!
Most of the latency probable came from colab and sending voice data across multiple pcs.
I think its just that streaming needs to be figured out and it would be an awesome experience!
5 seconds latency from voice input start to tts reply
Yea, I see, youd want the video synced to the audio
speaking but lip not moving at the beginning
that it
One hacky idea is to create a virtual webcam and stream that to discord using your user account (if other people want to use it)
i just thought that if the bot can send the audio to discord why it cant just play it
smart
you got my idea
Discord doesnt provide all the features of user accounts to bots
I imagine streaming video is computationally expensive for discord, so they dont want 1000s of bots spamming video feeds
Voice on the other hand is insanely small in comparison
š
An idea I once had was creating a few reaction images/gifs for an avatar and having the chatbot send those as emotional states change.
Not the same as lip sync, but a step in that direction ^^
Discord doesn't allow user bots, that's risky for account termination
Again, spam and people abusing the system
Using an account to stream video from a program doesnt involve automating the discord account, I think that's fine
ĀÆ_(ć)_/ĀÆ
this is the game i talked about :v
how can i plug this https://github.com/legekka/GanyuTTS into the bot with api
how do i plug something via api
at the moment the bot uses tgwui to handle tts.
The bot doesnt make the tts calls itself.
it's the tgwui extension that hooks into the text generation pipeline that returns audio.
To get it working with the bot as is right now, one could make an extension for tgwui.
Or wait for api usage to be implemented
Yes it apis make things easy to use.
Its just some stuff needs to be shuffled around in the bot I believe.
So that it doesnt use tts from an extension and api at the same time
hmm, that could just be part of the setting.
Like instead of "Edgetts" or "alltalk"
it could be "api"
and another key specifies the url
Maybe another would be needed to specify the schema (Alltalk api will differ from Ganyu for example)
right not all the apis are the same
f
@terse folio do you control the bot.py?
like know how is the code
where is each thing
Reality has the authority to change anything in the bot, but theyāre not super familiar with everything in it. Especially now that I just revised a most of the code
Reality wrote the history manager, and really helped with improving settings management
i suggest a file that divides the code in several part (tell the line of big chunks)
6000 lines is somthing that i cant ask chat gpt
27k words is nearly half of a book
theres a function for sending the tts in the discord_utils.py file iirc.
you can search that function name to see where its used.
Yes the bot has a zillion features and so also has many lines
way too many line for a beginner (gpt user)
gpt handles around 30k characters
about 600 lines
using tags to trigger tts would be an interesting solution
already using the existing "do this after that" type logic
Tags just need a tts function implemented
just something that i did not understand
what the difference between that and the /speak
brain isnt braining
sorry guys need more detail
/speak is a command on discord, a way to interact with the bot.
tags are a sort of programming/scripting language where you can put various parts together.
I suggested using the tag system to implement API driven tts to leverage the already existing code instead of adding new code that decides if tts should be generated on each generation
did not understand "instead of adding new code that decides if tts should be generated on each generation"
in a conversation every reply should have a tts
at the moment the bot doesn't not handle TTS at all.
The bot tells TGWUI to load the TTS extension.
That extension hooks into the text generation part of the code so when you generate text, it also outputs an audio file.
yeah
To use an API for TTS, we need to start handling the TTS (which we were not)
so it's not a simple swap.
tag to handle stuffs thats outside of tgwui
Thing is, the tags system already works when text is being generated.
It run other code before/after generation
So instead of reimagining how the bot generates text.
just make a little addon to the tags to do tts there
the bot just runs TGWUI
which is the webui
you could disable the webui, there's a flag for it
so all the time i could just type 127.0.~~~ and change some settings?
not bot settings
tgwui settings, yes
Oh, you were asking where there's isn't a webui for the bot
#1154970156108365944 message
this might be your answer
i just want to use tgwui with the bot
some ui to make setting more accessible is good though
it should work.
The webui doesn't work for me with the bot running oddly, I don't know what the issue is
that is what im asking
you'd have to ask @halcyon quarry about that
the api is working after his fixes
just not sure if it is working working
oh wait
i didnt plug the ethernet to the server ._ .
there is no such a message that tells you the url
and still can open the webui
I have the same issue, I'm not sure what the issue is
nah api working but not working
Will need more details than that :3
something is already using port 5000
do you have another copy of tgwui running/another bot?
just have the message bug and not working
the webui has no message and not working
and i still dont understand why complex memory wont work
cant ask gpt bcs the code is 6000lines long ._ .
does it work with normal tgwui?
My guess is addiscordbot bypasses something internally where the complex memory extension would hook into
at the beggining extensions that have setup in their code dont work
and what altoiddealer did was ignore and instead of skipping retry
fixed edgetts but
complex memory is strange
these are his words
/v1/chat/completions i found it
the bot has now a release š±
Hi
@halcyon quarry how about stop "is typing..." for a few second if the message is taking too much time
Thereās two reasons the bot doesnāt use the API
not about that
is about why api of the bot isnt working
as you fixed the openai extension
there should be a api
and one issue, extension loading twice
Thatās a feature
alright
answer this one then
I donāt like the API.
The API does not return a TTS response, so I canāt manipulate it - send it to voice channels /etc.
I monkeypatch load_extensions in order to update extension params anytime we want such as custom TTS voices etc - canāt do it while using the API.
The API is extremely strict about the payload. I donāt like it
They mean that the api doesn't work with the bot running
same for me, when using my other apps after working on the bot
i have to shut it down and start tgwui normally
I donāt really know what Iām doing, per se
Most of the TGWUI code comes from the TGWUI server.py file
Part of that code could block additional instances / the API / etc
it's possible something is going wrong that breaks some extensions
Also, I'm pretty sure the webui itself doesn't load?
(at least I remember having an issue, maybe it was that the port was different)
i'll check again later
Itās likely because the bot uses the shared module
also you 100% can write your own extension that returns whatever you want over api.
I've done it before!
I think it was before the openai extension,
I made an endpoint to stream text per sentence, also using uploaded regex as a stopping string on server side that could match on those sentences/the full string
It was to match things like "if the next sentence doesn't start with AI: then stop"
š¤Æ
Since I decided to start working on the new behaviors, Iāve been digging myself deeper and deeper into a hole I may never find myself out of
Ohno!
the ever lasting recursive hole of adding more detail!
run while you still can
what did i miss
whoa
work on things from a more general/broad perspective, so you don't lose vision of where the feature should be going
get yourself a nice todo list ^^
whoa
Btw, I like Obsidian.md + Kanban plugin for todolists
š "todo 1 year ago"
procrastinated too long
can make categories and move notes around/mark them as complete ^^
im having a ton of tabs lately...
closed a lot of them already
so what gpu are you using? š
I'm using a 2080ti
But I'm also working on a DNS like system to easily connect other pcs without bothering with networking/ips and ports
š
so a lot of these models would be distributed
?
Iām still stuck making the things work I added recently
DNS is short for Domain name system.
That is how your browser can connect to "google.com" and some DNS host will resolve "google.com" is actually 142.250.65.206 and then send your packets there.
My idea is similar, running little programs on one's pc that registers your machine as a worker to a central point where you can search by worker type without caring about ips
and all that tunnelling gets resolved automatically
On top of this, I can write queries.
Like: Connect me to 3 PCs running stable diffusion that have controlnet and are using model Y
im using remote desktop
Awesome ^^
Yea, so you know how with cloud computing you don't really have to deal with scaling yourself.
you set up an image and tell the system to start new instances when you need more.
Here, you can connect more machines any time to the system similar to a home hosted cloud
Yea, I read about that, impressive the laptop runs at such little power draw!
Might have to look into that myself one day!
still not getting the idea
servers can add more machines to get more speed
but to be able to connect you need pay a lot
and not getting much
I do tests on my pc,
I sometimes use my old laptop to run some models on CPU.
I can also use other PCs from the family if they're not busy.
Issue is, normally you have to change the ip/port in your app to point to those machines if you decide to change where a model is hosted.
parsec can remote control without dealing with ip
I'm not talking about remote desktop
im super comfortable using it
thats true
what about vpn
if you dont need tooo real time vpn is a solid option
Vpn is a means of connecting to another network securely,
Yes that's great for security, and could also be used with the system possibly.
The system acts as a list of known workers (your pc hosting a model)
As your pc runs the worker client, it becomes part of the network where it gets a load of requests to process until it disconnects any time
(very simplified example)
instead of:
config:
- SD: 10.0.2.1:7680
- tts: 10.0.2.3:10000
- textgen: 10.0.2.2:5000
you just:
config:
- SD: "jugernaut model"
- tts: "use xtts"
- textgen: "solar10b"
And the system figures out what PC to route your requests to
oohhhhh
thats why you said dns
i can remember my internal ip adress
and you can change the port to ez numbers
yup, you absolutely could do it manually.
My goal is to make it so I can turn on any pc, start the program on any pc and it knows what it can do and is already processing data!
you want redirect url or a website that has everythin
g
you are goin to use like the tgwbui or something like the discord bot
yes, sort of like a redirect.
you send your textgen/tts/imagegen request to the central point with some other paramaters like what model you'd prefer ... etc
And it picks the best device for the job and sends it back.
I will be working on a tgwui extension so it can join home hosted versions of this network
i want a chatbot, server: gotcha, activating laptop1?
wake on lan would be a cool feature.
I wonder how you'd go about security with that.
wake on lan is not that hard
how do you get past the login and start up programs?
idk, never done this
if you mean turn on the computer it is easy
yea, but I mean before you start running programs, you need to be logged in as a user right?
follow a simple yt tutorial
will put that on my todo list
you dont have to log in
windows it self has a feature
like if i turn on the pc execute this file
works in the background
i think
i still can access to tgwui with the laptop locked
more like:
You have the chatbot running on your pc, but a friend comes over and offers to add some extra compute to run image generation as well.
The chatbot being subscribed to events from the "Resolver" system, sees that it now has image generation ability, and now lets you prompt images.
Good to know, that would be useful!
I was thinking of traveling sometimes and wanting to get textgen loaded on my pc haha
maybe you have to leave your computer on 24/7
i think the logging after a reboot is special
2.4
I wish
Sleep/hibernate isn't the best option, it breaks a lot of programs for me
hibernate never broke something for me
might have been sleep then, okay
so i never turn off my computer
hibernate saves unloads everything into the drive
so i dont think i will break anything
even someone changed the cpu during hibernate
Yea, keeps a snapshot of everything
it is cool ^^
but remote desktop is really useful, suppose that something went wrong, you can just fix it
and gonna try something...
yea, each tool has their own use case
alright
programs can start in the background after a reboot withou login
program something with Windows Task Scheduler
send a magic packet and everything done
and remote desktop to make sure what is wrong and you can travell with tgwui
Mhmm, mhmm ^^
just connected to my tgwui with mobile network using vpn
another good thing about laptops is that the system wont use vram if you have a igpu
well desktops could
but i prefer cpu with f
:v
yea, my pc was the first time I built a pc without an igpu, was confused why the monitor wasn't working haha.
was many years ago
didn't know that was a thing back then
running 4k monitors or something?
not a 4k
but got a lot of stuffs opened
ah, makes sense
I get around 1.3-1.4gb idling with no games/video editor open
typically using vscode
sus
yup yup
is flask api famous
flask and dijango are the 2 more well known python web server frameworks
what api do alltalk use?
Personally I think flask is simpler and to the point, but the alternative is probably more feature packed
their own api?
it uses fastapi, which runs on unvicorn which is a newer??? python webserver
also faster than flask
it's probably compiled to C++? or C i forogt in the the backend.
no, the unvicorn webserver
š
Some python libraries like Numpy are compiled and can leverage C speeds that are really really fast
That's why it doesn't really matter what language we build our AI apps in
well, if you use Python
it's using compiled libraries, so you get the benefit of both worlds!
Coding speed, and execution speed
https://github.com/skshadan/TTS-RVC-API
this is fastapi?
oh coqui
if you can go to ..../docs it will tell you
for example
fastapi creates a documentation page from your code.
and also helps validate your requests to adhear to your schemas
do you have a project with fastapi support?
sure, tgwui openai api extension is one of them btw
you can go to localhost:5000/docs
to get the tgwui docs
so how i plug it? :v
hmm?
I checked out alltalk and found that it uses the same library as I.
But no, I wrote my own little api for TTS generation based a project from github
alltalk and coqui are pretty similar maybe
coqui is part of TTS (lib on pyip)
TTS is used by alltalk
I believe coqui was an early version/is XTTS?
which is one of the engines supported by TTS (the lib)
it's the one with the frog logo iirc
šµāš«
yes, coqui created the TTS lib
and xtts
okay, that makes sense about their relationship now
gonna try easier stuffs
like this one :D
Alltalk looks easy imo, it has a standalone installer and can be ran as an extension for tgwui.
I tested it and it worked fine
but this is not alltalk
i just dont like alltalk
this looks good and i like vits
this would be a step after tts generation.
RVC is a voice changer
one of the reasons to try first https://github.com/Arondight/vits_api_tts
Sure!
try it with tgwui standalone first so you can rule out any potential bugs as being from the extension
i just believed the readme and almost killed my self
i hope this is easy š¤©
backup just in case ._ .
compiling things is always scary, especially on windows,
had a lot of bad experiences with things constantly failing.
turned out I had some paths wrong, it was using some wrong version of buildtools and couldn't find required libraries
ill tell you that it has a step by step instruction
looks like 100% beginner friendly
who is really friendly was chatgpt ;-;
was chatgpt able to help you install it?
when projects have short codes ._ .
ah, was going to say that's impressive for something it never seen before
the longest was 800 lines
i copy the log and send it to chatpgt with the code
and magic
Hmmm
when you dont know about code you cant select a part of the code and send it to chatgpt
have you ever seen that kind of folder in github?
git doesn't download it properly?
the repo
the project is cool but pain
at least i ve done it :)
git clone --recurse-submodules <url>
thats why it was a betrayal
if you dont know what are you doing (completly new)
i searched "how to git clone links to sub git repos"
mhm, unfortunately a lot of programs aren't that friendly.
I've spent months learning/researching something to get some things working in the past...
yea big pain
with simplified explanation and showcase
you can decode it with subtitles
awesome :)
hmm, i'd expect the "deployment package" to already have a ready made environment in it
buddy could show me the quick deployment first
definitely gonna go for the quick one
ohno compiler errors
you cut off the error above
So I finally have the new behavior timing working correctly. Perfectly.
It's just a bit lame at the moment, until I update it further... streaming responses, replying to multiple messages at once, etc
"replying to multiple messages at once"
nah its fine
https://github.com/Artrajz/vits-simple-api got simple in its name i see a hell long readme
Pushed some updates to Main
This includes the thing that made the RVC model output work for edge_tts
did you get my message about the mp3 file actually being a wav from rvc?
kinda confused what the code screenshots are about
you have no voices?
it should be using the api
https://github.com/Arondight/vits_api_tts this is the extension
It isn't populating any voices
which should using the api to do something with https://github.com/Artrajz/vits-simple-api
yes, but the program is complaining about something missing
I really don't know what you mean by this... are you saying it is producing a .wav and saving it in an .mp3 container?
it's producing a wav, just named ".mp3"
you can rename files to any extension, doesn't mean they'll work.
but many do.
like if you rename a .gif file to .png, or jpg?, it will open as a still frame
i know i know betrayal
šØ
I've never used the repo, I have no idea what's going on there
Good eye on this lol
never gonna belive for beginners
that codec error is what led me to checking the file with ffprobe
The suspicious thing is the other outputs are < 100kb while the output from the RVC model is > 2MB
yea
our good pal chatgpt helped add a check to identify and fix improperly saved files before processing
what's the link to the rvc extension you're using that's bugged
It now saves that to wav before processing correctly
maybe there's a setting to change the output format
our good pal chatgpt is on his second try ;-;
It's this extension https://github.com/Unorthodox-oddball/text-generation-webui-edge-tts
it is the fork
the issue
now i remembered everything
ideally, this stuff should always save as wav
quality loss is minimal, and it's not like tts is super high quality to begin with
depends on the use case
like sending over network
would want some compression
you have to mention https://github.com/Unorthodox-oddball/text-generation-webui-edge-tts bcs the original one has problems
yes that is the link I just shared
ayo
i mean here
or somewhere
I'm pretty bad at updating the Wiki consistently š Will do that now, thanks for the nudge
for some reason the original one is on https://github.com/oobabooga/text-generation-webui-extensions
The RVC stuff seems pretty interesting, I may start using edge myself š
sounds like it's coming from a tin can but the voice model you shared definitely sounded like the source character
Edna Mole
Finally fixed the continue task from endlessly typing (and possibly others)
i think the extension was fine, i have to start vits then tgwui but i can find models ;-;
gonna kill myself
said any vits model, try huggingface
i already downloaded 2 different model
1 randomly and stole one from this guy
going for the third one
still no model
i downloaded the realease
idk what is wrong
gonna try renaming
readme example
Any thoughts on llama 3.1 8B for the discord bot. I am setting up a new mini server with one of those p102-100 10GB GPUs under PopOS (Seems their preconfigured nvidia drivers works out of the box for it).
Was thinking with its 128k context limit, if it was any good for Q5 with 20k+ context for memorizing a lot of conversations for the discord bot, and coherent enough to use.
thats more on the model than on the bot
if you are satisfied with its performance in tgwui then there shouldn't be any problems
For the first time, making a post on reddit for the bot.
https://www.reddit.com/r/StableDiffusion/comments/1efaecp/ad_discordbot_not_an_ad_please_read/
Appreciate any upvotes / supporting comments š
started working on streaming text responses š®
Yes!!
Whipped up a nifty new Tag useful to those using the bot as a SD tool, last_img_payload
It can can be used in two ways - in both ways it is applied first before other payload related tags (can be updated).
last_img_payload_dict: true will use the entire last image payload as the current payload.
last_img_payload_dict: ['cfg_scale', 'seed', 'positive_prompt', etc] - As a list, will use only those specific values from the last payload.
Personally, I'll be using this with flow tag to do upscaling / regenerating txt2img / etc
There were quite a few hidden nuances when doing this, yeesh.
As character specific settings, yes
#1154970156108365944 message
Those are all valid parameters you can use for the character's edge_tts settings
not that bad being 8b
I'm waiting on an adapter, but I'll be trying out some setup in the next few days and give my discord not an upgrade.
New/old hardware change
Added Gradio interface for settings management to the to-do list. Will get streaming responses working first
do you guys use cmd or powershell?
i see no much difference
it feels so good
First attempt at streaming response lol
to hear that again
Except it sent a message after every update from chatbot_wrapper
understood
I'm not sure how streaming works internally, but I think in a very old version of tgwui, streaming would send the entire context back with the next token
Yes if a print statement is added in there it prints an ever increasing text string
What I plan on doing is detecting sentence completions / line breaks, and use a character setting value to roll random and see if the message should get chunked
oof,
you'll want to add some note of the last token, and split the text at that point or something.
if you are not planning to make it be able to plug into a tts
edit pretty much solves it?
then cant you make the bot edit the message?
For simplicity I plan on logging the complete response, and collecting all the chunked message IDs as related_ids
?
(backend talk)
ill let you cook
The bot can edit its own messages yes
Which it currently does for Regenerate Replace, or Edit in History
from what i see the last line is the finished reply, if streaming responses you mean updating the message as they are generated, then you can just make the bot edit the message and it is pretty much done
This text streaming stuff is going to work, just need to fiddle around with it more.
Then overcome the challenges of all the other recent crap I added (possible delayed response behavior, typing speed, etc)
true
Just wanted to share funny outcome š
I had put some code in that was supposed to detect line breaks but it didn't work so it did send a discord message for every new token
@valid crypt I was thinking about building a custom long term memory system from scratch. It seems like you've exhausted a lot of options looking for something decent, have you had any luck at all or should I prioritize that long term memory system?
This is how that made discord feel inside
Yeah.. I had a problem earlier with chat history from streaming where it was saving a new chat history in the array everytime a new word was streamed in the same sense..
@keen palm had chimed in to remind me that I added a pair of ātagsā (in the Tags system) to mimic complex memory
Only downside is they are not shared with the actual complex memory extension, theyāre managed separately
Just had an interesting idea⦠if complex memory is enabled, it could try finding the complex memory file for that character and import them in tags format
Fun idea but I wouldnāt want it to write the changes to file
Tags system version is more comprehensive anyway, you can have comma separated list of trigger words
for tgwui you can use complex memory, but for the bot you better use the tag system
i plan to try more extensions but, complex memory seems the easiest and good enough but you cant use the original one but one of the forks
OK well the tags system method is more comprehensive than for the fact that you can use similar trigger phrases in separate memories and have one of them Trump the other
For example you could have a trigger for a memory that is Bob and then you could have a different memory with the trigger phrase Bob is sad or something like that and have a Trump the more generic casual one
Using microphone to text by the way
Still waiting for the relationship type triggers, so if Bob AND Doug are triggered, that specific memory will be given to the bot
Multiple trigger phrase
Will add that to the to do list and high up
I think I would have to disable text insertion or replacement mechanisms for that scenario
Except of course for insert before and after context as well as
Prefix prompt or suffix
Right. You would get some rather wonky outcomes if you didn't
Bigtime
The setting to control streaming
In the background, there will be more weight factored for line breaks versus sentence completions.
I'll also have to flag for code blocks / italics / other syntax so it doesn't screw it up (sigh)
Making good progress (I think)
for this fork of complex memory https://github.com/Imitationman/complex_memory
i would say it is pretty easy to convert it to tag format
the file is in the character folder of the extension
as the bot dont have webui yet, auto convert would be pretty good
I did have the thought of asking ChatGPT to make a little utility .bat file that I could include, where you could drag/drop a complex memory file onto it and have it convert it to tag format
Or vice-versa (convert tags to complex memory file)
Would just be annoying if the syntax is different from fork to fork
i have to say that others are just crap
the original one dont work
a newer fork mess around with the character card
and do a lot of *** to it
in comparison, this one has a independent file in the extension folder with this format
Yes, same reason I wouldn't want to write tags to character card
When you save data it removes all comments and reformats it all
you dont know what did the extension do to my character card
random order
For the bot, you can compare /internal/activesettings.yaml against settings_templates/base_settings.yaml
context at the begining then blah blah blah and name at the end
oof
@terse folio I imagine you've dealt with this scenario before, and ChatGPT (along with myself) are being bolts...
print("check_resp:", check_resp)
if check_resp.endswith('. '):
print("Ends with Period Space")
I've printed the value of the ever increasing check_resp and also printed each character as they are being received - as far as I can tell, the string ends with ". " at some point, yet it never triggers
models dont normally output tokens with trailing whitespace, check if it ends with a period
I'm starting to think that the space must be prefixed on the next token
yes
the Llm can distinguish tokens that are parts of other words vs words on their own because there are different tokens for text chunks that start with a space
I'm having a bit of a struggle trying to come up with a logcal way to skip a check cycle if the string ends with X - (last token) for this syntax, and string ends with X for this syntax
Mainly, trying to add more weight to chunk the text if \n\n but it always adds one at a time
I'll need an example I think, also need to sleep, will have to figure that out in a bit
alright so just checking for . isn't too shabby.
last_checked = ''
def check_should_chunk(partial_resp):
nonlocal last_checked
chance_to_chunk = bot_behavior.chance_to_stream_reply
chunk_syntax = ['\n\n', '\n', '.']
check_resp:str = partial_resp[len(last_checked):]
for syntax in chunk_syntax:
if check_resp.endswith(syntax):
last_checked = check_resp
# Ensure markdown syntax is not divided
if not patterns.check_markdown_balanced(last_checked):
return False
# Roll probability to chunk
if syntax == '\n\n':
print("Double newline matched")
chance_to_chunk = chance_to_chunk * 1.5
elif syntax == '\n':
print("Newline matched")
chance_to_chunk = chance_to_chunk * 1.0
elif syntax == '.':
print("Period ending matched")
chance_to_chunk = chance_to_chunk * 0.5
print("chance to chunk:", chance_to_chunk)
return check_probability(chance_to_chunk)
return False
It can currently match \n or . and factor the probability.
But \n\n will never match
because after it matches and I roll probability, I make sure it doesn't check the same string again
you would have to do some waiting to check if it's a new line on the next iteration,
yea that's not a fun problem
Alright, if I can't solve it in like the next 20 mins it's going on the indefinite TODO XD
yeah I'm giving up on double newline
Welp, I have it now successfully sending the message in chunks as it is generating, based on chance_to_stream_reply when it encounters a line break or sentence completion.
fun part is now making everything else work with this...
Yo Illysaviel is back in the Forge saddle for real
I tried pushing this commit months ago when he was apparently on hiatus from Forge, and it languished for over a month before I decided to cancel and push it to the dev2 branch (other code owner was maintaining)
This time, 6 hour wait
wonder if he saw Flux and was like oh boy time to play with normal image gen tools again for a bit
Whoa, youāre also alive š
(well other than last week I was MIA because i was at ICML in Vienna)
Certainly! And I havenāt been frequenting #general much myself
(it was very funny hanging around with the BFL crew as we got a best paper award for SD3, knowing sd3 was about to get rekt by Flux)
SD3.1 š
@calm rain Swarm basically dead now right? Do you miss it at all?
what?
Swarm is like
-the- interface to use for image gen
you'd be fuckin nuts to use anything else these days
š¤
day-1 flux support yo
Iāll have to check out Flux now that you mention it, man Iām out of it apparently
@calm rain Anyway congrats on the award! Iām checking out Flux ASAP
just released today
info to use it in the swarm discord announcements https://discord.gg/q2y38cqjNw
My code is a bit messy with this message chunking :S
difficult to organize everything
I have a spare P40 24GB I'll see about getting it to work on it later and hooked in for Flux*
Ok I definitely have the response streaming working well now for default settings (no intentional delays, etc)
It is only creating one "Hmessage" (internal history management), which is the complete response.
It collects each sent message ID in the related_ids list and ensures that the most recent sent message is allocated to the main id attribute.
This allows all our crazy functions to manage all the separately sent messages
Such as the history reactions feature, but apparently it does not let the bot just blaze through and react to 10 messages instantly, it allows like 5 at a time then pauses a few seconds. Nothing I can do about that - users may wish to disable that for streaming responses.
I will be pushing this feature today
Also fixed some bugs along the way
What is the history reactions feature?
If messages are "hidden" (not in chat history, only in the bots internal history) it will react to those messages with the 'hidden' emoji.
Will also react to messages that are continued or regenerated
So with this message streaming, if you trigger a tag that hides the interaction in history, it will go through and react to all the separate message chunks
Ahhhh, yes! Excellent, excellent
It's really crazy how all these features work
If you use edit in history on one of the chunks, it will delete them all and just create one message
This history manager from Reality is so good
Because of ratelimits,
I wonder if theres another way to indicate what messages are related without editing/reacting to them
All messages could be sent as embeds instead, with different embed colors š
would be interesting if they could have no other stylized effect besides the color beam on the side...
I'm going to skip message streaming for Regenerate Replace task (for now). There's a lot of code to overhaul to make it work correctly.
actually might have a solution...
yes, have that figured out
Now seems like only "Continue" needs some updating...
ugh. Realized it is sending them on a race condition...
ok fixed it with asyncio.event()
what was the race condition?
chatbot_wrapper() is imported from TGWUI - synchronous function
So I'm using run in executor to prevent blocking the discord heartbeat
I see, and run in executor spawns a thread, which causes the race condition?
the output perhaps?
meanwhile, as it is chunking the messages and sending them - the llm_gen() function was sometimes returning before the last message chunk could send - and collect its message ID
I really don't know what I'm doing, I trusted ChatGPT and it worked for the most part š
Looks something like this:
def callback(chunk, is_last=False):
asyncio.run_coroutine_threadsafe(process_chunk(chunk, is_last), loop)()
# Offload the synchronous tasks to a separate thread
await loop.run_in_executor(None, process_responses, callback)
The responses were always streaming from chatbot_wrapper() - the process was always just waiting for the complete output before proceeding
Now, I'm actually checking for triggers to chunk the message, processing them as they are chunked
does the chatbot_wrapper return an iterator?
where you can "for token in streamed_response"
idk š I'm pretty happy with it atm it is working
because I have a bit of code to run iterators in executor too
it's in the aio_utils I think
I'm looking forward to the day you get your hands dirty in this project again š
ok the damn solution actually didn't work
asyncio.event(), with set() and wait() and clear()
I tried searching but I don't think the code is commited
would like to look at what's going on
I'm going to comit now
Took that bit out
There's a bit of clusterf**k here and there
Pushed to streaming_replies branch
So there's 2 known bugs at the moment:
- Sometimes, the message ID for the last chunk is not captured because
llm_gen()returns and gets tosend_responses()before the last message chunk actually sends - Its possible that queued message chunks work right now but probably not š No time to test atm (If
bot_behavior.responsiveness < 1.0)
Logic of chunking:
- Whenever the 'chunk reply' syntax is encountered
- it checks to see if markdown is balanced. If not, it marks that string as ignore.
- otherwise, it rolls probability based on chance to stream reply.
- if a message gets chunked:
- A flag is set in the Task that chunking is happening.
- Each msg chunk creates a temp HMessage so it can use
send_long_message()to assign and capture IDs while it handles the chunk. - All those IDs get appended to the parent Task - so this keeps increasing as messages are chunked.
- At the
send_responses()function:- If no chunking, it will send a response and use the IDs for HMessage creation (as normal)
- If the responses were chunked, it fetches the chunked msg IDs list and creates the HMessage using them.
I had added some print statements and realized that sometimes the ID in send_response_chunk() (for the last msg chunk) would print after the HMessage assignments in send_responses()
Unfortunately, I'm going to keep this off Main branch for now.
But this is working quite well, if anyone wants to try out the streaming_replies branch, you'll be in for a treat
ah, yes, chatbot wrapper does return an iterator
you can make this async, and update your chain of functions to also be async
If you can give more details, that would be great š
I would love to make this async
Or, play around with it and fix it for me XD I've been at this for a long time and getting brain rot
if you're not working in that area I can update it
I've been putting off work work for the past 2 hours, now trying to do my actual work in the next 20 mins before office closes š
D: ohno!
alright
lets say streaming is cut off half way, should these variables be set to the current string?
need to figure this out.
I guess it depends what part of the iterator exited.
could move that closing code outside the iterator
I'm letting self.last_resp and self.tts_resp collect the full responses
I understand that,
would it break anything to set those iteratively as the code generates the reply?
I think it already does?
Er I understand
Nope, no problem
So long as when it's done processing they are complete
Thank you so much
Seems like 2 days ago would have been ideal time to add Swarm support to the bot
What with the extreme hype on Flux
ui?
testing Reality's fix now for streaming responses...
Noooo it doesn't work š
Will consult the great and powerful chatgpt
Oh you're around
i was in a call the last hour, im here now
ahh
i was afraid of that
i'll push an update
do, that's because the asyncio loop isn't really created until you do bot.run()
but I already create a loop at the top of the file for my coro handler
so they're missmatching oops
my hope was that asyncio.get_event_loop or whatever would use the same one
The other loops are in on_ready()
good point, I wonder if I can just hook into client.loop instead
I don't know much about loops, but they seem like something to limit unless necessary
different kind of loop, I mean the asyncio loop that is how async code works in python
basically dedicates a thread?
yea
because "async" runs single threaded, every async function you await, is actually pushing stuff to a loop, where it runs the next thing for a little bit until it reaches another await statement, then switching to the next thing in the asyncio loop
that's how you can get this "threaded like" behavior, out of a single thread
its nothing you need to worry about yourself, asyncio loop is just a term for where the async stuff runs
so issue is, you can't await something from a different loop inside another
I see so in this case it is hooking into one loop but it is async'ing for code defined in another loop
or something
merged it
yea
async functions are (probably still) secretly iterators
in a super old version of python before async became a standard feature, it was a library.
And you had to use yield from function() instead of await.
yield from is the same as writing:
for i in generator():
yield i
awesome!
I first learned of yield when I first asked ChatGPT about how to handle this - but that didn't seem like the right answer... it was not what you did.
Then noticed yield over in chatbot_wrapper()
knowing that's how it works,
you can understand how stuff like task.cancel works.
it's just telling the async function (which is an iterator) to stop iterating at that point in the loop
im curious to see how close it got.
but yes, it probably wouldn't have given you the best answer, because we had* a different problem here
converting a sync generator to an async one
In my test, learned that I need to add another condition to look out for when deciding to chunk
eheh, yea had some issues like this in my tests of similar text chunking!
Splitting with newlines usually works well
like your double \n\n idea
that's definitely the easy route
what's the goal?
to split sentences?
To make it flexible enough to behave like a variety of different types of people
Some people write one sentence at a time. Er well... hmm I suppose the character card could just handle that
Yeah, that's a good point...
I'll just split on \n\n - if ppl want their characters to send many short messages, the context needs to dictate it
ahh, maybe you could add something onto the system prompt
like always write your outputs like AI: "text"
and tell it each "" is it's own message and its okay to send multiple at once
then you can use regex to select all content in those quotes to send as each message
Too complicated XD
i can share some of my code for that
on top of that I was using grammar to enforce it. but that's not so nessecary
Appreciated, but not interested in that handling
Oke!
maybe, maybe it could be an optional alternate method for chunking
I'll think about it
if you want to go that route, you could define chunking as a user editable pattern somehow
(not sure how it was implemented)
Yep thats the idea. I already made this method adjustable
interesting
chance_to_stream_reply: 0.5 # Chance trigger sending partial message as it is generating. Range: 0.0 - 2.0. 0.0 = Never split reply / 1.0 = Splits very often / 2.0 = Always splits for any trigger
stream_reply_triggers: ['\n', '.'] # If you may want to adjust triggers. default: ['\n', '.']
afk for a bit
actually, this would need more than just modifying the chunking pattern.
you also need to change the chat template for tgwui to include the quotes
hmmm, on 2nd thought, the quotes might not be needed that much.
I should test that out later.
but on the other hand the quotes are useful for filtering out LLM censoring or the random stuff it says afterwards like "this has been a conversation with X about XYZ..."
stopping strings should handle that (mostly)
Testing that TTS is still working - then pushing to main
chunked messages are working with my dumb responsiveness queueing behavior but it's not synchronizing well with typing. Good enough for now.
Welp, this is unfortunate.
The responses don't output until the TTS finishes generating
- Payload is sent
- Alltalk or other TTS generates audio
- Text responses begin streaming
I wonder if alltalk has a setting for that...
I don't know how a tts engine would know when to return the "visible" portion of the generated text that contains the audio file
since there is only one audio file that is just streamed during generation?
if you stop generation per sentence and do it that way it would work.
but ultimately using a TTS api would solve this.
I think alltalk splits the output into multiple chunks
then generates the audio files for each, then merges the audio
Even not
interesting!
Yea, the result can jump a bit drastically between sentences and such
even without narrator
Well, streaming TTS responses will have to be on hold. Sad
Streaming replies is now committed to Main!
I may double back and attempt to detect double line breaks ('\n\n') in order to increase their strength, but for now '\n' will suffice for normal strength
"." has half strength. It can be omitted via the setting.
I added a check that prevents chunking after a list number like
- Won't chunk after the period
@terse folio in the case of chatbot_wrapper(), when you initially asked if it returns an iterator, is it because it uses āyieldā ?
Any other types of iterators that a synchronous function could return?
I don't remember exactly which is which, but I think it's pretty interchangeable, will look it up later.
But when you use the yield keyword, that creates a generator/iterator
What's unique about those is if you stop it midway using the break keyword, it will stop processing the code.
for example:
# No way to cancel, returns full result at once
def test():
output = []
for i in range(10):
output.append(i)
return output
x = test()
def test():
for i in range(10):
yield i
x = test() # this didn't actually run the code yet
# later we can access the items by using
for i in x:
print(i)
# or this will iterate over the generator, and put it in a list.
x = list(test())
one other thing, with that for i in x example.
if you were to try iterating over x afterwards, it would throw a iteration ended error
so if you want to go over items multiple times, cast it to a list first
Although you didnāt show example with break. I imagine that would be
for i in range
if i == 6:
break
yield i
outside the generator
Talked about it here while demonstrating something else
was actually just about to ask this š yeah thatās more interesting
Yup!
So at some point I may have a listener that would break for the output, such as if the author of the request is typing or just edited their message
yea, you can do that pretty easily
Easily is a thing of the past now that my dumbass incorporated the delayed response behaviors
I think there's a version of asyncio.wait_for for receieving discord events. I forgot what its called.
but put that in a list with your text streaming function and asyncio.gather it? there's probably an option to return when one or the other finishes instead of waiting for both
ohnoo
I think itās close to being manageable but Iām handling something wrong
Thereās a lot of moving parts to consider or ignore and itās difficult to make it all functional without being overhanded
I set up 2 checkpoints after queuing the message, where the function is supposed to analyze the message task and adjust timing for come_online() and/or istyping() , but it's still being a bit clunky. Had refined it many times
In an effort to keep forward progress, I think I made the relationships of Task(), Message(), IsTyping(), etc a bit more complicated than necessary
Its history repeating itself though, my progress has been like that since I got into it - write a bunch of code that works, then spend double that time going back and doing it the right way
Back when I was coding a fighting game character for MUGEN, the triggering system was like this:
triggerall: (must evaluate to True for any other triggers to work)
trigger0: Some condition
trigger0: Another condition. Both trigger0 AND triggerall must satisfy
trigger1: Some other condition. requires triggerall and trigger1 to be True
etc
triggerall was optional. Any number of the other triggers were allowed.
Ideally, I would mimic this for Tag triggering, but the entire format would have to be overhauled to support it.
So the best I can do "easily" will be like:
trigger (or trigger0): 'some text,another text'
trigger1: 'another trigger text'
trigger2: 'yet another'
etc
And all of the triggers would have to be found in the search text to consider the entire tag as "Matched"
Working on this now
Good progress on this.
I'm pretty confident that the revised match_tags() handles multiple triggers now. Just need to comb through the rest and make sure there wasn't some oddball thing regarding triggers
ok there were a few more sections that needed updating... trumping, expanding triggers. All good now.
@keen palm It turns out the logic can allow insert_text and positive_prompt tag values to be applied (insert the text where it is matched). I'm going to allow it, and print a warning that multiple triggers were matched but it will use the matched phrase from the first trigger definition.
idk if you use this or not, but just a heads up
I only use suffix_context, but that's handy to know
it's pushed to branch multi_trigger_logic - and seems to be working great but I don't have time to thoroughly test atm
got this https://github.com/Arondight/vits_api_tts working maybe and it seems like it got streaming response, gonna be tough to plug it into the bot but first gonna test it out
someone know what is wrong?
Is the error in the bot?
Or when using TGWUI
tgwui
if tgwui dont work, bot is dead as h***
chatgpt says that its because im using windows and it is something wrong with the file name
omg colons
Haha
Something you can add is a setting like "pass through: true/false" to the triggers.
Where if disabled it will not run all triggers leading up to that one.
Passthrough: false: (trigger all + trigger1) activates trigger1
passthrough: true: (trigger all + trigger1) activates them both independently
the extension is working but i only found a japanese vits model that worked
and the quality is pretty poor
I'm not using triggerall per se - it is simply collecting all keys that start with trigger and ensuring they all match.
I like your idea though, to add another param which dictates whether all triggers must match, or any can match.
From your explanation it sounded like you were doing multiple steps of matching
"Cat + Dog" = response 1
"Cat" = response 2
"Dog" = response 3
But what if cat+dog was set to pass through backwards?
You could get:
"cat + dog" = response 1, response 2, response 3 (Or any of the triggers that also allow being used standalone)
Actually, that standalone part would already be implemented as part of tags I believe
for some reasons most vits model are not detected by the https://github.com/Artrajz/vits-simple-api
but those model that works are not very good
at least i cant find many
maybe have to train myself ;-;
After a few drinks and thinking about it, thereās no need for the pass through type tag because the trigger definitions are each essentially comma separated lists of triggers.
Things like this are already possible due to the ātrumpsā tag
Except now cat + dog is possible
You can have two tag definitions that have identical triggers, but add some unique trigger to one like ādjdhdā and set it as the ātrumpsā param on the other
I do this with a bunch of tags for āillustrationā LORAs, paired with the ārandomā param.
Just use alltalk tts and be happy š¤
interesting, okay!
Am finding some minor bugs I overlooked
- trigger1: 'draw'
trigger2: 'cow'
search text: 'draw a cow'
This tag definition will be matched
search_text: 'a cow could draw'
Also matched
The triggers don't have to be incrementing or anything, they just need to start with 'trigger'
These are valid:
- triggermybutt: 'some test,another text'
triggeryourmom: 'mom is nice,mom sucks'
search text: 'mom sucks, but she passes some test'
...Matched!
Ahh very nice
And as long as one from each of the triggers matches, then the tag will be triggered
- trigger: 'generic_sw,star wars'
- trigger: 'star wars'
trigger2: 'boba fett'
trumps: generic_sw
search_text: Star Wars movie with Boba Fett
Result: only the second one applies
yes, one phrase from each key
Very useful. I might have to update!
Updating should never be a problem now really
You'd be helping me a lot by updating, because if something does break I'd love to know and fix it
I won't be able to do much today, at least not for a while
Bah!
I just recently embraced this variable for swap_llmmodel feature
Now gotta go back to the old way (capturing and storing it)
Whoopsy!
Well, the bot is completely non-responsive, so that's a good start
Okay, there. It required a few restarts for some reason
I've got the trigger1 and trigger2 thing to be working somewhat, but it doesn't work whenever the triggers match other tags as well.
So let's say that you have a tag that triggers on Bob, and another one that triggers on Doug. Both of those provide the bot with context about those two characters.
But also, there's a tag with trigger1 for Bob and trigger2 for Doug, which provides additional context about their relationship.
That tag doesn't get triggered, but the individual ones do.
So the text says āDoug ⦠Bob⦠ā and the ones with only one trigger are working, but not the one with Doug and Bob?
I had tested with two tags - one with a single trigger ādrawā and another with two triggers ādrawā and ācowā and they were both triggering together and applying
Did you use the updater script?
Not that important I suppose git pull would also do the job š The script is good if we add any requirements but we havenāt recently
I did use the script, yes
And yes, based on my testing, the tag with multiple triggers wasn't working. At least it doesn't seem to be injecting context appropriately
@keen palm Could you put a print statement here? Ctrl+F to process_prompt_formatting()
print("format_prompt:", format_prompt)
Alriiiighty
Good cuz I didn't want to do that
in process_llm_payload_tags()
print("context:", self.llm_payload['state']['context'])
Orrrrr
wanna share your tag definition? I can check it's formatted right (probably is)
I'll test again
I'm testing too
Getting the toddler up too, and she's a bit needy
I'll test first
Ugh yes... definitely something not right
I must have a "break" statement in the wrong spot, so when it matches a tag it is stopping checking the rest.
Will fix this tonight
yes found it
Fixed
Merged to main
I have a "background task queue" for processes that may take some time, but which do not use VRAM / interfere with main task processing
I just set all "react to messages" functions to use this (the "history reactions" feature).
This solves the issue of it taking too long when it sends a bunch of message chunks. Now, there's no rush, it can go ahead and react to the messages while handling other tasks.
Wondering if there's any other time consuming processes I'm overlooking that should get this treatment...
add stt and blow up our mind
The bot just reacted with š on a prompt that should not be hidden at all
If you check the log in internal/history/the channel number/the character name was the messaged logged as hidden?
also, was it just a regular message request? Or was it a regenerate / continue?
Was an image generated?
It is marked as hidden, yes. I see the issue though. I triggered my IdeaBot character, which sets things as hidden.
That shouldn't happen though
I have on_prefix_only:true for that tag
Does it have one trigger key? Or does that have trigger1, trigger2 ?
Just one
- trigger: 'Idea'
search_mode: user
on_prefix_only: true
swap_character: IdeaBot
load_history: -1
save_history: false
Checking it out now
I reproduced the issue... so now need to see why the code isn't working as I expected...
I'll just comment out tag for now
think I figured it out...
yep - pushing it now. Had a flaw in the logic
Did some additional testing to ensure I handled it right
It checks each trigger phrase in each trigger key.... when a phrase was matched, but failed on_prefix_only I was letting it continue to try the next trigger phrase.
But I was not setting the flag that the trigger key was unmatched as a whole
Now, it checks if the matched phrase is the last one.... if it fails the prefix check, it sets that flag. Otherwise lets it continue
@keen palm Pushed
I will break update a bit later
I've been trying out some different models for RP on the discord bot with some general character persona, but I'm getting tired of the constant breaks... Is there some specific model that doesn't include these breaks:
Everytime I add a new end phrase, a new one pops up
Llama 3 based models?
Make sure you are on latest TGWUI too, they've improved support for llama 3 models
Welp, pretty damn easy... just added a new /prompt command with just one additional option for now which is begin_reply_with
It's working just fine and as expected
Fuck yes
Adding a tag for it now, too.
aaaaaand that's implemented now too
Pushed to main!
What's the tag?
Same begin_reply_with - If a tag is triggered with that, it will use the value to Continue from
Just added it to the Wiki now https://github.com/altoiddealer/ad_discordbot/wiki/Tags#text-generation-tags
I'm going to start looking into per-guild characters now...
Seems like the name can be customized per server
But not avatars... avatar must be shared
might need to add a command to change the bots avatar
yeah I think this is the way
Eh. nvm I'm pushing it off for now š
Let's see... on the to-do list is: discord based conditional Tags
Can't remember exactly what we had in mind for this....
Ah ok I remember now.
Like this
And a few more⦠Rolesā¦
What about roles?
Oh, there was also the idea to use the channel info as additional character context
lol, it just added a period
š Nothing more had to be said
that's funny, I get some unexpected ends sometimes too when trying to guide responses
I'm going to have to do a more serious test
I've noticed that Continue almost always fails to add anything new, as well.
it's often like that in the webui as well.
if the llm has written a few messages for the character, it will keep returning end tokens because it thinks it's done.
(since that's where it stopped the first time)
what works for me is adding an extra newline and saying continue, then it usually adds 1-2 more sentences
I meant, like another conditional tag for something like for_role_name_only
Ahh yes
Ah OK you can get the Role ID from the Server Settings... just no where to get it easily from the chat window
I'll make these possible to be comma separated
I'm welcome to suggestions for more
With the new ābegin_reply_withā tag param, I had a very interesting idea for a āFlowsā definition. The idea I had was for something equivalent to the āalternating promptsā feature in SD WebUI, but for TGWUI
By triggering a low āmax_new_tokensā setting, swapping character context, and recycling the previous responses with variable {llm_0} in ābegin_reply_withā - this could have drastically different characters piecing together one lengthy response
One could make a true Jeckyl and Hyde type character in this way
interesting idea ^^
Some voice changes mid sentence would be cool as well
Ya know⦠that may be a hack to stream TTS⦠instead of chunking the text, just stop generating, then feed it back with continue
Er⦠yeah⦠wouldnāt work great actually
I don't think it's possible to change the prompt while its's generating with the current webui.
Would need to limit tokens to make TGWUI cut the gen short to trigger the TTS
which could be wonky
Iāll see if thereās some mechanism to trigger a Stop generating
Do the tts engines that support mulitple voices have tags you can send them?
like "this is a test <voice B><speed 0.5> and now in a new voice!"
Alltalk has the separate narrator voice, which applies to certain syntax
There may be something like that with other engines though
hmm hmmmm
this guy's code might help, this project has stream tts through openai api just im not sure if it is located here https://github.com/zixiiu/Digital_Life_Server/blob/master/GPT/GPTService.py
ĀÆ_(ć)_/ĀÆ
I could fairly easily stream TTS if I chose a specific TTS api like alltalk TTS API, but to keep the bot flexible for different TTS clients and let TGWUI manage the extension - would need to do some hack to make it generate partial responses
If there is some kind of "stop generating" function (I'll look for it), it could be the answer
There's no parameter to make TGWUI stop at a line completion / newline, only hard token count
š¤
what about stopping strings?
you can append an extra stopping string ontop of the user payload
With the way I had set it up, with a "chance to stream a reply" - I'd have to axe that and make it all or none
Hmm, hmm, alright
But I like the way it is, adding some randomness to when it decides to split message
yea, ultimately tts would be hooked into separately somehow.
I wonder if the extensions expose functions one could directly access.
if so, you'd also have to figure out how to unhook tts generation from the text-gen pipeline so it doesn't do it twice