#ad_discordbot (Fork of Fork of xNul's bot)

1 messages · Page 16 of 1

terse folio
#

since you ask users to specify the tts engine "edge/alltalk" in the config file, this can be used to decide which TTS wrapper class to use that will translate between them

halcyon quarry
#

I'll think about trying to monkey patch it

#

I downloaded mcmonkey's SwarmUI yesterday - downloaded the smallest Flux model (Schnell) and generated an image using the default bare bones settings (1024 x 1024, a mere 20 steps, Euler sampler).

I'm using a 4070ti (12GB Vram) - Goddamned picture took 2 minutes, and was shit because 20 steps isn't enough.

#

Flux is a huge milestone for open source but only if you've got a freakin supercomputer

#

Although I understand most LLM models also restricted to supercomputers

valid crypt
#

if i can set it below 50 steps i can wait

#

and depends on the quality

terse folio
#

i've seen people mention 2bit quants for that on reddit, perhaps you can get even smaller with the Schnell version at 2 bits?
but the quality will probably be as bad as you expect from 2 bits

valid crypt
halcyon quarry
#

I think by the time "good" quantized versions of Flux are available to the little guys, SD3.1 could be around by then and likely will be better by comparison (probably will not hold a candle to the normal Flux models)

valid crypt
valid crypt
#

¯_(ツ)_/¯

halcyon quarry
#

3.0 is very good at some things, but the fools intentionally poisoned the model and as a result it has a lot of issues

halcyon quarry
#

They recently made an announcement that they fucked up and are going to release a model that isn't borked

valid crypt
#

the biggest change ive noticed is the text and sd3 has pretty good result

#

¯_(ツ)_/¯

halcyon quarry
#

Have you not seen the results of "woman laying in the grass"?

keen palm
#

It's like some kind of eldritch horror

valid crypt
#

?

halcyon quarry
#

Don't let the preview fool you

valid crypt
#

isnt that something with the prompt

#

on civit i see normal results

#

i like more the bottom

#

and xl is top

halcyon quarry
#

https://stability.ai/news/license-update

Improving Model Quality

Before we released SD3 Medium, our initial testing indicated that it was, in most cases, a much better base model compared to SDXL, in terms of prompt adherence, diversity, detail, and overall quality. However, the community quickly identified some critical quality issues mainly related to body poses and words that were too rarely seen in the training set. To address these concerns, we have focused on two key areas:

Continuous Improvement: SD3 Medium is still a work in progress. We aim to release a much improved version in the coming week

valid crypt
#

now i understand

halcyon quarry
#

Can't find it but there was also some other threads where people had compared identical prompts / settings shared by Stability AI, and could not reproduce results close at all even when using their API for larger models

#

When SD3 was released these posts were endless

valid crypt
#

xd

#

fixed or not yet

halcyon quarry
#

not yet but if they're actually going to do it, shouldn't be too much longer

#

I had seen an update shared recently which acknowledges that they are still working on it

vestal python
# halcyon quarry Flux is a huge milestone for open source but only if you've got a freakin superc...

I run the flux.dev on comfyui currently with my 3080 10GB 1024x1024, 100 seconds for 20 steps using fp8_e5m2 with the t5xxl_fp15 clip, and I have gotten really amazing results. I have never bothered to try the schnell yet though, and the dev version generations don't look right anecdotally below 12 steps.

I can give the SwarmUI a try later this week for it too. I'm not sure if there's an API for comfyui or swarmUI since still just testing the model, but definitely an interest for my users just to run 20 step.

My SDXL I always had setup for them in 70 steps 1024x1024 default which is I think just a tad faster.

#

Very different than a1111 or forgeui...

halcyon quarry
#

On the bot to-do list is support for this

halcyon quarry
#

hmm... Tags based on "role" seems to require additional privileged intents ('member')

#

Seems to be more trouble than it's worth

halcyon quarry
#

May need to require that intent at some point for something more important...

#

got the guild IDs and channel ID tags working though

halcyon quarry
#

indeed, the hack I had in mind wouldn't work after all, because it would generate TTS for the entire message not just the continued response

vestal python
#

The whole:

"This
This is
This is a
This is a messsage!"

Issue?

valid crypt
#

stream response not working properly

#

you can try to pull latest version

#

or maybe he pushed the wrong one ._ .

vestal python
#

Is that a thing now? Sending some TTS server the /chat/completions streaming message instead for streaming voice? Sounds like a boss move for some TTS project.

valid crypt
#

no

#

is only for text

#

for now

#

he is trying to make it work for tts

#

he could do for one in particular but he want a more universal fix

#

¯_(ツ)_/¯

vestal python
#

I have some ideas about how to handle the chunks that ome through /chat/completions, but I'm not sure how like Alltalk TTS handles them, or if it'd even be fast enough..

valid crypt
#

just tell him your suggestions, these are his words

vestal python
#

More than likely He'll have a working solution first x.x It's parts of my list of Speech features I need to fix back up for my example lightweight website.

#
  • fix my basic STT-to-TTS
  • Hands-free 'Alexa' like mode for transcribing a key word to prompt a speech request
  • Streaming speech with an abort feature.
halcyon quarry
#

What I was thinking of yesterday, was trying to stop the generation every time it would split - which would result in it only generating so much TTS - then updating the payload with that response and use "Continue"

#

But after more consideration, realized it would regenerate the entire TTS response

vestal python
#

I just finished my testing bot for the website under Hermes-2-Theta-Llama-3-8B-32k.i1-Q5_K_M.gguf, to continue working on it while the partial version is live.. I'll have moreof a look at the TTS streaming. Supposedly you can't stop the streaming per the Alltalk notes while mid-stream?

halcyon quarry
#

This bot doesn't use the TGWUI API, nor any particular TTS API

vestal python
#

But I see what you mean combining chunks together to make less of a constant I think?

halcyon quarry
#

It let's TGWUI extension handling invoke the TTS stuff - which can't be chunked

#

The text responses can (and are) chunked - but the TTS comes all as one response

#

So the viable options to enable that, are to use a dedicated API, which limits the current flexibilty of clients.
Or I need to possibly monkey patch something in TGWUI to make it return partial TTS responses

halcyon quarry
#

@calm rain I've embraced the footer as you use in your bot 🤓

#

(although I know you include the image within the embed)

vestal python
#

Any info on this? I got my 2nd discord bot running and he's hallucinating alot and came up with some 'persona' google doc links.

vestal python
#

Overhauled alot of stopping strings. Almost no issues now x.x until he finds a new phrase to abuse

halcyon quarry
#

@terse folio quick FYI, in case the gears start turning again for this project, send_long_message() now returns a list of sent msg IDs and the last message object. It no longer handles assigning Bot HMessage IDs, which is now handled outside the function.

terse folio
valid crypt
#

this could be the only tts we will need

#

it literally has every tts

halcyon quarry
#

I'd rather not have send_long_message() returning an HMessage, the IDs list, and a discord Message, but just the latter 2

halcyon quarry
valid crypt
#

it might

terse folio
halcyon quarry
terse folio
#

what do you mean sending?

halcyon quarry
#

Nah, just going to delete and replace the messages. Don't want to spend too much time figuring out how to programmatically detect when the new message response will exceed the permitted text per message, and decide to edit or send for multiple message blocks

#

Better terminology would be "passing a dummy HMessage"

halcyon quarry
#

dummy_hmsg = local_history.new(message(save=False)
dummy_hmsg = await send_long_message(text, whatever, dummy_hmessage)

#

then collect its ID and related IDs... nahhh 😛

terse folio
#

oh why do you need to pass a dummy hmessage into the send_long_message function?

#

to get the channel id?

halcyon quarry
#

For occassions where I need the sent message IDs but it is not sending messages that I want in history whatsoever

#

But like I said, I changed it up now - I just get the IDs back from the function and assign them to the HMessage after

terse folio
#

I see i see

halcyon quarry
#

id = sent_msg_ids.pop(-1)
related_ids = sent_msg_ids

#

One thing I was struggling with yesterday and gave up - still thinking for a solution...
Is trying to run the bot with privileged intents, then falling back to "False" if an exception occurs.
Like, I don't want to screw up someone's bot because I want to have a for_roles_only tag - which needs members intent

#

The intents are set while creating the client (bot) object at the beginning of the script, then all the commands are set, yadda yadda, then finally at the runner is when it may error due to intents

#

I can catch the correct exception but don't think I can simply update an intent without recreating the client object again

terse folio
#

I don't think you can update intents, i'm pretty sure that's part of the login payload

#

I would create a little wiki for users on how to set intents for their bot in the developer panel

#

it's just a few clicks!

halcyon quarry
#

There is a priviledged intent that I've been setting for a long time which is required message_content, that I had no exception / custom log error message for - which now I do, but yes I will shortly be updating my install steps to explicitly explain setting that

#

The members one, if I can't find an elegant solution to try / fall back, I'm just going to forget that for now since it's such a minor feature

terse folio
halcyon quarry
#

I do have the intents defaults, which sets true for all non-priviledged intents (there are only 3 it needs explicit permission for)

halcyon quarry
terse folio
#

no no, I mean the privileged intents on the developer panel,
you can access them all iirc while your bot is small

halcyon quarry
#

OH

#

If it is > 75 guilds then it gets tricky

terse folio
#

and there isn't really a reason not to enable the basics like message content/members/roles...

most discord bots will use this information in some way.
like moderation bots will need member/role info as well as message content for moderation

terse folio
halcyon quarry
halcyon quarry
#

Been overhauling the Post active settings feature all morning, to not suck

#

Adding a command to set a settings channel per server (like voice channels)

#

It will create a list of message IDs for each settings category, and index it for the channel ID

#

When updating, will iterate over the stored IDs and fetch/delete the messages, then replace the list after sending the new settings

halcyon quarry
#

Made a lot of progress but still needs some debugging

valid crypt
#

alright fixed one that wasn't working, and amazin results

#

but not every model has the same error

halcyon quarry
#

The Post Active Settings feature will actually be quite nice, since it will post a

Header for each setting

And config.yaml will now allow the feature to be customized to only include certain settings.

#

Will help seeing TAGS much easier - planning to extract them from all sources and collect as one setting block, under subheadings like Character Tags and Imgmodel Tags etc

halcyon quarry
#

It's working O_O

#

When using the command - if changing the server's settings channel from one to another, it will delete all the settings from the previous channel, then post all settings to the new channel.
If it's the first channel set, it just posts all the settings

halcyon quarry
#

It's done - just need to update the code to use it more often (currently only triggering from the new slash command, or when changing imgmodels)

#

@terse folio Wondering if you have a solution for this...
my new /set_server_settings_channel command is working like the voice channel one - it has a prepopulated list of channels to choose from.
Big difference here though... it seems to be limited to 10 channels. D'ya know offhand a good solution for that?

#

this is on dev branch if you wanna check it out

terse folio
halcyon quarry
#

Ahhhh interesting

#

I’ll include that in the field description

terse folio
halcyon quarry
#

Not to toot my own horn but I think I have some pretty slick logic going on in this post active settings feature

#

Actually... yes to toot my own horn 😛

#

🚃

halcyon quarry
#

This feature also showcases how good the send_long_messages() function is at chunking code blocks

#

which I had spent a lot of time on back in the day

halcyon quarry
#

Also patching a pretty big bug with 'history reactions' feature

#

it's not reacting to the last message if sent in msg chunks. Also error if sending one message.
Resolved.
Pushing to Main

#

Pushed to main - start posting those settings XD

halcyon quarry
#

This feature is triggered to run in the background too, does not interere with main tasks

halcyon quarry
#

Just took the time to revamp this

halcyon quarry
#

Wiki is coming along

valid crypt
#

a video or something will definitely bring more users

#

i literally found this project by joining this server and asking in general

#

i suggest changing this a little

#

it explains very well the project but not good at attracting people

#

hmmmmm

valid crypt
#

i think the majority that uses tgwui is because they want it to be local

#

so i suggest adding the keyword "local" :D

valid crypt
#

or maybe a post in oobabooga reddit

valid crypt
halcyon quarry
#

Thanks for the nudge - I just updated the main description to:

Discord bot which transforms your servers into hubs for limitless local AI-driven interaction and content creation. Features cutting-edge tools for professionals, and unlocks creative fun for casual users. Integrates text-generation-webui and Stable Diffusion Web UIs.

valid crypt
valid crypt
#

someday my bot will have a perfect voice...

vestal python
#

That new Forge uodate with NF4 quant for Flux.Dev is interesting news.

#

Might make the speed for my discord bot Ecne's generations decent.

vestal python
#

Just tested on my RTX 3080 10GB card:

normal simple prompt
1024x1024 20 step
7641MiB / 10240MiB

fp8 = 4.88 s/it
NF4 = 1.4 s/it

100sec versus 31sec generations. Very good. Will test more later.

halcyon quarry
#

Indeed, for me I went from 2 mins to 40 seconds

#

Unfortunately forge API is in shambles for now, can’t really reap the benefits via the bot just yet

vestal python
#

I noticed as well

terse folio
#

That duck looks an awful lot like me 👀
joking about the blank screen being a reflection

halcyon quarry
#

You guys update anytime recently?

Any feedback on new features like the streaming replies, etc?

keen palm
#

I have updated recently, but I haven't done that. I have used start_reply_with to good success, though.

halcyon quarry
#

If you have any thoughts on something that should obviously be added to the new /prompt command, I'm all ears

keen palm
#

I never use that, actually. I find it easier to just type everything out without using the slash command.

halcyon quarry
#

yeaaaaah, since you're familiar with the instant tags syntax

halcyon quarry
#

before I forget again, I'm going to add to the TODO list a command to create Tags

#

And yes, like save locally (reusable)

keen palm
#

To an entirely separate json?

halcyon quarry
#

yes

#

I'll be sure to have a config option for the feature

#

I was thinking that I could make these tags available only to the user who created them

keen palm
#

How difficult would it be to limit tag creation to certain roles?

halcyon quarry
#

Well if its a slash command, then you can just adjust command permissions in Server Settings > Integrations > Your bot

#

So very easy

#

Ever have any issues with the instant tags syntax?

keen palm
#

Oh I see. Yeah, so ideally the tag would be useable by everyone (at least for my purposes), but limiting the slash command behind a role would be nice

#

Nope

halcyon quarry
#

sweet

#

Maybe I could have some mechanism for admin to click a button to approve it for global use?
I could just envision one person setting up all sorts of tags that drive everyone else crazy lol

#

eh, maybe just config option again... to set whether they are for the user who created it only, or works globally

halcyon quarry
#

2 new additions to the /prompt command

#

termonilogy 😎

keen palm
#

I don't know what kind of response you'll get with expert termonilogy

halcyon quarry
#

One more option for the command

halcyon quarry
#

@keen palm Something you may appreciate... I noticed that the param begin_reply_with was omitting the text it was continuing from.

I've fixed that. Now, it only omits that text from the reply when using the continue context command

#

Pushed updates to /prompt command

halcyon quarry
#

Anyone here use Docker for anything? There's someone asking me if it would ever be possible to run this bot with Docker... from my quick research looks a lot like "no"

terse folio
#

Docker is just a tool for running containers (virtual machines?) basically

there's a docker image for tgwui somewhere.
it just needs to be updated to include pulling from your bot's repo as well I assume.

I'm not really sure how docker images are created.
if it's like a script of commands to create the vm, or if you package up the frozen snapshot of that vm...

#

it should be possible

halcyon quarry
#

Tomorrow, I'm going to attempt to clean up the internal settings management, to subsequently achieve per-guild settings and perhaps even characters

#

only caveat with characters would be avatars

halcyon quarry
#

One problem I need to solve… some command options are based on settings… so would need to be able to register guild specific commands… 🤔

halcyon quarry
#

Nvm this is a non-issue.

halcyon quarry
#

Making good progress on per-guild settings

keen palm
#

Hmm. I can't get [[begin_reply_with:]] to actually output anything past the initial phrase

halcyon quarry
#

I'm reviewing this closely... may have screwed up the logic a little bit

#

yeah I can see what I fudged up...

#

testing...

keen palm
#

What's the issue?

halcyon quarry
#

I overlooked 2 lines of my existing code, when adding code to tweak the behavior of "continue" (which is what this tag uses)

#

the original code and new code kind of negated each other 😛

#

seems good now

#

Btw I'm very excited about the per server settings... I think I did a very solid job on reworking the core settings management

#

Again- wouldn't be possible without Reality's contributions - the database code is stellar.

#

@keen palm The fix is pushed to main

#

I think per server characters will also be a thing - there's a few more hurdles to jump for that one

#

the setting would require using a shared avatar though

keen palm
#

Which is only really a problem when people want different pictures of their AI waifu bots

halcyon quarry
#

The send_user_image tag could help but adds a lot of line breaks

keen palm
#

Hmm, it still isn't actually continuing even after updating

halcyon quarry
#

hum

#

definitely working for me

keen palm
#

My bot is dumb 😦

#

Apparently I gave it too much to start with and thought it should be the end

#

Weird. I'll have to tinker with it some more, but I got it to work that time.

halcyon quarry
#

Hmm I’ll test with longer inputs

#

Oh yeah, it’s certainly possible that the LLM just didn’t add anything

keen palm
#

This model seems to not like continuing in general

halcyon quarry
#

You can always use Edit History cmd, add something open ended like “Then, “ then Continue cmd

keen palm
#

I did one attempt at begin_reply_with ending with "that " and it still didn't continue

#

Actually the one with "that " worked but the one with "against " didn't

halcyon quarry
#

you could double-check the internal history log to ensure it isn’t just the sent discord message being sent short

#

Or, the Edit History command will show the logged text by default

keen palm
#

Yeah it's not cut short. It's just the model being weird.
Regenerating the prompt with "that " showed that sometimes it continues and sometimes it doesn't.

halcyon quarry
#

I might even have per server characters working...

#

unless there's some catastrophic detail I'm overlooking, this settings update is going very well

#

I think if Stable Diffusion WebUI setting enabled for "keep multiple models in VRAM", this may even allow per-server imgmodel handling

keen palm
#

Ain't nobody got that much VRAM

halcyon quarry
halcyon quarry
#

Yay, coded everything in... idk, 12 hours? Debugging time

halcyon quarry
#

Reality is it a big deal if I willy nilly import discord in modules?

#

Or does the entire bot script as a whole, only import it once regardless of import statements in modules?

#

sorry I could just ask chatgpt this 😛

#

yes chatgpt says it's cached on first import so NBD

terse folio
halcyon quarry
#

Learned a harsh lesson last night with circular imports… wasted about 2 hours

halcyon quarry
#

Well not a total waste as I did learn stuff

halcyon quarry
#

Just realized I needed to add this condition to "change_X" tags processing (such as 'change_character'):
and not is_direct_message(self.ictx)

#

I should probably add a setting to control whether images may be generated in DMs via tags

halcyon quarry
#

Welp - finally got everything working again - without per_server_settings enabled 😛

#

Time to see how that goes now...

halcyon quarry
#

So far so good, but only have this bot on one server. Need to test my other one

halcyon quarry
#

Note to self: allow anything via DM if bot owner

halcyon quarry
#

@terse folio you may be interested to know: Most of the idiocy of my settings management was resolved by simply importing BaseFileMemory from database to main. I also resolved the multiple instances of Config by importing BaseFileMemory to shared and initializing Config there

#

That had me thinking, oh, I can just initialize all sorts of main bot class instances in shared… but ended up creating unsolvable circular import hell

terse folio
#

Interesting, nice!

And yes, whenever you find yourself having circular imports one solution is to make a 3rd file that stands outside the previous ones which you import. (In this case shared.py)

halcyon quarry
#

Ok in addition to all this cool settings crap - now also allowing the bot owner to use most commands via DM

terse folio
#

Nice!

halcyon quarry
#

Starting to run out of bugs, I'm digging deep now

#

This is going to be a super sweet update

#

I had to add a little inbetween function for all 16 instances of ‘local_history = get_history_for(ictx.channel)’ in order to first get the character and mode (which now has ictx.guild.id as an arg), to subsequently pass to the history function

#

I first modified the custom history functions (get/set history for) to accept “ictx” instead of only the channel ID - but then there were some instances the parent class was calling with just the id (channel id)

#

Just a funny little story 🤗 it’s working and that’s all that matters

halcyon quarry
#

Just need to test the per-guild characters feature a bit more before I push this to main

halcyon quarry
#

The per server characters is totally working

#

I'm pushing this now - in addition to expanding the settings management, I've also cleaned up a lot of redundant settings crap and fixed a number of bugs along the way

#

It's much more efficient now

#

AND - there is a separate option to allow per-server-imgmodel-settings. I tested this and it will work if you have enough VRAM. In my case, I can use a separate SD 1.5 model in 2 separate servers

#

Will I continue using an SD 1.5 model in 2 servers? Heck no - but it does work!

#

Pushed to MAIN

valid crypt
#

different character same photo... i still dont understand why i would be useful 🙃

#

and what about the name

halcyon quarry
#

The names are unique!

valid crypt
#

wasnt there a cd?

#

it is going to be same name same photo diff brain

halcyon quarry
#

I still have the bot create a "delayed update task" to ensure profiles do not update more than once every 10 minutes

#

I'l review this to see if I can reduce it when only changing name

valid crypt
#

wait

#

what about changing bodies

#

2bot one program

#

is that a thing?

#

instead of opening 2 adbot, just 1

halcyon quarry
#

Anyway - it is allowing different names in each guild, and it is managing their settings separately. The only caveat is indeed that they need to share image

valid crypt
#

right

halcyon quarry
#

The config file has a field for a shared avatar image

valid crypt
#

different server can have diff name

halcyon quarry
#

if the image is not included there, it will just change the image to the new character every time

#

One more thing I updated... it now allows png, jpg and gif.
It was originally only allowing png

valid crypt
#

uhh .gif

#

bot can have nitro?

#

XD

terse folio
#

Oh yea, that's cool, I saw some bots with animated profiles, and banners recently!

valid crypt
#

what

terse folio
#

rythm i think

halcyon quarry
#

Well, the animated gif will probably be a static image if the bot is not nitro

halcyon quarry
#

Idk how the heck that works, I just looked up "discord avatar supported formats" - and revised my code to allow those 3 it says are supported

terse folio
valid crypt
#

whaaaaaaaaaaaaaat

terse folio
#

im not sure if there are restrictions on who can upload banners (like if it needs to be verified)

halcyon quarry
#

Interesting, hey maybe the bot can use animated gifs now XD

valid crypt
#

bot has more rights than most users...

terse folio
# valid crypt bot has more rights than most users...

mhm,
funny thing is some people have asked if they are allowed to use discord bots as user accounts because of that

Which I think you can do? Because you're not automating a user account.
Just using the bot api to run a bot which is fine

valid crypt
#

well anyways, i remember that there was a setting to make the bot reply to another bot, i thought it was running 2 adbots @halcyon quarry im i right?

valid crypt
terse folio
#

sounds like the random continuation setting?

valid crypt
#

ye

#

i dont think it would be useful because they are going to use doble vram which i dont have...

#

im thinking that instead of running 2 adbots, plug 2 tokens

terse folio
#

if it's running off the same instance of tgwui, it wouldn't have to.

But i'm not sure that was a bot feature

valid crypt
#

it could be funny

#

and easier to do

#

like man vram are expensive

terse folio
#

no, not at the moment,
running 2+ discord bots isn't a simple task

Also you'd need to figure out some communication layer internally between them.

The best solution that would fix a lot of things is to create an AD_Discord bot extension that hosts an api with everything it needs from tgwui.

Then you could start up as many discord bots as you want that connect to that extension and take turns sharing the vram

valid crypt
#

oof

terse folio
#

but an extension is no easy task either,
Because you'll now have separate configs.
One for tgwui (what tts extension to load) for example
And other for the bot (character, history... info)

#

it's something I want to attempt later, but i'm still busy until the end of this month

valid crypt
#

no wait wrong

#

ill burn my computer if it get infected

#

i trust!

#

feels like good?

terse folio
#

it's smart to test things in a vm of course,
but it being on github is a pretty okay indicator that it could be safe most of the time
As others can look through the code.

#

I would run it on a new bot account

#

incase theres a posibility it could grab tokens

#

wow 58k downloads,
Its probably fine ^^

valid crypt
#

its sooo cool

#

i got a email and number

#

uhhh f

terse folio
#

don't login as a user (discord doesn't like that)

valid crypt
#

it really has nitro

terse folio
#

I feel like that would be client side

#

but, interestinggg

valid crypt
#

a bot can create server :O

terse folio
#

back before discord was super strict about client side mods, I saw some to mimick some features of nitro.

Like letting you send custom emojis as plain text (others would need the mod to see them)

terse folio
valid crypt
#

i joined that server :O

#

can add friend tho

terse folio
#

as a bot account?

valid crypt
#

cant group chat

#

*cant

#

cant*

#

cant start dm out of no where f

terse folio
#

yea, what's cool is the discord.py docs have user endpoints too,
Added for completion, but are labeled that they wont work for bots.

Can be confusing sometimes for those who don't realise

valid crypt
#

imagine the server owner is a bot XD

terse folio
valid crypt
terse folio
#

amazing :3

halcyon quarry
#

So much would not be possible

valid crypt
#

bot might be able to use the camera

#

just i dont have a camera

terse folio
valid crypt
#

screenshare dont work

halcyon quarry
#

Your database and history manager made Settings() instances and management “easy” (just time consuming to ensure all my little features play nice together )

terse folio
#

Also if you have OBS, you can use the OBS webcam as a camera ^^

valid crypt
#

i can press camera but i dont have

halcyon quarry
#

OBS is the GOAT

terse folio
#

Oh yes and it gets even better, being able to program for it via obs-websockets and a python interface!

valid crypt
#

awww dont work

#

you can press the button

#

and thats all

halcyon quarry
#

I thought Reality already clarified recently that bots can’t use video in VC

valid crypt
#

i literally logged in a bot account

halcyon quarry
#

Yes, so can’t use video 😛

valid crypt
#

and i could press the button 😢

terse folio
#

the button lied to us!

valid crypt
#

cant accept invites tho

halcyon quarry
#

So practical

#

There’s an absolute fuckton of shit that discord bots can do

terse folio
halcyon quarry
#

The docs are huge, every time I need to look something up it’s a scavenger hunt

terse folio
#

Oh interesting, this must be new!
maybe came out when discord started the bot verification thing because that also has server count limits

valid crypt
#

it can upload 50mb+

terse folio
#

yea, bots have their own upload limits

#

I think it's 25 for normal users, 50mb for bots as you said, and 500mb for nitro?

valid crypt
#

got it 100mb

halcyon quarry
#

I may be wrong but I think uploads were recently reduced…

#

Or announced to be reduced…

terse folio
#

awh!

#

hope not

valid crypt
#

the nitro is fake 😢

#

cant change the banner

#

but the profile photo can be gif

#

:)

terse folio
#

bot banners might be changed from the developer panel

#

like you would change the aboutme section for a bot

valid crypt
#

true

terse folio
#

yup, there's a button to upload them there

valid crypt
terse folio
#

that might not be a bot feature, just a feature of the client if used in user mode

valid crypt
#

i closed it and it got deleted by my av

terse folio
halcyon quarry
#

I love how I pour my heart into this and slowly lose github stars lol

keen palm
#

Some nice github stars you got there. Would be a shame if something were to...happen to them....

halcyon quarry
#

if any of you guys end up trying out the per-server settings/characters features, I'd be very interested in feedback

keen palm
#

I've only got one active server, but I can try at some point.

halcyon quarry
#

Same - I only tested enough to confirm it works, along with most features... anyone using the bot in multiple servers could end up finding something I missed

valid crypt
#

im doing it right?

#

gonna believe and try first

#

forgot to make activate true :P

#

some small error but working ^_^

valid crypt
#

there is no description xd

#

idk how google grab info but google is skipping the most important part

#

although it reminds me of ad astra

halcyon quarry
valid crypt
#

nah just got the speacker name worng

#

the code it self has the problem of encoding " ' "

#

and always output %26%23x27 when there is '

#

gpt fixed it

#

and for some reason changing segment_size to 0 makes the tts much better

#

but it is not included in the extensions code the parameter and asked gpt to add it :)

halcyon quarry
#

I’m starting to get tempted to re-publish the project under a new name, the crappy name likely does have a huge impact

valid crypt
#

never done that before

#

dont know how to do :V

halcyon quarry
#
  • Create a GitHub account
  • Go to the repo and "fork" it - and give it a name like add -marcos to the end of it
#
  • You can make a duplicate of the main branch by going to the "branches" page, clicking new branch and call it something like fix bugs
#

I use the GitHub desktop app, but I think from this point all you would need to do is (in the GitHub page) click Add file to upload the updated files, which I think will trigger a commit (which updates the existing file)

#

Finally, from the top bar Pull requests > New pull request and it will let you compare two branches.
In this case, you would want to compare your fixed branch, to that other guys main branch

#

Then that guy can approve the PR and it is officially committed

#

I submitted these 2 PRs to sd-webui-forge this morning

valid crypt
#

ok

#

and one more thing

halcyon quarry
#

It sounds like a lot of steps, but once you do it once or twice it gets easier

valid crypt
#

i always forget the command to make bot join voice channel

halcyon quarry
#
  • The character file needs to have use_voice_channel: True setting.
  • In config.yaml you need to have the extension name plugged in (you probably do)
  • In your server you need to use /set_server_voice_channel
#

about use_voice_channel: True I think I'll make that True by default

#

(like the 'in character menu' setting)

valid crypt
#

yeah

#

but

#

there was a command to toggle

halcyon quarry
#

The command works only if you launched the bot with the TTS enabled

valid crypt
#

i remember it was toggle tts or something but cant find it now

halcyon quarry
#

/toggle_tts

#

If you don't see the command, you may need to restart your discord client

#

or try the /sync command

valid crypt
#

i added those settings to the character card

#

i need to add them to another files?

halcyon quarry
#

Well, each character

#

that you want to use on voice

#

I'm changing that default now...

valid crypt
#

like i dont see the command

halcyon quarry
#

Launch the bot with TTS enabled and all that

#

close discord

#

open discord

valid crypt
#

what do you mean with tts enabled

#

like it is sending audio files

#

but i only added some setting to the character card

halcyon quarry
#

When the bot starts up, it registers commands

#

That command does not register unless the bot was started with TTS enabled

#

If you opened your discord client while the bot was online and not connected - the command will not be there.
Reloading the bot with TTS enabled does register the command, but it likely will not appear in the discord window until you restart it

#

I'm testing now... allowing that command to always be there, but just saying "Can't use it" if no TTS client is in the config file

#

this will prevent it from mysteriously not being there when deciding to launch the bot with TTS

valid crypt
#

:p

#

what if the bot didnt detect my tts

#

rebooted

#

logged in with browser

#

still missing

halcyon quarry
#

Ok... last things to check
In config.yaml in the tts_settings, is it play_mode 2? (default)

valid crypt
#

yes

#

didnt touch that

halcyon quarry
#

Ahh ok, I think I know

valid crypt
#

👀

halcyon quarry
#

I have it currently set up so it needs to be in the 'supported clients' list...

#

confirm the name of that tts client and I'll add it

#

Actually, scratch that

#

I have another idea

valid crypt
#

👁️ 👄 👁️

halcyon quarry
#

I added a 'fallback client' logic when initializing extensions

#

It will consider setting up an extension that ends with "_tts" so long as it doesn't find another one that is in the supported clients list

valid crypt
#

👍

halcyon quarry
#

Pushed to Main:

  • Any extension ending with "_tts" can now be used, including voice channels.
  • Characters are no longer required to include use_voice_channel in characters, to use VC. It will join so long as otherwise configured, and the character does not explicitly disable VC with use_voice_channel: False
  • /toggle_tts will now always register, even if TTS is disabled.... to prevent the command missing in discord UI when TTS IS enabled.
#

@valid crypt give it a shot and lmk if it works out

valid crypt
#

ok

#

works

valid crypt
#

/sync reloads extension?

#

or a way to reload extension without closing

valid crypt
#

my first pull request 🥳

terse folio
# valid crypt /sync reloads extension?

sync reloads the discord commands iirc for your client

The bot can't be hot-reloaded without restarting the bot.
It would have to have been built with a cogs like system

halcyon quarry
#

Yeah, I wasn't so sure if it would make commands appear which were missing from the UI (despite those commands being registered)

valid crypt
#

i was wondering why vits_api_tts is so slow, and after changing power plan makes it fater

#

asked gpt to add some debug codes and

2024-08-23 02:20:42 [INFO] Using CPU on 13th Gen Intel(R) Core(TM) i5-13600KF with 14 cores and 20 threads. Total memory: 32GB [in ModelManager.log_device_info:181]```
#

using cpu

#

huh

valid crypt
#

yeah definitely using cpu

#

gpt fixed it but not very clean

#

i like it tho

halcyon quarry
#

Chatgpt turning everyone into coders 🤗

valid crypt
#

i feel the speed is very improved, until i see the log

#

daaaam

#

cpu was taking 8s at max power

halcyon quarry
#

Huge!

valid crypt
#

i think you should make this default

halcyon quarry
#

I noticed that TGWUI made that default recently

valid crypt
#

i just wanted to say that

#

even tgwui made it default

halcyon quarry
#

I’ll likely do that

valid crypt
#

and a little detail

valid crypt
#

i have to toggle it and toggle back to let bot join

#

¯_(ツ)_/¯

#

bruh having the respond to it self set to 10% i got 3 replies, 0.1^3 of probability .-.

#

that rare enough

terse folio
visual dagger
#

hi

#

how are you doing guys

valid crypt
halcyon quarry
#

Not that it would change the randomness but you may be able to tweak the prompt a bit

#

Such as adding a system message or prefixing the prompt or something

#

The chance to reply to itself feature actually takes its last reply and uses it as the user prompt so it may be effective but also may confuse the history or something

#

I think that’s how it works I really haven’t looked too much into it

halcyon quarry
# visual dagger how are you doing guys

There’s been a lot of new major features since you last sent anything here such as streaming risk on responses and her server settings including characters among other things

visual dagger
halcyon quarry
# visual dagger wdym by server settings including chars?

If you enable both new settings per_server_settings and per_server_characters, it will manage characters completely independently in each server. So in one server if you use /character and choose a new character, only that specific server will change to using that character.
The only caveat to this is that the avatar cannot be set independently. I have an additional config option for avatar_image so you can specify a dedicated shared image. If unset, the avatar for the bot will change in all servers every time a new character is loaded

#

This is all possible due to a big overhaul in settings management

visual dagger
#

you guys added ton of features and i need bunch to catch up on now

halcyon quarry
#

Reality hasn't added anything recently, been just little old me 🙂 But they gave me incredible tools to work with

visual dagger
#

hustling! lol

#

any way for decision making?

halcyon quarry
#

The bot does work with many extensions, so you could always try using an extension that adds that

#

At minimum, I kind of recently added some new "tags" for prefix_context and suffix_context, meant to mimic the complex memory extension

visual dagger
#

giving it final options like, [agree, disagree, middle ground, wrong, true, yes, no]

the llm in the end will choose one of those but before that there gotta be an analysis that happens

#

before the final decision

halcyon quarry
#

I'd mentioned this before, but you could use a flows tag to essentially pass the history along with a background prompt to a dedicated character context, to make a decision that would then give a specific prompt to your normal character

#

You could have the flows tag trigger for every user message, or only on certain key words

visual dagger
#

you mean sendimg it to a second llm? that will be the actual decision maker?

halcyon quarry
#

For example if it triggers your flows tag, flow_step 1 may be swap_character: Decision Maker who has a dedicated context with example dialogue like system_message: You make decisions, valid responses are X Y and Z, user: an example of a prompt, Decision Maker: X, user: another example of a prompt, Decision Maker: Z, <YOUR PROMPT GETS INSERTED BY THE format_prompt tag>

#

flow_step2 may then be format_prompt: {llm_0} representing what the lest character just wrote

#

etc

visual dagger
#

those are good tools but the problem is when it comes to the decision making part

#

the llm acts stupid

halcyon quarry
#

Youll need to read my comments in dict_tags.yaml and in my Wiki, look over my examples, look at all the possible tags, look at the Variables

visual dagger
#

i tried dozen prompts for decision making

halcyon quarry
#

It could be a matter of prompting

#

Here's what most people do: Make a huge long winded explanation of how the LLM behaves

#

I recommend trying having literally nothing else except for some solid example dialogue

#

and like, a concise system message that You make decisions

visual dagger
#

you mean a cloned character but the cloned version will be customised for only decision making?

#

the system prompt will be finetunrmed for only that

halcyon quarry
#

You need to see all the crazy crap my Tags System can do

#

You can make all of that apply to a flow step

#

completely different character context, different parameters, manipulating the history it sees, manipulating the prompt, etc etc

#

Hiding or keeping that characters response in history, yadda yadda

visual dagger
#

the data I mean examples

halcyon quarry
#

You see that as a negative, but it's actually stronger that it is biased... towards responding the way you actually want it to

visual dagger
#

should I include examples of each option? hmm..

halcyon quarry
#

Good example dialogue will have all random user prompt scenarios that are not going to be an exact copy of a prompt you or your users will actually make

#

But similar

#

like, in the ballpark

visual dagger
halcyon quarry
#

The benefit of using example dialogue only is you can use all those tokens for it without the huge long winded explanations

#

Fit as many examples as you can for diverse scenarios, and if you are very smart about it you can really emphasize how that character makes its decisions

#

without having to explain it

#

You can see the example characters I provided... they work

visual dagger
#

for decision making?

halcyon quarry
#

Well, Imgmodel_Selector

#

When my flows tag is triggered, it works - it picks appropriate image models depending on the prompt

visual dagger
#

that's a nice classifier

halcyon quarry
#

Yes, I also made one for selecting Aspect Ratios, which worked well too... unsure where the heck I misplaced that character...

visual dagger
#

the problem with dialogues, it will link unrelated dialogues together, just bcz they came aftetwards

this is what I mean

ai

halcyon quarry
#

This is giving me an idea for a new Tag... which is possible thanks to Reality's history manager

#

something like filter_history_for

#

Adding another layer of history manipulation. Such that, you could show that Decision Maker character it's previous responses to your prompts

#

Without showing those prompts/replies to the main context

#

Currently, the save_to_history tag is the main thing to prevent sharing unwanted history among contexts

#

yep, definitely adding this tag

halcyon quarry
#

Logging new item to history management

#

The upcoming tag filter_history_for will search both name and impersonated_by

#

and collect those exchanges

visual dagger
#

my phone died on me in the middle of me writing

#

😐

visual dagger
#

that can be both good and bad, if all the decisions it made in the past were bad then rip

#

it's same as providing examples that you talked about

#

I have alot of experimentation to do to get this decision making process working, and more importantly making it choose the correct decision

visual dagger
halcyon quarry
#

There are other tags for manipulating history, load_history could restrict the depth of history to only X # of exchanges
Using /reset_conversation wipes history.

halcyon quarry
#

@terse folio hoping you might take a look at something... I don't 💯 understand how history works

#

usage:

filtered_history = self.history.filter_history_for_names(names_list)
i_list, v_list = filtered_history.render_to_tgwui_tuple()
self.llm_payload['state']['history']['internal'] = i_list
self.llm_payload['state']['history']['visible'] = v_list
#

The thing I'm uncertain of, is if it matters what order I collect the hmessages

#

Although... yeah... .render_to_tgwui_tuple() won't work on a list of hmessages

halcyon quarry
#

I should never call for backup until after I’ve showered - because that’s when I always solve my problems

keen palm
#

Just do all your coding in the shower

valid crypt
#

waterproof laptop and good to go

halcyon quarry
#

I’m going to get the message pairs and collect to a list… if multiple bot replies do smart things to get the right one… then, iterate over a copy of history using the collected list and pop any messages not in the list

halcyon quarry
valid crypt
#

i have better brain while 💩

terse folio
halcyon quarry
#

Thanks - I think I’ve got the right idea now though #1154970156108365944 message

terse folio
#

The cost in performance when instancing classes vs lists is negligible
(Lists are classes anyway)
so dont worry about doing things that way!
there are still some optimization routes we could take: Like implementing "Slots" for history so the class only reserves memory for a set list of attributes

halcyon quarry
#

Thanks!

terse folio
visual dagger
#

hey @terse folio how is it going

terse folio
terse folio
halcyon quarry
#

I may add another bool attribute like “manipulated_reply” or “from_tampered_history” to convey that history was manipulated to produce the reply

#

Could be good for logging at least, maybe for a future filtering method

halcyon quarry
#

I think this should do it

#

although, seems like the custom .append() method is eager to save the history

#

ah ok, this is the answer

terse folio
# halcyon quarry ah ok, this is the answer

the _ prefixing the attribute name is to indicate it's supposed to be treated like a private variable (not meant to be modified externally)

the reason being, history.append does checks to make sure the added object is an hmessage type.

#

How about adding a "nosave" kwarg to .append!

terse folio
halcyon quarry
#

look good?

terse folio
#

yea!

halcyon quarry
#

Here's what the tags end looks like

terse folio
#

interesting, also something I could implement later is running slices on a history object

#

Like this could become history[-num:] if that's of any use

#

I take that back, different stuff going on as this is after extracting pairs, looks good ^^

halcyon quarry
#

I improved the get_history_pair() method too...
Added the check in the red box

#

If/when there can be multiple possible user messages for a bot reply, will need to expand that part

#

forgot break

#

(should only be one max anyway)

terse folio
#

nvm

#

I read it as all bot replies in the history at first

#

I get it now

halcyon quarry
#

It's been working well for the App Commands (regen / continue / toggle as hidden) but it may have been occasionally getting a less than perfect bot reply

terse folio
#

User1: 1+1
User2: 2+3
Bot: 2
Bot: 5

Theoretically this should output in pairs:
1+1 = 2
2+3 = 5

halcyon quarry
#

I do have a separate method to return all possible replies as lists

terse folio
halcyon quarry
#

rather than this one which filters it down to one pair

terse folio
#

A mix of both could be to gather all the replies from the first function, and "\n".join() them into a single reply to be used as a pair

halcyon quarry
#

Uh oh, the gears are turning
Might need to come burn some of that idea fuel XD

terse folio
#

got a few more days of really busy stuff!

halcyon quarry
#

The filter.... I think it working, my server is too dead to test effectively lol

terse folio
#

human coordination isn't always easy,
I have a couple test discord accounts for that!

halcyon quarry
#

Oh right I can test it for the impersonated_by attribute (from triggered 'swap_character')

terse folio
#

you can login to one on a webbrowser, or alternate official discord client like Discord Canary

halcyon quarry
#

There is an issue with the method I used for filter history...

#

maximum recursion depth exceeded in comparison

I'm not sure why this is causing an infinite recursion...

#

Could appending items to the copy of history, be causing them to also append to the history it was copied from?

#

Ok well for some reason self.fresh() is actually just giving me a copy of history, not an empty history instance

#

seems like it should be giving me empty history but here is what is printed

#

fixed by copy.deepcopy() 🤓

#

The error I'm getting is due to comparing HMessage objects to each other

#

it bugs out when they match 🙂

#

What chatgpt suggests (and what I'll do) is make a set of the message IDs that it already appended, and check hmessage ID against that

terse folio
#

the hmessage should be using the uid attribute for matching

#

make sure you have a unique one for unique messages

halcyon quarry
#

resolved it

terse folio
#

like for discord

#

but if it works, great!

halcyon quarry
#

I actually generate fake IDs for messages that are not sent to discord 😛

terse folio
#

I see I see

halcyon quarry
#

and whenever there are multiple IDs, the extras are all 'related_ids'

#

So yeah, for some reason if hmessage == hmessage: triggers an infinite recursion

terse folio
#

do you have a code snippet that causes that?

#

nvm

#

i never implemented an __eq__ function it seems

#

because dataclasses do it for you, and there was some bug with labling which attributes i wanted to be used for eq checking

#

in the meantime, you could get around it by defining your own .equals() function that compares the uids or ids

halcyon quarry
#

Welp, in my testing I realize another nice tag to add would be include_hidden_history

#

Currently, the only ways to include hidden history are:

  • to toggle it back to visible
  • or perform an App Command on a hidden item, it toggles it temporarily 😛
halcyon quarry
#

Added new tag: include_hidden_history

I implemented this by adding a new argument for render_to_tgwui_tuple()

if include_hidden_history:
    include_hidden = True
# Render history for payload
i_list, v_list = history_to_render.render_to_tgwui_tuple(include_hidden)
#

I first tried deepcopying self.local_history._items and iterating over the hmessages, but that actually still referenced the original attributes and they were getting modified

terse folio
halcyon quarry
#

In this case, was much more efficient to just expand the render method

#

There are methods that copy history… unsure how unique those copies are…

valid crypt
#

i fould the detail of json cant have comments D: does yaml have comments?

terse folio
#

yaml can use # comments

halcyon quarry
#

Besides most of the code I'm sharing here is from .py

terse folio
#

Also depending on how one's code reads the json, you might be able to get away with adding extra key:values that the program ignores

like "comment":"your text here"
but yaml would be better suited ^^

terse folio
halcyon quarry
#

eh, don't sweat it - we'll do it if the need arises 🙂

#

almost needed it, but not quite 😄

visual dagger
terse folio
#

Yea, tuning your prompt to do something perfectly can be a pain, especially with smaller models

halcyon quarry
#

There is an LLM therapist model called Carl

#

Probably a better one by now… that was > year ago

valid crypt
#

didnt know that vits is better on linux

#

f

#

gonna kill myself

halcyon quarry
#

Isn’t everything better on Linux?

keen palm
#

Even gaming in some cases!

valid crypt
keen palm
#

It's not a matter of opinion. There are Linux-based OSs built for gaming (SteamOS, Bazzite, Batocera), and some games run objectively better that way.

valid crypt
#

guy

#

ubuntu 22 or ubuntu 24

#

gpt says that 22 is more recommended 👌

keen palm
#

For what

valid crypt
#

training

valid crypt
#

gonna pause vits training for a few days :v

#

if i want to add stt to the bot is modifying an extension or bot.py

#

stt might to be to hard for me

#

to add streaming tts is modifying the extension or the bot.py

#

?

halcyon quarry
#

I'm really not sure what's involved with STT, but thanks for the reminder.
I may look into that in the coming days

halcyon quarry
#

I need to see how whisper_stt works, and figure out how to implement it

valid crypt
#

so what i have to modify to add streaming tts?

halcyon quarry
#

You'd need to implement a separate API call to a TTS specific API, with the responses that are yielded from llm_gen() to the process_response_chunk() function

#

such as alltalk_tts's API

#

in the correct format, with the parameters, etc

valid crypt
#

😵‍💫

halcyon quarry
#

and then the responses from the TTS API would need to use the voice client code

valid crypt
#

i let chat gpt to do the tech part

#

i think that we are not speaking the same english

#

😵‍💫

halcyon quarry
#

It's unfortunate that the "continue" function has the TTS extensions generate the entire bot reply, and not just for the continued text

valid crypt
#

does that affect the speed?

halcyon quarry
#

Yes, but it also returns the filepath to the generated audio which is the entire response

valid crypt
#

if it affects the speed, then it is tough

#

i thought that you could make a code to do some subtraction

halcyon quarry
#

Let's say I have max new tokens set to 2.
Example:

User: How's it going?
Bot: G
TTS Audio: G

Continue request:
Bot: Go
TTS Audio: Go

Continue request:
Bot: Goo
TTS Audio: Goo

#

hmm

#

There probably is a way to trim the subsequent audio files, based on the length of the previous

#

But it will generate the entire reply for every time it sends a response chunk

valid crypt
#

if the continue dont affect the output speed, you could just write a code to delete the previous response or smth

#

but

#

if it restarts everytime...

halcyon quarry
#

It does affect the output speed because it has to generate the entire response each time, yes

valid crypt
#

thats the problem

#

but i remember that when i press regenerate, starts where it was and wasnt taking too long

halcyon quarry
#

I'll make a deal with you.
If you can get python code up and running that uses the Alltalk TTS API, with the parameters and everything, I'll integrate that code into the bot

valid crypt
#

my vits tts uses api

halcyon quarry
#

When I looked at the alltalk TTS documentation for the API, it looked very complicated

valid crypt
#

the extension is only to do a api call

halcyon quarry
#

Well if you can send me functioning .py files that use the API directly I'll see what I can do

valid crypt
#
GitHub

This is an extension of text-generation-webui in order to generate audio using vits-simple-api. - Arondight/vits_api_tts

GitHub

A simple VITS HTTP API, developed by extending Moegoe with additional features. - Artrajz/vits-simple-api

valid crypt
halcyon quarry
#

I'll look into it, but it looks pretty goddamned complicated lol

valid crypt
#

it says simple api tho

halcyon quarry
#

uh huh

#

simple nuclear physics

valid crypt
#

also i feel vits like, very fast, mimics voices very good, but training it feels like eating s*

#

there is a ez trainin repo but 22050hz is pain (for me)

halcyon quarry
#

I'm outta here, I will look into that.
What are the odds that this vits is bugged like the other one?

valid crypt
#

?

halcyon quarry
#

nvm

valid crypt
#

it works fine

#

just need the correct model

halcyon quarry
#

I guess Artrajz probably the original author, and the other one the guy ported as an extension

#

the extension was bugged, you pushed a PR that will languish unmerged for an eternity XD

valid crypt
#

not a deadly bug

#

just flaws

#

1 big flaw

#

1 small flaw

#

very simple to fix

#

1st flaw was ' was read as #26x27; (something like that)
2nd flaw was adding a parameter improves quality

#

this is the gui

#

by accessing the url bellow generates the audio

valid crypt
#

steaming responses dont work???, but since i discovered that i wasnt using gpu, it takes too little to need that function

terse folio
halcyon quarry
#

I’m going to give a good try at monkeypatching whatever I need to, to get the desired result, either from Continue function or making message chunking occur in patched function

#

Maybe patch chatbot_wrapper

valid crypt
#

i said i was pausing vits but man it is addictive

#

terminal says nothing but gpu is working

#

i hope everything is fine

#

my pc is dying

#

so laggy

#

if training at 22khz im out of ram

#

i dont think i could survive 48kzh

halcyon quarry
#

committing Spongedoku

halcyon quarry
#

I've duplicated chatbot_wrapper() locally, imported the few required functions... now should be able to customize it...

valid crypt
#

🧐

#

(i have no idea)

keen palm
#

Hmmm, bot seems to be having some issues

#

got a 503 error on load, but now it's just freezing....

#

Other bots not working as well, so seems discord broke some things with that update.

terse folio
#

Login issue?, any chance your token was changed?
but try updating discord.py

halcyon quarry
# valid crypt 🧐

Chatbot_wrapper is TGWUI internal function that accepts the payload and returns the response

#

It also interacts with the extensions - I’m going to see if I can customize it when TTS and response streaming are enabled, to not send the entire response to the extension each time it splits

#

This will be optional, it could have adverse effects with other extensions. It will otherwise use the unmodified function

keen palm
halcyon quarry
#

already making good progress

halcyon quarry
#

I think streaming TTS responses will be a thing soon

#

There's a variety of ways that extensions can modify normal behavior.
alltalk_tts (and I suspect all other TTS extensions) change the state['stream'] variable to False.
This makes it so it must wait until complete generation is done before it lets extensions modify the "output".

I changed the behavior to force the stream flag to True, and now it at least lets the text stream... and at the end it, processes the full TTS.

I think I know what needs to be done to make it process TTS for each chunk...

halcyon quarry
#

I have it working, but something is making alltalk just hang sometimes... may need to add some arbitrary sleep or something.

valid crypt
halcyon quarry
#

In a way I already have it generating TTS after each response chunk

#

Just need to clean it up

#

And make it not hang and crash 🙂

#

If I can’t figure it out the guy from alltalk may have some insight, super nice guy

halcyon quarry
#

Noticed character is not automatically joining voice channel on startup... will resolve that as well

#

Ok, the TTS did not hang and error here at other location. I may have an outdated alltalk install or something at other location

halcyon quarry
#

For anyone wondering how I am making this work, I've extracted this code from chatbot_wrapper() (TGWUI internal function) which is the very last code it executes, after all text has been generated:

    output['visible'][-1][1] = extensions_module.apply_extensions('output', output['visible'][-1][1], state, is_chat=True)
    yield output

This is what actually triggers the TTS extensions to generate the TTS.
After the audio generates, this yields the "visible" response... when TTS is enabled, the message includes the path to the audio file.

  • I've extracted the code (no longer executes at the end of chatbot_wrapper() )
  • I now execute it on my end every time the bot decides to chunk text, with the chunked text in the arguments
if should_chunk:
    last_checked = ''
    already_chunked += partial_response
    audio_path = extensions_module.apply_extensions('output', partial_response, state=self.llm_payload['state'], is_chat=True)
    print("audio_path:", audio_path)
#

The TTS extensions only use the "state" dict for applying character name to filename, etc.

halcyon quarry
#

Figured out why bot was not joining VC on startup. Fixed

#

Also noticed that per_server_characters was not handling Voice Channels correctly. Fixed,

#

One more unrelated bug to fix (noticed Regenerate was broken) then Im pushing

halcyon quarry
# terse folio nvm

There must be some new thing I added to HMessage objects that is triggered the infinite recursion when comparing them, because old functions that used to work via comparison are now also triggering the error.

I'm doing what you suggested which is adding an .equals() method to compare ids

#
    def equals(self, hmsg:"HMessage"):
        return self.id == hmsg.id
#

something is super screwed up, ugh

#

just going to git checkout until I find where things went South

halcyon quarry
#

Ok, it turns out that the way I solved the one infinite recursion bug, created a different one

#

(go figure)

halcyon quarry
#

yes, History() did not like when I used copy.deepcopy on it

#

solved! Ok everything is looking very good now

#

PUSHED STREAMING TTS REPLIES TO MAIN

#

@keen palm @valid crypt

valid crypt
#

gotcha

keen palm
#

TTS still isn't something I've dabbled with at all.

valid crypt
#

died

halcyon quarry
#

hmm... so I'm testing edge_tts and see this one is not working

valid crypt
#

AttributeError: 'Task' object has no attribute 'streamed_tts'

21:33:59.618 #2846 ERROR [bot.main]: An error occurred while processing "on_message" request: 'Task' object has no attribute 'streamed_tts'
21:33:59.618 #3875 ERROR [bot.main]: An error occurred while processing task on_message: Type is not JSON serializable: AttributeError
TypeError: Type is not JSON serializable: AttributeError

#

something like that

halcyon quarry
#

Fix for that coming

#

Pushed

valid crypt
#

thats fast

halcyon quarry
#

Well, realized my mistake pretty quick after you said it 😛

valid crypt
#

code runs smoothly

#

just suspect that it isnt streaming

#

is it on by default?

halcyon quarry
#

The character needs to have chance_to_stream_reply behavior, some value > 0.0

#

The TTS streaming happens in sync with the text streaming

#

When it splits a sentence, it generates that portion of the TTS and plays it

valid crypt
#

what does the value do?

#

bigger number smaller chunk?

halcyon quarry
#

If 1.0 it will split on every period or line break

#

0.5 means 50% chance to split on one of those

valid crypt
#

understood

#

1 and 1.0 is the same thing?

halcyon quarry
#

Yep

#

hmm

valid crypt
#

working 👀

halcyon quarry
#

working for you?

valid crypt
#

did 3 tries

#

2 working

#

1 same message 3 times

halcyon quarry
#

edge_tts is failing, because it calls some async function which apparently is not allowed while TGWUI is generating text

valid crypt
#

did another try: no tts

#

tts might died

halcyon quarry
#

Your using that vits tts?

valid crypt
#

ye

#

i only can say that it is inconsistent

halcyon quarry
#

I'm getting very consistent results with alltalk, at least

#

Is there no rhyme or reason for the bits that aren't working?

valid crypt
#

bot sometimes sends 3 times the same audio

#

sametimes all correct and a full version

halcyon quarry
#

If you could walk me through that extension I could try reproducing it

valid crypt
#
GitHub

A simple VITS HTTP API, developed by extending Moegoe with additional features. - Artrajz/vits-simple-api

GitHub

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion - Plachtaa/VITS-fast-fine-tuning

GitHub

This is an extension of text-generation-webui in order to generate audio using vits-simple-api. - marcos33998/vits_api_tts-marcos

#

i remember that some dependencies cant be installed automatically so you have to install it manually but i think you wont have problems

#

these are what you need, they are under configs folder and pretrained models

terse folio
# halcyon quarry yes, History() did not like when I used copy.deepcopy on it

ah yes,
because there are references to itself

A hmessage contains it's parent
and history contains the hmessages.
That way they can reference eachother.

hmmm, now that I say that outloud I believe there's a bug with the history copy method.
It should create a new timeline and each hmessage item should point to the new history, not old.

i'll write that down on my todo

halcyon quarry
#

For what I was doing, which was makign a copy of history... I just used copy.copy() (like originally), then set its _items to []

#

things are working fine now, may be some other stuff not working 100%

terse folio
halcyon quarry
#

I’m using the microphone so this might not come out good

#

Now that I have a custom version of chat but rapper I don’t think that we need to use the custom a sink for partial thingy and can instead just make it an asynchronous function

#

I’m wondering if the inconsistent results from the TTS streaming I have is due to the method that we are making it asynchronous

terse folio
terse folio
halcyon quarry
#

I might need some kind of thread lock and unlock thing added to it I don’t know

terse folio
#

if the tts code is inside a asyncio.run_in_executer

#

it's in another thread and shouldnt affect the bot

#

what kind of bug are you getting?

halcyon quarry
#

It’s outside 🤓

#

Trying not to move all my text chunking logic inside that function (I am processing TTS every time text gets chunked)

#

But starting to look like I may have to

#

Er, actually it is still in the loop in executor block

terse folio
#

Use a generator that iterates on the output chunks, that way you can keep them as seperate functions but one is technically inside the other

#

but tts probably doesn't need to know about previous iterations

halcyon quarry
#

When you get an opportunity to look at it it will make much more sense

terse folio
#

Okie!

halcyon quarry
#

I’m making TGWUI do something it doesn’t want to, which is run apply_extension() at times it usually doesn’t

valid crypt
#

ill test why sometimes tts stops working

#

absolutely no idea...

valid crypt
#

i thought there was some kind of cooldown but no???
i thought that there a limit of messages but no???
i thought that after a while disables the tts but no???
i thought i have to manually generate a speech but no???

halcyon quarry
#

what?

#

Everything I said about TTS streaming not working before, are out the window now

#

This new idea that popped into my head the other day is definitely working

#

Just need to smooth out some of the wrinkles

#

I know you don't like alltalk but for a proof of concept you could try it and see that is at least working 100%

#

That author is extremely dedicated and constantly improving their code, so their efficiency and code structure can probably be credited for why this technique is working well for it

#

I think the solution I came up with could be further tweaked to make it work consistently for all TTS extensions

valid crypt
#

🤯

#

vits is not working at 100% it is like 90%, if i dont count the problem that the tts stops working
i suspect that the api call has a rate limit and the limit disappears after a manual generation so might not be your fault

halcyon quarry
#

I do have some wierd bug that I can't quite put my finger on, where it is uploading a file twice

#

What's odd about it is that the function that uploads TTS files like shown here, also sends them to voice channel.
But, this voice clip is not playing twice on the voice channel

#

It's very odd that this clip uploads twice

valid crypt
#

just very rare

halcyon quarry
#

probably something to do with the custom code I added to make edge_tts work 😛

#

because of that one model output being a bugged format

#
with io.BytesIO() as buffer:
    if file.endswith('wav'):
        audio = AudioSegment.from_wav(file)
    elif file.endswith('mp3'):
        audio = AudioSegment.from_mp3(file)
    else:
        log.error('TTS generated unsupported file format:', file)
    audio.export(buffer, format="mp3", bitrate=f"{bit_rate}k")
    mp3_file = File(buffer, filename=mp3_filename)

need to have chatgpt review

valid crypt
#

i think it is some kind of protection of vits simple api, as i run tts and llm on different devices...

#

vits simple api has a config file and i wonder why it wasnt using gpu

halcyon quarry
#

Ok also found a flaw in my logic

#

Could account for the 10% of the time not working for you

#

Currently it is only triggering a TTS if the text response gets split at least once

valid crypt
#

how i didnt notice that

#

🤯

halcyon quarry
#

testing my fix

#

and also found the issue causing the extra file upload

#

man I'm sloppy :p

#

And yes my fix worked for the no-split replies

#

Pushed the fixes to main

halcyon quarry
#

Now hopefully you have 100% success with that

valid crypt
#

wanna hear a small details to be improved?

#

ocf you want

#

the message after resetting the conversation has no tts :3

halcyon quarry
#

Oh, the greeting

valid crypt
#

yeees

#

the greeting

halcyon quarry
#

Could make an option for that

#

Sure, I'll do it

#

Glad you asked because I found some flawed logic in reset conversation

valid crypt
#

:)

halcyon quarry
#

The command queues a task... but it was actually replacing history with a fresh copy immediately before queueing the task

#

Also noticed this which is wrong... should be text not text_visible

#

testing if new setting works...

#

eyyy also found a bug in continue() while adding this

#

Pushed fixes / new TTS Greeting setting, to MAIN

#

Was able to use the existing speak_task() function

valid crypt
halcyon quarry
#

Showing that greeting message is making tts

valid crypt
#

thought that was showing the bug

halcyon quarry
#

The bug was that at the end of continue function it was processing any TTS response.
But, it already processed all TTS responses by that point

#

would pretty much just replay the last one

valid crypt
#

nice

halcyon quarry
#

Seeing if I can improve the message chunking detection...

Mainly, to allow \n\n (double linebreak) to be detected and given more weight.
Current code triggers on \n or . before having an opportunity to consider \n\n

#

I had given up on it back then, giving it another go now

halcyon quarry
#

Aaaand I'm giving up again. So complicated

halcyon quarry
#

Unfortunately no - the reply streaming would definitely be of higher caliber but I’ve got so many variables involved already then to make the shorter triggers be temporarily ignored, then factor a match offset, blah blah… can’t seem to make it work right

keen palm
#

Well, I withdraw my previous gif

valid crypt
#

chunking detection would be hard to make it better

#

keeping it random archives the 80%

#

i dont think that's high a priority if you cant do it easily

#

let me try somehting...

halcyon quarry
#

What I was aiming for was this kind of logic:

chunk_syntax = ['\n\n', '\n', '.']

if matched_syntax == '\n\n':
  chance_to_chunk = chance * 1.5
elif matched_syntax == '\n':
  pass
elif matched_syntax == '.':
  chance_to_chunk = chance * 0.5

return check_probability(chance_to_chunk)
#

But it is very complicated to try checking for \n\n because \n or . trigger before it can happen.

When it rolls probability for a trigger, I make it ignore that text for future checks.

valid crypt
#

you mean
this and

this?

halcyon quarry
#

Yes.

LLMs like to use double newlines like this

#

Wish I could add more weight to make it split on double newlines, compared to single newlines

valid crypt
#

what does it looks like now and what you want it to look like?

halcyon quarry
#

The third code block it gave actually worked for detection, despite not really understanding the logic of its solution.
Then I couldn't get it to actually split at the right place in the text.

valid crypt
#

can you give me a example?

#

of how the result should be and what is it giving

halcyon quarry
#

Hey here's some text that is generating.\n\nNow here's a new paragraph.

Because the tokens being generated are like "ng", ".\n", "\nN", "ow",... the code is not checking via .endswith(syntax) because it could never match \n\n

#

The code is instead checking a 4 character window to see if \n\n is anywhere within it

valid crypt
#

i understand why it is not working now

halcyon quarry
#

hat , t is , is g, gen, etc etc

valid crypt
#

but what do you want it to do

halcyon quarry
#

So eventually it will find either .\n\nN or g.\n\n or \n\nNo

valid crypt
#

like how should the results be

halcyon quarry
#

All of which are true

#

so then it needs to offset the text splitting based on where it matched in the window

#

But I also have 2 other variables tracking the text chunking

#

1 which stores all text that has already been chunked
Another which holds the text it is currently figuring out where to chunk

#

It doesn't matter if it splits before or after "\n\n" because discord clears the white space

#

its catching the little bits before and after

#

due to the "sliding window" of text it is analyzing

valid crypt
#

is this what you want it to do?
2linebreak| --> 2linebreak(message 1)
| 2linebreak(message 2)
2linebreak|