ad_discordbot (Fork of Fork of xNul's bot) | Text Generation WebUI | Page 19

halcyon quarry Oct 9, 2024, 3:48 PM

#

but obviously it would sound a bit jarring for it to generate each word separately

halcyon quarry Oct 9, 2024, 4:04 PM

#

Do you do this with other software?

visual dagger Oct 9, 2024, 4:25 PM

#

halcyon quarry Do you do this with other software?

I didn't implement a tts mechanism for me yet

#

I'm still theorarising of how to approach this

#

the poblem is I will need both the llm and the tts model working at the same time for real time

#

VRAM issue

#

maybe 2 gpus will help? but I don't hsve 2 gpus 😭

#

I might get a used cheap one

halcyon quarry Oct 9, 2024, 4:27 PM

#

right well this is something I certainly couldn't solve 😛

visual dagger Oct 9, 2024, 4:28 PM

#

the theory is there, it's possible to make it real time

#

it has been done before on yt, fully locally

#

but there is a hardware problem + software problem

#

both should be optimised for real time use

#

somehow

#

and I remember @terse folio did an experiment where you can cut off the AI, the script reality made adjusts the end of the chat history a bit to make it clear to the llm that it got cut off

#

LLM: so what I was saying is-
Me: No no shut up

someth like this, but there is more to it

halcyon quarry Oct 9, 2024, 4:31 PM

#

In the works

#

But when, unsure 😛

visual dagger Oct 9, 2024, 4:38 PM

#

halcyon quarry In the works

AGI .gguf when??

valid crypt Oct 9, 2024, 5:02 PM

#

i dont think realtime tts is possible, by sentence would be the maximun
this is really simple to prove, to be more acerated, bring 5 person and make them say a word with no context, or you can think how to say the word with no context

#

im sure that it is impossible to sound natural(continuing?)

#

would sound like cutting

valid crypt Oct 9, 2024, 5:05 PM

#

visual dagger ``` LLM: so what I was saying is- Me: No no shut up ``` someth like this, but t...

the interrupt feature is cool

#

and for stt is almost a must

visual dagger Oct 9, 2024, 5:20 PM

#

valid crypt i dont think realtime tts is possible, by sentence would be the maximun this is ...

why not possible?

#

if a sentence is 10 tokens, and you getting at least 10 tokens per second, then the voice will start almost instantly with 1s delay
(talking abt the LLM generation)

#

or 2s delay adding the tts process too

#

1-2s delay between you submitting the text prompt and the voice playing

halcyon quarry Oct 9, 2024, 5:23 PM

#

I think Marcos just means the whole generating one word at a time, the words not flowing well, etc

visual dagger Oct 9, 2024, 5:23 PM

#

ah yes I get it

#

that wont be nice yeah

halcyon quarry Oct 9, 2024, 5:24 PM

#

There would need to be a lot of research into it like the way they are able to get some degree of reasonable consistency in video generation now

visual dagger Oct 9, 2024, 5:24 PM

#

but a whole sentence will be possible

#

if the quality is bad then we should find a better tts

halcyon quarry Oct 9, 2024, 5:24 PM

#

previous generation would need to guide future

visual dagger Oct 9, 2024, 5:24 PM

#

inconsistency

visual dagger Oct 9, 2024, 5:25 PM

#

halcyon quarry previous generation would need to guide future

voice2voice?

valid crypt Oct 9, 2024, 5:25 PM

#

visual dagger but a whole sentence will be possible

thats is the streaming tts feature

visual dagger Oct 9, 2024, 5:25 PM

#

make sense, like taking the last 0.5 second as a starting point for the next generation

visual dagger Oct 9, 2024, 5:25 PM

#

visual dagger make sense, like taking the last 0.5 second as a starting point for the next gen...

to make the voice consistent

halcyon quarry Oct 9, 2024, 5:25 PM

#

No I mean the models and implementation etc would need to be some new tech that does not exist yet

#

Where it looks back a bit or knows a bit what's coming in order to still generate one word at a time but not sound like random trash

visual dagger Oct 9, 2024, 5:26 PM

#

real time doesn't have to be generating speech word for word

#

if the llm is capable of 10 t/s or more then the tts model can crunch a whole sentence

#

and while the user is listening to the first sentence, the second sentence will be ready already

#

and will be played next automatically

#

queuing

halcyon quarry Oct 9, 2024, 5:28 PM

#

uh huh, OH! Like the bot already does?

visual dagger Oct 9, 2024, 5:28 PM

#

between speeches (sentences) there will be no delays

#

the only delay is the first time

#

1s or 2s

halcyon quarry Oct 9, 2024, 5:28 PM

#

Audio is queued as it generates

visual dagger Oct 9, 2024, 5:29 PM

#

halcyon quarry Audio is queued as it generates

yup yup

halcyon quarry Oct 9, 2024, 5:29 PM

#

It does not pause and wait for a sentence to be spoken before generating the next sentence

visual dagger Oct 9, 2024, 5:29 PM

#

so you already implemented that?

halcyon quarry Oct 9, 2024, 5:29 PM

#

As soon as the TTS is generated, it simultaneously plays it while generating the next text and subsequentyly more TTS

#

which can be finished by the time the sentence is spoken

#

yes

visual dagger Oct 9, 2024, 5:29 PM

#

what a champion

halcyon quarry Oct 9, 2024, 5:30 PM

#

If you can get the text and TTS fast enough it will stream it nicely

valid crypt Oct 9, 2024, 5:30 PM

#

visual dagger so you already implemented that?

thats why i understood word by word, bcs buddy already mentioned the feature

#

here

visual dagger Oct 9, 2024, 5:31 PM

#

halcyon quarry If you can get the text and TTS fast enough it will stream it nicely

I did some testing on my own, I made something like that a week ago, and it was playing all audio chunks at the same time which is not what I want bcz it gets too noisy etc.. so I figured I should make a queue mechanism and behold it worked, the only delay was the first secs

#

.
but we are greedy, so how can we reduce that first delay to almost zero? any ideas?

valid crypt Oct 9, 2024, 5:34 PM

#

i suggested to set the first split trigger % to 100%

halcyon quarry Oct 9, 2024, 5:35 PM

#

valid crypt Oct 9, 2024, 5:36 PM

#

visual dagger . but we are greedy, so how can we reduce that first delay to almost zero? any i...

there's no much thing to do, its already splitting the response and generate by chunks, if you want it to be faster, you need better hardware

halcyon quarry Oct 9, 2024, 5:36 PM

#

valid crypt i suggested to set the first split trigger % to 100%

This is sensible, might add an option for this

visual dagger Oct 9, 2024, 5:37 PM

#

valid crypt there's no much thing to do, its already splitting the response and generate by ...

there might be, some string manupilation? or something else

#

there has to be a way, there is always a way

halcyon quarry Oct 9, 2024, 5:38 PM

#

well what he proposes is string manipulation 😛

#

We are manipulating string

visual dagger Oct 9, 2024, 5:38 PM

#

but did it reduce the initial delay?

halcyon quarry Oct 9, 2024, 5:38 PM

#

Currently sentences are split via chance_to_stream

#

so it will roll dice and split or not split

#

Marcos said it could be a good idea to make it guaranteed to split on the first sentence completion or whatever, then use the normal logic to roll random on other factors

visual dagger Oct 9, 2024, 5:39 PM

#

does it play the very very first ms once it's generated?

halcyon quarry Oct 9, 2024, 5:40 PM

#

I didn't explicitly time it but probably?

visual dagger Oct 9, 2024, 5:40 PM

#

halcyon quarry Marcos said it could be a good idea to make it guaranteed to split on the first ...

so a special method for onky the first sentence?

#

different than the rest of the reply

halcyon quarry Oct 9, 2024, 5:40 PM

#

right

#

I could do it right now and it'll be done in 10 mins really

visual dagger Oct 9, 2024, 5:41 PM

#

and how much the first delay will be?

halcyon quarry Oct 9, 2024, 5:41 PM

#

nothing complicated about treating the first "chance to stream" differently from the rest

#

however long it takes to generate that text + the TTS

#

1 sentence

visual dagger Oct 9, 2024, 5:42 PM

#

a human delay to first token (lol) is about 0.5-1 secs

halcyon quarry Oct 9, 2024, 5:42 PM

#

you'll be welcome to configure it to send and generate each word separately

#

will just sound like dogshit, but user preference is fine 🙂

visual dagger Oct 9, 2024, 5:43 PM

#

word by word isnt gonna cut ut

#

it

#

the goal is to reduce the first delay without affecting the quality

#

or affecting it slightly

halcyon quarry Oct 9, 2024, 5:44 PM

#

yes, please come back and let us know if and when you solve this problem

visual dagger Oct 9, 2024, 5:44 PM

#

I have an idea

#

generating 100 of starter phrases and saving them to disk, and always using those starters (text) and also the speech therefore real time voice

#

you know the "start with" concept in ooba?

#

those 100 voice starters will be pregenerated and saved for future use

#

you can allow the user to provide a list of starters and click a button "Generate all & save to disk"

#

then give the user those options/boxes to tick

Use saved voice starters
Use saved text starters

halcyon quarry Oct 9, 2024, 5:48 PM

#

Well, adding a play_audio tag parameter could be a good start

#

I could implemented the same logic I have for send_user_image which can accept either a direct image file, or a folder to randomly choose from

visual dagger Oct 9, 2024, 5:49 PM

#

oh and also cache first sentences

#

so in the future when the first sentence matches an already saved first senetence then play it right awya

#

caching is cool to reduce latency

halcyon quarry Oct 9, 2024, 5:50 PM

#

That's getting a bit too niche

#

for the bot at least

visual dagger Oct 9, 2024, 5:51 PM

#

halcyon quarry That's getting a bit too niche

I've been battling llm repetition for the last year or so

#

believe me it's not

#

alot of models at some point will output the exact same first sentence

halcyon quarry Oct 9, 2024, 5:52 PM

#

I do already have a tag parameter begin_reply_with - I imagine it will not actually generate that text

#

So you are already welcome to proactively prefix the LLM's reply with a specific string

visual dagger Oct 9, 2024, 5:53 PM

#

halcyon quarry I could implemented the same logic I have for `send_user_image` which can accept...

yeah you can just copy paste that function and editing a bit

#

just a food for thought

halcyon quarry Oct 9, 2024, 5:54 PM

#

I'll think about it

visual dagger Oct 9, 2024, 5:54 PM

#

you can target the most repetitve sentences ever

#

oh wait the user have to

#

but if you're the user, lol, then you just generate the most probable first sentences

halcyon quarry Oct 9, 2024, 5:55 PM

#

in the meantime, a good way to battle repetition could be to add some preconfigured randomness to your prompts, in the background

visual dagger Oct 9, 2024, 5:55 PM

#

I tried man

#

nope

#

some models are very stubborn, no matter what you do they insist on repeating

halcyon quarry Oct 9, 2024, 5:56 PM

#

I also have the llm_param_variances tag where you can preconfigure ranges for different parameters, and each generation it will randomly select values within those ranges

visual dagger Oct 9, 2024, 5:56 PM

#

visual dagger some models are very stubborn, no matter what you do they insist on repeating

llama and mistral models are one of them

halcyon quarry Oct 9, 2024, 5:57 PM

#

There's a lot of tools in here that should be able to put prompting and generations through the blender

visual dagger Oct 9, 2024, 5:57 PM

#

halcyon quarry I also have the `llm_param_variances` tag where you can preconfigure ranges for ...

it doesnt cause a whole history reevaluation?

halcyon quarry Oct 9, 2024, 5:57 PM

#

I don't think so

#

But you will get a whole history reevaluation if someone writes something in a different channel ("per-channel history" setting)

visual dagger Oct 9, 2024, 5:58 PM

#

btw a way to battle inconcistency in voice is adding a little bg noise, like winds or whatever the user chooses, the human brain can't tell the difference

halcyon quarry Oct 9, 2024, 5:58 PM

#

that's kind of a funny idea

visual dagger Oct 9, 2024, 5:58 PM

#

bcz the winds are consistent

#

yeah I tried it it works

halcyon quarry Oct 9, 2024, 5:59 PM

#

there's probably some functions to mix audio together and split on the original length of one

visual dagger Oct 9, 2024, 5:59 PM

#

halcyon quarry there's probably some functions to mix audio together and split on the original ...

ffmpeg!!!

halcyon quarry Oct 9, 2024, 5:59 PM

#

or loop one to the length of the other, then mix and split

visual dagger Oct 9, 2024, 5:59 PM

#

visual dagger ffmpeg!!!

is faster than python libs

halcyon quarry Oct 9, 2024, 6:00 PM

#

LLM says "I'm at the beach" and it mixes in the sound of waves crashing and seagulls cawing

visual dagger Oct 9, 2024, 6:00 PM

#

halcyon quarry or loop one to the length of the other, then mix and split

or yk just play the winds on loop forever once the webui is accessed

halcyon quarry Oct 9, 2024, 6:00 PM

#

I like it 😛

visual dagger Oct 9, 2024, 6:00 PM

#

visual dagger or yk just play the winds on loop forever once the webui is accessed

I think alot of people will wnat that

#

like a game

#

no need to merge audios

halcyon quarry Oct 9, 2024, 6:01 PM

#

oh right

visual dagger Oct 9, 2024, 6:01 PM

#

halcyon quarry LLM says "I'm at the beach" and it mixes in the sound of waves crashing and seag...

we're getting to silly tavern territory, but seriously the amount of features they have is insane

#

alot of effort went to it

halcyon quarry Oct 9, 2024, 6:02 PM

#

Never used it

visual dagger Oct 9, 2024, 6:02 PM

#

me too, but just looking from outside it is very cool

halcyon quarry Oct 9, 2024, 6:03 PM

#

u trying out my LLM streaming feature?

visual dagger Oct 9, 2024, 6:03 PM

#

ooba doesnt work for me since I broke it like earlier

#

conda issues

halcyon quarry Oct 9, 2024, 6:04 PM

#

fix it 😛

visual dagger Oct 9, 2024, 6:04 PM

#

I tried alot to make it work

#

I'm now a free man

#

my webui is any ui

#

I use llama.cpp and koboldcpp nowadays

halcyon quarry Oct 9, 2024, 6:05 PM

#

You're here looking for an excuse to fix tgwui

visual dagger Oct 9, 2024, 6:05 PM

#

and alot of scripting with python

#

I tried again and again but it just doesnt install

#

tried to fix those conda env issues but still no hope

#

screw it I will make my own webui

halcyon quarry Oct 9, 2024, 6:07 PM

#

Just run the 1 click installer and be done with it

visual dagger Oct 9, 2024, 6:07 PM

#

brother

#

do you think I didnt?

halcyon quarry Oct 9, 2024, 6:09 PM

#

it installs and runs from its own miniconda

visual dagger Oct 9, 2024, 6:09 PM

#

I know

#

it didnt work

#

I spent hours trying to fiz ereos

halcyon quarry Oct 9, 2024, 6:10 PM

#

Idk how this can go wrong for anyone

visual dagger Oct 9, 2024, 6:10 PM

#

fix errors*

#

you want to make a game like tab?

#

I can contribute a bit, but can't promise doing the whole thing

halcyon quarry Oct 9, 2024, 6:11 PM

#

wdym

visual dagger Oct 9, 2024, 6:11 PM

#

some sort of a game interface

#

it will be beside the chat tab

halcyon quarry Oct 9, 2024, 6:12 PM

#

The bot needs a settings interface but settings interfaces are a big PITA to code

visual dagger Oct 9, 2024, 6:12 PM

#

like with visuals

halcyon quarry Oct 9, 2024, 6:12 PM

#

would do that before thinking of a game interface

visual dagger Oct 9, 2024, 6:13 PM

#

we can make something from scratch

halcyon quarry Oct 9, 2024, 6:13 PM

#

not interested 🙂

visual dagger Oct 9, 2024, 6:14 PM

#

https://tenor.com/view/rejected-stamp-gif-12255531

Tenor

#

dont upvote it

#

: /

valid crypt Oct 9, 2024, 6:32 PM

#

visual dagger then give the user those options/boxes to tick - Use saved voice starters - Use...

i think i suggested similar ideas, i think it was prepare a bunch of voices or sounds that's always normal the begin a reply with, like umm, uhhh, ehhh, i think..., i... and randomly choose one and play it at the beginning, pretty much with 2s we get enough time to process the rest, 1s of audio 1s of silence feels natural

visual dagger Oct 9, 2024, 6:35 PM

#

valid crypt i think i suggested similar ideas, i think it was prepare a bunch of voices or s...

you tried it?

valid crypt Oct 9, 2024, 6:36 PM

#

just suggested

#

:p

halcyon quarry Oct 9, 2024, 6:42 PM

#

I'd think something more like "Let me think about that..." Or "Well, let's see..." or "hmmmmmm...."

#

Unless its a frat bro LLM char "Yo like, uhhh..."

valid crypt Oct 9, 2024, 7:15 PM

#

just for inspiration, should be customizable though

halcyon quarry Oct 10, 2024, 1:49 AM

#

I'm going to fix that progress bar tomorrow, no matter what

#

the image gen progress bar that likes to vanish

terse folio Oct 10, 2024, 8:06 AM

#

It's okay to have a second of silence,
People need time to think.

While people talk to us in voice, we are building up an output response in our minds, just not speaking it yet.

That's why it seems like humans can respond so quickly, because they have been thinking about it during the whole sentence.

if the llm is fast enough, you could do this all quickly at the end.
if not, it may be better to generate a little big mid sentence so the llm is more prepared for when you're done speaking and can get to tts immediately

I need to rebuild that system

halcyon quarry Oct 10, 2024, 1:12 PM

#

I figured out the reason why the progress bar poofs. Got it on the first guess really... should've debugged it sooner

halcyon quarry Oct 10, 2024, 1:36 PM

#

apparently the API call can report measurable "progress" before it will return a positive "job count"

terse folio Oct 10, 2024, 1:37 PM

#

that's interesting, hmm

#

Does it include the current job count in the progress data?

halcyon quarry Oct 10, 2024, 1:37 PM

#

I fixed it though

#

Now I just ignore job count - if it returns progress data it keeps going

#

I'll see if I can improve it a bit more such as updating the message if it stalls, etc

terse folio Oct 10, 2024, 1:41 PM

#

yea, some sort of timeout for how long it takes for progress to change.
but if a model is being offloaded to ram, it could take too long to make meaningful progress

halcyon quarry Oct 10, 2024, 3:05 PM

#

Pushed fixed and improved SD Img Gen Embed

#

If I can figure out how to effectively use the "Cancel" api endpoint I may add a Cancel button to the embed

terse folio Oct 10, 2024, 3:09 PM

#

going to also want to make sure the button is disabled on completion so users cant mess with others' generations

halcyon quarry Oct 10, 2024, 3:10 PM

#

ah yes... well it could check if the user is the original user. Or I could make it separate and ephemeral

terse folio Oct 10, 2024, 3:10 PM

#

I mean, even if it is the original user,
A malicious user could do an image request then cancel it after completion

#

because SD doesn't know what job you are canceling, it just knows to stop everything

#

similar to tgwui

halcyon quarry Oct 10, 2024, 3:11 PM

#

Not quite

#

The cancel endpoint seems to want a specific event ID

terse folio Oct 10, 2024, 3:11 PM

#

oh nice!

halcyon quarry Oct 10, 2024, 3:11 PM

#

which I believe will be present in the progress data...

terse folio Oct 10, 2024, 3:13 PM

#

okay, that's pretty good ^-^

valid crypt Oct 10, 2024, 5:02 PM

#

^-^

halcyon quarry Oct 11, 2024, 2:52 AM

#

Just made a very nice commit for SD Forge API handling

#

https://github.com/lllyasviel/stable-diffusion-webui-forge/pull/2027

GitHub

API Improvements: Modules Change AND Restore override_settings by a...

Improve API Modules Change:
Previously, the complete file path was required for each module specified in the 'forge_additional_modules' list.
'forge_additional_modules&#...

#

Forge API users will rejoice

halcyon quarry Oct 11, 2024, 11:35 AM

#

Mainly Illyasviel had removed the functionality of “override_settings” whilst revamping the code for Flux support, but never plugged it back in

valid crypt Oct 11, 2024, 2:55 PM

#

forge can use flux?

halcyon quarry Oct 11, 2024, 3:09 PM

#

Yep!

#

And I can take a fair share of credit for it working via API

valid crypt Oct 11, 2024, 4:32 PM

#

🤯

halcyon quarry Oct 11, 2024, 4:38 PM

#

There is example code in more recent versions of the bot (dict_imgmodels, basesettings) showing Forge specific values that can be used for managing Flux models

halcyon quarry Oct 11, 2024, 11:20 PM

#

Forge got Flux, then I waited… and no one fixed the api related code… so took a stab at it and made it happen (a few times now)

valid crypt Oct 12, 2024, 8:06 AM

#

:v

halcyon quarry Oct 12, 2024, 7:56 PM

#

I’m going to add 2 new tag params:

name - will be used in cmd print statements, as well as for…
if_tags_matched - a new condition, a list of tag names. If any tags with a name are matched the condition will be true

halcyon quarry Oct 12, 2024, 10:07 PM

#

For the latter, the main benefit will be if you have a crap ton of triggers for a tag, but you want another tag to trigger on those triggers but maybe another condition… or just want 2 separate tags with same triggers, can just use name trigger

tepid needle Oct 13, 2024, 6:24 AM

#

Quick question
Can the bot connect to a comfyui server?

terse folio Oct 13, 2024, 6:39 AM

#

tepid needle Quick question Can the bot connect to a comfyui server?

#1154970156108365944 message
I think that or something similar is on the todo list

tepid needle Oct 13, 2024, 6:39 AM

#

Noted
Thanks for the quick response

halcyon quarry Oct 13, 2024, 12:12 PM

#

tepid needle Quick question Can the bot connect to a comfyui server?

That is literally all it can do right now, but I didn’t code anything in to actually use it yet 🤣

#

Er, I’m working to make it work with SwarmUI which is a frontend for Comfy

#

Might also just work for Comfy directly

#

My PR at Forge got merged so override_settings now works for Forge again

halcyon quarry Oct 13, 2024, 2:28 PM

#

Pushed new setting to define imgmodel filters per-server

#

Enhances usefulness of the per_server_imgmodels setting

#

#

I'll likely add another code block for per_channel_filters because why not

#

Not necessarily to split SFW / NSFW bot uses, but can help with only enabling relevant models, like if you have a server dedicated to cartoons and such can keep realistic modls out, etc

valid crypt Oct 13, 2024, 3:04 PM

#

me while changing settings

#

https://tenor.com/view/velma-glasses-cant-find-my-glasses-scooby-doo-gif-11995608

Tenor

tepid needle Oct 13, 2024, 8:53 PM

#

im loving the discord bot API

#

took me a bit to set up the response formatting and configuration and figuring out the yaml settings but after that it works like a charm

#

im using a llama 3.1 70B exl2 4.0 model and image generation alongside it
fits snugly within my specs

#

the tag system is very fun

valid crypt Oct 13, 2024, 9:48 PM

#

:) leave him a star

halcyon quarry Oct 14, 2024, 1:46 AM

#

@tepid needle thanks! Glad you're enjoying it. Open to any feedback or suggestions

#

The tags system was this eureka idea I had to wrap up most of my existing features into one package, with easy expansion/enhancement/all that. I was in way over my head when I got into coding it, and its one of those things I'll always be proud of.

tepid needle Oct 14, 2024, 1:51 AM

#

the tag system works wonderfully with image generation

#

#

minty swaps in and translates their prompts into something usable for image generation

#

really, really good
makes it intuitive and easy for the user who doesnt know how to work with image generation

#

the only thing I wished was that image generation swapped to a different model depending on a tag

halcyon quarry Oct 14, 2024, 1:53 AM

#

Well that can be done

tepid needle Oct 14, 2024, 1:53 AM

#

Huh, then I did something wrong then cause I tested it out using selfie with a character

halcyon quarry Oct 14, 2024, 1:54 AM

#

Coincidentally, earlier today I noticed my comment for that param was not correct... well, it was correct at one point

#

The value for the swap_imgmodel or change_imgmodel tag param should be the "model name"

#

The model names can be fetched from the sd-models endpoint

#

its basically the filename except any subdirectories willbe prefixed before it

tepid needle Oct 14, 2024, 1:56 AM

#

so to use an example
lets say I want to swap over to the 'juggernautXL_juggXIByRundiffusion.safetensors [33e58e8668]'

would I type it like this in the character file?

swap_imgmodel: 'juggernautXL_juggXIByRundiffusion.safetensors [33e58e8668]'

halcyon quarry Oct 14, 2024, 1:56 AM

#

Such as /sdxl/leosamsHelloworldXL_helloworldXL70 - the model name is:
sdxl_leosamsHelloworldXL_helloworldXL70

tepid needle Oct 14, 2024, 1:56 AM

#

ah

#

I see

#

so thats the proper formatting

halcyon quarry Oct 14, 2024, 1:57 AM

#

I'll fix that comment now

#

swap_imgmodel: juggernautXL_juggXIByRundiffusion would be correct unless its in a subdir

tepid needle Oct 14, 2024, 1:58 AM

#

theyre not, this is the file structure
\stable-diffusion-webui-reForge\models\Stable-diffusion

halcyon quarry Oct 14, 2024, 1:58 AM

#

yep, then as I said should work

tepid needle Oct 14, 2024, 1:58 AM

#

got it, I'll try that out later

#

Greatly appreciate the help

halcyon quarry Oct 14, 2024, 1:59 AM

#

appologies for the confusion

tepid needle Oct 14, 2024, 2:00 AM

#

man, I cant wait to get that settled in
now on key words I can swap to a specific model with the proper Lora to match an aesthetic

#

fun stuff

halcyon quarry Oct 14, 2024, 2:01 AM

#

Yep! Lots of fun stuff to do with it

#

What amazes me is that when I promote this no one even looks lol

#

Baffling really, but I'm plugging away all the same

tepid needle Oct 14, 2024, 2:04 AM

#

thats crazy
the tag system makes it really intuitive for a user who has no idea how prompting works to get what they want

#

i can just throw the bot into a server and not be pinged to get a specific prompt for an image

#

similar prompt but from two different users using different keywords
the tags activate and the proper Loras are used

halcyon quarry Oct 14, 2024, 2:06 AM

#

So looking at my code - it looks like it should actually work either way

#

whether using juggernautXL_juggXIByRundiffusion.safetensors [33e58e8668] or juggernautXL_juggXIByRundiffusion

tepid needle Oct 14, 2024, 2:06 AM

#

got it

halcyon quarry Oct 14, 2024, 2:07 AM

#

Noice

#

In Forge... my instructions will fail though because for whatever reason, the UI does not show the [HASH] if it was calculated

#

You're using reforge though, that value from the UI should match the value in the API call

tepid needle Oct 14, 2024, 2:10 AM

#

im updating my repos before testing this out

halcyon quarry Oct 14, 2024, 2:12 AM

#

It tries matching these values marked in the left

#

here in Forge the damn dropdown values dont match anything

#

I guess I could also check if filename.endswith(value)

#

Another thing to consider, is trying to come up with your own prompting characters - using the same approach as the M1nty-SDXL character

#

just need to provide different example responses

tepid needle Oct 14, 2024, 2:16 AM

#

fuck yeah it worked

#

it didnt change back to the previous model but it did swap

halcyon quarry Oct 14, 2024, 2:18 AM

#

I'm not going to dive in but I believe that it is now just prepared to load that model again on the next request

#

If not - then I have something to fix 🔧

#

Did you use the swap_imgmodel param?

tepid needle Oct 14, 2024, 2:19 AM

#

i did generate an image after it

#

yep

#

halcyon quarry Oct 14, 2024, 2:19 AM

#

I'll look into it tomorrow

#

outta time 😛

tepid needle Oct 14, 2024, 2:20 AM

#

ditto
gotta get ready for work tomorrow

#

but with this i can use the tag system to swap to the appropriate model

halcyon quarry Oct 14, 2024, 2:42 AM

#

Some more cool image stuff to check out:

Flows tag is very powerful once you get it, check out the few examples I provided. For example, you can gen an image then use it as input to gen a second image, such as img2img and/or as a controlnet input.
I have an explanation in the “tips” folder on how to make some advanced workflows… the bot has a very elaborate method of choosing random images from nested directories. If you have a number of different image inputs in the same tag (img2img, controlnet inputs, reactor input, etc), all set to randomly select, it will try finding the others in the same directory

#

Such as, if I have a number of directories of “person posing with product” - and each directory has a package of the same inputs except they are different poses - the bot is able to basically pick one of those directories and apply all the matching inputs

tepid needle Oct 14, 2024, 2:45 AM

#

oooh, I see

halcyon quarry Oct 14, 2024, 2:46 AM

#

So I could have someone pose with the product in all different ways, then make some inpainting masks / etc for each one

tepid needle Oct 14, 2024, 2:48 AM

#

ill first need to figure out these other parts of stable diffusion because i started two days ago reading up on stuff

halcyon quarry Oct 14, 2024, 2:48 AM

#

There’s a lot to digest!

tepid needle Oct 14, 2024, 2:48 AM

#

what I would like is to use the upscaler for higher quality images when specified by the user

#

which im assuming can be done with the flows tag

halcyon quarry Oct 14, 2024, 2:49 AM

#

I’d recommend using the Flows tag for that - and use a series of incremental scale ups

#

Exactly

tepid needle Oct 14, 2024, 2:50 AM

#

i have something similar in comfyui so I get the idea

halcyon quarry Oct 14, 2024, 2:50 AM

#

There’s a cool extension you can get called Loopback Scaler

#

You’ll find it in extensions list

tepid needle Oct 14, 2024, 2:50 AM

#

found it

halcyon quarry Oct 14, 2024, 2:51 AM

#

Basically you’d do something like double your resolutions, then use 4+ steps with a medium-low denoise

#

I believe I added a custom payload param in the bot, “scale”

#

(Might be tag param)

tepid needle Oct 14, 2024, 2:54 AM

#

found it

#

gonna set that to 2 for double the resolution

halcyon quarry Oct 14, 2024, 2:54 AM

#

Well with Flows tag you’d wanna do something like
Step 1: scale 1.3
Step 2: scale 1.2
Scale 1.1

#

Or similar

#

Note that it will round dimensions to 64px precision

#

I may add a tag to adjust rounding precision

tepid needle Oct 14, 2024, 3:31 AM

#

interesting tool

halcyon quarry Oct 14, 2024, 3:39 AM

#

I got some great results using the xinsir Tile controlnet model with that - along with Reference controlnet

#

This allows a high denoise value

#

While keeping very close composition

tepid needle Oct 14, 2024, 3:44 AM

#

im getting the hang of it

halcyon quarry Oct 14, 2024, 4:04 AM

#

Definitely my favorite way to upscale

halcyon quarry Oct 14, 2024, 12:03 PM

#

You should consider switching over to Forge from ReForge

tepid needle Oct 15, 2024, 1:35 AM

#

What are the benefits?

halcyon quarry Oct 15, 2024, 2:29 AM

#

tepid needle What are the benefits?

Mainly, Forge supports Flux

#

The model loading/memory management may be marginally better than ReForge as well

#

I have a 4070ti (12gb vram) and quantized versions of Flux generate about the same timeframe as XL

tepid needle Oct 15, 2024, 2:54 AM

#

oh really

#

ill give it a shot then

valid crypt Oct 15, 2024, 6:55 PM

#

visual dagger a human delay to first token (lol) is about 0.5-1 secs

i got the answer, Doherty Threshold, the time must be <400ms and you can lengthen this time by giving any response/feedback 😄

visual dagger Oct 15, 2024, 9:14 PM

#

valid crypt i got the answer, Doherty Threshold, the time must be <400ms and you can lengthe...

what's that I didnt understand wdym

valid crypt Oct 15, 2024, 9:15 PM

#

search Doherty Threshold

visual dagger Oct 15, 2024, 9:17 PM

#

I did

#

"addicted to the website"

#

I like that, lol

valid crypt Oct 15, 2024, 9:17 PM

#

wut

#

ill search for you :)

visual dagger Oct 15, 2024, 9:17 PM

#

https://www.youtube.com/watch?v=Lef9blQ2cGk

YouTube

Anushka Bhagchandani

Doherty Threshold - Laws of UX

Link to presentation: https://docs.google.com/presentation/d/1fX4eLFmeTWG_rX0_3RQG57ZDmtqwuplcfR4hv7HLnio/edit?usp=sharing

Link to Medium article: https://medium.com/@anushkasb/19-laws-of-ux-doherty-threshold-d9759a874119

▶ Play video

visual dagger Oct 15, 2024, 9:19 PM

#

valid crypt i got the answer, Doherty Threshold, the time must be <400ms and you can lengthe...

so how would you go about implementing that, exactly?

valid crypt Oct 15, 2024, 9:19 PM

#

it is just a theory

#

like a goal

visual dagger Oct 15, 2024, 9:19 PM

#

a game theory?

#

jk

valid crypt Oct 15, 2024, 9:20 PM

#

the goal is to make it <400ms

visual dagger Oct 15, 2024, 9:20 PM

#

hums?

#

hmmmmmmm...

#

voice humming? or sound effects

valid crypt Oct 15, 2024, 9:21 PM

#

to lengthen the time could be hmmm

visual dagger Oct 15, 2024, 9:21 PM

#

premade sounds that can be played right away, while the actual generation is being baked in the backend

valid crypt Oct 15, 2024, 9:23 PM

#

Doherty Threshold only says that under 400ms we dont feel the waiting and by giving any kind of feedback can lengthen those 400ms

visual dagger Oct 15, 2024, 9:23 PM

#

you mean we got like 0.4 secs for free?

#

I mean if the actual generation is late a bit it's alright because we can play a premade sound even after 0.4 secs

valid crypt Oct 15, 2024, 9:24 PM

#

this tells us that we have to process the first sentece+tts in less than 800ms in worst scenarios

visual dagger Oct 15, 2024, 9:24 PM

#

possible right?

#

I mean if the humming is long enough

valid crypt Oct 15, 2024, 9:25 PM

#

visual dagger I mean if the actual generation is late a bit it's alright because we can play a...

yes, if we dont get the response in 0.4 we can play a sound to get another 0.4

visual dagger Oct 15, 2024, 9:25 PM

#

or whatever premade sounds we choose

#

if they are long engough

valid crypt Oct 15, 2024, 9:25 PM

#

if humming is 1s which i think is alright in total we have 1.8s

visual dagger Oct 15, 2024, 9:27 PM

#

1.8s of free time to generate (speech) the actual llm response

#

seems cool

#

you just add the humming or whatever default starters we choose to the llm response before hitting start

#

so in the end the llm response will make sense bcz the voice and the llm response are matched perfectly

valid crypt Oct 15, 2024, 9:31 PM

#

🤗

#

demanding a toggle to trigger tts for the first split :V

visual dagger Oct 15, 2024, 9:35 PM

#

you will choose a random starter to play?

#

from the list?

#

Example list

hmm...
hmm.. {user}
{user}
I think...
So...
You know what?
I... hmm..
umm...

#

starters/premade starter voices

valid crypt Oct 15, 2024, 9:38 PM

#

valid crypt demanding a toggle to trigger tts for the first split :V

this is a different approach to make the bot better, as setting the aplit to 100% you get bombed by discord, to a lower % it a matter of luck

valid crypt Oct 15, 2024, 9:39 PM

#

visual dagger Example list ``` hmm... hmm.. {user} {user} I think... So... You know what? I......

that looks good

visual dagger Oct 15, 2024, 9:40 PM

#

valid crypt this is a different approach to make the bot better, as setting the aplit to 100...

wdym set the split to 100%?

valid crypt Oct 15, 2024, 9:40 PM

#

have you tried the streaming tts feature?

halcyon quarry Oct 15, 2024, 9:40 PM

#

This guy doesn’t even use text generation web UI

visual dagger Oct 15, 2024, 9:41 PM

#

ooba doesnt want me

#

: /

#

ooba hates me

halcyon quarry Oct 15, 2024, 9:41 PM

#

Yeah I’ll add the parameter to trigger the first TTS split I’ve just been busy

#

I’ve been pushing some pretty significant Paul requests to forge the past few days

visual dagger Oct 15, 2024, 9:42 PM

#

valid crypt that looks good

you can allow the users to provide a custom list, then click a button "Generate Starters", all those starters will be voiced and cached for the future

#

just ideas for you guys

valid crypt Oct 15, 2024, 9:46 PM

#

visual dagger Example list ``` hmm... hmm.. {user} {user} I think... So... You know what? I......

after thinking this idea, if we set start reply with "Hmmm"
we start a reply, play a preprocessed "Hmmm", after generating the reply it will have another "Hmmm", so in total we will hear 2 "Hmmm"

visual dagger Oct 15, 2024, 9:49 PM

#

sometimes it's better to control the users behavior than the code

#

so add a note "Preferebly write long starters for good performance"

#

a note under the feature

#

longer starters will give you more headroom, more secs, more time to work with

halcyon quarry Oct 18, 2024, 4:56 PM

#

Sometimes I'll get an error message when it tries saving History to file.
This was reported before.

Pushed an update that resolves this

valid crypt Oct 18, 2024, 6:24 PM

#

idk if you were referring the problem i encountered before but indeed there was a problem with the history file

halcyon quarry Oct 18, 2024, 6:25 PM

#

Probably it

halcyon quarry Oct 20, 2024, 12:59 AM

#

Pushed an update

Tags can now have a "name" parameter
For the moment, this doesn't do much, only:
- prints the name for some tag logging
- Is now required for tags which include a "persist" param.

#

Coming soon, new condition which will be True if the value matches the name of a matched tag

#

#

Using 'name' to log/check 'persist' tags sits much nicer with me, than what I was having to do in order to capture the tag value and compare it for equality

terse folio Oct 20, 2024, 1:47 AM

#

halcyon quarry ## Pushed an update - Tags can now have a "`name`" parameter - For the moment, t...

That's awesome!
I think that would have been great to test a theory a while ago while debugging what was up with the censoring bypass.

#

For easier at-a-glance readability, every log for tags could be displayed as [TAGS | {name}] since it's a stored attribute that gets passed around ^^

halcyon quarry Oct 20, 2024, 8:32 PM

#

I might update the 'trumps' behavior to operate on 'name' instead of 'triggers'

halcyon quarry Oct 21, 2024, 1:37 AM

#

Pushed update adding new condition 'only_with_tags'

only_with_tags is a list of tag names (the new name param)
This condition is only True if one of the named tags was matched.

#

Condition will also be True for any named tags that are persistently applied
If a named tag was matched, then trumped, this tag won't trigger.

tepid needle Oct 21, 2024, 6:57 AM

#

@halcyon quarry one quick question before I forget
Is it possible to allow a tag, when triggered, to input a randomly chosen set of values for image generation?

halcyon quarry Oct 21, 2024, 12:34 PM

#

Yes it’s the img_param_variances tag @tepid needle

#

You predefine the ranges for each setting you want randomized

#

For number values integers and floats it does not use the value that it picks rather it will add or subtract the selected value from whatever the default value is like if you have 30 steps edit choose is five image will generate with 35 steps

#

I have some comments included for the tag for him so go check it out

#

Voice input

valid crypt Oct 22, 2024, 6:48 PM

#

halcyon quarry - Condition will also be True for any named tags that are persistently applied -...

😵‍💫 So I have a trigger, the trigger must match and then if the name matches too then activates the persist tag?

#

im 😵‍💫

halcyon quarry Oct 22, 2024, 6:49 PM

#

As long as the tag with persist also has a name parameter, it will work

#

Instead of retaining a copy of the entire tag value, as I was doing, it is now only retaining the name

#

During the tag matching, it fetches any persistent names that were captured.
As it iterates over the tags, if a tag has a matching name it will be automatically applied

valid crypt Oct 22, 2024, 6:53 PM

#

the persist tag must have a name
if the trigger matches ✅
if the name matches ✅

halcyon quarry Oct 22, 2024, 6:53 PM

#

The name does not have to match anything for the tag to trigger in the first place

#

It's only used to re-match the tag

valid crypt Oct 22, 2024, 6:54 PM

#

you mean that before it retains the content of the tag and now it retains the name and looks for the content?

#

😵‍💫

halcyon quarry Oct 22, 2024, 6:54 PM

#

Yes, before it made a copy of the entire tag and kept it

#

Now when a persistent tag is matched, it just captures the name

#

It applies the entire tag

#

Look, as far as you are concerned all you need to do is slap a name on it and it will behave as it has been

valid crypt Oct 22, 2024, 6:56 PM

#

what are the benefits?

halcyon quarry Oct 22, 2024, 6:56 PM

#

I manipulate the tags as they are being processed

#

by taking a snapshot of the tag value, then trying to compare it again later for equality, I had to mdofy a lot of code

#

so instead of capturing something like {'trigger': 'some text', 'should_gen_image': true, 'insert_text': 'some shit', 'text_replace_method': insert'} etc etc, then trying to match it later

#

Now I just capture: persist_tag_names: ['some shit', 'another persistent tag']

#

These are "tupled" with the number of remaining persistency

#

so it can be deducted by 1 every time until zero

valid crypt Oct 22, 2024, 7:09 PM

#

ah

#

benefits are for the coder

halcyon quarry Oct 22, 2024, 7:10 PM

#

benefits are for the code itself!

#

The way I had it was not sustainable

valid crypt Oct 23, 2024, 3:03 PM

#

healthy code healthy dev

halcyon quarry Oct 23, 2024, 3:09 PM

#

I found a bug which in some cases, caused a number of tags to be completely skipped, when using the /image command.

#

Just pushed a fix for that

valid crypt Oct 23, 2024, 3:27 PM

#

is civit half down for you?

halcyon quarry Oct 23, 2024, 3:38 PM

#

civit seems fully OK as far as I can tell

valid crypt Oct 23, 2024, 3:49 PM

#

i have a big problem then

#

💀

halcyon quarry Oct 23, 2024, 3:51 PM

#

Your network or work network?

valid crypt Oct 23, 2024, 3:52 PM

#

looks fine

halcyon quarry Oct 23, 2024, 3:53 PM

#

if you are at work they may have put some specific block or something

#

try a different browser

valid crypt Oct 23, 2024, 5:22 PM

#

i bet my intel cpu is dying

#

sd 3.5 🥳

valid crypt Oct 23, 2024, 6:04 PM

#

when i look

#

when i look away and look back

#

ha got you

halcyon quarry Oct 29, 2024, 3:39 PM

#

tepid needle it didnt change back to the previous model but it did swap

Finally got around to resolving this

#

one line I needed 😄

tepid needle Oct 29, 2024, 4:27 PM

#

Wooo

valid crypt Oct 30, 2024, 6:00 PM

#

👏

valid crypt Oct 31, 2024, 10:21 PM

#

https://tenor.com/view/happy-halloween-halloween-happy-holidays-trick-or-treat-let's-get-spooky-gif-10939093450278559085

Tenor

halcyon quarry Nov 1, 2024, 12:28 AM

#

Happy Halloween to you too

terse folio Nov 1, 2024, 12:31 AM

#

Happy Halloween!

visual dagger Nov 6, 2024, 6:42 PM

#

ded 💀

#

#

it feels so empty in here. hello!

terse folio Nov 6, 2024, 6:56 PM

#

Hello!

halcyon quarry Nov 6, 2024, 6:56 PM

#

Hi!

terse folio Nov 6, 2024, 6:57 PM

#

wave wave How's the project going?
^-^

halcyon quarry Nov 6, 2024, 6:57 PM

#

Been on break 🙂

terse folio Nov 6, 2024, 6:58 PM

#

That's a mood, I've been focusing on a lot of life stuff lately

visual dagger Nov 7, 2024, 1:21 PM

#

halcyon quarry Been on break 🙂

get back to work 😐

#

I don't pay you 0$ per month for nothing

valid crypt Nov 8, 2024, 3:57 PM

#

😶

terse folio Nov 8, 2024, 7:41 PM

#

visual dagger I don't pay you 0$ per month for nothing

Wow that's more than me, need to ask for a raise

visual dagger Nov 8, 2024, 11:28 PM

#

terse folio Wow that's more than me, need to ask for a raise

you get -100$

#

congrats!!

halcyon quarry Nov 16, 2024, 11:41 PM

#

This is actually pretty huge... Panchovix (SD ReForge) was able to bring the lora control extension into the Forge memory handling, which has been in demand since the initial release of Forge

#

https://github.com/Panchovix/stable-diffusion-webui-reForge/issues/36

#

I'll be pushing an update in the next day or so to allow ReForge to use my automatic loractl scaling feature

#

which is currently only enabled for A1111

halcyon quarry Nov 18, 2024, 3:16 AM

#

Ok well he made some other changes that screwed it up and now he has the feature back in dev lol

valid crypt Nov 21, 2024, 7:02 PM

#

donwloading reforge...

halcyon quarry Nov 21, 2024, 7:34 PM

#

It's probably the best UI if you don't care about Flux, and are iffy about Comfy / Swarm

#

This was the commit I tested where the "lora ctl" feature was still working.
https://github.com/Panchovix/stable-diffusion-webui-reForge/commit/1e950bc12e3a4f690ef6afcf96826e02e4d24ee9

#

he definitely broke it on the next commit or one of the next ones

calm rain Nov 21, 2024, 7:35 PM

#

don't be iffy about swarm

#

assimilate

halcyon quarry Nov 21, 2024, 7:36 PM

#

I began adding Swarm support to the bot, but have been tied up with a video game lately / lost motivation atm

#

So far all it can do is detect if Swarm is running and capture the session id and that's it 😆

valid crypt Nov 21, 2024, 7:46 PM

#

XD

#

im looking for mativation too

valid crypt Nov 21, 2024, 8:18 PM

#

valid crypt donwloading reforge...

i wanted to try a model that for some reason dont work for forge

#

forge

halcyon quarry Nov 30, 2024, 12:44 PM

#

Ok so now the loractl feature ACTUALLY WORKS in ReForge

#

I’ll push a quick update today to allow the bot’s auto loractl scaling feature to be enabled with ReForge

#

Actually works very well if you want to try setting up a whole crapton of tags with Loras, and triggering multiple

halcyon quarry Nov 30, 2024, 7:14 PM

#

Pushed an update which allows loractl with ReForge

halcyon quarry Dec 3, 2024, 3:24 PM

#

Pushed an edge case minor update

The bot can now apply the aspect ratio from an img2img image, by using the value 'from img2img' for the aspect_ratio tag parameter
This is useful for applying multi-controlnet, and other multiple-image-input tags using "random directory" method. The subdirectories can now have different resolutions.

valid crypt Dec 3, 2024, 3:52 PM

#

never tried img2img :v

halcyon quarry Dec 6, 2024, 3:04 AM

#

You can do a number of things with img2img via Tags

#

the most recent image generated is retained in the user images/temp location - so you can use '__temp/temp_img_0.png' as a valid img2img

#

Ya know, to use the last image as input. Can make a tag for that

halcyon quarry Dec 8, 2024, 12:45 PM

#

Been daydreaming about how to make a very flexible integration of ComfyUI, with the bot’s tags system. It would be super cool to be able to run all sorts of workflows, prompt for required inputs, handle whatever the response is whether image video audio etc

#

text2video, img2video, vid2vid - all very accessible now to anyone with a 3060+ with acceptable quality and generation time

valid crypt Dec 9, 2024, 11:12 PM

#

nice dream :)

#

awww comfyui

#

man there are too many uis :v

valid crypt Dec 25, 2024, 1:53 PM

#

https://tenor.com/view/merry-xmas-merry-xmas-funny-gif-1223698705761373793

Tenor

halcyon quarry Dec 25, 2024, 5:40 PM

#

🎅

terse folio Dec 27, 2024, 6:39 AM

#

ducky_santa

valid crypt Dec 31, 2024, 7:40 PM

#

https://tenor.com/view/2025-new-year-gif-13449429702471646689

Tenor

valid crypt Jan 3, 2025, 10:37 PM

#

what happened to alltalks :O

#

its amazing

halcyon quarry Jan 4, 2025, 6:37 PM

#

Still works for me

#

Someone just posted this on resdit… supposedly the web search extension works?

#

https://www.reddit.com/r/Oobabooga/s/Rn0taOmg4j

From the Oobabooga community on Reddit: Install LLM_Web_search | Ma...

Explore this post and more from the Oobabooga community

#

Don’t have time to check it out myself atm

valid crypt Jan 5, 2025, 10:55 AM

#

i remember that what didnt work was when you plug it to the bot

#

although ive never tried adding it

valid crypt Jan 5, 2025, 8:25 PM

#

@halcyon quarry please add support for the remote version of alltalks v2

#

https://tenor.com/view/spongebob-squarepants-begging-pretty-please-beg-on-your-knees-pray-for-mercy-gif-10678931350545522063

Tenor

halcyon quarry Jan 5, 2025, 8:29 PM

#

@valid crypt sounds like you've tested out alltalk v2? Does it work with the bot?

#

I havent played with it yet

valid crypt Jan 5, 2025, 8:30 PM

#

halcyon quarry <@323088470241312774> sounds like you've tested out alltalk v2? Does it work wi...

nope

#

not even for tgwui :v

#

the remote version works

halcyon quarry Jan 5, 2025, 8:30 PM

#

seems like there is a "TGWUI Remote Extension" that alltalk v2 is compatible with

valid crypt Jan 5, 2025, 8:30 PM

#

but with bot just no response

halcyon quarry Jan 5, 2025, 8:31 PM

#

so its not a feature of alltalkv2 per se

#

https://github.com/erew123/alltalk_tts/wiki/Text‐generation‐webui-Remote-Extension

GitHub

Text‐generation‐webui Remote Extension

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, D...

#

ah

#

it is part of alltalk...

valid crypt Jan 5, 2025, 8:32 PM

#

the is the message

#

and of course i tried and failed :)

halcyon quarry Jan 5, 2025, 8:33 PM

#

ah yeah I think I also saw that message recently... and stuck with v1

#

which still works fine

#

I'll look into this a bit more

valid crypt Jan 5, 2025, 8:35 PM

#

:)

burnt patrol Jan 7, 2025, 7:18 AM

#

Thanks

remote thistle Jan 12, 2025, 6:10 AM

#

Hello! I've ran into an issue that I suspect is my own doing but I'm kind of at a loss. At some point I created a conflict in the installer environment, so opted to just reinstall the whole stack (oobabooga, ad_discordbot and all) rather than try to figure that out. Should have been a clean wipe, but now I'm getting this error when I try to interact with the bot.

ERROR [bot.__main__]: An error occurred in llm_gen(): 'static_cache' Traceback (most recent call last): File "/home/mole/text-generation-webui-main/ad_discordbot/bot.py", line 2003, in llm_gen async for chunk in process_responses(): File "/home/mole/text-generation-webui-main/ad_discordbot/bot.py", line 1953, in process_responses async for resp in generate_in_executor(func): File "/home/mole/text-generation-webui-main/ad_discordbot/modules/utils_asyncio.py", line 161, in generate_in_executor result, is_done = await loop.run_in_executor(None, get_next_generator_result, gen) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mole/text-generation-webui-main/installer_files/env/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mole/text-generation-webui-main/ad_discordbot/modules/utils_asyncio.py", line 29, in get_next_generator_result result = next(gen) ^^^^^^^^^ File "/home/mole/text-generation-webui-main/ad_discordbot/modules/utils_tgwui.py", line 342, in custom_chatbot_wrapper for j, reply in enumerate(generate_reply(prompt, state, stopping_strings=stopping_strings, is_chat=True, for_ui=for_ui)): File "/home/mole/text-generation-webui-main/modules/text_generation.py", line 42, in generate_reply for result in _generate_reply(*args, **kwargs): File "/home/mole/text-generation-webui-main/modules/text_generation.py", line 97, in _generate_reply for reply in generate_func(question, original_question, seed, state, stopping_strings, is_chat=is_chat): File "/home/mole/text-generation-webui-main/modules/text_generation.py", line 305, in generate_reply_HF if state['static_cache']: ~~~~~^^^^^^^^^^^^^^^^ KeyError: 'static_cache'

I know it's an issue on the bot's end as the oobabooga client on its own works just fine. But other than that I haven't a clue. Also tested it using one of the known good example characters. Same thing.

This'll probably end up being silly, so I accept any teasing coming my way. 😛 I just want to get the bot working.

terse folio Jan 12, 2025, 7:03 AM

#

remote thistle Hello! I've ran into an issue that I suspect is my own doing but I'm kind of at ...

This isn't your fault, it's likely that you're using a version of tgwui that is more modern than the ad_discordbot

TGWUI is expecting a certain variable from the bot that the bot doesn't know to provide.
I'm running on a pretty outdated version of tgwui, not going to take the risk updating just yet!

But here's how you can patch that:
In tgwui/ad_discordbot/dict_base_settings.yaml
you can add static_cache: False
as one of the options under llmstate > state

remote thistle Jan 12, 2025, 7:05 AM

#

Could be! I had installed tgwui a while back to originally use with oobabot before making the switch, so it's very much possible that what I grabbed a second time is a bit too new. I'll try that patch and see if it works!

terse folio Jan 12, 2025, 7:06 AM

#

If after this, you continue to have similar errors about something missing from the state variable you can find the defaults under tgwui/modules/shared.py

remote thistle Jan 12, 2025, 7:17 AM

#

Looks like that worked perfectly! It's up and working just like it was before. Thanks for your help!

valid crypt Jan 12, 2025, 5:28 PM

#

terse folio This isn't your fault, it's likely that you're using a version of tgwui that is ...

must be something with a fresh installed tgwui, i updated like a week ago and i have no problems :p

valid crypt Jan 12, 2025, 5:33 PM

#

halcyon quarry I'll look into this a bit more

i discovered something, it actually does something, but idk why theres no audio

halcyon quarry Jan 12, 2025, 5:36 PM

#

Need to update the bot to add static_cache

#

This is the first time in awhile that ooba added new params that didn’t fallback to a default value when not in the payload

halcyon quarry Jan 12, 2025, 5:58 PM

#

updating to latest TGWUI and testing...

#

yes, works fine with this key added.

#

pushed! 2 lines! 😄

#

Yes I see that the original alltalk_tts is not working on the latest TGWUI

#

this is very unfortunate, the dev was so dedicated

#

yes seems like we need to go to the dev v2 version...

valid crypt Jan 12, 2025, 6:15 PM

#

halcyon quarry yes seems like we need to go to the dev v2 version...

yes you need

#

although even v2 doesnt work ._ .

#

but it is amazin

#

g

halcyon quarry Jan 12, 2025, 6:31 PM

#

this is crazy, the standalone app takes like 20 minutes to install

valid crypt Jan 12, 2025, 6:49 PM

#

is it too much or too fast

halcyon quarry Jan 12, 2025, 7:25 PM

#

what is the good xtts model again? 2.0.2?

valid crypt Jan 12, 2025, 7:31 PM

#

?

#

i used the default one

#

in the past

halcyon quarry Jan 12, 2025, 7:31 PM

#

Mistakenly did not retain my model

valid crypt Jan 12, 2025, 7:33 PM

#

the base model can be easily downloaded here

halcyon quarry Jan 12, 2025, 7:33 PM

#

Right, but if memory serves me right 2.0.3 was considered a step back…

valid crypt Jan 12, 2025, 7:35 PM

#

the quality does not matter if there's nothing

#

i had 2.0.2 and never used 2.0.3

valid crypt Jan 13, 2025, 8:01 PM

#

i think that the whisper stt stoped working after the update

halcyon quarry Jan 13, 2025, 10:19 PM

#

That sucks if a number of extensions just suddenly don’t work anymore

valid crypt Jan 13, 2025, 10:39 PM

#

i have an old tgwui in my another pc and i tried it and it worked, but the latest didnt work

#

you can try if it works for you

terse folio Jan 13, 2025, 10:40 PM

#

Seems the newest version breaks a lot of things including the openAi extension

valid crypt Jan 13, 2025, 10:40 PM

#

i might get an old tgwui just for alltalk wihtou remote

valid crypt Jan 13, 2025, 10:41 PM

#

terse folio Seems the newest version breaks a lot of things including the openAi extension

i saw ooba updating that extension 4 days ago

#

maybe wasnt ooba but if ppl updated it so it should work right?

terse folio Jan 13, 2025, 10:42 PM

#

I'll have to check it out soon, but just saw someone having an issue with the static_cache variable not being present

valid crypt Jan 13, 2025, 10:43 PM

#

ask them if they have it fresh installed or updated

#

i updated and im fine

#

😁

terse folio Jan 13, 2025, 10:44 PM

#

That's great to hear ^^

halcyon quarry Jan 14, 2025, 3:36 AM

#

I’ve still been mostly engrossed in this game I’m playing but plan on testing alltalk v2 / remote - more, soon

twin thunder Jan 14, 2025, 4:15 AM

#

I think I forgot to send a message but i've been submitting issues on the github for while now

terse folio Jan 14, 2025, 4:18 AM

#

valid crypt ask them if they have it fresh installed or updated

it was a fresh install for docker

twin thunder Jan 14, 2025, 4:18 AM

#

this seems incorrect...

terse folio Jan 14, 2025, 4:19 AM

#

What's this from?
TGWUI updater?
was there an updater for the bot? it's been a while

twin thunder Jan 14, 2025, 4:19 AM

#

just the update_windows.bat from the folder

#

oh crap I gotta redo my api keys

terse folio Jan 14, 2025, 4:20 AM

#

that's odd, wonder what happened.
You could try deleting the .git folder if something is missing and have it redownload

twin thunder Jan 14, 2025, 4:20 AM

#

the correct answer is i'm stupid and didn't install it right

#

nvm I installed it right what is happening

#

I'm just gonna retry everything but through gitbash instead of terminal

terse folio Jan 14, 2025, 4:23 AM

#

Tgwui is supposed to create it's own environment, that might come with it's own version of git in the conda environment

#

I'm not sure if that would affect things

twin thunder Jan 14, 2025, 4:23 AM

#

im not using tgwui

terse folio Jan 14, 2025, 4:23 AM

#

ahh, what are you using?

twin thunder Jan 14, 2025, 4:24 AM

#

windows???

#

like I just installed the repo from github

terse folio Jan 14, 2025, 4:24 AM

#

Tgwui is short for text-gen webui ^-^

twin thunder Jan 14, 2025, 4:24 AM

#

OH

#

lol i'm stupid

terse folio Jan 14, 2025, 4:24 AM

#

no worries!

twin thunder Jan 14, 2025, 4:24 AM

#

should update that while i'm at it tho

terse folio Jan 14, 2025, 4:25 AM

#

twin thunder just the update_windows.bat from the folder

is this an updater for the ad_discord bot, or the webui?

im not sure if the bot has its own updater script

twin thunder Jan 14, 2025, 4:25 AM

#

ad_discord bot

terse folio Jan 14, 2025, 4:25 AM

#

ah, okay, will check on that

#

are you updating from a very old version of the ad_discord bot?

twin thunder Jan 14, 2025, 4:26 AM

#

the updater seems to be working but i'm getting another error

terse folio Jan 14, 2025, 4:26 AM

#

hm, looks like the wrong path was put in the script, that should be pointing to

textgen_webui/installer_files...

#

ad_discordbot is meant to be inside textgen_webui

twin thunder Jan 14, 2025, 4:27 AM

#

oh whoops

#

I just deleted all my models

terse folio Jan 14, 2025, 4:27 AM

#

ohno!

twin thunder Jan 14, 2025, 4:27 AM

#

thats fine i know what they were

terse folio Jan 14, 2025, 4:27 AM

#

I would recommend moving your models to a seperate folder

#

and launching the webui with an argument to tell it to read from that folder

#

Like a symlink

#

I personally keep my models on a dedicated drive ^^

twin thunder Jan 14, 2025, 4:28 AM

#

oh yeah, i should move them onto my m.2 once it gets here

terse folio Jan 14, 2025, 4:28 AM

#

that way you can move things around without transfering large files or deleting things

#

PS: I wouldn't recommend storing large projects on your desktop

#

windows has to load all those files as your pc boots up
and can slow things down

twin thunder Jan 14, 2025, 4:29 AM

#

boot time hasn't really been an issue but thats good to know

#

oh my god

#

the web ui version i had downloaded was 1.18

terse folio Jan 14, 2025, 4:31 AM

#

Wow, that's suprizing, usually it defaults to latest for downloads

twin thunder Jan 14, 2025, 4:31 AM

#

no, i'm getting the latest now, I had 1.18 on my pc before

terse folio Jan 14, 2025, 4:31 AM

#

ahh, okay

#

~~but I could walk you through what needs to be changed~~
fixed ^^

halcyon quarry Jan 14, 2025, 4:34 AM

#

There’s no bugs with the bot atm oobabooga

terse folio Jan 14, 2025, 4:35 AM

#

great!

halcyon quarry Jan 14, 2025, 4:35 AM

#

I had pushed the update yesterday to add that one new key

#

Although I should really make a new “Release”

#

@twin thunder be sure to see the current install instructions for the bot

twin thunder Jan 14, 2025, 4:44 AM

#

I was being quite silly the whole time (did not clone the repo to the right spot)

#

what the hell

#

how many models did I have installed

#

I just freed 150gb of space

terse folio Jan 14, 2025, 4:47 AM

#

😅

#

nice

twin thunder Jan 14, 2025, 4:49 AM

#

what's ya'lls prefered models?

terse folio Jan 14, 2025, 4:50 AM

#

I'm a few months behind on the latest stuff, but I found that I could manage to fit gemma 27b at 2bits on my gpu which worked surprisingly well.
But I also use some llama3 8b finetune Hathor_Tahsin for simpler things

halcyon quarry Jan 14, 2025, 5:00 AM

#

I’ve been running the same model forever… NeuralBeagle 7b

#

Great model

valid crypt Jan 14, 2025, 2:58 PM

#

literally XD

valid crypt Jan 14, 2025, 6:54 PM

#

halcyon quarry I’ve still been mostly engrossed in this game I’m playing but plan on testing al...

guess what I've found after installing alltalkv2 as an extension

#

https://tenor.com/view/hide-the-pain-harold-gif-10576512

Tenor

halcyon quarry Jan 14, 2025, 6:56 PM

#

works just fine eh?

valid crypt Jan 14, 2025, 6:56 PM

#

that.. looks like remote...

#

the extension is the same? just using the same environment I think, but I have to try if it works!

#

i'll just ask why the start up was tgwui mode and then remote...

#

never mind, bot can't even load

valid crypt Jan 15, 2025, 9:13 PM

#

yea stick to 2.0.2, 2.0.3 i feel a slight improvement but it takes me 70% more time...

burnt patrol Jan 15, 2025, 11:16 PM

#

valid crypt yea stick to 2.0.2, 2.0.3 i feel a slight improvement but it takes me 70% more t...

Of TGWUI?

terse folio Jan 15, 2025, 11:16 PM

#

I believe those versions refer to XTTS models

valid crypt Jan 15, 2025, 11:49 PM

#

burnt patrol Of TGWUI?

xtts

burnt patrol Jan 16, 2025, 12:07 AM

#

valid crypt xtts

Ah

sullen plover Jan 17, 2025, 8:48 PM

#

hey guys how do i stop people dm'ing my bot?

#

i gotta do all this seems a bit over the top? To disable DMs for your bot while using the ad_discordbot plugin, you can modify its behavior based on its structure and configuration. Below are steps and examples for implementing this functionality:

Modify the Message Event in ad_discordbot
The ad_discordbot plugin processes messages through a message event listener. You can add a check to ignore DMs. Look for the section in the code handling the on_message event or similar and update it to include a guild check.

Example:
python
Copy
Edit
@bot.event
async def on_message(message):
# Ignore messages sent in DMs
if message.guild is None: # DM channels don't belong to any guild
return
# Continue processing messages in servers
await bot.process_commands(message)
2. Bot Settings for Scope Restriction
Check if ad_discordbot has configuration settings or a config.json file to define bot behavior. If such a file exists, look for options to disable or restrict DM responses.

Ignore DMs Globally
If ad_discordbot uses decorators for command definitions (e.g., @bot.command()), you can add a global DM filter to enforce the restriction across all commands.

Example:
Modify or wrap the command logic:

python
Copy
Edit
def no_dm_check(ctx):
return ctx.guild is not None # Allow only messages from guilds

@bot.command()
@commands.check(no_dm_check)
async def my_command(ctx):
await ctx.send("This command only works in servers.")
4. Update the ad_discordbot Core Logic
You may need to update ad_discordbot's source to handle this at a higher level:

Locate the part of the code where the bot reads incoming messages or processes events.
Implement a DM filter as shown in the examples above.
5. Redirect DM Senders (Optional)
If you want to send a polite response to DM users instead of silently ignoring them, you can modify the behavior to include a reply.

Example:
python
Copy
Edit
@bot.event
async def on_message(message):
if message.guild is None: # Check if the message is from a DM
await message.author.send("I do not respond to direct messages. Please use the bot in a server.")
return
await bot.process_commands(message)
6. Testing and Validation
Restart the bot after making changes.
Test it by sending DMs and ensuring the bot does not respond.
Ensure commands work correctly in servers.
If you encounter specific issues with ad_discordbot integration or need help pinpointing where to add these changes in its structure, provide snippets of its core processing logic, and I can assist further.

#

also stop it replying to other ai's bots too

terse folio Jan 17, 2025, 8:57 PM

#

sullen plover also stop it replying to other ai's bots too

pretty sure there were settings ^^

#

chance_to_reply_to_other_bots in base_settings.yaml

#

ah, it wasn't that texting was disabled in dms, just some commands

#

~~it shouldn't be too hard to add a setting for that and an extra if statement in the on_message

I wont be able to test it as i'm using an older version of TGWUI and dont want to update yet but can make a branch for you to try in a few hours.
Busy atm~~

Edit: there is actually a setting, different file

#

discord > direct_messages > allow_chatting in config.yaml
and can disable all commands in dms too with the next setting allowed_commands

sullen plover Jan 17, 2025, 9:10 PM

#

thanks checking now 🙂

#

i dont see this in my config.yaml discord > direct_messages > allow_chatting in config.yaml
and can disable all commands in dms too with the next setting allowed_commands

#

reply_to_itself: 0.0 # 0.0 = never happens / 1.0 = always happens
chance_to_reply_to_other_bots: 0.0 # Chance for bot to reply when other bots speak in main channel
reply_to_bots_when_addressed: 0.0 # Chance for bot to reply when other bots mention it by name
only_speak_when_spoken_to: true # This value gets ignored if you're talking in the bot's main channel
ignore_parentheses: true # (Bot ignores you if you write like this)
go_wild_in_channel: true # Whether or not the bot will always reply in the main channel
conversation_recency: 600

terse folio Jan 17, 2025, 9:50 PM

#

sullen plover reply_to_itself: 0.0 # 0.0 = never happens / 1.0 = always happens...

You can put code in clode blocks by surrounding them in 3 backticks ```code```

#

chance_to_reply_to_other_bots already being 0 and still happening might be a bug, interesting

#

what kind of bot does it reply to?
Are these bots mentioning the AdDiscordbot?

terse folio Jan 17, 2025, 9:52 PM

#

sullen plover i dont see this in my config.yaml discord > direct_messages > allow_chatting in ...

It's possible you have an older version of the config.
I'm not sure how updating it works

sullen plover Jan 17, 2025, 9:57 PM

#

thanks 🙂

halcyon quarry Jan 18, 2025, 4:13 AM

#

Just run the updater bat file

terse folio Jan 18, 2025, 4:15 AM

#

what I mean is, do the configs get updated too?
because they're editable

valid crypt Jan 18, 2025, 11:28 AM

#

i think configs dont get updated, only the first launch will copy from example, but after that, you have to copy or editing them manually

halcyon quarry Jan 18, 2025, 1:50 PM

#

The config templates get updated-
On startup, the bot compares user settings to the settings templates. Any missing user settings default to what is in the templates, while warning in the cmd window

marsh harness Jan 26, 2025, 12:10 AM

#

Love all of you

#

Have a good year

twin thunder Jan 30, 2025, 4:06 AM

#

getting this error while using ExLlama as a loader (i've already posted a issue on github about it)

terse folio Jan 30, 2025, 4:09 AM

#

twin thunder getting this error while using ExLlama as a loader (i've already posted a issue ...

A new parameter has been added, these instructions will work the same ^-^ #1154970156108365944 message

twin thunder Jan 30, 2025, 4:10 AM

#

ah, just downgrade or wait for ad-discord bot to update

terse folio Jan 30, 2025, 4:11 AM

#

Sure, but I'm also not sure if updating the bot will add the missing paramater to the settings file as it's meant to be editable.

That means it's probably in the gitignore file.
Ill have to look into that

twin thunder Jan 30, 2025, 4:15 AM

#

adding the missing parameters to the config appears to have worked! thanks!

valid crypt Feb 1, 2025, 10:43 PM

#

im playing https://huggingface.co/spaces/TTS-AGI/TTS-Arena
and im impressed with kokoro, gonna plug it to the bot

TTS Arena - a Hugging Face Space by TTS-AGI

terse folio Feb 1, 2025, 11:00 PM

#

Oh nice!
I wonder what the license is like on that and if it supports cloning/finetuning
~~ahh personal only~~
... actually not sure, maybe it's just the demo

valid crypt Feb 1, 2025, 11:05 PM

#

idk if it can be finetuned, but if the quality is good, rvc is your best friend

valid crypt Feb 1, 2025, 11:40 PM

#

found this but i cant make it work https://github.com/h43lb1t0/KokoroTtsTexGernerationWebui

GitHub

GitHub - h43lb1t0/KokoroTtsTexGernerationWebui: An extension to use...

An extension to use Kokoro TTS in text generation webui - h43lb1t0/KokoroTtsTexGernerationWebui

halcyon quarry Feb 2, 2025, 12:41 PM

#

twin thunder getting this error while using ExLlama as a loader (i've already posted a issue ...

I closed this issue as completed because I added the parameters 🤓 just update

#

Ah sorry just noticed this is old comment, mb

halcyon quarry Feb 2, 2025, 12:43 PM

#

valid crypt found this but i cant make it work https://github.com/h43lb1t0/KokoroTtsTexGerne...

Working in TGWUI?

valid crypt Feb 2, 2025, 12:49 PM

#

Not even in tgwui

#

From the comments it is supposed to prepare everything at the first startup, but mine didn't, maybe there's something that I had to do but only programmers would know

halcyon quarry Feb 2, 2025, 1:07 PM

#

I think there’s been a number of significant changes on TGWUI side recently… maybe try on a version as old as latest commit from extension

#

If it works, and can pinpoint commit that breaks it, that would be a good place to start fixing it

valid crypt Feb 2, 2025, 3:39 PM

#

Tried

#

The extension does nothing

#

You may try it, I can't spot the bug and chatgpt doesn't help

halcyon quarry Feb 2, 2025, 8:26 PM

#

Are you using the correct model v1.0?

halcyon quarry Feb 2, 2025, 9:47 PM

#

I looked at the Issues, and the author had just closed one 5 days ago… idk I’d expect that the extension should be doing what it claims to be doing with an active dev

valid crypt Feb 2, 2025, 10:55 PM

#

look at that

#

i believe

halcyon quarry Feb 2, 2025, 10:56 PM

#

ah very cool

valid crypt Feb 2, 2025, 11:03 PM

#

👍 wasnt my problem

#

i remember that the extension refers to it self as KokoroTtsTexGernerationWebui, and to be recognized as a tts extension must have _tts at the end

#

I suspect that my ISP cut off my internet, it should have no limits but I downloaded the deepseek r1 to try if it works...

valid crypt Feb 3, 2025, 12:05 AM

#

Nah my ISP is down not my fault

#

Today is not the day to try if the bot works

valid crypt Feb 3, 2025, 6:14 PM

#

valid crypt i remember that the extension refers to it self as KokoroTtsTexGernerationWebui,...

i was right, the extension refers as KokoroTtsTexGernerationWebui, not very hard to modify

valid crypt Feb 3, 2025, 6:33 PM

#

nah it doesnt work, with the bot, not a big deal as i cant plug rvc to it

visual dagger Feb 3, 2025, 9:03 PM

#

hi old fellas

#

still going hard at this.. I see

halcyon quarry Feb 3, 2025, 9:04 PM

#

not really, haven't changed much in 3 months

valid crypt Feb 3, 2025, 9:28 PM

#

valid crypt nah it doesnt work, with the bot, not a big deal as i cant plug rvc to it

as someone might guess, i tried to add rvc, but my skill is 😅

#

original

#

#

i quit coding 😓

terse folio Feb 3, 2025, 10:04 PM

#

valid crypt

sounds like missmatched framerates?
It looked like that tts generated 24kfps audio
Perhaps rvc is expecting 16k as other tts systems output?

valid crypt Feb 3, 2025, 10:10 PM

#

could be, i stole those code from https://github.com/marcos33998/edge_tts so...

GitHub

GitHub - marcos33998/edge_tts: A very simple implementation of edge...

A very simple implementation of edge_tts w/ RVC for oobabooga text-generation-webui. - marcos33998/edge_tts

#

😋

valid crypt Feb 3, 2025, 10:15 PM

#

terse folio sounds like missmatched framerates? It looked like that tts generated 24kfps aud...

i did something weird, idk what happened, but as what my teacher said, if it works dont touch it

#

gonna make it open source 👍

#

im ready to get roasted 👍

#

gonna fing a way to plug it to the bot 👏

halcyon quarry Feb 3, 2025, 10:17 PM

#

It should just work probably

valid crypt Feb 3, 2025, 10:18 PM

#

halcyon quarry It should just work probably

i already tried, no audio

halcyon quarry Feb 3, 2025, 10:18 PM

#

You need to make sure to put the correct extension name in the config file. And add the relevant parameters to your character file

valid crypt Feb 3, 2025, 10:18 PM

#

alltalks remot didnt work too

#

older tts is like text book examples, i can see, but these new tts, i dont see

terse folio Feb 3, 2025, 10:20 PM

#

valid crypt i did something weird, idk what happened, but as what my teacher said, if it wor...

i could take a look if the code isn't too long.
But depending on how you copied the code from edge_tts you shouldn't have a problem as the tts result is resampled? then saved as 44k which I imagine gets imported into rvc.

halcyon quarry Feb 3, 2025, 10:20 PM

#

Right so you need to just see the example parameters in the minty character and replicate it in your own character with the parameters

valid crypt Feb 3, 2025, 10:22 PM

#

valid crypt older tts is like text book examples, i can see, but these new tts, i dont see

that is older tts, new tts are just 🙈

halcyon quarry Feb 3, 2025, 10:37 PM

#

Eh the params are hiding in the code somewhere

valid crypt Feb 3, 2025, 10:38 PM

#

thats the problem

valid crypt Feb 3, 2025, 10:43 PM

#

terse folio i could take a look if the code isn't too long. But depending on how you copied ...

if you consider ~300lines as short do it https://github.com/marcos33998/KokoroTtsTexGernerationWebui_tts
it should work for everyone I believe

GitHub

GitHub - marcos33998/KokoroTtsTexGernerationWebui_tts: An extension...

An extension to use Kokoro TTS in text generation webui+RVC - marcos33998/KokoroTtsTexGernerationWebui_tts

#

let me do a final test, as ive only checked that the preview workds ._ .

#

it works, im not touching that

halcyon quarry Feb 4, 2025, 3:31 PM

#

You could've fixed the spelling issue ya know

#

gerneration lol

valid crypt Feb 4, 2025, 3:31 PM

#

i didnt see that one XD

halcyon quarry Feb 4, 2025, 3:32 PM

#

valid crypt Feb 4, 2025, 4:04 PM

#

it is too late to fix, theres a lot of files that uses the path with the wrong name, i'm not touching that

halcyon quarry Feb 4, 2025, 4:08 PM

#

Ok so I looked into it, and so far, all the other TTS extensions had added a string to the internal response in the format of:
'audio src="file/(.*?\.(wav|mp3))" - This is the regex that captures it

#

Looking into the code of this, it actually returns a string such as this example:
<audio controls><source src="file/path/to/audio/123456.wav" type="audio/mpeg"></audio>

#

As a quick test @valid crypt you could edit the bot file shared/utils_shared.py - find the audio_source = ... (in SharedRegex)

#

Replace that line with this:
audio_src = re.compile(r'src="file/([^"]+\.(wav|mp3))"', flags=re.IGNORECASE)

#

I believe the bot would then be able to play back the TTS response

valid crypt Feb 4, 2025, 4:12 PM

#

got it

halcyon quarry Feb 4, 2025, 4:13 PM

#

in other words it is formatting using source src= instead of what seemed to be standardized... audio src=

#

https://github.com/h43lb1t0/KokoroTtsTexGernerationWebui/blob/89334fddb3b6e2b1a216f75c522adcc6275be75f/script.py#L71

#

In your fork, you could also instead try just changing this to audio src= and see how it behaves in TGWUI (via the UI) / the bot

#

I believe this is the actual correct answer... I think source src= is like the generic catch-all for extra file types that can be appended to the internal response.

#

Since my regex also requires the extension to be mp3 or wav, I should be able to safely make this change (drop the "audio") without falsely trying to potentially process other response types as audio

valid crypt Feb 4, 2025, 4:20 PM

#

i didnt work with my fork, maybe is because it has no default selected voice, all talk should work, wait for me

#

my code's problem

#

all talk works XD

#

👍

valid crypt Feb 4, 2025, 6:41 PM

#

valid crypt I suspect that my ISP cut off my internet, it should have no limits but I downlo...

the drive that had been used as virtual ram went wrong, it should pretty new and high end

valid crypt Feb 4, 2025, 6:46 PM

#

valid crypt the drive that had been used as virtual ram went wrong, it should pretty new and...

got surpassed by an old high end + chipset m.2 slot 😓

valid crypt Feb 4, 2025, 6:57 PM

#

valid crypt i didnt work with my fork, maybe is because it has no default selected voice, al...

something is not right with mine

#

not my problem, his extension has problems 😠

valid crypt Feb 4, 2025, 7:44 PM

#

help 😭

valid crypt Feb 4, 2025, 8:46 PM

#

@halcyon quarry 🥹

halcyon quarry Feb 4, 2025, 8:47 PM

#

Help do what? lol

valid crypt Feb 4, 2025, 8:53 PM

#

the extension

#

no audio

#

😭

#

i changet the output modifier to ```def output_modifier(string, state):
# Escape and clean the text
string_for_tts = html.unescape(string).replace('*', '').replace('`', '')

# Generate audio file
msg_id = run(string_for_tts, rvc_params=RVC_PARAMS)

# Create relative path from webui root directory
audio_path = pathlib.Path(__file__).parent / 'audio' / f'{msg_id}.wav'

# Get relative path from webui working directory
relative_path = os.path.relpath(audio_path, start=os.getcwd())

# Convert to web-style path and add cache busting
web_path = f"file/{relative_path.replace(os.sep, '/')}?v={int(time.time())}"

# Add audio element with proper relative path
return f'{string}<audio controls><source src="{web_path}" type="audio/mpeg"></audio>'```

making it use relative path and accessble from local network but i still dont know why bot dont work

halcyon quarry Feb 4, 2025, 9:01 PM

#

Like I said the bot currently does not expect source src= it expects audio src=

#

on ur last line

valid crypt Feb 4, 2025, 9:06 PM

#

i changed it alr

#

so alltalk works now

halcyon quarry Feb 4, 2025, 9:06 PM

#

Does it generate TTS, and save a local version of the output? Amd just fail to play it?

valid crypt Feb 4, 2025, 9:09 PM

#

webui works

#

with bot, it did generate the file

#

but no playing

halcyon quarry Feb 4, 2025, 9:27 PM

#

ok this seems to be the problem here

#

in bot.py search for audio_src - there are 2 instances

#

Youll see something like:

                if 'audio src=' in vis_resp_chunk:
                    audio_format_match = patterns.audio_src.search(vis_resp_chunk)

#

Try removing that first condition, and then nudge all the lines below it so they are indented correctly

#

            def apply_extensions(chunk_text:str, was_streamed=True):
                vis_resp_chunk:str = extensions_module.apply_extensions('output', chunk_text, state=self.llm_payload['state'], is_chat=True)
                audio_format_match = patterns.audio_src.search(vis_resp_chunk)
                if audio_format_match:
                    stream_replies.streamed_tts = was_streamed
                    setattr(self.params, 'streamed_tts', was_streamed)
                    self.tts_resp.append(audio_format_match.group(1))

valid crypt Feb 4, 2025, 9:29 PM

#

🫡

halcyon quarry Feb 4, 2025, 9:29 PM

#

Well you can ignore the one in speak_task()

#

but that would change to

            audio_format_match = patterns.audio_src.search(vis_resp_chunk)
            if audio_format_match:
                self.tts_resp.append(audio_format_match.group(1))

#

This should work, on the assumption that you also updated the thing in Shared Regex in utils_shared.py as I had said earlier

#

audio_src = re.compile(r'src="file/([^"]+\.(wav|mp3))"', flags=re.IGNORECASE)

valid crypt Feb 4, 2025, 9:31 PM

#

?

#

ah

halcyon quarry Feb 4, 2025, 9:32 PM

#

right this was some dumb oversight of mine

valid crypt Feb 4, 2025, 9:32 PM

#

?

#

first one correct?

halcyon quarry Feb 4, 2025, 9:32 PM

#

valid crypt Feb 4, 2025, 9:32 PM

#

oki

halcyon quarry Feb 4, 2025, 9:33 PM

#

An easy way to shift the indents is to highlight all the lines and press Ctrl+[

#

To nudge them to the right, Ctrl+]

valid crypt Feb 4, 2025, 9:34 PM

#

check?

halcyon quarry Feb 4, 2025, 9:34 PM

#

yep looks good

#

And make sure that the regex pattern is updated in utils_shared.py

valid crypt Feb 4, 2025, 9:39 PM

#

didnt work?

#

i checked

#

#

i think it didnt work

halcyon quarry Feb 4, 2025, 9:43 PM

#

Add this print statement
print("RESPONSE:", vis_resp_chunk)

#

#

When you use the bot, it will print the extra crap that the extension adds to the response -

#

then I'll ask ChatGPT why the regex pattern is not finding it

valid crypt Feb 4, 2025, 9:44 PM

#

🫡

#


You&#x27;re right on track with your dream city floating above the clouds - that&#x27;s an amazing concept! Now, let&#x27;s add some more features to make it even more incredible.

Here are a few ideas:

* A network of sky gardens and vertical farms to provide fresh produce for its inhabitants.
* An advanced transportation system using hyperloops or vacuum tubes to transport people quickly and efficiently throughout the city.
* A unique waste management system that converts trash into energy, water, and nutrients for the ecosystem.

Now it&#x27;s your turn! What features would you add to this floating city?

(I&#x27;ll wait patiently for your response)<audio controls><source src="file/extensions/KokoroTtsTexGernerationWebui_tts/audio/8b837f97-4ac1-421f-ab3c-7cae1ed10050.wav?v=1738705660" type="audio/mpeg"></audio>```

halcyon quarry Feb 4, 2025, 9:49 PM

#

?v=1738705660" - it has to do with this bit at the end I'm sure

valid crypt Feb 4, 2025, 9:50 PM

#

idk what is that

halcyon quarry Feb 4, 2025, 9:50 PM

#

Try with this regex

#

audio_src = re.compile(r'src="file/([^"]+\.(wav|mp3)(\?[^"]*)?)"', flags=re.IGNORECASE)

#

actually

#

this is the one

#

audio_src = re.compile(r'src="file/([^"]+\.(wav|mp3))\b', flags=re.IGNORECASE)

#

This will ignore the extra query that appeats after the file extension

valid crypt Feb 4, 2025, 9:52 PM

#

🫡

#

check

#

works

halcyon quarry Feb 4, 2025, 9:56 PM

#

Pushed the changes!

#

Thanks for helping to debug that

valid crypt Feb 4, 2025, 9:59 PM

#

wait, i dont know if i did something wrong, only the first audio is being played

halcyon quarry Feb 4, 2025, 10:00 PM

#

Try disable tts streaming

#

I may have made it so only confirmed clients can steam

valid crypt Feb 4, 2025, 10:01 PM

#

not by that mean, literally only the first audio is being played

#

the first of all

#

ps: audios are generated

halcyon quarry Feb 4, 2025, 10:05 PM

#

You mean like an old file that was generated in a previous session? The oldest file in the directories?

valid crypt Feb 4, 2025, 10:10 PM

#

valid crypt not by that mean, literally only the first audio is being played

the extension doesnt delete its audio files, although only the first audio was played, next messages's tts are generated

#

as i'm having a lot of network problem with my laptop lately could be my fault

halcyon quarry Feb 4, 2025, 10:11 PM

#

If you have TTS streaming feature enabled it could be due to it

#

Otherwise, maybe your network issue. Otherwise, you could probably further debug it with print statements at that apply_extensions()

valid crypt Feb 4, 2025, 10:18 PM

#

valid crypt works

worse, only that time and that audio worked

#

i think is the split of the extension

#

the extension it self has a split function as kokoro only supports 500tokens

halcyon quarry Feb 4, 2025, 10:31 PM

#

Xtts works the same. All talk splits the text into Individual sentences

valid crypt Feb 4, 2025, 11:57 PM

#

gonna give a deep test another day 😪

#

it seems to work...

valid crypt Feb 8, 2025, 9:01 PM

#

after testing the bot with a lot of hello, it is scared of it XD

#

anyways, the tts works good

halcyon quarry Feb 8, 2025, 9:02 PM

#

Yeah my bot isn’t happy when I write “test” over and over

#

Eyyy nice, tell me if you had to mess with anything beyond just updating the bot

#

With the changes you helped me debug

valid crypt Feb 8, 2025, 9:04 PM

#

i didnt update, i only changed everything you told me, i have to officially update it now

halcyon quarry Feb 8, 2025, 9:05 PM

#

May have to delete the files to fetch fresh.

#

Or if using the github desktop app, right click discard changes

valid crypt Feb 8, 2025, 9:06 PM

#

halcyon quarry Or if using the github desktop app, right click discard changes

that is something that i should start using

valid crypt Feb 8, 2025, 10:42 PM

#

1h trying to fix tts, im the stupid one with 2 tts extensions XD

#

works perfectly

halcyon quarry Feb 9, 2025, 1:53 PM

#

You should try to push your changes (RVC support?) to the main project

halcyon quarry Feb 9, 2025, 2:11 PM

#

I could verify kokoro as a supported TTS extension

valid crypt Feb 10, 2025, 5:12 PM

#

halcyon quarry You should try to push your changes (RVC support?) to the main project

to complex for the little bro to merge, hi has exams, and i have 5000 suspicious lines of code XD

valid crypt Feb 10, 2025, 5:13 PM

#

valid crypt to complex for the little bro to merge, hi has exams, and i have 5000 suspicious...

rvc is 🔥

#

although only 300 are mine :)

halcyon quarry Feb 10, 2025, 5:17 PM

#

Ah well

valid crypt Feb 10, 2025, 5:20 PM

#

mine has _tts suffix and works too

#

I have to say, they are a good combination :) i'm proud of myself 😎

halcyon quarry Feb 10, 2025, 5:35 PM

#

I'll look into adding support but the thing that's going to be a bummer is if there are actually no parameters

#

I was trying to figure out if they were hidden somewhere but couldn't find them

#

Couldn't find them via printing TGWUI code either - I think the extension independently manages its parameters

valid crypt Feb 10, 2025, 5:36 PM

#

haha told you, the only parameters you can find are my rvc params hahaha

#

although idk if i used them correctly ._ .

halcyon quarry Feb 10, 2025, 5:37 PM

#

the other TTS extensions set parameters to TGWUI's shared.args class

valid crypt Feb 10, 2025, 9:01 PM

#

insane improvements

#

only 3 is real

#

kokoro is just no emotion

#

old ones, also the good thing about 4 is its insane speed of less than a seconds :v

halcyon quarry Feb 11, 2025, 2:20 PM

#

Wow! That is very very good

#

Number 1 quite good

#

4 is pretty good indeed given you say it processes fast AF

valid crypt Feb 11, 2025, 4:26 PM

#

2: 3.5x speed (with rvc)
1: 11x speed
4: 40x (lol)
5: 3.5x
a text that should be 24s long and divide by the average of 5 tries

halcyon quarry Feb 11, 2025, 4:36 PM

#

4 is very impressive except at the end sounds like "young one" instead of pronouncing it correctly "woman"

#

So what, you used kokoro for all these?

valid crypt Feb 11, 2025, 5:02 PM

#

only 2 is rvc

#

1 is our good friend alltalk

halcyon quarry Feb 11, 2025, 11:55 PM

#

Alltalk ftw

#

You’re making me jump through all hoops to make other extensions work but just need alltalk 🤗

valid crypt Feb 12, 2025, 12:02 AM

#

the misconception is that alltalk wasnt that good, I only asked for edge tts and vits, and these days I asked for kokoro and alltalkv2, but by fixing kokoro, now alltalkv2 works 😏

halcyon quarry Feb 12, 2025, 4:05 PM

#

Pushed some changes regarding /image cmd

The user's prompt would be part of the embed. Now, it is sent as normal text along with the embed so it can actually be copied when using discord on mobile (can't copy embed text on mobile).
Added another selection for the use_llm option - to automatically prefix the prompt with Provide an image prompt description (without any preamble or additional text) for:

#

valid crypt Feb 12, 2025, 9:09 PM

#

what file of the bot is related to user input? and what line of the bot.py?

halcyon quarry Feb 12, 2025, 9:10 PM

#

I didn't feel like over-complicating this new option, if that's what you're inquiring about.

#

Can optionally just write the full prompt without a preconfigured prefix 😛
Or use the tags system to prefix prompts

valid crypt Feb 12, 2025, 9:12 PM

#

i want to steal some stt/asr code and plug it with some black magic

halcyon quarry Feb 12, 2025, 9:12 PM

#

Personally, this quick prefix stops the LLM from begining the reply with "Sure! Here's an image prompt:"

#

Although, this just made me realize it would be a great idea to add a "Generate image" option to the /prompt command, even if redundant to some degree

valid crypt Feb 12, 2025, 9:16 PM

#

i was asking to make my life easier, i want to add stt but maybe i can use another program and with a little bit of inspiration make bot think that it was a message form user and 🥳

halcyon quarry Feb 12, 2025, 9:17 PM

#

The bot code is relatively easy to navigate... relatively.

#

lemme see...

valid crypt Feb 12, 2025, 9:18 PM

#

i mean 7000 lines is not very friendly, maybe just the name of those modules?

#

like i never thought that to fix tts i have to touch shared utils :O

halcyon quarry Feb 12, 2025, 9:18 PM

#

There's a number of ways users can input... now the main listener is def on_message()

#

It determines if the bot should reply or not.
If so, it creates a task and queues it.

#

The TaskManager class processes the tasks

valid crypt Feb 12, 2025, 9:22 PM

#

everything will be in bot.py?

halcyon quarry Feb 12, 2025, 9:22 PM

#

For user message type tasks, it will run one of these code blocks

#

There's modules that simplify/streamline a lot of the code used in these main blocks

#

For instance lots of tags related code is in the tags.py module

valid crypt Feb 12, 2025, 9:24 PM

#

not touching that very soon

halcyon quarry Feb 12, 2025, 9:25 PM

#

I comment so much stuff becomes I'll completely forget why the heck I do anything without it lol

#

Want a list of codes to look at if you seriously want to try helping add STT?

#

that one is likely a slippery one...

valid crypt Feb 12, 2025, 9:28 PM

#

dont expect fancy results from me

halcyon quarry Feb 12, 2025, 9:29 PM

#

Welcome to hear any proposal on, what your thoughts were on actually handling it.

#

like, a TGWUI extension?
Native discord functions?

valid crypt Feb 12, 2025, 9:30 PM

#

i was thinking at tgwui but im not sure if it is going to work

#

as whiper stt is not working anymore ._ .

#

if extension can directly do inputs and its compatible with the bot, i could think that way

halcyon quarry Feb 12, 2025, 9:31 PM

#

Well since the bot is designed to run on its own TGWUI instance and not via API, I'm not sure exactly how the extension could be beneficial...

#

What does it do that discord voice input cannot?

valid crypt Feb 12, 2025, 9:32 PM

#

right now what im thinking is just make it work, and leave all the problem for a further future

halcyon quarry Feb 12, 2025, 9:33 PM

#

There likely just needs to be some research on how voice input from voice channel can be captured appropriately to text

#

A listener function that uses discord code

valid crypt Feb 12, 2025, 9:33 PM

#

make a separated program, a second bot just for audio input, steal some code, make fake inputs to the main bot

halcyon quarry Feb 12, 2025, 9:34 PM

#

I'd likely just need to add some new "Task" or parameter to existing task, that will ensure no text response from bot, only play response on VC

#

I've been engrossed in this game lately, the ladder season is almost over and I'm definitely sitting out the next one

#

will be back in the saddle

valid crypt Feb 12, 2025, 9:37 PM

#

:v

valid crypt Feb 13, 2025, 2:40 PM

#

why didnt you add you bot to https://github.com/oobabooga/text-generation-webui-extensions

GitHub

GitHub - oobabooga/text-generation-webui-extensions

Contribute to oobabooga/text-generation-webui-extensions development by creating an account on GitHub.

#

ohhhh

halcyon quarry Feb 13, 2025, 2:41 PM

#

Well it's not technically an extension

#

I could ask ooba if he thinks it could be considered an exception... (disclaimer about what it actually is, etc)

valid crypt Feb 13, 2025, 2:46 PM

#

your is not too far away

halcyon quarry Feb 13, 2025, 2:47 PM

#

I'll see!

valid crypt Feb 13, 2025, 2:47 PM

#

although i never understood why your bot cant use the webui or the api

halcyon quarry Feb 13, 2025, 3:00 PM

#

If you enjoy character specific TTS settings (voices, etc), and TTS streaming - these are not possible via the API

#

Well, the TTS streaming may be possible... really not sure about that.
But definitely cannot adjust extension parameters via API.

#

Pushed small update - added new option to '/prompt' cmd

Can now force the response type (text / image / text+image)

halcyon quarry Feb 13, 2025, 4:58 PM

#

@valid crypt I submitted a PR to the extension list to add the bot, thanks for the suggestion!

valid crypt Feb 13, 2025, 5:01 PM

#

halcyon quarry If you enjoy character specific TTS settings (voices, etc), and TTS streaming - ...

I meant openai api

halcyon quarry Feb 13, 2025, 5:01 PM

#

I know - openai API is the TGWUI API

valid crypt Feb 13, 2025, 5:02 PM

#

I meant let others use the api while bot is running

halcyon quarry Feb 13, 2025, 5:03 PM

#

It may be possible to run 2 separate instances of TGWUI, if using custom flags with the bot such as unique port, etc

#

I know what you mean is like, an option to run all the UI related code as well instead of how the bot currently executes the backend code on startup

#

I have an (outdated) dev version of the bot which successfulyl uses the openai API - TGWUI launches normally and can be used in the UI simultaneously, etc

#

but this version of the bot does not launch TGWUI

#

and also complicates the settings management, and also makes some features impossible like the TTS voices

valid crypt Feb 13, 2025, 5:50 PM

#

f

halcyon quarry Feb 14, 2025, 2:59 AM

#

@valid crypt https://github.com/oobabooga/text-generation-webui-extensions?tab=readme-ov-file#ad_discordbot-altoiddealers-discordbot

GitHub

GitHub - oobabooga/text-generation-webui-extensions

Contribute to oobabooga/text-generation-webui-extensions development by creating an account on GitHub.

valid crypt Feb 14, 2025, 7:37 AM

#

👍

halcyon quarry Feb 14, 2025, 7:36 PM

#

Pushed another update for /prompt cmd

Yet another option, load_history to specify how much history to load for the interaction
The /prompt cmd can now be explicitly disabled from use in DMs via config.yaml

#

halcyon quarry Feb 15, 2025, 12:34 AM

#

What I need to add is for the bot to reply to show the user message

halcyon quarry Feb 15, 2025, 9:37 PM

#

OK - Now the bot will immediately send an embed reflecting the user's prompt and params used for /prompt cmd

#

(don't think system message is applicable for this model/mode)

halcyon quarry Feb 21, 2025, 4:12 AM

#

I had an idea for a new tag which could be pretty useful… “run_code” which would be a filename, and a companion tag “send_code_result” which would be a format to listen for and send (text, audio, video, etc)

#

Would be a bit advanced for some but would add a lot of flexibility to what the bot could actually do

valid crypt Feb 21, 2025, 6:26 PM

#

didnt understand

#

exams are driving me crazy

halcyon quarry Feb 21, 2025, 6:29 PM

#

like a user could define a tag that triggers for some phrase, and maybe I make some syntax that can optionally pass values into whatever code is being executed

#

like multiply >>678<< and >>2000<< and the tag will run a code that multiplies 2 values. Crude example.

valid crypt Feb 21, 2025, 6:30 PM

#

ahhh

halcyon quarry Feb 21, 2025, 6:31 PM

#

But it could be whatever code, could be something that generates and returns a video for instance

#

It would just add another advanced tool for users to think about using

valid crypt Feb 21, 2025, 6:33 PM

#

some model are trained to be able to use tools

#

the next step is make it an agent huh

fickle ember Feb 21, 2025, 9:11 PM

#

is it possible to specify which gpu the bot uses? i have a dual gpu setup and my main gpu does not have enough vram to load my models.

terse folio Feb 21, 2025, 9:14 PM

#

fickle ember is it possible to specify which gpu the bot uses? i have a dual gpu setup and m...

You should be able to set it in the webui in the models tab where you load your models.

Try saving settings there for how you want that model to be loaded.
You could also try the cmd_flags in ad_discordbot (was that a thing?) or just the normal cmd_flags to specify the gpu split

fickle ember Feb 21, 2025, 9:14 PM

#

terse folio You should be able to set it in the webui in the models tab where you load your ...

noted will try ty

halcyon quarry Feb 21, 2025, 9:22 PM

#

There is a CMD Flags file for both TGWUI, and one with the bot. Should be able to set the flag for this with the bots cmd flags file

fickle ember Feb 21, 2025, 9:24 PM

#

halcyon quarry There is a CMD Flags file for both TGWUI, and one with the bot. Should be able ...

what cmd flags should i use? is there somewhere i can find a list of them?

halcyon quarry Feb 21, 2025, 9:26 PM

#

On the TGWUI repo there is an expanding text labeled List of command-line flags

#

https://github.com/oobabooga/text-generation-webui

#

So Ctrl+F to jump to that and open it up

halcyon quarry Feb 21, 2025, 10:51 PM

#

@fickle ember welcome to the channel btw, let me know if you have any feedback on the bot 🤗

fickle ember Mar 3, 2025, 4:46 AM

#

halcyon quarry <@1276221874275352646> welcome to the channel btw, let me know if you have any f...

ive been using the bot for some time now. I have some feedback.

I notice that as conversations stretch the AI more or less loses its personality and forgets details about itself which are specified in the character.yaml files
The bot is unable to identify what youre talking about when you reply to someone and talk to the bot. i reviewed the console and it only seems to read the message outright not taking the message being replied to into account, in some cases this information would be vital.

halcyon quarry Mar 3, 2025, 4:52 AM

#

Thanks! In regards to 1. this is not exclusive to the bot; this will happen in the webui as well. You can try using a system message, or maybe even limit chat history

fickle ember Mar 3, 2025, 4:53 AM

#

halcyon quarry Thanks! In regards to 1. this is not exclusive to the bot; this will happen in ...

can you explain what a system message is and how i can set one up? i appreciate you getting back to me so quickly.

halcyon quarry Mar 3, 2025, 4:55 AM

#

About 2.- if you are suggesting that the bot might get an automatic prefix to the message like “(user X is replying to user Y’s original message which was ‘blah blah blah’)” then this could be an interesting idea, assuming I can get the message content from replies (would have to check into this)

fickle ember Mar 3, 2025, 4:56 AM

#

halcyon quarry About 2.- if you are suggesting that the bot might get an automatic prefix to th...

i think if you could figure out a way to do this, it would make the bots conversation abilities way better and flow more naturally like another user.

halcyon quarry Mar 3, 2025, 4:57 AM

#

System message is only applicable if your model’s template supports it and you’re in that mode (ei: chat template, or instruct template). The TGWUI code chooses the most appropriate template automatically so you’ll have to look into it a bit to see what template is loading for the model

#

Chat instruct mode might also help… i believe this prefixes your prompts with an instruction

fickle ember Mar 3, 2025, 5:01 AM

#

appreciated. i will try this.

halcyon quarry Mar 3, 2025, 5:02 AM

#

I’ll def look into that idea, hadn’t considered that before. I do have some other things on the backburner to make it behave more natural as well

fickle ember Mar 3, 2025, 5:03 AM

#

halcyon quarry I’ll def look into that idea, hadn’t considered that before. I do have some oth...

how long do you think it will take to get some of these ideas out? my friends and i are really enjoying the bot

halcyon quarry Mar 3, 2025, 5:03 AM

#

A few features I want to add at once all under a “server mode” setting

#

Honestly, within a month or two. Been engrossed in this game that has a ladder season which doesn’t have a fixed date but ending relatively soon

#

Ttyl though goin to be now

#

Bed*

valid crypt Mar 3, 2025, 6:17 PM

#

fickle ember ive been using the bot for some time now. I have some feedback. 1. I notice tha...

for problem 1 i think it could be either too much context or cutting context, the model that you are using also affects the quality, i would like to know about the thing in the image

halcyon quarry Mar 3, 2025, 6:18 PM

#

for image generation I retain zero chat history for this reason - chat history will incrementally make the responses get worse and worse from desired result

valid crypt Mar 4, 2025, 7:56 PM

#

finally got some time to do stt, gonna start with the ez way, another bot only for stt, right now gonna go with whisper although these are interesting too.
(~~https://github.com/modelscope/FunASR~~)
https://github.com/FunAudioLLM/SenseVoice
https://github.com/k2-fsa/sherpa-onnx

GitHub

GitHub - FunAudioLLM/SenseVoice: Multilingual Voice Understanding M...

Multilingual Voice Understanding Model. Contribute to FunAudioLLM/SenseVoice development by creating an account on GitHub.

GitHub

GitHub - k2-fsa/sherpa-onnx: Speech-to-text, text-to-speech, speake...

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC...

#

it's a really good way, being really simple, and should be really effective as users can create a private text channels and let bots chat there, while in voice channel feels like totally nothing weird :v

#

not the most elegant way but yeah

halcyon quarry Mar 4, 2025, 8:01 PM

#

Two bots then?

valid crypt Mar 4, 2025, 8:02 PM

#

some day when i get better i might be able to fuse them

#

another benefit of this is the environment? and i can use another machine for it :)

halcyon quarry Mar 4, 2025, 8:07 PM

#

Very cool

valid crypt Mar 4, 2025, 9:36 PM

#

uhhhh

#

as you are using discord.py i think i should go with the extension...

valid crypt Mar 4, 2025, 9:58 PM

#

bruh NotImplementedError: aead_xchacha20_poly1305_rtpsize

#

#

bruh

#

nah, big win

halcyon quarry Mar 4, 2025, 10:17 PM

#

So is it working?

valid crypt Mar 4, 2025, 10:21 PM

#

no more error but im still fighting for it

#

the fork works

#

i think

#ad_discordbot (Fork of Fork of xNul's bot)

Pushed fixed and improved SD Img Gen Embed

Pushed new setting to define imgmodel filters per-server

Pushed an update that resolves this

Pushed an update

Pushed update adding new condition 'only_with_tags'

Pushed an update which allows loractl with ReForge

Pushed an edge case minor update

Pushed some changes regarding /image cmd

Pushed small update - added new option to '/prompt' cmd

Pushed another update for /prompt cmd