ad_discordbot (Fork of Fork of xNul's bot) | Text Generation WebUI | Page 20

terse folio Mar 4, 2025, 10:47 PM

#

I can get you a working example when I get home.
but it wasnt that bad to get working iirc

valid crypt Mar 4, 2025, 11:18 PM

#

🥳

valid crypt Mar 5, 2025, 12:34 AM

#

https://tenor.com/view/good-night-gif-5479725380357410088

Tenor

terse folio Mar 5, 2025, 12:59 AM

#

I'm not sure about the encryption stuff, will have to test it

terse folio Mar 5, 2025, 6:21 AM

#

I think I spammed discord's ratelimits too much tonight.
I'm stripping out all the unnecessary stuff like external libs and complexity for an example to echo your voice back to you

halcyon quarry Mar 5, 2025, 1:31 PM

#

The next code I’m going to add is to make it possible to run the bot in a non-TGWUI environment for the image generation capabilities only

#

Most of the code to allow this already exists in the bot

#

I basically just need to update the batch files that launch the box so that it will create a new virtual environment if the text generation one is not found. Install the few requirements that the bot currently relies on from the text generation environment. Finally skip trying to import text generation modules when it’s configured for image generation only

valid crypt Mar 5, 2025, 8:44 PM

#

terse folio I think I spammed discord's ratelimits too much tonight. I'm stripping out all t...

i didnt had time today

#

still wait for it though

valid crypt Mar 5, 2025, 10:42 PM

#

im getting it! i little slow, it should run on gpu thought, but the model is the base whisper which shouldnt be too slow even on cpu...

valid crypt Mar 5, 2025, 11:02 PM

#

a bunch of bugs

#

well one big bug

halcyon quarry Mar 5, 2025, 11:03 PM

#

Ok I see that this bot writes your text

terse folio Mar 5, 2025, 11:03 PM

#

valid crypt im getting it! i little slow, it should run on gpu thought, but the model is the...

that's awesome!
you got it hearing you so far

valid crypt Mar 5, 2025, 11:15 PM

#

alr the thing works, it is pretty fast, what happens is that it should detect 1s of silence to chunk and 2s of silence to do the final stt and send the message, the time works but only when you speak again.
its something like, i spoke (3years later),
a little bit before i speak again, oh it had been more than 2 s send the message then yes yes marcos im hearing you

#

i spoke 2025-03-06 00:04:50,083 - INFO - Started hearing from Marcos
I spoke for 2.38s but the transcribe log is much after 2025-03-06 00:04:58,946 - INFO - Transcribing chunk for Marcos with audio length 2.38 seconds
It happened a couple of milliseconds before i spoke 2025-03-06 00:04:59,181

2025-03-06 00:04:58,946 - INFO - Transcribing chunk for Marcos with audio length 2.38 seconds
2025-03-06 00:04:58,946 - INFO - Starting transcription for temp_323088470241312774_199565.wav
2025-03-06 00:04:59,181 - INFO - Started hearing from Marcos
2025-03-06 00:05:00,388 - INFO - Transcription completed for temp_323088470241312774_199565.wav:  Hello, hello, hello
2025-03-06 00:05:02,837 - INFO - Sending transcription for Marcos:  Hello, hello, hello```
and also the silence log disappeared here D:
it feels like if i dont speak it pauses

#

gonna cointinue tomorrow

#

i hope i get it done before sunday

terse folio Mar 5, 2025, 11:23 PM

#

have you looked into live whisper repos?
They do something like collecting audio as it comes in and adding it to a buffer (like 30s)
Process on that and return the output of high confidence tokens.
And trim the audio/text.

There are some issues with this too, like if it doesn't hear you finish your sentence it might continue retrying to process the same buffer over and over until it gets your final words.

#

What would be cool is if we can run the speech detection model that preprocesses before being sent to whisper to do the cutoffs more intelligently.

#

If you're going to do turn based voice chat,
I recommend starting with voice messages because there you don't have to worry about pauses and it could be a cool interface!
The user decides when they're done

valid crypt Mar 6, 2025, 2:25 PM

#

terse folio If you're going to do turn based voice chat, I recommend starting with voice mes...

didnt understand

#

i did think about realtime whisper, and instead of sending everything it hears, only send when there's a keyword like **hey **and send the message when there a that's it

#

if the problem is for the future me, it not my problem 👍

#

gonna leave more features for the future me

terse folio Mar 6, 2025, 5:03 PM

#

valid crypt didnt understand

I was thinking the idea is pretty similar, but just taking a slightly different route to achieve.

Also that's a good idea to use a trigger keyword, that can save a lot on processing.
There are some libraries dedicated for that too

valid crypt Mar 6, 2025, 10:03 PM

#

getting a somehow working version but just takes random time to send the message

#

also gonna fix that

valid crypt Mar 6, 2025, 11:28 PM

#

im dying

terse folio Mar 6, 2025, 11:44 PM

#

what's going on?

valid crypt Mar 6, 2025, 11:44 PM

#

fixed audio processing getting paused, but takes random time around 1-12s

#

and i cant get it fixed

#

but anything else is pretty good

#

take a look if you want, gonna check https://github.com/davabase/whisper_real_time and https://github.com/ufal/whisper_streaming

📎 discord-whisper-bot.zip

GitHub

GitHub - davabase/whisper_real_time: Real time transcription with O...

Real time transcription with OpenAI Whisper. Contribute to davabase/whisper_real_time development by creating an account on GitHub.

GitHub

GitHub - ufal/whisper_streaming: Whisper realtime streaming for lon...

Whisper realtime streaming for long speech-to-text transcription and translation - ufal/whisper_streaming

terse folio Mar 6, 2025, 11:51 PM

#

I could take a look at it,
i have a few ideas, like how do you check that it has been silent for a certain amount of time with background noise

#

I think that 2nd one https://github.com/ufal/whisper_streaming is what I used

#

wow looks like a lot going on in the bot

#

one idea is that the audio processing stops when there's silence, have you tried printing when silence is detected?

#

sink_cb = voice_recv.BasicSink(callback)
sink_silence = voice_recv.SilenceGeneratorSink(sink_cb)
vc.listen(sink_silence)

Wrapping the voice sink in a silence generator will create silent packets when the user stops talking so the silence detection loop has silent packets to work with

#

i havent tested running it yet

valid crypt Mar 7, 2025, 12:08 AM

#

i tried to implement silent, but there is something stopping it

#

the silent threshold i made is nearly useless

terse folio Mar 7, 2025, 12:11 AM

#

Silence thresholds arent easy, it will have to be tweaked for everyone's mic

Since you're using the whisper-stream library, you can track based in if the sentence has been ended perhaps

#

Also whisper large v3 has has an issue with never finishing sentence punctuation iirc.

#

large v2 works great

valid crypt Mar 7, 2025, 12:15 AM

#

actually whisper base is really fast

#

the silent should work

#

but for some reason there is a 10s waiting time that i dont know why is it there

valid crypt Mar 7, 2025, 12:34 AM

#

i think there something about the way it retrieves voice data and processing and etc, making it really slow?

#

i think thats it, i normally disable turbo to save energy and not loosing power, but the cpu runs on low frequency

#

shouldn't be a big problem for gaming and others stuffs but my python code 😓

#

i hate my life

#

me--> 😪

halcyon quarry Mar 7, 2025, 3:35 AM

#

🤗

terse folio Mar 7, 2025, 5:23 AM

#

Running with the tiny model it seems fine, responds within a second of finishing talking

marsh harness Mar 7, 2025, 12:34 PM

#

With the reasoning models, is it the stopping strings that need to be modified/updated so that the bot doesn't keep speaking beyond its initial response, as well as not outputting </think> at the end of its responses, or are those two separate issues?

#

Even going through deepseek r1's release, I haven't seen any examples of what would need to be specified in order to remove the </think> from the text output. I get that these models are meant to be run where you can see the thinking/context window so you otherwise wouldn't see that, but it outputs it in ooba and when speaking to the bot on discord.

#

Both deepseek r1 and now QwQ do it, and it makes sense if it's related to the thinking/reasoning tokens. Bartowski has both exl2 quants and ggml files on HF for QwQ if you haven't tried it out yet.

halcyon quarry Mar 7, 2025, 1:46 PM

#

Of course it could be used as a stopping string if you wanted

#

If the responses are generally too verbose, there are other settings for that

marsh harness Mar 7, 2025, 2:20 PM

#

👍

valid crypt Mar 7, 2025, 7:01 PM

#

valid crypt shouldn't be a big problem for gaming and others stuffs but my python code 😓

i think it wasnt my fault, the the extension depends a lot on cpu to decrypt, making it slow if the cpu is running at low clock speed

#

but man, if i dont remove the cap for clock speed, 35w for doing nothing and 40~70w for moving my mouse is

#

https://tenor.com/view/hearties-wcth-when-calls-the-heart-sc-heart-home-coughing-gif-25504753

Tenor

valid crypt Mar 7, 2025, 7:57 PM

#

these days my pc is crashing, i think my intel cpu is cooked

fickle ember Mar 7, 2025, 10:20 PM

#

is it possible to make the bot capable of being user installed

#

to make it usable in dms

valid crypt Mar 7, 2025, 10:48 PM

#

it is usable in dms

#

in your setting file you can turn on dm

#

confing.yaml

#

although it is not installed but it can be used

valid crypt Mar 8, 2025, 1:24 AM

#

@terse folio i added a new asr engine somehow and broke all my loggings, not fixing that as it works just fine i think :)
https://github.com/marcos33998/asr_discordbot/tree/MoreEngine

GitHub

GitHub - marcos33998/asr_discordbot at MoreEngine

A speech-to-text (STT) bot for Discord that joins voice channels and transcribes conversations. - GitHub - marcos33998/asr_discordbot at MoreEngine

#

now the question is how do i plug it to https://github.com/altoiddealer/ad_discordbot

GitHub

GitHub - altoiddealer/ad_discordbot: Discord bot which transforms y...

Discord bot which transforms your servers into hubs for limitless local AI-driven interaction and content creation. Features cutting-edge tools for professionals, and unlocks creative fun for casua...

#

https://tenor.com/view/think-meme-thinking-memes-memes-2024-gif-6703217797690493255

Tenor

halcyon quarry Mar 8, 2025, 3:21 AM

#

fickle ember is it possible to make the bot capable of being user installed

Go into your Discord Developer portal, then Apps, Bot > 0auth. Be sure that you check off pretty much all the checkboxes when generating the bot invite URL.

#

And yes as Marcos said you’ll also need to enable that one setting in the config.yaml

#

By default most commands are hardcoded as disabled for users via DM, but the bot owner (you) can use almost all cmds via DMs

halcyon quarry Mar 8, 2025, 3:25 AM

#

valid crypt <@226121791670583296> i added a new asr engine somehow and broke all my loggings...

So this is working pretty much exactly how you want?

#

I could try to make some time to integrate the feature

valid crypt Mar 8, 2025, 12:09 PM

#

works just fine, with some flaws that i dont think it is my problem

#

the time it takes to collect all the packets is demonic

#

im confused if it is my cpu, the extension or discord's fault

valid crypt Mar 8, 2025, 1:20 PM

#

the water of retrieving audio is deeper than i thought

valid crypt Mar 8, 2025, 4:24 PM

#

@halcyon quarry can you try those benchmark scripts and give the results? and is it possible to somehow use partially pycord? also how hard would it be to move to pycord?

requirements are pynacl for both i think
and discord-ext-voice-recv @ git+https://github.com/Aviana/discord-ext-voice-recv for the py

i got really good result with pycord and demonic with discord.py

GitHub

GitHub - Aviana/discord-ext-voice-recv: Voice receive extension pac...

Voice receive extension package for discord.py. Contribute to Aviana/discord-ext-voice-recv development by creating an account on GitHub.

valid crypt Mar 8, 2025, 4:30 PM

#

valid crypt <@670018869418786816> can you try those benchmark scripts and give the results? ...

both uses import discord 😓

valid crypt Mar 8, 2025, 5:59 PM

#

valid crypt <@670018869418786816> can you try those benchmark scripts and give the results? ...

discord.py uses !benchmark and pycord uses /benchmark

📎 discord.py.txt 📎 pycord.txt

halcyon quarry Mar 8, 2025, 8:01 PM

#

ad_bot already installs pynacl btw

valid crypt Mar 8, 2025, 8:12 PM

#

ad_bot cant have pycord 💀

terse folio Mar 8, 2025, 8:22 PM

#

pycord is a different discord bot library.
But discord.py was the first in python and probably still the most feature packed.

But yes there are other libraries in other languages that do voice receive out of the box

terse folio Mar 8, 2025, 8:23 PM

#

valid crypt <@226121791670583296> i added a new asr engine somehow and broke all my loggings...

also looking briefly at the utils.asr module, I can't find where "asr_manager" is defined

valid crypt Mar 8, 2025, 8:25 PM

#

forgot to upload that one 😱

#

should work now

#

pycord is made on top of discord.py, the voice recording was supported in discord.py but dev abandoned it while pycord even improved it

valid crypt Mar 8, 2025, 8:32 PM

#

terse folio pycord is a different discord bot library. But discord.py was the first in pytho...

i dont want to modify libraries but, both using discord...

terse folio Mar 8, 2025, 8:36 PM

#

what happened to the previous extension on discord.py?

valid crypt Mar 8, 2025, 8:36 PM

#

?

#

if you mean the original and not the fork, pretty much abandoned and few months after discord deprecated of its supported decrypt method

#

tho fork added support for the new encryption

#

and it is super slow

#

at least for me

halcyon quarry Mar 8, 2025, 8:53 PM

#

Sure seems like the decryption method is supported in discord.py

#

https://discordpy.readthedocs.io/en/stable/whats_new.html?highlight=chacha20#id2

terse folio Mar 8, 2025, 8:54 PM

#

oh I see,
I guess I havent updated yet or something

I was trying to get an echo test version of voice receive working, but joined the voice channel too much and discord locked me out for a day Xd.
hadn't really gotten there to test latency

halcyon quarry Mar 8, 2025, 8:58 PM

#

hmm

#

yeah my discord.py was outdated as well, I activated the TGWUI venv and used pip install discord.py -U which updated it to current, which includes the support for that encryption

#

So if we are to add this feature I'd just need to update requirements.txt to ensure it specifies v2.5>

terse folio Mar 8, 2025, 9:01 PM

#

halcyon quarry So if we are to add this feature I'd just need to update requirements.txt to ens...

if the requirements.txt will replace the current installation, yea that should work ^^

halcyon quarry Mar 8, 2025, 9:03 PM

#

yes my updater scripts first git pull then do execute requirements.txt

valid crypt Mar 8, 2025, 9:05 PM

#

halcyon quarry Sure seems like the decryption method is supported in discord.py

so discord.py restored support for voice receiving

#

?

halcyon quarry Mar 8, 2025, 9:06 PM

#

You said it was aead_xchacha20_poly1305_rtpsize yes? That's the method that the changelog says was implemented in discord.py 2.5

valid crypt Mar 8, 2025, 9:10 PM

#

i found this https://github.com/Rapptz/discord.py/pull/9288#issuecomment-1785942942

GitHub

Support for receiving audio from voice channels by Sheppsu · Pull R...

This PR turned into an extension module for reasons outlined in this comment, so use that.
This PR is based off #6507. I made a new branch and PR for reasons outlined in this comment on my old PR.
...

#

pure chaos

halcyon quarry Mar 8, 2025, 9:13 PM

#

That Issue is from 2023 my dude

#

#

Seems that this encryption method is implemented as of merely 3 weeks ago

valid crypt Mar 8, 2025, 9:14 PM

#

why would a guy update an extension when the main project actually does what the extension does

#

😓

#

they are playing with my mind

halcyon quarry Mar 8, 2025, 9:15 PM

#

Well let's say I made a bitching extension that does something that doesn't work in the main project

#

Now some time later, I imagine they want to just update a few blocks of code rather than rewrite or discontinue the extension

#

Anyway, the way to correctly capture / encrypt/decrypt voice channel data in discord.py is likely documented here:
https://discordpy.readthedocs.io/en/stable/index.html

#

the methods etc are probably very similar if not identical to how it works in this extension you found

terse folio Mar 8, 2025, 9:20 PM

#

valid crypt why would a guy update an extension when the main project actually does what the...

discord.py doesn't have voice recv because he couldnt figure out a good standard for it yet. so someone else made an extension

valid crypt Mar 8, 2025, 9:21 PM

#

terse folio discord.py doesn't have voice recv because he couldnt figure out a good standard...

reseaching again if it has support now and get rid of the extension

halcyon quarry Mar 8, 2025, 9:21 PM

#

Just took a quick look through the asr bot extension and it's not a crapton of lines or anything...

valid crypt Mar 8, 2025, 9:23 PM

#

asr bot extension?

halcyon quarry Mar 8, 2025, 9:24 PM

#

https://discordpy.readthedocs.io/en/stable/api.html#voice-related

#

This is what you said works yes?
<#1154970156108365944 message>

valid crypt Mar 8, 2025, 9:25 PM

#

thats is my humble bot, using the extension to retrieve packets

halcyon quarry Mar 8, 2025, 9:25 PM

#

Ok I see that there is not a method like receive_audio_packet only a send_audio_packet

valid crypt Mar 8, 2025, 9:26 PM

#

could be encryption for sending

halcyon quarry Mar 8, 2025, 9:26 PM

#

actually...

valid crypt Mar 8, 2025, 9:26 PM

#

s

halcyon quarry Mar 8, 2025, 9:27 PM

#

might be part of this Opus library that its referencing all over in this section

#

https://discordpy.readthedocs.io/en/stable/api.html#audiosource

terse folio Mar 8, 2025, 9:32 PM

#

halcyon quarry might be part of this Opus library that its referencing all over in this section

discord.py only has voice send (think music bots)

Danny (the author) has stated he's not adding voice receive yet.
There's a discord.py discord if you need more info ^^

#

a few years ago

halcyon quarry Mar 8, 2025, 9:35 PM

#

Aight 🙂 Looks like the extension here (https://github.com/imayhaveborkedit/discord-ext-voice-recv) could just be added as a dependency if we implement whisper?

terse folio Mar 8, 2025, 9:35 PM

#

halcyon quarry Aight 🙂 Looks like the extension here (<https://github.com/imayhaveborkedit/di...

Can confirm this one works

valid crypt Mar 8, 2025, 9:36 PM

#

terse folio Can confirm this one works

dont work for me

#

i get 2025-03-08 22:31:46 - __main__ - ERROR - Benchmark failed: aead_xchacha20_poly1305_rtpsize

halcyon quarry Mar 8, 2025, 9:36 PM

#

I made this image way way way back when TGWUI was juuuuuuust getting off the ground

terse folio Mar 8, 2025, 9:36 PM

#

okay, will do a sanity check as it's been a moment since i updated ^^

#

i never got my echo test working for other reasons, but I should be able to get it to write to a wav file

halcyon quarry Mar 8, 2025, 9:37 PM

#

Here's a quick LTX-Video image2video I just executed in about 1 minute on my 4070ti (12gb vram)

valid crypt Mar 8, 2025, 9:37 PM

#

the one taht works to me is https://github.com/Aviana/discord-ext-voice-recv

GitHub

GitHub - Aviana/discord-ext-voice-recv: Voice receive extension pac...

Voice receive extension package for discord.py. Contribute to Aviana/discord-ext-voice-recv development by creating an account on GitHub.

valid crypt Mar 8, 2025, 9:38 PM

#

halcyon quarry Here's a quick LTX-Video image2video I just executed in about 1 minute on my 407...

XD

halcyon quarry Mar 8, 2025, 9:39 PM

#

I'm extremely interested in actually getting the comfy UI / swarm support added in, for users to easily configure execution of various workflows via the bot and send to channel the expected output

#

The one thing that's a bit of a bummer about current video generation is that the models generate the whole video as essentially a full length diffusion process - not a sequential process that could be paused and resumed

terse folio Mar 8, 2025, 9:41 PM

#

That's fascinating to me too, i did a little research into it and found that you can upload custom comfyui nodes via api to be processed?

That would be cool for things like assigning entire workflows to tags

halcyon quarry Mar 8, 2025, 9:41 PM

#

So those sort of requests would stall the bot bigtime in a busy server

#

Yes, I believe there's some extreme flexibility for using the Comfy API

terse folio Mar 8, 2025, 9:42 PM

#

halcyon quarry So those sort of requests would stall the bot bigtime in a busy server

maybe think about some distribution.
Like giving the option for people to add multiple api urls for multipel comfyui servers

And have the bot pick the least used/free one

#

can help there when the time comes

halcyon quarry Mar 8, 2025, 9:44 PM

#

Some pretty ambitious thoughts going through my head lately in regards to managing the generative endpoints

#

I've currently got some pretty fixed and rigid definitions for what an image request payload entails -- I was thinking how it would be much better to make it so the current payload stuff I have is like an example template, but that the bot could accept, process and send whatever the user defined without any errors.

valid crypt Mar 8, 2025, 9:47 PM

#

terse folio Can confirm this one works

somehow pycord actually connected using xsalsa20_poly1305_lite, why this works???

halcyon quarry Mar 8, 2025, 9:47 PM

#

Doing this wouldn't really be too hard but I'd want to make like, a dedicated method to filter some of the client specific features I've written for A1111 / Forge / Reforge

terse folio Mar 8, 2025, 9:48 PM

#

halcyon quarry I've currently got some pretty fixed and rigid definitions for what an image req...

yea, apis change, people might want to use other tools.
Checking requests is useful if you know where it's going.

having the bot return the error the api returns should suffice ^^

terse folio Mar 8, 2025, 9:49 PM

#

valid crypt somehow pycord actually connected using xsalsa20_poly1305_lite, why this works??...

No clue tbh, I never had to mess with encryption stuff

halcyon quarry Mar 8, 2025, 9:49 PM

#

I think it already does this... I just have a lot of micromanagement going on that I could cut back on

#

But mainly, the overhaul I need would make it very very simple to manage payloads and settings for various APIs

terse folio Mar 8, 2025, 9:51 PM

#

Oooh, now I get the encryption error

#

weird that it worked a couple days ago...

halcyon quarry Mar 8, 2025, 9:51 PM

#

like a user directory for storing payloads, then just having one setting in config/py to specify the one you'll be using, or something

terse folio Mar 8, 2025, 9:52 PM

#

interesting, be safe with that.
Storing payloads in json/yaml files could mess things up if they aren't escaped properly

halcyon quarry Mar 8, 2025, 9:56 PM

#

Well I currently already do something like this, with basesettings.yaml

#

BTW - I've been using a lot of ComfyUI workflows lately, there's a ton of cool crap you can do without having to actually play with the spaghetti

valid crypt Mar 8, 2025, 9:58 PM

#

camfyui is cool

halcyon quarry Mar 8, 2025, 9:58 PM

#

Followed by, lots of additional cool crap you can do once you do know how to play with the spaghetti

valid crypt Mar 8, 2025, 10:01 PM

#

terse folio Running with the tiny model it seems fine, responds within a second of finishing...

you tried right?

terse folio Mar 8, 2025, 10:01 PM

#

yes, yesterday or the day before, everything was working.
And my other test bot was working.

today I'm having the encryption issue

#

I'll try out the fork later

halcyon quarry Mar 8, 2025, 10:06 PM

#

The default example i2v workflow for LTX-Video includes this prompt enhancing LLM that wrote a very good caption for this image

valid crypt Mar 8, 2025, 10:07 PM

#

terse folio I'll try out the fork later

try the benchmark too, mine is very slow, now that you have to switch to the fork you will suffer like me

halcyon quarry Mar 8, 2025, 10:07 PM

#

So my input was the image, and a prompt a caveman reading something on his computer monitor

valid crypt Mar 8, 2025, 10:07 PM

#

valid crypt discord.py uses !benchmark and pycord uses /benchmark

here

valid crypt Mar 8, 2025, 10:08 PM

#

halcyon quarry The default example i2v workflow for LTX-Video includes this prompt enhancing LL...

when i use image generation i still use words,words,asdasd,asd,asd,asd

#

what i found comfyui interesting is that you can plug a lot more things and have a super complex workflos, i even saw people selling comfyui workflows 💀

halcyon quarry Mar 8, 2025, 10:11 PM

#

Yeah it’s the YouTubers all going full blown Patreon lol

#

It’s insanity really anyone who can invest an hour or so just tinkering can figure out how to connect different workflows together and stuff

#

It’s so simple now that you can create Groups of nodes

#

You can toggle a complex feature on and off by just Bypass Group now

terse folio Mar 8, 2025, 10:29 PM

#

I had a similar view, and still do kinda...
My family often told me I should create tutorials or classes for friends. (random topics like programming)
And I often replied like "anyone could figure this out with a few hours of research"
or "it's really nothing special"
But sometimes we forget how niche the things we know are.

Here for example, a lot of us have some programming knowledge, and the node networks of comfyui are reminiscing of programming where one thing pipes into another.

But, especially now days where AI tools are mainstream, everyone wants to do it.
I can understand why people sell shortcuts.
But I appreciate it when there's an open source solution that you can compile yourself for free or download a working version for a donation

valid crypt Mar 8, 2025, 10:34 PM

#

people are getting dumber with tiktok

terse folio Mar 8, 2025, 10:36 PM

#

Yea, I can see that with ai tools as well, taking away reason to do your own problem solving 😭

valid crypt Mar 8, 2025, 10:37 PM

#

i still remember when my teacher said that the secondary school nowadays is

#

https://tenor.com/view/muppet-show-muppets-gonzo-chickens-chicken-gif-2112801932139156383

Tenor

#

🤪

halcyon quarry Mar 9, 2025, 12:08 AM

#

Memebenders

fickle ember Mar 10, 2025, 3:35 AM

#

valid crypt

like this?

fickle ember Mar 10, 2025, 3:44 AM

#

halcyon quarry Go into your Discord Developer portal, then Apps, Bot > 0auth. Be sure that you...

it returns invalid scopes. im looking on the discord documentation what kinds of scopes i need but its a bit confusing

halcyon quarry Mar 10, 2025, 3:49 AM

#

fickle ember it returns invalid scopes. im looking on the discord documentation what kinds of...

Make sure you did everything noted here in Step 2 of the installation instructions
https://github.com/altoiddealer/ad_discordbot?tab=readme-ov-file#installation

#

Also check your server settings - Integrations > Bot

#

And Roles. Make sure the bot has privileges

fickle ember Mar 10, 2025, 3:58 AM

#

this is for making it userinstallable yes?

halcyon quarry Mar 10, 2025, 3:59 AM

#

It could be installed without any intents scopes etc, but then it’s not very useful

fickle ember Mar 10, 2025, 3:59 AM

#

i kinda got it to show up at all in another server its not installed in albiet it didnt work fully

halcyon quarry Mar 10, 2025, 4:01 AM

#

In the developer portal, Bot, 0auth (I mentioned this yesterday)

#

This is where you generate an invitation link for your bot to join a server

#

The link changed as you check off the various permissions you want to allow the bot to have

fickle ember Mar 10, 2025, 4:02 AM

#

yeah

halcyon quarry Mar 10, 2025, 4:03 AM

#

So check off Bot, which expands a new section, and check basically everything. Copy paste the link in browser and invite the bot to the server. Repeat for additional servers

#

You may also need to give the bot a Role with more permissions like, Send Messages etc

#

In the server settings, or channel settings, etc

valid crypt Mar 10, 2025, 2:39 PM

#

fickle ember like this?

you always can look for examples at ad_discordbot/settings_templates
or https://github.com/altoiddealer/ad_discordbot/tree/main/settings_templates

GitHub

ad_discordbot/settings_templates at main · altoiddealer/ad_discordbot

Discord bot which transforms your servers into hubs for limitless local AI-driven interaction and content creation. Features cutting-edge tools for professionals, and unlocks creative fun for casua...

halcyon quarry Mar 10, 2025, 2:51 PM

#

He has the settings templates in correct place and all that, for sure

#

the bot copies them automatically if the user did not already

valid crypt Mar 11, 2025, 10:32 PM

#

wanted to fork pycord and change its name and misclicked and did a PR 😓 never used codespace and do a merge with vs code :v

#

got blocked really quick XD

#

also i think i misclicked a lot of thing 😦

valid crypt Mar 11, 2025, 11:20 PM

#

how can i have pycord and discord.py at the same time?

#

this is killing me

#

i really dont want to use the extension of discord.py ;-;

#

actually i believe that they can be installed together

#

dont tell me that they really ca be together .-.

#

dreaming too much f

#

actually got an idea, gonna try it tomorrow

halcyon quarry Mar 12, 2025, 2:34 AM

#

gowron3

valid crypt Mar 12, 2025, 3:52 PM

#

valid crypt actually got an idea, gonna try it tomorrow

@halcyon quarry is it okay for you to add another bot for stt?

#

discord.py's extension although it can be added in one bot, i'll prefer adding another bot with pycord as it is faster for me

halcyon quarry Mar 12, 2025, 4:10 PM

#

Does the bot send the transcribed text to the channel? If so, then my bot should already be able to handle it... although I'd probably want to add a tag like "regex_text" that could be a regex string to update the user text. In this case, ignore or modify the prefix that other bot adds

valid crypt Mar 12, 2025, 4:14 PM

#

what i was thinking is a deeper connection, but if a separated bot sending the transcription is good enough i will be happier, right now my bot sends displayname: transcribed text

halcyon quarry Mar 12, 2025, 4:15 PM

#

Just trying to stay flexible 🤗

#

besides a customizable Regex tag has been on my mind for some time

valid crypt Mar 12, 2025, 4:17 PM

#

is there tag to stop tts?

#

actually, for voice chat anything should stop the tts

#

and to make it better, add "you were interrupted" to the context

halcyon quarry Mar 12, 2025, 4:22 PM

#

Suppose I need a "should_tts" tag

valid crypt Mar 12, 2025, 4:30 PM

#

i'll try to add multi speaker for now

valid crypt Mar 12, 2025, 7:44 PM

#

valid crypt i'll try to add multi speaker for now

had to move to pycord as the bot was made for discord.py so... multi speaker will be done before april 👍

#

also i'll have to move a lot of thing to config but my bot is lightning fast now https://github.com/marcos33998/asr_discordbot/tree/pycord-MoreEngine

GitHub

GitHub - marcos33998/asr_discordbot at pycord-MoreEngine

A speech-to-text (STT) bot for Discord that joins voice channels and transcribes conversations. - GitHub - marcos33998/asr_discordbot at pycord-MoreEngine

terse folio Mar 14, 2025, 3:21 AM

#

valid crypt also i'll have to move a lot of thing to config but my bot is lightning fast now...

lightning fast in the transcriptions?
That's cool, I wonder what the difference was between pycord and the former?

valid crypt Mar 14, 2025, 4:59 PM

#

terse folio lightning fast in the transcriptions? That's cool, I wonder what the difference ...

At least for me discord.py was super slow

#

Before it takes me X5 time to collect all the packets

#

My hypothesis is lack of optimisation

terse folio Mar 14, 2025, 5:10 PM

#

makes sense makes sense,
That's something i'm interested in testing timings with when i have free time

But i'm happy it's working good for you now 😸

valid crypt Mar 14, 2025, 5:19 PM

#

Although the biggest problem it has now is recording the whole voice channel in a single file

halcyon quarry Mar 14, 2025, 5:19 PM

#

I'm done with my video game vacation now so I'll be making progress again

halcyon quarry Mar 15, 2025, 2:21 AM

#

Result of the recent new image command option, LLM gave a very good prompt for Flux

#

This NeuralBeagle model will never cease to amaze me

valid crypt Mar 15, 2025, 8:33 PM

#

bot at the top :O

halcyon quarry Mar 15, 2025, 11:32 PM

#

Interesting, I had made a PR to add the bot but I definitely did not put it at the top

#

Either ooba checked out the bot and thought it deserved the recognition? Or someone else sneaky moved it in a PR lol

halcyon quarry Mar 16, 2025, 12:05 AM

#

Yeah, he totally moved it up

#

oobabooga ❤️

terse folio Mar 16, 2025, 12:26 AM

#

Congrats, that's really cool :)
The best way to get beaten to something,
oh misread a little, I thought you were considering the PR but someone did it before you

halcyon quarry Mar 16, 2025, 12:27 AM

#

I'm really stoked about it

#

That repository has instructions for requesting to have an extension added to the list, which is simply to clone the repo, update the list and send it as a PR.

I did this, but I had inserted my bot way down the list, just below Oobabot

#

It looks like shortly after, he decided to reorganize to sort what he thought to be most noteworthy to the top

terse folio Mar 16, 2025, 12:31 AM

#

that's epic!

halcyon quarry Mar 16, 2025, 12:32 AM

#

I only feel bad you don't have more recognition, you wrote some of the most impressive features that it offers

#

Well that's in my control eh 😛

terse folio Mar 16, 2025, 12:37 AM

#

I was glad to make some friends here, and support something cool ^^

halcyon quarry Mar 16, 2025, 12:42 AM

#

I’m grateful, very. I hope you’re also proud of this ascention of the bot in that list

#

Could not be in that coveted spot without you

#

I remember every contribution you made, notably you sorted out my chaotic single file making my life so much easier

terse folio Mar 16, 2025, 12:57 AM

#

halcyon quarry I’m grateful, very. I hope you’re also proud of this ascention of the bot in th...

Absolutley!

terse folio Mar 16, 2025, 12:58 AM

#

halcyon quarry I remember every contribution you made, notably you sorted out my chaotic single...

It's wonderful to see those parts still being used and expanded upon,

anyway, it's time for me to sleep for now, take care!

halcyon quarry Mar 16, 2025, 1:11 AM

#

@valid crypt thanks for pointing this out I’m literally drinking champagne over this

#

Mind is blown

valid crypt Mar 16, 2025, 1:39 AM

#

🫡

halcyon quarry Mar 16, 2025, 3:11 AM

#

halcyon quarry Mar 16, 2025, 1:12 PM

#

I think it could make sense to do the following:

ship the bot with text and image generation each disabled by default.
make it possible for the text generation to be run in 2 modes, as API mode, or the custom TGWUI integration

#

While the bot is configured for anything besides text gen enabled + TGWUI integration, do not attempt to activate and rely on the TGWUI venv. Instead create own venv.

#

I suppose these would be better controled via CMD Flags

halcyon quarry Mar 16, 2025, 4:42 PM

#

Yes, making some good progress on this... borrowed a lot of code from TGWUI setup so that it will create a venv and install requirements

#

I’m going to have it detect and ask how to handle venv setup

valid crypt Mar 16, 2025, 11:14 PM

#

got multi speaker working, https://github.com/marcos33998/asr_discordbot/tree/multiSpeaker , the next step would be filter tts from bot, or a more general blacklist, and packing responses, as i think it would be better on my side than in bot's side

GitHub

GitHub - marcos33998/asr_discordbot at multiSpeaker

A speech-to-text (STT) bot for Discord that joins voice channels and transcribes conversations. - GitHub - marcos33998/asr_discordbot at multiSpeaker

#

i need a tag to stop tts on new message, and add "you were interrupted while speaking" or custom text to the context

valid crypt Mar 16, 2025, 11:34 PM

#

wait a second, why bot is replying to my bot when i speak but not when bot speaks? :O magic

valid crypt Mar 17, 2025, 12:18 AM

#

valid crypt i need a tag to stop tts on new message, and add "you were interrupted while spe...

Actually, what it need to happen is stop tts if user speaks, so this must be either you add voice detection or your bot has to commuticate somehow with mine

#

by taking a shower, i think this can be done with tags!

#

a tag to pause tts, to continue and to abort

halcyon quarry Mar 17, 2025, 1:46 AM

#

I feel a bit over my head with adjusting this installation procedure...

halcyon quarry Mar 17, 2025, 2:07 AM

#

In this process, I noticed that a requirement of the bot, pydub, has been expected to be present in the TGWUI environment -- but TGWUI does not seem to install this by default. It seems to only get installed by some common extension we've been using

#

In any case, I figured out the few requirements needed if not relying on TGWUI environment, and they're all now in requirements.txt

halcyon quarry Mar 17, 2025, 2:23 AM

#

valid crypt a tag to pause tts, to continue and to abort

I don’t think this is a viable solution

valid crypt Mar 17, 2025, 10:56 AM

#

A simple stop tts would be fine for now

valid crypt Mar 17, 2025, 11:00 AM

#

halcyon quarry I don’t think this is a viable solution

In my bot there is some false alarms although they are filtered afterwards, to make the process fast, pausing the tts is imo the best way

halcyon quarry Mar 17, 2025, 11:34 AM

#

I’m sure someone has thought of this but I was thinking how it would be really cool if there was a such thing as future conditioning, when generating text “in the middle”

#

Something like this must already exist, but I just haven’t heard of it

#

The bot’s history management allows generating in the middle but only by omitting the future text. Would be neat if there was a mode that including it had an influence

halcyon quarry Mar 17, 2025, 12:09 PM

#

Eh nvm this is achievable by summarizing the future text and just better prompting including it

valid crypt Mar 17, 2025, 9:18 PM

#

at least a simple, fast tag to stop tts pls

halcyon quarry Mar 17, 2025, 11:13 PM

#

Alright, I’ll see about doing this tonight

valid crypt Mar 17, 2025, 11:38 PM

#

halcyon quarry Suppose I need a "should_tts" tag

i also want this :p

halcyon quarry Mar 18, 2025, 12:39 AM

#

@valid crypt https://github.com/altoiddealer/ad_discordbot/tree/tts_tags

#

Working on the "should_tts" tag now

#

I added this tag...
toggle_vc_playback: str

Changes playback in guild's voice channel where tag is triggered. Use with 'for_guild_ids_only' condition for selective control. Valid values: 'stop', 'pause', 'resume', 'toggle' (pauses or resumes)

halcyon quarry Mar 18, 2025, 1:42 AM

#

Ok I just added a should_tts tag as well - which is only useful if using value false to prevent TTS on the current interaction

#

@valid crypt Please let me know if these additions work expectedly, as I think they should!

valid crypt Mar 18, 2025, 3:14 PM

#

ok updating now

valid crypt Mar 18, 2025, 3:30 PM

#

toggle_vc_playback could have should_gen_text: false as default

#

actually with that off, it does not work

halcyon quarry Mar 18, 2025, 3:38 PM

#

toggle_vc_playback is not intended to have any influence on TTS generation - simply affecting the current playback state in the voice channel

valid crypt Mar 18, 2025, 3:38 PM

#

but it triggers generation

halcyon quarry Mar 18, 2025, 3:39 PM

#

You mean, the absence of this tag does not result in TTS generation?

#

Adding this tag makes TTS generation happen explicitly?

#

I have the other should_tts tag which is intended to explicitly prevent TTS from generating

valid crypt Mar 18, 2025, 3:41 PM

#

a tag triggering toggle result toggling the tts which means that it works, but the tag it self triggers text generation, and with the tag that turns off generation makes the toggle dont work anymore

halcyon quarry Mar 18, 2025, 3:45 PM

#

Please tell me how you feel the logic should be applied

valid crypt Mar 18, 2025, 3:48 PM

#

my intention was with the tag the vc chould be toggled, but the message that contains the tag should not generate text

halcyon quarry Mar 18, 2025, 3:49 PM

#

What happens if you include both tags?

valid crypt Mar 18, 2025, 3:49 PM

#

i thought it can be done with this tag but it makes the toggle dont work

halcyon quarry Mar 18, 2025, 3:50 PM

#

toggle_vc_playback and should_tts

#

should_tts: false```

valid crypt Mar 18, 2025, 3:50 PM

#

halcyon quarry ```toggle_vc_playback: stop should_tts: false```

that is something that i was going to try later

halcyon quarry Mar 18, 2025, 3:51 PM

#

Right well I think you're expecting toggle_vc_playback to do more than I think it should do

valid crypt Mar 18, 2025, 3:52 PM

#

😓 just it shouldn't take it as a message

halcyon quarry Mar 18, 2025, 3:53 PM

#

Maybe you have it triggering on the wrong condition

valid crypt Mar 18, 2025, 3:54 PM

#

there must be audio playing to be toggle, but by sending the toggle, you get a new audio (new response)

valid crypt Mar 18, 2025, 3:54 PM

#

halcyon quarry Maybe you have it triggering on the wrong condition

#

with the comment it works, removing the # it doesnt work

valid crypt Mar 18, 2025, 3:57 PM

#

halcyon quarry Right well I think you're expecting `toggle_vc_playback` to do more than I think...

my intention was let my asr bot to pause the tts if it was receiving audio packets

#

should tts

halcyon quarry Mar 18, 2025, 4:13 PM

#

oops

#

Sorry :< Just pushed the fix for that

#

Line 1673 in bot.py needed an await
tts_sw = await self.check_tts_before_llm_gen()

valid crypt Mar 18, 2025, 4:16 PM

#

valid crypt should tts

i hope that tag can be triggered by the bot itself too :v

halcyon quarry Mar 18, 2025, 4:21 PM

#

erm

#

It wouldn't make sense because the TTS would have already been generated

#

They are generated simultaneously

#

The tags system does not currently review and apply tags to every response "chunk" as it is generating text - it applies tags after the response has completed generating

valid crypt Mar 18, 2025, 4:23 PM

#

oh

halcyon quarry Mar 18, 2025, 4:24 PM

#

That'd be a bit of a tricky one to implement...

valid crypt Mar 18, 2025, 4:24 PM

#

add it to the to do list then, no hurry 😋

#

is that?

halcyon quarry Mar 18, 2025, 4:26 PM

#

I do have a special system in place for reviewing response chunks for "censored" text

#

Could slip in the special handling for TTS here

#

at the expense of, even more code complexity

#

It's currently running this code here for every response chunk. When initially building tags, it looks specifically for censor tags and keeps separate tabs on them for this function

#

Would have to do the same - make a list of should_tts tags as they are being built, then look for trigger text in every response chunk

valid crypt Mar 18, 2025, 4:32 PM

#

no hurry :p

valid crypt Mar 18, 2025, 4:39 PM

#

halcyon quarry Line 1673 in bot.py needed an `await` ` tts_sw = await self.check_tts_bef...

changed manually and there is tts, this is my second time booting it, the first time there was nothing.

halcyon quarry Mar 18, 2025, 4:43 PM

#

Ok I solved one of 2 bugs there

#

this error here is due to the TTS streaming feature...

#

Try adding this little bit here which I think should resolve it

#

#

not sure if this will actually resolve it

#

Was TTS streaming working on the main branch?

#

with the new remote Alltalk v2?

#

It's possible that this feature is bugged for the new AllTalk (I haven't had time to test it out yet)

valid crypt Mar 18, 2025, 4:53 PM

#

halcyon quarry with the new remote Alltalk v2?

streaming works

valid crypt Mar 18, 2025, 4:53 PM

#

halcyon quarry

ill try that

halcyon quarry Mar 18, 2025, 4:55 PM

#

If that fails,

#

Add a print statement here and print chunk

#

print("chunk:", chunk)

#

Also one here - to print vis_resp_chunk

#

print("vis_resp_chunk:", vis_resp_chunk)

#

In any case I should probably add a line here before searching for the audio patterm if vis_resp_chunk:

valid crypt Mar 18, 2025, 5:02 PM

#

ok

halcyon quarry Mar 18, 2025, 5:02 PM

#

I did add this line now... which will prevent error but won't fix unexpected bug

valid crypt Mar 18, 2025, 5:06 PM

#

there is not chunk

#

only vis

halcyon quarry Mar 18, 2025, 5:07 PM

#

It generated 0 tokens

valid crypt Mar 18, 2025, 5:08 PM

#

only when i trigger should tts

#

only the first time 😓

halcyon quarry Mar 18, 2025, 5:10 PM

#

Are you sure this only happens when using should_tts?

valid crypt Mar 18, 2025, 5:11 PM

#

============================================================
C:\text-generation-webui-main\installer_files\env\Lib\site-packages\llama_cpp_cuda_tensorcores\llama.py:1240: RuntimeWarning: Detected duplicate leading "<|begin_of_text|>" in prompt, this will likely reduce response quality, consider removing it...
  warnings.warn(
Llama.generate: 1334 prefix-match hit, remaining 99 prompt tokens to eval
llama_perf_context_print:        load time =    2194.38 ms
llama_perf_context_print: prompt eval time =     769.07 ms /    99 tokens (    7.77 ms per token,   128.73 tokens per second)
llama_perf_context_print:        eval time =    4166.61 ms /    45 runs   (   92.59 ms per token,    10.80 tokens per second)
llama_perf_context_print:       total time =    5032.58 ms /   144 tokens
Output generated in 5.04 seconds (8.93 tokens/s, 45 tokens, context 1465, seed 984739868)
vis_resp_chunk: <audio src="file/C:\text-generation-webui-main\extensions\alltalk_remote_tts\Ganyu_20250318-180936.wav" controls autoplay></audio>I&#x27;ve been thinking about our last conversation, and the way you described the view from the mountain still brings a smile to my face.
18:09:36.446 #3170   INFO [bot.__main__]: Marcos: "remain_silence"```

halcyon quarry Mar 18, 2025, 5:11 PM

#

everywhere that I have this code, I see no evidence that it should affect text generation at all

valid crypt Mar 18, 2025, 5:12 PM

#

i typed correctly right?

halcyon quarry Mar 18, 2025, 5:13 PM

#

tags don't update as you modify the file though - use a command like /character to refresh tags

#

Or reload the bot

#

You have that defined correctly

valid crypt Mar 18, 2025, 5:14 PM

#

i wrote that a long time ago, and in the dict tag file

#

and rebooted a lot of times

halcyon quarry Mar 18, 2025, 5:16 PM

#

hmm..

valid crypt Mar 18, 2025, 5:16 PM

#

turning off streaming? or changing tts?

halcyon quarry Mar 18, 2025, 5:17 PM

#

try updating this block of code I have at line 1650

#

Replace with this block

    async def check_tts_before_llm_gen(self:Union["Task","Tasks"]) -> bool:
        # Toggle TTS off if not sending text, or if triggered by Tags
        if (not self.params.should_send_text) or (self.params.should_tts == False and tts.enabled):
            return await tts.apply_toggle_tts(self.settings, toggle='off')
        # Conditions which are only valid for guild interactions
        if hasattr(self.ictx, 'guild') and getattr(self.ictx.guild, 'voice_client', None):
            # Toggle TTS off if interaction server is not connected to Voice Channel
            if not voice_clients.guild_vcs.get(self.ictx.guild.id) and int(tts.settings.get('play_mode', 0)) == 0:
                return await tts.apply_toggle_tts(self.settings, toggle='off')
        return False

#

I might know the real issue here...

valid crypt Mar 18, 2025, 5:19 PM

#

yes of course my skill issue :v

#

just a joke

halcyon quarry Mar 18, 2025, 5:19 PM

#

I don't think the extension params are controlling AllTalk

#

What happens if you try using a different voice with the /speak command? Does it speak using a different voice?

#

Or, if you use a different voice filename in the character file?

valid crypt Mar 18, 2025, 5:22 PM

#

i remember that you didnt add /speak for all talk, anyways ill go with kokoro

halcyon quarry Mar 18, 2025, 5:23 PM

#

I did add /speak for alltalk

valid crypt Mar 18, 2025, 5:23 PM

#

the remote

halcyon quarry Mar 18, 2025, 5:24 PM

#

Does that still reside in a directory called alltalk_tts in the extensions folder?

valid crypt Mar 18, 2025, 5:24 PM

#

it is called alltalk_remote

#

what i did was changing it to alltalk_remote_tts and it works

halcyon quarry Mar 18, 2025, 5:25 PM

#

try changing it to alltalk_tts

valid crypt Mar 18, 2025, 5:28 PM

#

kokoro, there is tts,

halcyon quarry Mar 18, 2025, 5:29 PM

#

We didn't figure out how to control kokoro via extension params either

#

Try edge_tts or try renaming alltalk

valid crypt Mar 18, 2025, 5:30 PM

#

renaming all talk

#

i remember that streaming broke edge

halcyon quarry Mar 18, 2025, 5:31 PM

#

These are the "supported" ones:
'alltalk_tts', 'coqui_tts', 'silero_tts', 'elevenlabs_tts', 'edge_tts', 'vits_api_tts'

valid crypt Mar 18, 2025, 5:31 PM

#

i could try vits later

#

renaming is not very...

halcyon quarry Mar 18, 2025, 5:35 PM

#

hum

#

When I have time I need to try getting alltalk remote working on my end

#

see what's up with that...

#

I expect that the TTS will correctly be prevented when using tts apps that respect the extension parameters

#

alltalk TTS remote may have different labels for the extension params

valid crypt Mar 18, 2025, 5:41 PM

#

i somehow killed vits-simple-api let me fix it

valid crypt Mar 18, 2025, 6:01 PM

#

alr, idk why it died, i got to reinstall it, and i hope that everything will be fine

halcyon quarry Mar 18, 2025, 6:35 PM

#

Same!

valid crypt Mar 18, 2025, 6:43 PM

#

i dont know wt is going on with this

halcyon quarry Mar 18, 2025, 6:56 PM

#

I'd be looking into this now if I wasn't super busy

valid crypt Mar 18, 2025, 7:00 PM

#

alr vits working (just the tts)

#

actually we can delay should_tts , and make the toggle tag do not trigger a response :v

#

i think i didnt change anything but, does the bot play the tts locally???

#

it is a feature that i would like to have though

halcyon quarry Mar 18, 2025, 7:06 PM

#

toggle_vc_playback applies specifically to the voice channel

#

should_tts applies to the TTS generation entirely

valid crypt Mar 18, 2025, 7:07 PM

#

here buddy, python is playing the tts

#

i was wondering why i was hearing echo, it was discord and python

halcyon quarry Mar 18, 2025, 7:08 PM

#

lmao

valid crypt Mar 18, 2025, 7:09 PM

#

was that a new feature only with vits?

halcyon quarry Mar 18, 2025, 7:09 PM

#

Probably

valid crypt Mar 18, 2025, 7:10 PM

#

its your bot buddy

halcyon quarry Mar 18, 2025, 7:10 PM

#

Ok I thought you meant, that the actual vits code when triggered by TGWUI, might open some python player

#

b/c tgwui is running vits code when TTS is triggered as an extension

valid crypt Mar 18, 2025, 7:11 PM

#

vits is running on my main pc and bot on my remote pc

#

and the remote is playing locally the tts

halcyon quarry Mar 18, 2025, 7:11 PM

#

🤷‍♂️

#

never heard of this

valid crypt Mar 18, 2025, 7:13 PM

#

~~for now the should_tts is not very working~~

#

as it shouldn't generate the tts from the beggining

#

that was on me

#

typo

#

vits works

halcyon quarry Mar 18, 2025, 7:14 PM

#

As in, the tag works correctly with vits yes?

valid crypt Mar 18, 2025, 7:14 PM

#

double tested, vits works

#

lemme reboot and triple test it

halcyon quarry Mar 18, 2025, 7:16 PM

#

The bot can only modify TTS behavior if the extension parameters are valid

#

as defined in base_settings or your character file

valid crypt Mar 18, 2025, 7:17 PM

#

vits ✅

#

also tested that vc toggle gives error if there is no audio

#

vc toggle + should tts = 0 tokens

valid crypt Mar 18, 2025, 7:25 PM

#

valid crypt

the toggle should be able to combine with this

halcyon quarry Mar 18, 2025, 7:26 PM

#

Please elaborate

valid crypt Mar 18, 2025, 7:27 PM

#

i should be able to pause the tts without triggering text generation

halcyon quarry Mar 18, 2025, 7:28 PM

#

You can

valid crypt Mar 18, 2025, 7:28 PM

#

how

halcyon quarry Mar 18, 2025, 7:28 PM

#

You may be correct

valid crypt Mar 18, 2025, 7:29 PM

#

this is what happend when i used should_gen_text: false

#

basically nothing

#

it does not trigger text generation but same for the tag to pause

#

i suppose that you will fixe it so, i'll be adding those features to the asr bot

#

👍

halcyon quarry Mar 18, 2025, 7:30 PM

#

ehhhh

valid crypt Mar 18, 2025, 7:30 PM

#

DDD:

halcyon quarry Mar 18, 2025, 7:32 PM

#

yeah I'll figure something out

valid crypt Mar 18, 2025, 7:33 PM

#

👍

halcyon quarry Mar 18, 2025, 8:53 PM

#

I need to review if there's any other tags irrelevant to the text generation, process them regardless

valid crypt Mar 18, 2025, 9:13 PM

#

fixed and pushed?

#

i finished mine https://github.com/marcos33998/asr_discordbot

GitHub

GitHub - marcos33998/asr_discordbot: A speech-to-text (STT) bot for...

A speech-to-text (STT) bot for Discord that joins voice channels and transcribes conversations. - marcos33998/asr_discordbot

halcyon quarry Mar 19, 2025, 2:10 AM

#

Probably tomorrow 😆

#

Still trying to make progress on “unrequire TGWUI” logic

#

Bat file is getting pretty complex but almost have the installation/launch logic figured out

valid crypt Mar 19, 2025, 6:29 PM

#

oh

halcyon quarry Mar 19, 2025, 6:31 PM

#

I'm taking steps towards the bot being used in either of 2 ways:

With TGWUI integration (as it is currently)
As a Standalone where TGWUI is not required, but can be used via API
Image generation capabilities and other bot functions will work in either setup

#

The logic I'm adding into the launcher script:

It is checking for a txt file that confirms whether the bot is installed, which will specify the conda environment.
If the file is not found it is assuming it is the first run of the bot.
2a. It will detect if the bot is nested in TGWUI. If so, it will have both install options.
2b. If TGWUI is not detected, it will mention that TGWUI was not found for an integration option, and only provide Standalone option.
Depending on the option, it will activate the appropriate environment, check for the bots requirements there and install as necessary.
3a. For TGWUI integrated, the bot will not create its own environment.
3b. For Standalone, the bot will download git / Miniconda as necessary and install them automaticaly and create environment - in the same fashion that TGWUI does.

#

I plan on segregating the TGWUI integrated features such as Extension management, and anything else that won't be compatible with API.

#

Finally - I'm replacing the Updater scripts with update wizards just like TGWUI has. The wizard will have option to switch from TGWUI to Standalone (vice versa if TGWUI is detected)

valid crypt Mar 19, 2025, 8:57 PM

#

halcyon quarry I plan on segregating the TGWUI integrated features such as Extension management...

if you are able to do that, as it can use any open ai compatible apis, something that would be amazing would be adding support for vlm :V

#

the only extension adding vlm in tgwui is https://github.com/RandomInternetPreson/Lucid_Vision which is not very cool :/

GitHub

GitHub - RandomInternetPreson/Lucid_Vision: This extension enhances...

This extension enhances the capabilities of textgen-webui by integrating advanced vision models, allowing users to have contextualized conversations about images with their favorite language models...

#

also i dont think it would work now

halcyon quarry Mar 19, 2025, 8:59 PM

#

My ultimate vision is that the bot can host any number of APIs, so long as user wants to duplicate a block of code and fill in some lines

#

use the tags system to call whatever API and do whatever with the response

#

Need to inch my way in that direction and it starts with still recommending, but un-requiring TGWUI

valid crypt Mar 19, 2025, 9:00 PM

#

i want to turn my lights on :p

halcyon quarry Mar 19, 2025, 9:02 PM

#

ad_discordbot has ya covered 👍

valid crypt Mar 19, 2025, 9:02 PM

#

i think there is a open source api for that 👍

#

idk if this works https://github.com/google-home/google-home-api-sample-app-android but the idea is definitely interesting and has a lot of potential

GitHub

GitHub - google-home/google-home-api-sample-app-android: A sample a...

A sample app that showcases the basic capabilities of the Google Home APIs. - google-home/google-home-api-sample-app-android

halcyon quarry Mar 19, 2025, 11:32 PM

#

From the TGWUI installer:

@rem figure out whether git and conda needs to be installed
call "%CONDA_ROOT_PREFIX%\_conda.exe" --version >nul 2>&1
if "%ERRORLEVEL%" EQU "0" set conda_exists=T

@rem (if necessary) install git and conda into a contained environment

Realizing that this startup script actually doesn't install git though 😛

#

at the end it calls another script which does, but still

halcyon quarry Mar 20, 2025, 5:31 PM

#

Sorry I didn't finish up the TTS stuff yet

valid crypt Mar 20, 2025, 5:31 PM

#

no hurry

valid crypt Mar 20, 2025, 11:44 PM

#

technically 2 bot can log into the same account but cant use the same voice channel

#

hmmmmm

#

why sst if no tts :(

#

bro is back after 1 year - 3 days https://github.com/imayhaveborkedit/discord-ext-voice-recv

GitHub

GitHub - imayhaveborkedit/discord-ext-voice-recv: Voice receive ext...

Voice receive extension package for discord.py. Contribute to imayhaveborkedit/discord-ext-voice-recv development by creating an account on GitHub.

halcyon quarry Mar 21, 2025, 12:44 AM

#

I'm actually glad to be splitting up these "generic tags" processing, the few things being added here are essentially duplicates in both llm tag processing and img tag processing functions

#

Will just call this ahead of those with the 'phase' as positional argument

#

@valid crypt I pushed the changes to the same branch

#

https://github.com/altoiddealer/ad_discordbot/tree/tts_tags

#

This now processes "generic" tag matches immediately after matching them, and before handling LLM and Img specific functions

#

So you should now be able to toggle the voice channel playback regardless of whether anything is being generated or not

#

this handles the following tags: flow, toggle_vc_playback, send_user_image and persist

#

If you confirm it handles the VC playback expectedly I'll push it to main

valid crypt Mar 21, 2025, 4:01 PM

#

got it

#

oops

#

wrong channel :v no one saw anything

halcyon quarry Mar 21, 2025, 4:11 PM

#

rats, missed it

valid crypt Mar 21, 2025, 4:32 PM

#

at least with alltalk remote should tts doesnt work, and when a new message is sent the current audio cant be paused

valid crypt Mar 21, 2025, 4:32 PM

#

valid crypt at least with alltalk remote should tts doesnt work, and when a new message is s...

basically if bot is generating the next message, the current tts cant be paused

halcyon quarry Mar 21, 2025, 4:33 PM

#

right, the issue with that is the extension params must have changed or something, or the directory name is messing it up

#

the bot controls the TTS by hijacking TGWUI's extension loader and updating parameters

#

I should be able to figure that one out - this is an alltalk-remote specific issue though.

valid crypt Mar 21, 2025, 4:41 PM

#

pause and resume tags are pretty good

#

https://tenor.com/view/perfect-great-excellent-gif-15034512

Tenor

halcyon quarry Mar 21, 2025, 4:44 PM

#

Thanks a lot for beta testing these improvements

#

And the suggestions, all good ones

halcyon quarry Mar 21, 2025, 7:52 PM

#

@valid crypt if you want to help debug this alltalk thing

#

#

        print("EXTENSION ARGS:", shared.args.extensions)
        print("EXTENSION ARGS:", shared.settings)

valid crypt Mar 21, 2025, 7:53 PM

#

🫡

#

erm, line?

#

well gonna search

halcyon quarry Mar 21, 2025, 7:53 PM

#

def on_ready

#

Did you have to do anything special to get the remote thing working, or just follow the steps carefully?

valid crypt Mar 21, 2025, 7:56 PM

#

nothing special

halcyon quarry Mar 21, 2025, 7:57 PM

#

I'm going to go ahead and try getting things up and running on my end since I have a little bit of time on my fingertips

valid crypt Mar 21, 2025, 7:57 PM

#

settings are changed either through webui or directly in alltalk

halcyon quarry Mar 21, 2025, 7:59 PM

#

Ok so in your basesettings or in your character, etc

#

There is the bots custom extension support

#

If you have an alltalk_tts dictionary key just try renaming it to alltalk_remote_tts

valid crypt Mar 21, 2025, 8:00 PM

#

?

#

i dont think that would work, those settings are for v1 and im using v2 remote

halcyon quarry Mar 21, 2025, 8:11 PM

#

try it

valid crypt Mar 21, 2025, 8:12 PM

#

oki

#

should tts dont work

halcyon quarry Mar 21, 2025, 8:16 PM

#

rats

halcyon quarry Mar 21, 2025, 8:35 PM

#

These are the correct extension args Im pretty sure

#

But I need to update the bot to ensure certain keys behave correctly...

halcyon quarry Mar 22, 2025, 1:21 AM

#

I’m working out some more kinks, like adding exceptions for when current method to “get voices” fails

#

Plan to add TTS API.

#

It seems like the remote extension ignores settings defined from settings.yaml, and only applies changes via gradio

halcyon quarry Mar 22, 2025, 3:27 PM

#

https://github.com/erew123/alltalk_tts/issues/571

GitHub

Configuring default settings Specifically for TGWUI Remote Extensio...

Heya, I'm also a developer (a discord bot) - I had integrated support for the original alltalk v1 in my bot (via TGWUI). Preface (jump down to ANYWAY! to skip this) Currently, my bot relies on ...

#

I expect the author of alltalk will clarify whether the alltalk v2 remote extension can be controlled in the same way as other extensions (via TGWUI extension arguments).

valid crypt Mar 22, 2025, 4:24 PM

#

:V

valid crypt Mar 23, 2025, 10:33 PM

#

ive noticed something funny, at least for all talk remote, an api request to tgwui will trigger tts XD

halcyon quarry Mar 24, 2025, 2:10 AM

#

I have API working for /speak command now

#

@valid crypt I just pushed changes to that tts branch, which prevents the script from crashing when trying to collect voices for the /speak command

#

Also a few api settings added to config.yaml

#

The bot can now use the /speak command with alltalk v2.

#

You can safely rename the extension folder to anything with the phrase 'alltalk' in it and it should behave the same (alltalk, alltalk_remote, etc)

#

These additions are kind of a hotfix - I have much bigger plans for API stuff, it's really going to be an overhaul on the bot

#

gotta change my focus back to the install process, update wizard scripts, etc

valid crypt Mar 24, 2025, 2:54 PM

#

halcyon quarry <@323088470241312774> I just pushed changes to that tts branch, which prevents t...

i have experienced some sudden crashes, but never thought that the command was the culprit

halcyon quarry Mar 24, 2025, 2:58 PM

#

The one you shared in the past when the directory was named "alltalk_tts" occured because it tried importing a function that didn't exist in the new alltalk v2, and I did not have it in a "try / except" block

#

I revised the logic so that if there is a specified tts voices endpoint, it will first attempt to collect the voices using it. If it fails, it will try using the original methods I had. If that fails, it now just disables the voices option in /speak cmd

#

rather than just crashing and burning

valid crypt Mar 24, 2025, 3:18 PM

#

after some testing, i think if there is a message with no gen text, and gets deleted after some milliseconds, the input that comes afterward is not detected

valid crypt Mar 24, 2025, 3:18 PM

#

valid crypt after some testing, i think if there is a message with no gen text, and gets del...

let me test further, i got that by using my asr bot

#

example, my bot sends pause tag after receiving audio packets and deletes it

valid crypt Mar 24, 2025, 3:23 PM

#

valid crypt after some testing, i think if there is a message with no gen text, and gets del...

if the one that comes after is not a tag, then there is a cooldown of 1 seconds?

halcyon quarry Mar 24, 2025, 3:26 PM

#

I don't really understand what you're describing

#

Whenever the bot is triggered to interact in any way, such as a message request, etc - it collects the information it needs, creates a task, and queues it.

It then processes each task sequentially

#

It can queue up new tasks while it is processing the current task

valid crypt Mar 24, 2025, 3:29 PM

#

i dont know either :v

#

just my bot is getting ignored

halcyon quarry Mar 24, 2025, 3:30 PM

#

There is a behavior setting called "chance to reply to other bots"

#

maybe it's at 0.0?

valid crypt Mar 24, 2025, 3:31 PM

#

valid crypt Mar 24, 2025, 3:32 PM

#

halcyon quarry maybe it's at 0.0?

i thought that too, i dont have anything so it should use default, the test is made after adding reply to 1

#

but if i speak longer which means sending the text later, works

#

weird

halcyon quarry Mar 24, 2025, 3:35 PM

#

By default, chance_to_reply_to_other_bots: 0.0

#

In dict base settings

valid crypt Mar 24, 2025, 3:35 PM

#

valid crypt

it also isn't consistent

valid crypt Mar 24, 2025, 3:35 PM

#

halcyon quarry By default, ` chance_to_reply_to_other_bots: 0.0`

that is weird too because before it was often working too

halcyon quarry Mar 24, 2025, 3:38 PM

#

Here is the code that applies the vc playback tag

#

    async def toggle_playback_in_voice_channel(self, guild_id, action='stop'):
        if self.guild_vcs.get(guild_id):          
            guild_vc:discord.VoiceClient = self.guild_vcs[guild_id]
            if action == 'stop' and guild_vc.is_playing():
                guild_vc.stop()
                log.info(f"TTS playback was stopped for guild {guild_id}")
            elif (action == 'pause' or action == 'toggle') and guild_vc.is_playing():
                guild_vc.pause()
                log.info(f"TTS playback was paused in guild {guild_id}")
            elif (action == 'resume' or action == 'toggle') and guild_vc.is_paused():
                guild_vc.resume()
                log.info(f"TTS playback resumed in guild {guild_id}")

#

If the value is stop and something it currently playing, it will stop.
If the value is pause or toggle and its currently playing, then it will pause.
If the value is resume or toggle and it is currently paused, then it will resume.

valid crypt Mar 24, 2025, 3:39 PM

#

also i was trying to get as little latency as possible so i changed the stream chance to 2 and it is not always splitting (exclamation mark should split)

#

same here

halcyon quarry Mar 24, 2025, 3:45 PM

#

I'm contemplating this now

valid crypt Mar 24, 2025, 3:45 PM

#

halcyon quarry Here is the code that applies the vc playback tag

i think is the text gen tag, so ill test deeper

halcyon quarry Mar 24, 2025, 3:47 PM

#

The text splitting is super complicated btw lol

#

What makes it super complicated, is that longer syntax such as \n\n will never trigger without some very complicated logic

valid crypt Mar 24, 2025, 3:49 PM

#

halcyon quarry The text splitting is super complicated btw lol

got you more example

halcyon quarry Mar 24, 2025, 3:51 PM

#

I made a system where it creates a little "window of text" to check (it is only evaluating like 5 characters at a time).
If it matches on a shorter syntax (like ".") it will set a flag to not split the text, and wait one more iteration.

#

Test something for me, we'll just increase the window a little bit

#

#

print("matched syntax:", syntax, "window:", check_window)

#

Try increasing the window size here to 3 or 4

#

I can test this out myself in a bit but right now I'm working on something important work related

valid crypt Mar 24, 2025, 3:54 PM

#

ok

#

didn't work, if it isn't for the perfection i dont think this is a must (at least not a priority)

halcyon quarry Mar 24, 2025, 4:00 PM

#

I'll have to look into it a bit more, I thought I had it figured out 100% but apparently not

valid crypt Mar 24, 2025, 4:01 PM

#

this time worked, looks like that it isn't very consistant

valid crypt Mar 24, 2025, 4:02 PM

#

valid crypt this time worked, looks like that it isn't very consistant

not actually, there is another .

#

i think ive discovered something

halcyon quarry Mar 24, 2025, 4:16 PM

#

What did you discover? 😛

valid crypt Mar 24, 2025, 4:18 PM

#

erm

#

if the sentence have rain or pain doesnt work if the message was send by my bot

#

what logic is this .-.

halcyon quarry Mar 24, 2025, 4:19 PM

#

I suppose you have those associated with tags?

valid crypt Mar 24, 2025, 4:19 PM

#

but if i send it myself it works

#

???

halcyon quarry Mar 24, 2025, 4:23 PM

#

#

You could put a print before the return to see if it is ignoring messages

valid crypt Mar 24, 2025, 4:25 PM

#

i logged into my bot's account

#

:O

halcyon quarry Mar 24, 2025, 4:25 PM

#

I don't think I have anything hardcoded for "ai"

#

#

This is where it decides whether it will reply to a message or not

valid crypt Mar 24, 2025, 4:27 PM

#

only happens to my bot

#

what do i have to print?

halcyon quarry Mar 24, 2025, 4:27 PM

#

Ahhh

valid crypt Mar 24, 2025, 4:27 PM

#

halcyon quarry I don't think I have anything hardcoded for "ai"

i can, my bot cant

#

😱

#

discrimination

#

XD

halcyon quarry Mar 24, 2025, 4:28 PM

#

I think I'm starting to understand what';s happening here...

#

If you change reply_to_bots_when_addressed to 1.0 I believe it will solve this

#

I need to revise the logic to avoid this issue

#

I think this logic sort of makes sense if it is matching the whole word for the bot name, but it's not it is triggering if the bots name is anywhere in the text string at all

valid crypt Mar 24, 2025, 4:32 PM

#

halcyon quarry If you change `reply_to_bots_when_addressed` to 1.0 I believe it will solve this

:O

halcyon quarry Mar 24, 2025, 4:33 PM

#

Like if you only want it to reply to another bot which said "Hey ai, tell me about"

#

It's currently rolling probability for reply_to_bots_when_addressed if the bot's name (ai) is literally anywhere in the text

valid crypt Mar 24, 2025, 4:34 PM

#

fixed

#

XD

halcyon quarry Mar 24, 2025, 4:39 PM

#

Could you please put it back to 0.0, and update line 6645 with this?

if message.author.bot and re.search(rf'\b{re.escape(last_character.lower())}\b', text) and main_condition:

#

#

this should now only trigger on whole words

#

You could test with rain, pain, etc - and also "Hey ai, what's up?)

valid crypt Mar 24, 2025, 4:51 PM

#

ok

valid crypt Mar 24, 2025, 4:57 PM

#

halcyon quarry Could you please put it back to 0.0, and update line 6645 with this? ` i...

never triggers

#

#

halcyon quarry Mar 24, 2025, 5:01 PM

#

Ok - I had it wrong 😛

#

nvm... ehhhh

#

I'll play around with this myself

#

@valid crypt Try with the bots actual casing like is it "AI" ?

valid crypt Mar 24, 2025, 5:13 PM

#

halcyon quarry Mar 24, 2025, 5:13 PM

#

how about Hey AI dude

valid crypt Mar 24, 2025, 5:13 PM

#

halcyon quarry Mar 24, 2025, 5:13 PM

#

you did change that setting back to 0.0 right?

valid crypt Mar 24, 2025, 5:14 PM

#

i removed it

halcyon quarry Mar 24, 2025, 5:15 PM

#

and you restarted the script?

valid crypt Mar 24, 2025, 5:15 PM

#

i rebooted and reselected 👍 i dont think i missed

halcyon quarry Mar 24, 2025, 5:16 PM

#

thanks 🙂

#

alright I'll have to tinker with it later on

valid crypt Mar 24, 2025, 10:46 PM

#

i didnt know that tags could take effect in real time

valid crypt Mar 24, 2025, 11:07 PM

#

by tweaking to get maximun performance i've noticed that the token per second is wrong, it includes time that the tts takes to generate the tts

#

tts off

#

:P

#

average 3s to get the first stream tts, arrrgh, i want it to be faster and faster

valid crypt Mar 24, 2025, 11:21 PM

#

valid crypt by tweaking to get maximun performance i've noticed that the token per second is...

i also have some suspicion of tts making the generation slower only because of those 0.5 seconds, im not that slow to take 2 more seconds, something is making everything slower

halcyon quarry Mar 25, 2025, 12:31 AM

#

Pushed the recent changes to Main

halcyon quarry Mar 25, 2025, 12:36 AM

#

valid crypt by tweaking to get maximun performance i've noticed that the token per second is...

The tokens / sec is printed from TGWUI code, and is not available for the bot to modify easily

valid crypt Mar 25, 2025, 12:37 AM

#

i really think that it is being affected

halcyon quarry Mar 25, 2025, 12:37 AM

#

likely!

#

well, just the TTS model being loaded means less ram and/or vram available to TGWUI (I think that's how it works)

valid crypt Mar 25, 2025, 12:37 AM

#

valid crypt tts off

although this is 33 but in reality is around 20

valid crypt Mar 25, 2025, 12:39 AM

#

halcyon quarry well, just the TTS model being loaded means less ram and/or vram available to TG...

tts is one a different machine, so if the token/s reflects the real time it shouldnt be 6

halcyon quarry Mar 25, 2025, 12:39 AM

#

aha

#

Well you have 1000 more tokens in context

valid crypt Mar 25, 2025, 12:40 AM

#

valid crypt i also have some suspicion of tts making the generation slower only because of t...

also this one, if the generation is completed in 0.5 and tts completed in 0.2s there is no way to hear the tts after 2s

#

from tgwui, with tts, toggled off tts

valid crypt Mar 25, 2025, 12:53 AM

#

valid crypt from tgwui, with tts, toggled off tts

if tts is turned off, there is some delay at sending the message to discord, if completely disabled it is lightning fast

#

ill dig deeper tomorrow

valid crypt Mar 25, 2025, 1:07 AM

#

halcyon quarry You can safely rename the extension folder to anything with the phrase 'alltalk'...

did you test it?

halcyon quarry Mar 25, 2025, 1:08 AM

#

🤔

valid crypt Mar 25, 2025, 1:12 AM

#

valid crypt did you test it?

Renamed it to alltalk_remote and there's nothing about tts

#

No building for /speak or anything

#

Well I'm dying right 😪

#

Now

halcyon quarry Mar 25, 2025, 1:13 AM

#

I just pushed an update

#

I did overlook a few little things 😛

#

Note that the bot currently requires the extension name to be in config.yaml - even if you have TGWUI flags to launch it

#

so maybe you updated the folder name but not the name in config.yaml

valid crypt Mar 25, 2025, 7:43 AM

#

halcyon quarry so maybe you updated the folder name but not the name in config.yaml

did both

#

halcyon quarry Mar 25, 2025, 10:36 AM

#

There’s a key under “tts_settings” offhand I think it is “tts_client” which is expecting one string value

#

You can launch more extensions with the bot via CMD_FLAGS

valid crypt Mar 25, 2025, 4:01 PM

#

valid crypt did both

my fault

#

https://tenor.com/view/sweats-gif-25346666

Tenor

valid crypt Mar 25, 2025, 4:17 PM

#

lets say, streaming tts works, but it modifies the token/s from tgwui, and tts makes bot much slower

halcyon quarry Mar 25, 2025, 4:21 PM

#

I think that typically, you'll generate all the text while it maximizes VRAM usage, then memory moves around as the TTS model gets to do its thing.
But with streaming we need to jump back and forth between both models so the memory gets shifted around more

valid crypt Mar 25, 2025, 4:26 PM

#

halcyon quarry I think that typically, you'll generate all the text while it maximizes VRAM usa...

in my case, i have two machines so there's no way to have bottle neck

#

,something that i've noticed is that pause tag works just fine but it takes around 0.05 to 0.2s, while my discord ping is 4ms

halcyon quarry Mar 25, 2025, 4:26 PM

#

ah yeah derp

#

keep forgetting you have 2 machines

#

But yes, I'm sure TGWUI starts determining the tokens /sec at the beginning of text generation, and stops at the end of the entire job, but with the TTS streaming the bot hijacks normal behavior adding all the TTS time to the total time.

halcyon quarry Mar 25, 2025, 4:28 PM

#

valid crypt ,something that i've noticed is that pause tag works just fine but it takes arou...

There's likely nothing I can do about this

valid crypt Mar 25, 2025, 4:31 PM

#

could be python or too much code, although i think that tts api could improve the speed

#

how do i log the time when the bot gets the tts?

halcyon quarry Mar 25, 2025, 4:32 PM

#

There are limits to how fast discord can accept data from a single source - and these are applied automatically

#

So, when we are sending text as well as a "pause()" or "stop()" cmd, etc - these are all automatically throttled

#

If you search for " def apply_extensions" you'll find the function that applies the tts

#

could add a print statement there before and after the process is called

#

print("TIME START:", time.time())

print("TIME END:", time.time())

valid crypt Mar 25, 2025, 4:37 PM

#

?

halcyon quarry Mar 25, 2025, 4:37 PM

#

Yep thats good

#

might also want to add

#

print("TIME START:", start)```

#

print("TIME END:", end)```

#

print("TIME SPENT:", end - start)

valid crypt Mar 25, 2025, 4:42 PM

#

the instance i hear the audio it gets printed, where 0.02s is from network (actually lower), 0.73s from alltalk server round it to 1.25s there is alot of time missing

#

hmmm

valid crypt Mar 25, 2025, 4:43 PM

#

halcyon quarry might also want to add

the math dont need to be very exact, by seeing it i know that it took around 2s :P

halcyon quarry Mar 25, 2025, 4:44 PM

#

yep

valid crypt Mar 25, 2025, 4:44 PM

#

api might help

#

if you didnt know what would happen when i try to use speak, i suppose that it is meant to work

📎 message.txt

valid crypt Mar 25, 2025, 4:55 PM

#

valid crypt if you didnt know what would happen when i try to use speak, i suppose that it i...

last time when i made something to use all talk i was using /v1/audio/speech

halcyon quarry Mar 25, 2025, 5:08 PM

#

I had successfully used the speak command with my setup

#

with alltalk remote

#

need to look into your error when I have a sec and see what's going on

halcyon quarry Mar 25, 2025, 10:50 PM

#

Ok I see I do have an issue with some of the logic initializing the TTS extension

#

particularly when it does not end with _tts

#

fixing this

halcyon quarry Mar 25, 2025, 11:23 PM

#

@valid crypt I have it fixed nice now

#

will be pushing it in a sec

#

#

I changed it so that whatever the heck extension is set in config.yaml as the tts client, it's gonna load it up.

#

If any additional TTS extensions try loading from flags, etc - it's going to warn that only one client can load and it will only load the configured tts extension

#

Might revisit this logic later when I get the API suite all worked out

valid crypt Mar 25, 2025, 11:30 PM

#

discord just did an ui overhaul .-.

#

i think that it still doesnt work

#

📎 message.txt

#

how did you name your alltalk, did you put any params?

halcyon quarry Mar 25, 2025, 11:34 PM

#

Just pushed the fixes to Main

valid crypt Mar 25, 2025, 11:34 PM

#

i updated already

#

oh

#

you mena now?

#

there was another update that confused me :v

valid crypt Mar 25, 2025, 11:37 PM

#

valid crypt

i just got the same error

#

to make sure

📎 message.txt

halcyon quarry Mar 25, 2025, 11:40 PM

#

All should be good in the update I pushed 6 mins ago

#

"Improve TTS / Extension loading"

valid crypt Mar 25, 2025, 11:41 PM

#

main or not main

halcyon quarry Mar 25, 2025, 11:41 PM

#

Main

#

I deleted the TTS branch

valid crypt Mar 25, 2025, 11:42 PM

#

then idk why it doesnt work

#

i touched this

#

halcyon quarry Mar 25, 2025, 11:44 PM

#

Where is that last screenshot from?

valid crypt Mar 25, 2025, 11:44 PM

#

valid crypt i touched this

config

valid crypt Mar 25, 2025, 11:44 PM

#

valid crypt

tgwui settings.yaml

halcyon quarry Mar 25, 2025, 11:47 PM

#

🤷‍♂️

#

Can't see how you could be getting that error

valid crypt Mar 25, 2025, 11:48 PM

#

and i dont have params in character

halcyon quarry Mar 25, 2025, 11:48 PM

#

My setup is basically the same. Same, no params in character.

#

That log message isn't 100% accurate

#

Does your startup look similar to the screenshot I posted?

#

Loading your configured TTS extension "alltalk_remote"

#

It will only print this if the extension is tts.client, and the key that speak command is trying to access is that value

valid crypt Mar 25, 2025, 11:51 PM

#

i just renamed it back with _tts

halcyon quarry Mar 25, 2025, 11:53 PM

#

try as alltalk_remote

valid crypt Mar 25, 2025, 11:57 PM

#

📎 message.txt

#

the first try got new error this time

#

but ended with the same

#

~~i missed something~~ nothing

#

are you using the xtts?

#

got this

valid crypt Mar 26, 2025, 12:01 AM

#

valid crypt are you using the xtts?

im not

#

also i gtg 💤 i wish, my bot could be added to your readme as temporal stt solution, i tried my best... it sends tts tags!
~~at least ill dream it~~

halcyon quarry Mar 26, 2025, 12:22 AM

#

Yeah I’m using xtts so that could be part of the issue…

halcyon quarry Mar 26, 2025, 12:33 AM

#

valid crypt got this

Upon reviewing the handling for speak cmd, that warning just means no voice was selected and there was no voice param. But this shouldn't yield an actual error if you have alltalk running with RVC or whatever

#

the api request will just be request = {'text_input': self.text} which is the minimum required information needed for a successful response

halcyon quarry Mar 26, 2025, 10:52 AM

#

Do you know any very simple extensions that respect default args set in TGWUI’s settings.yaml? I think I just need to review how that’s applied, tweak alltalk remote code and send a PR

#

Ok offhand I think it was like setup() or atsetup() - just need to take a peek there in the remote…

valid crypt Mar 26, 2025, 3:42 PM

#

halcyon quarry Do you know any very simple extensions that respect default args set in TGWUI’s ...

never heard of a extension reading tgwui's setting.yaml

halcyon quarry Mar 26, 2025, 4:05 PM

#

Could be the other way around > TGWUI sets the extensions parameters so long as they are formatted expectedly

#

in any case it's the reason why for alltalkv1, edge, vits, etc - the bot is able to update parameters on the fly

#

(because you can set parameter values in TGWUI settings.yaml and they actually take effect)

valid crypt Mar 26, 2025, 4:14 PM

#

then edge tts

#

?

halcyon quarry Mar 26, 2025, 4:42 PM

#

eh... I'll figure it out

#

@valid crypt Is the /speak command working, aside from that warning message?

valid crypt Mar 26, 2025, 4:42 PM

#

vit simple api too

#

if you did something after i went to sleep idk

halcyon quarry Mar 26, 2025, 4:44 PM

#

No, but as far as I can tell it should be sending the minimal API request and should not yield an actual error

#

such as an invalid voice etc... if none selected

valid crypt Mar 26, 2025, 6:05 PM

#

i wont be able to test anything for a few days as my intel cpu is definitely cooked...

valid crypt Mar 26, 2025, 6:05 PM

#

valid crypt i wont be able to test anything for a few days as my intel cpu is definitely coo...

actually i can :v

#

forgot that i have a second machine 😓

valid crypt Mar 26, 2025, 6:33 PM

#

my second machine also give the same error and it is not working, the no voice selected hapopens when the extension or the name is alltalk_remote without _tts

#

you could try downloading a vits model

valid crypt Mar 26, 2025, 7:14 PM

#

brain aint braining, im downloading xtts :v

#

i'll do a clean install of the bot

valid crypt Mar 26, 2025, 7:15 PM

#

valid crypt brain aint braining, im downloading xtts :v

same error

halcyon quarry Mar 26, 2025, 7:17 PM

#

Can't reproduce on my end

#

The only thing you shared that was different from what I have, is I do not include the extension in TGWUI's settings.yaml

#

I only put it in the bot's config.yaml

valid crypt Mar 26, 2025, 7:23 PM

#

halcyon quarry The only thing you shared that was different from what I have, is I do not inclu...

ahh

#

let me try that

#

that might be the only difference, i just did a clean install and got the same error so...

halcyon quarry Mar 26, 2025, 7:25 PM

#

I hadn't said anything because I was thinking if that could happen when I was looking at the code but it didn't seem likely...

valid crypt Mar 26, 2025, 7:27 PM

#

failed ;-;

#

i literally did a clean install, entered the token set voice channel, add the extension name in config... hmmm

#

wait, i removed alltalk from settings but still loading it

valid crypt Mar 26, 2025, 7:33 PM

#

valid crypt wait, i removed alltalk from settings but still loading it

from bot config

halcyon quarry Mar 26, 2025, 7:33 PM

#

Yes, thats what I said - I only load the TTS from bot's config

#

If that's actually the solution to your issue I should be able to prevent that kind of conflict...

valid crypt Mar 26, 2025, 7:34 PM

#

still dont work

#

📎 message.txt

valid crypt Mar 26, 2025, 7:36 PM

#

valid crypt

literally fresh install maybe i missed something that i have to do?

halcyon quarry Mar 26, 2025, 7:37 PM

#

I'm super busy but Ill see if I can help for a sec

valid crypt Mar 26, 2025, 7:37 PM

#

from the fresh install i only added the extension in config this time as it is on my second machine i didnt change the ip

halcyon quarry Mar 26, 2025, 7:38 PM

#

            loop = asyncio.get_event_loop()
            if tts.api_mode == True:
                request = {'text_input': self.text}
                print("tts.client:", tts.client)
                print("tts_args:", tts_args)
                client_args:dict = tts_args[tts.client]

#

#

hmm

#

Alright, I see the problem finally

#

Honestly unsure why I didn't get error when I tested

#

Actually, the reason I didn't get error was because I chose a voice each time

valid crypt Mar 26, 2025, 7:44 PM

#

halcyon quarry ``` loop = asyncio.get_event_loop() if tts.api_mode == T...

the fix?

valid crypt Mar 26, 2025, 7:45 PM

#

valid crypt the fix?

im lil dumb

halcyon quarry Mar 26, 2025, 7:45 PM

#

I can test this myself later but I believe if you just replace the whole process_speak_args() function with this,

#

it should do the trick

#

async def process_speak_args(ctx: commands.Context, selected_voice=None, lang=None, user_voice=None):
    try:
        tts_args = {tts.client: {}}
        if lang:
            if tts.client == 'elevenlabs_tts':
                if lang != 'English':
                    tts_args[tts.client].setdefault('model', 'eleven_multilingual_v1')
                    # Currently no language parameter for elevenlabs_tts
            else:
                tts_args[tts.client].setdefault(tts.lang_key, lang)
                tts_args[tts.client][tts.lang_key] = lang
        if selected_voice or user_voice:
            tts_args[tts.client].setdefault(tts.voice_key, 'temp_voice.wav' if user_voice else selected_voice)
        elif tts.client == 'silero_tts' and lang:
            if lang != 'English':
                tts_args = await process_speak_silero_non_eng(ctx, lang) # returns complete args for silero_tts
                if selected_voice: 
                    await ctx.send(f'Currently, non-English languages will use a default voice (not using "{selected_voice}")', ephemeral=True)
        elif tts.client in tgwui.last_extension_params and tts.voice_key in tgwui.last_extension_params[tts.client]:
            pass # Default to voice in last_extension_params
        elif f'{tts.client}-{tts.voice_key}' in shared.settings:
            pass # Default to voice in shared.settings
        else:
            await ctx.send("No voice was selected or provided, and a default voice was not found. Request will probably fail...", ephemeral=True)
        return tts_args
    except Exception as e:
        log.error(f"Error processing tts options: {e}")
        await ctx.send(f"Error processing tts options: {e}", ephemeral=True)

halcyon quarry Mar 26, 2025, 8:04 PM

#

did it work?

#

Actually I just realized an even easier fix, I know it will work

#

I just pushed it now

valid crypt Mar 26, 2025, 8:12 PM

#

it worked, and i trust your easier fix, how do you get the audio file with the api, same as capturing the scr?

#

hijacking the extension?

halcyon quarry Mar 26, 2025, 8:12 PM

#

Nope

#

er

#

When using the /speak command the bot uses an API call and gets the response directly.
When chatting normally, TGWUI runs the remote extension which makes the API call which returns the audio file, and it's detected in the response from TGWUI same as all the others

#

Not really hijacking the extension per se - more like hijacking the entire TGWUI chatbot wrapper function

#

Normally it will only apply TTS extensions after the complete text generation

#

But I monkeypatch that function, and apply TTS extensions any time the bot wants to split text

valid crypt Mar 26, 2025, 8:22 PM

#

hmmm, it feels slow to be directly, the first video is raw, the second one i added black screen after completing the generation and when it sends the audio which is when the when the warn appeared

#

halcyon quarry Mar 26, 2025, 8:24 PM

#

What's the problem?

valid crypt Mar 26, 2025, 8:24 PM

#

slow

#

just that

halcyon quarry Mar 26, 2025, 8:25 PM

#

Enable Deepspeed 😛

#

Alright, maybe it's due to the discord message sending over and over

valid crypt Mar 26, 2025, 8:27 PM

#

halcyon quarry Alright, maybe it's due to the discord message sending over and over

the generation part took 1.14s where 0.27s is used to generate, and i started the timer when bot received the /speak

valid crypt Mar 26, 2025, 8:28 PM

#

valid crypt the generation part took 1.14s where 0.27s is used to generate, and i started th...

halcyon quarry Mar 26, 2025, 8:28 PM

#

Try deleting these 3 lines

#

in async def process_speak_args

#

Beyond that, there's nothing I can really do to improve the speed

valid crypt Mar 26, 2025, 8:28 PM

#

sliming down >:D

halcyon quarry Mar 26, 2025, 8:29 PM

#

well, until I have true API stuff implemented

valid crypt Mar 26, 2025, 8:29 PM

#

halcyon quarry Beyond that, there's nothing I can really do to improve the speed

D:

halcyon quarry Mar 26, 2025, 8:29 PM

#

The bot functions don't add any time

valid crypt Mar 26, 2025, 8:29 PM

#

halcyon quarry well, until I have true API stuff implemented

i have one written for all talk

halcyon quarry Mar 26, 2025, 8:29 PM

#

it's just generational tasks and discord interactions that slow things down

valid crypt Mar 26, 2025, 8:32 PM

#

import requests
from threading import Thread

# Configure the OpenAI-compatible (AllTalk) TTS endpoint URL (change IP and port as needed)
openai_tts_url = f"http://192.168.1.14:7851/v1/audio/speech"
def generate_openai_tts(text: str, voice: str = "nova", speed: float = 1.0,
                          model: str = "any_model_name", response_format: str = "wav") -> bool:
    """
    Call the OpenAI-compatible (AllTalk) TTS endpoint to generate speech audio.
    
    Parameters:
        text: The text input (max 4096 characters).
        voice: The voice to use. Supported values: 'alloy', 'echo', 'fable', 'nova', 'onyx', 'shimmer'.
        speed: Playback speed (between 0.25 and 4.0; default 1.0).
        model: Model identifier (currently ignored but required).
        response_format: Audio format (e.g. 'wav').

    Returns:
        True if audio was successfully saved, False otherwise.
    """
    payload = {
        "model": model,
        "input": text,
        "voice": voice,
        "response_format": response_format,
        "speed": speed
    }
    headers = {"Content-Type": "application/json"}
    try:
        resp = requests.post(openai_tts_url, data=json.dumps(payload), headers=headers)
        if resp.status_code == 200:
            with open(voice_path, "wb") as f:
                f.write(resp.content)
            return True
        else:
            notice(f"Error: {resp.status_code} - {resp.text}")
            return False
    except Exception as e:
        notice(f"OpenAI TTS request error: {e}")
        return False

        elif tts_menu.get() == "OpenAI TTS":
            # Use the new OpenAI-compatible API endpoint
            # You can customize the voice and speed here if desired.
            if generate_openai_tts(text, voice="nova", speed=1.0):
                play_voice()

i might have some import missing

valid crypt Mar 26, 2025, 8:33 PM

#

valid crypt ```import json import requests from threading import Thread # Configure the Ope...

    def play_mp3_th():
        pg.init()
        try:
            pg.mixer.music.load(voice_path)
            pg.mixer.music.play()
            while pg.mixer.music.get_busy():
                pg.time.Clock().tick(1)
            pg.mixer.music.stop()
        except:
            pass
        pg.quit()

    Thread(target=play_mp3_th).start()```

#

100% ~~not ~~stolen 👍

halcyon quarry Mar 26, 2025, 8:38 PM

#

I've got big plans my dude

#

big big plans

valid crypt Mar 26, 2025, 8:39 PM

#

halcyon quarry Try deleting these 3 lines

i dont have that but i suppose it is the same

halcyon quarry Mar 26, 2025, 8:39 PM

#

oops yeah I started typing this warning

#

yeah clear those 2 lines

#

I'm deleting this message

#

I just deleted those 2 lines and pushed it

valid crypt Mar 26, 2025, 8:44 PM

#

deleting those i get double warns but works just fine

valid crypt Mar 26, 2025, 8:47 PM

#

valid crypt deleting those i get double warns but works just fine

the first is when bot receives the audio and the second one when bot sends the audio

halcyon quarry Mar 26, 2025, 8:48 PM

#

I'll try to debug this later

#

What warning?

valid crypt Mar 26, 2025, 8:49 PM

#

actually this is useful for me :p ill pass the footage

#

halcyon quarry Mar 26, 2025, 8:51 PM

#

ugh

valid crypt Mar 26, 2025, 8:51 PM

#

there is definitely 0.5s of room to improve, if the second delay is not discord that would be another 1s

halcyon quarry Mar 26, 2025, 8:52 PM

#

That error is not discord so there's no improvement

#

that's just a warning from the bot

valid crypt Mar 26, 2025, 8:53 PM

#

i wrote delay, it took 1s to send the audio file

halcyon quarry Mar 26, 2025, 8:54 PM

#

Ok - by "There is no improvement" I mean, when using TGWUI the way the bot is now

valid crypt Mar 26, 2025, 8:54 PM

#

oh ok

halcyon quarry Mar 26, 2025, 8:55 PM

#

One more thing....

#

I noticed that your print statements included that your character will go idle

#

Offhand I don't think the responsiveness setting / go idle crap would add delay to this processing

valid crypt Mar 26, 2025, 9:00 PM

#

valid crypt hmmm, it feels slow to be directly, the first video is raw, the second one i add...

im using this one

#

it has no responsiveness setting

valid crypt Mar 26, 2025, 11:52 PM

#

Trying to fix for one last, I messed up with my system and I only can boot into safe mode, hope I can fix it

#

lesson of the day, do restore point in different drives and never touch display drivers

halcyon quarry Mar 27, 2025, 12:33 PM

#

When I have a chance, I’ll update the extension argument handling, so it doesn’t spam warnings when there’s really no problem

#

When I had coded this feature, the main focus was handling extensions with XTTS, And I was doing a lot of shooting from the hip to actually get it working fast

#

Still a few rough edges to smooth out

valid crypt Mar 27, 2025, 3:06 PM

#

valid crypt hijacking the extension?

i was wondering because if it's using the api directly it didnt even need to load any extension

halcyon quarry Mar 27, 2025, 3:06 PM

#

The bot code currently cannot directly support TTS API for normal requests - it's the remote extension for TGWUI that is doing the api calls

#

It is using the API for the /speak command

#

Need to update a crapton of logic in the bot and I don't have the time to do that at the moment

valid crypt Mar 27, 2025, 3:08 PM

#

alr

halcyon quarry Mar 27, 2025, 3:08 PM

#

Think about it though, it wouldn't save any time

#

Well

#

I'd have to include an option to simultaneously generate text and TTS

#

and the only sane reason for a user to do this would be if they have dedicated computer for each

valid crypt Mar 27, 2025, 3:11 PM

#

what i was wondering is why is it taking ~~so much~~ (0.5s) to do a call, while by using the web (remote) it is almost instant

halcyon quarry Mar 27, 2025, 3:11 PM

#

The bot could then go ahead and happily buzz along generating the text uninterrupted - and on the bot end, I'd need for the responses to wait for TTS gen to complete so it can deliver both chunks at once

valid crypt Mar 27, 2025, 3:11 PM

#

same for the extension, the text is generated for 0.5s and then it send the call

halcyon quarry Mar 27, 2025, 3:12 PM

#

Yes - that's because there is no stop and go

#

generates 100% text, generates TTS - sends both

#

Bot, back and forth back and forth

#

It will take a very sophisticated framework to allow both at once, with dedicated computers

#

due to the streaming responses feature

#

otherwise, very simple

#

The current framework is already very complicated - so it's going to be quite a can of worms overhauling it

#

Not saying it won't happen but it's going to be a little while - and I am moving in that direction

valid crypt Mar 27, 2025, 3:14 PM

#

actually i can try all in the same machine, and then i'll tell if there's a lot of performance bottleneck or an actual problem

halcyon quarry Mar 27, 2025, 3:15 PM

#

If you stick a print statement anywhere in the bot code, you'll see that code execution is almost instant, everywhere.
The only slowdowns are discord interaction and generative tasks

valid crypt Mar 27, 2025, 3:19 PM

#

valid crypt ?

if i want to print when does it do the api call, where would it be, in the all talk extension? i want to generate text entirely and no streaming to test the latency

#

i also want to print when all talk receives the api call

halcyon quarry Mar 27, 2025, 3:21 PM

#

search for chatbot_wrapper in TGWUI/modules/chat.py

#

wayyy at the bottom, it will trigger TTS immediately before returning the complete text response

#

#

ad discordbot applies this logic whenever the bot decides to split text, instead of waiting until 100% text generated

valid crypt Mar 27, 2025, 3:25 PM

#

aww tgwui

#

was seraching in ad bot :p

halcyon quarry Mar 27, 2025, 3:25 PM

#

If you disable the bot's streaming text setting in config.yaml

#

it will behave the same as TGWUI

#

You can find the modified version in the bot by also searching chatbot_wrapper

valid crypt Mar 27, 2025, 3:27 PM

#

right before blaming your bot should test tgwui first :P

#

where i print when it finishes?

#

tgwui

halcyon quarry Mar 27, 2025, 3:35 PM

#

in the screenshot I shared, the first print is when text is done generating and it is about to request TTS

#

the second print is when it's going to return the text and audio.

valid crypt Mar 27, 2025, 3:36 PM

#

halcyon quarry

i didnt saw the second one sorry

#

alr tgwui has huge a problem

halcyon quarry Mar 27, 2025, 3:41 PM

#

oh?

valid crypt Mar 27, 2025, 3:41 PM

#

around 2.5s and the tts is done in 1.05 like with the bot

valid crypt Mar 27, 2025, 3:42 PM

#

valid crypt around 2.5s and the tts is done in 1.05 like with the bot

copy pasted the print :v

halcyon quarry Mar 27, 2025, 3:42 PM

#

🤓

#

Bot checks out

#

the huge optimizations that you'll see in open source crap is like, different computational algorithms and stuff

#

the bot is just a whole bunch of normal simple braindead operations wrapped around these complex generative models

#

I've been messing around with it enough to know, there are no intentional pauses anywhere unless user wants to play around with responsiveness

valid crypt Mar 27, 2025, 3:50 PM

#

i think that the text is generated and saved instantly, but the way it uses the extension is not very efficient

#

nope, alltalk is slow i think, the last thing is the time when it does the call

valid crypt Mar 27, 2025, 6:39 PM

#

the there is something making everything slower, the first print is when it receives the api call, and the secon one is when it sends

valid crypt Mar 27, 2025, 6:55 PM

#

ayo i found a way to do this lightning fast but idk how

#

example of tgwui api call

#

the part that is taking extra time is completing the tts to send or or just generation completed

#

but this only happend using the tgwui api call

#

instead, if you use the alltalk :7851 to generate tts it is done instantly and skips the completing process

valid crypt Mar 27, 2025, 7:02 PM

#

valid crypt instead, if you use the alltalk :7851 to generate tts it is done instantly and s...

valid crypt Mar 27, 2025, 7:03 PM

#

valid crypt

lightning fast

halcyon quarry Mar 29, 2025, 12:00 AM

#

ok so I finally have the installer changes hammered out. With this framework the bot could theoretically support additional deep integrations like the TGWUI one

halcyon quarry Mar 29, 2025, 12:18 AM

#

This bit was definitely not my forte

halcyon quarry Mar 29, 2025, 12:42 AM

#

Wizard 🌈

halcyon quarry Mar 29, 2025, 2:18 AM

#

OK - I can't say for sure if these are bug free or not but I updated all the launchers and replaced all the updaters with wizards

#

This is on the "unrequire TGWUI" branch https://github.com/altoiddealer/ad_discordbot/tree/unrequire_tgwui

halcyon quarry Mar 29, 2025, 11:10 AM

#

Marcos you only use windows yes?

valid crypt Mar 29, 2025, 11:11 AM

#

im a windows boy

halcyon quarry Mar 29, 2025, 11:11 AM

#

✋

#

Tried linux in the past to more efficiently run a a little budget HTPC (home theater PC) and it was a very painful experience

valid crypt Mar 29, 2025, 11:14 AM

#

XD, i only use wsl if windows is not supported

halcyon quarry Mar 29, 2025, 11:16 AM

#

Are you able to try running the WSL launcher on that branch?

#

That’s the one I most expect to be broken lol

valid crypt Mar 29, 2025, 11:19 AM

#

valid crypt i wont be able to test anything for a few days as my intel cpu is definitely coo...

my cpu is missing, so i would like to trywhen my cpu arrives

halcyon quarry Mar 29, 2025, 11:19 AM

#

Ah yeah

valid crypt Mar 29, 2025, 11:20 AM

#

but i can test with windows 🙂

halcyon quarry Mar 29, 2025, 11:20 AM

#

The windows one will work up until it actually loads the bot haha

#

The hard part is done though I can probably finish the job tonight or tomorrow

#

On first run it determines if the parent directory is TGWUI. If it is, it gives 2 options: integrated install (uses TGWUI environment and will enable more features) or Standalone (create and use own env)

#

Also included a method to check if parent is a fork of TGWUI and allow that

#

The update wizard includes an option to switch the install from TGWUI/Standalone

#

The standalone install will not be able to do text generation until I slap in a TGWUI api method. The ultimate plan is for the bot to accept any configured software for api calls for a number of specific purposes

#

(text gen, img gen, video gen, tts gen, etc) so long as the API has a get method for or user predefines the expected payload structure

halcyon quarry Mar 29, 2025, 11:56 AM

#

Also planning to add a user “command builder” that can utilize configured apis

#

So a user could easily configure their own custom command like “/set_tts_voice” etc

#

Or a command for a specific comfyui workflow like “/comfy_wan_img2vid”

halcyon quarry Mar 29, 2025, 12:56 PM

#

@valid crypt about recommending your fork of STT, if you could write something up like a word doc, I could probably do that. Needs to include how to install, troubleshooting, etc

#

In terms of usage with the bot, or any deviation from normal instructions from your peoject page

valid crypt Mar 29, 2025, 2:02 PM

#

my readme is pretty complete, i dont think i missed anything, althoug i can improve a little

#

i could make a bat to install or something

#

ill look into those

valid crypt Mar 29, 2025, 3:24 PM

#

halcyon quarry <@323088470241312774> about recommending your fork of STT, if you could write so...

cant be easier, and no troubleshooting it just works :p

#

some bugs

#

forgot to include usage but i think this is enough

#

also you have to mention that these are for those tts tags

halcyon quarry Mar 29, 2025, 3:47 PM

#

Could you share some example tags, like the minimum for it to play nicely with my bot?

valid crypt Mar 29, 2025, 3:48 PM

#

valid crypt also you have to mention that these are for those tts tags

i only have that as it is a simple bot to do transcription...

#

as i use it with /main i didnt include ping or a field to add the name of the bot

#

i could add replacing to do some voice command, something like if i say "create image", replace it with the tag as no asr will include things like _ (create_image), but the tag can adapt to the transcription.

valid crypt Mar 29, 2025, 4:01 PM

#

valid crypt i could add replacing to do some voice command, something like if i say "create ...

shut up --> stop tag
or
stop tag: 'shut up'

#

¯_(ツ)_/¯

valid crypt Mar 29, 2025, 7:36 PM

#

i found out why my all talk is taking more time, all because of me being dumb

halcyon quarry Mar 29, 2025, 10:01 PM

#

Do tell

valid crypt Mar 29, 2025, 10:01 PM

#

tell?

halcyon quarry Mar 29, 2025, 10:02 PM

#

Yeah, tell me 😛

valid crypt Mar 29, 2025, 10:06 PM

#

didnt understand

halcyon quarry Mar 29, 2025, 10:23 PM

#

Just curious what you did that caused slowdown with alltalk

valid crypt Mar 29, 2025, 10:25 PM

#

rvc 💀

halcyon quarry Mar 30, 2025, 1:14 AM

#

Seems like I'm picking a good time to try making TGWUI API possible.

terse folio Mar 30, 2025, 1:21 AM

#

That's cool!

halcyon quarry Mar 30, 2025, 2:15 AM

#

Hit a bit of a snag with my new install method

#

On this branch

#

I'm not sure what exactly is causing this, but now when I import TGWUI's modules.shared - the args parser in that module is now parsing my bot's args

#

On the plus side, the bot will now successfully launch and handle image generation tasks etc - without TGWUI.

#

Just need to debug this particular issue here and the TGWUI integration should also be back in place

#

This has something to do with how the bot initializes... previously, os.cwd() would return the directory to TGWUI, but now even running from TGWUI env os.cwd() is returning root dir of the bot

#

I was also parsing args in bot.py but I moved that to utils_shared.py

halcyon quarry Mar 30, 2025, 11:03 AM

#

Ah okay… the bot code was originally popping them as they were read in. I tweaked it, there’s probably a straggler.

halcyon quarry Mar 30, 2025, 11:44 AM

#

Oobabooga actually reopened 4 issues I opened that were closed as Stale

calm rain Mar 30, 2025, 11:44 AM

#

every single closed-as-stale issue was reopened

#

i got like 50 notifs

halcyon quarry Mar 30, 2025, 11:45 AM

#

Ah - that’s very cool

calm rain Mar 30, 2025, 11:45 AM

#

i guess he noticed that a lot of the closed-as-stale issues were wrong and wanted to unfuck it a bit

#

just need somebody with a massive amount of patience given access to manually close them, to then go through and actually sort things out

halcyon quarry Mar 30, 2025, 11:46 AM

#

I remember feeling a little dismayed when no response, they were legitimate issues

valid crypt Mar 30, 2025, 12:45 PM

#

2.5k issues is 💀

halcyon quarry Mar 30, 2025, 12:55 PM

#

Yeah at the time I think it was like > 1k as well

calm rain Mar 30, 2025, 1:07 PM

#

"victim of its own success" situation - repo gets far more issue posts than the one guy running it has time to manage

#

need either a team or a dedicated ~~crazy person~~ extremely high patience individual to sort it

#

ComfyUI has the same problem
-# (for a bit of time I was in charge of solving it re comfy but I wasn't enough and I'm not on that team anymore so rip)

halcyon quarry Mar 30, 2025, 1:14 PM

#

Meanwhile I get excited when an issue is reported - evidence someone is using the bot lol

#

I was starting to feel something like a team member in Forge, I had solved a few medium difficulty issues. But interest waned when I couldn’t get Illyasviel to review something… the project is like dead to him immediately after he added flux support. I improved the logic of module handling for model changes, they would have just merged it but I changed enough that I felt it warranted the author’s blessing

#

I can’t even imagine how good Forge would be if Illyasviel kept plugging away at it from time to time, the guy’s a genius

#

I heard there’s this cool app called SwarmUI too

#

gowron1

halcyon quarry Mar 30, 2025, 1:54 PM

#

@calm rain just curious to know - did you come up with the idea to use the calculations to factor rounding precision of image resolutions? Or did you basically lift that from trainer code / stability / etc?

#

So simple yet such an effective method for applying nice res values. Love it

calm rain Mar 30, 2025, 2:15 PM

#

halcyon quarry <@105458332365504512> just curious to know - did you come up with the idea to us...

it's literally how the resolutions for training data in many models (including SDXL but also pretty much most models out there) get selected. The base set of resolutions available in the UI are literally just the SDXL trainset res list. The unique bit in Swarm is just the UI to select it, and reusing the same math to estimate good values for models that work at other scales (since they almost certainly calculate train res's the same way anyway)

#ad_discordbot (Fork of Fork of xNul's bot)