#ad_discordbot (Fork of Fork of xNul's bot)

1 messages · Page 7 of 1

halcyon quarry
#

pretty simple

#
        active_settings = copy.deepcopy(bot_active_settings.get_vars())
        if not active_settings:
            bot_active_settings.init_activesettings()
terse folio
#

Use the load_defaults function that runs on class start, that's designed for creating the default variables

#

I'll make the switch when I get back

halcyon quarry
#

OK!

#

Im going to hop off for the night or else I'll be in the doghouse

terse folio
#

back, and okay

halcyon quarry
#

Next, I can probably figure this out - is I think put all user setting files in a ‘settings_templates’ folder, with a txt instruction to copy up one level to put into effect

#

With this and internal changed, I think users could git pull without issues, maybe just miss out on features if they don’t pay attention

terse folio
#

need to write an updater

#

little python script maybe that runs install of requirements, git pull and all

terse folio
#

active_settings generates when switching characters

halcyon quarry
#

Ideally, it will have empty dicts when it isn’t found - then when botsettings initializes it will assign all the default values during update_settings()

#

Which is where I showed the snippet earlier

terse folio
#

it does now, but initializing with defaults isn't necessary because the bot will fill out what it needs anyway

#

for example I didn't have SD online

#

imgmodel was left as {}
but everything else was populated

#

Since users wont be modifying this file, I think that's fine

#

the code already handles defaults with the .get statements, right?

#

or is there something else Im missing

halcyon quarry
#

Before I changed it I was getting error during character loader, the update_dict function errors because the llmstate dict doesn’t exist yet

terse folio
#

yes, just needed to load the defaults ^^

#

same happened with the starboard

halcyon quarry
#

Then yes, all is well now 🤗

#

Well almost well 🙂 Did you do anything in regards to the error messages?

terse folio
#

Which ones?

#

There's just a mention of file not found, but that's fine, it will be created when it needs it

#

User doesnt have to do anything

halcyon quarry
#

FileNotFound - but this is expected, so I don’t think we should flag as error right away

terse folio
#

okay

halcyon quarry
#

As I mentioned earlier I think it should first give a logging info or maybe warning, but if file fails to create then error

terse folio
#

added a "missing_okay" flag to load_file

#

Also, at the top of the function data = None is defined.

It's actually okay without this because in all states of the if statement "data" will be assigned.

And the "with" statement doesn't change the scope of the variable.
So it just passes down until it's returned ^^

I understand there could be fears about this causing bugs, but it's all good!

#

Python is interesting like that.
some variables can leak out in unexpected ways

#

Like if you were to write

for i in range(10):
    pass

Then later write

print(i)

This would actually print "9" and not cause an error.

#

Compiled languages on the other hand require assigning variables before they're used 😸

halcyon quarry
#

Well pass does basically nothing

terse folio
#

yes, pass is just a placeholder for "do something here"

#

in my example

halcyon quarry
#

I’m just contemplating your indent level on the print statement chiharu

terse folio
#

Exactly, it doesn't seem obvious at first because the indents are wrong

#

But python interprets it as this

i = None
for i in range(10):
    do stuff
print(i)

But the "i = None" is implied

#

Of course, this only works if it's ensured that the loop will at least iterate once

#

because if the loop never runs, i is never defined

#

In the case of the if statements though,
if both "if" and "elif" and "else" initialise "i"
then it will never cause a variable error

halcyon quarry
#

Ahhh so your saying that after a loop, the last key, value, whatever would still be the variable outside the loop?

terse folio
#

Yes, but if you're working with data where it's possible the loop will be empty, you'll want to set a default value

#

Like you did here

halcyon quarry
#

Of course

#

I think I only fixed that recently chiharu

#

Now I know better for things like that

#

And I was having a lot of errors with missing embeds when trying to delete/edit them in exceptions

#

Until I set them as None right away

#

and of course ‘if embed:’

terse folio
#

yea, the order you assign things can also be confusing

#

like a variable you're going to use has to be created before you use it.
But if creating functions/classes, they can be defined after
IF you use them within another function that is used later.

#
def main():
    test()

def test():
    pass
main()
#

This can also affect typehints too!

#

So, python lets you write a typehint in quotes, that lets ot evaluate in the future

halcyon quarry
#

What is a typehint btw

terse folio
#

This would work for your IDE to know what class "db" takes

terse folio
#

Python doesn't enforce this

#

For example

halcyon quarry
#

This is very new for me

terse folio
#

if I comment out the typehint

#

python doens't know what "self.llm" is

#

because it doesn't exist yet

#

it's loaded from a function where a lot could happen

#

it could be loaded as a Nonetype, maybe an int, maybe something else

#

there's uncertainty here

#

made a mistake in the test function

halcyon quarry
#

Heh

terse folio
#

forgot to add self as a variable

#

Wow, python is actually smarter than that ^^
it evaluated the chain of functions to figure out that self.llm will be initlaized as a "_Statistic"

halcyon quarry
#

I understand… if you put a type hint, VScode and the like will make better suggestions and it will be more apparent what values are expected

terse folio
#

This is because, self.load_defaults() will always run in __init__

#

so python is safe to assume what self.llm is

terse folio
#

new example

#

let's assime self.llm is a dict from the statistics file

#

data.pop will return the value of llm, or {}
so there's a posibility it wont be a dict according to python

halcyon quarry
#

Well, the dict

terse folio
#

because of this, python doesn't give me a hint

#

by telling python it will be a dict after importing from the file

#

I now get all the dict hints 😸

halcyon quarry
#

Aha

#

Is it good practice to use them pretty much everywhere?

terse folio
#

when it's not obvious what a variable is, yes

#

but if you define a variable like
i = 10

python knows it's an int and will give you the typehints for it

#

thing is, I didn't write this code, I don't know what is being passed into all these functions and what comes out.
So it can be harder to debug

halcyon quarry
#

Seems like you figured out a lot pretty fast

terse folio
#

So I began tracing back variables to their source and documenting what they are and what code should return

halcyon quarry
#

What does -> do?

terse folio
#

typehints are like spellcheck for your code ^-^

terse folio
halcyon quarry
#

Ahh

terse folio
#

def f(arg: type) -> output:

halcyon quarry
#

If its returns many things that probably doesn’t look so hot

terse folio
#

Typehints aren't always purely for debugging

#

if you take a discord command for example

#

when you create an arg for the command, you specify a typehint

#

discord.py reads the typehints and creates code in the backend to enforce them.
they're called converters

halcyon quarry
#

What’s interesting is that I’ve done an absolute massive crapton of coding with chatgpt but it never provided code with typehints

terse folio
#

so if you defined input:int

and ran a command /test 100
It would return an int in python, instead of a string "100"

#

the same works for "discord.User"
you could write a user's name, and discord would figure out to convert taht arg to a discord.User type

terse folio
halcyon quarry
#

Ahhh

terse folio
#

but it's technically correct in the sense it shows you what the arg is 😸

halcyon quarry
terse folio
#

just in an unofficial way

halcyon quarry
#

Well I understand there’s probably other libraries with similar hints then

terse folio
halcyon quarry
#

I do have to run now for sure, thanks for the hints 😉

terse folio
#

Cya!

#

Super secret cool thing:
You can also assign custom data to functions!

#

Discord.py also uses this, that's how it creates commands/events internally.
It's the way it figures out how to route the right event to the right function

#

Merged 👍

halcyon quarry
#

Just brushed teeth, picked up phone, see you had more info 🤗 good stuff

#

I’m looking through the commits… I don’t think you were caught up with main?

terse folio
#

is main different from dev?

halcyon quarry
#

Could’ve sworn I did change bot_statistics.llm_statistics to just bot_statistics.llm

terse folio
#

You did, I also changed it to bot_statistics.llm

#

maybe both of us doing it made it revert?

#

i'll have to check

halcyon quarry
#

Ok I guess I just didn’t get there yet

terse folio
#

no no, you did make the change, but something went wrong if it's reverted again

#

both main and dev use bot_statistics.llm for me

halcyon quarry
terse folio
#

Yea, but after merging (next commits)
you the code matched so it wasn't listed in the PR

halcyon quarry
#

whew

terse folio
#

Look at the files changed tab

#

because the change I made already equalled the change you made (after all the commits were applied)

halcyon quarry
#

Will get a better look tomorrow was just taking a sneak peek

#

Gnight for real lol

halcyon quarry
#

You went through the Recent Changes in the readme and removed what looks like 40 line break instances, lol

#

Ahhh man, I try and avoid modifying most of the user setting files unless its essential - because updating the bot is still an unpleasant task.

#

That is quite the cleanup though I must say, must have eraditaed 1k spaces

terse folio
#

i just ran a regex on the folder

#

:3

#

users don't have to update those files, it can only be applied for new people who freshly download it
thats fine

halcyon quarry
#

So long as they actually follow the install / update instructions

#

if they just clone it into textgen-webui and try git pull, they'll have to back up their user files first

terse folio
#

I think you should look into making the bot an extension for tgwi, that way people could just run git pull and it updates with no need to move files

halcyon quarry
#

I'm aiming to change the instructions back to that (clone into TGWUI)

#

Now that 'internal' and those settings are created and not part of the package, thats half the battle

terse folio
#

👍

halcyon quarry
#

@terse folio
My idea is to move the user settings files to a settings_templates directory.

In the main directory will be included a file called:
User settings are in settings_templates directory.txt

In that directory will be included files called:
Copy these files up to ad_discordbot directory.txt
Do not edit these files.txt

Then I modify the bot code to ensure it works without the user setting files.

Let me know if you have a better idea

terse folio
#

it's worth a shot, what I also see is using gitignore to ignore changes to configs when running git pull?
But it downloads them on git clone.

Not sure if it would say there's a merge conflict

halcyon quarry
#

I believe what would happen is that it would just ignore the files

#

The downside to this is the users won't know anything changed in those files when it happens, unless they visit the project page

#

For quite some time now, I've been careful to use .get() and even the config transition code I did - to ensure the script runs for users stubborn to update the settings files

terse folio
#

great!

halcyon quarry
#

(As is good practice, you're doing it too of course)

#

I'm going to start working on that idea next, unless you have any better idea

#

Have some work to do 😛

#

work work

terse folio
#

I'm currently working on a wrapper for xtts so I can use my finetuned model from an api.
The webui I was testing is a bit of a mess :P

terse folio
halcyon quarry
#

FYI - you could probably use it via our bot

#

the finetuned model

terse folio
halcyon quarry
#

for instance, Alltalk_TTS has a parameter that works with TGWUI to specify the model

terse folio
#

I'll search alltalk, I don't think it comes with tgwi by default

halcyon quarry
#

Fairly certain the behavior is enabled by default

terse folio
#

Also another thing I want to implement is holding the audio files in ram only.
I don't need a disk full of generated audio 😸
already deleted 1Gb from tests

halcyon quarry
#

y'know I have a setting for that right?

#

in config.yaml

terse folio
#

haven't checked it out yet

halcyon quarry
#

I think I have one to delete the output...

#

text-generation-webui\extensions\alltalk_tts\confignew.json
These parameters can be included in your character file under extensions key

#

And the bot will load them

#

Here is one default parameter:
"tts_model_name": "tts_models/multilingual/multi-dataset/xtts_v2"

terse folio
# halcyon quarry I think I have one to delete the output...

it would be cleaner to just never write the file, it's more performant to keep things in ram because you're not limited by read/write speeds.
And I like to move the AI stuff to an HDD so i'm not wasting SSD cycles on huge amounts of generated content

But that's a thing I'm willing to implement myself if it doesn't exist, no worries!

halcyon quarry
#

Yeah seems like alltalk does not have a setting to keep file in ram, seems like it wants to always write the output.
There is a setting to delete the output. But I don't think you want to use that as it will likely delete it before the bot plays it 😛

terse folio
#

with most programs you can tell them to write to a bytesIO() object.
This is a temporary file in ram ^-^

#

from there you can convert it to bytes and send over a webapi to your application (in my case)

#

in the meantime, happy to test with it!

#

Does alltalk support piper models?

#

I want to look into alternatives to Xtts because it doesn't permit commercial use.
just want to feel less restrictions on whatever I do

halcyon quarry
#

I don't know all too much about it, don't know anything about piper

terse folio
#

Seems not, piper is only mentioned in issues when talking about feature requests

halcyon quarry
#

If memory serves me right piper was good but more vram required

terse folio
#

interesting, okay

halcyon quarry
#

I had zero interest in TTS until the advent of xtts

#

even then I'm just barely interested 😛

#

So easy to add voices though - idk if you noticed but our /speak command allows users to attach their own voice clip to generate with

#

can be mp3 or wav

terse folio
#

interesting, is that saved?

halcyon quarry
#

Nope, one time use

#

need to drag n drop every time 😛

#

You may lose sleep to learn it writes a temp file then plays it and removes it 😄

terse folio
#

sometime on the todolist, could add a feature to save a voice permanently or just for the current session.
And assign that voice to a name.

terse folio
#

Gradio writes temp files when you upload an audio as well

halcyon quarry
#

Its good enough as is IMO

#

borderline excessive

terse folio
#

I could be overreacting about SSD write cycles, not sure,
I just am in the habit of building optimized things :)

halcyon quarry
#

We should strive for perfection, indeed

#

Beautiful work on the updates

#

It's perfect. For a sec I thought it should say something about creating the files, but I don't think users need to be told this

#

Actually, I do want a message for when the internal message is created only

terse folio
#

Less instructions means less points for users to screw it up 😸

halcyon quarry
#
class SharedPath:
    dir_root = 'ad_discordbot'
    dir_internal = os.path.join(dir_root, 'internal')
    if not os.path.exists(dir_internal):
        logging.info('Creating dir "/internal/" for persistent settings not intended to be modified by users.')
    os.makedirs(dir_internal, exist_ok=True)
#

seem about right?

#

certainly works

terse folio
#

Sure

#

Streaming with alltalk is nice
But it feels like all the generation is half the speed of the other webui (even through deepspeed is enabled)

#

maybe it's doing some other processing to clean up the audio ontop

#

also that's a cool UI for alltalk, nice settings

halcyon quarry
#

Oh yeah, the developer is in love with the project

#

It's a beautiful thing

terse folio
#

Xtts-webui:
resemble enabled: 24s
disabled: 11s

Alltalk: 17s

#

all with the same text prompt

#

but with shorter prompts, it's pretty consistent with speeds
and sometimes faster ^^

halcyon quarry
#

eek

#

@terse folio you may appreciate this

terse folio
#

I can't imagine why it would fail, because if path already exists, it will skip.

maybe a permission error.
but if that was the case, os.makedirs would raise an exception and not reach the log code.

#

Nice idea though!

#

Reusing code 😸

halcyon quarry
#

Ill just axe the second bit then

terse folio
#

I'm not sure if you can run the function like that

halcyon quarry
#

it's working

terse folio
#

because it doesn't belong to an instance yet

#

I see

#

cool cool

halcyon quarry
#

wildcard message doesnt appear b/c I have it symlinked from Forge extensions folder

terse folio
#

Random python info :P

There are decorators that enable you to run methods in a function without needing to instance it.
like @classmethod
or @staticmethod

classmethod lets you define a "cls" arg instead of "self"
so you can use other functions in the class outside of an instance.

Static method has no access to other parts of the class iirc

halcyon quarry
#

I've used cls before, but very little experience with it

#

Only lately getting any experience with self method

halcyon quarry
#

Now to offload these user settings...

#

Yes, on startup we're just going to go ahead and copy the files from the settings_templates directory

halcyon quarry
#

...surprisingly easy!

#

yoooo here we goooooo

#

I mean, let's gooooooo!

terse folio
#

Woooo!

halcyon quarry
#

This is a beautiful day for the bot, couldn't be possible without you

#

🍾

terse folio
#

Looks great!

halcyon quarry
#

Guess I should include a warning for if the src_path does not exist

terse folio
#

I don't think the code would be running in that case

#

that warning would have to be at the top of bot.py before you import other files

halcyon quarry
#

Well in this case the src_path is the file in settings_templates

#

so if the file does not exist in the main path, and also not in the settings_templates, raise an error

terse folio
#

oh, thought you were talking about the ad_discordbot folder 😸

halcyon quarry
#

lol

#

Error - bot no exist 😛

terse folio
#

yea, but bot.py is still there all alone!

halcyon quarry
#

x_Copy all files from 'settings_templates' to here.txt
x_Copy all files up to 'ad_discordbot'.txt

terse folio
#

I think naming the folder "settings_templates" is obvious
just need to mention to copy them up a level

halcyon quarry
#

Committed to Main

Everyone may now git clone the project into /text-generation-webui/ and be able to git pull moving forward without issues!

#

@viral lagoon a long time ago you suggested making the user settings as templates - finally got around to that. Thanks for the tip 🙂

keen palm
#

What'd you add on this update?

halcyon quarry
#

Just need to replace bot.py and /modules/

Reality went nuts clearing up trailing spaces everywhere (why many files changed) 😛

Other than that, you could move your installation temporarily so that you can git clone https://github.com/altoiddealer/ad_discordbot into /text-generation-webui/

Then copy your settings files and /internal/ back to the new install.

Moving forward you can git pull without issues

keen palm
#

Hmmm, I keep getting this at the end of every response:

@keen palm" Reply Delete
Three different models producing the same thing

halcyon quarry
#

if you turned on server_mode in config, turn it off

#

Lmk if it's already off

keen palm
#

That's not an option in config

halcyon quarry
#

Good 😛

#

Try reset conversation

#

Maybe one time it decided to output that, and now it's mimicking itself

keen palm
#

Yeah, that seems to be the case. Bad bot

#

I'm testing a new model, and it loves using emojis and winks

#

I...don't know how I feel about that

halcyon quarry
#

😉

keen palm
#

I think it's hitting on me

halcyon quarry
#

My character sent a pretty impressive image involving a dildo

halcyon quarry
#

pylance is absolutely amazing

#

idk how I went this far without it

halcyon quarry
#

A user sent an Issue that they are using the bot on multiple servers, and that it is sending TTS to the one voice channel regardless of server 😆

#

I'm pushing the fix for this, just pretty funny thought haha

#

Apparently they have the bot on 3 servers at once

vestal python
#

Yes, I remember that one. I had a user asking about tomatoes and it'd pop up in the private voice channel I set.

halcyon quarry
#

I'm resolving this by checking if i.guild.voice_client == voice_client (variable representing the VC bot is connected to).

#

If not, it sets tts_resp = None and behaves like only text response was received

#

To disable the TTS from processing would require reoloading extensions constantly though

keen palm
#

You have a plan in place for putting user name in prompts, right?

halcyon quarry
#

Yes

keen palm
#

That info is being sent to the LLM, though

halcyon quarry
#

Not yet - when I mentioned server_mode earlier I forgot that's only on the dev branch atm

keen palm
#

Well, I mean, when the bot responds, it responds with my user name

halcyon quarry
#

Oh - that's automated and the bot does not know it is @ mentioning you

#

But the user who wrote the messages is dynamically assigned to the user1 parameter

#

The LLM does see each users name as the user

#

The @ mentioning occurs if the bot is not responding to the same person consecutively

keen palm
#

I'm not talking about the @ mentioning, though

halcyon quarry
#

LLM sees each username

#

If you ask the bot, what is my name?

It will (probably) reply with it - unless you use the custom thing I have for stopping strings

keen palm
#

Yeah, I've tested that before, and it works.

halcyon quarry
#

From example character:

  # Stopping strings you may include which this bot will dynamically replace:
  # "name1" (the user's name)
  # "name2" (the character's name)
  custom_stopping_strings: '"### Assistant","### Human","</END>","\nname1:","\nname2:"'
  stopping_strings: '"### Assistant","### Human","</END>","\nname1:","\nname2:"'
#

Oh, it will only stop if the name follows a \n newline from my example

keen palm
#

I haven't yet gotten to test out whether the bot can associate a user name with a particular game character. My guess is it will get confused.

#

I definitely need to put in some stop strings, though

halcyon quarry
#

What we were discussing earlier, about names... Will only have any effect if/when the new Behaviors are implemented - when multiple messages may be merged to a single prompt

#

(optional).

keen palm
#

Why are there custom_stopping_strings and stopping_strings settings?

halcyon quarry
#

no clue but they're both required by TGWUI

keen palm
#

Classic

halcyon quarry
#

@terse folio did something dumb - ended up deleting and recreating the dev branch

keen palm
#

Anything in the works for delete/replace last response?

halcyon quarry
#

On the todo list

halcyon quarry
#

I’ll update the one and only pinned comment to include the actual todo list

#

(Soon)

terse folio
#

Had a cool little idea you could also implement using Flows.

An ability for the TTS to change how it speaks based on the text.
Maybe an LLM would decide what emotion the text is meant to convey and pull the correct reference audio for that speaker+emotion.
Same for other tasks like whispering or talking louder.

keen palm
#

I decided to put our RPG character information into the Gamemaster context, like so:

  • [Character name], played by [user name]: [Information about character]

And so far, generally speaking, the bot can differentiate the character based on the user that spoke to it

halcyon quarry
#

I have no experience with this, but absolutely, if there's a syntax for that then a character could have specialized context to apply the syntax

#

Pinned msg has been updated with ToDo list

#

It's probably much longer than that

halcyon quarry
#

I was wondering why the heck my Aspect Ratio helper character kept giving the same answer no matter what

#

its because I forgot to put mode: chat - so it saw zero context

#

It's the grammar string, grammar is awful.

keen palm
#

That is one thin image

terse folio
#

infinitely thin :O

halcyon quarry
#

Works good without the grammar

#

It picked the correct ratio

#
  - trigger: 'draw,generate'
    insert_text: ''
    insert_text_method: replace
    search_mode: user
    on_prefix_only: true
    save_history: False
    load_history: -1
    swap_character: M1nty-SDXL
    should_send_text: false
    should_gen_image: false
    flow:
      - flow_base:
          save_history: False
          load_history: -1
      - flow_step: Ask LLM for best Aspect Ratio
        format_prompt: '{llm_0}'
        swap_character: '_Aspect_Ratio_Selector'
        should_send_text: false
        should_gen_image: false
      - flow_step: Gen image with the LLM's selected AR
        format_prompt: '{llm_1}'
        aspect_ratio: '{llm_0}'
        should_gen_image: true
        should_gen_text: false
        should_send_text: true
#

Don't actually need to use the last step there, could just use the variable to gen image on the same step as prompting the AR helper

#

nvm

#

I need to better document it - the variables get their values updated immediately before the next flow step

#

Anyway, from now on I'm letting the AR selector get a piece of the action for all image requests

visual dagger
#

hey

#

how are you doing guys?

halcyon quarry
#

good!

#

There has been a lot of great progress these past few days

#

The install and update instructions have changed, and will likely remain as they are now for awhile

#

The bot can now be git cloned into the text-gen-webui directory, and you can use git pull to update without conflicts due to modified user setting files

terse folio
#

This is epic, I put together an XTTS api server.
You can upload audio files to create latents that will be stored during the session.
You then can use those latents to generate tts.

And everything stays in ram ^-^
I could port this to work in TGWI as well, it would just create a webserver like the openai Extension.
But this is eventually one last thing we need to worry about needing to hook directly into tgwi for.

#

Working on this also taught me how to fix the documentation of the openai extension for the transcription endpoint which requires a file upload.
It currently has no documentation because it's a little complicated to have both

halcyon quarry
#

👏

#

Can params be modified onthefly?

#

Meanwhile I added a simple function to toggle TTS activate on/off.
Also set the “loading extension X” to warn once (per extension) so it doesn’t spam it when modifying extension args.

terse folio
# halcyon quarry Can params be modified onthefly?

I havent checked the full list, but I copied what the xttswebui had.
Temperature, speed, topk, topp,...

And I got some warnings about needing to use num-beams, so will add that too.

It has a low vram mode, but I'll add an endpoint to load/move the model to CPU.

Now that I think about it, there should also be an optional timeout to keep the model in memory for a few seconds after the request so you could process multiple in one go without waiting to move from ram.

#

Another idea I had was an endpoint to the tgwi api to list the active extensions.

This could be useful to find out what other web servers are running in tgwi for a client to call.
Like your function that picks what tts client to use.

#

An external bot could check what tts models are running and call the correct urls

halcyon quarry
#

Well yeah, in time the function may be updated to do just that, if your vision turns out like that

terse folio
#

I think no one ever used the transcript endpoint with the openai extension.
having to install python packages that weren't part of the requirements 🤔

terse folio
#

yay got that working!

halcyon quarry
#

Almost time to open your own resources thread chiharu

terse folio
#

😸

#

Just took a look at alltalk's streaming, looks like a straight forward implementation!

terse folio
#

so I have audio streaming, but something is terribly wrong with pyaudio and it just plays sound extremely loud.
The audio is int16, all the values are correct.
So weird

halcyon quarry
#

is that in lieu of FFmpeg?

#

from the bot code:

        # Otherwise, play immediately
        source = discord.FFmpegPCMAudio(file)
terse folio
#

PCM is uint8 iirc, gen_stream sends int16 (same as alltalk)
normal gen sends float32

But this isn't the issue, i can write the stream/generation to a file and play it.
But using pyaudio to stream it will blow out my headphones haha

#

The bot code would have to support playing from an iterator, or live buffer

halcyon quarry
#

tried asking ChatGPT or the like?

#

always worth a shot 😛

terse folio
#

not yet, this feels pretty obscure, searched a lot

terse folio
#

I tried asking a few things to Chatgpt, wasnt of much help

Tomorrow I plan to start up alltalktts and use the streaming api and see if I get the same corrupted wav issues.
The code was nearly identical.
Cya for now!

halcyon quarry
#

adieu!

halcyon quarry
#

Pushed an update to improve TTS handling

#

If the bot is on multiple servers, TTS generation is now handled gracefully.

#

Additionally, it no longer generates TTS when a tag is triggered with should_send_text: False

halcyon quarry
#

I accidentally broke Cont and Regen 2 days ago.
Fixed

#

(they still don't work correctly... coming soon)

halcyon quarry
#

I'd gone through and cleaned up the user and channel references.
Apprently on_message(), the commands, etc, are happy with i.author.display_name - except for those two App commands which do not have an author attribute

#

must be i.user.display_name for those

halcyon quarry
#

About to hit 500 commits 👀

halcyon quarry
#

@terse folio thanks to pylance, I noticed that one of the original imports from the source project was not used at all… 'torch' 😆

#

(Sounds expensive)

#

Im afk didn’t try this yet, but if I clone the repo anywhere but TGWUI will pylance list all the missing requirements?

terse folio
#

it's imported in the tgwi i'm sure, so it's not adding any load times as packages are only loaded once and using import just provides a connection to the already loaded lib ^^
But yes, if the bot was standalone from tgwi then torch would add 1-2 seconds of load time!

terse folio
halcyon quarry
#

I need to test if it runs without TGWUI

terse folio
#

oop, this stuff still hasn't been fixed, will add to todo later

terse folio
halcyon quarry
#

I made a lot of updates past two days, mostly cleanup. But added a nifty TTS toggler

terse folio
halcyon quarry
#

I had refactored all the code to not actually require either program but did not actually test lol

#

I forget where but I noticed one false hint you wrote and fixed it 😛

#

idk maybe it was correct now that I think of it… forget it 😛

#

I cleaned up the user / use_name so it’s more obvious what variable is what

#

Also making better use of the ‘params’ dict that gets passed down

terse folio
#

Yea, it doesn't know what to do with .mention

#

I think that's because .mention is a property not an attribute

#

typehints are for types, they don't take instances/variables.

str is a type, but str() is a string and wont work.

#

    @property
    def mention(self) -> str:
        """:class:`str`: Returns a string that allows you to mention the given user."""
        return f'<@{self.id}>'
#

While User.mention has output typehints, it doesn't seem that pylance lets you use those as the typehint for other items.

terse folio
#

The way I do typehints is for the less obvious things.

For example in the past there were arguments called "user", which would lead me to believe they were of the discord.User type.
They were not, so I marked them as str.
And in the most recent commits, you change the argument to user_name which is much more descriptive ^-^

#

Another thing I noticed was removing user+channel for i

"i" is a pretty commonly used variable for for-loops, like for i in ....
This could cause bugs later on where i gets replaced with the last item in the for-loop like I talked about earlier.

So I would replace i with ctx
or inter short for interaction if that's what it is

halcyon quarry
#

Let me know if you’re working on anything so I don’t inadvertently start working on the same thing

#

I will change all those interaction variables to just enter because that is a good point you make

terse folio
#

I'll fix that myself later

halcyon quarry
#

Oh interesting

terse folio
#

Some of your functions can be activated from on_message, and interactions.

So we can use a Union[discord.Message, discord.Interaction] typehint

#

which means this or that

#

I'll look more into that, not exactly sure, I didn't trace it.
It could be context, and not a message type

halcyon quarry
#

When I was making my coffee this morning I decided how I want to resolve one little mess... the split on_message_gen and _hybrid_llm_img_gen.

I'll be merging those together as hybrid_llm_img_task - and probably divide it to 2-4 subfunctions

#

currently on_message_gen leads to hybrid_llm_img_gen and nothing else leads directly to the latter (and never will)

terse folio
#

Another pretty big change I want to make is remove many of the large if statements.

like if SD_enabled
define x,y,z...

I think it would be better to define all the functions and bot commands in main.
But have the if statement in the interaction to tell the user this command is disabled.

This would cut down on the need to restart your discord client after enabling SD or another feature.
Because all the commands would already be there.

halcyon quarry
#

Yes I;ve been intending to make that simple little change.

This actually won't all happen if at the beginning, it does not toggle sd_enabled based on the client check

#

Just need to undo that dumb toggle

#

I put it there becasue I was originally collecting all imgmodels at init - but now thats fixed

#

its just one line wreaking havoc, just delete it 😛

You want the honors or shall I?

terse folio
#

Even with TGWI,
We could create a TGWI class, give it a "connect" function which would run all the imports and toggle a flag "TGWI enabled"
You then could access TGWI features through that class.

So all those functions could also be there not hidden behind an if statement ^^
just running as "just in time" where it imports when it needs it.

I'll probably work on that myself since I have experience there.
I have a large library of all my useful code utils, and I keep some machine learning stuff in there too.

But importing Torch is expensive, can add a lot of wait time!
So I use JIT importing there too!
To only import torch when a model is being loaded.

halcyon quarry
#

We do want to disable all the commands and everything, if the config file specifically has it disabled

#

I understand - the way I have it works but isn't the clean way to handle it

terse folio
halcyon quarry
#

I'm going to add a check in the sd_api() function - when there is a successful response, if SD_CLIENT == None It will go fetch the actual client name.
(If it is None that means it was not online during startup)
This variable isn't required for anything to work, its just to enhance the user experience

#

will function*

#

Since its currently a global variable I'm just going to remove the return at init and set it in get_sd_sysinfo, to more easily call the function from sd_api()

#

I'm updating the get_settings_dict() from the main settings so it may return a top level dict instead of the entire settings dict

#
    def get_settings_dict(self, key=None):
        if key:
            return self.settings.get(key, {})
        else:
            return self.settings
#

(not like it's used much... yet)

#

fixed those bad behavior assignments in init...

#

Added a method in ImgModel() to refresh the extension support if SD WebUI is found to be online later on

    def refresh_enabled_extensions(self):
        self.init_sd_extensions()
        imgmodel_dict = bot_settings.get_settings_dict('imgmodel')
        merge_base(self.img_payload, imgmodel_dict['payload'])
#

slight change to that

#
    def refresh_enabled_extensions(self):
        self.init_sd_extensions()
        imgmodel_dict = bot_settings.get_settings_dict('imgmodel')
        new_payload = merge_base(self.img_payload, imgmodel_dict['payload'])
        update_dict(bot_active_settings['imgmodel'], new_payload)
        bot_active_settings.save()
#

I know how much you despise love the save function 🤓

terse folio
#

Also check out what alltalk did with the ability to install standalone or in the webui.
They have a script.py in the main dir that acts as an entry point to tgwi to start it.

You could do the same for the bot.
And people wouldn't have to move any files. just git clone.

I did attempt to start this, but found it overwhelming with the amount of functions that mimicked the startup of tgwi.

halcyon quarry
#

update_dict(bot_active_settings['imgmodel']['payload'], new_payload)

#

I looked at the documentation for the Alltalk API and it looked way too complicated for me to screw around with, when current method is working simple and effectively

terse folio
# halcyon quarry I know how much you ~~despise~~ love the save function 🤓

haha it's all good,
I just don't want to save on every assignment like this:

settings.set(test=1)
settings.set(name='someone')
settings.set(character='example')

This would trigger 3 saves in a row
So it's best to set the attributes through settings[test] = 1, settings[name] = 'someone' ... etc

and at the end save it ^-^

halcyon quarry
#

I know the top level dictionary imgmodel kind of sucks for what it is - we can change that if you come up with some brilliant migration lol

#

I don't care enough though it can stay

terse folio
halcyon quarry
#

It can just run as an extension in TGWUI - you can leave the form field blank in config.yaml, but use the --extensions flag in CMD_FLAGS

#

TTS will work

#

(I think 😛 )

terse folio
#

hmm interesting, haven't tried

halcyon quarry
#

I think it tries to match the tts_client value from extensions list...

#

nah. hmm...

#

Can easily update that

terse folio
#

Are we talking about ad_discordbot or alltalk?

#

I see

halcyon quarry
#

I just need to add the extension args as a source to check for the value

#

That field in config.yaml is meant to just be a shortcut to make it plain and simple

#

I mean, it will enable the extension if its in CMD_Flags - the /speak command will probably just be bugged

#

ANWYAY - fixing that 😛

#

good catch

terse folio
#

I haven't digged around config.yaml too much, not sure which field you mean.
I'll read this back again later, have some things to do outside ^^

halcyon quarry
#
...
  tts_settings:
    extension: ''
    api_key: ''
...
terse folio
#

I think we were talking about different things :)

halcyon quarry
#

Maybe 😛

#

You're probably talking about how to handle things if/when we transition to use TGWUI API

#

some of the hacks currently in place may still work

#

or maybe can be tweaked

terse folio
#

not exactly.
When I installed alltalk a few days ago, I noticed it prompted me if it was being installed as an extension or standalone.

Which tells me the same code could work for either!

So what if we did something similar for the bot.
Creating a script.py with a setup function that starts up the bot when tgwi loads with the --extensions bot flag.

From the bot (as an extension) you can still access all the internals of tgwi and the internals of other extensions like for tts!

halcyon quarry
#

If it's something you believe in, then feel free to make it happen, I don't quite understand 😛
Maybe you mean like, the ability to toggle Deepspeed or a few other settings that can't be toggled from shared params

#
    tts_client = ''

    # Initialize shared args extensions
    for extension in shared.settings['default_extensions']:
        shared.args.extensions = shared.args.extensions or []
        if extension not in shared.args.extensions:
            shared.args.extensions.append(extension)
            # Get supported TTS client found in TGWUI CMD_FLAGS
            if extension in supported_tts_clients:
                tts_client = extension
#

added that bit at the end to snag the client name

terse folio
halcyon quarry
#

There will be some guess and check, for sure

halcyon quarry
#

The Controlnet and Reactor options in the /image command were initialized depending on whether they were responsive to an API check

#

Revised that to just use config settings (so they can be used if SD client launched later)

#

cnet_data was a global variable used by the image cmd.
Now its just fetched if the controlnet option was used in /image

terse folio
#

just got around to testing alltalk streaming api.
It also writes a corrupted wav file.
Trying to play it says there's 0 seconds.
But same as before, could open it in vscode and play it there. Hmm.

That's fine I guess, just need to figure out what's wrong with pyaudio now!

halcyon quarry
#

get_settings_dict() was also dumb method from Settings() - the way it's structured can just as easily write the direct value

#

Ditched that

halcyon quarry
#

Welp, all the features of SD are now working after starting it later

vestal python
#

I'm bringing my 2 bots back online now that my dev one is complete. My dev server has a free rtx 3060 12gb, and seeing about setting up a SD server on there for the bots to use.

#

The one with 35GB Vram.. What Model should run that now? I was using Mixtral on it for the longest time. Is there anything better now?

halcyon quarry
#

ugh

#

I'm fairly certain the bot would successfully load extensions if they were in CMD_FLAGS.... doesn't seem to be happening now

#

I'm blaming @terse folio who must be the culprit

#

either that, or I borked it when I monkeypatched load_extensions()

#

I think maybe Reality is to blame here...

#

😄

#

Oh yes they are

#

maybe not... ugh

#

I give up for now. the monkeypatch is probably what is preventing extensions from CMD_FLAGS from loading

halcyon quarry
#

I'm making it so that the value for Image Models in Tags and dict_imgmodels.yaml can be either the title OR sd_model_checkpoint

#

title is the filename minus .safetensors, and is prefixed with {subdir}_ for each subirectory
sd_model_checkpoint is the exact value including the hash from SD WebUI model list

vestal python
halcyon quarry
#

Does your client have flags --api --listen ?

vestal python
#

that's probably it. forge does some weird things, but I might have it down now

halcyon quarry
#

I use forge

#

If you switched to their dev2 branch, then you won't have this issue

#

(they merged my PR)

vestal python
#

15:41:29.181 #3656 INFO [bot.main]: Dundell2 used "/image": "anime style asian man with hat"
15:41:29.181 #3202 ERROR [bot.main]: An error occurred in img_gen_task(): 'payload'

#

I'll keep attempting some things

halcyon quarry
#

You may have made an error migrating old settings

#

I’m on the road right now but I’ll try cloning a fresh copy of the bot to see if there’s any issues

#

I’m always updating my own personal instance as I go so I may have overlooked something

#

Who knows

vestal python
#

No luck with both bots. One has additional issues, but that might be something to do with no Characters, but they both have the payload error. My forge ui server does show api enabled, and allowed for local use on 192.168.1.249 port 7861

#

But that's it for me for a while. I did get both bots atleastrunning for now which is great

halcyon quarry
#

I'll check it out momentarily

vestal python
#

The third bot, the 70B I might keep just on api though. Too slow for discord

halcyon quarry
#

working out one little kink on fixing the change/swap imgmodels

#

It's a complicated function

#

Got it resolved. Now to see about this payload thing...

vestal python
#

so i am connected at least

halcyon quarry
#

Try again

#

That message occurs if the menus changed while your discord client is running

vestal python
#

17:20:42.234 #687 INFO [bot.main]: Bot is ready
17:21:29.690 #4150 INFO [bot.main]: Dundell2 used "/imgmodel": "Juggernaut-X-RunDiffusion-NSFW"
17:21:30.076 #4079 ERROR [bot.main]: Error guessing selected imgmodel data: [WinError 3] The system cannot find the path specified: 'C:\forgeui\webui\models\Stable-diffusion\Juggernaut-X-RunDiffusion-NSFW.safetensors'
17:21:30.103 #634 WARN [bot.main]: One or more "tags" are improperly formatted. Please ensure each tag is formatted as a list item designated with a hyphen (-)
17:21:30.606 #2026 INFO [bot.main]: Image model changed to: Juggernaut-X-RunDiffusion-NSFW
17:21:47.262 #3656 INFO [bot.main]: Dundell2 used "/image": "rubber ducky on a lake"
17:21:47.262 #1439 ERROR [bot.main]: Error getting tags: can only concatenate list (not "NoneType") to list
17:21:47.264 #3137 ERROR [bot.main]: Error matching tags for img phase: list indices must be integers or slices, not str
17:21:47.266 #3202 ERROR [bot.main]: An error occurred in img_gen_task(): cannot access local variable 'key' where it is not associated with a value

halcyon quarry
#

Ok Im booting up fresh install

#

Ok I am also getting the payload error

#

that is no good

#

The error is because I never thought of how to handle the first image model

#

when none has ever been loaded via the bot

#

All those other errors you have are due to something misformatted in your tags

#

erm

#

hmm

#

Is that path valid? C:\forgeui\webui\models\Stable-diffusion\Juggernaut-X-RunDiffusion-NSFW.safetensors

vestal python
#

It's all new server stuff. I can take a closer look once I get home. I'd like to setup two options for users 30 step and 70 step models, and add in some doifferent sdxl models later on. Be neat

#

C:\forgeui\webui\models\Stable-diffusion\Juggernaut-X-RunDiffusion-NSFW.safetensors

#

Might be a C: directory issue?

halcyon quarry
#

My family came home so can’t look again for a few hours

#

I may be missing an os.join() that could cause issue with non-Windows

vestal python
#

I hear you, my son is in my lap with an apple watching youtube. Trying to work with one arm

halcyon quarry
#

My wife wants to strangle me when son is home and I’m on computer lol

#

I need to see if there’s an api call to get current imgmodel from SD

#

Also need to prompt for bot token if it’s not set, and save it. Instead of saying set it manually and exiting

halcyon quarry
#

What I can say is that from a fresh install with nothing changed besides my bot token, I could change imgmodel then prompt images

halcyon quarry
#

Here’s something that’s going to change…
The image model is no longer going to be explicitly saved to activesettings anymore. When the bot starts, if the field is blank that will be whatever current model is.

#

This error with payload is due to a flaw in the new activesettings initialization

halcyon quarry
#

@terse folio another huge thanks - this new framework you set up is super easy to work with now that I get the gist of it

#

Just deleted that big lump of Config at the beginning, replaced with simple block in database.py
Migration is working to convert old config.py to config.yaml
It's loading.
And I added a prompt for bot token, and now can use that simple config.save()
Beautiful

halcyon quarry
#

@vestal python I tried reproducing the Image Model error you encountered, and I did. Then, I did resolve it
I added a hyphen to the model name and got the same error.
But the problem is not due to the filename including hyphens.
The error was due to Forge having stored information about the model, and then I changed it without relaunching Forge

#

After I closed and relaunched Forge, it loaded the model without errors

visual dagger
#

hello again

halcyon quarry
#

I made a lot of good progress today on bug fixes and ease of use

visual dagger
halcyon quarry
#

I'm probably going to take it easy this weekend, though

visual dagger
#

yeah take some deep breath and go on a vacation with your AI friend, lol

#

any news on an open source Her (GPT4-o replika) or not yet?

halcyon quarry
#

Update pushed

terse folio
visual dagger
#

hey Reality, how is it going

terse folio
terse folio
visual dagger
#

like 3 devs doing the job instead of just one miserable dev 😭

halcyon quarry
#

3 devs! Now there's a juicy thought

visual dagger
#

lol

terse folio
#

Yes, but there's a limit to how fast that can go.
Batches will be faster with multiple machines yes, but you'll still need to wait the initial time for the first generation to come in.

Also there's the drawback that sending custom trained models could be an expensive task, they're 1.5Gb each

halcyon quarry
#

This project has 3 devs - Reality counts for 2, for how efficient they are

terse folio
#

a central hub splits tasks out to workers

#

I actually should look into if there's a way to do parallel streaming/generation with xtts.
Because it has some interesting model stuff going on, I'm not sure how i'd accomplish that.

Like your ability to use a custom speaker for each generation

visual dagger
#

I get you, there is a limit of how fast things can go, like 3 devs, 2 can get the job done, so there is no need for the third one

terse folio
#

With text generation, McMonkey explained to run each token in batch, which makes sense.
I think Xtts also generates voice tokens?

visual dagger
terse folio
visual dagger
#

faster results vs more results, why you made that distinction?

terse folio
#

It depends on your usecase.
Like some people might value the TTS output coming in really fast, immediately so the bot can respond in real time like a human.

as for more results, this is nessecary if you are running a large service, with 100s... 1000s of active users generating voice

#

Because at the moment, Xtts only generates one request at a time

#

Same with TGWI actually, text generation is locked to one at a time.

#

I looked into making it parallel, but found some roadblocks, like the model class for Exllamav2 was coded to ignore paralell requests.
I'm not sure if this was on purpose or not for the sake of caching, but that's something that needs testing

halcyon quarry
#

signing off, have a great night. Don't report any bugs tonight or I won't sleep well 😛

terse folio
#

Sleep well!

visual dagger
#

so in theory you can chunk things up and make multiple gpus work on it

#

and get faster resukts

terse folio
#

Yup, absolutely

#

I talked to someone who was running 3 instances of tgwi to utilize all their GPUs

visual dagger
#

oh

terse folio
#

Not for big models, they were small, like 7Bs.
But this was for a web service

visual dagger
#

so is tgwi a fork of a tts pacakge?

terse folio
visual dagger
#

putting his hand on his face in a moment of silence

terse folio
#

No worries ahah

visual dagger
#

so it was easy to setup for him? just plug and play

visual dagger
#

and are we talking text gen only? or also xtts

terse folio
terse folio
visual dagger
#

was getting faster results?

#

faster inferance

#

about voice, if you have bunch of gpus or a relatively mid speed gpu you can do some tricks here and there to make the illusion of a "realtime" conversation, probably openai done that with gpt4o

terse folio
#

I didn't ask, but from my own experience running multiple things on a single GPU, yes.

If you run multiple processing tasks on a GPU, they fight for the same Cuda cores, even if they both can fit in memory.

Like running text generation would utilize 100-90% of cuda in task manager iirc.
Same for Stable diffusion, so they'd slow eachother down

terse folio
visual dagger
#

making things seamless and smooth

terse folio
visual dagger
#

one gpu I mean

visual dagger
terse folio
#

they're designed for parallel processing!
But the little issue is that some programs are so intense they use all the Gpu's resources

visual dagger
#

conflicts betwn prgrams

terse folio
#

TTS generation uses about 20% of my gpu while generating.

It's just not a big enough model to use all the cores perhaps.
I don't know the specifics of what's going on there

visual dagger
#

i think it's possible to get a gpu dev (or God knows what the job title 😅 is) to make a customised script that takes advntge of everything in an optimal fashion

terse folio
#

that is usually how it works

#

Every task tries to complete as fast as possible

#

Like if you're running a videogame, the frames render as fast as they can. (this is also based on how fast the CPU can put out frame data too)

But you optionally have the setting to limit the framerate if you wish

visual dagger
#

programs stealing resources "NO IT'S ALL MINE. AND ONLY ME"

terse folio
#

😸 haha

#

on the CPU, you can define priorities to certain programs

terse folio
#

where it lets those higher priority programs run as fast as they need, then in the downtime between instructions/waits run everything else.
(that's just a guess)

visual dagger
#

you think it's possible to jave a mid range npu, gpu or whatever in the future that enables crazy stuff whike being affordable to the average consumer?

#

something on rtx 4090 level on ai, but relatively cheap and affordable

#

cheap to mid range

terse folio
#

I've heard people are working on such things.
From custom chips for processing transformer type models, to using analogue for the matrix multiplication since it's okay to lose some accuracy with ML.

Not sure when they will be for public consumption.
I imagine that could take some time, either having to build some sort of interface to existing motherboards, or create your own.
How that works is a bit beyond me

#

maybe such a thing would work with the normal PCIe lanes that your gpu uses.

visual dagger
#

you think nvidea will let those ppl/companies sell freely?

terse folio
#

Competition would be nice, hopefully that would lower prices for everyone.
Not sure, they could also buy the company.
lots could happen
just hoping for the best

visual dagger
terse folio
#

I agree ^^

#

more customers, lower entry barrier

visual dagger
#

you either give the consumer more reasons or less reasona to buy

#

if you need to change also the mb,you will give it much thought

terse folio
#

Mhm, such a company could also go the route of selling to datacenters

visual dagger
#

hopefully thimgs change to the better soon

visual dagger
#

like groq is doing noe

terse folio
#

mhm, and scary to think about those businesses trying to stop people from running their own models.
Trying to create restrictions, hmm

visual dagger
#

yeah things aren't looking good

#

sure we get 100b and 200b opensource models

#

but who's gonna run that?

#

just 2 or 3 ppl and the rest are other businesses

#

I like to see a never endng trend of small models emerging 7b and below

#

and how to stick them together to achieve a gpt4 or 5 level of results

terse folio
#

Checked the logit probabilities, and they were all around the same for each choice, showing that the model has no idea

visual dagger
#

like downloading only the exeprts that you want to use and are useful for your usecase

terse folio
visual dagger
#

each expert is a 3b model

visual dagger
#

not the math, but the higher level logic

#

that can be applied to alot of other affordable solutions

terse folio
#

I feel like all the models would have to be loaded in vram for it to be fast.
Because having one model decide which expert to use, then loading that from ram to vram would take too long.

And perhaps this switching of experts can happen per token? not sure

visual dagger
#

and the user can choose the experts

terse folio
visual dagger
visual dagger
#

imagine 3b models highly focused on a specific task

#

laser focused

#

all of them are finetunes

terse folio
#

Yup, I used a 124m gpt2 model finetuned fro turn based conversation by Facebook iirc.
It was way too overfitted 😸

#

but for speed, that is what I needed

visual dagger
#

a 3b that can do ml with python, and just that, no other python expertise rather than only this

terse folio
#

we didn't have quantization back then

visual dagger
#

like someone was using gpt4 to tag some text, then found out that it was a waste of money and moved to gpt3.5

#

it was enough to do the job

terse folio
terse folio
visual dagger
terse folio
#

Some simple things like extracting a mentioned email address in a text doesn't need a whole llm query.
I remember this example from somewhere

visual dagger
#

and those jackpots will get popular

visual dagger
#

but not all ppl know that

#

you can simply use a python script for that

#

no llm or nlp model needed

visual dagger
#

that's important the coding part, if we can somehow make llms generate code that helps them it will be reallu good

#

generating code on demand depending on the task without the user prompting for a code

#

like email extraction, the model can just go "hmm.. I don't need to waste my resources for that, I can just make and run a simple python script for that"

terse folio
#

Mhmm!
and I'd love to see LLMs having the ability to work on large code projects that are more than a little snippet.
But that's a lot of context space needed

visual dagger
#

or you give the llm access to a giant database of small snippets of code

#

and the llm just chooses the best one

#

no need for code generation at all

#

just choose the best one and run it

visual dagger
visual dagger
#

if offline just use it right away

#

scripts are tagged ofc, categorised and classified

terse folio
# visual dagger like email extraction, the model can just go "hmm.. I don't need to waste my res...

if an llm is making the decision here, it could probably be faster to just get the emails using the llm.
As it would spend more time writing a script than outputting the list of emails.
But perhaps the context size of processing long texts for the email would cost more time than just writing the script.

Maybe some caching for tasks.
If task = find email -> create code or run existing code.

Now sure what kind of project that would be. it was a strange example

visual dagger
#

we can use a ready to go code instead of generating from scratch

#

just a quick search on the database

#

if the llm finds anything usefull it uses it, if nothing is useful then the llm goes to generate it from scratch and run it

#

but if the database is really massive, I doubt that there will be nothing useful

terse folio
#

Yup that pretty much describes AI using tools ^-^

visual dagger
#

the irony in this is we can make llms make the entire db

#

lol

#

you basically getting free inferance

#

coz all the code is pregenerated and stored on the db

terse folio
#

Of course, it's just caching!

visual dagger
#

yup and you can make a highly focused db that suits your usecase

#

just set things up and let the llm create ton of code while you are sleeping at night

#

this code can be used later

#

it has to be saved, categorised, classified the right way, to be retrieved later

#

maybe rag can help

terse folio
#

😸 I like the "just in time" approach.
wasting less resources, because you're only generating what you need the first time you need it

visual dagger
#

ye and the llm won't even generate that code again

#

I mean.it just has to detect the right code then it will be instantly run

#

like this

the best code for the task is "generateImage.py"
running the file..
#

you see, there is no need to turtore the llm (generating the same py file again)... it will just run right there and then

terse folio
#

Yup!

visual dagger
#

and the llm also can manupilate the code using another code instead of generating all that

#

sometimes a simple .replace is all you need

#

lol

#

if you know any crazy someone who made something like that I will have to know about it, so let me know : )

#

ai agents looks similar, but there gotta be a better way to merge llms with code

#

so they can handle a large db of coding tools/snippets with almost no inference time needed

terse folio
#

First part to work on is the ability to recognise and use tools properly,

I've experimented with using embedding to try matching text to tools.
But the best results came from using 13B models with multishot prompts.
Still made mistakes with some complicated chat logs because llms have trouble with time.

Thinking something the user asked a while ago is the current task.

visual dagger
terse folio
#

yes

#

because if that doesn't work, then the database and all wont matter

visual dagger
#

time is confusing sometimes

visual dagger
#

tagging the tools

like this os, system package, system cmds

terse folio
# visual dagger oh yeah time is...

thing is, sometimes the task depends on multiple messages in the history.
So I can't just cut it off.

Maybe you're having a conversation about a pet, the bot asks questions about it, you describe it.
at the end asking to create an image.
It would have to take into account all the previous information.

terse folio
#

On another note, I discovered that the speaker in Xtts is the last step in the audio processing.

Maybe one could generate the audio tokens at the same time, then loop over the results applying the correct voices to each.

But that would require editing the generation code/internals, I really don't want to do that!
But possible, one day!

visual dagger
#

or sceneDetailsIdentifier

#

a script that read the logs and extract relevant info

terse folio
#

mhm, breaking tools down into more tool calls.
would help!
Yea, but I have an example where the bot calls the wrong tool as it's stuck on the previous one.
I'll see if I have a screenshot

visual dagger
#

okk

terse folio
#

#general message
Not sure if I posted a broken example here.
But this is the kind of tests I was doing.

And I see now why the LLM got confused.
because one of the tools is "chatbot"
And I guess it doesn't think chatbot should be used as much, and tries to use different tools.
Especially if you mention something, it might mistake it as user info to save or something continuously.

#

Also just noticed we're in the ad_discordbot thread 😸
Hope @halcyon quarry wont mind

visual dagger
#

hmm

#

did you limit the tools?

terse folio
#

If I could get grammar working, I bet results would be better

terse folio
visual dagger
#

like "the only tools you can use are xyz rtg bvc "

terse folio
#

no, just through multishot prompting,
during tests I also like to see the new function ideas the llm might come up with if it ever does.

#

Like that idea of cats[0] and cats[1] I wouldn't have thought of that

visual dagger
terse folio
visual dagger
#

oh k

#

I get it

terse folio
#

Another thing I did is include
system: used tool (generate image... whatever)
in the chat log

So the llm knows it already ran that

#

It came with it's pros and cons, some models got confused by this, some got the point

#

hahah had this saved in my last run of that python notebook

visual dagger
#

Somethng I did lately might help you, so I made a dict, the pairs are trigger_word, fact

like this

{'Alex':'Alex is John's childhood friend, is living near the park', 'Potato':'John hates potatos'} etc..

so when I prompt the llm

hey John, lets visit Alex

I have script that detects any trigger words then adds a notes as a fact

hey John, lets visit Alex (Note: this is a fact, Alex is John's childhood friend etc... )

the llm answers is way better this way

same with the potato. even if I include that John hates potatos in the system prompt nothing happens, I mean the llm is acting dumb

terse folio
#

Yup that's another problem, getting LLMs to listen to RAG.
Also nice, the dict thing is similar to this bot's tag system it sounds!

#

I like that idea, a simple solution like we talked about earlier

visual dagger
#
llm: I brought some potatos
john: potatos!!
llm: yes they are delicious

without the trick ^

llm: I brought some potatos
john: potatos!! (Note: This is a fact, John hates potatos)
llm: oh no, sorry I forgot you don't like potatos

with the trick ^

#

this is really powerful

#

I didn't try giant text facts

but a small sentences as facts

#

I don't think a giant wall of text wilk work

#

but a tiny fact will

#

brief and straight to the point

terse folio
#

Yea, that's where that "save_user_info" idea is about.
Building these key value pairs while chatting so they can be called upon later

visual dagger
#

the insane thing is, it becomes nornal means I don't notice it, it just work on the backend

#

automatically

terse folio
#

Amazing!

visual dagger
#

if the scriot detects any trigger words in my prompt it adds notes before sending it to the llm

terse folio
#

Could also scan the llm's output and put in these annotations, maybe have it rerun the generation

#

maybe it comes up with something that wasn't in the original prompt, but that thing has some important context

visual dagger
#

I diddn't get it explain a bit

terse folio
#

Something more roleplay like, where you might be moving through a space.
The LLM decides there should be a potato store for some reason to your left.

in the 2nd pass, the LLM learns John doesn't like potatos, and corrects it's response to say "ohno, john has found himself by a potato store now"

(of course the bot wouldn't generate "potato store" without being told about the potatos in the first place, but I can't think of a better example atm)

#

Like letting the LLM "remember" while it's talking

halcyon quarry
#

I’ve been missing out on some deep talk here

terse folio
#

Was sleep good?

#

:P

halcyon quarry
#

I love potatoes

terse folio
#

They are nice

vestal python
#

Just got home, I'll reboot all 4 machines and try again

visual dagger
halcyon quarry
#

So a little bird may have told him

visual dagger
halcyon quarry
#

Good updates

terse folio
visual dagger
#

I forgot something that might be important, the script is set to mention the fact only one time in X amount of last chat

I mean the fact will be mentioned IF it wasn't mentioned already in the last 4 chats (or any number)

so 2 conditions here, the presence of the trigger word potatos and the fact not being mentioned in the last 3 chats

I do this to avoid repetition of the fact which might lead to an overeaction, if the fact is mentioned a lot of times in the last 4 chats the character will overreacte

like this

I am very sorry, *cries* I will nerver give you potatos again *histericly crying*

#

😂😂

terse folio
#

Absolutely!
I'd have some sort of flag set on the triggers that indicates it's currently in use until that message with the (note: here) is deleted then it releases

#

also wow, over reaction!
would be funny to encounter that while browsing the web, repeatedly trying to finetune what your search is because the bot isn't searching the right things

#

unexpected in that context

visual dagger
#

that's insane

terse folio
#

but yes, some simple regex would work on those results!

visual dagger
#

then you can add them to the dict and get a never ending loop of improvements

#

key, value pairs
trigger, fact pairs

terse folio
#

This is just some fancy code to create the system prompt through random shuffling and picking random examples.
But this is how the examples are defined.

visual dagger
#

a bug in this that you might encounter is saving temporary facts

#

I am sick

terse folio
visual dagger
#

this will be saved and you will always be sick in the eyes of the llm

terse folio
#

that's an interesting idea

visual dagger
#

you will save a temporary fact as permanant

#

you are not alwyas sick

#

so that's a bug

#

a solution is to update/remove those facts

#

with a new one

terse folio
#

mhm,
Another idea I have been playing around with is figuring out how to build a RAG database that also considers time, favoring more recent information

visual dagger
#

I am well now, woow it was a rough 3 days

#

the sickness "fact" will be deleted

#

or updated with "John was sick but he's well now"

terse folio
visual dagger
#

if you generalise on all facts that's might cause bugs (that's how I would do.it)

i would make the llm check every fact against each other to delete or update the contradactory ones, which will cause a mess

#

so your way of doing is better, narrowing down things

#

i wonder how tags can help in this?

halcyon quarry
#

Tags can help with anything 🌈

visual dagger
#

Tags to rule them all

#

Tags can help in classifying things up

terse folio
#

Mhm, would want to have them stored in a tree,
maybe the prompt should be more sophisticated like include the user's name.

save_user_info('cats', '2')
actually saves to dict(users[-1]: {cats: 2})

halcyon quarry
#

How do we feel about a ‘regex_prompt’ tag?

visual dagger
terse folio
#

Could open up a lot of possibilities, I like it

halcyon quarry
#

Good news is you don’t need to - LLM knows this stuff

terse folio
#

For optimization, you could also let people decide the range they want the regex to run on.
The whole chat? or just the last message(s)?

halcyon quarry
#

Well that’s the beauty of the tags system

terse folio
#

oh that's already part of it? ^-^
still haven't messed around with it!

halcyon quarry
#

Well, I haven’t added any tags yet to Edit History but that’s pretty interesting thought there

visual dagger
#

edit history?

halcyon quarry
#

tags either have a condition (trigger and/or RNG) or they don’t (apply always). Eithrr to user msg, llm reply or both

visual dagger
#

okk

halcyon quarry
#

Reality suggested that regex could be applied to the entire conversation and I thought perhaps they meant including history 😛

visual dagger
#

I thought we were talking about tags like image tags etc

halcyon quarry
#

Not in this channel chiharu

#

Or at least not while I’m awake

visual dagger
#

nothing naughty :/

halcyon quarry
#

This channel is nsfw no worries

terse folio
#

Aye, figured out what was wrong with pyaudio!
It was the audio format.
I had int16 correct, width 2.
Issue was that pyaudio assumed it's a signed int, only 0 to 65536
But what we want is -32768 to 32767

Figured this out by hovering over the open function and realising the default value is wrong.

visual dagger
#

I didnt mean that 😭

terse folio
halcyon quarry
#

I had a feeling it was going to be something like that

terse folio
#

Regex substitution would also be cool

#

but don't know what usecase it would have yet

#

It's good to have a library of tools!

halcyon quarry
#

The use case I have already personally, is that I have an effective prompt to get the chat context to write a pretty nice image prompt, with chat history intact. But they always say Oh blah blah blah and this that, here’s your prompt:

#

grammar is failing to help

#

Regex would trim out everything before “prompt” easy

#

It happens like clockwork

terse folio
#

I would take this from the other side and have the image tag just extract the image prompt.

Because that whole text prompt "generate an image" is temporary, so no need to change it

visual dagger
#

Reality I think categorising the facts might help in narrowing down

it's like folder - sub folder - sub sub folder relationship

Character

  • John
    • School Related
      • Grades
    • 90/100 on math
    • 80/100 on english
    • 60/100 on geography
      • Club
    • Joined basketball club
    • Have a match next June 22th
terse folio
visual dagger
#

yeah totally automatically

visual dagger
#

generated

halcyon quarry
#

The text related tag behaviors can apply to either user msg or llm resp

visual dagger
#

all saved in a yaml, json file or a actual folders wih text files

halcyon quarry
#

Anyway, more tools more power

visual dagger
#

pawah

visual dagger
terse folio
#

Someone wrote a fork/patch or something for discord.py that enables voice receive.

I used this to create a talking status indicator.
But I think it's capturing the raw wav data for each user.

if we decode that we could use whisper to STT to LLM to TTS 😸

#

I also think I remember seeig some forks for Whisper that enable streaming of text from stt.

#

so the response times could be pretty fast if done right

#

last time I did something like that was with Gpt2-tiny haha.
That was a lightning fast model.
As well as a really fast tts.

And the small voice rec model I used was pretty bad with the outputs, so model got confused a lot

visual dagger
#

I suppose they chunk the audio in silent moments

#

bla blabla ----- so yeah let's go there ----- yeh yeah

---- representing silence

terse folio
visual dagger
#

hmm but it's.too late when the bot or agent speaks on real time

#

like a human, you can't erase what you already said

#

loudly

terse folio
#

yup, you'd have to track what the bot said as it's talking so it can handle interuptions

terse folio
#

Also you wouldn't be streaming text to the llm, that's a lot of prompts per second

#

it would be done in chunks when needed

visual dagger
#

why stt > llm > tts?

vestal python
#

Ok so my Ogma bot is now working. This line in he errors. I think my old character yaml was part of the problem.:

WARN [bot.main]: ** No extension params for this character. Reloading extensions with initial values. **

#

swapped him to your example minty yaml and he worked

terse folio
vestal python
#

Oh, I sould get back to working on STT->TTS

visual dagger
terse folio
visual dagger
#

repeating back what already said.. but why?

vestal python
#

Transcripts on transcripts

visual dagger
terse folio
#

interesting, doing evaulation tests of stt models?

visual dagger
#

but I guess it's all tokens in the end of day

#

predecting the next token (voice chunks)

terse folio
# visual dagger but I guess it's all tokens in the end of day

Like LLava for example, the model is trained to accept the embeddings of an image.

Instead of doing something like:
Image -> Blip (captioning) -> LLM

or:
Image -> Blip for initial caption -> LLM (with prompt ask about image) -> Blip (ask it questions) -> LLM

#

I've done that with Blip, it didn't work amazing, but it was something!

vestal python
#

I did get the 'Hey Chat' to work in a webapp example, to then send a speech request -> whisper -> LLm -> Alltalk -> text+speech back. Hands free response which wasn't terrible with llama 3 8B 4Q

#

But... Y'know talking is overrated. I barely have conversations all day to begin with..

terse folio
#

similarly, speech (the tone of your voice, the emotions...) can be encoded as a vector.
That could be fed to the LLM that's finetuned for that
Again, just guessing, I didn't read too much into that model

visual dagger
terse folio
terse folio
#

A LOT of special tokens, like 800 of them haha

halcyon quarry
visual dagger
halcyon quarry
#

As long as we see no errors 😅

visual dagger
#

am I a token 😐?

terse folio
#

If you appear enough in a training dataset, maybe there is a special token just for you :P

#

Happened to a few people actually

visual dagger
#

yaay my token

halcyon quarry
#

Good news fellas, the bot now prompts for your discord bot token if it’s not in config.yaml

terse folio
#

those usernames got pruned from the training data because it was garbage to train on haha.

halcyon quarry
#

And saves it !

terse folio
#

and then the LLM lost meaning to those names, and they became glitch tokens
(from a youtube video about this topic)

visual dagger
#

hey openai, I demand you removing my this chat from your training data. now!!

#

guys say hi to openai

terse folio
visual dagger
#

😭

halcyon quarry
#

The only manual step now (besides setting up discord bot technically) is moving bot.py

terse folio
#

we can do better than that, make it 0 steps!

#

Soon™️ ad_discordbot-the extension

visual dagger
#

lol

vestal python
#

Well...I'll need to spend part of this weekend to create my character yaml's, and learn more about that. Swapping to M1nty surprised my users 'Who's this bot??' doggokek

terse folio
halcyon quarry
#

Did he speak of cryogenics? 😛

terse folio
#

just to make sure about this

#

because binary options can be confusing

vestal python
#

No, but lots of Tech comments

terse folio
#
if ('change_llmmodel' in tag) 
and 
    (not (llm_payload_mods.get('change_llmmodel')) 
    or llm_payload_mods.get('swap_llmmodel')):
#

this is how python interprets it

halcyon quarry
# terse folio

I did this so Incould initialize the mods as an empty dict

terse folio
#

i'm not familiar enough with tags to know why "not change_model" but "yes swap_character"

halcyon quarry
#

I want to ensure that the first tag processed with either change_X or swap_X is the one that will have effect

#

Because I’ve taken care to keep their priority in check

visual dagger
#

Reality is it possible to make a group chat of multiple characters, each one have their own system prompt etc.. that doesn't break?

halcyon quarry
#

If you can write it better, please 😛

visual dagger
#

like if you provide enough details for evey charavcter in system prompt

#

and swap system prompts aka swap characters

terse folio
terse folio
halcyon quarry
#

I was originally capturing both tags, then just prioritizing “change” over “swap”

visual dagger
terse folio
#

okay I think I see,
with the first item

If flow is a tag, but not part of the params, add it to params.

visual dagger
#

maybe bcz one character/system prompt/agent generates all the group chat?

#

thta's the problem?

halcyon quarry
terse folio
#

But with swap/change.
You have some possible bugs I think.

What happens if swap_character is in llm_payload.
But not change_character?

The next line of code would indicate that it writes tag[change_character] to llm payload

#

and then on the next if statement, you have the same condition, which will then write "swap_character" to llmpayload

#

so to get both, you only need "swap_character"

halcyon quarry
#

If X and not (A or B)
If either A or B is in the dict, it will be ignored

terse folio
halcyon quarry
#

Woops

terse folio
#

it can be confusing to write sometimes!

halcyon quarry
#

Had me second guessing for a sec actually lol

#

Pretty sure I have that right…

terse folio
#

But, if you want one or the other, you can use if, elif statements.
It will run the first, and pass

or skip the first because it doesn't match, and run the next

#

your code is written as

If X and (not A) or (B)

#

if B exists, the 2nd half will always be True

#

ohh

#

sorry

#

you're right

#

it's git's hightlight that messed me up

halcyon quarry
#

🙂

#

Made me look again

terse folio
#

Sorry for that confusion 😅

halcyon quarry
#

Planned changes:

  • Will stop saving sd_model_checkpoint and vae to activesettings. This was necessary before I added api model loading.
  • Consolidate on_msg_gen n hybridllmimggen to hybrid_llm_img_task as mentioned before
  • regex_prompt tag
  • all things on TODO list (📍pinned msg)
terse folio
#

ahaha just noticed this

halcyon quarry
#

🍺

#
  • replace all instances of i with inter
terse folio
#

i'll take a look at that now

halcyon quarry
#

I was contemplating using ictx in places where I send ctx and i to same place

terse folio
#

ctx is short for context, it would be fine to use that meaning interaction.
it's the context of the interaction/command/message!

halcyon quarry
#

So far, haven’t needed to do anything limited to one or the other

#

i and ctx objects have a few differences

terse folio
#

yea, I'm just talking about the name

halcyon quarry
#

yep 😛

terse folio
#

import gradio as gr
import torch

from modules import chat, shared
from modules.text_generation import (
    decode,
    encode,
    generate_reply,
)

params = {
    "display_name": "AD Discord Bot",
    "is_tab": False,
}


def setup():
    """
    Gets executed only once, when the extension is imported.
    """
    pass
#

this is the entry point to TGWI

#

we could put the bot.run() call in setup()

halcyon quarry
#

I only want to distinguish the labels in case some day I do have a reason to handle one different from the other

terse folio
#

spawn it in a new thread

terse folio
#

this also works for baseclasses, like checking if discord.Member is a discord.User which is True.

terse folio
#

But I think that's covered by prompting the user in cmd now

halcyon quarry
#

That could be good bc apparently it’s not intercepting TGWUI flags anymore

#

from the file

terse folio
#

this seperates the flags because I don't think there would be a way to inject a new arg parser into tgwi if this is an extension

#

Since extensions load after that file runs

halcyon quarry
#

Sounds good to me!

#

Anything you’d like to do, it’s our shared canvas to paint upon

terse folio
#

I believe this does the same

#

using the elif statement it will ignore the other if the first matches

halcyon quarry
#

Leave that as I had it though 😛 Pretty sure that won’t work there

terse folio
#

Sure!

halcyon quarry
#

it is iterating one tag key at a time, so first one is change_char. Ok, now its added. Next key is swap_char…. Also added!

#

Because it’s not checking that the other is added yet

terse folio
#

Yup

#

Hmm, is 8mb still the max?

#

I thought they raised the limit to 25 for everyone

#

maybe that doesn't apply to bots

halcyon quarry
#

It’s probably chatgpt old infl

#

Info