ad_discordbot (Fork of Fork of xNul's bot) | Text Generation WebUI | Page 8

terse folio May 25, 2024, 5:09 AM

#

I see I see

halcyon quarry May 25, 2024, 5:09 AM

#

But, I have not come close to that on very long responses

terse folio May 25, 2024, 5:10 AM

#

Also, not sure about pydub.
But with ffmpeg you can specify a size limit and it will pick the bitrate

#

I think pydub uses ffmpeg in the backend

halcyon quarry May 25, 2024, 5:10 AM

#

This is working pretty good

terse folio May 25, 2024, 5:10 AM

#

^-^

#

Also there is a token limit on tts, about 1000 in/out

halcyon quarry May 25, 2024, 5:11 AM

#

Getting the TTS up and running was fun

halcyon quarry May 25, 2024, 5:14 AM

#

terse folio Also there is a token limit on tts, about 1000 in/out

That’s a hard limitation to get around… now. Juice not really worth the squeeze

terse folio May 25, 2024, 5:14 AM

#

halcyon quarry That’s a hard limitation to get around… now. Juice not really worth the squeeze

that's quite a lot of text!

#

just split it into paragraphs, or sentences

#

I was bringing this up more to say that I bet it would be hard to reach the discord file limit with what Xtts can output

halcyon quarry May 25, 2024, 5:15 AM

#

Ah

#

Yeah, definitely

#

Remove the limit if you’re there 🙂

#

It’s probably 25MB like you said anyway

terse folio May 25, 2024, 5:16 AM

#

the limit is probably a good idea on the wav version

#

But may as well always convert to mp3, the quality loss is minimal compared to the already not the best tts outputs

#

and saves users on data if that's ever a worry! / faster upload

halcyon quarry May 25, 2024, 5:17 AM

#

Looked it up, ur right about 25mb

terse folio May 25, 2024, 5:19 AM

#

😸
wow, 18 minutes of music comes out to 16mb mp3

#

yup yup, can remove it!

#

wrote the mp3 to a buffer, so don't have to worry about the deleting mp3 code that was commented out ^^

#

Discord accepts bytesIO types 😸

#

hmm, looking at the play in voice channel function, I wonder how one would achieve audio streaming

#

there must be a function to add to the voice channel

halcyon quarry May 25, 2024, 5:27 AM

#

You mean merging audio sources?

terse folio May 25, 2024, 5:27 AM

#

yes, xtts streaming sends out little chunks that might cut off words

#

so it needs to be added to some queue

#

i'll look more into how music discord bots work soon

#

might get some answers there

halcyon quarry May 25, 2024, 5:30 AM

#

Currently, the bot puts the audio to queue and moves along, and it plays on VC. If another TTS resp is generated right away it plays immediately after the first finishes

terse folio May 25, 2024, 5:30 AM

#

I see I see

halcyon quarry May 25, 2024, 5:31 AM

#

idk if you saw what I wrote yesterday but there’s a guy with this bot on 3 servers at once lol

terse folio May 25, 2024, 5:32 AM

#

I did read that, nice!
some distributed stuff going on

halcyon quarry May 25, 2024, 5:32 AM

#

he said it has some odd responses due to the shared history/context but generally works well

terse folio May 25, 2024, 5:32 AM

#

oh 3 discord servers?

halcyon quarry May 25, 2024, 5:33 AM

#

Ya

terse folio May 25, 2024, 5:33 AM

#

yes the chat should be seperate

#

interesting that happens

#

maybe creating a bot_history per guild

halcyon quarry May 25, 2024, 5:33 AM

#

Channel specific history will handle that

terse folio May 25, 2024, 5:33 AM

#

mhm

halcyon quarry May 25, 2024, 5:34 AM

#

It’s not too complicated I don’t think

#

It’ll be easy 😛 Had too many distractions these past few days to make it happen

terse folio May 25, 2024, 5:35 AM

#

halcyon quarry It’s not too complicated I don’t think

shouldn't be, just replace instances of bot_history with

history_channels.setdefault(channel.id, bot_history())

#

actually would rather write that as an if statement,
I think python will evaluate bot_history() and create a class each time regardless.

#

This could be costly if you do something expensive in the __init__ of that class, like loading files or other tasks
just to keep in mind for the future

halcyon quarry May 25, 2024, 5:37 AM

#

‘Unique_id’ is the datetime when the conversation began - for file saving. If TWGUI supports subdirs in char logs that would be nice

#

For the people who care about loading this history from the webui

#

(Make subdir for each channel)

terse folio May 25, 2024, 5:38 AM

#

why not use the unique_id + channel_id in the file name?

#

if you can set the names

halcyon quarry May 25, 2024, 5:38 AM

#

Certainly can

terse folio May 25, 2024, 5:38 AM

#

also you could add an os.sep into the name, it probably would create a subdir

#

maybe throw an error dir not found. hmm

halcyon quarry May 25, 2024, 5:39 AM

#

if subdir would be the os.path.exists channel OK

#

or whatever 😛

terse folio May 25, 2024, 5:40 AM

#

https://stackoverflow.com/questions/62051016/discord-py-play-voice-from-pyaudio

Stack Overflow

Discord.py Play Voice From PyAudio

I'm trying to play live-audio from my microphone into my Discord bot, but I can't convert the stream from PyAudio into an acceptable format Discord can play.
I have tried:

async def play_audio_in_...

#

looks like this would support streaming to vc

#

just write packets as they come in, and specify the correct encoder

halcyon quarry May 25, 2024, 5:42 AM

#

Essentially, to start playing TTS sooner than waiting to complete the full generation yes?

#

Gnight 😛 get some sleep!

terse folio May 25, 2024, 5:54 AM

#

halcyon quarry Essentially, to start playing TTS sooner than waiting to complete the full gener...

yes, exactly
it starts within a fraction of a second!

visual dagger May 25, 2024, 9:58 AM

#

hello

#

any anti repetition prompts that worked for you guys?

#

system prompts or one shot pompts or whatever

halcyon quarry May 25, 2024, 10:58 AM

#

The bot has a Dynamic Prompting feature.
You can put txt files in the wildcards directory and use them like
##instruct me in a ##mindset manner, just ##examples. Please ##final_instructions

#

The sd-dynamic-prompts extension ships with a buttload of wildcard txt files, maybe you can come up with one effective Dynamic Prompt with varying results

#

While testing you can use tags to freeze history

#

You could also use the llm_param_variances tag feature to randomize params and look out for improvements

#

You could also make a Flow where it simply loops the dynamic prompt X times such as flow_loops: 10 and the LLM will be prompted 10 times with the randomized prompt/params

terse folio May 25, 2024, 11:19 AM

#

if my live STT ever survives,
we can call it "BooginSTT" in spirit of Oogabooga

#

I got livestreaming audio over websockets into whisper working!
sadly I can't run whisper+TTS+llm on the same machine to do some cooler tests right now

#

I'm going to implement silence tracking so it doesn't have to process constantly.

#

then im sure it could be adapted to a discord bot

visual dagger May 25, 2024, 11:50 AM

#

halcyon quarry The bot has a Dynamic Prompting feature. You can put txt files in the wildcards ...

so by dynamic prompt you mean it's the same system prompt but worded differently?

#

like there is multiple versions of the system prompt

halcyon quarry May 25, 2024, 11:52 AM

#

The feature can select a random line from the txt file and swap it in

visual dagger May 25, 2024, 11:53 AM

#

terse folio if my live STT ever survives, we can call it "BooginSTT" in spirit of Oogabooga

that's quiete a name haha

halcyon quarry May 25, 2024, 11:53 AM

#

I have a wiki article on the feature

visual dagger May 25, 2024, 11:54 AM

#

so hmm.. adding a relatively medium to big section of text or removing it from the system prompt will help?

#

I mean like example

sometimes we will include a "more info" section and sometimes we don't include it (we remove it)

halcyon quarry May 25, 2024, 11:56 AM

#

Currently, the ‘custom_system_prompt’ parameter cannot use dynamic prompting.

#

But your actual prompt can

visual dagger May 25, 2024, 11:57 AM

#

I mean... does it help? like will it work to reduce the repetition

visual dagger May 25, 2024, 11:57 AM

#

visual dagger I mean like example sometimes we will include a "more info" section and sometim...

what I explained here

#

thanks for help

halcyon quarry May 25, 2024, 11:58 AM

#

You’re here in our project channel so I’m offering answers in terms of how this project could help

visual dagger May 25, 2024, 11:58 AM

#

btw thank you guys @terse folio @halcyon quarry

it's so wonderful and fun discussing these stuff with you guys and trying it

visual dagger May 25, 2024, 12:00 PM

#

halcyon quarry You’re here in our project channel so I’m offering answers in terms of how this ...

I didn't undertand the dynamic prompting feature

#

what I got from it, is including or removing sections

#

or better say a chunk of text

visual dagger May 25, 2024, 12:01 PM

#

halcyon quarry The bot has a Dynamic Prompting feature. You can put txt files in the wildcards ...

this

halcyon quarry May 25, 2024, 12:02 PM

#

Just blasting it out there in case it gives you any ideas

visual dagger May 25, 2024, 12:03 PM

#

so I undertood how it works correctly?

halcyon quarry May 25, 2024, 12:04 PM

#

Yeah it can replace your ##wildcard-syntax (representing wildcards/wildcard-syntax.txt) with a randomly selected value

visual dagger May 25, 2024, 12:04 PM

#

ok cool 🔥

halcyon quarry May 25, 2024, 12:04 PM

#

The value could be a single character or a book of text

visual dagger May 25, 2024, 12:05 PM

#

got it got it, thanks

halcyon quarry May 25, 2024, 12:05 PM

#

Well, the values is randomly selects from must be separated by new lines in the txt file.

visual dagger May 25, 2024, 12:06 PM

#

it chooses a random line

#

from the text file then inject it

halcyon quarry May 25, 2024, 12:06 PM

#

Alternatively you can use {{this syntax|this or that syntax|another syntax like this}} to pick something at random

#

Yes, try it

#

It returns a copy of the updated text to the channel

#

So you can see what random values were applied (also visible in cmd)

#

#

Good thing we're discussing this because just noticed the wildcards has a bug due to directory not being initialized correctly with the filesystem...

#

#

fixed that

vestal python May 25, 2024, 1:46 PM

#

On the road for the next 2 days and all the new servers wiped and setup now... I didn't take the time to add in a new wireguard vpn x.x I'll at least grab the example character.yaml and get that done. After all that my 3 discord bots should be working as intended.. Or 2 plus an alltalk server with rtx 2070 maxq

halcyon quarry May 25, 2024, 2:24 PM

#

The session_history variable is currently just one dict.

If autoload_history: True, the value is set by TGWUI load_latest_history() based on current state['character_menu'] and state['mode']

I think the correct course of action will be to just snag those two values, create a loop for all channels the bot is currently a member of, copy then update the value of character_menu to prefix it with the value of those channels. Then assign each history by channel id in a chan_session_history variable

#

We'll collect a list of channel IDs the bot has message permissions for on startup.

#

    def collect_all_bot_channels(self):
        for guild in client.guilds:
            for channel in guild.text_channels:  # Only consider text channels
                # Check if the bot has send msg permissions for the channel
                if channel.permissions_for(guild.me).send_messages:
                    self.bot_channels.append(channel.id)

halcyon quarry May 25, 2024, 2:56 PM

#

    def load_history(self):
        current_state = bot_settings.settings['llmstate']['state']
        values_for_history = {'character_menu': current_state['character_menu'],
                              'mode': current_state['mode']}
        for channel_id in self.bot_channels:
            chan_values_for_history = copy.copy(values_for_history)
            chan_values_for_history['character_menu'] = f'({channel_id}_{chan_values_for_history["character_menu"]})'
            channel_history = load_latest_history(chan_values_for_history)
            self.session_history[channel_id] = channel_history

#

Pushed WIP to new branch per_chan_histories - done for probably the day

halcyon quarry May 25, 2024, 4:41 PM

#

More settings.
If any channels are specified in history_channels then it will not dynamically use all available channels

halcyon quarry May 26, 2024, 1:44 AM

#

To use TGWUI's save/load history functions with per-channel histories - the chatlogs will have to be saved into directories suffixed with the channel ID.
Meaning, there will be a 'Character_xxxxxxxxx' directory for each channel with its own history management.
These will not appear in TGWUI chatlog dropdown without renaming the directory to match the characters name

#

Making a lot of progress on this

halcyon quarry May 26, 2024, 2:55 AM

#

calling it quits for the night, but per-channel histories are probably coming tomorrow/Monday. Very close.

visual dagger May 26, 2024, 3:50 AM

#

heh

#

hey

#

@halcyon quarry thank you for yesterday, I think it's working, I'm noticing less repetitions

#

but I still need to do more testing

terse folio May 26, 2024, 10:58 AM

#

"Note: Can't delete outputs uploaded to channel. (../extensions/coqui_tts/outputs)"
interesting, why's that?
Os error? like file in use?

#

if so, you could put the files you want to delete in a yaml file, then check if they exist on the next load and purge them at that time.

halcyon quarry May 26, 2024, 11:00 AM

#

terse folio if so, you could put the files you want to delete in a yaml file, then check if ...

Help 😆

terse folio May 26, 2024, 11:01 AM

#

halcyon quarry Help 😆

I'm just not sure what the issue is, I don't have coqui installed yet

halcyon quarry May 26, 2024, 11:01 AM

#

You nailed it

terse folio May 26, 2024, 11:01 AM

#

put alltalk in tgwi to test my update to the file sending

halcyon quarry May 26, 2024, 11:03 AM

#

If my error reads like that, then I never got around to making it use the tts_client variable (originally only supported coqui

terse folio May 26, 2024, 11:03 AM

#

hmm I wonder if it has to do with tgwi maybe using that outputs folder like a "static" folder on a webserver.
So it can link to those files in the webui

terse folio May 26, 2024, 11:04 AM

#

halcyon quarry If my error reads like that, then I never got around to making it use the tts_cl...

i'm 3 commits behind, you might have changed it

halcyon quarry May 26, 2024, 11:04 AM

#

They can be deleted after new file generated though if memory serves me right

#

Nah haven’t touched tts code in a loooong time

terse folio May 26, 2024, 11:06 AM

#

oh nvm, i could install coqui pretty easily, it's already part of the extensions
and its only requirement is "tts" which alltalk also uses

#

Lol thought something was wrong with the api.
That's an odd thing for the LLM to reply to for "test"

#

anyway, file upload works as normal!

halcyon quarry May 26, 2024, 11:09 AM

#

Lol

#

Too much ChatGPT response in that training

terse folio May 26, 2024, 11:10 AM

#

Yea, it's an older llama2 model for the fast load times during testing

halcyon quarry May 26, 2024, 11:11 AM

#

Definitely makes sense

#

Models quality has come such a long way since then

#

Assuming you mean like, when llama2 was new

#

I have mixed feelings on llama3

terse folio May 26, 2024, 11:15 AM

#

at the latest I've been using Solar10b, a mix of Mistral, and Llama3-8b.

they have been pretty good for the random tests I do

halcyon quarry May 26, 2024, 11:15 AM

#

Sounds good need to check that out

terse folio May 26, 2024, 11:15 AM

#

halcyon quarry Assuming you mean like, when llama2 was new

I don't think it was new, it's a finetune, but yes, probably a few months old.

#

Mistral was the first above 2k context model I tried, so that opened up a lot of possibilities!
Then llama3 came around at 8k context by default which matched the settings I used for mistral anyway.

#

Llama3 performed pretty okay with tool selection (shown in that screenshot yesterday)

visual dagger May 26, 2024, 11:19 AM

#

any useful 32k llama3 8b finetune?

#

all extended context ones I tried break at some point

terse folio May 26, 2024, 11:20 AM

#

visual dagger any useful 32k llama3 8b finetune?

I think you could expand it to 32k yourself using the alpha/rope slider

#

and yes, in many people's experiences after around 8k tokens things start to break down.

visual dagger May 26, 2024, 11:20 AM

#

what's that rope thing? I keep hearing about it

terse folio May 26, 2024, 11:21 AM

#

It's in the model load tab, 1 sec

visual dagger May 26, 2024, 11:21 AM

#

terse folio and yes, in many people's experiences after around 8k tokens things start to bre...

the farest I could go is at like 13k-16k and after that things break badly

#

even for 200k finetune or 64k

#

all of them break after 16k

#

I couldnt get any finetune to survive until 32k at least

terse folio May 26, 2024, 11:22 AM

#

visual dagger even for 200k finetune or 64k

Yep!
I wonder how those needle in a haystack evaluations work if it starts breaking so fast

#

#general message

alpha 4 for 4x
2.5 for 2x

visual dagger May 26, 2024, 11:23 AM

#

4x the context?

terse folio May 26, 2024, 11:23 AM

#

yea

#

#

this slider

#

of course, this will degrade the quality the more you expand it

#

even if you don't use all the context

#

Kind of like quantizing the model to fit into a smaller space I assume.
This compresses tokens to fit in a smaller space

halcyon quarry May 26, 2024, 11:34 AM

#

Even though I’m not going to use it personally I’m very excited about the per channel histories

#

This will be a feature that definitely sets us apart from the rest

terse folio May 26, 2024, 11:35 AM

#

😸

halcyon quarry May 26, 2024, 11:39 AM

#

I realized that model and character changes could be mysterious happenings, but what I’m going to do is just use the history_channels list to chanel.send the embed. The message for each embed will be dynamic for if the channel is in i.guild.channels such as ‘User in another server changed character’

#

Or if in same guild, ‘{user} in {channel_name}} changed character “

#

I’ll offload this to a background task and maybe add asyncio.sleep if there’s a buttload of channels

terse folio May 26, 2024, 11:52 AM

#

it would be best to handle characters/system prompts per conversation

visual dagger May 26, 2024, 11:56 AM

#

halcyon quarry I’ll offload this to a background task and maybe add asyncio.sleep if there’s a ...

the tool handles multiple prompts/responses on parallel?

#

if yes how does it not affect the pc's performance,, I mean it will use double or more resources than usual I assume

halcyon quarry May 26, 2024, 11:58 AM

#

terse folio it would be best to handle characters/system prompts per conversation

That’s true… a whole nother can of worms

#

The bot is going to handle one request at a time for now

#

But it is going to have an option to manage history separately per channel

#

Which means multiple history variabls so some additional ram usage but it’s just text so probably nothing significant

#

It’s almost done really

halcyon quarry May 26, 2024, 12:40 PM

#

The only way this bot could allow multiple characters in a per-channel discord context, is disable avatar/display_name change.

#

On a per-guild basis, those could be updated

terse folio May 26, 2024, 12:42 PM

#

~~webhoooooks~~

#

someday™️

halcyon quarry May 26, 2024, 12:45 PM

#

The multiple character thing I think is too much effort / too specialized, when the user could have multiple bot instances. I think I’d rather improve support for that concept

terse folio May 26, 2024, 12:47 PM

#

at some point I want to build a new bot base where all these tools can be plugged in as hot swappable modules 😸

Tts support, images, textgen,..

halcyon quarry May 26, 2024, 12:47 PM

#

Really just need to put like, root_dir as a variable in config.yaml

terse folio May 26, 2024, 12:48 PM

#

halcyon quarry Really just need to put like, root_dir as a variable in config.yaml

for what?
What if a user moves the tgwi folder?

halcyon quarry May 26, 2024, 12:48 PM

#

I meant so there could be a ad_bot1 folder and ad_bot2 folder

terse folio May 26, 2024, 12:50 PM

#

a lot of settings will be the same

#

like if users want tts, so that could be shared

halcyon quarry May 26, 2024, 12:50 PM

#

Well they will literally be the same

terse folio May 26, 2024, 12:50 PM

#

just the "active settings" needs to be serperate per bot

#

if using different characters

halcyon quarry May 26, 2024, 12:51 PM

#

If separate bot instances could be completely independent of each other, it could be easier for user to share the settings they want to share

terse folio May 26, 2024, 12:51 PM

#

oh I see

#

well then you have the problem of vram

#

running multipel copies of tgwi

halcyon quarry May 26, 2024, 12:53 PM

#

It would be for someone with 2+ gpu using one for each instance

#

I have no clue I’m spitballing here lol

#

Multiple character support sounds like a nightmare

vestal python May 26, 2024, 1:18 PM

#

.> I could test it out on my 5GPU server if you want.

terse folio May 26, 2024, 2:45 PM

#

Was talking about pattern matching in another channel.
This could be of benefit to you too!

Letting users build patterns that can pre/post match on user/llm text

#

#

based on this list of patterns in the database.

#

The difference here is this runs through a recursive tree a token at a time to find the first best match
instead of running a match for every entry.

halcyon quarry May 26, 2024, 3:50 PM

#

It’s your canvas to paint upon my friend

#

Just make sure potatoes is spelled correctly

keen palm May 26, 2024, 3:52 PM

#

Reality is Dan Quayle confirmed

halcyon quarry May 26, 2024, 3:55 PM

#

You can make your first tag 😮

terse folio May 26, 2024, 4:03 PM

#

Lol, I thought it was interchangeable I've seen people spell them both ways

halcyon quarry May 27, 2024, 12:58 AM

#

Instead of sending Change embeds to history_channels (from /character, /llmmodel, /imgmodel, Tags), I'll send them to main_channels. So this will be an enhancement now, separate from the per-channel history thing

#

Are there better alternatives? Yep, but I want to get a pretty good solution published for now

terse folio May 27, 2024, 1:04 AM

#

an /announcements possibly

halcyon quarry May 27, 2024, 1:05 AM

#

Excellent solution

#

Now that I already coded this thing lol

#

terse folio May 27, 2024, 1:06 AM

#

it's just copypaste the main command and switch what database entry it uses

#

database.main_channels -> update_channels

#

you can select a piece of text and use Ctrl D to duplicate cursor

#

easy rename :)

halcyon quarry May 27, 2024, 1:07 AM

#

How about we just add an announcement_channel to config.yaml under the discord key

terse folio May 27, 2024, 1:08 AM

#

halcyon quarry How about we just add an `announcement_channel` to `config.yaml` under the `disc...

but there will have to be one per server right?

halcyon quarry May 27, 2024, 1:08 AM

#

Ahhh yes

#

How did I make it this far without you? Sheesh

#

Now that is a nifty trick

#

thanks for that advice

terse folio May 27, 2024, 1:09 AM

#

😸

#

one of my favorite keybinds

halcyon quarry May 27, 2024, 1:10 AM

#

One of my new favs already

#

Beautiful. I like this solution a lot

terse folio May 27, 2024, 1:13 AM

#

Mhm!
gives people the option to opt in

halcyon quarry May 27, 2024, 1:13 AM

#

Especially b/c we don't have to do too much

#

Now it's on them 😎

#

🥂

halcyon quarry May 27, 2024, 2:20 AM

#

Time to start testing the finished per-channel histories code...

terse folio May 27, 2024, 2:21 AM

#

hope that works!

halcyon quarry May 27, 2024, 2:25 AM

#

Didn't 😛 need to see what the dealio is...

#

I probably can't do this

terse folio May 27, 2024, 2:27 AM

#

i don't use setdefault too often, not sure

#

it's worth using an ipynb file (python notebook)
for live testing ideas like that.

Just those lines about setdefault and see if they work the way you think they do

halcyon quarry May 27, 2024, 2:32 AM

#

another thing I did not take into consideration, is that instruct mode doesn't save logs to any subdirectories

#

ok my problem was simple

#

I was initialzing with placeholder lists in the session history variable. So these were just adding new lists to the existing placeholders.
Derp

terse folio May 27, 2024, 2:41 AM

#

interesting

#

also to save you some code:

#

just set the default channel to "internal", or 0 if not using per_channel

#

maybe you reference session_history in other parts of the code, and this could remove some possible bugs if an if statement is forgotten

halcyon quarry May 27, 2024, 2:57 AM

#

progress

terse folio May 27, 2024, 3:02 AM

#

No worries about the channel ids
They have no personal information ^-^
All someone could extract from one is the date/time the channel was created.

#

Also awesome

#

Actually all the discord IDs are built like this, users, roles, channels, servers.

Only user ids can be resolved to a username.
This allows bots to ban users from joining your server if they never joined.

#

this inspired me to make my own snowflake (unique id) thing 😸

halcyon quarry May 27, 2024, 3:07 AM

#

I like to exercise precautions like this

terse folio May 27, 2024, 3:08 AM

#

Absolutely!

#

perfectly reasonable

viral lagoon May 27, 2024, 3:13 AM

#

Fancy JS to convert snowflake (id) to a date js new Date((Number(id) / 4194304) + 1420070400000)

#

I've had this as a command on my bot, it's super handy

halcyon quarry May 27, 2024, 3:16 AM

#

New announce channel thing that sends embeds to all announce channels for character, imdmodel and llmmodel changes - is working

#

Per channel histories is still a bit bugged, I'm outta time though

terse folio May 27, 2024, 3:16 AM

#

yea that's a big change to make

halcyon quarry May 27, 2024, 3:16 AM

#

Hopefully have that all buttoned up tomorrow

#

it's close - probably one or two lines wrong 😛
It's saving a new log file on every message. Otherwise it seems to be working

terse folio May 27, 2024, 3:20 AM

#

hmm that is an odd sounding bug

halcyon quarry May 27, 2024, 3:21 AM

#

I wrote a lot of lines, just overlooked something. Head's spinning at the moment.
It is managing the history seperately for both channels I'm testing in

terse folio May 27, 2024, 3:21 AM

#

that's good progress!

halcyon quarry May 27, 2024, 3:21 AM

#

loading the correct one on message, continuing along, saving to correct thing, etc

#

For chat mode, it is creating a new subdirectory for each channel

#

I may not need to do this... need to rethink

terse folio May 27, 2024, 3:23 AM

#

interesting, if tgwi can't read them, you may as well save them somewhere else
and in the structure you want

halcyon quarry May 27, 2024, 3:24 AM

#

Rather than trying to dissect the code I'm just seeing how I can tweak the data to continue using imported history managing functions

#

Just need to doublecheck my logic to see if I need to do these separate folders...
Anywa, calling it a night

#

pretty damn close

terse folio May 27, 2024, 3:25 AM

#

a weird hack would be to copy the bot profile for each channel (when sending messages)
so tgwi can import it.
But also name them like TMP_<name>_<channel>
so the user knows they can delete the characters if they want to change something in the base character file

halcyon quarry May 27, 2024, 3:27 AM

#

I'll figger it out tomorrow 😛 Pushing what I have in case you want to look at it

terse folio May 27, 2024, 3:28 AM

#

maybe, maybe, I feel like I'll be too distracted by my project haha

vestal python May 27, 2024, 2:18 PM

#

You use forge. Is this correct to use the branch instead? I'm about to get back to working on getting my bots and sdxl working properly, and then see about adding in some features from textgen I was thinking.

halcyon quarry May 27, 2024, 3:13 PM

#

vestal python You use forge. Is this correct to use the branch instead? I'm about to get back ...

dev2 is a few commits ahead of Forge main.
There's not likely to be much progress on the dev2 branch before Forge becomes irrelavent... which may happen if A1111 merges a number of open PRs that will make it more Forge-like in terms of performance

#

https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/15821

#

The ControlNet API will not work with this bot unless you are on dev2, or if you just use my branch 😛
https://github.com/altoiddealer/stable-diffusion-webui-forge-altoids/tree/restore_control_type

keen palm May 27, 2024, 3:16 PM

#

Per channel history hasn't been pushed yet, has it?

halcyon quarry May 27, 2024, 3:16 PM

#

It's almost done, hoping to push that tonight or at latest tomorrow

#

As a family man, difficult to say when I'll have time 😛

keen palm May 27, 2024, 3:17 PM

#

I know the feeling

#

I get approximately -20 minutes of free time daily

halcyon quarry May 27, 2024, 7:06 PM

#

I think I've got this per channel histories thing nailed down

#

If the feature is enabled, it will intialize the history variable / log file with all available text channels

#

If /reset_conversation is used in a channel, only that chanels history gets reset

#

This will be exclusive for the bot, cant just load these logs up into TGWUI. Although maybe could make a function to split it to separate files if needed

keen palm May 27, 2024, 7:13 PM

#

Do you know what happens if you try to load that log in TGWUI?

halcyon quarry May 27, 2024, 7:14 PM

#

Probably error

#

Expected file structure is a dictionary with two lists. This log will be a dictionary of dictionaries with 2 lists each

#

I need to do a bit more testing later on to ensure all is working completely well

#

It should correctly autoload history, as well as honor the keep/reset setting for characted changes

#

I also have a feeling that ‘visible’ list has no purpose for the bot purposes… will likely keep it empty from now on

#

(For per-channel history mode)

#

May add a utility script to convert the multichannel log into multiple files, which could duplicate the internal lists for visible lists

#

Definitely will - chatgpt will tell me how to do a little bat file or something to drag n drop onto

keen palm May 27, 2024, 7:42 PM

#

Regenerate is still broken?

halcyon quarry May 27, 2024, 7:42 PM

#

One more thing… probably going to change it from “channel id” to “guild - channel name”

halcyon quarry May 27, 2024, 7:42 PM

#

keen palm Regenerate is still broken?

Well it works just not in any flexible manner

keen palm May 27, 2024, 7:43 PM

#

I get "IndexError: list index out of range" when I try to regenerate

halcyon quarry May 27, 2024, 7:43 PM

#

Did you write anything in current session?

keen palm May 27, 2024, 7:43 PM

#

Oh, you know what, that'd be it

halcyon quarry May 27, 2024, 7:44 PM

#

Requires at least 1 history entry to work

#

When I fix it there will be no error… it’s on the official todo list 🙂 pinned msgs

keen palm May 27, 2024, 7:44 PM

#

I'm also trying to regenerate a response that I specifically had it not save

#

Hwhoops!

halcyon quarry May 27, 2024, 7:49 PM

#

I may need to have 2 versions of Regenerate

#

One that uses history and one that uses recent messages (includes messages not in history)

keen palm May 28, 2024, 1:29 AM

#

Do you know of any way to suppress the LLM response to a prompt? I am just looking for a way to add new information without the AI responding and clogging up history.

halcyon quarry May 28, 2024, 1:29 AM

#

[[should_gen_text:False]]

#

erm

#

that probably won't work...

#

I need to run atm

#

So you basically want to incrementally prepare your prompt?

keen palm May 28, 2024, 1:37 AM

#

I already tried that one. Doesn't work.

#

Basically yeah

halcyon quarry May 28, 2024, 1:38 AM

#

You could do something like this for the moment…
[[state:{max_new_tokens:1}]]

keen palm May 28, 2024, 1:38 AM

#

I did use [[state:{max_new_toke...yeah

halcyon quarry May 28, 2024, 1:38 AM

#

I think 0 = error

keen palm May 28, 2024, 1:38 AM

#

Correct

halcyon quarry May 28, 2024, 1:40 AM

#

Adding to my TODO list before I forget:
user variable assignments
I plan on adding a feature for users to be able to assign variables on demand for whatever purposes.
This would solve it

#

Idk the syntax yet but you could just set each chunk as a variable, then when your ready send your prompt prefixed with the variables

#

May have syntax to add to existing variablr

#

I have it now only creating the channel dictionaries in the history on-demand.
It is now also using the server + channel name as the dictionary key

#

Also not collecting Visible for multi-channel mode

#

Also suffixing filenames with multiple-history

#

Need to do more bug testing... probably push this to Main tomorow.

If anyone wants to help with debugging, please try this branch (or just grab bot.py and modules/database.py - and add new setting to config.yaml)
https://github.com/altoiddealer/ad_discordbot/tree/per_chan_histories

terse folio May 28, 2024, 2:06 AM

#

the reason you originally collected visible was to get the audio file url from the html tag.

I'm not sure if theres another way to get it when running the generate text pipeline?

halcyon quarry May 28, 2024, 2:06 AM

#

It still gets it - it iust doesn’t hang onto it now

terse folio May 28, 2024, 2:06 AM

#

I see, yea good point, it's only needed the one time

halcyon quarry May 28, 2024, 2:07 AM

#

I think the only reason to store it, is to be able to load the log file in TGWUI

#

But that won’t be possible in this logging format.

#

Well it will be possible after post-processing log file

#

@terse folio do you know if server/channel names can have any wacky values that could mess up this dict structure?

terse folio May 28, 2024, 2:15 AM

#

halcyon quarry <@226121791670583296> do you know if server/channel names can have any wacky va...

anything encoded in json is safely escaped.
I think you can use special characters in guild names.
Channel names too I think.
they might have been more limited in the past, but we can use emojis and all in them.

Anyway, I would recommend using the guild id/channel ids in the dict structure because if someone changes the name of the channel then your data is gone.

#

Also, json doesn't like int keys iirc.
so guild/channel ids should be converted to str before querying the json data.

halcyon quarry May 28, 2024, 2:18 AM

#

The way I have it set up, when a user sends a message it uses the guild name and channel name from the interaction as the positional arg to “get history”. If the key doesn’t exist it returns the 2 empty lists

#

But yes I see your point - the history would suddenly be reset after channel name change

#

Ok what I’ll do is put everything under the channel.id key

#

I’ll just add 2 dict keys inder thay

#

Server name and channel name

terse folio May 28, 2024, 2:22 AM

#

you dont really need the guild Id

channel id should be unique .
The message fetch function uses the channel id, and message id for example

halcyon quarry May 28, 2024, 2:23 AM

#

Correct- I just want to include guild name / channel name for easy reference

terse folio May 28, 2024, 2:23 AM

#

for the folders?

halcyon quarry May 28, 2024, 2:23 AM

#

for logging

terse folio May 28, 2024, 2:24 AM

#

okay, yea

#

because if writing to a folder you need to filter the characters as some are invalid for a file path.

#

<>/'" are some invalid ones I think on windows

halcyon quarry May 28, 2024, 2:25 AM

#

Good points

#

I’ll be sure to mention it to my pal ChatGPT when I request the utility 😛

#

I don’t speak with him much these days I know most of what I’m doing now

visual dagger May 28, 2024, 1:22 PM

#

terse folio based on this list of patterns in the database.

about this

#

does it go and collect all metioned keywords's facts?

#

like this

the only trigger is this word 'potatos'
the input/user msg is I don't like potatos btw

the backend work collects those

{potatos: John don't like potatos}

which will trigger 'John' fact because 'John' is mentioned here

{John: John is 32 years old and works as a teacher}

{work: John is a teacher at LLM University}

{LLM University: LLM University is a university specialised in everything AI/ML}

So a user input I doesn't like potatos btw

Caused this much facts to get retrived even if these following facts' keywords aren't mentioned, but they got triggered because they are mentioned in other facts.

+ John don't like potatos
+ John is 32 years old and works as a teacher
+ John is a teacher at LLM University
+ LLM University is a university specialised in everything AI/ML

halcyon quarry May 28, 2024, 1:31 PM

#

From what you are saying, you can already do something with the tags system like

trigger: 'John'
format_prompt: |
+ John is 32 years old and works as a teacher\n
+ John is a teacher at LLM University\n
+ LLM University is a university specialised in everything AI/ML\n
{prompt}

trigger: 'potatos'
format_prompt: |
+ John don't like potatos\n
{prompt}

#

When using the format_prompt tag, your actual prompt gets inserted anywhere you include {prompt} in the value

visual dagger May 28, 2024, 1:32 PM

#

halcyon quarry From what you are saying, you can already do something with the tags system like...

it goes on a never ending spiral of collecting facts?

#

but what if it got too much, like an overload of info

#

tokens inflation

halcyon quarry May 28, 2024, 1:34 PM

#

Maybe I don't understand.

What I'm suggesting is that you put the facts into dict_tags.yaml in this format

visual dagger May 28, 2024, 1:34 PM

#

halcyon quarry From what you are saying, you can already do something with the tags system like...

but in this example you already predefined them

#

they don't get triggered automatically

halcyon quarry May 28, 2024, 1:34 PM

#

Alright, so you are talking about capturing facts automatically - not so much triggering them automatically

visual dagger May 28, 2024, 1:35 PM

#

uhm.. my wording is bad I guess

halcyon quarry May 28, 2024, 1:35 PM

#

As it stands, the bot is great at triggering things 😛

Capturing things - can still be improved

visual dagger May 28, 2024, 1:35 PM

#

but I'm debating if that's a good idea or not

#

since if I don't stop it it might litetally collect all the info

#

assming the db is huge

#

ALL will be triggered

halcyon quarry May 28, 2024, 1:37 PM

#

Well, if whatever the feature is uses the Tags system - I have a trumps tag that allows a matched trigger to be cancelled by another matched trigger

visual dagger May 28, 2024, 1:37 PM

#

maybe setting a max 'depth' can help

#

like a max 5 spiralling facts

halcyon quarry May 28, 2024, 1:38 PM

#

- trigger: some trigger

-trigger: another trigger
 trumps: some trigger

text: Let's talk about some trigger and another trigger

Only another trigger has effect

visual dagger May 28, 2024, 1:43 PM

#

let me explain what am talking about, I think I made thigs confusing

the user msg is only this hey, Alex is calling

this user msg will trigger (by trigger I mean will cause a fact to be retreived)

(keyword: fact pairs)

Alex: Alex and John are chidlhood friends

^ this fact have the keyword John

which will cause this to be retrieved

John: John is a 32 years old teacher and have one kid called Charlie

^ this fact have the keyword Charlie

which will cause this to be retrieved

Charlie:Charlie is the son of John, Charlie is a 2 year old boy, Charlie have a sister called Martha

and you know the rest, Martha info will get retrived and it goes on and on

halcyon quarry May 28, 2024, 1:44 PM

#

This is what you want? Cascading effect?

visual dagger May 28, 2024, 1:44 PM

#

halcyon quarry This is what you want? Cascading effect?

not exactly

#

still debating if it is a good idea or not

#

bcz it will cause an inflation of tokens

#

so in theory it might even retrive the whole db

halcyon quarry May 28, 2024, 1:45 PM

#

There could be another user setting like max_depth so like max_depth: 5 it could only look for 4 additional facts before stopping

visual dagger May 28, 2024, 1:46 PM

#

and parallel facts, lol

#

hey, Alex is calling, I will sit with Charlie while you answer

#

you see, we got multiple paths here, Alex is mentioned and also Charlie

#

so yeah

halcyon quarry May 28, 2024, 1:48 PM

#

facts could be collected in a Set() which omits duplicates

visual dagger May 28, 2024, 1:48 PM

#

that's an idea to remove repetitive facts

halcyon quarry May 28, 2024, 1:48 PM

#

before being appended to the prompt

visual dagger May 28, 2024, 1:48 PM

#

but you misght end up with totally different topics triggered

#

which means you won't have duplicates

#

it's possible to summarise the similar facts I guess

#

instead of John is a teacher and John works at LLM university

that can be summarised to John is a teacher at LLM university

#

summarising or collecting repretitve/too similar facts , collecting them together

#

maybe there have to be a mechanism to score the retrieved facts and get rid of the less relavent ones

and keeping like the top best highest quality relavant 5 ones or something

halcyon quarry May 28, 2024, 1:54 PM

#

Quite frankly, this is all above me, so let's wait for Reality - and I wouldn't count too hard on them implementing such a feature

visual dagger May 28, 2024, 1:54 PM

#

you don't have to

halcyon quarry May 28, 2024, 1:54 PM

#

We're in brainstorming phase 🙂

visual dagger May 28, 2024, 1:54 PM

#

I'm just

#

not sure about it

#

if it is a good idea or not

#

bcz as I said I might end up retrieving the whole db lol

#

and this excesive retrival might cause excecive repetition

#

bcz the llm sees the same facts again and again

#

and the llm sees a lot of them, so... they have more weight than the user prompt

#

it will be like 70% attention to facts and 30% attention to actual user prompt

halcyon quarry May 28, 2024, 3:01 PM

#

Just finished up revising all the history code

terse folio May 28, 2024, 3:01 PM

#

visual dagger like this the only trigger is this word 'potatos' the input/user msg is `I don'...

I like this idea where information is called on recursively.
That's kind of what I was talking about with the LLM generating text, and grabbing information based on triggers in it's own text, then updating it's answers!

terse folio May 28, 2024, 3:03 PM

#

halcyon quarry From what you are saying, you can already do something with the tags system like...

Yes, the tags system should be able to do retrieval like that!
What I was offering was to work on a library for matching more complex patterns that could be used to extract facts to be used in the tags system.
(In the end this would be it's own github repo you could install into the bot as a library so it can always be up to date with features ^^)

terse folio May 28, 2024, 3:06 PM

#

visual dagger let me explain what am talking about, I think I made thigs confusing the user m...

John: Hey, Alex is calling
Matching John + Alex as part of the name: message pair, interesting ^^

halcyon quarry May 28, 2024, 3:06 PM

#

New logging format for per-channel histories

terse folio May 28, 2024, 3:07 PM

#

halcyon quarry This is what you want? Cascading effect?

You'd set a recursion limit on it ^-^
Like pass a int variable down the function and subtract one each time.

halcyon quarry May 28, 2024, 3:07 PM

#

No, not me 😛

terse folio May 28, 2024, 3:07 PM

#

halcyon quarry There could be another user setting like `max_depth` so like `max_depth: 5` it c...

ah yup, you got that :)

terse folio May 28, 2024, 3:07 PM

#

halcyon quarry New logging format for per-channel histories

nice!

visual dagger May 28, 2024, 3:08 PM

#

terse folio I like this idea where information is called on recursively. That's kind of what...

yeah thank you and @halcyon quarry for that, I started playing with this idea of words triggering facts/memory

halcyon quarry May 28, 2024, 3:08 PM

#

Actually, just rearranged it so that the guild and channel are assigned before the history list

#

so, slight difference 😛

visual dagger May 28, 2024, 3:08 PM

#

visual dagger yeah thank you and <@670018869418786816> for that, I started playing with this i...

and it's useful but kinda hard to get right

terse folio May 28, 2024, 3:09 PM

#

visual dagger yeah thank you and <@670018869418786816> for that, I started playing with this i...

This is something i'm perusing too, running into the same problems while brainstorming, "what is a good point to stop, it might miss crucial information"

Ultimately, the facts would be trained into the LLM, so it knows such concepts are out there somewhere.
Because RAG could always leave something out by accident and the LLM would never know.
But finetunning is costly for such small things.

terse folio May 28, 2024, 3:11 PM

#

halcyon quarry Actually, just rearranged it so that the guild and channel are assigned before t...

Dicts actually have no order, it's just a table of keys:values.

Back in python3.5 if you put items in a dict they would not stay in the same location 😸
Free to move around as they wanted.

But yes, more recent versions keep them sorted based on how you input the data.

So don't expect dicts to always come out exactly the same way you put them in when translating them to other languages, such as exporting a json/yaml file.
(but from my experience they usually respect order when importing to python)

halcyon quarry May 28, 2024, 3:12 PM

#

nbd 😛

terse folio May 28, 2024, 3:13 PM

#

Yup, just little bugs I ran into in the past ^^

visual dagger May 28, 2024, 3:13 PM

#

terse folio `John: Hey, Alex is calling` Matching John + Alex as part of the name: message p...

that msg was supposed to be Mark talking to John about Alex like this

Mark: hey, Alex is calling, I will baby sit Charlie until you finish *hands him the phone*

but you gave me new ideas now, lol, including the name of the speaker as a trigger too, maybe that will be helpful for NPCs? or a group chat of characters? that will interesting

halcyon quarry May 28, 2024, 3:14 PM

#

Hopefully TGWUI load_latest_history() isn't too complex to reconstruct / patch because it is not happy when it finds this new structure

terse folio May 28, 2024, 3:15 PM

#

visual dagger that msg was supposed to be Mark talking to John about Alex like this `Mark: he...

Pattern matching doesn't know who Mark is talking to,
How would it know to query John+Alex?

terse folio May 28, 2024, 3:16 PM

#

halcyon quarry Hopefully TGWUI `load_latest_history()` isn't too complex to reconstruct / patch...

leaving lots of comments to help walk others through the workflow is great!

visual dagger May 28, 2024, 3:16 PM

#

terse folio This is something i'm perusing too, running into the same problems while brainst...

it's tricky to get it right, like you might retrieve facts that are misleading to the llm, like it will confuse it

terse folio May 28, 2024, 3:16 PM

#

Yea, this is where we'd need an llm!

#

do some preprocessing on the text.
like Convert the text to use 3rd person pronouns instead of 1st, 2nd person (eg: you, i)

#

And then the complex ways that sentences work sometimes

#

like "Get the groceries" has an implied "you" at the start.

#

"Mark, go get the groceries"

#

Then tags could be run on these processed texts

visual dagger May 28, 2024, 3:21 PM

#

terse folio Pattern matching doesn't know who Mark is talking to, How would it know to quer...

it's recrusive as you said

Alex will be set as a triggering word, that will trigger this fact to be retrieved Alex and John are childhood friends

#Pairs on this format
TriggerWord: Fact

it's a dict of this ^

so because John is in this fact, another fact will be triggered which is John works as a teacher and have a sone called Charlie

and this fact will trigger the retrival of another fact which is Charlie is a 2 year old

#

not because John is in this fact, no, but because John is set as a triggering word

{John: John is a teacher,

Alex: Alex and John are childhood friends

Charlie: Charlie is a 2 year old}

terse folio May 28, 2024, 3:23 PM

#

but why should Alex trigger Alex+John.
what if the conversation is only between Alex and Mark?

#

it would bring up John

visual dagger May 28, 2024, 3:23 PM

#

so the fact can actually not mention anything about the triggerword

#

it's a trigger to fact relationship

#

like a var to value

#

the value can be anything

#

and can be even any random stuff

#

but when the var is called the value will get printed

#

like that

terse folio May 28, 2024, 3:25 PM

#

Is see, but there would be multiple entries for the relationships
Alex: Alex+John, Alex+Mark, Alex+Charlie..

visual dagger May 28, 2024, 3:26 PM

#

terse folio but why should Alex trigger Alex+John. what if the conversation is only between ...

it's a never ending loop of triggerring facts down deeper and deeper which might lead to retrieving the whole db

that's why I'm still thinking if it's a good idea or not?

terse folio May 28, 2024, 3:27 PM

#

you can detect loops by checking if a fact has already been visited
Just add them to a set and check if the current recursion item is in that set.

And cut it off there because it would do a circular import of the data again

#

Then as AltoidDealer mentioned, using a set to keep only the unique items

visual dagger May 28, 2024, 3:27 PM

#

terse folio Is see, but there would be multiple entries for the relationships Alex: Alex+Joh...

the one I explained cares only about one single word trigger

but you are right, maybe having mutliple words / conditions is the way to go

terse folio May 28, 2024, 3:28 PM

#

visual dagger the one I explained cares only about one single word trigger but you are right,...

It's a start to use one word, just make sure to keep your code open to changes, thinking about the future :)

visual dagger May 28, 2024, 3:30 PM

#

terse folio you can detect loops by checking if a fact has already been visited Just add th...

not that it will retrieve duplicated facts, but this method that I explained might retrieve the whole db bzc every word triggers a fact and every fact might have a trigger word and it goes on and on

#

you can make it stop by doing a max_depth concept

#

but still idk how that will work, it might ruin llm responses

#

confusing it

terse folio May 28, 2024, 3:34 PM

#

visual dagger not that it will retrieve duplicated facts, but this method that I explained mig...

A lot of these facts would be hand picked,
If you're using auto collection, you could have the auto collected facts written to a file, and you delete the ones you dont want.
Then tell it to import.

Maybe somehow create a blacklist from creating the same garbage facts again "the sun is hot"

visual dagger May 28, 2024, 3:35 PM

#

terse folio It's a start to use one word, just make sure to keep your code open to changes, ...

there is a problem, one word trigger can be needed sometimes, like assuming we did only this

Food+Potatos: John hates potatos (if both those words are present , this fact will be retrived)

but what if down the line in the story, this happend

John looks around and he finds out someone selling potatos

bcz we limited the fact of potatos to two words and not one, it won't be mentioned here, thist fact won't be mentioned here

#

we will wait someone to mention food before this fact is retrived

#

and by that time John might already be eaten the potatos, poor John

terse folio May 28, 2024, 3:36 PM

#

create multiple!

#

create a hierarchy

#

"food+potatoes" will run before "potatoes"

#

The pattern matching thing works the same way.
It first checks if an exact match for your text exists, then it starts testing wildcards until it finds a match.

visual dagger May 28, 2024, 3:38 PM

#

terse folio "food+potatoes" will run before "potatoes"

but what is the reason?

#

it will be the same outcome

#

so just do potatos only

terse folio May 28, 2024, 3:38 PM

#

you could have different information mentioned

#

John+potato not the same as potato

visual dagger May 28, 2024, 3:39 PM

#

visual dagger it will be the same outcome

bcz it's an or condition

terse folio May 28, 2024, 3:39 PM

#

i'm not sure why "food+potato" was an example, what that was

visual dagger May 28, 2024, 3:39 PM

#

food is not a topic

#

potatos also not a topic

#

they are both literal words

#

strictly

keen palm May 28, 2024, 3:40 PM

#

Psst: potatoes

terse folio May 28, 2024, 3:40 PM

#

O:

visual dagger May 28, 2024, 3:40 PM

#

o:

terse folio May 28, 2024, 3:40 PM

#

visual dagger they are both literal words

I mean, what kind of sentence do you expect to match that contains both "food" and "potatoes"?

#

"John is getting potatos"

visual dagger May 28, 2024, 3:40 PM

#

I run out of examples

#

: (

#

yeah food + potatos seems weird but can't think of any other example

terse folio May 28, 2024, 3:41 PM

#

I see, In that case need to create a file of good examples that you think would break/cause bugs!
Use those to finetune your code!

visual dagger May 28, 2024, 3:44 PM

#

I'm trying to generalise as much as possible, making things autonomos and general means working for any scenario

#

without me fixing llm mistakes every now and then

terse folio May 28, 2024, 3:45 PM

#

mhm, what I mean is while you figure that out, you should have test cases that you can quickly run to see how the logic works out

visual dagger May 28, 2024, 3:45 PM

#

but it seems like the system should be very robust and self sufficient

#

fixing it's own bugs somehow and updates false info

#

and facts etc..

terse folio May 28, 2024, 3:46 PM

#

Here's an example

John+Potato(s) - John hates potatoes
Potato(s) - The town you all live in is actually known for potatoes!

These are 2 different peices of information that can be triggered in different contexts

visual dagger May 28, 2024, 3:46 PM

#

terse folio mhm, what I mean is while you figure that out, you should have test cases that y...

like benchmarking?

terse folio May 28, 2024, 3:46 PM

#

benchmarking your code progress to make sure you're bug free 😸

#

like that if you change something, it doesn't break previous tests

terse folio May 28, 2024, 3:47 PM

#

terse folio Here's an example John+Potato(s) - John hates potatoes Potato(s) - The town you...

in this example it wouldn't matter to pick one or the other,
it would be fine to display both tags

visual dagger May 28, 2024, 3:47 PM

#

how do you deal with the llm forcing including all facts in one response?

#

instead of being more natural

terse folio May 28, 2024, 3:48 PM

#

I usually think about how my mind would handle it.

#

Maybe using embedding to match how related the tag contents are to the conversation

#

if it goes too far off track into unrelated territory it would return lower scores

visual dagger May 28, 2024, 3:49 PM

#

terse folio Maybe using embedding to match how related the tag contents are to the conversat...

relation, relative, scoring, yeah but like it's gotta be more complex than that

#

like there gotta be a strict logic

#

that handles or decides what is relevant or not

#

based on A B C D factors

#

taking mutliple stuff into consideration

#

like how a human being make a decsion.. yk?

you rely on your undertanding of space, time, self, current situation, if there is other intities (people) around you etc

#

you process all those different categories of things before making a decision

#

of let's say screaming

terse folio May 28, 2024, 3:52 PM

#

visual dagger like how a human being make a decsion.. yk? you rely on your undertanding of sp...

well this depends more on your environment you're building.
If it's just a discord chat, no you don't have access to know what direction the speakers are facing to understand an implied "you" when you tell the person you are facing something.

#

time and space, yes I agree,
But we are more limited depending on where we talk

visual dagger May 28, 2024, 3:53 PM

#

it's more of a game that I am working on

#

rather than only chat

terse folio May 28, 2024, 3:53 PM

#

if you're doing roleplay with moving around like your game, then yes

#

in that case you could have the tags look in different locations

#

some in chat, some based on location, some based on time

#

like if your current location matches "the town" it can mention it's know for potatoes

visual dagger May 28, 2024, 3:54 PM

#

do you have any idea on how to craft such a scoring system?

#

that takes many factors into consideration

#

or maybe just use another llm?

#

it feels like I should have a phd in phsychology sometimes lol😭

#

or study the humans

terse folio May 28, 2024, 3:57 PM

#

It's not something I tried.
But I was going for a similar idea where I planned to use Embedding models.

I think the idea I had was retrival of information, then highlighting the senteces that are most related (that's why I created the zoom levels thing)

visual dagger May 28, 2024, 3:57 PM

#

who is up for a scientific expirement 😐

visual dagger May 28, 2024, 3:58 PM

#

terse folio It's not something I tried. But I was going for a similar idea where I planned t...

what decides the most relavent?

terse folio May 28, 2024, 4:02 PM

#

visual dagger it feels like I should have a phd in phsychology sometimes lol😭

yup yup I know.

Using negative words such as "not" "do not generate this"
Gives the LLM a reason to try doing exactly what you don't want.
(At least smaller ones)

Similarly if you tell a person to not imagine a chicken, they have to do so to understand what you mean.

Maybe we should do things in multiple steps?
Like how the person first calculates the meaning of what you said, then decides to act on it.
I have no idea what that would look like though.

Haha, it reminds me of people saying "be careful with that", and then you are suddenly more likely to slip up and fall with the item.

LLM agents seem to do that, working on a problem in multiple steps.

keen palm May 28, 2024, 4:04 PM

#

https://tenor.com/view/matrix-oracle-vase-premonition-speakintoexistence-gif-17768235848752040284

Tenor

visual dagger May 28, 2024, 4:06 PM

#

terse folio yup yup I know. Using negative words such as "not" "do not generate this" Give...

"be careful with that"

"careful? what? looking away for a second then oops falls"

"i told you"

"don't tell me next time, you distracted me : /"

#

dc lagging

#

i got a half baked idea, scoring facts based on priorities

#

but priorities of who? the dev? the characters?

#

idk

terse folio May 28, 2024, 4:08 PM

#

visual dagger what decides the most relavent?

When trying to do something completely automated, don't expect perfect results every time.
Anyway, the idea was to search based on the current conversation, this would give you the topic embedding.

From there search the database subset on the same conversation and find the topic item that best answers the question/adds context/whatever.

Once this context is added, generate the reply.
Embed that reply, then compare to the sentences of the context and try to find a quote you could cite.

If the score is too low, we could assume the bot hallucinated an answer.
if there's a good match, it can be confident the answer has truth

terse folio May 28, 2024, 4:08 PM

#

visual dagger i got a half baked idea, scoring facts based on priorities

hmmmm maybe one should make a machine learning model to finetune these priorities :P

#

that's a bit beyond me right now, but i'd look into Keras, they had some simple examples, like classification

visual dagger May 28, 2024, 4:09 PM

#

terse folio When trying to do something completely automated, don't expect perfect results e...

I undertand but this way you are counting on only current conv

terse folio May 28, 2024, 4:10 PM

#

visual dagger I undertand but this way you are counting on only current conv

the more conv you count the less accurate your search would be because it's averaging more and more data from the past that might not be relevant anymore

#

for embedding

#

tag matching uses if statements, so it's more accurate in making sure the tag exists

visual dagger May 28, 2024, 4:11 PM

#

what is important? and how do you know that?

#

like how do you extract that from the chat history

#

conclusions

terse folio May 28, 2024, 4:11 PM

#

I don't!
would have to build a test program and figure that out

visual dagger May 28, 2024, 4:12 PM

#

and you might need to go short term vs long term, like this topic is imortant currently but might not be in the following weeks

#

topic let's say a temporary sickness

#

after some days-weeks it will go down on the list of priorities

#

until it is completely removed

terse folio May 28, 2024, 4:14 PM

#

Lets say you have a home assistant.
And you're having a conversation with someone about what you want to eat.

During this conversation you raise your voice slightly and say "search for restaurants serving X"

We need some logic for the home assistant to figure out you're talking to it, and not the other person.

Most solve this with a trigger word "okay google"
Others might listen to a change in tone of your voice (like it being raised to project across the house)

How do we do this in text?
for the bot to understand you want to search the web while talking.
That's one of my goals

visual dagger May 28, 2024, 4:15 PM

#

hmm umm...

#

text only?

terse folio May 28, 2024, 4:15 PM

#

visual dagger and you might need to go short term vs long term, like this topic is imortant cu...

mhmm, I'm thinking of including a timestamp in all embedded items where they can be sorted after query

terse folio May 28, 2024, 4:15 PM

#

visual dagger text only?

Like we are here on discord

#

if i say, search the web about it.
You would understand i'm talking about "search the web about remembering things that is sorted by time"

visual dagger May 28, 2024, 4:16 PM

#

I was thinking of the assistant having a camera so when you do point to it with your finger and say search for xyz, it's a clear sign you are talking to your oc

#

pc/assistant

#

https://tenor.com/view/thats-it-yes-thats-it-that-right-there-omg-that-thats-what-i-mean-gif-17579879

Tenor

terse folio May 28, 2024, 4:16 PM

#

visual dagger I was thinking of the assistant having a camera so when you do point to it with ...

that's another interesting solution,
home assistant was just an example because they have a similar problem of trying to detect when it's being spoken to

#

😸

visual dagger May 28, 2024, 4:17 PM

#

visual dagger https://tenor.com/view/thats-it-yes-thats-it-that-right-there-omg-that-thats-wha...

yes yes that one, No No go back go back

#

lol

visual dagger May 28, 2024, 4:18 PM

#

terse folio that's another interesting solution, home assistant was just an example because ...

I think I can do that with openpose and a good gpu that I don't have : /

visual dagger May 28, 2024, 4:18 PM

#

terse folio mhmm, I'm thinking of including a timestamp in all embedded items where they can...

but how that relates to "search"?

terse folio May 28, 2024, 4:19 PM

#

if you do a search online, you might value more recent information over decades old

#

because the tools might have changed if researching a code problem

#

BUT, there is still plenty good information over 12 years old that has helped me tons!

#

so we can't just set a cut off date on search results

#

This is related to the "feeling sick" temporary memory

#

how do we decide what should be kept longterm, and what short

visual dagger May 28, 2024, 4:20 PM

#

terse folio if i say, search the web about it. You would understand i'm talking about "searc...

the assistant is lacking context about you?

sam altman (paraphrasing) always saying lately that the models yes will get better, but more importantly they will have more context about you

(me saying this) which gives the illusion of intelegence, but they just know more and undertand you more

#

and memory at play here

terse folio May 28, 2024, 4:22 PM

#

Yea, that's right,
that's kind of what I was saying with fine tuning the model.

Baking that knowledge into the model.
The reason I mentioned fine tuning instead of dumping all the context was because consumer gpus have limited context

visual dagger May 28, 2024, 4:22 PM

#

terse folio how do we decide what should be kept longterm, and what short

exactly, how?

#

how do you decide?

#

and if you assume all info are important then how do you store it in a nice way to be retrieve it easily later?

terse folio May 28, 2024, 4:23 PM

#

visual dagger exactly, how?

I don't know yet, maybe I will do some sort of weighting?

Like the further ago the result is in time, the less it is worth.
But also consider the similarity scores.

Score(Score bias) * time(Time bias) = final score

and we fine tune these biases

So maybe, some item with really good score from 10 years ago would have enough weight to show up as a result.

#

I think that would work

#

because if there are no new results about that topic that are good matches, the old would would be the next best one to show

visual dagger May 28, 2024, 4:24 PM

#

terse folio Yea, that's right, that's kind of what I was saying with fine tuning the model. ...

it will be really cool for a model staying upto date with you, how?

just set things up so the model gets finetuned at night when you are asleep

#

how wondeful is this?

halcyon quarry May 28, 2024, 4:25 PM

#

I like how this channel is the most productive channel on the server lately

visual dagger May 28, 2024, 4:25 PM

#

I don't have the gpu for it, and even if I did, finetuning is not that easy to get right

#

@halcyon quarry 🫡

terse folio May 28, 2024, 4:25 PM

#

visual dagger I don't have the gpu for it, and even if I did, finetuning is not that easy to g...

I saw one person in here doing that.
finetuning a lora often.

#

with all previous chats

terse folio May 28, 2024, 4:27 PM

#

visual dagger it will be really cool for a model staying upto date with you, how? just set th...

I've been collecting my discord data from my own servers, so there's a lot of data about me that I could use to test this kind of thing!

But I think some pre-processing is needed before saving facts.
using embedding to match conversations might not be so efficient because important info could be spread across many messages and we need to condense that down to save context

visual dagger May 28, 2024, 4:27 PM

#

terse folio I don't know yet, maybe I will do some sort of weighting? Like the further ago ...

you see this is the problem we might be facing, we looking at things from one dimension, it's hard to calc all dimensions and count for them all, I'm just a human being 😭

#

am not an ASI

terse folio May 28, 2024, 4:28 PM

#

Mhmm,
Should look into how hybrid vector databases work.
That would give me better insight into how to do the weighted by time thing efficiently

visual dagger May 28, 2024, 4:28 PM

#

terse folio I saw one person in here doing that. finetuning a lora often.

woow that's cool, a lightweight lora might help

#

I hope that it worked

terse folio May 28, 2024, 4:29 PM

#

A database like Chroma supports putting tags in your items to search on.
But I don't think they optimised this for high speed.
So it's doing a comparison on every row in the db?
I could be wrong!

visual dagger May 28, 2024, 4:30 PM

#

terse folio I've been collecting my discord data from my own servers, so there's a lot of da...

yeah you have to set things up to a point of not needing your intervention, it will be hard to reach such a level of everything going right

visual dagger May 28, 2024, 4:31 PM

#

terse folio A database like Chroma supports putting tags in your items to search on. But I d...

not sure, but ty for the info, I didn't know that a vector db can include also tags

terse folio May 28, 2024, 4:31 PM

#

after I get my other project done, i'll put some time into the pattern matching!
That gives us a logical way of extracting info

visual dagger May 28, 2024, 4:34 PM

#

about similarity, the assistant should bring up your medicine up if you forgot it, so... similarity won't help, since you forgot

but that thing "medicine" is one of the top priorities

but there is a problen, when you are not sick anymore, the llm/system/logic should drop that priority and make deduction on its own

that

John not sick = no need for medicine

no need for medicine = drop it from the priorities list

terse folio May 28, 2024, 4:36 PM

#

visual dagger not sure, but ty for the info, I didn't know that a vector db can include also t...

Hmm, speaking of hybrid databases.
I read about them briefly once.

They work by doing the embedding vector search.
then another search using one-hot-vectors which contains actual tokens instead of being compressed.
This lets you search for exact words unlike embedding models that are more about meaning.

Anyway, what I'm thinking is:
Create a sql database, and query the top N rows that match certain tags, or time ranges...
This will return a list of IDs

In the embedding database, query another top N matches based on meaning similarity.

Do a set intersection to find the items that are returned by both sides.
Then score them by going through each result at a time.

Around 1000 items is still fast enough for millisecond response!
so we could use around 10k results and have a decent trade off of accuracy and speed.

visual dagger May 28, 2024, 4:36 PM

#

so things gotta be dynamic somehow, if the enitity (llm+system/logic) heard you saying no I didn't forget my medicine for like 7 days on a row, the entity should conclude that there is no point in reminding you since it's getting annoying

#

so... it should go down on the priorities list a bit

#

but not much

terse folio May 28, 2024, 4:37 PM

#

visual dagger about similarity, the assistant should bring up your medicine up if you forgot i...

yes having some background knowledge that is just always present when talking to that user.
This could be updated with temporary info such as state of health

terse folio May 28, 2024, 4:39 PM

#

visual dagger so things gotta be dynamic somehow, if the enitity (llm+system/logic) heard you ...

The least work might be to add a counter for each bit of context.
Like "user is sick (mentioned 6 times)"

The llm could have a system message to use a tool to delete a piece of context if it thinks you are telling it to stop and that number is high enough.
because no way you are fitting a week of context in there!

visual dagger May 28, 2024, 4:39 PM

#

terse folio Hmm, speaking of hybrid databases. I read about them briefly once. They work by...

they work with two tyoes of data at once?

terse folio May 28, 2024, 4:40 PM

#

visual dagger they work with two tyoes of data at once?

Vector DB outputs items: [1, 2, 5, 6]
SQL outputs: [2, 3, 4, 5]

Intersection: [2, 5]
For item in interection:
do some final weighting

return sorted(items)

#

But consider each database would return 10k/20k+ items

visual dagger May 28, 2024, 4:41 PM

#

terse folio The least work might be to add a counter for each bit of context. Like "user is ...

the assitant also shouldn't listen to you all the time, like if you want to not take your medicine and you said "no I am not taking it, stop reminding me" the assistant should in this case find more sneaky ways to convince you to take it? maybe?

halcyon quarry May 28, 2024, 4:41 PM

#

Quick report that I did have to copy/paste 2 loading functions and modify them, to get per channel histories working correctly

#

And it is working correctly

terse folio May 28, 2024, 4:42 PM

#

visual dagger the assitant also shouldn't listen to you all the time, like if you want to not ...

that would be up to the character card if we want to keep it general and as little work as possible

terse folio May 28, 2024, 4:42 PM

#

halcyon quarry Quick report that I did have to copy/paste 2 loading functions and modify them, ...

Woo!

visual dagger May 28, 2024, 4:43 PM

#

terse folio that would be up to the character card if we want to keep it general and as litt...

yeah but character card isn't enough, maybe a decision making system prompt will help?

#

then integrate the result/decision into the main chat

#

hmm.. so by the enitity having priorities it should have values... and one of the values are your well being

#

that. ... is interesting it won't base it's decisons on randomness but on actual values and priorities

#

not stupid censorship that the big AI players implement

#

but kind of a natural version of that

#

that is based on your best interest in mind

terse folio May 28, 2024, 4:55 PM

#

visual dagger hmm.. so by the enitity having priorities it should have values... and one of th...

that's a lot work to write out every decision/value!
I would first do testing to see if the LLM could intuit those values on it's own based on a character card

#

Maybe add tweaks ontop of that which the LLM can't do on its own

visual dagger May 28, 2024, 5:01 PM

#

terse folio that's a lot work to write out every decision/value! I would first do testing to...

not writing everything down.. even if you tried you can't, there is unlimitted possibilities, so it's not possible to write everything

but you can write the main values, and you can use the character card for that, the character card might be enough but still it might cause negativity/positivity bias

#

[character card]
"you are an AI assistant that cares about the user. ... etc"

but bcz of this ^ it will happily agree that you drop your medicine (which might be dangerous), it's not helping in this case, it's doing more harm than good

halcyon quarry May 28, 2024, 5:04 PM

#

Reality I'm having a little issue that maybe you have an idea how to fix...

terse folio May 28, 2024, 5:04 PM

#

halcyon quarry Reality I'm having a little issue that maybe you have an idea how to fix...

Sure

terse folio May 28, 2024, 5:05 PM

#

visual dagger [character card] "you are an AI assistant that cares about the user. ... etc" b...

I think doing expirements and adjusting the character card is the way to go.

Release it into the wild!
like a discord server, collect user opinions and adjust.

maybe create multiple versions of the card, and see who likes which better

terse folio May 28, 2024, 5:06 PM

#

halcyon quarry Reality I'm having a little issue that maybe you have an idea how to fix...

Has anything been pushed recently? Dev seems empty

halcyon quarry May 28, 2024, 5:06 PM

#

This is failing to find the channel ID key in the dictionary even though it is there:

    async def get_channel_history(self, i=None):
        # If per-channel history
        if self.per_channel_history_enabled:
            if not self.session_history.get(i.channel.id):

This gives me an error when trying to convert the channel ID to a string.
I believe the string version of the channel ID would find the value in the dictionary.

    async def get_channel_history(self, i=None):
        # If per-channel history
        if self.per_channel_history_enabled:
            chankey = str(i.channel.id)
            if not self.session_history.get(chankey):

terse folio May 28, 2024, 5:06 PM

#

ahh, a new branch, missed that

terse folio May 28, 2024, 5:07 PM

#

halcyon quarry This is failing to find the channel ID key in the dictionary even though it is t...

remember to convert to string

#

halcyon quarry May 28, 2024, 5:07 PM

#

Yes - help

#

😛

terse folio May 28, 2024, 5:07 PM

#

Json will convert int keys to strings.
It doesn't support non string keys in dictionaries

visual dagger May 28, 2024, 5:08 PM

#

per my testings no matter how much you change the system prompt it's not enough, the llm gets lost, I tried a lot adjusting system prompt but I didn't get results like implementing a logic or/and adding facts etc.. < that helped more than adjusting the system prompt

making a system/logic is the way to go, the llm is just a cog/part of the overall system

#

system or entity

terse folio May 28, 2024, 5:08 PM

#

halcyon quarry Yes - help

Sorry, what's the issue?

#

what's the error?

halcyon quarry May 28, 2024, 5:08 PM

#

terse folio remember to convert to string

Please see second chunk of what I wrote

#

The error is simply the channel ID value

terse folio May 28, 2024, 5:09 PM

#

we need better tracebacks

#

add a print(traceback.format_exc()) before that error message line

#

and import traceback if not already

#

because the code looks fine to me

#

it could be getting the channel history, but another part of the code is causing issues

halcyon quarry May 28, 2024, 5:11 PM

#

Nah - shortly after, it adds a duplicate copy of the channel key to the history

#

then starts adding to it

terse folio May 28, 2024, 5:13 PM

#

just downloaded the code

halcyon quarry May 28, 2024, 5:13 PM

#

This issue only occurs if I load previous hsitory

#

Thats outdated

#

1 sec

#

Pushed update

#

The problem only occurs if autoload_history: true and you had a previous conversation while per_channel_history: true

vestal python May 28, 2024, 5:15 PM

#

Imma end up setting up my discord bots up, but on hold working further for them :x about to lose my current job, but they're looking to hire me maybe for AI stuff. So I need to finish my webapp design as a proof-of-concept for them and then back onto finishing discord bot and back end textgen webui implementations..

halcyon quarry May 28, 2024, 5:16 PM

#

If you see load_bot_history() (line 4802) - it is correctly getting the latest history as either a single format or as a per-channel format

#

self.session_history is being assigned the dictionary value of the history file - if you print it, you will see the prior channel ID as one of the top level keys

#

When a message is sent init_llm_payload() function is supposed to get the correct existing channel history via get_channel_history()

But it is not matching that key value

#

Pretty sure it will match so long as the channel.id can be converted to a string...

terse folio May 28, 2024, 5:22 PM

#

what happens when you run get_channel_history with multi channel disabled?

#

oh

#

nvm

halcyon quarry May 28, 2024, 5:22 PM

#

😛

#

I have literally everything else working...

terse folio May 28, 2024, 5:23 PM

#

well ill test the bot and see what happens!

halcyon quarry May 28, 2024, 5:24 PM

#

I'm not getting an error if I copy it first like this...

#

        if self.per_channel_history_enabled:
            chankey = copy.copy(i.channel.id)
            print("chankey", chankey)
            chankey = str(chankey)
            print("chankey", chankey)

#

It's also not printing it with '' around it though...

#

terse folio May 28, 2024, 5:25 PM

#

print repr(chankey)

#

yes, that's an int

#

copy.copy(channel.id) returns an int

halcyon quarry May 28, 2024, 5:26 PM

#

Ill try to fstring it

terse folio May 28, 2024, 5:27 PM

#

#

printing a string won't show the quotes

#

using repr() will show them though

halcyon quarry May 28, 2024, 5:27 PM

#

ah thats good to know

terse folio May 28, 2024, 5:28 PM

#

in an fstring you can use f'{var!r}'

#

as a shortcut to repr()

#

fun fact, you can define your own __str__ and __repr__ functions in a class
so when printing it, or using repr on it, they can give custom results!

#

Discord.py does this with user objects

str(user) returns user.name

halcyon quarry May 28, 2024, 5:29 PM

#

This doesn't seem to be the issue - I'm missing something in my logic heh

terse folio May 28, 2024, 5:29 PM

#

Does it return the correct history?

halcyon quarry May 28, 2024, 5:36 PM

#

It's just the first exchange after loading, is not handling correctly

#

This is weird.

#

See:

    async def get_channel_history(self, i=None):
        # If per-channel history
        if self.per_channel_history_enabled:
            print("history:", self.session_history)
            chankey = copy.copy(i.channel.id)
            print("chankey", chankey)
            chankey = str(chankey)
            print(repr(chankey))
            if not self.session_history.get(i.channel.id):
                print("Not matched")

#

Here is the printed history:

#

#

It should 'get' that key but it is not

#

ok... maybe this is because it is nested down another level?

#

no - idk

terse folio May 28, 2024, 5:40 PM

#

you're not using the string channel id, you create the string version, but forgot to swap it in the .get()

#

if not self.session_history.get(str(i.channel.id)) should do the trick ^^

halcyon quarry May 28, 2024, 5:41 PM

#

Yeah well,,, lets see

#

😛

#

been back and forth with things... think I already tried it correct...

#

Now I'm back to that error. yikes

#

Alright.

#

I got it

#

the error was because later I tried returning the value with the i.channel.id key so that was a keyerror

terse folio May 28, 2024, 5:44 PM

#

#

something's wrong up here, I think history didn't get loaded

#

#

not sure if related

#

but there might be some recursion in the history

#

or this is it just failing over and over

halcyon quarry May 28, 2024, 5:47 PM

#

I am not caught up with Dev

#

which could be the issue

terse folio May 28, 2024, 5:48 PM

#

Where is history saved btw?

#

oh, logs?

halcyon quarry May 28, 2024, 5:49 PM

#

It is kept in the session_history variable, which gets written to and saved after every LLM response to yes, logs

#

I think I have this business worked out...

#

At the start of manage_history() I also just need to convert the channel.id to str

#

YES - Did resolve this mess

#

Now it is correctly getting the dict.

terse folio May 28, 2024, 5:54 PM

#

👍

#

cleared the logs and still getting issues

halcyon quarry May 28, 2024, 5:55 PM

#

hum

#

Did you merge this with dev?

terse folio May 28, 2024, 5:56 PM

#

no, just using the history branch

#

hmm I think I unintentionally fixed it

halcyon quarry May 28, 2024, 6:01 PM

#

Did find another error which had to do with the imported start_new_chat()

#

I need to just skip this and instead init history as an empty dict

terse folio May 28, 2024, 6:01 PM

#

another part of the code was not running because a forgotten await
So that's returning a future object.

And llm_gen is complaining that history contains a future object when trying to copy it.

halcyon quarry May 28, 2024, 6:02 PM

#

...where? 😄

terse folio May 28, 2024, 6:02 PM

#

ill push it,
the reset_history function

#

huh, how do I get it to import the previous history?

#

on restart

halcyon quarry May 28, 2024, 6:04 PM

#

In config.yaml need to enable autoload_history

#

await bot_history.reset_session_history(ctx)
Ahh yes this was missing await
Woops!

terse folio May 28, 2024, 6:04 PM

#

Pushed

#

have them all true

halcyon quarry May 28, 2024, 6:06 PM

#

When you start up and send a message / get a response, it should save a new log file.

If you close and open the bot again, send message, that previous log should reflect 2 messages

terse folio May 28, 2024, 6:06 PM

#

the history is empty on start

#

it creates a new log

halcyon quarry May 28, 2024, 6:08 PM

#

I'm going to fix this one tiny thing and push what I have - maybe you still have the channel.id trying to match the string

#

(which I did not push the fix)

terse folio May 28, 2024, 6:08 PM

#

ah right, yes
I undid that for the commit!

halcyon quarry May 28, 2024, 6:09 PM

#

Checking your modifications before I push...

#

load_bot_history() does not need to be awaited

terse folio May 28, 2024, 6:10 PM

#

I converted it to an async function for consistency so it's easier to remember

halcyon quarry May 28, 2024, 6:11 PM

#

Before I screw something up

#

Here's my version
It does not have your merges in it

📎 bot.py

terse folio May 28, 2024, 6:12 PM

#

wish discord had a copy text from file button

#

The cool thing about git is you can undo commits!

#

it's fine

#

how come per channel history trims the history shorter? for flows

halcyon quarry May 28, 2024, 6:15 PM

#

I have a separate copy of history that includes all messages including those ignored by history

#

Which you are probably going to cringe about 🙂

terse folio May 28, 2024, 6:17 PM

#

interesting interesting,
Also noticed get_channel_history is called once and put to llm payload
But bot_history.session_history is accessed many times with [internal]

#

Am I missing something how does that work?

halcyon quarry May 28, 2024, 6:18 PM

#

If the key does not exist when get_channel_history is called, the key is initialized.
Otherwise, the history for that key is returned

#

After llm_gen, the prompt/reply and channel_id are sent.
It again finds the correct key and manages everything accordingly

terse folio May 28, 2024, 6:20 PM

#

okay, I see there's som if statements around about "single" or "multichan"
This is a little complicated to follow

halcyon quarry May 28, 2024, 6:20 PM

#

I added that bit because it will load the latest history file it finds... which may not be structured correctly for the current mode

#

If the mode doesn't match the history that is loaded, it basically resets...

terse folio May 28, 2024, 6:22 PM

#

Got it

halcyon quarry May 28, 2024, 6:22 PM

#

could be possible to try harder to find the most recent history that does match

#

but seems complicated

terse folio May 28, 2024, 6:24 PM

#

interesting, with everything in str() it still makes a new history file

#

i'll check it out later after you push the commit

halcyon quarry May 28, 2024, 6:24 PM

#

Ok give me a minute to try merging your changes into this 😛

terse folio May 28, 2024, 6:25 PM

#

just await reset_session_history 3 times was really important

halcyon quarry May 28, 2024, 6:26 PM

#

OK I see that was it

#

Pushed

#

reverted load_bot_history to normal function 🙂

terse folio May 28, 2024, 6:29 PM

#

on another note, I was working with some complicated history stuff for one of my own projects!
I did the idea I talked about, using a Message class for each history item.

That contains attributes like is the message finished, when was it created and so on.

Because i'm trying to work with real time messages through voice, handling interruptions and continuing messages that are not at the front of the history!
I'll share when things are more solid.
It had some really tough bugs at first

halcyon quarry May 28, 2024, 6:31 PM

#

For consistency, I'm going to change the str version of channel_id to chankey

terse folio May 28, 2024, 6:31 PM

#

I saw, was updating some other spots where it was missed as well

halcyon quarry May 28, 2024, 6:31 PM

#

Well I have it convert to string, but kept the variable name as channel_id

#

and yes will update the hints to str

terse folio May 28, 2024, 6:32 PM

#

should update typehints too on that, maybe trace it back to the source

halcyon quarry May 28, 2024, 6:33 PM

#

OK will convert channel to str before sending to manage_history

terse folio May 28, 2024, 6:33 PM

#

yup just did that :P

halcyon quarry May 28, 2024, 6:34 PM

#

We're probably changing literally exactly the same as the other...

terse folio May 28, 2024, 6:34 PM

#

ill undo my changes and focus on some other things I noted

halcyon quarry May 28, 2024, 6:36 PM

#

Thanks for all your help, again

#

Pushed

#

Are you messing with send_char_greeting_or_history() at all?

terse folio May 28, 2024, 6:39 PM

#

no, trying to figure out why guild_name comes up as a guild object in history

#

this happened while I was testing earlier

#

oh, that might have been my mistake

halcyon quarry May 28, 2024, 6:40 PM

#

That looks like the channel.id thing maybe

terse folio May 28, 2024, 6:40 PM

#

I removed the str() from channel.id earlier to test what the bot does without code modifications

#

and when usign ctrlD I also selected the str of guild and channel name by accident without realizing

#

all good

halcyon quarry May 28, 2024, 6:41 PM

#

I'm updating send_char_greeting_or_history() to work with the per chan history

#

er actually...

#

yeah, we're not sending anything if its per-channel 😛

terse folio May 28, 2024, 6:42 PM

#

to make things easy to read,
I would set a default channel id like "0" if it's using single channel mode.
This way you only need to write the code once!

halcyon quarry May 28, 2024, 6:43 PM

#

I think the main reason anyone would willingly keep it all as one, is for immediate compatibility with TGWUI

#

If I reformat it under a key then that won't work either 😛

terse folio May 28, 2024, 6:45 PM

#

I see

#

wonder why tgwi doesn't support custom names for logs

halcyon quarry May 28, 2024, 6:45 PM

#

It does - it just does not support this dictionary structure

#

I add the custom name to the logs just for easy identification

#

So at a glance you can tell which are incompatible'

terse folio May 28, 2024, 6:47 PM

#

i would create 2 logs

halcyon quarry May 28, 2024, 6:47 PM

#

On that note - going to see what ChatGPT has to say about writing a utility function

terse folio May 28, 2024, 6:47 PM

#

one for the custom stuff (guild/channel name), and one for the messages per channel

halcyon quarry May 28, 2024, 6:47 PM

#

If I delete the guild name and channel name, its still incompatible

#

It's because the visible and internal are nested under a key

#

The alternative is to have separate log files for every channel

#

I'm going to see about a utility function now to extract these logs into individual ones

terse folio May 28, 2024, 6:49 PM

#

halcyon quarry It's because the `visible` and `internal` are nested under a key

you could use the file name as the channel key

terse folio May 28, 2024, 6:49 PM

#

halcyon quarry The alternative is to have separate log files for every channel

yes, this is what I would do

#

to make it readable for the user, you could save it like:

channel.id_character_guildname[:10]_channelname[:20]

#

so you don't have infinitely long file names haha

halcyon quarry May 28, 2024, 6:51 PM

#

At this point, it is staying as is

terse folio May 28, 2024, 6:51 PM

#

but make sure to clean up the guild/channel names

#

yea

halcyon quarry May 28, 2024, 6:52 PM

#

Just thinking of that approach is giving me a headache - and I like having it all bundled up nicely into one log file

terse folio May 28, 2024, 6:57 PM

#

Yea, I can imagine

terse folio May 28, 2024, 7:13 PM

#

here's a few things that will take a while to fix

#

discord.Interaction uses i.user
commands.Context uses i.author

#

you can get the correct version like this

#

or maybe just pass the username down as a different variable

#

maybe ill add a function to get username

halcyon quarry May 28, 2024, 7:24 PM

#

Got the utility script working...

terse folio May 28, 2024, 7:24 PM

#

#

you have way too many types feeding into these functions!

#

Message, interaction, context

#

All with their slight little changes

halcyon quarry May 28, 2024, 7:26 PM

#

some extra details in the screenshot than intended

#

About the author / user thing

#

I don't think I actually have any errors atm

#

but, I know could be confusing 🙂

terse folio May 28, 2024, 7:28 PM

#

I can't believe it works 😸

#

just in a constant state of almost errors

halcyon quarry May 28, 2024, 7:29 PM

#

The thing that is odd is that i.author isn't documented... but seems to work universally EXCEPT for the Regen and Cont App Commands

terse folio May 28, 2024, 7:29 PM

#

discord.Interaction doesn't have an author attribute

#

but hybrid commands send out commands.Context which does

#

only your menus won't work with i.author

halcyon quarry May 28, 2024, 7:30 PM

#

I'll repeat- everything is working with i.author except for the Cont and Regen commands

terse folio May 28, 2024, 7:30 PM

#

ill check if that's still true, my discord.py is a bit older

halcyon quarry May 28, 2024, 7:30 PM

#

Those are the only 2 places I replaced it with i.user.display_name

#

The documentation doesn't seem to reflect it - was just a happy accident I found when I simply tried replacing all instances with i.author.display_name

#

on_message, all the commands, etc

terse folio May 28, 2024, 7:35 PM

#

I think it's just a near miss,
I see nothing in the code about an author attribute

#

impressed it has worked so long!

halcyon quarry May 28, 2024, 7:36 PM

#

yeah like, 2 weeks (that's when I made the sweeping update)

#

Like I said, I accidentally found that it worked pretty universally

#

They have so many alias and crap for things, it's no wonder they overlooked something like this

#

Pushed the commit including the utility if you want to check it out

terse folio May 28, 2024, 7:38 PM

#

halcyon quarry They have so many alias and crap for things, it's no wonder they overlooked some...

some classes might have a property that evaluates to an alias.
same for base classes.
But i checked all that out 🤔

#

ill be pushing my changes to :dev

halcyon quarry May 28, 2024, 7:39 PM

#

I know you just recently pushed a number of things to dev, a few days ago

terse folio May 28, 2024, 7:39 PM

#

put in a little warning log to see if it ever triggers for i.user

halcyon quarry May 28, 2024, 7:39 PM

#

I need help merging that and the per-channel stuff 🤯

#

This per channel stuff is good to go I think

terse folio May 28, 2024, 7:40 PM

#

😸

#

nvm, forget menus are defined differently

halcyon quarry May 28, 2024, 8:00 PM

#

yeah, this is still a bit bugged unfortunately. Dammit

#

It seems like I do need to collect visible.

#

maybe...

#

it errors when using extensions that want to modify history... such as TTS

terse folio May 28, 2024, 8:06 PM

#

interesting, they shouldnt be related

halcyon quarry May 28, 2024, 8:07 PM

#

When I turn on alltalk_tts I'm getting this error

#

terse folio May 28, 2024, 8:08 PM

#

it wants to grab the last message and I guess replace it with the html tag

halcyon quarry May 28, 2024, 8:09 PM

#

Ill try collecting visible and see what happens

#

I think I just need to initialize with 2 empty sublists

#

yep

#

no error now

#

derp

#

Yep, no error if I initialize with 2 empty sublists

terse folio May 28, 2024, 8:17 PM

#

To explain some of my changes:

Typehints are classes (types), python classes use Pascal Case for example ExampleClass
So you can expect the last item of a typehint to start with a capital letter.
discord.message - was a file
discord.Message - is the class you wanted 😸

I think the same rule could apply to earlier with using discord.User.mention as a typehint.
If User.mention was a class, this would have worked ^-^

Also discord.py is structured weirdly,
discord.ext.commands is like a separate module from discord.

So discord.ext.commands.Context didn't work as a typehint.
But
from discord.ext import commands
commands.Context does.

#

pushed some cleaning!

halcyon quarry May 28, 2024, 8:20 PM

#

Alright I'll see if I can figure out how to merge the channel stuff into this

terse folio May 28, 2024, 8:20 PM

#

That may be a big merge!

halcyon quarry May 28, 2024, 8:20 PM

#

think I just need to open the two files in split view

#

then just be very careful

terse folio May 28, 2024, 8:21 PM

#

you should be able to create a pull request, it will merge as much as it can, and then give you a diff view to combine

#

create a new branch ofc to do the merge

halcyon quarry May 28, 2024, 8:21 PM

#

Yeah, good idea. Will duplicate branch

terse folio May 28, 2024, 8:23 PM

#

Also found out aiohttp has a .request method!
That just lets you pass the request_method like get/post as a variable ^-^

halcyon quarry May 28, 2024, 8:24 PM

#

I see that you did change the labelling to ictx yes?

terse folio May 28, 2024, 8:24 PM

#

For the mixed ones

#

i was changed to message where it was only coming from on_message

#

i was changed to inter in interaction only functions

#

but there's some cases where all 3 is possible,
so I called that ictx

#

and gave that its own typehint alias CtxInteraction

halcyon quarry May 28, 2024, 8:32 PM

#

I need to run, but I duplicated dev to dev2, then tried merging per_chan_histories - and went through and resolved the few conflicts

#

If you want to try out dev2 that would be great - I'll be checking it out later as well

halcyon quarry May 28, 2024, 9:25 PM

#

Ya missed one!

        if bot_will_do['should_gen_text']:
            # build llm_payload with defaults
            llm_payload = await init_llm_payload(i, user_name, text)

terse folio May 28, 2024, 9:25 PM

#

it created some duplicate functions

#

should do a merge side by side

halcyon quarry May 28, 2024, 9:26 PM

#

I see,,, announce changes

terse folio May 28, 2024, 9:26 PM

#

and the function below it

halcyon quarry May 28, 2024, 9:26 PM

#

and yep

#

change char task

terse folio May 28, 2024, 9:27 PM

#

halcyon quarry Ya missed one! ``` if bot_will_do['should_gen_text']: # buil...

huh, wonder why it shows different, maybe a duplicate function?

#

oh nvm

#

that's one I edited haha

#

it might have been added in the merge

#

because I would have caught that with error checking

#

this is helpful before doing a push 😸

halcyon quarry May 28, 2024, 9:29 PM

#

I got that bit straightened out...

#

(announce, change char task, etc)

#

Pushed fixes to dev2 🙄

#

Looking pretty good now...

terse folio May 28, 2024, 10:34 PM

#

happy that worked, I was messing around trying to figure out how to merge 2 branches without automaticly merging.
Like making everything count as a conflict for manual resolution

#

Can compare 2 branches and get the diff view I want, just not to select left/right or edit them in the previews

halcyon quarry May 28, 2024, 11:07 PM

#

There is the compare files thing in VS Code

#

It’s wha I should’ve done really

terse folio May 28, 2024, 11:11 PM

#

File compare seems to only work to edit the latest, which is great.
But you need 2 files to compare in the first place

Maybe if you cloned the repo into 2 folders running different branches it could be done that way allowing edits

halcyon quarry May 28, 2024, 11:14 PM

#

Yeah that’s what I meant

halcyon quarry May 28, 2024, 11:42 PM

#

So I had changed visible to have 2 empty sublists and it worked

#

What I didn’t realize was that it fails on the next gen lol

#

I’m going to test adding empty strings so the internal and visible lists have matching lengths

halcyon quarry May 29, 2024, 12:26 AM

#

Yep - appending empty strings resolves any error.

#

I suppose at all times, the internal and visible lists are expected to have matching lengths

halcyon quarry May 29, 2024, 1:02 AM

#

doing one last run of vigorous testing then pushing this bad boy to main

halcyon quarry May 29, 2024, 1:21 AM

#

Yeah we are a go

#

New setting in config.yaml enabling each channel to have its own separate chat history.

Compatible with the other history settings. The only downside really is that the log file cannot be directly loaded into text-generation-webui. However, a utility can be found in new directory /utils/ where users can Drag n Drop one of the _multiple-history.json files and it will split it into compatible logs.

New /announce command

The command allows users to set channels as "announce channels". If any announce channels are defined, Model changes and character changes will be announced there instead of in interaction channels. This makes it easier to inform users in all channels about things changing with the bot.

keen palm May 29, 2024, 1:59 AM

#

How's regenerate coming along? 🙂

halcyon quarry May 29, 2024, 1:59 AM

#

Might be next on the list

halcyon quarry May 29, 2024, 2:39 AM

#

Ok I thought of a very good idea for handling Regenerate/Continue - which is also going to simplify my “Recent messages” handling (different from History)

#

Im going to collect message.ids into lists matching same structure as History and account for ones not in history. Will be able to get the corresponding messages needed / edit history for the commands

visual dagger May 29, 2024, 4:48 AM

#

hey again fellas

#

@keen palm how is it going with the game?

keen palm May 29, 2024, 12:33 PM

#

Not going anywhere right now. Regeneration is kind of the last piece of the puzzle needed for the bot

halcyon quarry May 29, 2024, 12:45 PM

#

I’ll be working on that today

keen palm May 29, 2024, 1:49 PM

#

Do you know of a way to put information into context without adding it to the ongoing history? The Complex Memory extension (which seems to be b0rked now) did that with keywords, where the information would be injected into the beginning of the context. Silly Tavern does a similar thing with world books. Same with Novel AI and their lorebooks.
The issue with using tags to inject information via trigger words is the information remains in the history, so it can rather quickly overwhelm the available context, especially if it's more commonly triggered information.

halcyon quarry May 29, 2024, 2:05 PM

#

Unsure if I mentioned this to you or someone else, but I plan on added User Variable assignments.

So you could assign something to a variable, then use the variable in your prompt

#

It's on the TODO List™️

keen palm May 29, 2024, 2:28 PM

#

You did mention that, but I don't know if that would fix the inherent problem

halcyon quarry May 29, 2024, 2:29 PM

#

Ah yeah... you're right about that

#

I think what I need to do is add a specific handling for what the state tag is used - if it includes custom_system_prompt, to add them together rather than update existing value

#

Or, make yet another tag like add_system_prompt which will add to whatever the current custom_system_prompt is

#

To my understanding, the system prompt handles the behavior you're seeking

keen palm May 29, 2024, 2:32 PM

#

Yeah, there has to be something for that

#

Maybe need to dig into the code for complex memory and see how that does it

halcyon quarry May 29, 2024, 2:32 PM

#

You could now use custom_system_prompt but it will set it rather than combine with others

#

[[state:{custom_system_prompt:Bob is a raging alcoholic}]] What's Bob been up to lately?

keen palm May 29, 2024, 2:35 PM

#

I'll give that a try and see how it works

halcyon quarry May 29, 2024, 2:35 PM

#

Adding to to-do list: Append system prompts

keen palm May 29, 2024, 2:42 PM

#

That does say in the command window that the state has changed, but the system prompt is ignored

halcyon quarry May 29, 2024, 2:44 PM

#

Chat mode or instruct mode?

#

Shouldn't matter... hum, seems like the default chat_template_str defines how the system prompt should be handled.

#

If it's having no effect then that's a bit puzzling, unless the chat template is being ignored or something

#

I had to fix a few things but now I'm onto Regenerate

keen palm May 29, 2024, 2:49 PM

#

This is the context modification code in complex_memory, BTW:

context_injection_string = ('\n'.join(context_injection)).strip()

if memory_settings["position"] == "Before Context":
    state["context"] = f"{context_injection_string}\n{state['context']}\n"
elif memory_settings["position"] == "After Context":
    state["context"] = f"{state['context']}\n{context_injection_string}\n"

return generate_chat_prompt(user_input, state, **kwargs)

halcyon quarry May 29, 2024, 2:50 PM

#

The difference between this and custom_system_prompt (if it were working), is that custom_system_prompt would vanish from history

#

actually, nvm

#

obviously context does not get written to history 😛

#

Instead of dicking around with system prompt I'll just copy paste this as inject_context

#

open to suggestions on how to name it

keen palm May 29, 2024, 2:52 PM

#

For tags purposes?

halcyon quarry May 29, 2024, 2:54 PM

#

I'd need to create two tags in any case such as
inject_context_text: The thing you want injected
inject_context_mode: before (or after)

Instead I'll just do this
prefix_context: your text
suffix_context: your text

#

Yes, will just add params to the Tags system

keen palm May 29, 2024, 2:55 PM

#

That would be the easiest, yes

halcyon quarry May 29, 2024, 2:55 PM

#

Quite easy

keen palm May 29, 2024, 2:56 PM

#

I don't know whether prefix or suffix is better in use, but whatever

halcyon quarry May 29, 2024, 2:56 PM

#

Just gives it more or less priority than your actual context

keen palm May 29, 2024, 2:58 PM

#

Yeah, I know. Just not sure which is preferable. Maybe important character information should be weighted higher than current context

halcyon quarry May 29, 2024, 3:04 PM

#

May need to guess and check that - I'll add both

keen palm May 29, 2024, 3:05 PM

#

Right, I'd play around with it a bit

visual dagger May 29, 2024, 3:16 PM

#

keen palm Do you know of a way to put information into context without adding it to the on...

you can make a python script that edits the history and removes the junk (memory info) after generation

#

but you have to include the memory info in a way that's easy to remove later

#

something like sections?

#

+ info 1....
+ info 2....
etc ....

###John response
I will be right back

This is the full prompt. So after the generation you can easily remove/edit the memory info using .split()

keen palm May 29, 2024, 3:19 PM

#

It would be far better to have that info not ever added to history, though

visual dagger May 29, 2024, 3:20 PM

#

I think there is a post processing thingy in ooba, you gotta check the source code

#

I think you can clean things up before adding the response to the history

halcyon quarry May 29, 2024, 3:22 PM

#

I'm adding this feature today.

visual dagger May 29, 2024, 3:23 PM

#

you are DA GOAT

#

@keen palm NPCs... did you try this in any way shape or form?

keen palm May 29, 2024, 3:26 PM

#

In what way do you mean?

visual dagger May 29, 2024, 3:29 PM

#

like hmm.... interacting with a random new character, a one off interaction?

#

and also maybe saving that NPC to interact with later

#

or any other way

keen palm May 29, 2024, 3:30 PM

#

I haven't done that so much with this bot, but I have certainly done so previously when using complex_memory.
With the bot, when the context injection thing gets up and running, saving an NPC for later would be as simple as creating a new tag for it.

#

But interacting with a random new character temporarily I have absolutely done countless times.

visual dagger May 29, 2024, 3:30 PM

#

keen palm I haven't done that so much with this bot, but I have certainly done so previous...

wait but how you will do it?

#

generate that NPC on the fly?

#

on real time?

halcyon quarry May 29, 2024, 3:31 PM

#

Predefined

visual dagger May 29, 2024, 3:32 PM

#

keen palm But interacting with a random new character temporarily I have absolutely done c...

automatically? without the need for you to manually set things up about the NPC?

visual dagger May 29, 2024, 3:32 PM

#

halcyon quarry Predefined

: /

#

ouch

halcyon quarry May 29, 2024, 3:33 PM

#

You know by now that the bot can't currently save information locally - don't hold your breathe as if this feature may only come if Reality figures it out, or I finish my todo list and start looking for new things to try

visual dagger May 29, 2024, 3:33 PM

#

saving things is easy, one .py script away

#

or functino

halcyon quarry May 29, 2024, 3:34 PM

#

The most it can save and call back atm is recent messages, and eventually user variables

keen palm May 29, 2024, 3:34 PM

#

visual dagger automatically? without the need for you to manually set things up about the NPC?

It depends on the prompt, but yeah, the LLM is able to create a character that it then uses for interactions. Information about the NPC is fleeting, though, unless it's stored somehow for later

visual dagger May 29, 2024, 3:34 PM

#

the problem is the logic, saving things for the sake of saving won't result in good results

#

but there gotta be a reason for it to exist (the saving feature)

#

like how it will be used in an effective way?

visual dagger May 29, 2024, 3:35 PM

#

keen palm It depends on the prompt, but yeah, the LLM is able to create a character that i...

wdym by fleeting?

keen palm May 29, 2024, 3:36 PM

#

Unless the NPC information is stored in a lorebook or something similar, then it gets pushed back in context until it's forgotten, basically

visual dagger May 29, 2024, 3:36 PM

#

@halcyon quarry do you think of making an LLM conditionned function in the bot?

#

ifLLM("John is a teacher")

#

it will return true or false

halcyon quarry May 29, 2024, 3:37 PM

#

No

#ad_discordbot (Fork of Fork of xNul's bot)

New setting in config.yaml enabling each channel to have its own separate chat history.

New /announce command