#ad_discordbot (Fork of Fork of xNul's bot)
1 messages · Page 23 of 1
You need to set up characters, plus play around with settings of the llm
Do you have a template for it?
There are examples characters in the folder examples
Also if you scroll up in this channel you'll find some files with it too
As for settings - I dont have a template
Loras - can I somehow use them from their default folder or I have to copy them?
actually i was having that problem too but i thought it was my fault, ill try to find where is the problem later
The bot also has a built in dynamic “stopping string” of the username of who makes the request - example usage is in the characters
This can definitely help stop the LLM from answering itself as you
I haven’t used Loras before but they should work normally, just need to use cmd flags
I have it like half done, sorry I went back to my report builder project. My niece was asking about some way to work on researching topics for her criminal law extra credit, and reimaging my spare laptop for her to use. I'm working on finishing that bundled into pyinstaller into a single .exe for ease of use.
Send the topic, key phrases, and guidance what you want and have it use several methods to scoure the internet for resources and build a finalized .pdf report with references.
The discord bot.py script I'm thinking it almost works breaking it down into what I have working except for the image generations it's roughly bot.py at 2,000 line and then + 1,500 + 1,000 and several smaller modules between 250~150.
No I've just had my 2 1/2 yr son and work keeping me busy alot
.
well that fixes the problem, but aren't you curious of why does it happen?
As I was testing, I did clean install and it was having the problem, now I cloned my main bot and updated and it is having the issue so...
pretty sure that this happened during the api branch period
It’s on TGWUI end
The reason this issue flared up with the bot is because TGWUI changed how the param was supposed to be formatted
So our stopping strings suddenly stopped working
So check the formatting of the param - if you are copy/pasting old formatting the stopping strings aren’t working at all
before it was working although ive only tried with 1 message
It was working because the bot’s stopping strings were actually taking effect
But TGWUI changed the format for the param
i have to see if it wasnt an casuality
Lowkey I just copied the example stuff and asked chatgpt to remake it for me fr
That kinda work for me
Just put your personality and context thingy and ask chatgpt to make it yours
And then put it on the characters and use that one
Also @halcyon quarry will there be memory feature and internet search in the future update for it?
i actually dont remember what version is this but it is very old with the latest tgwui, it does not have api and it has missing key in dict base which is top_n_sigma
Need help making bot do the magic for gguf model. Other than in dict imgmodel, do i need to change something else?
I did this preset there and still get error about clips (i think it still treats the model as SDXL).
- filter: [gguf, chroma]
exclude: [xl, turbo, sd15]
payload:
width: 1024
height: 1024
cfg_scale: 4.0
distilled_cfg: 1.0
sampler_name: euler
scheduler: beta
vae: ae.safetensors
clip1_gguf: clip_l.safetensors
clip2_gguf: t5xxl_fp8_e4m3fn.safetensors
override_settings:
forge_additional_modules:
- ae.safetensors
- clip_l.safetensors
- t5xxl_fp8_e4m3fn.safetensors
comfy_delete_nodes: []
tags:
- tag_preset_name: CHROMA Payload
Ok, guess_model false helped a bit, however i get black square instead of image now... lol
Are there any other settings that can override these presets?
both chroma and flux gguf give me the same square. sdxl and sd work as intended... argh..... @halcyon quarry almighty, please help lol
are those the vae / clip / textencoders you have?
Tags could override those but you would see it in cmd
yes, trying the last thing and I will get back to you
Yeah, I am stupid. I have forgot to move my payload to the folder with payloads... Thus it couldnt choose correct scheduler and sampler, as well as vae and encoders i gues... So yeah, make sure that payload files are there...... 🤦♂️
Ok apparently that was not the issue as now I am having black square again after 1 good image lol
Tbh I’ve also had random black square images here and there when running flux from Forge
Distilled cfg of 1.0 is very low - you have it flip flopped with regular cfg
Would make more sense if cfg_scale: 1.0 / distilled 4.0
Its like a requirement for chroma these cfg's. The thing is scheduler still doesnt stick. It shows this line:
WARNING:root:Sampler Scheduler autocorrection: "Euler" -> "Euler", "None" -> "Automatic"
Weird, but did imgmodel once more to make sure its chroma that's chosen and now it works. Maybe that because I have guess model turned off and the model needed to be reloaded... a
Guess model must be enabled for per-model / type overrides
If it’s disabled, it just uses whatever is in base_settings > payload
I just did a lot of focused updates in the imgmodel settings handling, lots of testing etc
One thing to note is that the “filters / excludes” stack with multiple matches, so if you included [flux, gguf] and the model you select has both flux and gguf in the name + path, it scores higher than just [flux] or [gguf] only
And if you include [gguf, chroma] in the “exclude” for other presets, it would make those presets get negative points for each match
I’ll look into it when I get back from vaca. Maybe something is bugged with my settings management here after all
Huh, ok
Basically it compares the selected model name (which includes subdirs) against each preset’s filter/exclude and scores the presets
Most model filenames correctly include the model type/class anyway
so far so good, so payloads loaded correctly thank god. And chroma now does too. Need to figure out best settings for it now, but thats different story lol
I haven’t really checked it out yet, lots of buzz about it but I know the model isn’t fully trained yet
Kind of reminds me the brief period peeps were finetuning XL 0.9 beta, with 1.0 just around the corner
Yeah, i remember that time lol but im just curious like that) not bad, but im still trying to figure out schedules and samplers. And cfg is killing me
Honestly got so tired of flux that I got back to sdxl and illustrious. Flux just killed creativity in me
I only use flux for certain things that require higher fidelity, coherent text, etc. Can’t easily get away from the gen speed and still great quality of XL
Oh yeah, totally, sdxl still rules for most applications
@halcyon quarry is there any way to make the ai have memory?
So it remembers previous interaction and I do not need to summarize everything for it
Maybe there’s TGWUI extensions for that - there’s some Tags for memory stuff (prefix_context / suffix_context) but I don’t think it’s what you’re after
Those tags are more for temporarily supplementing your context with specific details, triggered by keywords like mentioning a certain person / place / thing / etc
So far the only thing I did was add entries to character file. That has some success. But for tgwui stuff better ask the main discord I think
you can try these extensions
this is a fork but works (tested like 3months ago) https://github.com/Sonic2kDBS/long_term_memory
this is the original, you can explore it https://github.com/wawawario2/long_term_memory
and this is a more comlex one that may not work https://github.com/brucepro/Memoir
S2k version of the LTM extension from @wawawario2 for the Oobabooga text-generation-webui. A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion. - S...
A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion. - wawawario2/long_term_memory
How do I set it up?
these are tgwui extensions, you can follow their readme, they are mostly cofigured in the webui and usually work with the bot if they work in the tgwui
so when i tried to use the extension it gave me this error
Are you installing these in the correct directory? That’s two extensions printed in cmd TGWUI can’t find script.py
I installed it in the extensions folder in tgwui, and installed the requirements using the update_wizard_linux.sh
seems like the latest tgwui is having dependency conflict with the extension, unlikely to be fixed
What tgwui version do you have that works with it?
And if possible could you send that version?
mine is broken too because i updated it here :V
the v2.1 im pretty sure it would work but it might be too old
so you can actually try v3.0
you should definitely try memoir first to see if that satisfies your needs as it is still in developement
although i ve never touched it :p
It’s currently possible to do something like this with the bot:
- have a Tag which includes a “flow” tag param - which always triggers (or has a trigger word like “think” or “remember”)
- The tag also has should_gen_text: false (skip LLM gen initially)
- Flow tag first step: prompt the LLM to summarize the recent conversation. (should_send_text: false / save_history: false)
- flow second step: ‘format_prompt: {user_1}’ (will use your initial prompt) / prefix_context: {llm_0} (will prefix the context with the summarization from flow step 1)
I think in the flow tag (flow_base param) you could actually use the “trumps” tag param to prevent triggering the same Flow tag again
Basically this would make all your requests a 2-step iteration
Ill try to come up with something
are these extensions enabling vision and memory?
on a discord bot right
It should be but tgwui broke most of the extension
have you tried the memoir one
Not yet but I'll try to use it
💔
Tried to ask the bot to describe the image and it just say some random stuff
sounds like it didnt work
i forgot how to install extensions in the ui
i prolly need to put it in a folder again
I just put them in the folder and just install the requirements with the update wizard
Assuming you use windows open the update_wizard_linux.bat
wtf i already had memoir installed
And select the install requirements
now i have two
Delete the other one ig
i figured that out. im wonder if ones updated or something
they have two different names tho so
im running the update wizard
@late pivot i got memoir to load in docker using qdrant
i got it to load, not work
yet
gettint it to run with ad_discordbot is the big problem it seems
horray
progress
the outputs are not useful at all but it is working
@late pivot
You might have a malformed “tag” - I’ll see about improving error reporting there
eh so far it hasnt been an issue at all in terms of the character cards functioning
i do need to find out why the responses are nonsensical
Where that error is occurring, I think you’re not collecting any Tags at all, so can’t use the system without fixing that
I can fix that a bit
sanitized version. im not really calling for any tags
lots of prompt fuckery to try and simulate memory
Interesting character!
doesnt work of me to install the requirements
btw @halcyon quarry how do i install image generation again?
- move a payload from examples to user/payloads depending on the software (Forge / Swarm / Comfy / etc). Adjust the payload as you like.
- recommend moving Prompt Enhancer char from examples to user/char (or tgwui char dir)
- in config.yaml > imggen > enabled
- in dict_api_settings.yaml, ensure the API is enabled, and the name in the bot_functions section is paired. Ensure the txt2img / img2img endpoint’s reference the correct payload file
Which one is better?
i downloaded comfy instead
I recommend using the modeltypes + loras one
Check that it’s referenced in the api settings file. You may need to configure dict_imgmodels especially if you are using flux / chroma
You can run whatever workflows you want at all via the bot (triggered from Tags) but this is advanced usage - frankly I need to simplify this for user convenience (high on my to-do list)
I had kind of a eureka moment when thinking about the upcoming “user commands” feature - I’m going to be able to release this much sooner than I thought
Initially, it’s going to be limited to just executing API workflows, though. But that was the main motivation for the feature anyway
can it use this model?
https://huggingface.co/Meina/MeinaMix_V11
Gonna work no prob
It’s an SD 1.5 finetune just make sure dict_imgmodels will correctly identify it as a 1.5 model (put it in a SD15 subdir or something - “filter” needs to match part of the dir or filename)
Personally can’t imagine using an SD 1.5 model tho - there should be some comparable style XL models which would be superior
twas a community project so it was written by like 12 people over like 30 iterations to get the simulated memory just right
Ngl @halcyon quarry I think setting up memory is difficult, could you add a feature where it would just look at chat history to remember the memory?
From the beginning the use of memory is to reduce chat context, so if you want bot to remember chat then just increase your context, although in the future he is going(?) to add a tag to create tag which would do memory thing a lot better
How do I increase it?
For now if your have some specific memory like birthday you can add a tag manually
Traditionally in the model tab inside tgwui, pretty sure it still works
But increasing context would make it consume more VRAM and make it slower when you have large context
How do I add the tag instead
in character or in dict tag
- trigger: 'Birthday'
search_mode: userllm #Change it if you need
suffix_context: "Your birthday is celebrated on December 2nd"
also you have persist tag that makes it reapply for more interactions
you can check all of them in dict tag template
Also in the github Wiki
Is ts normal?
No
that's a common failure mode in LLMs, and you need good parameter config to combat it. That's autoregressive collapse, where an LLM has said "mmmm" a little bit, and when it's asked to predict the most likely next token... maybe more "mmm", and once it's done that two or three times, it's just actually most likely to continue so it biases more and more and never stops itself
Flux Kontext is crazy
First try
This is also only 5-steps using a Flux Schnell LoRA, as someone described here - https://www.reddit.com/r/StableDiffusion/comments/1lm2k1o/flux_kontext_what_ggufs_to_use_with_12_gbs_of_vram/
This took like 30 seconds with 12GB Vram
amazin
think I may have found the issue with the chat settings
yes, I inadvertently messed up a text gen setting when updating some imggen stuff
Pushed an update that should resolve strange LLM output
What was happening: the mode param was being changed from a valid one (chat, instruct, etc) to txt2img (invalid). I reviewed all other settings management, everything else looks as expected.
I imagine TGWUI would fallback to some default mode when receiving an invalid mode param 🤷♂️
@valid crypt let me know when you have a chance to test 🤗
🎉
Won’t break anything ever again 🤞
Btw what's the new update?
I fixed what may have been causing your LLM to write mmmmmmmmmm
Or at least part of that issue
Which one to edit?
Or replace
Just run updater
I had messed up a setting handling on the backend
Coincidentally at the exact same time, TGWUI changed stopping string formatting, so I blamed that without realizing I fudged something
Will the settings get overwritten?
Can you tell me just which part to modify instead so I don't have to reconfigure it?
his updates would never touch user settings 🤞
the fix was about the ai replying for the user
literally this one
Yeah but which file?
you would never touch bot.py right? 😁
The user directory is git ignored
you can safely run the updater without losing any settings
Stumbled across this workflow on Reddit where you can use 2 images with Flux Kontext - and I'll be damned it works
I wager by the end of this week it should be relatively easy for users to create a command that would execute this workflow by requesting 2 images and a prompt
And other workflows, img2vid etc
The user commands feature is really not far off from completion
@late pivot oh geez
Could you correct this on your end real quick in one_click.py - let me know if the updater then works
works fine
What would you like to do?
A) Update the bot
B) Revert local changes to repository files with "git reset --hard"
C) Switch to standalone environment (remove TGWUI integration)
N) Nothing (exit)
Input> A
- Updating the local copy of the repository with "git pull"
git: 'remote-https' is not a git command. See 'git --help'.
Command '. "/run/media/OneVloth/HDD/text-generation-webui-main/installer_files/conda/etc/profile.d/conda.sh" && conda activate "/run/media/OneVloth/HDD/text-generation-webui-main/installer_files/env" && git pull --autostash' failed with exit status code '1'.
Exiting now.
Try running the start/update script again.
(/run/media/OneVloth/HDD/text-generation-webui-main/ad_discordbot/installer_files/env) [OneVloth@arch ad_discordbot]$
i presume these outputs may have been caused by that bug
Even though it failed haha
Yes, could have been the cause
It was an oversight on my end when I was updating something related to img gen
it happens
iirc the only devs on this are you and like one or two other people right?
Basically me 😛
yeah
i noticed that. i try to be as vocal as i can about how things are going for me cuz of that
All feedback very much appreciated
🫡
I'm not sure exactly what's going on here
For now, you can just git pull
The updater script is mainly to ensure that new dependencies are installed while updating - no new dependencies since the big "API" feature update
git is not a command
do you have git installed?
did u add it to path?
NGL, I copied almost verbatim the TGWUI updater script logic / code except adding my integrated/standalone logic
I’ll take a look tomorrow and see if our friend Ooba changed something in his script that I should be mirroring
@halcyon quarry can you give me the fix you made for the bot.py instead?
since the updater kinda broken
You can just download the raw file from GitHub and replace it locally
Or just use git pull from the ad_discordbot dir
@halcyon quarry btw an error when i tried to use the new one
Trying to activate Conda from: /run/media/OneVloth/HDD/text-generation-webui-main/installer_files/conda/bin/conda
1Conda activated successfully.
Traceback (most recent call last):
File "/run/media/OneVloth/HDD/text-generation-webui-main/ad_discordbot/bot.py", line 42, in <module>
from modules.utils_misc import check_probability, fix_dict, set_key, deep_merge, update_dict, sum_update_dict, random_value_from_range, convert_lists_to_tuples,
ImportError: cannot import name 'consolidate_prompt_strings' from 'modules.utils_misc' (/run/media/OneVloth/HDD/text-generation-webui-main/ad_discordbot/modules/utils_misc.py)
(/run/media/OneVloth/HDD/text-generation-webui-main/ad_discordbot/installer_files/env) [OneVloth@arch ad_discordbot]$ ./start_linux.sh
Welcome to ad_discordbot
Checking for existing Conda installation...
The bot can be integrated with your existing text-generation-webui environment.
[A] Integrate with TGWUI Recommended
[B] Create and use own environment
[N] Nothing, exit script
Enter A, B, or N: A
Trying to activate Conda from: /run/media/OneVloth/HDD/text-generation-webui-main/installer_files/conda/bin/conda
Conda activated successfully.
Traceback (most recent call last):
File "/run/media/OneVloth/HDD/text-generation-webui-main/ad_discordbot/bot.py", line 42, in <module>
from modules.utils_misc import check_probability, fix_dict, set_key, deep_merge, update_dict, sum_update_dict, random_value_from_range, convert_lists_to_tuples,
ImportError: cannot import name 'consolidate_prompt_strings' from 'modules.utils_misc' (/run/media/OneVloth/HDD/text-generation-webui-main/ad_discordbot/modules/utils_misc.py)
(/run/media/OneVloth/HDD/text-generation-webui-main/ad_discordbot/installer_files/env) [OneVloth@arch ad_discordbot]$ ./update_wizard_linux.sh
You seem to be on a more older version of the bot than I expected
So just replacing bot.py is no good - you need to git pull
Seems like Git was not installed correctly on your system
Have you ever tried running the TGWUI updater script? Do you not get the same error?
Update: New command /change_main_api
- Allows changing the "main" API for a bot function during runtime
For example, can change from SD Forge / Swarm / Comfy / etc - during runtime
This will also work for TTS clients.
This will also work for TextGen clients whenever I expand support for that
The tgwui updater always works
How that's possible makes no sense to me - I've compared the TGWUI updater against mine, and your copy/paste from the cmd error -
That error has to do with conda being compiled incorrectly. But in this case, my script is activating TGWUI conda environment and executing identical command, so I don't see how this could succeed with TGWUI but not from my script
@vestal python Thanks again for fixing the Linux launcher script - any chance you could test run the current linux updater script, and see if you get an error?
Yeah let me grab everything later and teat again on my spare Ubuntu that has the GTX 1080ti running 570+12.8 drivers and that Qwen 3 14B
I'll just do a fresh git clone of both and see.
I'm getting no where fast trying to pyinstall this report builder project for my niece. Might just set it up and tell her to double-click the .bat file when she wants to research a topic.
Upgrading the machine now. My son is sick so pretty much clingy all night watching bear documentaries rather than projects.
OK had to reimage the whole computer there..
Anyways Ubuntu 24.04 LTS, nvidia 570.169 driver with cuda 12.8
Install python3 and pip
Git cloned tgwui and ran start_Linux.sh, and downloaded Qwen 3 14B and tested
Then git cloned the ad_discordbot and simply chmod +x start_Linux.sh and ran it selecting tgwui, autoloaded the qwen 3 14B and up and running. Although no character found as I haven't added one yet.
Roughly 20t/s down to 12t/s for 10k context.
With a gtx 1080ti limited to 180w
That's above and beyond what I thought you would have to do lol
Can you try running the update_linux.sh script?
Yeah update_wizard_Linux.sh
Option A works
🫡
@late pivot See if anything our friend Dundell2 wrote here gives you any ideas on anything you might be executing incorrectly (albeit he is on Ubuntu not arch)
I lost a lot of vscode small projects that never went anywhere, but I'll just have to start over. Btw did you see the note that they're adding back Gemini 2.5 Pro for 100/day requests for free again?
Very a useful
Otherwise I'll have to think on this a bit more, but I couldn't think of anything that could be causing that issue
Good to know! I hardly have time to do anything as I'm just coding all the time haha
I just reworked all my internal ComfyUI subclassed behavior to be more flexible... while fixing a few other bugs along the way
Here is an example of a ComfyUI request using the generalized configuration "Steps" in my API system
I've now made a call_comfy step as a shortcut to handle all this automatically
Pushed an update
This update adds the following "Steps" that can be executed via run_workflow and response_handling:
load_data_file: loads a .json / .yaml / .yml as a dictionarysend_content: sends text or files to Discord interaction channelcall_comfy: Massively improves user experience for executing ComfyUI workflows.
Some other minor bugfixes / logical improvements also made.
These new Steps are detailed in the Wiki (https://github.com/altoiddealer/ad_discordbot/wiki/APIs-‐-StepExecutor)
I've successfully tested and executed Video ComfyUI workflows with the new steps - very easy.
The icing on the cake will be the next update, which will be the User Commands feature
Heck might even have this feature available tonight
Bah. Not quite.
I'm calling it the Custom Commands feature now
That under
git clone --branch user_commands https://github.com/altoiddealer/ad_discordbot
?
Hey so the way the discord bot responds... Is it just directly from the chat generations?
I assume so, was just wondering.
What I'm thinking is just force brackets use <chatting> Only text to display in Chat</chatting>
For larger models like 14B and above this shouldn't be a problem and just have 3 attempts to use chatting brackets to supply a response to discord. To remove all this potential fluff.
The response could be altered by tag definitions, but typically no it just sends the exact response
Hold off for the moment, I'm very close to squishing the last ittle bugs
(I may be wrong about Tags being able to alter the response if it is not being used for image generation)
It's just something I do with my report and podcaster projects. Alot of "Tool Calling" of brackets for specific answers only to look for the last bracket pair in a response for an answer. That way avoid anything stuck in <think> and fluff around normal generations.
I'm just working on the user_commands branch with nothing changed just some character yamls set. I can take a look in a bit though.
I'm about to push a commit to user_commands that basically makes it work top-to-bottom - need to test one more time...
I need to clean up the config file a bit with better comments, writeup the Wiki page, before I merge it to Main
Pushed that now - I invite you to try it out
Yeah I can pull it
It may be confusing until I add some better examples.
The way it works:
- Creating the commands is mostly straightforward in the new
dict_commands.yamlconfig file. - What needs explanation really is how
stepsfactor into how the commands actually do things
- Each value from the command can optionally be pre-processed through StepExecutor (by including a
stepslist) - All of the option values (processed or not) are collected to a "context" dictionary.
- If the command has
steps, a StepExecutor is executed where it's context is prepopulated with the data
Each key name in the context dictionary, is the option name.
The way the whole StepExecutor system works is that, any string values with syntax like "{my_first_option}" become automatically resolved to the actual data value from context
In the case of executing Comfy payloads, the payload just needs to have "{placeholders}" where each option value is going to be injected.
Let me know when it's pushed and I can test it out and start working on the other idea of just some precise_chat option
But there's a growing number of things that StepExecutor can do
I'll try to keep it minimal interference and some toggle to a module import script
I need to look and see if I do allow Tags to modify LLM response - I can easily add a "Regex" tag and that syntax would be easy to capture
I might be out of the loop too - is <chatting> standard or something? I don't quite grasp what you're aiming for
No just some setup additions in a system prompt when toggled like say /precise_chat is turned on, so the custom system prompt will tell the llm what this bracket is for, when and how to use it basically for every conversation response it gives will be within this chatting brackets and then we use a python script to extract the response first by ignoring any text before </think> and then look for the last pair of <chatting></chatting> and then use that as the discord bot response. IF its missing, retry the request up to 3 times if fails 3 times, respond with full text.
This is just a way for the AI to be more in character and to not break immersion, but also to allow it to print that extra fluff outside of the brackets so we're not messing with its context.
Example would be like this. The response would be the first paragraph wrapped in chatting brackets.
The second paragraph is just fluff and be outside of the brackets.
Aha
Yes, so in terms of the Custom Commands feature, you could look at lines 4878 in bot.py
line 4947 is the callback that is executed when a command is launched.
Currently this queues a task that is run line 3008
And the logic is currently to exclusively use the StepExecutor https://github.com/altoiddealer/ad_discordbot/wiki/StepExecutor
I have not put any consideration yet into creating command config options that could ultimately update bot settings, etc
My number one priority with this feature is to make it so anyone with a few brain cells to rub together, could arbitrarily create discord commands for whatever ComfyUI workflows they want, with customizable inputs and output processing
This is working - which is super friggen cool
Whoops 
clean but I need to check the logs backend and see exactly what it's creating. Takes twice as long it feels to give me a "precise answer"
Yeah adding logs to see what's being inputted/outputted raw text to modules/precise_logs/timestamp_logs.txt
See how it's being handled... but seems to be working and removes the fluff at least for tgwui backend setup so far.
created a fork for the feature to test with and fix up some things. Like i two <chatting> are present, use the Last pair of brackets for the output is generally the right one.
Have one issue where I'm not tracking if chatting brackets found were true, and it just loops 3 attempts and then processes it... Little annoying (It's the way I'm gutting the respopnse and then checking for thr brackets. Just order of operations issue)
So the point of this is to alleviate “thinking” from being included in the response?
To go a step further and any non-conversational content.
Anything that's 3rd person, thinking, etc
I haven’t used the newer LLM models much - so with a system prompt they’ll divide their “chat” response from all sorts of other internal monologues?
Well yknow even older models might benefit with the horrendous hallucinations by instructing it to a direct response using only the given brackets and we gut out the rest in a script. Something something
Conditional Output Generation / "Soft Tooling"
Newer LLMs like Qwen 3 14B have a hard time not thinking, and this also helps strip the thinking and any hallucinated parts of the conversation and/or internal monologue
One thing it doesn't solve is endless looping 
Yeah learning more about it... How to keep it fresh in the LLM's mind to use it correctly since just keeping it in system prompt loses memory on how to use it, so a simple added 15 tokens to every request seems to help.
Going to spend all night once my boy is asleep to fine tune this
Had an issue where the bots keep making up stories like
(Bot message here)
Then said my name like
Vloth : (story here)
It basically made me say something I didn't on the bot messages
And then somehow keep going on making up the story replying to itself
Yeah that's normal and depends on the model.
Mythomax l2 13b
My method I'm testing hopefully fixes that, but might be also model dependent on its capabilities.
Is there a way to fix it?
mythomax is a bit of an older model
You can add user plus the : character as a stop token, but eventually you'll reach like 50 stop patterns
You can use the custom stopping strings from the example M1nty character - it should prevent the LLM from impersonating you
I'm using Qwen 3 14B, uses around 10GB's Vram for Q4. What's your vram?
GTX 1080 ti
11gb
That's perfect that's what I'm using
Qwen 3 14B Q4 with 14000 context at q8_0 no-mmap and such
Restricted power to 185w to keep the wattage down and only costs roughly 10% speed to 20t/s writes which is more than enougg
So far the output has been very clean with some tweaks.
Now just need to work in the external tool calling my research program and be set for now..
commands:
- command_name: remove_bg
description: Remove the background from an image, with optional threshold value
options:
- name: rembg_image
description: The image to remove the background from
type: attachment
required: true
steps:
- save:
file_path: comfy-input # This should be a symlink from comfyUI input directory, to your output directory.
timestamp: false
returns: name # Extract filename from save result dict
- name: threshold
description: Adjusts the strength of the edge detection
type: string
required: false
choices: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
steps:
- type: float
# All selection data will be initialzed into context (can be used to resolve payload fields, etc)
steps:
- load_data_file: comfy_workflows/comfy_remove_bg.json
- call_comfy:
client: ComfyUI
message: Removing background from image
file_path: comfy
I'm going to hold off on pushing to Main still - tomorrow is when it's coming
Is ‘/imgmodel’ printing “There are no Imgmodels available”?
Be sure to either use the correct name for the “main” Imggen in the top section of dict_api_settings.yaml
OR - try ‘/change_main_api’ - if you initialize with the wrong main API
Just put the image file in same dir as the character file, with identical filename
Yes, and I put the file in the characters files, with identical name too
Did you enable the config setting “per_server_characters” ?
Oh yeah I forgot
I'll do it later since I'm away atm
No no, if you did, that would explain it
Bc then you need to explicitly set the image in config.yaml
Make sure to use /character and select the character
It may be delayed 10 minutes btw - but this would be printed in cmd
Discord limitation
The bot logs the time it last successfully changed display name/pfp - when changing character it checks the time and creates a delayed update task if < 10 mins since last change (the actual character data is always updated immediately)
Now that the bot is very capable of producing and trying to send large files, I need to add a config option to allow designating a file hosting API to automatically use (send a download link) when content > 25MB.
I will also be adding a "Step" to offload step execution to the background (release the task from the queue).
I'm reaching a point where I think Qwen 3 14B is just not that good at taking instructions.
Like I can get a response now that is good and clean like 95% of the time, but it just takes 1 1/2 ++ minutes to get it.
This is the last log I have. Any ideas what could be missing? The refiner pass has no character persona and relies on the default settings in the I think dict_base_settings.yaml versus what's preset in the character.yaml
Just sometimes it keeps talking until the 1024 token limit is reached
even with repetition penalty set to 1.3
It’s probably a prompting issue?
If the character file has no state settings, it would be entirely from dict_basesettings
Just add a print statement for the payload, before chatbot wrapper
Here’s an idea for you - you can have a custom character context that is tailored to simply update responses to filter out bullshit
Then just use a “flow” tag to always pass the LLM’s reply to it to edit
I'm going to swap to Qwen 3 30Ba3 and also Try something that's never really failed me for relative tasks, QwQ-32B and see how they handle it first.
I've tested out a file upload API https://gofile.io/ and this has helped a lot with figuring out the logic for a generalized upload files via API logic
I have some nice methods for this now, also added a upload_files "step" which is similar to call_api step but is specialized for preparing the payload.
If you want users to be able to use the command on the other server, then simply update the permissions on that server
If you don't want them to be able to use the command on that server, then I believe it is working as intended
Ok so I simplified my little side project, and Qwen 3 30Ba3 UD XL Q4 is really good with this addtional instruction.
The issue is I need to up it past 1024 token length since thinking can take up 90% of the tokensa
Upping the limit o 2048 tokens for now seems to help it from running out and causing it to redo the request.
Yeah Qwen 3 14B was crummy like 40% success rate with unwanted text sometimes included, whereas Qwen 3 30Ba3 so far 75%+ success rate, no unwanted text which is the whole purpose, and double the t/s speed with P40 24GB gpu/.
Does anyone have gemma 3 params that work
Usually model card on HF has suggested settings?
So this is pretty cool... the tasks are all executed within Semaphores, and apparently just calling semaphore.release() will free the slot without interrupting task execution or cleanup
So this is a simple solution to "offload" a task which just has low compute / waiting objectives ahead of it (uploading files -> sending them to discord)
yep, works perfectly.
@terse folio you may be interested in knowing this (if not already :P)
Finally back working on the little side project... So just gathering logs on what messes up, why it messed up and any solutions on fixing, and if it's fully interacting with your features or if I accidentally made it just outside with the added module.
Then see how it works. Right now I do only process (Or at least I think I do) context added to history that are successful.
(Went back to a Creative/Refiner Design. A 2-pass system that lets the character persona enabled call with context build the response, and the refiner has no character persona with basic instruction-tuned settings and only contains the response from creative to check for the chatting tags, reducing errors alot.)
Stil hit/miss.
I just can't seem to have the bot output text the same way it does in webui. It works just fine there, but it's not the same with the bot. And I've tried keeping every setting I can the same
Updated the bot recently? I had identified an issue that could be the source of this about a week ago - if you have not updated since then, please try updating
Trying that now
Friend called me and distracted me for a few hours
For a short while I had accidentally introduced a bug that set an invalid value for TGWUI’s “mode” param (chat / instruct / etc)
I see
I figured it out about a week ago
I had chat-instruct in my yaml file for my character and it's what I use in the webui too
Right - it was simply getting overrided with an invalid value though 😛
So that makes sense if that was causing an issue/not doing anything.
Ah
Well I'm very sleep deprived right now but
Fudged up something in my internal management
Why is it listening on that port? Is there a specific reason? I'm messaging the bot and getting no responses from it because of that
You must have a cmd arg flag somewhere, maybe something in your TGWUI config?
Ah
Let me try turning off flash attention and streaming llm and see if it works now since I updated
I put those in the cmd flags because I thought I might need to have everything working the same way it does in the webui
While trying everything I could think of and do before I came here
Now I have no clue. Removed those flags and it's still there.
Only noticed it after I updated.
It didn't even seem like putting --flash-attn and --streaming-llm did anything different when I put it in the cmd flags to begin with, since it seems to take the settings from tgwui when it initializes
Idk I guess I just messed something up with the bot because it's just listening on random ports every time I start it now
Maybe it’s something that happened with TGWUI update? The bot doesn’t print that it’s listening on anything
That’s coming from tgwui internals
In your TGWUI check user/settings.yaml - check your default behaviors
I checked settings.yaml and didn't see anything that referenced a port, so I don't know. I even just deleted the settings file and tried it, no change
On the webui the it starts with the usual 7k port
Scratch that
When I load the model in tgwui it's listening on a random port as well, so it's something that's happening when the model is loading for some reason
It does what looks to be the chat template, and then just does that main server is listening thing
LLM responds in the webui still but idk
I'm guessing it was either a tgwui update or something I messed up so badly I couldn't even tell you how. Because I really didn't change anything other than params like temp etc in tgwui and my character yaml file
Maybe you mixed up the bots CMD_FLAGS.txt with the TGWUI one?
Or something? That’s the other place you can see ‘—listen’ (TGWUI flags)
I'll just try fresh installing the bot again after I wake up and see if that works
Because right now the only thing I can imagine is that putting --flash-attn and --streaming-llm is what messed it up
This wasn't happening before I updated though, I'm pretty sure. And I hadn't done anything else. But it happens when the model starts in tgwui too, so I dunno if that'll fix it
I don't remember if I updated tgwui instead of the bot because I wasn't paying attention and that's what caused the change or not, but that could be likely
Discord bot which transforms your servers into hubs for limitless local AI-driven interaction and content creation. Features cutting-edge tools for professionals, and unlocks creative fun for casua...
I said this before but I'm pushing this feature to Main tomorrow afternoon. Just need to update one little thingamajig
I give up for now. Wow learning alot of issues with models i'm using I think for this task I had in mind. Whereas gemini flash 2.5 works almost flawlessly... I have an extra RTX 3060 12gb coming in at the end of the week. I'm going to work on my project I haven't updated in a month and pick back up next week with Hunyuan 80B13a now that it seems added to llama.cpp
Got anything for me to test I can use my Qwen 3 14B system to prop up quickly
Thanks for the offer - I'm working out this last little bug right now then I'll just be pushing this branch to Main.
Had already extensively tested this branch for awhile
Actually, just realized there probably isn't a bug and I'm just an idiot XD
That's ok, I've been running into my own stupidity repeatedly the past few days 
I revised the logic for collecting image results from the main image generation task, and was wondering why the images were failing to include metadata
But I was testing with ComfyUI which I haven't yet supported metadata injection for 🤓
Pushed the new Custom Commands feature to Main
- Includes various minor bug fixes found along the way
- Adds a new
upload_large_filesAPI role that can be assigned.- ALL files that the bot attempts to send will instead be uploaded if they exceed 10MB (bot will send download links).
- If not configured, the script will warn before the send files inevitably fails.
- Includes generated images, TTS audio files, and any outputs from API system (ei: ALL OUTPUTS).
That's all really, but this is a very stable release I recommend hopping on
Have fun developing your own custom slash commands - I'll add additional post-processing to that eventually but for now it is limited to the actions of StepExecutor https://github.com/altoiddealer/ad_discordbot/wiki/stepexecutor
So according to chatgpt, I did update tgwui instead of the bot and that's what was causing my issue. Was too tired and wasn't paying attention.
I never enabled any --listen or --api flags anywhere or anything lastr night so I was really confused.
From my little understanding of what's happening, it seems that it'd now require api use?
But I saw this in the dict_api_settings.yaml; I don't know if that means it wouldn't work currently or if that's referring to something else
I'm completely lost and have no idea what I'm even saying so
I don't know if it's just the flash attention, or what. I had it on when using a previous model and had no issue. I tried turning it off with this model, but in the terminal, it says the model won't run without it due to quantization. So I don't know if it's just the model, an ooba update, flash attention, or what. I never used --api or --listen at any point in either the bot's cmd flags or tgwui's cmd flags.
The bot doesn’t use the API but eventually will
API status of TGWUI shouldn’t actually have any effect
What problems are you having exactly?
Oof. Gotta fix something that broke with recent TGWUI update
Wow, I bet the bot is so far beyond where it was when I last looked at it!
Great job 
PS: tasks like sending files to discord are already async
I've been coding my ass off the past 3-4 months lol
It's come a LONG way
Had a power outage and power just came back... but.
I looked into it further before that, and supposedly it's just a model issue. The llama.cpp version isn't compatible with the model, so it isn't fully loading and that's what's causing the issue. That's what I read at least.
https://github.com/oobabooga/text-generation-webui/issues/7006
Yeah there's been a number of changes with llama.cpp in TGWUI in the past weeks/months
I prefer Exllama myself so I haven't been affected
I used to use only exllama, but a lot of the latest (uncensored) models were only in gguf that I could find quantized, so I switched to llama.cpp. It also helped with trying out larger models by using offloading/layers. I liked that I could at least load and try out larger models in lm studio by using that and couldn't do that with exllama at the time in the webui. So I felt stupid never having tried out llama.cpp in the webui until i'd found that out
I saw there's an exllamav3 now so I might look into that
There were a few startup issues from recent TGWUI changes - which I resolved.
I haven’t looked too deeply into what models are supported with exl3 but it’s not as flexible as exl2
I found some exl3 models and tried them out. The generations start off decent, but gradually start degrading and then lose context as the message it's sending gets longer. Like it just chooses completely random tokens in the end. And eventually, everything just turns into completely incoherent gibberish with more messages sent.
I imagine that the same would happen with GGUF?
There must be some misconfiguration of parameters
Or the models just suck 😛
Nope. I actually found a gguf model that's the best performing I've used yet. Just won't work with the bot, only works in the webui because of the port thing.
I've updated to the most recent version of TGWUI and cannot reproduce what is happening with the port thing
There must be an extension or CMD flag responsible for that
How much vram do you have? maybe this is something I could try to reproduce
Some models just have some kind of llama.cpp error and it causes that to happen due to incompatability
I have 12gb
Link me to the model and I'll take a look
I can just find a lower quant that'll run in 12gb so you can test it and see
Yeah
as far as I'm aware, the model / parameter handling, etc, is all using TGWUI internal code
Pretty sure that's the exact one I downloaded
I like some of these quant descriptions lol
IQ3_XS probably better
Welp I'll go with IQ3_XS
Every gemma or mistral small quant in gguf that I've tried has done the same thing with the port. I tried out gemma 3 27b in exl3 and it worked with the bot. I just probably need to tinker with parameters until it functions properly without messing up.
But that gguf mistral small model performed better than this exl3 gemma3 https://huggingface.co/MetaphoricalCode/gemma-3-27b-it-abliterated-exl3-5bpw-hb6
Tried a gemma 3 gguf and the quality wasn't as good as that mistral small. Both still did the port thing though.
So yeah, it is also doing that on my end - printing that an HTTP server is listening
I believe it is executing that here in TGWUI code https://github.com/oobabooga/text-generation-webui/blob/6338dc0051e02c29dcd4c9ee9d7fc6da20423cda/modules/llama_cpp_server.py#L21
I'm going to test with some print statements but, offhand, it seems to me that when you use the Web UI, it passes an argument for the server_path that would bypass that print statement...
The output was just fine from my test.
@marsh harness sent you a friend request so I can DM you a bit more on this
Just accepted
I noticed that TGWUI actually prints the same thing about listening to server
via the WebUI
I think that this must be required behavior for llama.cpp to function now
@valid crypt @fickle ember pinging that the Custom Commands feature was finished up and shipped
at this point it's quite easy to use multimodal LLM via a command + comfyui
So it wouldn't work with chatting with the character but you can make a command to have such a model respond to an image input + input prompt, you can have cmd options for other input params, etc
I have some big new projects to look into then
Thanks for the ping this is big news
It’s everything I dreamed of initially, and then some. It worked out amazing
The processing of individual option selections, and the final command execution, are currently limited to what StepExecutor can do (this is well documented) - so let me know if you have good ideas for new “Steps”
I’m always open to ideas and suggestions especially good ones 🤓
Some ideas I already have in my head:
- Special handling for expected option names, such as “use_llm” (which would behave the same as the option in native “/image” cmd)
- Special handling for certain option values like “if it’s a string that starts with the word “No” or “False” make it evaluate to the boolean value “False”
Nvm I just need a “use_llm” Step 😛
i need to find a way to enable vision with this
comfy ui seems like a good bet to start on that
definitely
This works good https://github.com/miaoshouai/ComfyUI-Miaoshouai-Tagger
This works good https://github.com/kijai/ComfyUI-Florence2
This works good https://github.com/EvilBT/ComfyUI_SLK_joy_caption_two/blob/main/readme_us.md
These all have example workflows you can just fire right up and start using
Added support for assigning a “Cancel” endpoint for image gen
Makes a Canel Generation button appear while generating. Interaction user or bot owners can click it
Eh I’ll have to test some other button styles for this one
yeah much better
Ok so my inlaws were watching something on TV and I learned about this guy
altoid ross ulbricht
The government is probably stalking me by username proximity
Alright just realized this went down 12+ years ago
Hahahahaha
Yeah man, you're good. He even got pardoned from prison by Trump, so he's a free man now
I finalized that "Cancel" support - and tested with Forge / Comfy / Swarm
all working expectedly
I've decided for the time being to not refactor all the textgen code to support "main TextGen" APIs.
Just sticking with the TGWUI integration for now.
Other text gen can be used via the API system (Tags, Custom Commands)
Next up - finally going to build on that “server mode” I teased.
Probably going to add another “Behaviors” block called “scheduling” where you can define timeframes for internal prompts, “coming online”/“going to sleep”/ etc
how do i fix this error?
Launch TGWUI - load the model successfully - press “Save Settings” button - when you use only the “—model” flag in bot CMD_FLAGS it should load successfully
Or you can add more launch flags
how do i add the launch flags?
Just open that txt file there’s examples
On TGWUI repo they have a list of CMD flags on the main page
btw @halcyon quarry after updating the ad discord bot yesterday the character personality is wrong
like it doesnt follow it
how do i reset it?
like make it not remember previous interaction and forces it to follow the personality?
The higher your model’s context length value is set at (in model settings of TGWUI) the more of your chat history is kept and therefore distances the origjnal context
So reduce that and also reduce “truncation_length”
so where do i go to reduce it?
On the model loader tab - reduce the context length then save settings. The bot respects these settings
You could instead again use launch flags
Truncation length would be in your character file (if included there) otherwise bot’s base settings yaml
If you want to do this selectively based on trigger phrases, you can use Tags to manipulate history as needed
See “limit_history” (it may be “load_history” forget offhand)
does the hierarchy matter?
like where its positioned
or can i put it anywhere inside of state:?
like this
Order doesn’t matter - that’s good
That makes the setting character specific though if you change characters it will use setting in that char or base settings
You may need to mirror chat_size too (may be prompt_size again I forget)
Idk what the difference is with truncation length but typically these values mirror each other
@halcyon quarry could you consider adding a feature where if someone changed the LLM model it will also make a hello prompt so fo make sure it would work?
btw how do i fix this?
It looks like the context length is lower than truncation length
Both are same value though?
1024 on both of them
There's the chat parameters, and then there's the model loader parameters.
The latter is managed either via TGWUI Models tab (when you load the model > Save Settings), or via CMD flags
If you load each of your models in TGWUI and "Save Settings" for each, you shouldn't have any problems.
The only thing is that I don't currently have "per-LLMModel Settings" handling yet (payload parameters).
Alright
The proper thing to do would be to generalize most of the logic for the ImgModel settings into a base class like "ModelSettings", and the ImgModel settings/LLMModel settings would basically just be another instance with a few overrides
@calm rain Got a small gripe with Swarm API.
I use payload param donotsave: true because my system will save the output.
I handle Swarm output in this subclassed method:
async def resolve_image_data(self, item:dict, index: int) -> bytes | str:
image:str = item['image']
if image.startswith('View/'):
# Is a path that needs to get bytes from server
response = await self.request(endpoint=image, method='GET', retry=0, timeout=10)
return response.body
else:
# Is base64 string
return split_at_first_comma(image)
My gripe is that the base64 output does not include any metadata
And it's not like it couldn't support it, because A1111/Forge/ReForge return base64 with metadata included
@hard cobalt This is pretty funny btw - I've revised the image generation code so many times over the years now and managing the pnginfo was always a PITA. I've always assumed this step you coded here originally was required https://github.com/mercm8/chat-llama-discord-bot/blob/main/bot.py#L1032-L1041
(calling A1111 post_pnginfo endpoint)
Turns out the info is already in the result, it just gets erased when opening the image data and saving it.
So when you fetched the png info via API, it was actually an exact copy of what was already baked in
Huh
All that was needed is something like this after opening the image data:
def get_pnginfo_from_image(image: Image.Image) -> Optional[PngImagePlugin.PngInfo]:
if not image.info:
return None
pnginfo = PngImagePlugin.PngInfo()
for key, value in image.info.items():
pnginfo.add_text(key, value)
return pnginfo
I seem to remember being annoyed that it wasn't in there
Well that's a lot cleaner
Still using A1111, or moved on to forge?
It returns base64 data of the image and metadata is included.
Then we decode and open it before writing it. But the metadata gets wiped if you don't use the pnginfo= argument when writing. So all that was needed was to collect it from the image data so that it could be passed in that arg
Rip
Slowly but surely I'm running out of "things I've been pulling my hair out over for nothing" lol
100% on to Forge / Comfy / Swarm
Well I'm using ooba API and comfyui now, complete fresh starts
btw in case you missed it, the bot has some very good ComfyUI support now
What you would be most interested in, is that I made a feature to create your own slash commands.
My main motivation for it was to be able to easily run comfyUI workflows remotely - it's very simple to configure the command options which will set payload values.
It can execute any ComfyUI workflows you can think of, and you get to define what command options will override defaults in that workflow
@hard cobalt TRY IT 
I'd be glad to either walk you through a first command setup, or literally just whip it up for you for a specific workflow
Very quick and easy to set'em up once you get the gist of it
that's this'n user setting
Although I think that, by default, all image returns via the API should include metadata
nbd though
Confirming that worked
Btw @halcyon quarry the ai still not following the personality I gave
How do I make it forget everything instead?
I already tried to create another channel and chatting there and it's still not following the personality
Make sure the mode parameter is chat or chat-instruct - if it is instruct then the 'context' parameter and a few others are ignored
Where do I check that?
It would be in your character file, in that state dict. If it is not there, then it would default to the mode in base_settings.yaml
I recommend updating the bot if you have not recently
I updated it 2 days ago
Let me know if your mode was already chat
On startup it also checks for the mode and prints what mode the TGWUI payload is using in cmd
Do you not have any issues when running this model via the Web UI?
ComfyUI support got some nice improvements over the past few days.
- The output now includes the metadata so you can drag/drop it to load the workflow (which includes all resolved placeholder values).
- Since the recommended payload / internal bot code expects specific nodes to be deleted (in order to easily change models between different architectures) the script will now also reroute "Any Switch" nodes that only have 1 input.
- Basically, output metadata is very nice now.
Tried to install and run the discord bot thing (not tech savy) and got the latest version without the text gen portion. Just the standalone thing to test. I ran start bat file, added discord token from discord developers site, added bot to discord server for friends to try but don't get response. Am i missing something? It also says chat history is being stored somewhere but when i check the folder i dont see anything
If it's Standalone then there is no text generation 🙂
Any replies you get from the bot will be generated content from other features, such as image generation
So the web gen ui thing does need to be installed along your bot thing?
It's still possible to get Text Generation outputs if you have a text generation API available, although there's a few extra hoops to jump through.
But yes I strongly recommend installing TGWUI.
You'll likely have to git clone and install again from the TGWUI folder, after you install it (as detailed in the instructions)
Note that, last night, I finally started working on a "Getting Started" page in my Wiki, so if you have any questions I'll gladly help out
Understood. I saw the word standalone and thought it wasnt necessary. I'll give it another shot! 👏🏼 If not i'll wait and take a look at that wiki for reference when you publish
The Standalone exists for anyone who just wants to use the bot for the image generation features. I added that option because I've busted my ass on this for awhile and at least half of my features are for image generation
Didn't want to limit the potential userbase on a strict TGWUI requirement (which there was for a very long time until just a few months ago)
Standalone or not, the bot has a system to interface with virtually any API-enabled software
At some point, TGWUI removed "None" from the models list. The /llmmodel command was expecting "None" to exist - it would remove it and add it as the value for the "Unload Models" button.
This was not working as expected after the change.
I updated this code so the Unload LLM Model button works correctly.
Pushed another nice convenience improvement
- Automatically fallback to another valid "main" API client if the one named is offline
Currently limited to known Imggen APIs
It was on chat and I changed it to chat-instruct
But now there's another problem where the bot respond takes 10 minute where it usually takes only 1 minute
If you have other high GPU software running, such as image generation with a model loaded, you’ll get massive slowdown if the LLM model is too big
If you tinkerered with the “responsiveness” behavior (character behavior setting) this could cause a delayed response
Since I use both img gen and LLM running side by side I opt for a smaller LLM model
I only changed the responsiveness to 0.0 and I don't even have image generation because I don't really generate image or know how to set up one
Default is 1.0 right? 🤓
Yeah
The responsiveness setting is if you want the bot to behave more like a human where it may “be busy”
Since the ai didn't follow the personality I tried to set it to 0.9
0.0
It generate text for 1 minutes and then 10 minute randomly
If you enjoy this concept, that’s actually what I’m expanding on now
What I’m adding is a feature where the bot can respond to multiple messages as “one prompt”
Also - it may pause writing to read new messages, and affect the remaining of the response
Btw does the current version have memory?
Like reading old message to catch up
Or get information
Over the past few months, part of this API system included an internal variable management system
The framework is there for me to implement something like “user variables” (it’s on my pinned TODO list)
You can already pre-define information to inject via the Tags system
Using the prefix/suffix_context tag
Wish I had another guy like Reality who is eager to contribute 🤓 Got spoiled temporarily
Btw @halcyon quarry why is the ai not following the {{char}} message?
Like following the speech pattern I have put up
Like for some reason it just talked normally instead of following the speech pattern
The "{{char}} : hello
It just doesn't follow it like that
If you want it to literally include its name before each response, you’ll need an instruction to do that in the context
If you mean, it’s just not behaving like it’s depicted in the context (example dialogue) it could be faulty settings
Or maybe the model is just not good at roleplay
Dial in your settings, etc, in the Web UI
Bc it’s easier to tweak them there
Once its behaving the way you like, you could save those settings as a Preset and just use the “preset” param
Or just copy each one over
Alr
Made some good progress on this Wiki page
https://github.com/altoiddealer/ad_discordbot/wiki/Getting-Started
This shows what I mean about the fallback API client thing
(Have "Swarm" configured as my "Main ImgGen" client, but it's offline and SD Forge is online)
If I launch Swarm, I can just use /toggle_api (enable it) followed by /change_main_api
@halcyon quarry do you know what model is great at following personality?
since Mythomax could be broken
nothing works at all even following the example one
even tried to update it too
you can try gemma3 qwen2.5 and llama3.1
I changed model, it worked for a moment until i restart it
It just somehow changed the personality when I restarted the start_linux.sh
And now it's hallucinating by adding more user there
I haven't modified anything btw
I just restarted it because I wanted to play some games without lagging and now it just does that
So after I use reset_conversation 3 times now it follows the personality
Probably bad params. As I recommended before, dial in the settings in the WebUI then port them over
I haven't used WebUI in a long time time, where do I go?
Just launch it > Models tab > load a model > then youll be hopping back and forth between the Chat tab and Params tab
On the params what do I do?
Like what do I dial?
The model you downloaded - if you go to the Model Card on Huggingface or wherever you got it, it will often include some recommended parameters
You could try asking ChatGPT or similar for suggestions. Or you can just try a few of the “presets” in TGWUI
Or go for Temperature, min_p, repetition penalty, top_k, and just move these up/down and see the difference
You could also prompt ChatGPT or similar for an explanation of LLM model parameters
TGWUI repo may have explanation of parameters as well
Lastly, some models just don’t work well in Chat mode - you may need to switch to Instruct mode, and consolidate your context into an instruction/ system_message
Tried using the bot again but get this. Not sure what it means. I do have TextGen installed and works but when running the start bat file this pops up. Did i miss a step from the wiki? Reinstalling the whole folder doesn't work 🙏🏼
It reads text-generation-webui\text-generation-webui\ad_discordbot - Do you havetext-generation-webui installed within a nested duplicate directory name?
Ah i just noticed that. Maybe thats whats causing the issue
Also -
You'll have less chance of trouble if you just git clone the TGWUI repo instead of specifying main branch
Is that also why i may not be seeing the option to integrate it?
indeed!
Try just renaming the direcory to remove -main
If you see the option to install with TGWUI integration - I suggest running TGWUI to make sure renaming the directory doesn't have some kind of adverse effect
and that you are able to update it without issue
Did you install TGWUI as "portable" version?
No not the portable one. The other one
SO did you try renaming the directory to just "text-generation-webui" ?
Yes. I ran it and updated and that seemed to work fine. I did run the start bat file again and it did show the option to integrate but then it just had this pop up and close with nothing happening
Try deleting this file.
When you try again, try opening a cmd in the bot directory and type start_windows.bat
This will keep the window open if there is a problem rather than immediately closing the window
I see that this is the only file. Is that because i haven't run the bat file or am i still missing files?
that's expected
I'm not a genius at setup wizards but from my testing, I didn't have any trouble.
Because you had the first problem may have messed something up
Yeah... I'll probably start over since i'm getting this again. At least i know how the first few issues started so maybe a fresh install can fix all that
If you copy/paste that text for me in a code block "```"
three backticks on each side
I might be able to ask ChatGPT what's up 🤓
Microsoft Windows [Version 10.0.26100.4652]
(c) Microsoft Corporation. All rights reserved.
C:\Users\GoodOlMavis\text-generation-webui\ad_discordbot>start_windows.bat
Welcome to ad_discordbot
The bot can be integrated with your existing text-generation-webui environment.
[A] Integrate with TGWUI Recommended
[B] Create and use own environment
[N] Nothing, exit script
Enter A, B, or N: a
Trying to activate Conda from: "C:\Users\GoodOlMavis\text-generation-webui\installer_files\conda\condabin\conda.bat"
Conda activated successfully.
Traceback (most recent call last):
File "C:\Users\GoodOlMavis\text-generation-webui\ad_discordbot\one_click.py", line 393, in <module>
check_env()
File "C:\Users\GoodOlMavis\text-generation-webui\ad_discordbot\one_click.py", line 112, in check_env
if os.environ["CONDA_DEFAULT_ENV"] == "base":
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "<frozen os>", line 714, in getitem
KeyError: 'CONDA_DEFAULT_ENV'
C:\Users\GoodOlMavis\text-generation-webui\ad_discordbot>
Oh i forgot the "
np this is good
Sorry for the trouble 🙏🏼
ChatGPT suggests trying this...
If you open one_click.py - jump to def check_env()
Replace the whole thing with this:
def check_env():
# If we have access to conda, we are probably in an environment
conda_exist = run_cmd("conda", environment=True, capture_output=True).returncode == 0
if not conda_exist:
print("Conda is not installed. Exiting...")
sys.exit(1)
conda_prefix = os.environ.get("CONDA_PREFIX")
if conda_prefix is None:
print("Conda environment not detected. Please activate an environment. Exiting...")
sys.exit(1)
works on my end but it's already installed no problem 🤓
Did that and ran it again. Looks like it says conda is activated but then under it it says otherwise
We're gonna try one last thing...
Yes. I see that TGWUI must have also had this issue because they changed the lines here (my launcher is 95% based from it)
Replace it with this
def check_env():
# If we have access to conda, we are probably in an environment
conda_exist = run_cmd("conda", environment=True, capture_output=True).returncode == 0
if not conda_exist:
print("Conda is not installed. Exiting...")
sys.exit(1)
# Ensure this is a new environment and not the base environment
if os.environ.get("CONDA_DEFAULT_ENV", "") == "base":
print("Create an environment for this project and activate it. Exiting...")
sys.exit(1)
You should be fine then
I'll be updating this right now
I think so 👏🏼
Yep 👍
You can revert that change and update the bot, once you are set up
I just updated that line
I also did add a lot more info to my "Getting Started" page on the Wiki https://github.com/altoiddealer/ad_discordbot/wiki/getting-started
ive noticed problem with conda, some mess with license and authentication
i think it is something like each windows with conda installed will have a unique key, and if you switch computer (portable ssd, etc), you cant use the environment
TGWUI one_click.py had
if os.environ["CONDA_DEFAULT_ENV] == "base":
...which assumes that this environment variable exists.
With this error from GoodOlMavis I noticed that TGWUI recently updated the line to safely get the var if it exists:
if os.environ.get("CONDA_DEFAULT_ENV", "") == "base":
I had caught up with a lot of other changes but apparantly that one slipped by
./start_linux.sh: line 8: [[: invalid regular expression [!#$%&()*+,;<=>?@[]^{|}~]': Invalid content of {}
Welcome to ad_discordbot
is this normal?
it gets invalid regular expression but it works fine
Try deleting this block and let me know:
# Check for special characters in installation path
if [[ "$(pwd)" =~ [!#\$%\&\(\)\*\+,\;\<\=\>\?@\[\]\^\`\{|\}\~] ]]; then
echo "WARNING: Special characters were detected in the installation path!"
echo "This can cause the installation to fail!"
fi
I think ChatGPT wrote this for me and so, probably fudged it 🙂
If it solves the issue I'll push it to main
Btw @halcyon quarry I'm still struggling on where the ai still respond in 10 minute after I set it to 1.0 responsiveness
And it sometime print out 1 emoji too and nothing else
If you go into ad_discordbot/internal/settings/activesettings.yaml - does the behavior reflect responsiveness: 1.0 ?
I'll try that later
Just want to ensure the setting adjustment actually took effect - which it should
If it actually takes 10 minutes to generate because you loaded a model that is way way too big and is performing a significant amount of inference in system RAM/CPU then I can't help there
I'm using mythomax 13b
What quantization?
It generate on 1 minute usually
Q4KM
But now it just sometime generate on 10 minute
And your graphics card?
It's broken so I'm currently using my igpu
Ryzen 5 3400g or Vega 11
But it usually just generate in a minute
And then randomly 10 minute
that seems like a pretty big model for your situation
Yes but usually it's 1 minute
And it's going slower slowly
It just randomly go to 10 minute after 1 minute respond time
If you retain chat history at all, then the amount of memory required increases every generation
is this right?
so what do i check?
Added a bunch of print statements
back up your bot.py
and use this one until we see where it gets stuck
My intuition is that the model is generating slow
Any of you guys using ComfyUI should immediately check out https://github.com/nunchaku-tech/ComfyUI-nunchaku/tree/main
This has a unique quantization method for Flux dev / Flux Kontext that is super duper good
hiiighly recommend swarm for nunchaku models
nunchaku is a pain in the ass to install
swarm automates it
they're working on wan too which will be awesome when it exists
The PITA part was installing the requirements then it still erroring. Then figuring out that a wheel needs to be installed as well
They have a custom workflow with a node that installs the correct wheel though - which worked like a charm
Mind is blown on how good this quant is
Didn’t try Kontext nanchuku yet but Im sure that’s also gonna be a mind blown situation
is this normal?
900 seconds for 2 words
What hardware are you using @late pivot
That sort of time is something id expect from trying to run a model on a cpu rather than a gpu
ryzen 5 3400g
used to be just 1 minute but it just randomly generate for 900 seconds
one of the memory is broken
Litterally anything would is better than cpu mode
need to be replaced
The vram on your graphics card is broken?
yes
ryzen 5 3400g has integrated graphics
and on linux i dont need to install any drivers
I think you should abandon this endeavor until you repair your gpu other wise you should expect exactly what you got and little better
I hope thats not rude
Unless someone with more knowledge than me knows some sort of cpu wizardry
since if im not wrong ai can only be used with nvidia gpu only
Anything is better than cpu
If you can load it, by all means try
Im actually unsure of this
I would not expect integrated graphics do much more than make sure your screen turns on
thats just the old one ngl
newer integrated graphics can actually play games
yeah
Maybe if you quantize the model to an insane degree?
You should look into some highly quantized low parameter models
You can get surprisingly good results if you set it up right
Hopefully if its small and efficient enough of a model it might hold up to give you responses in a timely manner without being too dumb
Im talking 6b or less
Have you tried using actual cpu mode rather than the igpu?
how do i do that?
Im not sure as ive never had an issue (i own 4 graphics cards)
Upgraded my computer 2.5 years ago, wifey still uses that in arguments that I get everything I want but she can’t (all she wants is a new kitchen, nbd)
In the model tab, all your possible options are there
But yeah you are pretty much locked out of the open source AI game when your graphics card kicks the bucket
im planning to fix it myself
what are the model that you recommend that is similar to mythomax 13b?
Try running something with 3b or 1b in the name for starters 😛
Low quant version too
i meant for the rx 6700xt
im plannign to fix it
I’ve got a 4070ti and I try to stick with models around 7-10b for the speed.
No clue
Better off screenshotting your Model tab
Also I don’t know what’s best for you anyway
Should I buy 3 rtx 5060 low profile or just fix my rx 6700 xt?
Idk!
Heeey. Been a while, fell off the radar. Hows it going? Anything I can help testing or something?)
If you remember I've had troubles with my setup not applying correct payload sometimes. Basically, if you do exact match - models in sub folders have appended folder name to the model name lol so instead of sdxl_intorealism the bot was looking for SDXL_sdxl_intorealism. So maybe that will help someone some day lol
Heya!
If you haven't updated in awhile, there's some very amazing new features
Last one was back in June lol
lotta bug fixes along the way as well
Since starting btw never went back to ui itself. Amazing bot, and I want to say again great thanks lol
Oh nice, sounds awesome
Oooooh, its / commands. Niiiiice
Once you get the gist of it, there's really a ton of practical uses that you can whip up commands for
I made it with ComfyUI in mind, you can literally make a command for any workflow at all with sensible options to set payload values
The only real limitation there is that file inputs need to be less than 10mb unless the user has Nitro
So like, if you want to attach a "driving video" for an iv2v workflow that would fail if the vid is > 10MB unless user has Nitro
I spent a lot of time making sure the Wiki covers everything
The "StepExecutor" system is what actually makes things happen from the inputs
@smoky cedar also, I've deeply reworked how the bot sends content. Now, it's possible to have a designated "Upload API" which will automatically be used if the bot tries sending content that exceeds 10MB
So the bot would instead send a download link
The api_settings.dict includes a suggested API / config - which is free and easy to set up
Holy, nice
There was a need for such a thing now that the bot can easily be configured to generate large files
Other than i2v what else can it be used for?
Like, anything lol
Some things might be a little tedious to create the options for like, if you put a bunch of sound files in one of your user folders, and then make a bunch of options for each sound file, and when the user selects it it plays it on the VC.
So the option values in this case would take a sec to manually copy/paste into the config
Hm, ok
Once you read over Custom Commands / Step Executor (and APIs if you have the time) - these things all work very intuitively
Some things that you might think would be tricky like, the save step - the method I made for that is super flexible. You can pass this step a file path, bytes, base64, and in most cases it can correctly figure out the file type and such.
It returns a dictionary that can be directly chained into a send_content step or, if save is the last step, the bot will just send whatever content it detects
Also, there is a dedicated step for ComfyUI call_comfy that makes it super simple
Will read on it today. Sounds complicated lol
If you have a comfy workflow or something you want to try out, I could walk you through it
And you'll go, holy crap that's simple 😛
I made a utility as well to make it easier to prepare ComfyUI workflows for the bot
The bot has also been updated to work with Swarm
which should be significantly less frustrating than Comfy
but giving access to the stuff Comfy can do that Forge cannot
I'm also primarily a Forge user, but there's just such amazing shit going on in Comfy
Sorry, what do you mean?
Didnt find something that would lure me in lol
Swarm has integrated support for most of the video generation stuff, support for the newer models (Flux Kontext, Chroma, etc)
Oh i see. Didnt delve in to i2v with bot yet. Doing it separately in pinokkio btw
In terms of the bot, the only things it can't do easily for Swarm / Comfy is the custom ReActor face swap and ControlNet handling, both hardcoded for A1111/Forge/ReForge and too complicated for me to want to resolve
With the custom commands feature, it's a breeze
If you find some comfy workflow to face swap, can make a /face_swap command in like 5 minutes
flux kontext, same. super ez
The video workflows are pretty complicated to understand b/c there's a million nodes and spahetti everywhere, but once you see where the model loader is, encoder/vae/etc, image input, there's not many values that need tweaking.
Ah yeah, one more thing I added...
bot now supports a "cancel" endpoint. Just copy/paste it from the settings template for Forge.
If the endpoint is there it makes a Cancel button appear when generating
Fuck, titanic work!
Would it be possible to make these custom commands something the bot can trigger itself to do using key words in messages?
For example "hey phantom can you do X?" And have this be automatic
I have a "run_workflow" tag that can do anything in StepExecutor
Custom Commands just gives you the ability to clearly define multiple inputs for the system. I need to take a closer look and see what bot variables are available when using that tag (prompt, etc)
Eh, it's one of those things that I'm not sure when is the appropriate timing to actually execute a run_workflow tag.
I'm thinking of creating yet another "conditional tag" where you can change the default timing for actually executing certain tags.
This would mainly work for "generic tags" that are not specifically tailored for LLM / Img Gen
I suppose I should just add another parameter for call_api and run_workflow tags - since these are really the only two that timing can make a difference
How things work currently:
- The bot tries to match Tags after the user writes.
- Immediately applies "generic tags"
- Applies LLM related Tags > Updates task variables > Generates text
- Tries to match Tags for LLM's response
- Immediately applies "generic tags"
- If image generation is triggered: applies Imggen Tags > Updates task vars > Generates image
Since I lumped in call_api and run_workflow as "generic tags" a number of variables can't be used.
By the way, I also updated the logic for the image model matching it used to only give preset ONE point if any of the includes text was matched, (-1 for excludes).
The way it works now it gives a point for EACH text strings that match in the file
Basically, you shouldn’t need to use exact match anymore
Just include the model name as an “includes” and that preset will get another point
@halcyon quarry can you help me on something?
so the ai will sometimes not follow the personality after for some period of time
like after a few message it will not follow it unless i do /reset_conversation
can you tell me whats wrong with it?
The context gets further away
Bad params maybe too
Bad model maybe too
I really don’t think this is any result of the bot script
It means there's a bug in the history manager! (not good)
I have a patch for this
I pushed an update that prevents this error
Thanks for the bug report
Nice
How do I make the context short?
Reduce context size (model loader param), trucation length
What do you recommend?
You can also try chat-instruct mode, but you need to set the instruction which is intended to remind the LLM about its personality or whatever. I do not have experience with this
TBH you are much much better off asking in General
Dude nice model 👍🏻
it depends, but ollama's default is 2048, as long as you dont feel that it is forgetting a lot you can try
I use 2048 always as well
Hello I am having a small problem when I am running "start_windows.bat" when I open and select option A it closes immediately.
I am getting"
Trying to activate Conda from: "C:\Users\molin\text-generation-webui\installer_files\conda\condabin\conda.bat"
The system cannot find the path specified.
The system cannot find the file C:\Users\molin\text-generation-webui\ad_discordbot\installer_files__conda_tmp_20678.txt.
Failed to run 'conda activate "C:\Users\molin\text-generation-webui\installer_files\env"'.
Failed to activate the conda environment. Exiting...
"
I got it working now, but i have a question. can i use a loRA?
Yeah, should work
I haven’t been keen enough with text gen LoRAs to actually test it personally but I’m pretty sure the correct code is in place for that
Made a “custom command” for flux kontext on my end
If you know what caused the issue / what steps fixed it, that would be good info!
I believe the issue was with Conda. what I did to fix it was use the update windows bat and then relaunched the start win. and that seemed to fix it.
How do I get access to it
It’s detailed here:
https://github.com/altoiddealer/ad_discordbot/wiki
Related topics are the “flexible API System” and “StepExecutor”
I’m going to add the flux kontext one I screenshotted as another example, but I already have some examples in the dict_commands.yaml
Hey, sorry, for another question lol hadn't gotten to new system there, but for reactor - is it possible to make it work with a menu where you can choose the face models for it? Im still using key:value method, but I have about 100 models, and I dont remember all lol
If you are familiar with ComfyUI at all, it's possible to set up your own face swapping workflow, then create a "custom command" where the face model can be selected from a dropdown
As it is, you can create "tags" for each face model which could automatically load up the correct one via trigger phrase
Ok, I see. Thanks)
Each "tag" can literally be this simple:
- trigger: 'Some person'
reactor: some_person.safetensors
Yeah, thats what I thought. Still wont remember who is who without looking at list lol
Maybe just clean up your naming? 😛
Urgh 😂
But yeah, tags are awesome dude
I haven't done it yet but you can create a "custom command" that works for Forge
Lemme whip something up real quick as an example...
I should make a deciated "call_sdwebui" step now that works like the "call_comfy" step
Anyway, the idea for the custom command is to define an option for each payload field you want to be able to be modified.
Then the "main" steps will use load_data_file (this file), followed by call_api
At this phase all of the matching key in the "Context" that are matched in the __overrides__ dict will be updated.
Regardless of whether they're matched or not, all values from __overrides__ get injected into the payload
essentially you set them up as defaults, and the user selections will override them.
Oh wow. Once im home ill tinker with that. Thanks
Look into it, but I'm going to go ahead and create the convenience step in the next day or two, to call_sdwebui
You should still take a stab at setting up your command, just don't go crazy with the "main" steps.
Almost got this step whipped up
In this process found some bugs!
Such as, discord Attachments were not being handled well. They were not usable unless passed to a "save" step. Now they'll be decoded to bytes by default.
Had a decode_base64 step but not an encode_base64 step
Alright I've also kicked it up a notch by adding dynamic Select menu creation
via prompt_user step
Merged the improvements to Main
I just need to document the new Steps/usage, tomorrow
This is really cool btw - the whole /image command can essentially be duplicated in a custom commands definition
mcmonkey I know you’re reading this I didn’t just add convenience “step” for SDWebUI it’s also for Swarm 🤗
bah. Wanted to allow sending a Modal for requesting text input from user, but that needs to be handled immediately (race condition)
can't just defer the response either
yep, there is a 3 second window to send a Modal lol
I've updated the Wiki to reflect the new/updated steps:
- call_imggen
- prompt_user
- dict
- list
Added yet another step
- comfy_delete_nodes
Finally got around to testing Wan 2.2 and it is super duper good
Most importantly, it has stunning breast physics 🏆
I updated the handling for run_workflow tag. This now processes after the task is finished, so it has access to all of the butt variables from the task
BOT variables
Team nunchaku strikes again by releasing their quantized version of the new Flux Krea
https://huggingface.co/nunchaku-tech/nunchaku-flux.1-krea-dev
Flux Krea Dev versus Flux Dev - same seed and prompt. Generated in 4 seconds!
I tried out the Wan 2.2 t2i workflows which are super impressive but they take like a minute to render
I’m excited to hear about the first confirmed user trying the custom commands feature - let me know if you get around to it! The files I shared are drop-in ready just need to populate the list of face models
If there is an API endpoint to fetch all reactor models (i need to check) an even better way would be to have a “call_api” step to fetch the model list, then use that for a Select menu options in “prompt_user” step (making it dynamic). It also automatically overflows into more dropdowns (25 options in each dropdown)
Added "if" / "elif" / "else" steps
- Allows only executing "steps" branches based on condition
- Restricted to some relatively simple operands