#Anyone know of any tools to test a bot at scale?

1 messages ยท Page 1 of 1 (latest)

rich vault
#

My bot is at a point where it's too big for its own good and im on the verge of shutting it down. Bot works completely fine while testing locally, soon as it goes to prod it breaks and its because of my caching/presence event being blasted a million times a second. I was testing on prod yesterday and got hit with the "you are out of shards for the day, resets at blah blah"

old lagoon
#

So you want to just send a fuck ton of events locally right?

rich vault
#

yes

#

specifically triggering the presence update event

old lagoon
#

Hmm, could call the state.py method with example payloads

rich vault
#

how would i go about doing that tho and at scale

old lagoon
#

Can you find out roughly how many events your bot is getting per cluster?

rich vault
#

i can try when my discord decides to reset my shards

#

latest count of just pure members, was around 11 million

old lagoon
#

Christ that'd do it

old lagoon
#

might work, not sure if it does anymore

#

Do you know exactly what the issue is? Cos id suggest make a couple example payloads and just throw some random ids in em and asyncio gather them to your state.py event method

rich vault
#

it would prob help if i cache all the data before hand then enabled the event

#

cause after about 3 mins of the event being active it nukes itself.

old lagoon
#

Hahaha yes

rich vault
old lagoon
#

Orrrrr I have something for you to reduce load, but first. How up to date do you need presence data?

#

Is there a difference between having it within 30s of starting and 5mins

#

Like I spose slower boot but being fine would also be nice right

rich vault
#

soo pretty much anytime someone adds a specific custom status to their status the bot adds a role to them or removes it if they remove it.

rich vault
old lagoon
#

Then something like this maybe?
Whenever you get a presence event come in your throw it into a asyncio.Queue and call it done.

Then somewhere else you define a method to handle a presence event with your logic n caching and whatever. Then when you start your bot you simply run a few of these methods as asyncio.Task's in the background forever consuming from the queue

#

It'd take some playing to figure out how many consumers to have at once without causing a massive bottleneck / backpressure issues but it could be a nice way to introduce managed processing to something which currently seems like it struggles to handle the bursts of events from things like startup

#

I'm not 100% sure itd work but based on how you described your issue its what Id guess at

rich vault
#

what exactly would i put into the queue itself

#

cause my presence event looks like

async def on_presence_update(before: disnake.Member, after: disnake.Member):
    settings = await get_or_create_settings(after.guild)
    _embed = await get_or_create_embed(before.guild)

    if settings is None:
        return

    role1 = after.guild.get_role(settings.role1)
    role2 = after.guild.get_role(settings.role2) if settings.role2 else None

    if role1 is None:
        return

    if not after.guild.me.guild_permissions.manage_roles:
        return

    if settings_should_skip(after, settings):
        return

    if should_handle_vanity(after, settings):
        return await handle_vanity(before, after, settings, role1, role2, _embed)

    if should_handle_removed_vanity(after, settings):
        await handle_removed_vanity(before, after, settings, role1, role2, _embed)

so i would assume implementing the queue wouldn't be too bad

old lagoon
#
queue.put_nowait((before, after))
#

Then just move everything you currently have to a method which runs as a task with the information being fetched from the queue

#
data = await queue.get()
while True:
  # Your stuff here
  queue.task_done()
  data = await queue.get()
rich vault
#

how often should i run the task

old lagoon
#

You could also refactoryour code so that the if ... return statements are right at the start to avoid un-needed processing in some cases

old lagoon
#

You start X tasks that all handle shit from the queue for the lifetime of your bot

old lagoon
#

You'll need multiple consumers as well likely, otherwise the queue will forever grow and never empty. But that'll take some tinkering

#

I.e. (average queue insertions per X) * (Average runtime of your method) + 1 or some shit like tha

#

Could be worth a shot

rich vault
#

yeee im giving it a go rn.

#

i learned python an unconventional way so this is somewhat new to me lmao.

rich vault
old lagoon
worldly shoalBOT
#

suggestions/bot.py line 578

async def load(self):

main.py line 57

await bot.load()
old lagoon
#
async def on_presence_update(before: disnake.Member, after: disnake.Member):
    queue.put_nowait((before, after))

async def process():
    while True:
      before, after = await queue.get()
      queue.task_done()

      if not after.guild.me.guild_permissions.manage_roles:
          continue
  
      if settings_should_skip(after, settings):
          continue

      settings = await get_or_create_settings(after.guild)
      _embed = await get_or_create_embed(before.guild)
  
      if settings is None:
          continue
  
      role1 = after.guild.get_role(settings.role1)
      role2 = after.guild.get_role(settings.role2) if settings.role2 else None
  
      if role1 is None:
          continue
  
      if should_handle_vanity(after, settings):
          return await handle_vanity(before, after, settings, role1, role2, _embed)
  
      if should_handle_removed_vanity(after, settings):
          await handle_removed_vanity(before, after, settings, role1, role2, _embed)

async def load():
  asyncio.create_task(process())
  asyncio.create_task(process())
  asyncio.create_task(process())
  asyncio.create_task(process())
  asyncio.create_task(process())
#

Something like that

#

Otherwise go profile your prod bot and figure out exactly what your bots dying from and what needs speeding up ๐Ÿ˜›

#

Prolly that tho

#

I assume you also already have speedups installed?

rich vault
#

yeee i do

#

ima give this a go

old lagoon
#

Yea give it a go, see what happens

#

You should also log the queue size every so often to see if your keeping up or need to start more tasks to consume from it

#

Should help alleivate the pressure from burst updates

#

Other then that you likely still will have othe rissues from stuff

rich vault
old lagoon
#

Shouldddd be fine to let it rip I reckon

#

Essentially your workers will consume at a constant rate, and as long as that rate is higher then the input rate she'll be all good and it should handle bursts a lot better because the only performance issue will be that they end up in the queue. And given you don't need it to be realtime it can work on it as it pleases

#

That's the theory atleast

rich vault
#

while i have you here, i noticed you made a post about this before but thing is im not using a proxy lmao

#

legit spamming

#

unlit it finally connects it

#

im assuming its because all my shards within the clusters are all trying to connect at once?

#

ima let the bot run for like an hour and see what happens, right now im not getting an logging from the presence event. but all cmds work fine, the functions within the presence event aren't being called so we will see what happens.

old lagoon
#

Yea idk tbh without looking into it further I cant remember

rich vault
#

๐Ÿ’€

rich vault
#

thats just one cluster also

rich vault
#

uhh????

#

its just climbing lmao

subtle ocean
#

oh my god

old lagoon
#

May uh, may want more yes

#

If you want lil more complicated you could have a set amount of 'forever' workers and then scale as appropriate lol

#

Can you try do some maths to find out how many events are added per second? And how long it takes for one loop to complete

#

Like im not surprised its climbing lol

rich vault
#

well i had 5 bumped it to 15, bot was working fine in my server, queue was down to 0, but some servers werent working at all(as in roles not being added for custom status)? changed some things reset bot now its not working but queue is at 0?

#

im so lost because all cmds work fine

#

but anything inside the presence update event is a hit or miss

old lagoon
#

I mean tasks suppress errors so maybe thats it

rich vault
#

do i dare enable debug in logger

#

wtf

#

why is it dispatching events already?

#

this is the second i start the bot it floods logs

#

and it shouldn't be

subtle ocean
#

Try deleting pycache

rich vault
#
@plugin.load_hook(post=True)
async def wait_until_ready():
    await plugin.bot.wait_until_ready()
    #bunch of junk
    logging.info("Cached guilds. Starting presence updates. Took %.2fs", time.monotonic() - started_at)
    plugin.bot.add_listener(on_presence_update, "on_presence_update")

the listener gets added here

#
[2024-03-04 09:06:29,483] DEBUG [disnake.client.dispatch:750] Dispatching event socket_event_type
[2024-03-04 09:06:29,483] DEBUG [disnake.client.dispatch:750] Dispatching event raw_presence_update
[2024-03-04 09:06:29,484] DEBUG [disnake.client.dispatch:750] Dispatching event presence_update
[2024-03-04 09:06:29,489] DEBUG [disnake.gateway.received_message:553] For Shard ID 43: WebSocket Event: {'t': 'PRESENCE_UPDATE', 's': 5129, 'op': 0, 'd': {'user': {'id': '1129948038643843084'}, 'status': 'dnd', 'guild_id': '1071267739580252170', 'client_status': {'desktop': 'dnd'}, 'broadcast': None, 'activities': [{'type': 0, 'timestamps': {'start': 1709536768000}, 'state': ' Speeding on Pillbox Hill', 'session_id': '9fa180939190229eadcf98c578d6a896', 'name': 'TPLA', 'id': '22a55f840028e879', 'details': 'Players: 164/260 | Queue: 0 Players', 'created_at': 1709543189426, 'buttons': ['Tebex', 'Discord'], 'assets': {'large_text': 'TPLA', 'large_image': '1178868371194904627'}, 'application_id': '1144568003195850803'}]}}
[2024-03-04 09:06:29,489] DEBUG [disnake.client.dispatch:750] Dispatching event socket_event_type
[2024-03-04 09:06:29,489] DEBUG [disnake.client.dispatch:750] Dispatching event raw_presence_update
[2024-03-04 09:06:29,490] DEBUG [disnake.client.dispatch:750] Dispatching event presence_update
[2024-03-04 09:06:29,490] DEBUG [disnake.gateway.received_message:553] For Shard ID 43: WebSocket Event: {'t': 'PRESENCE_UPDATE', 's': 5130, 'op': 0, 'd': {'user': {'id': '743956688926801960'}, 'status': 'dnd', 'guild_id': '778438158605615115', 'client_status': {'desktop': 'dnd'}, 'broadcast': None, 'activities': [{'type': 0, 'timestamps': {'start': 1709542604000}, 'state': 'In A Squad', 'party': {'size': [4, 4], 'id': '549744c92c72032bbd5fd4fedab33f6a'}, 'name': 'Fortnite', 'id': 'baa5df061b353164', 'details': 'Battle Royale - 25 Remaining', 'created_at': 1709543189444, 'assets': {'small_text': 'Tier 100', 'small_image': '443127519386927104'}, 'application_id': '432980957394370572'}, {'type': 0, 'session_id': 'd0f00b9e0f7910626de37c1d622eddde', 'name': 'Rainbow Six Siege', 'id': '9ba7c6776a719ec4', 'flags': 1, 'details': 'in MENU', 'created_at': 1709539055002, 'assets': {'large_image': '446301881636225042'}, 'application_id': '445956193924546560'}]}}
[2024-03-04 09:06:29,490] DEBUG [disnake.client.dispatch:750] Dispatching event socket_event_type
[2024-03-04 09:06:29,490] DEBUG [disnake.client.dispatch:750] Dispatching event raw_presence_update
[2024-03-04 09:06:29,492] DEBUG [disnake.client.dispatch:750] Dispatching event presence_update
[2024-03-04 09:06:29,539] DEBUG [disnake.gateway.received_message:553] For Shard ID 43: WebSocket Event: {'t': 'PRESENCE_UPDATE', 's': 5131, 'op': 0, 'd': {'user': {'id': '1069411509698035713'}, 'status': 'online', 'guild_id': '1058361754016546878', 'client_status': {'web': 'online'}, 'broadcast': None, 'activities': [{'type': 4, 'state': 'discord.gg/member-service', 'name': 'Custom Status', 'id': 'custom', 'created_at': 1709543189459}]}}
[2024-03-04 09:06:29,540] DEBUG [disnake.client.dispatch:750] Dispatching event socket_event_type
[2024-03-04 09:06:29,540] DEBUG [disnake.client.dispatch:750] Dispatching event raw_presence_update
[2024-03-04 09:06:29,540] DEBUG [disnake.client.dispatch:750] Dispatching event presence_update
old lagoon
#

There are default listeners for things iirc

old lagoon
rich vault
# old lagoon There are default listeners for things iirc

so the

[2024-03-04 09:06:29,540] DEBUG [disnake.client.dispatch:750] Dispatching event raw_presence_update
[2024-03-04 09:06:29,540] DEBUG [disnake.client.dispatch:750] Dispatching event presence_update

are just fired soon as the bot starts?

#

cause its blowing up logs lol

worldly shoalBOT
#

disnake/state.py line 971

def parse_presence_update(self, data: gateway.PresenceUpdateEvent) -> None:

disnake/client.py line 749

def dispatch(self, event: str, *args: Any, **kwargs: Any) -> None:
old lagoon
#

It will attempt to dispatch the event, and thus log, regardless of if you have listeners or not

rich vault
#

ahhh i see

#

so yea debug does nothing for me then lmao

old lagoon
#

Indeed

rich vault
#

addings roles/sending a message to a channel

#

like global?

old lagoon
#

It could be

#

I havent hit it but I'd assume you'd see

rich vault
#

yeah at this point im contacting dev support lmao

#

to see whats going on

old lagoon
#

Aight gl

rich vault
#

appreciate the help, def gonna keep the q implmentation

rich vault
# old lagoon Aight gl

Any idea what could cause the events to not work on some clusters but others? im using ur cluster/shard setup so all clusters are exact same code etc. but only some are giving roles/sending notification when a user changes their custom status and other clusters just don't at all. but all clusters, the commands/interacting with the bot works fine?

#

queue size on the non-working clusters is active, as it's constantly consuming the queue so the event/function itself is working.

old lagoon
#

Not a clue ngl

#

Seems weird as

#

Might need some hella debugging

rich vault
# old lagoon Not a clue ngl

figured it out. now it's just a matter of processing the queue faster cause look at this shit.

{
    "cluster7": {
        "size": 284302
    },
    "cluster5": {
        "size": 337271
    },
    "cluster4": {
        "size": 273814
    },
    "cluster9": {
        "size": 252700
    },
    "cluster3": {
        "size": 239151
    },
    "cluster6": {
        "size": 274352
    },
    "cluster2": {
        "size": 219467
    },
    "cluster8": {
        "size": 268732
    },
    "cluster1": {
        "size": 0
    }
}
#

thats w 50 consumers

#

a cluster

#

and its just rapidly growing ๐Ÿ’€

rich vault
#

should I spawn more clusters?

#

cause right now, each cluster is about a million users

old lagoon
#

More consumers!

#

Maybe I make you a smart one

rich vault
#

yeee so it boils done to consumers. i upped clusters and spawned 100 consumers and more clusters are working and most are 0 qsize but some are still constantly climbing. assuming its just more active users on some clusters than others

#

ive got alot of power soooo just a matter of optimizing it all

old lagoon
#

Yea, you could build something to auto scale consumers

#

If it's not keeping up, spawn more for a bit

#

That kinda vibe

rich vault
#

yeee im just do sumn like

 for i in range(plugin.bot.queue.qsize() / 2):
     asyncio.create_task(process())

thats as far as my brain can math tbf

#

how would I cancel/delete em after the queue is brought down some

#

how shit is something like this?

    plugin.bot.queue.put_nowait((before, after))
    if plugin.bot.queue.qsize() > 5000:
        for _ in range(plugin.bot.queue.qsize() // 2):
            asyncio.create_task(process())
        return
    else:
       loop = asyncio.get_event_loop()
       tasks = list(asyncio.all_tasks(loop))
       for t in tasks[100:]:
           t.cancel()
old lagoon
#

Well the current ones, atleast as I provided are forever loops which means it won't work

#

You need a way to signal to the task to return for cancel to actually cancel

rich vault
#

fuck me

old lagoon
#

It's fairly simple

#

Tbh

rich vault
old lagoon
#

Hmm actually given you await a lot you could likely get away with a simple check

#
async def load():
  for i in range(size):
    create_task(process(i, cancel_set))
    queues.add(i)

...

async def process(_id, cancel: set[int]):
  while True:
    if _id in cancel:
      return

    ... process stuff
#

And just add ids to cancel set when you want em to die on next loop

rich vault
#

or is that just another set/list of tasks that's been created

old lagoon
#

So if you wanna kill half the consumers, iterate over half the list

#

It could also be

#
queue: list[tuple[int, Task]]
queue.append((i, process(i, cancel))

...
i, task = queue.pop()
task.cancel()
cancel.add(i)
rich vault
#

@old lagoon im ab to glaze ๐Ÿ˜‚ but dude you have helped me so much from code help and your libs. my bot lowkey be fucked without your help. its all working now, and this actually gave me an idea for premium version of my bot too with the queue system.

old lagoon
#

Let me guess, it runs zonis and function cooldowns ๐Ÿ˜›
Possibly alaric as well if you use mongo as a backend