#.loadAll() blocks all shards?

1 messages · Page 1 of 1 (latest)

simple plaza
#

Okay, I have finally isolated my problem which I've opened multiple questions about.

The problem is that

await client.stores.get('commands').loadAll();

(my old bad way of getting all the pieces and reloading them in a loop does the same)

Requirements for reproduction:
- 20+ shards with ~15k guilds (not entirely sure that it's required actually)

  • active users using commands
  • run the code on all of them

What happens:

  1. About half of the shards complete the process within seconds.
  2. All clients "freeze" for 10-20 seconds, the shards do not respond to any discord message or interaction. however the event loop does not seem to be blocked because independent functions still run.
  3. All shards resume normal operation and the rest of the loadAlls finish.

djs: 14.11.0
node: 16.13.1
sapphire/framework: 4.4.3

serene gullBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

simple plaza
#

A workaround for now is to wait 3 seconds between reloading on each shard... This seems like it causes a small freeze each time, but it's less disruptive

low olive
#

@vagrant frost you're more knowledgeable about this kinda stuff

simple plaza
#

I might have been falling for a red herring this whole time. it might as well be at each loadall only blocks it's own client.

vagrant frost
#

At this point, bomi, share your bot's entire code

#

Also, what does your bot do that you're using the sharder? Practically speaking, the sharding manager leads to countless of easily avoidable headaches that could be solved by not using it at all, SM is for a tiny fraction of bots, and if you're using it because sharding requirements, it's not needed. @subtle moth runs 13k guilds just fine in one process, using internal sharding alone

simple plaza
#

Multithreading

#

I'm not sure what you mean by entire here, sorry

vagrant frost
#

You're pointing the wrong thing

#

What you're having is a feedback loop, shards sending messages to other shards, those sending to the rest, and so on, spamming broadcasts from each shard to the rest of the shards without stop

simple plaza
#

1 sec ill boil it down as much as possible

vagrant frost
#

Active users using commands makes no effect here because Sapphire loads pieces atomically, it's only unloaded when the new piece is ready and inserted

#

It's not a problem with Sapphire because otherwise a lot more people would have run into this, specially those who use the HMR plugin as it spams reloads

#

And ShardingManager may be unreliable at times, but it doesn't cause full app freezes

simple plaza
#

if I has a feedback loop it'd have multiple console logs, right?

vagrant frost
#

You're running other stuff that's called at the same time as the pieces reload, check for the code at the top level of your commands and the onLoad methods, you have something that's very CPU draining

simple plaza
#

The reason I say that it's reliant on active users is that I have the same code running on another bot account, and it does not happen, I have nto managed to get it to happen once with that account... T_T

vagrant frost
#

I also recall you did delete require.cache[]s (albeit the wrong order, doing it after commands rather than before)

#

It's possible that whatever modules you're reloading, they're running very expensive operations

simple plaza
vagrant frost
#

I can't say anything for sure, I know neither Sapphire or Discord.js are the issue, but something in your code is

#

Something in your code is being called when pieces reload, and they're incredibly expensive and/or have an infinite loop that never breaks for some reason

simple plaza
#

But here's the most basic boiled down version

let startTime = Date.now()
await botclient.stores.get('commands').loadAll();
message.channel.send("LoadAll took " + (Date.now() - startTime) + "ms")
return

(This is in a command)
For unknown reasons (Which is what I think is user activity), it takes 8000ms sometimes

vagrant frost
#

For example you could have multiple modules depending on each other and reading their state, and if you load them out-of-order, you end up executing logic with a cache state of finished rather than a cache state of uninitialised, and that kind of (impure) behaviour can lead to infinite loops

#

btw you shouldn't use botclient.stores, but container.stores

vagrant frost
# vagrant frost I also recall you did `delete require.cache[]`s (albeit the wrong order, doing i...

Keep in mind as well that you can run into split cached modules, resulting not only on a memory leak, but also duplicated systems. For example if some files load your database init function, and commands do too, then you reload the database init module (making it re-initialize) and then reload commands without reloading the rest, you end up running 2 database pools/connections simultaneously, and if you depend on them to behave like they're the same object, with its internal queues and everything, things can go south really fast

vagrant frost
simple plaza
#

Yeah it "should" take 300-400ms as it does when it does not take 8000+ms

vagrant frost
#

Maybe check that, whether or not you have a very heavy or a long-running onLoad hook

simple plaza
#

oh my god

#

Yeah the times it took 8000ms was then require cache was empty

#

I've red-herringed everyone because I deleted some old junk on the test bot

#

That old junk needing to be loaded in is what took 7600ms in loadall

#

Sorry about that, lesson learnt I hope

#

Thanks for all the help

#

Im nto sure which message to mark as sollution

vagrant frost
#

Whichever the "old junk on the test bot" falls into

serene gullBOT
#

Thank you for marking this question as solved!

Question Message ID

1110485677973446757

Solution Message ID

1110496862227660824