#thread_local cache

93 messages · Page 1 of 1 (latest)

narrow rampart
#
ThreadSafeQueue<someBigAndCostlyShit> q;
class someBigAndCostlyShit {
someBigAndCostlyShit() {} //does this prevent eng from constructing?

void initialize()
{
eng.seed(currentTime());
}
~someBigAndCostlyShit
{
q.enqueue(this);
}
std::mt19937_64 eng;
}
void threadfunc()
{
  thread_local someBigAndCostlyShit a;
  std::optional<someBigAndCostlyShit> val = q.pop();
  if(val.has_value())
  {
    a = val.value();
  }
  else
  {
    a.initialize();
  }
}

I want to implement a class that will cache itself in a queue when a thread exits to prevent constructing it again when a new thread calls threadfunc, how does this look
I'm afraid that thread_local someBigAndCostlyShit a; will be expensive anyways because of default constructor to mt19937_64
I'm not sure which random number generator to use and if my solution even makes sense
I want to have a random function from x to y but my server is async and I guess a lot of new threads can spawn and often die, so creating these all the time would be costly
thats why I wanna cache them

distant briarBOT
#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question use !howto ask.

radiant jewel
#

Thread local sucks big on windows for some reason btw major performance diff between Linux

#

There are certainly faster random number generators as well, I also don’t like the use of a global variable

#

What matters is how expensive that object creation is compared to the execution time of the thread

narrow rampart
#

I assume rng generators per thread would be expensive so I came up with this

#

and I plan to run on linux anyways

#

and for a global variable well I dont see any other option

radiant jewel
#

You can inject a dependency?

#

A reference to the queue

narrow rampart
#

sounds shitty

radiant jewel
narrow rampart
#

I could also make the variable static in the function itself

narrow rampart
#

and its a waste of time to make a new generator to get 1/2 numbers

#

so I could just reuse it

radiant jewel
#

Your thread safe queue will probably just even it out

#

Because 1000 threads are trying to acquire the lock

#

Which I assume your queue has

narrow rampart
#

yeah

#

but they wouldnt be all 1000 at once

radiant jewel
#

Well I think it’s a lot of premature optimization, you still haven’t told me how long the thread actually runs for ?

narrow rampart
#

I think making a new generator for each thread in case where async can spawn shit ton of threads that will die shortly after is a waste

radiant jewel
#

So anything from 1 nanosecond to 10 billion years?

narrow rampart
#

yes

radiant jewel
#

And your average is also unknown ?

narrow rampart
#

Invocation of the handler will be performed in a manner equivalent to using boost::asio::io_service::post().

reef vault
#

sounds like you'd benefit from having a thread pool as well tbh

reef vault
#

might not matter here though

narrow rampart
#

I use async_accept in loop then I use async_read in loop for connections

reef vault
narrow rampart
#

well I think I got the how? so just more about the sense of this

#

if my class wrapper would even work and prevent the generator from running its shit on this line thread_local someBigAndCostlyShit a;

reef vault
#

excluding premature optimization and whatnot caching instances and creating new instances on demand is fine-ish, just that if you really have the one instance per thread and have a ton of thread I'd really much rather have a thread pool and something somewhat similar to coroutines

narrow rampart
reef vault
narrow rampart
#

I have a vector of std::thread and then I emplace back 8 threads that do io_context.run

reef vault
narrow rampart
#

and from what I see that post will only invoke the handlers from the 8 threads

#

🤔

#

I actually didnt read into asio documentation

reef vault
#

then you have only 8 instances max

narrow rampart
#

I thought it can spawn threads on async operaitons

#

but it seems it just uses whatever thread I do io_context.run on

reef vault
#

well depending on the exact setup the bottom line is you don't want too many concurrent instances and you want them only on the threads that will use it

#

depending on what exactly you do you can have one generator churn out packet of random numbers, push those packets in a queue, and grab packets in different threads

narrow rampart
#

I never did such a thing that required a thread pool so idk how many threads to use

#

assuming 1500 people sending few packets a second of sizes about 500 bytes each

fringe hatch
#

I think the real question here is: why are you spawning new threads at such high rates that creating an std::mt19937 once per thread becomes performance relevant?

sour plank
#

wait are you generating a new instance of a rng mt19937 generator when you spawn a new thread? And you are scared that generating mt19937 will be expensive?

reef vault
#

well apparently he might not be, but that's not confirmed?

narrow rampart
#

I thought asio could spawn a thread to execute the handler

#

but it just uses the threads that I gave it

fringe hatch
#

one would hope so ye ^^

narrow rampart
#

so the main idea is out of the window I guess

#

now its just to decide how many threads I will need

reef vault
#

if you don't know try hardware_concurrency or whatever

#

depends how many threads outside the pool you expect there to be

#

.cppref hardware_concurrency

#

is that not the bot command

distant briarBOT
narrow rampart
#

is there a number that beyond which adding threads doesnt give performance

#

like above the number of cpu threads or cores

reef vault
#

yes there is, but the "exact" number depends on what you're doing

narrow rampart
#

and is there any way to approximate that number without trial and error

reef vault
#

if most threads are asleep most of the time you can "afford" to have more but it's kinda weird

#

if your threads are always active and want to churn out computation then there's not much point to go beyond what your machine can actually do

narrow rampart
#

idk to be honest what asio does under the hood with threads

#

I just know that io_context.run runs its event processing loop

reef vault
#

or more like it's detrimental because then the os has to spend more time scheduling and switching between threads

narrow rampart
#

how does windows not freeze then

reef vault
#

what do you mean

#

like the OS?

#

because it's in control

#

and gets to decide what runs or not

narrow rampart
#

from what I see in task manager I ahve 3536 threads running

reef vault
#

sure

#

most of them are either sleeping or do not get to do anything

#

unless your cpu is able to concurrently have 3536 threads progress then some of them must be inactive, no question asked

#

whether they are inactive because they have nothing to do and are waiting to be woken up/signalled, or they were forcibly dragged away from the cpu to put a different thread in its place, is unspecified

narrow rampart
#

so I should just use the number of CPU threads

#

or cores * threads

#

if I have 8 cores and 16 threads then it means I have 2 threads per core yeah?

radiant jewel
#

Depends on what type of execution they run

narrow rampart
#

so I should spawn 16 threads

radiant jewel
#

If they do performance intensive tasks you should only spawn as many threads as you will have

#

Since these will takeover all your system resources

#

If they are just io or are stopping their execution regularly because they wait for data then spawning more is okay

narrow rampart
#

yeah they are io

#

I guess I would have to do trial and error