#Optimalization through multithreading

316 messages ยท Page 1 of 1 (latest)

hexed loom
#

I want to multithread a for loop using a ThreadPool

hot jayBOT
#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question run !howto ask.

hexed loom
#

according to the profiler I should first look at update_rockets

#
void Tmpl8::Game::update_rockets()
{
    for (Rocket& rocket : rockets)
    {
        rocket.tick();

        //Check if rocket collides with enemy tank, spawn explosion, and if tank is destroyed spawn a smoke plume
        for (Tank& tank : tanks)
        {
            if (tank.active && (tank.allignment != rocket.allignment) && rocket.intersects(tank.position, tank.collision_radius))
            {
                explosions.push_back(Explosion(&explosion, tank.position));

                if (tank.hit(rocket_hit_value))
                {
                    smokes.push_back(Smoke(smoke, tank.position - vec2(7, 24)));
                }

                rocket.active = false;
                break;
            }
        }
    }
}
#

So what I think I should do is look at the available threads on the machine, and then divide the rockets vector over them

timber jackal
#

send the pastebin link here too

hexed loom
#

This is the entire main file

timber jackal
#

Line 169:

std::vector<Tank*> nearby_tanks = tank.owner_cell->tanks;
hexed loom
#

and this is the threadpool

timber jackal
#

this doesn't have to be a copy

#

it could be a const ref to the tanks field

#
auto const& nearby_tanks = tank.owner_cell->tanks;
hexed loom
#

I see

timber jackal
#

there is only one .reserve() call on vector in your code

#

and 11 .push_backs

#

reserve memory ahead so it won't be reallocated many times

#

estimate how many items will be stored

hexed loom
#

Okay

timber jackal
#

IDK what does add_tank do, but this is most likely unsafe

#

line 90

#

you're taking a reference to a vector element

#

in a loop

#

in which you're pushing back to it

hexed loom
#
/// <summary>
/// Voegt tank toe aan een cell gebaseerd op zijn positie
/// </summary>
/// <param name="tank"></param>
void Grid::add_tank(Tank* tank) {
    Cell* cell = get_cell(tank->position);
    add_tank(tank, cell);
}

/// <summary>
/// Voegt tank toe aan cell
/// </summary>
/// <param name="tank"></param>
void Grid::add_tank(Tank* tank, Cell* cell) {
    cell->tanks.push_back(tank);
    tank->owner_cell = cell;
    // Geef de positie van de tank in de vector array aan:
    // Omdat hij aan de achterkant van de vector staat (push_back) is de positie
    // de grootte van de vector min รฉรฉn.
    tank->cell_vector_index = cell->tanks.size() - 1;
}
#

Sorry it is in dutch

timber jackal
#

yeah, it is unsafe

#
std::vector<Cell*> neighbouring_cells = grid->get_neighbour_cells(&current_tank);
std::vector<Tank*> neighbouring_enemy_tanks = grid->get_neighbouring_tanks(neighbouring_cells, current_tank.allignment);

line 113-114

#

use references here

#

same as before

hexed loom
#

using auto?

#

I am not sure what auto does, does it infer the type?

timber jackal
#

yes

#

auto is optional but mostly advised

hexed loom
#

Cool

timber jackal
#

unless you have very good reason not to use it

hexed loom
timber jackal
#

modern IDEs can show you inferred types anyway

#

also, not a performance problem, but I will mention it anyway

#

try not to repeat yourself with types

hexed loom
#

Aha

#

Yes I know what you mean

#

By making a template yes?

timber jackal
#
MergeSort sorter = MergeSort();
#

no

#
auto sorter = MergeSort();
hexed loom
#

๐Ÿ‘๐Ÿป

#

This seems a bit weird though

#

I mean I know it will be a MergeSort

timber jackal
#

no need to repeat it multiple times :p

hexed loom
#

The performance profiler, it only shows in depth one loop

#

at least I think

#

let me try that again

timber jackal
#

Ask yourself if all of the ticks have to be performed at the same rate?

#

From my experience most of game ticks can be limited to 10 times per second

hexed loom
#

I know the tanks_update one will be used a lot

timber jackal
#

I'm not talking about core gameplay mechanics

hexed loom
#

ah like that

timber jackal
#

and physics handling

#

this should tick each frame

#

but cosmetics

#

could be limited not to tick every frame

hexed loom
#

right

#

this would definitely increase performance

timber jackal
#

also prefer early return

#

same with loops and their continue;

#

nesting adds complexity and is unnecessary

#

this guy explained it well

hexed loom
#

Yea I definitely agree with this

#

but I didn't write most of this code

#

If I change it somewhere I'll have to change it everywhere

#

and I don't really wanna spend loads of time refactoring the code to make it more readable

timber jackal
#

okay I understand

hexed loom
#

I know personally though I wouldn't write it this way

#

Does the profiler make your program run slower?

timber jackal
#

generally yes

#

but in a simple programs it shouldn't really matter

hexed loom
#

alright good I thought I might've f'd something up ๐Ÿ˜…

timber jackal
#

what compiler do you use?

hexed loom
hexed loom
#

let me check

timber jackal
#

MSVC?

#

GCC?

#

Clang?

#

If you're running on MSVC you can easily adapt multithreading in certain places

hexed loom
#

I honestly am not sure, visual studio handles all of that

timber jackal
#

if you're running on a default Visual Studio setup

#

then it is MSVC

hexed loom
#

would make it a lot easier

timber jackal
#

C++17 to be exact

hexed loom
#

only the thread library and the threadpool they provided

timber jackal
#

in C++17 they implemented parallel versions of algorithms

#

yeah, but I doubt std lib is not allowed

hexed loom
#

Oh yeah sure that is allowed

timber jackal
#

you can for example run for_each in parallel

hexed loom
#

hmm

timber jackal
#
#include <execution>
#include <algorithm>

// ...
std::for_each(std::execution::par, vec.begin(), vec.end(),
  [&](VecElement const& elem) {
    // do something
  });
#

no need to manually launch threads

#

most of your loops that do not write to a common memory

#

could be rewritten this way

#

and possibly using std::scoped_locks on common data

#

example

#
void Game::update_particle_beams() {
    for (Particle_beam& particle_beam : particle_beams)
    {
        particle_beam.tick(tanks);
 
        //Damage all tanks within the damage window of the beam (the window is an axis-aligned bounding box)
        for (Tank& tank : tanks)
        {
            if (tank.active && particle_beam.rectangle.intersects_circle(tank.get_position(), tank.get_collision_radius()))
            {
                if (tank.hit(particle_beam.damage))
                {
                    smokes.push_back(Smoke(smoke, tank.position - vec2(0, 48)));
                }
            }
        }
    }
}
#

could be

#
void Game::update_particle_beams() {
    for (Particle_beam& particle_beam : particle_beams)
    {
        particle_beam.tick(tanks);
 
        //Damage all tanks within the damage window of the beam (the window is an axis-aligned bounding box)
        // ...
        std::for_each(std::execution::par, tanks.begin(), tanks.end(),
          [&](Tank& tank) {
            if (!tank.active || !particle_beam.rectangle.intersects_circle(tank.get_position(), tank.get_collision_radius()))
              return;
            
            if (!tank.hit(particle_beam.damage))
              return;
            
            auto lock = std::scoped_lock(smokes_mtx);
            smokes.push_back(Smoke(smoke, tank.position - vec2(0, 48)));
        });
    }
}
hexed loom
#

Aha

timber jackal
#

make sure to properly protect smokes vector with the smoke_mtx

hexed loom
#

nifty method

timber jackal
#

this will give you a huge performance boost if applied properly

hexed loom
#

unfortunately the profiler data isn't loading

#

after finishing

#

I will try this

#

but I am 90% sure I am supposed to use the threadspool they provided

#

even though this is clearly a better option

timber jackal
#

IDK how their thread pool looks like

#

but it should be pretty similar

hexed loom
timber jackal
#

you can even write your own impl of the for_each using the threadpool

#

pretty easily

#
template <typename Range, typename Func>
void par_for_each(ThreadPool& tp, Range& range, Func func) {
  // assuming you have an 8 core processor
  auto len = range.size();
  auto num_threads = std::min(16, range.size());
  auto step = len / num_threads;

  for (size_t i = 0; i < len; i += step)
    tp.enqueue([i, step, &func]{
      auto curr = std::next(std::begin(range), i);
      auto end = std::next(curr, step);
      for (; curr != end; ++curr)
        func(*curr);
    });
  }
}
#

hm, you should also hold futures somewhere

hexed loom
#

this is

#

maybe a little too advanced for me

#

๐Ÿ˜‚

timber jackal
#

welp

hexed loom
#

im struggling trying to lock the smokes even

#

so i make a std::mutex and I call .lock on that

timber jackal
#

this is how launching N threads over a number of elements work

timber jackal
hexed loom
#

oh like so

#

ah I understand

timber jackal
#

use:

{
  auto lock = std::scoped_lock(mutex);
  // do thing here

  // automatically unlocked
}
hexed loom
#

so you dont have to do anything with the lock?

timber jackal
#

no, scoped_lock does what its name suggests

hexed loom
#

alright

#

it unlocks when exiting that scope

timber jackal
#

yes

#

even if you write return; in between

#

so it is much safer than manually calling .lock()

hexed loom
#

and I should create the mutex then before the for_each

timber jackal
#

you should create a mutex wherever you create the thing it protects

#

but

#

if you're going for the safer and more managable approach

hexed loom
#

the smokes vector is made in the header file

timber jackal
#

which is to have the main thread

#

and then launch simple worker threads for smaller tasks

#

that don't exit the function that creates them

#

you can have a local mutex

#

in that function

#

you'd have a lot of trouble otherwise

#

because you'd have to protect the .smokes vector every time you access it

#

because there could be another thread launched in background

#

that modifies it

timber jackal
hexed loom
#

I would like to have it like that but it's the same with the early return principle ๐Ÿ˜‚

#

the whole program is designed around a single thread

#

I don't even really understand it because I didn't write it

timber jackal
#

okay okay

timber jackal
#

utilizing the threads only to run an algorithm in parallel

#

inside one function of the main thread

hexed loom
#

yes

#

that would be the Game::update method

#

I think

timber jackal
#

yes

#

and its submethods

#

I have to go now unfortunately

#

good luck :p

hexed loom
#

thanks for the help

hot jayBOT
#

@hexed loom Has your question been resolved? If so, run !solved :)

burnt nova
#

also small note, multithreading is not a magic pill to make things faster

burnt nova
hexed loom
#

debug

hexed loom
#

however I am having some issues using it

#

my thread pool is a unique pointer

#

I don't think I can then pass that into the method?

#

it doesnt even have to be a unique pointer i dont think

#

no it does

#

otherwise I get this error

#

@burnt nova would you know how I can pass the unique_pointer thread_pool into the function?

burnt nova
burnt nova
hexed loom
#

par_for_each(tanks, &set_routes);

#

Is this not how I should pass the function?

#

ah Game::set_routes

burnt nova
#

For real though in the future make profiles in release

#

Debug ones are basically useless

hexed loom
#

okay

burnt nova
#

But yeah it's indeed game::set_routes

#

Though I'd be somewhat supriced if that just works

hexed loom
#

that part works

#

For some reason though, even though THREAD_AMOUNT and thread_pool are both in the same Tmpl8 namespace, I still get errors

#

when I control click them they go to the right thing

#

and I can't seem to be able to pass them to the function directly

burnt nova
#

what is the first error that is actually thrown in the output?

#

cause the error list is nice, but a bit flawed cause it isn't ordered

hexed loom
#

I'm not sure how I

#

check in which order they occurred

burnt nova
#

the output window

#

cause a few of these could be falltrough errors

hexed loom
#

I get no output window, these are build errors

#

unless I dont understand what you mean with output window

burnt nova
#

there should be an option for build on the top there

#

and if it's not there

hexed loom
#

they show up in the same order

#

in the output window

burnt nova
#

so that is the std::min one?

hexed loom
#

yes

#

because auto num_threads = std::min(THREAD_AMOUNT, len);

#

THREAD_AMOUNT doesnt exist?

#

would it give that error?

burnt nova
#

nah, lets fix the first one first

hexed loom
#

that doesnt seem logical

burnt nova
#

what does it exactly say for the first error?

#

in the output window?

hexed loom
#

THREAD_AMOUNT is a const int

#

that doesnt matter right

burnt nova
#

no it is important

#

and what is len?

hexed loom
#

auto len = range.size();

burnt nova
#

right ๐Ÿค”

hexed loom
#
    template <typename Range, typename Func> // Template nodig voor een soort van "auto" types
    void parallel_for_each(Range& range, Func func) {
        auto len = range.size();
        auto num_threads = std::min(THREAD_AMOUNT, len);
        auto step = len / num_threads;

        // Loopt over het aantal threads dat we willen maken
        for (size_t i = 0; i < len; i += step)
            thread_pool.enqueue([i, step, &func] {
                auto curr = std::next(std::begin(range), i);
                auto end = std::next(curr, step);
                for (; curr != end; ++curr)
                    func(*curr); // * is nodig omdat curr een iterator is
            });
    }
burnt nova
#

yeah that is a size_t

hexed loom
#

the range is a reference

#

to a list of Tank

burnt nova
#

try making thread_amouth just a size_t

#

that ensure the types match

hexed loom
#

yea that fixed that error

#

thread_pool however doesnt exist, probably because it is a unique_ptr

burnt nova
#

again always work on the first error first

hexed loom
#

alright

burnt nova
#

what is that range can't be brought in hat is true as you do not capture it

hexed loom
#

The first one is the thread_pool one

burnt nova
#

right

hexed loom
#

I will need to pass it to the function, using move, I think

#

like you said

burnt nova
#

rightr where is threadpool defined?

hexed loom
#

In game.h

#

std::unique_ptr<ThreadPool> thread_pool;

#

and then in game.cpp I do

#

thread_pool = std::make_unique<ThreadPool>(THREAD_AMOUNT);

burnt nova
#

is that part of the game class?

hexed loom
#

but same namespace

burnt nova
#

right then yeah, you don't have access to it

hexed loom
#

I see

burnt nova
#

unless thread_pool is a global

#

then you might

hexed loom
#

can I make it one?

burnt nova
#

you shouldn't

hexed loom
#

okay so I should pass it

burnt nova
#

non constant globals are basically always a bad idea

#

just pass it in as an argument for the paralled_for_each realy

hexed loom
#

Alright

burnt nova
#

as a std::unique_ptr<ThreadPool>& or just a Threadpool&, and then dereference the unique_ptr when calling it

hexed loom
#

oh alright

burnt nova
#

*thread_pool

#

to make it a ref

hexed loom
#

ofcourse

burnt nova
#

also a small tip in general I would reccomend making comments in english over dutch

#

it's a bit more standard

hexed loom
#

yea I usually do

#

But for some reason they wanted it in dutch this time

#

Okay that fixed the thread_pool bug

burnt nova
#

odd, is this for buas?

hexed loom
#

but the range one is still here

burnt nova
#

[i, step, &func] vs [i, step, &func, &range]

hexed loom
burnt nova
#

yeah that is cause it's not captured in the lambda

burnt nova
hexed loom
#

no

#

im in leeuwarden

burnt nova
#

I just regoncnized the template and they generally use it for first year and entrance exams

hexed loom
#

ah

#

second year for me

burnt nova
#

and the teacher that made that template works at BUAS

hexed loom
#

cool

hexed loom
#

but now a new one

burnt nova
#

what exactly causes that error?

hexed loom
#

well

#

I actually think I know

#

it's trying to call the function I passed

burnt nova
#

good!

hexed loom
#

with part of the tanks list

#

but that function doesn't take any arguments

#

so I'll need to change it

#

hm I changed it

#
void Tmpl8::Game::set_routes(vector<Tank>& tanks)
{
    if (frame_count == 0)
    {
        for (Tank& t : tanks)
        {
            t.set_route(background_terrain.get_route(t, t.target));
        }
    }
}

#

but I still get the error

#

I think wrong type

#

yea I dont get this one

#

it should be able to call the method

#

oh wait

#

I completely read the thing wrong

#

no it's still not working

#
void Game::set_route(Tank& tank)
{
    if (frame_count == 0)
    {
        tank.set_route(background_terrain.get_route(tank, tank.target));
    }
}
#
1>D:\Programming\School\periode-2---optimalisatie-pepijn-s\game.cpp(168,60): message : see reference to function template instantiation 'void Tmpl8::parallel_for_each<std::vector<Tmpl8::Tank,std::allocator<Tmpl8::Tank>>,void(__cdecl Tmpl8::Game::* )(Tmpl8::Tank &)>(Tmpl8::ThreadPool &,Range &,Func)' being compiled
1>        with
1>        [
1>            Range=std::vector<Tmpl8::Tank,std::allocator<Tmpl8::Tank>>,
1>            Func=void (__cdecl Tmpl8::Game::* )(Tmpl8::Tank &)
1>        ]
1>grid.cpp
#

@burnt nova do you see what the problem is?

#

I would like to see what curr is but I can't debug if it's a build error

#

I think the issue is with how I am passing the method

hot jayBOT
#

This question thread is being automatically closed. If your question is not answered feel free to bump the post or re-ask. Take a look at !howto ask for tips on improving your question.