#Voxel Game Engine

831 messages · Page 1 of 1 (latest)

west orbit
#

working on a minecraft like voxel game engine, currently stuck on world streaming lol

stone depot
#

dope

west orbit
#

it loads streams properly but only for rendering, the actual block data is deleted when the geometry is built

#

working on a structure that can keep the block data around but without being too slow, im stuck on multithreaded unloading as the issue so far bjorklul

#

rendering is currently not very optimized, it has the most trivial block face culling with zero frustum culling so it runs horribly on lower end hw

#

an hd 7750 can run it with a 10 chunk render distance at ~32fps in 900p 💀

#

sandy bridge GT1 igp is ~8.8fps

#

oh, its also built to natively compile and run on both windows using MSYS2 and linux systems

west orbit
#

its got the opportunity to run much faster, im just getting basic functionality working before doing optimizations like that bjorklul

stone depot
#

yea get it working first

west orbit
#

every block on the border of a chunk has every face rendered which is killing it pretty hard kekw

stone depot
#

lol yea thatll do it

stone depot
west orbit
#

im high up

#

looks like maybe 20-30 meters above the ground

stone depot
#

ah alr

#

one of my favorite voxel engines has to be silvermans voxlap engine

#

its just insane

#

all on the cpu using sse

west orbit
#

oh damn

stone depot
#

it was also written in '00

west orbit
#

im using C++20 bjorklul

#

funnily enough im also using legacy opengl kekw

stone depot
#

legacy is just easier to use
i dont feel like learning all the new batching stuff (at least rn)

west orbit
#

yeah it is, anything more than what im doing will get really tedious and confusing tho

#

for simple textured geometry its pretty good if you just want to get something rendered and figure out the rest later

stone depot
#

yea youll probably have to switch over to modern gl at some point

west orbit
#

my rendering stuff is setup to be extensible so i can just add functionality to it later

#

it has 2 texturing modes and 4 vertex specification modes

#

it can be compatible with ogl 1.0 if you want bjorklul

stone depot
#

dayum

#

good abstraction

west orbit
#

ive been iterating on this engine's design over the past 5ish years bjorklul

#

learning what works and what doesnt

stone depot
#

😳

west orbit
#
       extern std::function<void(Geometry *g)> initBuffers;
        //syncs the data in g.vertAttribs and g.vertInds with their GL buffers
        extern std::function<void(Geometry *g)> syncBuffers;
        //clears the geometry in g.vertAttribs and g.vertInds
        extern std::function<void(Geometry *g)> clearGeometry;
        //sets up the GL state by calling the appropriate glEnableClientState and gl*Pointer functions for the attribute layout in g, should be matched with a cleanupVertexAttribLayout call
        extern std::function<void(Geometry *g)> setupVertexAttribLayout;
        //draws the geometry in g, uses whatever state GL is currently in
        extern std::function<void(Geometry *g)> draw;
        //cleans up the GL state by calling the appropriate glDisableClientState functions given the attribute layout in g
        extern std::function<void(Geometry *g)> cleanupVertexAttribLayout;
        //deletes the GL buffers
        extern std::function<void(Geometry *g)> deleteBuffers;

the interface for geometry handling is pretty much function objects that can be reassigned to different actual functions based on what vertex specification mode is desired bjorklul

#

rendering chunks looks like this

            for (auto it = chunkManager->chunks.begin(); it != chunkManager->chunks.end(); ++it){
                
                if(!it->second.ptr->isRenderable) {
                    continue;
                }
                
                glm::dmat4 modMat = glm::translate(glm::dvec3(it->second.ptr->worldspaceBlockLocation.x, it->second.ptr->worldspaceBlockLocation.y, it->second.ptr->worldspaceBlockLocation.z));//compute the model matrix from its position
                gl::transform::fixed::loadModelViewMat(viewMat * modMat);//load the modelview matrix with the product of the view and model matrices, this transforms the chunk into view space for rendering
                gl::geometry::setupVertexAttribLayout(&it->second.ptr->geometry);
                gl::geometry::draw(&it->second.ptr->geometry);
                gl::geometry::cleanupVertexAttribLayout(&it->second.ptr->geometry);
                
            }
#

of course you need to setup the gl state before the loop and clean it after, like all your glEnables and glDisables

#

i have void setVertexSpecificationMode(uint32_t mode); that lets you set the way geometry is rendered, this can be done even at runtime

#

however changing it from VAO mode to immediate mode requires that you either clean up all the GL state or create a new one and set it up in order for it to actually work

#

eventually im going to implement a vulkan renderer but from what ive seen of vulkan so far, it would take forever and be a giant waste of time currently

stone depot
#

yea vulkan is a massive pain

west orbit
#

the current problem is unloading regions and chunks in a multithreaded environment with minimal blocking

this is what my structures look like rn

    struct World;
    struct Chunk {
        std::atomic<bool> isLoaded = 0;
        std::atomic<int32_t> numLoaders = 0;
        //int64_t chunkLocationX, chunkLocationY, chunkLocationZ;
        Coordinates worldspaceChunkLocation;//the chunk's location in the world, in chunk coordinates
        std::unique_ptr<uint32_t[]> blocks;
    };
    struct Region {
        std::unique_ptr<AtomicSharedPtr<Chunk>[]> chunks;
    };
    struct Dimension {
        std::shared_mutex regionsMutex;
        std::map<std::array<int64_t,3>, Region> regions;
    };
    struct World {
        //the dimensions of a chunk, in block coordinates
        uint32_t chunkWidth, chunkHeight, chunkLength;
        //the dimensions of a region, in chunk coorinates
        uint32_t regionWidth, regionHeight, regionLength;
        //std::map<int32_t, Dimension> dimensions;
    };
#

AtomicSharedPtr is this thing i made bjorklul

template <class T> class AtomicSharedPtr {
    public:
        T *ptr = nullptr;
        std::atomic<uint32_t> *use_count = 0;
        AtomicSharedPtr() {
            ptr = new T();
            use_count = new std::atomic<uint32_t>(0);
        }
        AtomicSharedPtr(const AtomicSharedPtr &copyFrom) {
            ptr = copyFrom.ptr;
            //*copyFrom.use_count += 1;
            use_count = copyFrom.use_count;
            *use_count += 1;
        }
        ~AtomicSharedPtr() {
            if(*use_count == 0) {
                delete ptr;
                return ;
            }
            *use_count -= 1;
        }
};
stone depot
#

shared_ptr 🥲

west orbit
#

std::shared_ptr didnt do exactly what i wanted bjorklul

#

use_count is approximate in a multithreaded environment and it doesnt auto allocate either

stone depot
#

i meant it more as the whole concept of shared ownership is fishy
but its usually passable for games

west orbit
#

it works perfectly for what im using it for

stone depot
#

yea

west orbit
#

currently stuff can be used by 3 separate systems, the graphics system, the world system and the jobs system

#

putting guards to keep something from being freed if a job is still running on it is very difficult with how i have it setup

#

a shared pointer works perfectly to prevent use-after-free crashes

#

i can remove elements from a map without them being deleted if a reference to that thing still exists

#

the weird AtomicSharedPtr thing automatically allocates memory for whatever type is specified and automatically deletes when the last copy is deconstructed

#
    namespace chunk {
        size_t getBlockOffset(World *w, size_t x, size_t y, size_t z);
        void allocBlocks(World *w, Chunk *c);
        //void freeBlocks(Chunk *c);
        int8_t genTerrain(World *w, AtomicSharedPtr<Chunk> c, voxel_world::Coordinates worldspaceChunkLocation);
        //int8_t loadTerrain();
    }
    namespace region {
        size_t getChunkOffset(World *, size_t x, size_t y, size_t z);
        void init(World *world, Region *region);
        Chunk *loadChunk(World *world, Region *region, Coordinates regionspaceChunkPosition);
        Chunk *getChunk(World *world, Region *region, Coordinates regionspaceChunkPosition);
        void unloadChunk(World *world, Region *region, Coordinates regionspaceChunkPosition);
        void unload(World *world, Region *region);
    }
    namespace dimension {
        Chunk *allocChunk(World *world, Dimension *dim, Coordinates worldpsaceChunkLocation);
        int8_t loadChunk(World *world, Dimension *dim, Coordinates wolrdpsaceChunkPosition);
        int8_t unloadChunk(World *world, Dimension *dim, Coordinates wolrdpsaceChunkPosition);
    }

this is my current interface for each type of structure

#

most things should be done in the context of a dimension, the rest should be handled transparently

#

havent worked out all the details tho bjorklul

west orbit
#

this is an older iteration that has no multithreading and only supports immediate mode ogl kekw

stone depot
#

dat grass 🔥

west orbit
#

made it in blender bjorklul

#

this is a much earlier version where i was learning ogl

#

made a custom export script for blender and a custom loader bjorklul

#

this is simple textured geometry tho, the lighting is baked into the texture

stone depot
#

that looks real nice ngl

west orbit
#

got an opengl test from 2015 kekw
ive been working on this for a long time

#

oh damn that means its been 8 years 💀

stone depot
#

ngl ive only used opengl twice

#

once for a game and once for a paper

west orbit
#

its ass but i didnt want to learn direct3d

stone depot
#

ive never been able to really get into it for fun

#

YEA

#

i saw the boilerplate for that

#

compared to opengl

#

and you can probably guess what i picked

west orbit
#

i never even looked into d3d kekw

#

i wanted this to run natively on linux systems as well so d3d wasnt a good choice

#

itd need some kind of translation layer or something

#

and im not gonna learn a proprietary API like d3d or something, it feels like a waste of time to lock myself to a specific vendor

#

same reason why i wont touch CUDA

#

cant even use CUDA anyway, i have an amd gpu in my dekstop bjorklul

stone depot
#

cudas also so fucking annoying to write

west orbit
#

id rather take my chances with vulkan compute or OCL or something

stone depot
#

msvc which "supports" it wont let you use <<<...>>> without screaming that its wrong

#

anything that doesnt try and integrate with an ide is probably a better option

west orbit
#

ill just try my hand at vulkan compute if i ever get to gpgpu stuff bjorklul

stone depot
#

vulkan 2 scary 4 me

west orbit
#

its an insane amount of boilerplate but it may not be too bad when you get that out of the way

stone depot
#

yea i wrote a simple 2d "engine" in it

#

took like a week to even get it drawing a white screen

west orbit
#

💀

#

just found my proof of concept for using multitexturing to make a texture atlas mosaic

stone depot
#

😍

west orbit
#

this was may 2021 apparently bjorklul

#

i later decided to drop the texture atlas mosaic thing tho

#

so for the needed amount of texels, 4096x4096 texture support is needed to get 16x16 block face textures

#

i may try to see if 8x8 textures are usable so that lowers the requirement to 2048x2048

#

using a mosaic of texture atlases through multitexturing could fix that whole issue but its kinda cumbersome to work with

#

version from june 2021 testing drawing models and chunks at the same time

#

stole the minecraft textures kekw

west orbit
#

so, im kinda stuck on how id do multithreaded chunk processing with my current architecture Hmmm

sinful wedge
#

have a program wide thread pool and just submit a "job" that's just a void (void) lambda that takes in everything by capture?

west orbit
#

i already have a thread pool to process a job queue that takes std::functions

#

but scheduling it may be a bit weird

sinful wedge
#

are they not asyncronous and operating on a global "world" memory pool?

#

It's an absolutely wonderful primitive

#

I wrote a wrapper around it so the ergonomics of using it are a bit better however it's amazing to structure everything around

west orbit
#

id need a chunk tick function thats launched for every chunk every server tick, however long that is

#

ill probably go for 1/20th of a second, it seemed fine on minecraft

sinful wedge
#

chunk tick? why are your chunks updating at all every tick

west orbit
#

well they dont right now kekw

#

ill need some kind of processing done on the chunks every server tick at some point

#

like growing vegetation, processing machines, mob spawning and other stuff like that

sinful wedge
#

seems a little sketch

west orbit
#

im not sure bjorklul

sinful wedge
#

idk tho I gotta bounch

#

ping me in like 90 minutes, ive got more than a few ideas, Ill draw some of them out

west orbit
sinful wedge
#

also fences are your friend

#

(most) uses of mutexes can be eliminated with the proper use of fences

#

and theyre a helluva lot faster

west orbit
#

ill look into those dorime

west orbit
#

I've been reading about fences and I still have no clue how they actually work or how you're supposed to use them yamikek

sinful wedge
#

well shit

#

I had like a paragraph typed out then I acidently CMD + Q'd my discord

#

basically

#

are you familiar with condition variables

west orbit
#

somewhat yeah

#

used them to implement a thread safe job queue

sinful wedge
#

the things where you can set one off, whatever is looking at it checks to see if its signaled, if its not it goes back to sleep. you normally combine them with a mutex since they can be a bit fucky if youve got mutliple threads accessing the same one causing spurious wakeups

#

but a fence is guaranteed to only ever have one reader and one writer

#

so thus no spurius wake ups anymore

#

so you create a chain of fences, where thread 1 signals THREAD 1-2 OPERATION FENCE when its done, and thread 2 wakes up when its signaled

#

hopefully I explained it good enough

#

tldr if you know the order of mutex accesses you can use fences

#

and theyre faster

#

however doing so youre still liminited to one thread at a time working on a shared resource

west orbit
sinful wedge
#

I reccommened just trying shit out and using the thread sanitizers

#

actually what i've described here is best thought about using a semaphore

#

for the signalling bit

#

memory fences are a completely different ball game

#

those are for single threaded operations if the CPU or compiler is doing some funny instruction reordering

west orbit
#

is there a specific thing i should be looking at like std::atomic_thread_fence or is that something else thinkies

sinful wedge
#

yea thats a memory fence

#

those are seperate

#

are you familiar with instruction reordering?

#

this is down at the assembly level

west orbit
#

ya, for OOE cpus?

sinful wedge
#

yes

#

basically a memory fence is you telling the compiler / cpu dont reorder past this boundrary

#

so nothing after can come before and nothing before can come after

#

thats the general idea

#

theres a billion more technicalities

#

however not important rn

west orbit
#

oooh okay

#

so what is the actual thing in C++ you can use to create thread fences

sinful wedge
#

theyre called semaphores

#

I think theyre in C++23

#

std binary semaphore

west orbit
#

ooh i saw those on the C++ reference website but didnt look too much into them

sinful wedge
#

imma be honest idk wtf you'd use a counting semaphore for

#

but I also dont know a whole lot about concurrency in the grand scheme

#

but a binary semaphore is what youd use to implement this signalling

#

basically youd create one for each "boundrary" where access is handed from one thread to another

#

you pass one reference to each

west orbit
#

ill have to look into those more lol

sinful wedge
#

THREAD 1

/// normal code
std::binary_semaphore::release() // adds one to the bool

THREAD 2

std::binary_semaphore::acquire() // tries to decrement the value from 1 -> 0, if its sucessful, controll has been passed and the function returns, if unable the thread goes to sleep and is woken up *automagically* by the OS when the corresponding call to std::binary_semaphore::release() adds one and passes controll over

/// normal code
#

semaphores are super light weight

#

heres an example of an MPMC queue that uses semaphores to absolutely smoke mutexes

west orbit
#

i could probably use binary semaphores on a per chunk basis to ensure jobs dont clobber each other

sinful wedge
sinful wedge
#

you have one "master" thread that constructs all of the semaphores and passes them around and all the worker threads doing their business

#

you'll want to use multiple semaphores per thread to stage operations in a way that a CPU executes multiple instructions at once thats how you get faster than single thread performane

#

in the bottom scenario each row can be thought of as a thread and each column can be a chunj

#

so you work on essentially min(MAX_SEPERATE_CHUNK_OPERATIONS, NUMBER_OF_THREADS) of times speedup

#

so say you have the lighting, block update, entitiy update, idk what else you get the idea

#

and you can pipeline them

west orbit
#

ya thats a really good idea

sinful wedge
# sinful wedge

also this just shows you some of the performance speedup that is possible a literal 4!!!!! order of magnitude increase in performance

west orbit
#

pipelining is important dorime

#

OOE cpus are great lol

sinful wedge
#

yup

#

I really do need to stop calling these fences though :P

west orbit
#

ive done some asm programming but only for a z80 kekw

sinful wedge
#

vulkan has to call them fences because of other shit

sinful wedge
west orbit
#

i thought you were talking about memory fences

sinful wedge
#

twas not :P

#

in vulkan theyre a combo of memory fence + semaphore

#

so thats why theyre called fences

#

also you might want to move to vulkan since in vulkan you can submit calls from All threads

west orbit
#

im doing opengl for compatibility reasons bjorklul

sinful wedge
#

none of this master thread callback bullshit

sinful wedge
west orbit
#

also vulkan has a lot of boilerplate that makes it hard for me to get through it kekw

sinful wedge
#

vulkan has literally no state iirc

#

oh well

#

actually

west orbit
#

oh yeah opengl is awful, being a global state machine is the worst thing kekw

#

its so confusing

sinful wedge
#

there are a lot of callbacks that are state

#

but other than that

#

nothing

west orbit
#

i need to make something that can construct a render ready vulkan context

#

setup swapchains and images and whatever it is

sinful wedge
#

here

#

let me link you my repo

#

its (fairly?) well sectioned out

#

I'm refactoring it rn to dynlink vulkan

#

however each bit is logically seperateed from each other

#

if I remember when I finish dynlinking vulkan

#

ill publish my new repo and all youll need is a
lib/libfmt.a and lib/libglfw3.a to build it

#

and ill get rid of libfmt once it becomes standard in gcc

#

but yea Ive been doing all of the vulkan biolerplace because I want to get back into graphics but just couldnt stomach working with opengl again

west orbit
#

ive been mostly doing legacy ogl so far, all i have implemented rn is geometry rendering, textures and fog

#

just to get something rendered so i can develop the rest of the engine

sinful wedge
#

My one gripe with vulkan is that certain things like say texturing are heavily optimized and as a result you really have to watch tutorials or dig into the api docs to use them

#

for instance depth buffering

#

I wish it was manually implemented in the fragment shader and accessing global memory in a uniform buffer

#

however that would be slow as balls

west orbit
#

💀

sinful wedge
#

so you have to jump through a lot of hoops to implement it

#

but the end result is fucking mad performance

west orbit
#

yeah i assume the ROPs deal with depth buffer modification pretty fast vs something a shader would do

sinful wedge
#

yes they do, but theyre only programmable by the driver so thus specific shit

west orbit
#

ya

sinful wedge
#

IMO vulkan is the dream graphics api

#

Also if you need me to explain some of the code i'd be more than happy to. I implemented it in this weird middle ground between procedural and OOP that I really like and seems to scale well, however if youre used to 10 class inheritance hierarcheries then its going to be a bit different to look at

#

tldr if you need me to explain shit

#

id be more than happy to do so

west orbit
#

oh im used to weird mixes of OOP and procedural stuff kekw

#

thats generally how my engine is structured, mostly procedural stuff with simple classes for specific jobs

#

most functions take pointers to structures to operate on for instance

sinful wedge
sinful wedge
west orbit
#

C++ lol

sinful wedge
#

references and member functions

#

IMO I avoid pointer like the plague

#

references are guaranteed ™️ to be valid

west orbit
#

i seem to deal fine with them bjorklul

#

i tried references but they throw a wrench into the syntax for my job queue

#

i dont want to wrap everything in a std::ref when i can just pass a pointer to std::bind

sinful wedge
west orbit
#

nope

sinful wedge
#

well then, pointers it is!

#

dammit

#

=]]

west orbit
#

i like them HappyGrin

#

references are useful in more specific cases to me tho lol

sinful wedge
#

IMO I like rust's balance between explicitly passing references but still having them be known good

#

also lifetimes, I miss

west orbit
#

i made a thing that can automatically constructs a new thing for something when constructed, then automatically deconstructs it when the last copy of it is deconstructed

#

uses an atomic counter thats incremented when copy constructed and decremented when deconstructed

sinful wedge
#

you mean a std::shared_ptr ?

west orbit
#

its kinda like that, but its easier to use when accessing std::maps

#

also use_count in a std::shared_ptr is approximate in a multithreaded environment, which was a problem

sinful wedge
#

cursed

west orbit
#

thats why i made my own better shared_ptr gigachad

#
template <class T> class AtomicSharedPtr {
    public:
        T *ptr = nullptr;
        std::atomic<uint32_t> *use_count = 0;
        AtomicSharedPtr() {
            ptr = new T();
            use_count = new std::atomic<uint32_t>(0);
        }
        AtomicSharedPtr(const AtomicSharedPtr &copyFrom) {
            ptr = copyFrom.ptr;
            //*copyFrom.use_count += 1;
            use_count = copyFrom.use_count;
            *use_count += 1;
        }
        ~AtomicSharedPtr() {
            if(*use_count == 0) {
                delete ptr;
                return ;
            }
            *use_count -= 1;
        }
};
#

declaring it is equivalent to making a new T

#

its almost transparent, you simply access the pointer to the thing through ptr

sinful wedge
#

I fail to see how this is different from std shared ptr

west orbit
#

you have to assign a new thing to the shared pointer when its declared

#

this does that part transparently

#

so accessing an element in a map will automatically allocate a new T in that map instead of you having to call the new operator and assign it to the map

#

also use_count is exact instead of approximate

sinful wedge
#

auto sharp = std::make_shared<std::unordered_map<std::uint32_t, std::string>>()

west orbit
#

gl::Chunk *glChunk = chunkManager->chunks[{x, y, z}].ptr;
this for instance will get the pointer of the element accessed in the map without having to check if its not there and having to assign a new gl::Chunk to it

#

itll handle that part itself

#

when the element is removed from the map, itll still exist if there is a copy of the AtomicSharedPtr somewhere

#

for example if one was bound to a std::function for a job

sinful wedge
#

this seems like normal shared_ptr stuff

#

idk tho

west orbit
#

the map access part is the thing i really wanted it for

#

the map is declared like this std::map<std::array<int64_t,3>, AtomicSharedPtr<Chunk>> chunks;

#

accessing an uninitialized element in the map will automatically construct a new Chunk

sinful wedge
#

ohh now I see the point

west orbit
#

ya

#

its slightly different than a std::shared_ptr

sinful wedge
#

indeed std shared ptr would happily give you a null

#

see I saw the difference I just didnt see why it was useful

west orbit
#

lol

sinful wedge
#

I would have done this slightly different

#

*ly

west orbit
#

i tried making a move constructor but i apparently dont know how thats supposed to work kekw

sinful wedge
#

instead of making my own shared ptr, I would have made a class World

and a function World::accessChunkAtCoords(ChunkCoordinates c)

that sees if its in the internal map and if not creates it

sinful wedge
#

are you familiar with how std vector is implemented?

#

three pointers:
pointer to beginning
pointer to end of elements
pointer to end of allocated memory

sinful wedge
#

can you ssee how the vector operations could be implemented from these three pointers?

west orbit
#

i assume it does pointer arithmetic to figure out the size?

sinful wedge
#

yup

#

and that gets extended to all the other stuff like accesing at elemnt etc

#

creates references from pointer arethmetic

#

etc

#

the copy constructors / assignment operators follow the pointers down, copy all the data to a new location on the heap, and creates a new vector completly independent from the first

#

the move constructors / assignment operators, yoink the pointers, copy them into the new instance, and null out the pointers in the old instance, effectevely making it a zombie,

#

thus there is no data copied, just teh threee pointers

sinful wedge
west orbit
#

i never nulled the old one, just copied the pointers

west orbit
#

that still doesnt work bjorklul

sinful wedge
west orbit
#

oooooh

sinful wedge
#

there we go

#

implement a move constructor / assignment operator in something, ill look it over

west orbit
#

ayy that fixed it

template <class T> class AtomicSharedPtr {
    public:
        T *ptr = nullptr;
        std::atomic<uint32_t> *use_count = 0;
        AtomicSharedPtr() {
            ptr = new T();
            use_count = new std::atomic<uint32_t>(0);
        }
        AtomicSharedPtr(const AtomicSharedPtr &copyFrom) {
            ptr = copyFrom.ptr;
            //*copyFrom.use_count += 1;
            use_count = copyFrom.use_count;
            *use_count += 1;
        }
        AtomicSharedPtr(AtomicSharedPtr &&moveFrom) {
            ptr = moveFrom.ptr;
            use_count = moveFrom.use_count;
            moveFrom.ptr = nullptr;
            moveFrom.use_count = nullptr;
        }
        ~AtomicSharedPtr() {
            if(use_count == nullptr) {
                return;
            }
            if(*use_count == 0) {
                delete ptr;
                return ;
            }
            *use_count -= 1;
        }
};
#

had to put a check in the deconstructor to keep it from crashing

#

i didnt realize the deconstructor was called when move constructing

#

now itll skip incrementing then decrementing the atomic counter when it just needs to be moved

sinful wedge
#
AtomicSharedPtr(AtomicSharedPtr &&moveFrom) {
            ptr = moveFrom.ptr;
            use_count = moveFrom.use_count;
            moveFrom.ptr = nullptr;
            moveFrom.use_count = nullptr;
        }

This isn't threadsafe btw. what if the count was copied, incremented on the next clock cycle? nvm you fucking heap allocated the std atomic, 💀 dont do that. if that cache lookup fails it

#

s slower than a mutex

#

actually

#

wait

#

let me think about this a bit more

west orbit
#

it should work fine bjorklul
except if heap allocating an atomic variable is bad bjorklul

sinful wedge
#

well no

#

heap allocating an atomic is fine

#

however ive just never seen it

#

let me look at a reference implenmentation of std shared ptr

west orbit
#

its copying pointers, not the value, modifying the value itself is safe since its atomic

sinful wedge
#

I'm not too sure rn

west orbit
#

i can tell you that it at least doesnt crash my engine in its current state kekw

sinful wedge
#

yea I guess its fine

west orbit
#

ah

#

it works fine bjorklul

sinful wedge
#

yea I guess a shared ptr has to use a heap allocated atomic

#

I dont think theres any other wya

#

ignore me im being dumb

west orbit
sinful wedge
#

since you have to share it accross instances

west orbit
#

ya

sinful wedge
#

do you know what atomics are?

#

its actually really cool

west orbit
#

i assume atomics are some special threadsafe instructions used to modify basic types?

sinful wedge
#

yes / no

#

I'm going to use rust syntax here because I like it better and its clearer

#

where the mutex is controlling a std::uint32_t

#

you understand syntax?

#

hopefully

west orbit
#

i havent seen much rust so im assuming bjorklul

sinful wedge
#

basically its a way to encode in the type system what is being controlled

#

I like it many dont :P

#

anyway

#

Mutex<std::uint32_t

#

so I want to increment this varaible

#

so I do the following

#

lock mutex (maybe wait)
increment variable
release mutex

#

clever CPU developers have figured out absolute fucking magic I dont understand it, but basically it allows all of the cores to implicitly mutex eachother

#

i.e every core has a unique lock on every memory address as long as it only modifies one address

west orbit
#

yeah i have no clue how that would actually work kekw

sinful wedge
#

yea you dont have to

#

just know it does

west orbit
#

good sweating

sinful wedge
#

So an atomic is just a word that generalizes this notion of Mutexstd::uint32_t but allows the compiler to do optimiations using those fancy pants cpu instructions

west orbit
#

how expensive are they anyway thinkies

sinful wedge
#

so on x86 a std::atomicstd::uint32_t is a built in atomic because of that architecture fuckery

#

its free

#

like no if ands or buts

#

free

west orbit
#

oh, even if you have many atomic flags that need to be modified?

sinful wedge
#

;asm -O3

#include <atomic>

int main()
{
  std::atomic<int> a = 0;
  a++;
}
harsh frostBOT
#
Assembly Output
main:
  mov DWORD PTR [rsp-4], 0
  lock add DWORD PTR [rsp-4], 1
  xor eax, eax
  ret

west orbit
#

i guess the only slowdown would be if one thread tries to read/modify/write when another one is busy doing that at the same time, but thats super fast

sinful wedge
#

as you can see x86 has a special lock add instruction

west orbit
#

yeah

sinful wedge
#

but this is what a semaphore is if you remember what I was talking about

#

its just a convient wrapper around an atomic bool

west orbit
#

so then what exactly is a mutex then thinkies
do those have their own instruction or something

sinful wedge
#

this is "free" since it also has to flush the cache to whatever address was modified to the other cores, but as long as you only modify one atomic every like 16 instructions its free, if you do more oh no it takes like 4 cycles

sinful wedge
west orbit
#

ah

sinful wedge
#

that os stuff is where all the overhead comes from

west orbit
#

so thats why they're expensive

sinful wedge
#

yup

#

its because even though your program can see the mutex, the OS has to manage the mutex since what if multiple applications try and access the same memory address

west orbit
#

oh huh

sinful wedge
#

do you get it or do you want me to explain again?

west orbit
#

i think i get it lol

#

i should probably try and see if i can get my thread pool to work using semaphores

sinful wedge
#

Imagine two programs both trying to access 0x1000, if program 1 had a mutex for that address at 0x8000, program 2 has no idea that exists, so thus the OS has to step in, and make a master look up table of mutexes and their addresses so there arent memory races

sinful wedge
#

but they dont have all of that OS stuff

west orbit
#

ya, but rn it uses mutexes when it really doesnt have to

sinful wedge
#

So when using they lock access to an object and not an address

west orbit
#

its just to block other threads when work is being done on the job queue that they're working with

sinful wedge
#

yup

#

and dont forget to use std::this_thread::sleep()

#

or if you use std::binary_semaphore

west orbit
#

oh they wait on a condition variable bjorklul

sinful wedge
#

its all handled for you :P

sinful wedge
west orbit
#

ah

#

i may need the condition variable to wake the threads when new jobs are available

sinful wedge
#

std::binary_semaphore::acquire() 👉 👈

west orbit
#

what if the job queue is empty thinkies

sinful wedge
#

oh well youd have to ditch the queue

#

let me explain

#

// master thread
// TODO: remove the heap allcoation on the binary semaphore, theres a way I just can't think of it rn
std::binary_semaphore sema

//                                  start   | end
createJob(std::function<void(void)>, nullptr, sema)
// this job waits untill the first one is signaled to start
// and signals the end semaphore once its done
createJob(std::function<void(void)>, sema, sema2);
createJob(std::function<void(void)>, sema2, sema3);
///...
#

I'd reccommend channels to collect the jobs in a central spot, attaching the required semaphores there, and then passing them on to whatever processing facilities you have

west orbit
#

the jobs are currently stored in a central std::vector<std::function<int8_t()>> thats stored in a JobQueue class

sinful wedge
#

what does std::int8_t represent :P

west orbit
#

error code bjorklul

sinful wedge
#

moving on....

west orbit
#

if it returns -1 for instance, the job is put on the back of the queue

sinful wedge
#

💀 you do you imma not critique design stuff here :p

west orbit
sinful wedge
#

anyway, do you get my idea?

west orbit
#

yeah im looking at it rn hmmm

sinful wedge
#

basically store a bunch of semaphores on the main thread, pass mutable references to the "jobs" and free them once theyre used

west orbit
#

rn the jobs are processed on a loose FIFO basis by worker threads that pop the job, run it, if it returns an error, its put on the back of the queue, if not, its deconstructed

sinful wedge
#

well, I'll leave you to grapple with your own design decisions, it's the best part of programing :P

west orbit
#

yeah, ive been working on this for a while now kekw

sinful wedge
#

I reccommend paper and pencil

west orbit
#

also, it seems like you can have a thread wait on an atomic variable? thinkies

west orbit
#

oh it doesnt seem like that would work in the way i need it to

#

i may be able to skip waiting on the condition variable if the job queue isnt empty yet thinkies

#

that should reduce the overhead a bit

west orbit
# sinful wedge condition variables introduce an OS overhead since the OS has to wake up the thr...
#include <iostream>
#include <thread>
#include <chrono>
#include <semaphore>
#include <vector>
void worker(std::binary_semaphore &sem, int *num) {
    sem.acquire();
    std::this_thread::sleep_for(std::chrono::milliseconds(10));
    *num += 1;
    sem.release();
}
int main() {
    std::binary_semaphore sem(1);
    int num = 0;
    std::vector<std::thread> threads;
    for (int i = 0; i < 1024; ++i) {
        threads.push_back(std::thread(worker, std::ref(sem), &num));
    }
    for (int i = 0; i < 1024; ++i) {
        threads[i].join();
    }
    std::cout << num << "\n";
    return 0;
}
#

just made a quick program to see if semaphores can be used for preventing race conditions and it turns out yes

#
Semaphores are also often used for the semantics of signalling/notifying rather than mutual exclusion, by initializing the semaphore with ​0​ and thus blocking the receiver(s) that try to acquire(), until the notifier "signals" by invoking release(n). In this respect semaphores can be considered alternatives to std::condition_variables, often with better performance. 

so i can probably change my jobs system from using mutexes and condition variables to binary and counting semaphores

sinful wedge
#

I did a bit more reading and you should probabkt use std latch

#

Fundamentally the same just its a std lib wrapper

#

The standard library just gets bigger and bigger I swear I fijd something new every

#

Like three weeks

west orbit
#

wait why would i need a latch thinkies

#

arent latches/barriers used to make sure threads have reached the latch/barrier so they can all move on right after?

sinful wedge
#

std::latch is just a semaphore with differently named member functions. I was just looking and it seems that it might be more intiutive, idk though

#

semaphores and latches are literally the same :P

harsh frostBOT
#
Compiler Output
<source>:6:6: error: variable or field 'worker' declared void
    6 | void worker(std::binary_semaphore &sem, int *num) {
      |      ^~~~~~
<source>:6:18: error: 'binary_semaphore' is not a member of 'std'; did you mean 'binary_negate'?
    6 | void worker(std::binary_semaphore &sem, int *num) {
      |                  ^~~~~~~~~~~~~~~~
      |                  binary_negate
<source>:6:36: error: 'sem' was not declared in this scope
    6 | void worker(std::binary_semaphore &sem, int *num) {
      |                                    ^~~
<source>:6:41: error: expected primary-expression before 'int'
    6 | void worker(std::binary_semaphore &sem, int *num) {
      |                                         ^~~
<source>: In function 'int main()':
<source>:13:10: error: 'binary_semaphore' is not a member of 'std'; did you mean 'binary_negate'?
   13 |     std::binary_semaphore sem(1);
      |          ^~~~~~~~~~~~~~~~
      |          binary_negate
<source>:17:39: error: 'worker' was
sinful wedge
#

;compile -std=c++20

#include <iostream>
#include <thread>
#include <chrono>
#include <semaphore>
#include <vector>
void worker(std::binary_semaphore &sem, int *num) {
    sem.acquire();
    std::this_thread::sleep_for(std::chrono::milliseconds(10));
    *num += 1;
    sem.release();
}
int main() {
    std::binary_semaphore sem(1);
    int num = 0;
    std::vector<std::thread> threads;
    for (int i = 0; i < 1024; ++i) {
        threads.push_back(std::thread(worker, std::ref(sem), &num));
    }
    for (int i = 0; i < 1024; ++i) {
        threads[i].join();
    }
    std::cout << num << "\n";
    return 0;
}
harsh frostBOT
#
Compiler Output
/opt/compiler-explorer/gcc-12.2.0/bin/../lib/gcc/x86_64-linux-gnu/12.2.0/../../../../x86_64-linux-gnu/bin/ld: /tmp/ccsiIQAg.o: in function ​`std::thread::thread<void (&)(std::counting_semaphore<1l>&, int*), std::reference_wrapper<std::counting_semaphore<1l> >, int*, void>(void (&)(std::counting_semaphore<1l>&, int*), std::reference_wrapper<std::counting_semaphore<1l> >&&, int*&&)':
/opt/compiler-explorer/gcc-12.2.0/include/c++/12.2.0/bits/std_thread.h:135: undefined reference to ​`pthread_create'
collect2: error: ld returned 1 exit status
Build failed
sinful wedge
#

;compile -std=c++20 -llibpthread

#include <iostream>
#include <thread>
#include <chrono>
#include <semaphore>
#include <vector>
void worker(std::binary_semaphore &sem, int *num) {
    sem.acquire();
    std::this_thread::sleep_for(std::chrono::milliseconds(10));
    *num += 1;
    sem.release();
}
int main() {
    std::binary_semaphore sem(1);
    int num = 0;
    std::vector<std::thread> threads;
    for (int i = 0; i < 1024; ++i) {
        threads.push_back(std::thread(worker, std::ref(sem), &num));
    }
    for (int i = 0; i < 1024; ++i) {
        threads[i].join();
    }
    std::cout << num << "\n";
    return 0;
}
harsh frostBOT
#
Compiler Output
/opt/compiler-explorer/gcc-12.2.0/bin/../lib/gcc/x86_64-linux-gnu/12.2.0/../../../../x86_64-linux-gnu/bin/ld: cannot find -llibpthread: No such file or directory
/opt/compiler-explorer/gcc-12.2.0/bin/../lib/gcc/x86_64-linux-gnu/12.2.0/../../../../x86_64-linux-gnu/bin/ld: note to link with /lib/x86_64-linux-gnu/libpthread.a use -l:libpthread.a or rename it to liblibpthread.a
collect2: error: ld returned 1 exit status
Build failed
west orbit
#

;compile -std=c++20 -lpthread

#include <iostream>
#include <thread>
#include <chrono>
#include <semaphore>
#include <vector>
void worker(std::binary_semaphore &sem, int *num) {
    sem.acquire();
    std::this_thread::sleep_for(std::chrono::milliseconds(10));
    *num += 1;
    sem.release();
}
int main() {
    std::binary_semaphore sem(1);
    int num = 0;
    std::vector<std::thread> threads;
    for (int i = 0; i < 6; ++i) {
        threads.push_back(std::thread(worker, std::ref(sem), &num));
    }
    for (int i = 0; i < 6; ++i) {
        threads[i].join();
    }
    std::cout << num << "\n";
    return 0;
}
harsh frostBOT
#
Program Output
6
west orbit
#

ayy there we go

#

;compile -std=c++20 -lpthread

#include <iostream>
#include <thread>
#include <chrono>
#include <semaphore>
#include <vector>
void worker(std::binary_semaphore &sem, int *num, std::chrono::time_point<std::chrono::steady_clock> timePoint) {
    
    std::this_thread::sleep_until(timePoint);
    sem.acquire();
    *num += 1;
    sem.release();
}
int main() {
    std::chrono::time_point<std::chrono::steady_clock> start = std::chrono::steady_clock::now();
    std::binary_semaphore sem(1);
    int num = 0;
    int nthreads = 1024;
    std::vector<std::thread> threads;
    for (int i = 0; i < nthreads; ++i) {
        threads.push_back(std::thread(worker, std::ref(sem), &num, start + std::chrono::milliseconds(100)));
    }
    for (int i = 0; i < nthreads; ++i) {
        threads[i].join();
    }
    std::cout << num << "\n";
    return 0;
}
harsh frostBOT
#
Compiler Output
terminate called after throwing an instance of 'std::system_error'
  what():  Resource temporarily unavailable
west orbit
#

oh oops bjorklul

#

;compile -std=c++20 -lpthread

#include <iostream>
#include <thread>
#include <chrono>
#include <semaphore>
#include <vector>
void worker(std::binary_semaphore &sem, int *num, std::chrono::time_point<std::chrono::steady_clock> timePoint) {
    
    std::this_thread::sleep_until(timePoint);
    sem.acquire();
    *num += 1;
    sem.release();
}
int main() {
    std::chrono::time_point<std::chrono::steady_clock> start = std::chrono::steady_clock::now();
    std::binary_semaphore sem(1);
    int num = 0;
    int nthreads = 10;
    std::vector<std::thread> threads;
    for (int i = 0; i < nthreads; ++i) {
        threads.push_back(std::thread(worker, std::ref(sem), &num, start + std::chrono::milliseconds(100)));
    }
    for (int i = 0; i < nthreads; ++i) {
        threads[i].join();
    }
    std::cout << num << "\n";
    return 0;
}
harsh frostBOT
#
Program Output
10
west orbit
#

and then you remove the semaphore acquire and release

#

;compile -std=c++20 -lpthread

#include <iostream>
#include <thread>
#include <chrono>
#include <semaphore>
#include <vector>
void worker(std::binary_semaphore &sem, int *num, std::chrono::time_point<std::chrono::steady_clock> timePoint) {
    
    std::this_thread::sleep_until(timePoint);
    //sem.acquire();
    *num += 1;
    //sem.release();
}
int main() {
    std::chrono::time_point<std::chrono::steady_clock> start = std::chrono::steady_clock::now();
    std::binary_semaphore sem(1);
    int num = 0;
    int nthreads = 10;
    std::vector<std::thread> threads;
    for (int i = 0; i < nthreads; ++i) {
        threads.push_back(std::thread(worker, std::ref(sem), &num, start + std::chrono::milliseconds(100)));
    }
    for (int i = 0; i < nthreads; ++i) {
        threads[i].join();
    }
    std::cout << num << "\n";
    return 0;
}
#

bruh

west orbit
harsh frostBOT
#
Program Output
10
west orbit
#

well it didnt break that time bjorklul

#

10 threads isnt quite enough to trip it up

west orbit
#

;compile -std=c++20 -lpthread

#include <iostream>
#include <vector>
#include <chrono>
#include <thread>
void worker(int *num, std::chrono::time_point<std::chrono::steady_clock> timePoint) {
    std::this_thread::sleep_until(timePoint);
    *num += 1;
}
int main() {
    int numThreads = 12;
    int num = 0;
    std::vector<std::thread> threadHandles;
    std::chrono::time_point<std::chrono::steady_clock> start = std::chrono::steady_clock::now();
    for (int i = 0; i < numThreads; ++i) {
        threadHandles.push_back(std::thread(worker, &num, start + std::chrono::milliseconds(100)));
    }
    for (int i = 0; i < numThreads; ++i) {
        threadHandles[i].join();
    }
    std::cout << num << "\n";
    return 0;
}
harsh frostBOT
#
Program Output
12
west orbit
#

damn

#

ah, so you can do about 12 threads bjorklul

#

anyway, it does trip up with enough threads

$ clang++ -o ./out.exe ./main.cpp; ./out.exe
993
#

should be 1024

sinful wedge
#

Worlds best compiler bot

#

I'm thinking about going in the deep end with my Vulkan renderer and making it a voxel based renderer (how original!)

Ive got a couple of questions?

What's the high level overview of your engine?

How does lighting work? do you use a compute shader or just a heavily optimized cpu side per voxel algorithm. lighting?

runtime storage formats so you dont duplicate copy all the data to the gpu every frame?

west orbit
#

the plan was a cpu side algo for compatibility and just making it a runnable job

#

eventually the lighting would just be done fully on the gpu in a deferred renderer

#

for simplicity's sake i wasnt going to make light level a gameplay mechanic as i have chunks along all 3 axis and that would get expensive to compute

#

at least it would be hard with sunlight, but block light wouldnt be that bad

#

making light levels not a gameplay mechanic makes it much easier to have chunks along all 3 axis, if i want just visual lighting, its much easier to just compute it for visible chunks

west orbit
#

you can then simply draw from the buffer with all the geometry stored on the gpu, it only needs to be reuploaded when there are updates to the geometry, like blocks being added or removed

#

it can also handle stuff like GL 1.1 vertex arrays, where it doesnt allocate any buffers and simply draws from the vertex data on the cpu side

west orbit
#

best i got is that it has multiple systems ive implemented that handle specific tasks like managing graphics chunks, world chunks, drawing text, and taking input and making the camera move based on that input

#

im trying to make it as modular as possible, for example the graphics chunk system depends on the world chunk system but there is no dependency the other way around

#

this would make it easier to make a server as i can simply omit stuff that isnt needed

#

i can give code snippets if you want but its all kinda undocumented yamikek

#

i was going to clean it up and properly document it when its actually something to begin with

#

rn its not much

sinful wedge
#

ok sounds like I'm thinking too much into this, I'll just go for it and probably make a thread called "poor design decisions" and chronicle my adventure there like you have

west orbit
sinful wedge
#

I should be able to get voxels up and rendering by the end of this week

west orbit
#

ive been overthinking this for 8 years uwu

sinful wedge
#

my refactor should be done by like tomorrow and I want to get dynamic vertex generation

#

my problem rn is interfacing my renderer / (dyanmic vertex / voxel generator idkwhat to call it, pls help)

#

I dont want them to interact

#

I want to be able to just return a list of chunks and a camera position and have it get rendered

west orbit
#

mine has that rn bjorklul
gl::chunk_manager::draw(&glChunkManager, &camera);

#

the chunk manager has a map of all the chunks that its working with and the draw function simply sets up the ogl state and renders them

sinful wedge
#

I'm thinking of storing two copies of the data, one on the renderer size and one on the voxel side, so I iterate over each chunk hash it and if its changed update the corresponding render chunk

#

I want to have my generator completly isolated from the renderer

sinful wedge
west orbit
#

this is what my chunk manager looks like

    typedef struct ChunkManager {
        shapes::Cube cube;
        std::map<std::array<int64_t,3>, AtomicSharedPtr<Chunk>> chunks;
        gl::Texture texture;//currently the only texture needed

        uint32_t renderDistanceXZ = 96;
        uint32_t renderDistanceYPositive = 0;
        uint32_t renderDistanceYNegative = 0;
    } ChunkManager;
sinful wedge
#

see this is almost exactly what I was thinking

#

now heres a question

west orbit
#

and then i have these functions for it

    namespace chunk_manager {
        //initializes the chunk manager opengl state and textures
        void init(ChunkManager *chunkManager);
        //must be run once per frame to stream chunks properly, syncs chunks until the specified time point, will sync chunk geometry if shouldSync is set
        void syncChunksUntil(ChunkManager *chunkManager, std::chrono::time_point<std::chrono::steady_clock> timePoint);
        //draws chunks if isRenderable is set
        void draw(ChunkManager *chunkManager, gl::transform::Camera *camera);
        //frees chunks outside of the render volume, currently does not unload chunks
        void freeDistantChunksUntil(ChunkManager *chunkManager, gl::transform::Camera *camera, voxel_world::World *world, std::chrono::time_point<std::chrono::steady_clock> timePoint);
        void loadCloseChunksUntil(JobQueue &jobQueue, gl::ChunkManager *chunkManager, gl::transform::Camera *camera, voxel_world::World *world, std::chrono::time_point<std::chrono::steady_clock> timePoint);

        //utility function for when graphics resets happen
        void freeAllChunks(JobQueue &jobQueue, gl::ChunkManager *chunkManager);
    }
sinful wedge
#

namespaces + C style OOP 💀

west orbit
#

it allows me to change functions globally at runtime gigachad

sinful wedge
#

ill stop making fun :p

west orbit
#

no need to think about messing with classes and assigning methods

#

also it makes using functions as jobs much easier

sinful wedge
#

my question was what did you call your dynamic vertex generator thing

#

rn I have this

#

but I dont like the word scene

sinful wedge
sinful wedge
west orbit
#

oh it does use std::bind but i find the syntax of using a class method for a job kinda oof

#

adding jobs looks like this
jobQueue.appendJob(std::bind(voxel_world::chunk::genTerrain, ...

sinful wedge
west orbit
#

ez

sinful wedge
#

ignore the shity async stuff

west orbit
#

i find making lambdas kind of a pain just for this bjorklul

sinful wedge
west orbit
#

i use std::queue<std::function<int8_t()>> for my job queue

sinful wedge
#

I need to finish my channels implementation (definetly not a overlay over moodycamel::concurrentqueue :P)

west orbit
#

the return is used as a success/fail code, failed jobs are put back on the end of the queue and ones that succeeded are deconstructed

#

this allows me to make sure jobs with dependencies on other jobs are executed in the correct order

#

the data structures themselves have atomic flags that are set based on what the current state of the structure is

#

that part is cumbersome but honestly idk how else i would do it bjorklul

#

something i need to do to improve chunk loading performance is pool my VBOs so the driver doesnt have to realloc every time chunks are loaded

#

itll impact gpu memory usage a bit but it should be fine, i could just make it an option uwu02

#

i fucking love settings kekw

#

so far i was going to have settings for vertex specification modes, texture modes, block face culling modes and i guess a VBO pool setting as well bjorklul

#

im also gonna implement a lot of rendering settings as well when i get to shaders

#

my draw function for chunks looks like this btw

            for (auto it = chunkManager->chunks.begin(); it != chunkManager->chunks.end(); ++it){
                
                if(!it->second.ptr->isRenderable) {
                    continue;
                }
                
                glm::dmat4 modMat = glm::translate(glm::dvec3(it->second.ptr->worldspaceBlockLocation.x, it->second.ptr->worldspaceBlockLocation.y, it->second.ptr->worldspaceBlockLocation.z));//compute the model matrix from its position
                gl::transform::fixed::loadModelViewMat(viewMat * modMat);//load the modelview matrix with the product of the view and model matrices, this transforms the chunk into view space for rendering
                gl::geometry::setupVertexAttribLayout(&it->second.ptr->geometry);
                gl::geometry::draw(&it->second.ptr->geometry);
                gl::geometry::cleanupVertexAttribLayout(&it->second.ptr->geometry);
                
            }
#
                gl::geometry::setupVertexAttribLayout(&it->second.ptr->geometry);
                gl::geometry::draw(&it->second.ptr->geometry);
                gl::geometry::cleanupVertexAttribLayout(&it->second.ptr->geometry);

these 3 are assignable std::functions for example

#

i can switch what these actually do under the hood to change what vertex specification mode it uses, even at runtime

#

however switching at runtime is rather difficult as i need to either keep track of and delete all related ogl objects and reinit them, or completely reinit the ogl context itself in order for it to work right

#

it does however work perfectly on load time, just call the function to set the appropriate functions before anything is done

#

doing all of this allows me to support even GL 1.0 bjorklul

sinful wedge
west orbit
#

its not too cumbersome as im not really shooting myself in the foot too often by forgetting to set the flags properly bjorklul

#

i did run into that problem with preventing use-after-free crashes but i solved that with ✨ smart pointers ✨

sinful wedge
#

RAII is I think my favorite thing in all of programming

#

all the benfits of GC without any of the drawbacks

west orbit
#

it cant use RAII everywhere in my engine as the last I is quite expensive sometimes kekw

sinful wedge
#

See I solve this by making most of my classes non default constructable and then just using a std::unique_ptr<RAIIWrapperClass> wrapped {nullptr} It's a heap allocation, but idgaf,

#

is there any way to delay construction of a don default constructable object on teh stack?

west orbit
#

dont std::map and std::unordered_map need their elements to be default constructible thinkies

sinful wedge
#

can't I just hash the uniqueptr<T>?

#

big brain

#

fill the heap

west orbit
#

or i guess

#

DefaultInsertable

#

ah so yes it must be default constructible if using the [] operator to insert elements

When the default allocator is used, this means that key_type must be CopyConstructible and mapped_type must be DefaultConstructible.

sinful wedge
#

just dont do that 👀

west orbit
#

ah but i did anyway gigachad

#
                        gl::Chunk *glChunk = chunkManager->chunks[{x, y, z}].ptr;//get the pointer to the working chunk, this also automatically creates it by reading from the std::unordered_map structure
                        gl::chunk::init(glChunk, GL_VATTRIBLAYOUT_T2F_V3F);//initialize the chunk
sinful wedge
#

->chunks[{x, y, z}] ctad at its limits :P

west orbit
#

ctad? thinkies

west orbit
#

oh i just got it kekw

olive comet
#

very dope

west orbit
#

;asm -std=c++20

#include <iostream>
#include <mutex>
#include <semaphore>
void worker_sem(int *n, std::binary_semaphore &sem) {
    sem.acquire();
    *n += 1;
    sem.release();
}
void worker_mut(int *n, std::mutex &mut) {
    std::unique_lock<std::mutex> lock(mut);
    *n += 1;
}
harsh frostBOT
#
Critical error:

Embed too large.

west orbit
#

💀

#

;asm -std=c++20

#include <iostream>
#include <mutex>
#include <semaphore>
void worker_sem(int *n, std::binary_semaphore &sem) {
    sem.acquire();
    *n += 1;
    sem.release();
}
harsh frostBOT
#
Critical error:

Embed too large.

west orbit
#

bruh

#

nvm then kekw

sinful wedge
#

here me out here

#

this does the hard part for you

#

just have each thread poll this

#

and if it fails

#

just put it to sleep

#

use std::this_thread::yeild

#

I asked a question in #c-cpp-discussion about this as well since this iwhat im dcurrently doing

west orbit
sinful wedge
#

Nih?

west orbit
#

Not Invented Here

sinful wedge
#

Ahh

west orbit
#

its easier for me to know exactly how something will behave if i made it bjorklul

#

thats mostly why

west orbit
#

33fps on a RX 6800 kekw

#

48 chunk render distance

west orbit
#

i just thought of something thinkies

#

load and unload distances could be different thinkies

#

this is 10 chunk render distance without unloading chunks bjorklul

sinful wedge
#

if you really wanted to get fancy, detect times when threads are sleeping and put them to work doing other stuff :P

west orbit
sinful wedge
west orbit
#

i mean i could just submit other jobs bjorklul

#

but thats stuff that doesnt require that its done this frame kind of quick

#

i need to make this load from the player outwards, rn it loads in whatever direction it feels like which is usually x+ then z+

west orbit
#

uninitialized block data kekw

#

kinda wish i could emulate what this is doing but im not sure thats even possible kekw

sinful wedge
#

it has the same energy as turning off execution proection on a micro controller and just letting it run wild

west orbit
west orbit
#

im still stuck on world streaming

#

i have a high level idea as to how i want it to work but the specifics are very difficult to figure out properly

west orbit
#

@flint kettle this is the thread where i ramble about the design of my engine bjorklul

flint kettle
#

Oh nice

#

I will follow

west orbit
#

im still stuck on world streaming

sinful wedge
#

ramblage

#

it's a good thread

flint kettle
#

Ramblage Kek

west orbit
#

im trying to design a way to load/unload regions in a multithreaded environment but its really difficult and hard to keep track of everything

sinful wedge
#

When was the last time you just took a break, busted out some pen and paper and drew out your high level design ideas?

flint kettle
#

Maybe assign simulation for chunks per thread

west orbit
#

the high level idea is to have chunks in regions, only unload them when the entire region is unloaded, but have unloaded chunks compressed in memory until they're loaded again

west orbit
flint kettle
sinful wedge
#

RAII seems like your friend here

west orbit
#

i need to make sure i dont clobber my memory with threads running over each other

flint kettle
#

Chunks will request ownership for singular simulation jobs maybe

#

I tried to do this but failed

west orbit
sinful wedge
flint kettle
#

Maybe store chunk state within the chunk and that chunk can float around to various threads depending on load conditions

west orbit
sinful wedge
#

let me pull up the explanation again

#

its here somewhere

west orbit
flint kettle
#

I suggest a job stack for each chunk basically

#

I’m so bad at wording

west orbit
sinful wedge
#

#c-cpp-discussion message

flint kettle
#

I have precisely no idea what I’m talking about but somehow this is making sense to me pepekek

sinful wedge
#

that's parallelization for you!

west orbit
#

itd be way easier if it was singlethreaded kekw

#

but bruh i have 16 cores that arent going to be sitting around on my watch kekw

#

scalability was a problem i ran into when trying to play the minecraft modpack i like

sinful wedge
#

I still contend that the semaphore idea is the best since you can essentially just think aobut it like single thread, except instead of doing the processing on said thread, you just delegate the work for other threads to do

#

thus you think about it in a single threaded manner

west orbit
#

it seems most stuff in that is run on a single thread for processing all chunks

sinful wedge
#

Semaphores are hands down my favorite threading primitive

west orbit
#

rn the jobs reschedule themselves if the chunk isnt in the correct state

sinful wedge
#

oh

#

rescheduling is a big nono

west orbit
#

yeah kekw

#

it takes the job and puts it back on the end of the queue if the chunk isnt in the correct state

sinful wedge
#

honestly it sounds like you know where your faults are, would it be a good idea to just burn everything and star over.

flint kettle
#

Idk how to use a semaphore but it seems like the best choice based on what is described here

west orbit
#

hoping the dependent job runs at some point

flint kettle
#

If it’s anything like the “job stack” I described

#

Job stack idea fits nicely inside of brain

sinful wedge
west orbit
#

std::queue<std::function<int8_t()>> jobs;

flint kettle
#

Implement unlinked list pepekek pepekek

west orbit
#

the job on the front of the queue is processed by the first free worker thread

flint kettle
west orbit
#

ya

flint kettle
#

So what if each chunk had a queue

#

And then engine did round robin simulation

sinful wedge
#

why can't you just have many threads popping off that queue

#

thats the simple solution

flint kettle
sinful wedge
#

????

#

then why are we having this conversation

#

what am I missing

flint kettle
#

My cpu have 99999 threads

#

Use them all

west orbit
#

its hard to get a region of chunks to not get clobbered by multiple threads

#

currently chunks are generated for building geometry and then immediately deleted

#

as i have no actual working world system

#

i have everything designed like this so far but im missing so much

    namespace chunk {
        size_t getBlockOffset(World *w, size_t x, size_t y, size_t z);
        void allocBlocks(World *w, Chunk *c);
        //void freeBlocks(Chunk *c);
        int8_t genTerrain(World *w, AtomicSharedPtr<Chunk> c, voxel_world::Coordinates worldspaceChunkLocation);
        //int8_t loadTerrain();
    }
    namespace region {
        size_t getChunkOffset(World *, size_t x, size_t y, size_t z);
        void init(World *world, Region *region);
        Chunk *loadChunk(World *world, Region *region, Coordinates regionspaceChunkPosition);
        Chunk *getChunk(World *world, Region *region, Coordinates regionspaceChunkPosition);
        void unloadChunk(World *world, Region *region, Coordinates regionspaceChunkPosition);
        void unload(World *world, Region *region);
    }
    namespace dimension {
        Chunk *allocChunk(World *world, Dimension *dim, Coordinates worldpsaceChunkLocation);
        int8_t loadChunk(World *world, Dimension *dim, Coordinates wolrdpsaceChunkPosition);
        int8_t unloadChunk(World *world, Dimension *dim, Coordinates wolrdpsaceChunkPosition);
    }
flint kettle
#

What does clobbered mean in this context

#

Sorry I am slow with this

#

Lmao new to parallelized stuff in C++

west orbit
#

its called race conditions afaik

flint kettle
#

Oh data race

#

Yea

sinful wedge
#

wait a darn second here

#

What's actually holding all of the chunks at the highest level?

west orbit
#

like for instance, if im trying to save a region to disk but another thread wants to load a chunk in that region

sinful wedge
#

do you have vector of chunks or like a hash map or what

flint kettle
west orbit
# sinful wedge What's actually holding all of the chunks at the highest level?
   struct World;
    struct Chunk {
        std::atomic<bool> isLoaded = 0;
        std::atomic<int32_t> numLoaders = 0;
        //int64_t chunkLocationX, chunkLocationY, chunkLocationZ;
        Coordinates worldspaceChunkLocation;//the chunk's location in the world, in chunk coordinates
        std::unique_ptr<uint32_t[]> blocks;
    };
    struct Region {
        std::unique_ptr<AtomicSharedPtr<Chunk>[]> chunks;
    };
    struct Dimension {
        std::shared_mutex regionsMutex;
        std::map<std::array<int64_t,3>, Region> regions;
    };
    struct World {
        //the dimensions of a chunk, in block coordinates
        uint32_t chunkWidth, chunkHeight, chunkLength;
        //the dimensions of a region, in chunk coorinates
        uint32_t regionWidth, regionHeight, regionLength;
        //std::map<int32_t, Dimension> dimensions;
    };
sinful wedge
#

well thats unaboidable you havce a world afterall

west orbit
#

i have a std::map to hold regions which has an array of chunk pointers

sinful wedge
#

imma draw something out on paper

#

gimme a lim

west orbit
flint kettle
#

You could also literally physically map threads to chunks

#

Using some gridlike pattern

west orbit
#

also my blood sugar is a bit low rn so my brain may not be working at 100% bjorklul

west orbit
flint kettle
#

I have caffeine and insanity coursing through my brain

#

I am gods strongest soldier and Minecraft is his hardest battle pepekek

sinful wedge
flint kettle
#

Gonna go prototype stuff now

sinful wedge
#

one sec

flint kettle
#

Discord is upside down

#

Not the image

west orbit
#

an actually good minecraft engine is harder to make than one would think kekw

sinful wedge
#

ok

#

lookie here

#

The things marked with a 🚫 are data races

#

You can have chunks do internal processing on themselves from multiple seperate threads

#

You cannot move chunks or insert chunks into the map without a data race occuring

#

You need to create "dummy" chunks in the map and then pass those pointers around to have their processing done seperately

west orbit
#

i want to limit the spawning and destruction of threads, and i also want to closely control how many threads are active

sinful wedge
#

the insertion of new chunks is fundamentally limited

#

however once theyre constructed

#

you can do whatever

#

as long as the threads stay isolated

west orbit
sinful wedge
west orbit
#

i need to clean it up because its hacked together but it does work

sinful wedge
#

just spawn like 15 threads and use a mutex and cv on a queue to wait when its empty

#

and otherwise they just pop off the next job and execute it

west orbit
sinful wedge
#

also moodycamel::concurrentqueue

#

will do this better than you ever will 😛

west orbit
sinful wedge
#

I dont wanna look at this however this is needlessly complex

west orbit
#

i prefer to make my own thing over having to learn how something else works

sinful wedge
#

let me show you my impementation

west orbit
#

again i need to clean it up

#

i need to change the cv to a counting semaphore, it provides a less confusing interface imo

sinful wedge
#

lines 190 - 285

#

wtf

#

why wont discord upload

west orbit
#

discord moment

sinful wedge
#
template<class T>
        class Receiver 
        {
        public:

            Receiver()                          = delete;
            Receiver(Receiver&  other)            = default;
            Receiver(Receiver&& other)            = default;
            Receiver& operator=(Receiver&  other) = default;
            Receiver& operator=(Receiver&& other) = default;

            [[nodiscard]] auto receive() const noexcept
                -> std::optional<T>
            {
                std::optional<T> output = std::nullopt;

                this->mutex_ptr->lock([&output](std::deque<T>& value_safe)
                {
                    if (!value_safe.empty())
                    {
                        output = std::move(value_safe.back());
                        value_safe.pop_back();
                    }
                });

                return output;
            }

        private:

            template<class J>
            friend std::pair<Sender<J>, Receiver<J>> create() noexcept;

            Receiver(std::shared_ptr<Mutex<std::deque<T>>> ptr) noexcept 
                : mutex_ptr(ptr) {}

            std::shared_ptr<Mutex<std::deque<T>>> mutex_ptr;
        }; // class Receiver<T>

        template<class T>
        std::pair<Sender<T>, Receiver<T>> create() noexcept
        {
            auto mutex_ptr = std::make_shared<Mutex<std::deque<T>>>();

            return std::make_pair<Sender<T>, Receiver<T>>(
                Sender {mutex_ptr},
                Receiver {mutex_ptr}
            );
        }
    } // namespace mpsc
#
template<class T>
        class Sender;

        template<class T>
        class Receiver;

        template<class T> 
        std::pair<Sender<T>, Receiver<T>> create() noexcept;



        template<class T>
        class Sender 
        {
        public:
        
            Sender()                          = delete;
            Sender(Sender&  other)            = default;
            Sender(Sender&& other)            = default;
            Sender& operator=(Sender&  other) = default;
            Sender& operator=(Sender&& other) = default;

            void send(T&& valueToSend) const noexcept
            {
                this->mutex_ptr->lock([valueToSend](std::deque<T>& lockedQueue)
                {
                    lockedQueue.push_front(std::move(valueToSend));
                });
            }

        private:

            template<class J>
            friend std::pair<Sender<J>, Receiver<J>> create() noexcept;

            Sender(std::shared_ptr<Mutex<std::deque<T>>> ptr) noexcept 
                : mutex_ptr(ptr) {}

            std::shared_ptr<Mutex<std::deque<T>>> mutex_ptr;

        }; // class Sender<T>
#

I know NIH syndrome but look at this implementation to ge ideas

#

this is just a channel

flint kettle
#

what is NIH syndrome

#

this looks fine

west orbit
flint kettle
#

oh

#

i google

west orbit
#

i personally prefer to make my own systems so i know exactly how they work, rather than learn a system i didnt make

sinful wedge
#
class LoggerSingleton
    {
    public:

        ~LoggerSingleton() = default;

        LoggerSingleton(LoggerSingleton& )            = delete;
        LoggerSingleton& operator=(LoggerSingleton& ) = delete;


        static void sendMessage(Message&& message) noexcept
        {
            static LoggerSingleton logger {};

            logger.thread_sender->send(std::forward<Message>(message));
        } 

    private:
        LoggerSingleton() noexcept
        {
            auto [threadSender, threadReceiver] = seb::mpsc::create<Message>();

            this->thread_sender = std::move(threadSender);

            this->worker_thread = std::jthread(
                [receiver = std::move(threadReceiver)](std::stop_token token)
                {
                    // Main loop
                    while (!token.stop_requested()) {
                        std::optional<Message> val = receiver.receive();

                        if (val.has_value()) 
                        {
                            std::cout << static_cast<std::string>(val.value());
                        } 
                        else
                        {
                            // TODO: refactor to use std::condition_variable
                            std::this_thread::yield();
                        }
                    }

                    // Cleanup Loop
                    for (;;) {
                        std::optional<Message> val = receiver.receive();

                        if (val.has_value()) 
                        {
                            std::cout << static_cast<std::string>(val.value());
                        } 
                        else 
                        {
                            break;
                        }
                    }
                }
            );
        }
        std::jthread worker_thread;
        std::optional<seb::mpsc::Sender<Message>> thread_sender;

    };
#

look at this simple use

flint kettle
#

but im too Idiot to write my own things as complicated as this

#

(not that copmlicated maybe idk)

west orbit
sinful wedge
#

this is just a simple asyncronous logger

#

however it's obvious whats being done and how this could be easily extended into a proper thread pool

#

say I wanted 3 printing threads

#

it's not that bad

#

take a look at this

#

get some ideas

#

i gotta go

west orbit
#

i think using a system of shared and unique locks may do what i need Hmmm

#

no wait hmmm

#

yeah the system of chained semaphores is probably what i need to ensure all jobs are ran before a region is unloaded for example Hmmm

#

im gonna go get some food and see if that fixes my blood sugar

#

its hard to think rn

west orbit
#

there has to be a good way to do this Hmmm

#

tbh im kinda waiting for the idea to pop into my head suddenly, thats usually how i figure stuff out bjorklul

sinful wedge
#

chained semaphores seems like the best way to handle this just be sure to actually setup multiple concurrent chains rather than just having one chain bouncing it's execution ebtween threads :P

west orbit
sinful wedge
#

also make sure you get it working insingle threaded first :P

west orbit
#

yeah, i have a way to execute jobs on the main thread without spawning workers at all

#

im gonna take a break from trying to figure this out for now, depression got hands

void eagle
#

Hi

#

What are the optimisations you are currently using

#

Frustum culling is really important i hope you know

west orbit
#

So far limited block face culling is really the only optimization I've done

#

The rest is just trying to design stuff to run fast and well

#

So its not too bad on the cpu end, but its ridiculously heavy on the gpu end

#

However I know why and how to fix it

#

I'm currently stuck on making a proper world structure for world streaming, then I'm going to move onto proper full block face culling and then view frustum culling

#

Vertex buffer pooling is also something I should probably do

#

yeah vertex buffer pooling would help speed a decent amount, seems the main thread is spending lots of time loading geometry onto the gpu, probably because it needs to keep freeing and allocating buffers each time it loads a chunk

flint kettle
#

Don’t go too far

#

Optimization can be done later

#

Scale well n stuff

west orbit
#

Both of those optimizations will become necessary very quickly yamikek

flint kettle
#

Yes quite

west orbit
#

also im quite interested to see its performance on devices like a pi4 and the atom n470 machine i have and those are the the only rendering optimizations i can really make

west orbit
#

Okay, instead of sitting around waiting for me to figure shit out, I guess I could try and clean it up so its presentable and see what you all think of it so far

#

Tbh I'm not sure how I want to license it tho thinkies

sinful wedge
flint kettle
#

Not sure, depends on what rights you want to reserve

west orbit
#

Id go with MIT for smaller things hmmm

#

With this i don't want it to be used for commercial purposes, unless I get something for that thinkies

#

Personally I don't think I'd release any paid for software using this, but if someone is gonna try and make money off of it I'd like something yamikek

#

I think the libraries I'm using are compatible with that

#

I need a license that is for FOSS but allows me to dual license for commercial purposes if I ever need to, which realistically I won't need to kekw

flint kettle
#

It exists but cannot remember

#

Forgor

west orbit
#

I could just release it initially with no license so I don't have to make an actual decision yet yamikek

flint kettle
#

Make it FOSS but you have to request source code lmao

#

Can’t just view it

#

It must be mailed to you as a binder of paper

west orbit
west orbit
#

okay, the 8-9fps for 5 chunk render distance on a sandy bridge GT1 igp was not what i thought it was

#

that was with no culling at all kekw

flint kettle
#

Lmao

#

If I render five chunks I want to render it all

#

No culling

#

I paid for five chunks I’ll get five chunks

west orbit
flint kettle
#

Fuck :((( goddammit