#Marching cube help to reduce game load times / performance improvements
1 messages · Page 1 of 1 (latest)
first i tested with a larger base chunk sizes and smaller World, and Large Chunk sizes which resulted in an expected more expensive Chunk.MarchCube(); function call for both load time and when terraforming
next i test with a smaller base chunk and smaller world chunk, and Large chunk sizes were bigger. This resulted in a much faster load time even though the number of individual chunks went up from 512 to 16,384 the number of Chunk.MarchCube(); function calls were alot less. Same with the recorded data for when terraforming but when turning off the profiler it felt much choppier to use the smaller chunk sizes with largechunk being bigger.
Well first of all 27 mb of garbage for a single frame is insane
You might want to post relevant code alongside those profiler data.
and final test with a World size being larger and the chunk and largechunk sizes being smaller resulted in a much faster load time and being the only time where i could comfortably watch the screen and know what's happening which is interesting cause it had the highest call amount out of the three test but requires the most gameobjects to be spawned at a whopping 524k
!code
📃 Large Code Blocks
Use links to services like:
https://paste.mod.gg/, https://hastebin.skyra.pw/, https://paste.ofcode.org/, https://paste.myst.rs/
📃 Inline Code
Surround code with three backquotes. Not quotation marks.
To format as C#, add cs to the first line:
```cs
// Your code here
```
Add a comment with a line number if there is an error message.
wrong one :/
which one of these would you recommend?
Doesn’t matter
A tool for sharing your source code with the world!
think thats it
though i guess World Generator if we go in order of first used to last should be first not second
It’s kinda hard to analyse this code on my phone. I will try to get back to it when I can on a proper PC.
codes a mess huh, sorry 😅
But in the mean time it might be worth your time to explore the job system and Burst. There is only so much you can do in a single thread.
Not a mess no. That website does not have syntax highlighting and ofc a phone screen is small.
Compared to my two 24 inch monitors that is
been dreading that answer but yeah think i might have to
fair
ugh these damn reference types are making this harder than i had hoped. Edit: seen the NativeList going to use that
new one fun :)
I don't have full context of your goal, but for starter the 16 MB allocation in Chunk.MarchCube can be cut down to 0 by just moving new float to caller.
sorry?
void MarchCube(Vector3Int position, int step)
{
float[] Cube = new float[8];
ahh but move to caller?
Yes, let the caller CreateMeshData allocate just once, and pass the same one to MarchCube, instead of every call of MarchCube needing to allocate one and then throwing it away after it's done.
ahh i see
will definitely help since there were thousands of these chunks, thanks
I'm not really available to fully digest your code and understand your goal, but I would guess there are probably lots of these low hanging fruits you can clean up.
Additionally, engineering is also a lot about reframing the problem. If A is a hard problem to solve but you don't actually need to do A, instead you can just do B, then that might also solve your issue.
Burst and job system will help but that's not the silver bullet that rewriting your code to it will magically make it fast.
fair, the main Chunk code wasn't designed by me so i didn't check it the most myself plus im new to coding (started september of 2024 so not even a year) so not sure what im doing quite often :/
assumed so, i did see a huge jump from when i went from a one chunk system to 2 and then from when i went from calling it all at once to giving a delay. but marching cubes as far as i know is known for being hard to optimize.
and sorry real quick i should put it before the big chunk of for loops right?
Well, this low hanging fruit is not really specific to marching cube algorithm.
like this
Yeah.
though im told jobs doesn't like reference data like the float[] is there a Native array?
And this optimization doesn't just applies to managed code, even when you are writing Burst code you would still want to minimize allocating native arrays, and this optimization will still apply.
oh wait wouldn't that just undo that if i did this since when pushing to jobs would un reference it cause the reallocation no?
Basically point is that, instead of throwing away your existing code and praying a rewrite in Burst will magically fix all your problems, try to understand why your existing code is slow.
got it, thanks for the tips guess i'll need to spend awhile analysing the code itself
If you are going to write your code with Burst, you wouldn't be able to use reference type at all, you would instead be allocating native arrays and passing native arrays around, and if you naively "translate" the code you will run into the exact same performance pitfall: instead of just allocating one native array in CreateMeshData, you end up allocating a native array in every call of MarchCube and then deallocating right afterwards, killing your performance.
so not what i want to do, wonder if there anyway i can take the whole CreateMeshData into the same function so as to not need to move it about
I mean you absolutely can rewrite your existing code into Burst and you will very likely get a decent performance boost, but yes the point is that you still need to understand the problem with your existing code to know what to fix.
kinda like this https://paste.mod.gg/zhsyfehdddor/0
A tool for sharing your source code with the world!
If you fix all the issues with your existing code, it might even be fast enough that you don't need a rewrite at all.
You don't have to inline it. All you get out of inlining is just skipping passing a parameter, which costs practically nothing.
sorry inline it?
You move the code in MarchCube directly into the caller.
i mean before the float[] was inside with the nested for loop no?
this is it in jobs and not the first code you saw
Oh I meant you can still keep CreateMeshData and MarchCube separate instead of one giant block of code.
nah if i want to have the float[] and use jobs but not reallocate the data i figured putting the whole loop into the MarchCube function would do just that
But as said, you should revert back to your old code, analyze and fix it, before considering rewriting in Burst and job.
Otherwise it's a waste of time and effort.
fair 😅
You should consider making it properly async first. From the look at the profiler screenshot, you're processing 800 chunks in one frame.
which one were you looking at?
The one at the start of the thread.
Or rather this one
#1393084510182047744 message
well makes sense there was an intentional 32 running at the same time, but would it be better to run only one at a time than chunks of them?
What do you mean by "at the same time"?
You're currently processing 800. That's what's clear from the screenshot.
I don't know how many. It depends on how much time each chunk takes.
If one takes 16ms to process, then yes.
i mean the test had 524k chunks total i wouldn't want them to run one frame per one chunk
especially when run on the same frame equates to about 1k per second (though im not sure how many it would be if it was one per frame)
sorry which Task should i use to make it wait till the last chunk was done? been using Yield but feel like thats wrong
love that this simple change boosted the fps while loading from the avg range of (5-20) to 60 - 200 depending on how many i run at once. also made it so the deep profiler didn't use so much, thanks i'll be careful to lookout for more.
Ah, OK, so it does work async.
That's the one you'd use. Other than that you'll need some custom logic.
Next, I'd look at the profiler again for more clues.
i mean yeah it does unsyncronize the calls so as to not call it all at once but i'll need to find a sweat spot or it'll call to little at a time and have alot of frames or call way too many and have no frame
yeah i was trying that now that it runs with the profiler
Now maybe that's even faster enough that you don't need a full on rewrite anymore 😄
nah it didn't do much for loading speed just increased the fps sadly
still takes 2 minutes for a the 524k chunks or 25,165,824 cubes (i think its this many) to be marched :/
2 minutes might be a bit long, but 25 million cubes is a lot, especially (if I'm understanding correctly) you are actually building meshes for them.
well not for every cube only those that are within a set value for the float[,,] TerrainMap but yeah
I'm not sure how fast you expect it to be, but consider what I've said before that engineering is a lot about reframing problems. Do you need all 500k chunks to be processed right away before player can play, or do you only need some of them (eg the ones where player spawns in) to be processed right away, and the rest can be processed in background/on demand as player move around?
well the chunks are quite small (as seen in this video) so having a lot spawned at once would be nice
Instead of drilling on how to make this solution faster, take a step back and consider if you are tackling the wrong problem.
fair
i tried with loading new chunks as the player move but that comes with a caveat that it will give lag spikes whenever the player loads new chunks (which i thought i did have an older video of but don't)
i can try again of course since the creation speed of the chunks has increased alot
Spawning GOs is pretty slow yeah and you cannot move it to background thread, but you can always do something like "limiting spawning only x chunks per frame."
As long as player doesn't move so fast that "spawning only x chunks per frame" cannot keep up, it's irrelevant if a chunk shows up a few frames later.
honestly i can get it up to spawning 10 ish thousand a second, and the current biggest draw back is the await Task cause it limits whats happening but it's meant to
very poorly put together test but the idea is there
found the problem with this it was a my bad :/
after making a properly working script it runs pretty well for frames but it's doing odd things
also as far as i understand what this means
pretty sure it's the async holding the generation back but im not sure
This is probably just not revealing all the relevant info. Try adding profiler markers along your code. This way you'll be able to see parts of it in the profiler without deep profiling.
I would argue to implement a ObjectPool for the chunks. The player wouldnt be able to see all the chunks all the time, so you could just have an area around the player and store the chunk data for all chunks, and recycle gameobjects to show the data of the chunks
if you do that, new chunks wont allocate new gameobjects, avoiding needing to rely on the main thread for instantiate
(would also have the side effect of not needing a gameobject per chunk)
got curious how the game would run if i removed the LargeChunk and it turned out about how i expected but not to the extent (both videos have the same sizes for the Chunk and World Chunk but the video with the smaller world is the once without the largeChunks)
wonder what'll happen if i do the opposite and add another Chunk size so it goes WorldChunk to LargeChunk to SmallChunk to Chunk 🤔
nope also way too laggy to be practical guess the 2 chunk system was the sweat spot
i learnt IJob (don't think i can does IJobParallelFor for this code) and it made no difference for performance but made my pc get used more
code with the Jobs system https://paste.mod.gg/erehoywjbzmd/3
A tool for sharing your source code with the world!
The complete call here basically makes the main thread wait for the job to complete, making the job redundant:
JobHandle jobHandle = marchingCubeJob.Schedule();
jobHandle.Complete();
You should schedule all the jobs first before waiting for them to complete.
Aside from that, profile!
Complete is a bit confusing of a naming, the job would still complete even without you calling it, just some time later. It would be better named as BlockUntilCompletion or something.
Or CompleteRightFuckingHereRightFuckingNow()
Also I noticed you do not have BurstCompile on your job
Ok but not calling would make the vertices list not be filled once the next function is called, which is called right after and needs it filled out
I can try though
Yeah i added that right after sharing the scripts
Nope throws the error "The previously scheduled job writes to NativeList vertices. You must call JobHandle.Complete before you can read data from the NativeList
Unless you want me to somehow call all of BuildMesh(); and BuildUV(); inside the job
And yeah mesh is not a value type so i can't
Yep, you either need to schedule jobs with dependencies on previous jobs or have a separate list for each.
do you know of a way to have a job call a function after it's done? so as to let the main thread go on to the next chunk and not worry about needing to call the BuildMesh() function until it knows the job is complete ie have the CreateMeshData end at the MarchCube and have the BuildMesh called on a separate function line thats happens once the job is complete if that makes sense
I don't think there're any callbacks that the job calls. You'll need to check the handle in update, async or coroutine to see if it's complete.
That being said, it's kind of a wrong approach to using jobs.
The whole point of jobs is that you can schedule them somewhere at the start of the frame, forget about them and do some other work on the main thread, and lastly collect the results later in the frame, where it is likely complete.
fair still new to learning them so not the most knowledgeable
would async / coroutine work to let the main thread go on to other scripts while the worker thread does it's thing?
sorry i mean how would i implement it
It will, yes.
Well, you'll need your whole thing to be async( or coroutine) basically, including the method that schedules and waits for job. Then after scheduling, loop in a while loop with an yield for some time, until the handle IsComplete(or something like that) is returning true.
Though, it's not ideal since I think jobs are meant to be complete within a frame, while async/coroutine don't have that many "resume points" in the frame.
like this?
while (!jobHandle.IsCompleted)
{
await Task.Yield();
}
Yeah. You can try that I guess.
after changing it a bit it isn't throwing an error but it's also not the fastest ~6fps
but it is spawning 2.5k per frame i think
though it also doesn't have lag spikes the framerate is consistently the same
You'll need to profile.
It's likely due to the work you do on the main thread. Or because you're not actually making it async.
fair, eating currently so it'll take me a bit
https://github.com/domenkoneski/unity-jobs-callback/blob/master/JobHelperExamples.cs take a look here for a callback example
For more complex job orchestrations we use a version of this https://github.com/Nebukam/com.nebukam.job-assist