#Netcode: Excessive GhostSentSystem:SerializeJob Time

1 messages · Page 1 of 1 (latest)

chilly pier
#

With about 1 million entities, my GhostSendSystem SerializeJob takes about 40-50 ms.

A breakdown of my entities when testing my 8192x8192 map:

  • 900k Trees
    -- Single float2 Ghosted (+ default stuff)
    -- Archetype is 440 B
    -- Static Optimization
  • 4k NPC Structures
    -- Single float2 Ghosted (+ default stuff)
    -- Archetype is 1.2 KB
    -- Static Optimization
  • 50k NPCs
    -- Single float2 Ghosted
    -- Archetype is 6.5 KB (I need to debug why one of my BufferComps is 5.5 KB ...)
    -- Dynamic Optimization

And here's the part where people will judge me:

  • 16k Terrain Segments, each of which represents a 64x64 segment of the terrain.
    -- Each of which has a Ghosted DynamicBuffer with 4096 elements in it, each with 1 ghosted byte
    -- Archetype is only 740 B because the Buffer exceeds the allotted internal capacity and is off-chunk
    -- Static Optimization and these currently never change, and only about 12-20 of these are Relevant to a given client at a time

ecuzzillo referred me to the optimization docs: https://docs.unity3d.com/Packages/com.unity.netcode@1.0/manual/optimizations.html

I'm already using GhostRelevancyMode.SetIsRelevant and I'm explicitly managing the Relevancy of ghosts. I confirmed that SetIsRelevant is enabled and I only have 400-1200 entities relevant at a time. The set of ghosts that are relevant only changes when a client moves their camera.

If I cut my generated map down to 4096x4096, my total entity count drops to ~250k and GhostSentSystem:SerializeJob Time drops to 12-15 ms. Though the number of relevant ghosts does not change when I lower the map size.

From the Docs:

Serialization cost
When a GhostField changes, we serialize the data and synchronize it over the network. If a job has write access to a GhostField, Netcode will check if the data changed. First, it serializes the data and compares it with the previous synchronized version of it. If the job did not change the GhostField, it discards the serialized result. For this reason, it is important to ensure that no jobs have write access to data if it will not change (to remove this hidden CPU cost).

It sounds like this Serialization Cost is present even when no client is eligible for receiving data.

I'm going to play around with some of the GhostSendSystemData config values to see if that helps, but I'm curious why the system is serializing and checking entities that are not relevant to any client?

chilly pier
#

I see that I'm able to get some control with:

ghostSendSystemData.ValueRW.MaxSendEntities = 1500;
ghostSendSystemData.ValueRW.MinSendImportance = 1;
ghostSendSystemData.ValueRW.IrrelevantImportanceDownScale = 50;

MaxSendEntities does seem to reduce the number of entities that are serialized as part of checking them for changes but that alone results in visual lag as entities are seemingly checked bit by bit over many frames.

A MinSendImportance > 1 introduces visual issues (at least until I adjust importance values of various ghosts), but setting this to 1 gives a baseline for the next part.

Setting IrrelevantImportanceDownScale of 50-100 makes it so the 1500 entities that are checked are generally the entities I care about. So as long as I don't move my camera around, and my MaxSendEntities is high enough, this helps.

But ... this still causes a number of issues. Moving the camera around (which sets relevancy on additional entities) is too delayed due to the downscaling of previously irrelevant entities. And on my large map there's still visual issues due to the sheer number of entities that have their irrelevant check every so often.

Since I'm manually adding entities in and out of the Relevancy Set, is there a way to have the serializer only focus on the entities within that Set and never scan other chunks until a new entity is added into the Relevancy Set?

storm basalt
#

@potent parrot is really the person to answer all these questions; i'm only guessing based on the docs / source /etc

#

are you using the distance-based importance like on that page?

#

intuitively it seems like that whole grid scheme they have going on in asteroids would be a good fit

chilly pier
#

Not Unity provided distance based. Manually managed distance checks based on terrain segments. "Chunks" (not an ECS chunk) of the map are declared relevant based on camera position.

Rather than a distance check on all entities, I only do a distance check on terrain segement bounding boxes. If a segment is nearby, I query all elements in that segment (via a ZoneId Shared Component) and manually register them all as relevant.

That seems more efficient then using a distance calculation of all entities, specially since I plan on removing the Transforms of entities since I only need a float2 for my 2D game.

I posted an earlier version of this the other day (trees and NPCs are not shown here): https://www.youtube.com/watch?v=mRKP8Cj309Q

#

I imagine that using Unity's provided Distance Based Relevancy will result in the same thing I'm seeing, but with the need to do entity level distance calculations.

storm basalt
#

i'll wait for niki for this one 🙂

chilly pier
#

NP, and thanks anyway!

potent parrot
# chilly pier With about 1 million entities, my GhostSendSystem SerializeJob takes about 40-50...

Hey @chilly pier (and thanks for the CC ecuzzillo!)

With about 1 million entities, my GhostSendSystem SerializeJob takes about 40-50 ms.
For how many connected clients? Just one?

This is unfortunately a known issue. The problem is that, for each client, we need to iterate over every single chunk containing a ghost to determine:

  1. Filter whether or not this chunk should even be considered.
  2. If it should, sort each chunk by its priority.
    See the first for loop in GhostSendSystem.SerializeJob.GatherGhostChunks .

Thus, regardless of your settings, more ghosts = more iterations. It's a scaling issue that I raised a little while back, but we don't currently have a netcode-wide solution. Please file a bug report with your project, citing GhostSendSystem performance of 40-50ms for (presumably) one player as the bug. No promises, but, with your bug report, I think I can justify bumping the priority of fixing this. I'll let you know.

Since I'm manually adding entities in and out of the Relevancy Set, is there a way to have the serializer only focus on the entities within that Set and never scan other chunks until a new entity is added into the Relevancy Set?
Unfortunately no. That is exactly the problem, and likely also the fix.

#

Moving the camera around (which sets relevancy on additional entities) is too delayed due to the downscaling of previously irrelevant entities.
The solution here is to have the AABB which determines relevancy to be larger than the cameras AABB. E.g. 50 units larger in each direction. That way, even if your camera moves to the right (for example), you'll have entities already set to relevant and received by the client before you need them. This buffer distance is determined by: a) camera speed (in units per second) * max expected ping (in seconds). E.g. 20 units of movement per second * 0.35s (350ms) for max ping = ~7 units.

If your camera can teleport, then there will ALWAYS be a delay. It's cheesy, but you could have some full-screen blocking UI saying "calibrating" every time your camera teleports outside the range of relevancy. It's a hard problem to solve.

Rather than a distance check on all entities
Note that, for "Distance based Importance", netcode also doesn't do a distance check on all entities. It:

  1. Puts every single entity into a spatially-located chunk (using its LocalTransform.Postion to calculate a tile index).
  2. Performs a single distance calculation per spatially-located chunk.

The interesting thing is that this makes calculating relevancy easier (as entities are now spatially chunked). It still wont help much in your situation though. Your performance problem is before you even get to the importance scaling.

#

Oh, and just to rule out the obvious: It's 40-50ms with Burst ON, right? And is your editor in Debug or Release?

chilly pier
#

Thank you so much! Lots of good stuff there. I'm holding off on replying until I have time to truly dive in and respond properly. Just wanted to say thanks for now.

potent parrot
#

Just to add: The only viable thing you can do today to fix this problem is to reduce the total quantity of ghost chunks (by either a: reducing or b: disabling most entities).
Therefore, if I were you, I would write a system that determines how far each spatially chunked chunk is from the closest client, then disable the entire chunk if it gets further than X number of chunks away.

If you have update systems that need to operate on those chunks, it gets much trickier.

covert pagoda
#

It would be nice for us to be able to remove ghosts for all the netcode ghost systems so that it can continue to be simulated on server but not have the overhead.

#

Even a simple, DisableGhost tag component similar to the DisableRendering solution rendering has

#

I know structural changes are not ideal but I imagine this type of disable/enabling would be infrequent and ideally could be handled by the user by batch operations based off their architecture.
query.SetFilter(new sector {value = 1}});
AddComponent<DisableGhost >(query)
kind of deal

potent parrot
#

Agreed. My own view is to make relevancy available on a per chunk basis, per client, then use that to allow user code to opt in (which, in a large world, is very likely to be a tiny subset of all chunks). Open to thoughts, of course.

worthy hamlet
chilly pier
# potent parrot > Moving the camera around (which sets relevancy on additional entities) is too ...

Sorry, my household has been debilitated by a Spring Flu (apparently that's a thing).

For how many connected clients? Just one?

Yes, I see the issue with just one client (ClientAndServer)

The solution here is to have the AABB which determines relevancy to be larger than the cameras AABB. E.g. 50 units larger in each direction. That way, even if your camera moves to the right (for example), you'll have entities already set to relevant and received by the client before you need them.

Yes, I'm doing this already and it works well enough if I don't adjust GhostSendSystemData configs. If I try limit MaxGhosts and try to make unrelevant ghosts less important, then it becomes difficult to balance.

If your camera can teleport, then there will ALWAYS be a delay. It's cheesy, but you could have some full-screen blocking UI saying "calibrating" every time your camera teleports outside the range of relevancy.

Yes my camera can teleport, but for the most part it will move in a more standard way. I will certainly always have some degree of needing to load in entities and this "calibrating" idea is something that will be useful for me. Thanks for that suggestion!

Note that, for "Distance based Importance", netcode also doesn't do a distance check on all entities. It:

  1. Puts every single entity into a spatially-located chunk (using its LocalTransform.Postion to calculate a tile index).
  2. Performs a single distance calculation per spatially-located chunk.

Thanks for those details! I may have to give that a shot! Since I'm using a Shared Component that specifies the zone an entity is in, I was hoping that would help group my ghosts into chunks based on relevancy also. But it sounds like Netcode further optimizes when using the Distance based Importance functionality.

I want to consider completely removing the Unity Transform because my game is 2D. Since I'm using a mix of my own rendering + NSprites, I only need a float2 for position (plus sprite direction, animation state information, etc.). I may have to rethink the idea of removing the Unity Transform if Netcode has optimizations that leverages it.

#

Oh, and just to rule out the obvious: It's 40-50ms with Burst ON, right? And is your editor in Debug or Release?

Correct. Burst is on, I confirm the job is bursted in the Profiler, and my editor is in Release mode. I also tested with a Standalone Build with and without dev mode (without deep profiling) and my FPS and profiler data is roughly the same.

The only viable thing you can do today to fix this problem is to reduce the total quantity of ghost chunks (by either a: reducing or b: disabling most entities).

While I am currently testing based on my desired map size, I have accepted that I may need to reduce my scope based on optimization limits. But as for disabling entities, my game is a simulation game and it's important to fully simulate the world even if no player is looking at that part of the world.

+1 to what Tertle said:

It would be nice for us to be able to remove ghosts for all the netcode ghost systems so that it can continue to be simulated on server but not have the overhead.

+1 to what Niki said also:

Agreed. My own view is to make relevancy available on a per chunk basis, per client, then use that to allow user code to opt in (which, in a large world, is very likely to be a tiny subset of all chunks). Open to thoughts, of course.

My mindset was also that I would opt in entites on a per client basis. I don't mind managing this myself and I also don't mind either enabling/disabling Ghost Components or marking entities as Relevant or not (per client).

Please file a bug report with your project, citing GhostSendSystem performance of 40-50ms for (presumably) one player as the bug. No promises, but, with your bug report, I think I can justify bumping the priority of fixing this. I'll let you know.

Will do! My repository is currently private though and you know ... my game is the next million dollar money maker and I don't want everyone stealing my super awesome ideas. Some options:

  • I'd prefer not to make it public, but I will if it will be truly helpful
  • I could setup a demo that shows only the problem, but I do not know when I'll have time for that
  • I can add access for individuals
  • I could fork my repository and delete my gameplay content and then make that public
potent parrot
#

I may have to rethink the idea of removing the Unity Transform if Netcode has optimizations that leverages it.
You can actually modify distance importantance to be 2D. Docs should explain.

My repository is currently private...
Oh, I actually don't know if bug reports from the editor help menu makes uploaded projects publicly downloadable. Obviously Unity employees have access, but I'll need to confirm re public access.

If you want to just describe the bug without a project, that's fine too. It's a very easy to repro problem (e.g. in our Astroids sample, just up the asteroid count to 1m).

chilly pier
#

Okay I'll start with a bug report that just describes the problem. If more is needed beyond that, I'll do what I can.

chilly pier
#

Submitted via the Unity Editor. I didn't get an email confirmation and I don't see it on Issue Tracker, so I assume it needs to go through an intake process. Once I get a link to it, I'll link it here.

storm basalt
#

check spam folder?

#

@chilly pier ^

chilly pier
chilly pier
#

I tried out the distance based importance and it crashed my performance more than my previous examples. ECB Playback within GhostDistancePartitioningSystem was taking 80+ ms on my smaller 4096x4096 map (I didn't even test the larger map).

I feel like I must have done something wrong, but all I did was:

  • Add a GhostConnectionPosition to the SourceConnection within the Server Side of GoInGame.
  • And added logic in my Server Side Player Cursor System (which reads a IInputComponentData sent from the Client) to set the GhostConnectionPosition for that player.
  • Added the setup seen below for GhostDistanceData and GhostImportance.
if (!isGhostDistanceStuffSetup) { 
    var gridSingleton = state.EntityManager.CreateSingleton(new GhostDistanceData {
        TileSize = new int3(64, 64, 1),
        TileCenter = new int3(32, 32, 0),
        TileBorderWidth = new float3(1f, 1f, 1f),
    });
    state.EntityManager.AddComponentData(gridSingleton, new GhostImportance {
        ScaleImportanceFunction = _scaleFunctionPointer,
        GhostConnectionComponentType = ComponentType.ReadOnly<GhostConnectionPosition>(),
        GhostImportanceDataType = ComponentType.ReadOnly<GhostDistanceData>(),
        GhostImportancePerChunkDataType = ComponentType.ReadOnly<GhostDistancePartitionShared>(),
    });
    isGhostDistanceStuffSetup = true;
}
potent parrot
#

Ah, sorry, you ran into a bug that we have since found (and fixed) for the next release. Add:

        [WithNone(typeof(GhostDistancePartitionShared))]

to the struct definition for the IJobEntity : GhostDistancePartioningSystem.AddSharedDistancePartitionJob.

#

I would also *personally * recommend increasing the TileBorderWidth from 1f to 4, 8, or even 16, and double check that your TileSize is appropriate for your maps scale.
You can use the Entities > Archetypes window to see how full your chunks are.

chilly pier
#

I was just about to pull the code into my local packages and make that exact change!

potent parrot
#

Yeah, sorry 'bout that!

chilly pier
#

All good, it was satisfying to find it and then look here and see you confirm what I was seeing

potent parrot
#

Oh, you should also be able to schedule the AddSharedDistancePartitionJob as ScheduleParallel. The ECB playback is unfortunately singlethreaded, nothing we can do about that. But it should be significantly less common now.

chilly pier
#

So with that fixed, I'm still not having much luck with the distance based functionality. I'm seeing no change in performance compared to my original test.

Note, I don't see GhostSendSystemData config values being changed in the sample repository: https://github.com/Unity-Technologies/EntityComponentSystemSamples/tree/40f18d0e4663674c360e59154942abd8301b1957/NetcodeSamples/Assets/Samples

I tried with and without customizing GhostSendSystemData MaxSendEntities and MinSendImportance values.

  • Without customizing these values, everything seems to function as it does in my original base example. The performance GhostSendSystem SerializeJob is the same (~40ms).
  • With customized values for MaxSendEntities (2500) and MinSendImportance (1), the importance of entities seems unaffected by the Distance calculations and the entities near my player are not prioritized in any way. I'm pretty sure I have it setup correctly, but maybe I messed something up.
  • Bumping the MinSendImportance up to 20 results in clear visual lag indicating that entities near my cursor are not more important than other entities.

I think I'm going to take a step back from the Distance based calculation stuff.

#

I did confirm that the runtime inspector shows the GhostConnectionPosition component on my NetworkConnection (server side) and confirmed that the position is being updated correctly.

#

oh shoot ... my entities in the world don't actually have a position set on them 😛 I forgot that I'm not using Transforms for their position. Though that should mean they're all important when the player is near (0,0)

#

Well that explains that at least. Maybe it would work better for me if I do set their transforms appropriately.

potent parrot
#

Note that DistanceImportanceScaling does not upscale importance of nearby entities. It downscales the importance of far entities.

Instead of setting MinSendImportance, you may benefit more from setting:


        /// <summary>
        /// The minimum importance considered for inclusion in a snapshot after applying distance based
        /// priority scaling. Any ghost importance lower than this value will not be send every frame
        /// even if there is enough space in the packet.
        /// </summary>
        public int MinDistanceScaledSendImportance;

See how it's implied inside GhostSendSystem.cs:1419.

I forgot that I'm not using Transforms for their position.
Ah, yeah. They are excluded from distance based importance downscaling entirely then. You'll need to duplicate GhostDistancePartitioningSystem to support your custom Transform types.

#

CAVEAT: GhostDistanceImportance does multiply ALL importance values by 1k, so you may actually be getting errors as a result if a subset of your entities not implementing this method at all (they won't receive that 1k boost). EDIT3: It's actually a bug yeah.

chilly pier
#

Thanks, I appreciate the support. I may spend some time thinking about how to work around the system rather than with it. I would still prefer that the ghost system only look at entities that I have declared relevant. I may see what the cost of adding/removing ghost components on the fly would be, or look into having entities that are for the simulation and then separate entities that are for ghosting and rendering and then I can spawn in the ghost/rendering entities only when a client wants to view them (though that sounds overly complicated).

potent parrot
#

No worries 🙂

I may spend some time thinking about how to work around the system rather than with it.
Yeah, unfortunately you will have to.

I may see what the cost of adding/removing ghost components on the fly would be,
This will likely break in horrible ways. My guess is that you can't mess with internal Netcode types at runtime, as it'll break a lot of built-in assumptions.

or look into having entities that are for the simulation and then separate entities that are for ghosting and rendering and then I can spawn in the ghost/rendering entities only when a client wants to view them (though that sounds overly complicated).
This was my first thought too. E.g. Have a ConvertToGhost component, and when you get within range of any player:

  1. Create a Ghost version of it, copying over all relevant components as you spawn it.
  2. Delete the non-ghost version on the same frame.
  3. Patch up all Entity references.

Then do the reverse when they move out of range (+margin).

#

You wouldn't need to split by simulation though, only proximity. On the client, you need them all to be the Ghost versions for Prediction to work.
Obviously, ideally, we'd just optimize enough that we can support this scale, so you wouldn't have to do anything.

#

Actually yeah you make a good point. I think it might be more viable to have bespoke ghosts, that each collect a bunch of simulation entities, as a form of "gameplay LOD" thing. Like video compression, where you only replicate a lower fidelity, lossy version of the actual data.

worthy hamlet
#

UnityChanOops Wow. Looks really complicated. I will wait official to solve scaling issue.

chilly pier
potent parrot
# chilly pier What exactly do you mean by this?

To give an example: Your 900k trees: You could only have a ghost for a tree that is not in a deterministic, pre-baked (or precalculated) state. Thus, 900k ghost trees becomes n modified trees.
I think Minecraft does this for its map chunk generation. Each client deterministically calculates the entities in the world, then they only replicate changes.

#

I.E. If trees can only be in X number of states, only replicate the X - 1 least common states.

#

E.G. (Simplified) If trees are normally standing, then only replicate all "chopped" trees. A replicated "chopped" tree will disable any "standing" tree that it spawns on top of (using ID, or even location).

chilly pier
#

Interesting. I hadn't yet decided if trees would always be in the same place when they respawned or if I would let trees spawn/respawn more organically. If the locations of the trees is fixed, then I could spawn/despawn ghosts that represent an override state.

What you describe is what I was leaning towards for my terrain. Currently my terrain tiles are all dictated by the server, but I was leaning towards clients deterministically knowing the starting point of the terrain, and then only replicating changes. It might lead to the changes being applied after some visual lag, but it will remove the need for me to replicate the full map.

potent parrot
#

If the locations of the trees is fixed
Or deterministically random!

Currently my terrain tiles are all dictated by the server, but I was leaning towards clients deterministically knowing the starting point of the terrain, and then only replicating changes.
This is a fantastic bandwidth optimization too, regardless of whether or not it's needed.

It might lead to the changes being applied after some visual lag, but it will remove the need for me to replicate the full map.
It shouldn't have too, especially as fewer ghosts globally increases the odds of any individual ghost being replicated "on time".

gusty bolt
#

Replicating 900K trees or terrains without some random but deterministic base line is unreasonable by default I would say.
The ideal case scenario for terrain or any generative solution is to have something that is always automatically generated both server and client side (can be even as simple as a function approximation of the terrain/tree placement, using seeds or whatever you like) and everything is the first time sent always in delta against it. That will result in good bandwith saving (if not completely saved , because nothing has changed (for which we should some slightly better support)

worthy hamlet
covert pagoda
#

We'd like to be able to support 1 mill+ ghosts as well - but only need a tiny fraction to be replicated at a time but still simulating.
I'd just like a way to exclude them from the various ghost systems until they're required to be replicated/etc.
Having to duplicate everything with a ghost vs non-ghost version sounds very painful

#

Niki's chunk level relevance suggestion is probably good enough, but would need to test

#

hmm maybe 1mill is a bit excessive, let's think more 100k+

gusty bolt
#

100 players is nothing honestly. It is pretty small workload. Unless you have a crazy character that is 20k + by its own.
As @worthy hamlet alost have 😄
For reaching 1m ghosts you need to change a couple of things.
You can indeed use a relevance per chunk, and that just solve a part of the equation. You still need to through all chunks anyway, or all entities anyway. Unless you build up hierarchical data structure (can be quadtree, AABB tree, octree, whatever) and accelerate things on you side a bit.
Entities does not provide support for that at the moment.
You still need to modify the iteration loop on how chunk are gathered and prioritised, and go wide on that instead of players for better scalability.
That requires splitting send in two stages:

  • Gathering phase
  • Send phase
    So it is easy to put youserlf in the middle of it. All these requires changes and though.
    There is also data costs to be considered here. The booking per chunk is pretty big on the server and track a lot of info "per entity".

In general, though with 1m ghosts relevancy and scaling must be tuned and written ad hoc to fit the game needs.

covert pagoda
#

100% at any type of crazy scale like this it has to be on the game to manage it.
i'd just like to be able to do this while integrating into netcode while avoid package changes if possible.

#

I currently have a spatial partitioning rebuilt per frame that can handle ~200k entities individually which I use for relevancy

gusty bolt
#

@covert pagoda yes, this is what you need to do. The key factor is being able now to use this spatial partioning of yours directly to fileter out the chunk

covert pagoda
#

But if this can be sped up using chunks instead of individual entities that alone would be huge

gusty bolt
#

This is the key factor here: right now we don't provide that hook. We have a "default" stragey for that. The key here is letting the users told us what are chunks we need to consider, using optimised (for their game) spatial partioning or other accelerator structures.

#

That on top of relevancy of ocurse.

worthy hamlet
#

Anyway I guess official will need to solve current relevancy scaling issue first. If I understand correctly, the more ghost u put even most of them is out of relevancy range it's still drop performance significantly.

gusty bolt
#

We are just speaking about possibilities here.

#

Indeed there are scalability needs and it is our responsiblity to provide good framework for handling the game scale you need

worthy hamlet
#

Oh yeah not sure u still remember this or not. There's temp fix that split each new ghost type into new chunk even it's same archetype at both client and server. I guess will need bring back to the same archetype stay at same chunk to make it performant first to open more optimization opportunity

chilly pier
#

Just to add to the recent conversation, I agree that replicating 1 million entities is unreasonable but selective replication should be doable.

In my case, only 400-1000 entities are relevant in my current scenario. The bottle neck I'm currently seeing is that the Netcode implementation is seemingly serializing all entities to see which should be replicated even when an entity is not relevant and therefore doesn't need to be serialized.

Outside of Netcode, I'm able to query against my 900k trees using a SharedComponent filter that acts as a ZoneId. While I'm thinking more narrowly about my particular case, it should be manageable to limit replication consideration to only entities that are currently relevant.

Thinking more generally, the request is that Netcode should provide a means for respecting relevance in a strict way. The implementation is up in the air, but likely involving either the ability to know easily if a chunk contains relevant entities, restructuring of entities to keep them organized into chunks that are relevant, or some sort of cache to avoid looping over all chunks every frame.

A risk I see is that excessive changes to what is and isn't relevant would be counter to this design.

#

Splitting the request into two:

  1. Chunks/Entities that are not relevant should not be serialized, checked for changes, or be subject to any expensive operation.
  2. It would be nice if the selection process was optimized to focus the search on chunks that contained relevant entities.

I need to do some more digging, but it looks like even item #1 would have substantial benefits for me.

Forgive me if my ignorance of the implementation is causing me to over simplify the situation.

potent parrot
#

You're broadly right BoostHungry, except that we don't serialize those entities: It's simply the processing involved in deciding WHICH chunks to pass into the serializer that gets prohibitively expensive.

chilly pier
#

In my testing "PrioritizeChunks" is expensive, but most of the time is spent doing this:

_profilerMarkerSeven.Begin();
// TODO: This is slow...
serializeResult = serializerData.SerializeChunk(serialChunks[pc], ref dataStream,
    ref updateLen, ref didFillPacket);
_profilerMarkerSeven.End();

In my medium map example (250k entities) 3ms is spent doing "PrirotizeChunks" while 6ms is spent doing the above. On my large map (1M entities), 14 ms is spent doing "PrioritizeChunks" and 48 ms spend doing the above.

#

If the number of relevant entities is remaining constant between my two tests, I would expect PrioritizeChunks to become more expensive with more entities, but I would expect the serialization cost to remain fairly constant. One explanation could be more entities == more fragmentation == more chunks to serialize, but I would expect my Shared Component to counter fragmentation.

#

(note, I haven't yet dove into what serializerData.SerializeChunk is doing)

potent parrot
#

In my medium map example (250k entities) 3ms is spent doing "PrirotizeChunks" while 6ms is spent doing the above. On my large map (1M entities), 14 ms is spent doing "PrioritizeChunks" and 48 ms spend doing the above.
Per client?? Have you enabled optimizations like Burst + Release Editor config? Those sound particularly high.

I also recommend reading the optimizations.md manual page to see optimizations for serialization.

It boils down to: Send less stuff every frame. I.e. Unless you're using some known slow-paths (e.g. serializing children), you're likely paying a high serialization cost simply because you're serializing a large quantity of entities. The Multiplayer > NetDbg tool should be very useful for understanding what goes into each snapshot packet.

Depending on your games requirements, you'll have a specific budget allocated per player.

chilly pier
#

Yes it looks to be properly bursted. Everything after the PrioritizeChunks marker is many many of my markers that indicate repeated calls into the code block I pasted above.

chilly pier
# potent parrot > In my medium map example (250k entities) 3ms is spent doing "PrirotizeChunks" ...

That looks to be an internal github link. As I mentioned, I only have 400-1000 Entites Relevant and that number does not change between my different size test maps. I tried limiting Entites sent via GhostSendSystemData configurations (eg, MaxSendEntities and using Importance factors), I wrote about that here: https://discordapp.com/channels/489222168727519232/1094078364031143936/1094083253117395015

My goal: I have 400-1000 Entities Relevant regardless of the map size. Note that Relevant Entities should be organized in Chunks correctly because Relevant Entities will share a SharedComponent ZoneId value. Increasing my overall Entity count (but not my Relevant Entity count) may result in increased cost finding Chunks that contain Relevant Entities, but should not result in increased serialization cost because it should not be serializing Entities that are not Relevant.

What I'm seeing is that increasing the overall Entity count increases the cost of Serialization related processes even though the number of Relevant Entities is unchanging (and I assume fragmentation isn't a root cause either).

worthy hamlet
# potent parrot > In my medium map example (250k entities) 3ms is spent doing "PrirotizeChunks" ...

From what I understand at both client and server, current netcode will only store one ghost type into one new chunk even the another ghost has same archetype which make it quite slow that basically looping each chunk only iterate one ghost instead of multiple ghosts at once. I think this lost a lot of optimization opportunity. Did official has any plan to make it store multiple same archetype ghosts into one chunk again?

potent parrot
#

Ah whoops, apologies!
Looking at your profiler capture (thank you!), yeah this is very odd. I'd very much appreciate you uploading your repro project, along with quick repro steps (e.g. scene, any editor config peculiarities etc). We may have multiple places in there where we do not scale well.

chilly pier
#

Let me see if I can create a new project that trims everything down to just this specific concept and see if I still see the issue. Then that will be easier to share.

potent parrot
#

Cheers. Large project is fine too if that's too much of a faff.

chilly pier
#

I have an hour while watching a live stream at work, so I'll see if I can get a POC setup in that hour 😛

chilly pier
#

@potent parrot here you go: https://github.com/BoostHungry/NetcodeSerializeIssueRepro

Camera starts near 0,0 and it spawns ~1M "trees" expanding out to 8192x8192. You can move the camera around with WASD (+Shift) or drag with middle mouse button. Mouse scroll wheel zooms.

In the Scene view you can see "trees" spawning in as they get added as relevant and a debug log will tell you how many are relevant when new zones are declared relevant.

This trimmed down version does a bit better than my original example but still demonstrates the problem:

  • 40 FPS
  • ~25 ms for GhostSendSystem:SerializeJob (Burst)
  • 4-6 ms for "PrioritizeChunks"
#

Note, you have to select the Sample Scene and you should set the Scene view to 2D to get a good view of the "trees" spawning in/out (may need to toggle it off then on again). I just tested with a fresh Git Clone and it seems to work out of the box.

gusty bolt
# worthy hamlet From what I understand at both client and server, current netcode will only stor...

We do store same archetype (but different prefab) in different chunk (yes) at the moment, on both server and client.
We can lift that on the client with some work a little more easily than on the server. But abolutely doable on both side to improve chunk utilization.
The problem is still the same though: even if they are different archetypes, if you utilise chunk badly (big entities, bit components, bad splitting because of shared compoents etc) the number of chunks can be still very high.

worthy hamlet
chilly pier
#

So to me it looks like the act of checking if a chunk is relevant and if the clients have ack'd a possible relevancy change is taking the time. Out of 34087 chunks that pass through the GhostChunkSerializer.SerializeChunk call, 34078 of them are skipped due to valid irrelevancy. But despite the vast majority of these being skipped, the loop over the chunks for this process is taking 30 ms.

chilly pier
#

So I migrated all my entities to using a dual-entity setup: A long lived server side simulation entity, and a spawned-on-demand client side ghost entity.

Both GhostSendSystem and the Serialize Job are down to under 0.1 ms, generally around the 0.01 ms range.

This is much better but it introduces the complexity of needing to maintain a server and client entity (and prefab), manage spawning client entities, copying over initial data, and then copying dynamic data (like position) every frame.

On the positive side though, this approach has the added benefit of slimming down my archetypes since server entities don't need ghost and rendering components on them and client entities don't need all logic based components that don't contain ghosted data.

All in all, it would be nice to have support for many ghost entities with a way to completely disable ghost components, but I will proceed with this approach of managing dual entities.

I understand the complexity of needing to spawn/despawn ghosts and the need to ensure Ack of those actions. It would be nice if once a ghost entity is determined to be Ack'd as successfully despawned, it can live off in the shadows without needing inspection until a time when it is declared relevant again.

gusty bolt
#

We do support disabling ghost component (actually removing them) from the prefab at both authoring (when possible) and at runtime (when they are processed).
It is just a matter of configuring that with the GhostComponentAttribute (via code) or on per-prefab basis too (via GhostAuthroringInspectorComponent).

For building a server, that automatically does all he need to do and there is not need to at baking time.
When building a client, that also does all the majority of the work at baking time.

Only when building a client/server build we need to do the stripping at runtime. And indeed, that is still causing the fact the prefab is loaded as it is and then the component stripped away, creating more "archetypes" in that sense.
A possible solution is to make all the these strippable component IEnableable and instead of removing, we are actually disabling them.
However, that may have implication on how you write jobs and query etc etc.
Today this would not work out of the box, still requires some Netcode package changes.

chilly pier
# gusty bolt We do support disabling ghost component (actually removing them) from the prefab...

I'm not entirely following. You're mentioning about the prefab, but do you also mean individual instances of a prefab can have the Ghost Component disabled/removed?

I don't want to modify the prefab (and therefor all instances of that prefab). I want to remove/disable the Ghost Component on specific entities (in a way the removes those entities from checking/consideration within Ghost Systems).

To be clear, I might instantiate an entity and it should have the Ghost Component disabled at the beginning, then later that entity might need its Ghost Component enabled, then later it might need to be disabled again.