#bevy_replicon

1 messages Β· Page 2 of 1

dire aurora
#

My crate doesn't do that at all. It just makes the packet without any regard to who it gets sent to later. So even with 500000 clients it just serializes each entity/bundle once (with the exception of the new connection stuff which I need to improve cause visibility would behave similarly and change way more often)

spring raptor
#

Yes. But if you talking about actual WorldDiff creation - it sucks, I know. I should od it defferently.

dire aurora
#

Resend timer is simpler by itself, but having acks to bypass that timer would work with the changes already needed for visibility πŸ€”

spring raptor
dire aurora
#

Also I guess realistically things like sending the whole bundle vs just changes is kind of a later stage optimization? One with hard to measure impact since broadcasting the same to everyone could offset the benefits you get from only sending the minimum number of changes

spring raptor
dire aurora
#

Yea the current replicon approach is quite different. It would get a bit more complex with separate packets, but it should still be possible to do some kind of ack setup, that would at the very least prevent resends that aren't necessary anymore ... One case I can think of is: You stand in the town of an MMO, and 70% of the people are just AFK or only chat. Would be a bit of a waste to send their data till the end of time πŸ€”

#

And for time critical things that don't change often you can then just do unreliable with a resend time of 0, it would do what replicon does if we ignore the diffing

spring raptor
#

Loolks like it depends on how hard to pack an individial message for each client?

dire aurora
#

Yea, and I guess also how expensive serialization of a message is. There's basically 3 steps:

  1. Serialize data
  2. Throw the serialized data in the necessary buffer(s)
  3. Send out those packets to the right sources
    Having per-component diffs means upping basically all of them, having visibility means you add to the second and third (tho I think we might be able to optimize the third if we make a PR on renet later), and just giving everyone the same data every time is computationally the cheapest
spring raptor
#

Yes, I afraid that for MMO you need to send the same packet to everyone

#

Wait, you don't send the same data to 100 players

#

Most likely you send data that specific for each client

#

Like players within 100m

dire aurora
#

Well it depends, I think many MMOs broadcast to the whole shard

#

But you might also limit what you send per-client based on what it is

spring raptor
#

I know nothing about MMO, so it could be the case

dire aurora
#

Send chat messages to everyone, only send vfx to the people nearby, stuff like that

spring raptor
#

I mean I played World of Worldcraft a lot, but I don't know how it works πŸ˜…

#

Wait, maybe I do

#

Online games need robust, easy to use network APIs. JAM is World of Warcraft's inter-server serialization and routing layer. This 2013 talk from Blizzard's Joe Rumsey describes how JAM came to be, and how it is used today using real world sample code from WoW and other Blizzard projects to illustrate key concepts, such as machine generated code ...

β–Ά Play video
#

I don't remember what this video, but I think it explains how MMO works

dire aurora
#

There's also this fun thing where MMOs don't necessarily have good netcode

spring raptor
#

True about every networking game

dire aurora
#

I've played MMOs that use TCP and didn't even enable TCP no delay. Meaning your latency goes way up if not enough stuff happens πŸ˜‚

#

@spring raptor I assume you also have some game that uses bevy_replicon?

spring raptor
#

Yes, life simulation game I working on :)

dire aurora
#

Do you happen to have some idea of what things it does that my crate don't support yet?

spring raptor
#

I think you mentioned that you don't have support for client + sever in the same app?

dire aurora
#

I don't actually know if it supports that or not, I never tried πŸ™ƒ

spring raptor
#

And maybe single-player event's that's all

#

I mean generalization of events

dire aurora
#

I think a good way to go might be if I just add any missing features first, fix some of the bugs (like the missing identifiers never getting spawned), and restructuring it to an entity loop. Then after that we can see how we could restructure the code to be more like a crate. Then worry about those per-client differences later, since that sounds like it's gonna be some confusing alloc hell since renet is also involved πŸ€”

#

Even if we do resend packets every frame, it should probably end up more efficient than the strings replicon makes right now πŸ€”

dire aurora
#

Ah but we should probably get a repo set up before I make any big changes, would be hard for anyone to review anything if it's just hidden away in a crate in my game's repo somewhere πŸ˜‚

spring raptor
#

Most your features based on this generics. If we just take it - it will work.

#

One more thing that I afraid of: for MMO using serde is not a good idea. You will need to serialize for every client and the best way is just take bytes for POD structs and write it to socket.

#

Or use your current approach

dire aurora
#

I mean technically I do use serde with my current approach. But really it just serializes the bytes into a buffer right away

spring raptor
#

Don't you use bincode? It's what I mean, you don't just take bytes, it's smarter

#

If bincode just take bytes, you won't be able to serialize any dynamic data, like Ver or String

#

Because it uses heap

#

I.e. any non-POD data

dire aurora
#

bincode uses serde's serialize/deserialize traits. the derives for those basically just make a list of "serialize X as this type of value, serialize Y as this other type of value", etc

spring raptor
#

Yes, bincode just shorter then something like YAML

#

But I afraid that for many clients you need to write bytes to socket as is

#

I.e. without copying to intermediate buffer.

#

Hm... There are some libraries that supports zero-copy serialiation.

#

But it's not serde

dire aurora
#

rkyv is not a great lib tho πŸ˜‚

spring raptor
#

Didn't know, never used it

dire aurora
#

It's basically like grabbing the bytes and shoving them in a packet. Which means you get all the overhead too

spring raptor
#

But looks like there AllocSerializer and it mentions that it works with no_std...

dire aurora
#

I think speedy is even faster but it doesn't list numbers

spring raptor
#

In your game context*

dire aurora
#

Well ultimiately it always depends on the game's context

spring raptor
#

BTW, I was surprised that Bevy use std

dire aurora
#

If you're gonna serialize 50k entities with 70 bundles each you're gonna have to bring a super computer πŸ˜‚

spring raptor
#

All game engines I know use pre-allocated ring buffers.

#

Even Godot

dire aurora
#

Having some alloc isn't always a big issue, as long as you can reuse it

spring raptor
#

So what is bad about such deserialization is small allocations. I mean it's better the the same about of big allocations, but still bad.

dire aurora
#

Ring buffers have a great advantage in that you can modify both the front and back, and even have it automatically wrap. But there's plenty of usecases where that's irrelevant

spring raptor
#

You have to copy your components to them first, but maybe it's not a big deal

dire aurora
dire aurora
#

Copying components is usually fine unless it has big types or alloc types in there. If it's just a wrapper around a u64 it's basically free

#

The annoying part is that because there's a chance something has alloc types, we need to handle things like it always does

spring raptor
#

How about this. I will rewrite my crate to generics and try:
Fill many entities in a single packet + optional compression on top (via feature). And use networking tick as you advised. But will map things on client to avoid allocations on server. Does it makes sense?

dire aurora
#

The mapping isn't really the source of the copy tho πŸ€”

spring raptor
#

But don't you need to clone the component in order to change it for client?

#

I just imagine that things like Spells(Vec<Entity>) could be not cheap to clone.

dire aurora
#

Hmmm, in that case it might have some very slight cost yes, but I think that would be a small minority of all data πŸ€”

spring raptor
#

If you don't like it - I won't do it, but you mentined the performance concers

#

I think It could help speedup things a bit.

#

And you cloning these component into a non-resuable buffer

dire aurora
#

Well the real performance concern isn't when mapping. It's silly things like self.clone(), nothing changes there but it clones anyway πŸ€”

spring raptor
#

Right, but even if you will need to change things, you can avoid cloning in this case too?

dire aurora
#

The real issue is how are we going to get Entity on feature parity with what Identifier does? There would need to be some way for the receiving side to tell if the thing already has a predicted variant around

#

Same thing with ClientAuthority, since the client would share the ending code so now the server needs to do the mapping, but it can't possibly know the client's Entity values

spring raptor
spring raptor
dire aurora
#

Yea that applies in the case of the server having been the one to spawn the entity, and only having server -> client messages

#

But if the client uses the client authority to update something on the server, it'll just be like "What's this entity you're talking about? That doesn't exist". Cause the entity mapping would be on the code that receive messages

#

And if you have client prediction, use a spell, and immediately spawn the effect, then when the server sends a message about that entity, it shouldn't spawn another entity on the same spot

spring raptor
dire aurora
#

It doesn't. The server doesn't know the Entity values of a client at all

spring raptor
#

Oh, you maintain the same identitifers, I get it

dire aurora
#

Yea that's how I do it in my crate

#

Identifier is constant whenever anyone talks about the same thing

spring raptor
#

But who creates identifiers?

dire aurora
#

Anyone can create them, but ultimiately they only hold meaning if the server recognizes them

spring raptor
#

But how you make sure that they are unique?

#

If clients can create them

#

These are different machines.

dire aurora
#

Depends a bit on how you make them. In case of terrain I just throw the bytes of the chunk position in the id, so they always become the same

#

In case of spawning the player locally before the server sent state about it, I know my own client id already

#

For something like spawning a vfx the server would need to reserve some collection with IDs for you to use

spring raptor
#

And say to server "Hey, I spawned this one", please recognize it?

dire aurora
#

I don't even tell the server that, I just spawned it with the same id, so when the server sends an update for it it goes to the correct entity

spring raptor
#

But how server know that it goes to the correct entity?

dire aurora
#

The server doesn't really care where it goes, it just says "here's some data for this Identifier" and then the client figures out where it goes

#

For things like enemies, which the client neither predicts nor spawns, the server just uses an incrementing id. Then the client gets a message for it it spawns it since it doesn't exist

spring raptor
#

Sorry, don't get it :(
You spawn an entity and you predicted it. When server updates this entity, how server knows the id?

dire aurora
#

It's the client that predicts the id the server will use. Or more like it knows for certain what the server will use

#

Then when the server sends the data for that id, the client has the entity so it updates that instead of spawning a new one

spring raptor
#

Got it, so you send Identitifer that you spawned to the server?

dire aurora
#

In most cases it would be the server telling you want identifiers you can spawn

spring raptor
#

Sorry if I asking stupid questions, but I just don't understand.
So on client you send "I want to spawn X, here is what I used to spawn it" and predict it locally. Server says back "Okay, use this ID for what you just spawned" and client updates the ID?

dire aurora
#

The client never tells the server it spawned something with the Identifier. It just has some id it knows the server will use for it, and spawns it with that. Then whatever input made the client spawn that eventually arrives at the server, it spawns it too with the same id, and sends an update about it to the client. When the client receives it it realizes that thing already exists and writes to the existing entity

spring raptor
#

But in the message above you said that you know what id server will use

dire aurora
#

In that specific case it would be how the client knows the id. The server just told you "Your next vfx ids are going to be between 2000 and 2999"

spring raptor
#

Is vfx visible for all clients?

#

Can't several clients request a vfx at the same time?

#

What Id you will use?

dire aurora
#

It told you you get 2000 to 2999, other players would have their own ranges like 0-999, 1000-1999, 3000-3999, etc

#

Thats the whole reason it tells you the range of ids, so you can generate the ids when predicting your own actions, without worrying about the vfx of other players

spring raptor
#

Got it. And client says I used this ID for this effect?

#

Oh, you probably client inputs ordered, right?

#

So server also knows what ID was used

dire aurora
#

Yea, in practice you can use whatever available options to sync up the ids, as long as they end up matching your client prediction should work

spring raptor
#

I think it's kinda complicated and we have no idea how to avoid calling clone() on each component. And you need to implement mapping twice, once for scenes, and once for components. I.e. you can't serialize a component with entities without EntityMap trait.

Maybe we could emulate the same things with regular entities and EntityMap? You can spawn an entity on client and send what you spawned back to server. And server sends you back server ID to finish the mapping.

dire aurora
#

It would be ideal if we can, since the Idenfier system is mostly a hack. But telling the server what we spawned and waiting for the response wouldn't really work since the timing on that might be off so the mapping ends up breaking

spring raptor
#

No waiting, you predict spawning and mark it with some component. And when server sends you it's ID back - you estabialaze the mapping.

dire aurora
#

But when does it send the ID?

spring raptor
#

Maybe send it with the entity changes?

dire aurora
#

So it would basically be spawned as a component on the entity? The challenge there is, when we receive it, it'll just spawn an entity with that id. We have no real logic to decide this packet should be used to map entities, and especially not that now we need to merge the old entity if we already spawned it (because the other message arrived first)

spring raptor
#

The challenge there is, when we receive it, it'll just spawn an entity with that id.
Why so? First we check if server sends a predicted entity. If true - finish the mapping and update values of the entity

spring raptor
#

You can receive only new message with this entity and we can guarantee to include information to finish the mapping.

dire aurora
#

The server could send the mapping on tick 12, then an update on tick 13. And then the client might receive the one for tick 13 first

spring raptor
#

Tick 13 contains everything from tick 12

#

If tick 12 wasn't acknowledged.

dire aurora
#

If we resend things constantly like replicon does yes, but the way my crate currently works it won't send it again

#

It would essentially rely on first having a good system to handle only resending things that way when needed, then using that for the identifier, and only then can the original identifier be replaced by Entity mapping for this usecase πŸ€”

spring raptor
#

This is why I suggesting to consider reworking this. I would like to try to play with it, doesn't look like something hard to implement.

dire aurora
#

It would definitely be good to rework it, but first we'd clearly need some efficient design to loop over entities. Also to fix the overhead that would be caused by sending Entity (8 bytes vs 5 bytes for Identifier) once per bundle πŸ€”

spring raptor
#

but first we'd clearly need some efficient design to loop over entities
Let me think about this one.

#

I will play with this and write you back maybe tomorrow

#

Late night for me right now

dire aurora
#

Allright ... And I guess the looping requirements would be mostly that we can efficiently check what bundles are and aren't present outside of a generics context. Since calling all the handlers for all the entities would have some big overhead ... My best idea so far is to make a system that checks for With<T> for each component in the bundle, and Without<SomeBundleMarket>, and the same thing in reverse check for With<Marker> Or<(Without<T>, ...)>, then updating some list for that entity with the bundles it has available πŸ€”

#

Ideally if you had 100k entities that could get replicated, but with no bundles actually present on them, it would have near-zero performance impact

spring raptor
#

Can you use Ref<Component>?

#

Never mind, unrelated

#

I probably need some sleep πŸ˜…

#

And I guess the looping requirements would be mostly that we can efficiently check what bundles are and aren't present outside of a generics context
Makes sense.

dire aurora
#

Just added a really simple benchmark for having many bundles ... Having my original case, and 25 unused bundles registered is 219.93Β΅s (only 8Β΅s slower than without any of those extra bundles), while having the same bundles registered but only using those 25 (they all overlap on the same component, Number) it takes 5.7992 ms (!). Clearly there's some room for optimization here. But obviously also room for that first case to get much much slower

spring raptor
#

I have an idea about iteration over entities.

  1. For each replicated component we store component id and function that accepts EntityMut, &[u8] and performs deserialization.
  2. We can iterate over entities in exclusive system and use ReflectSerialize to serialize components without reflection and component id.
  3. And on deserialization we deserialize id first and call corresponding function from 1.

No reflection and no macro required.
It's basically @dire aurora approach + mine with stripped reflection.
Thoughts?

#

I did some size measurement. This struct

struct MappedComponent(Entity);

takes 52 bytes with reflection(!!!) and only 8 bytes without it.

dire aurora
#

Using component id and ReflectSerialize sounds like it's still reflect to me πŸ€”

#

Also you can't actually replace the macros without losing a lot of features. Replacing #[networked(as = X)] would just create more overhead, but now the enduser has to write it and it has to get stored as a separate component

spring raptor
#

Component ID is needed if we serialize diffs. If per bundle, you can just serialize only bundle ID. I think you doing the same when you send bundles?
ReflectSerialize - yes, but we need to iterate over components in some form. I think when we access data from queries we do something similar?

dire aurora
#

My code never uses component id. It's not something normal non-reflect code would ever touch

spring raptor
spring raptor
dire aurora
#

Yea but that happens based on the present bundles, which I guess could be checked with component id of a marker, or it could be stored as some other component that's more direct, say a list of present bundles per entity, that way we don't need to do another archetype lookup for each possible bundle (since you'd already need to do 1 just to get a marker on there)

spring raptor
#

I probably confused you, let me rephrase it.

#
  1. For each bundle we store some sort of id and function that accepts EntityMut, &[u8] and performs deserialization.
  2. We can iterate over entities in exclusive system and use ReflectSerialize to serialize components without reflection. We will serialize all components for changed bundle (or on timeout) and put the id for 1.
  3. We receive packet and on deserialization we deserialize id first and call corresponding function from 1.

Macro is used only to generate function that serializes.

#

At first I started to describe suggestion based on my old approach. Forgot about it. It's better to describe it in terms of your crate. @dire aurora ^

dire aurora
#

Point 2 is the one that confuses me. If you use a macro to serialize things, why would ReflectSerialize be relevant?

spring raptor
#

Macro is used only to generate deserialization functions.

#

Serialization will happen in single exclusive system

dire aurora
#

If you have a macro for deserialization, why wouldn't you use it to also serialize, you pass it EntityRef and it can serialize your bundle's data

spring raptor
#

Hmm... Yes, we could generate serialization function with macro too.

dire aurora
#

And point 1 and 3 is pretty much already how my code works. Only challenge there is that for looping over entities, you need an efficient way to know what is on an entity without checking around for tens or hunderds of components, or checking if any component id matches the bundle

#

Deserialization has none of that overhead cause the packet says what data it is, so at that point all you need to do is either update or spawn a component

spring raptor
#

But it's not all.

#

It's just first approximation :)

#

Now we can improve it and instead of sending data per-bundle, send multiple entities per-packet instead.

#

I want to describe my thoughts step-by-step to give you a better picture.

spring raptor
dire aurora
#

Since we can already apply_changes from a macro, we can just make a function that serializes a bundle in that macro instead ... Instead of making systems that just send a bundle to a buffering system

spring raptor
dire aurora
#

The buffering system becomes more complex per-entity tho. Because of the cases we discussed yesterday, player 1 might receive Bundle B and C, while player 2 gets A, B, and D. But you still want the entity to be kept together where possible

#

Or you'd need some other logic to make sure necessary parts repeat ... Like the Identifier case we talked about, it would need to be in every packet that talks about that entity until it's acked

spring raptor
#

Let's try to think from a different angle

Our goal is:

  1. Iterate over all entities.
  2. Serialize only changes.
  3. Pack as many entities as possible in single packet.
  4. Track acks.

Now I let's think if we can solve this.

dire aurora
#

I think for now we can ignore 2 and 4 as implementation details. They shouldn't add any significant overhead to looping over entities

#

At least if you consider the fact that my crate already has ServerToOwner vs ServerToObserver vs ServerToAll, which creates a different message per entity for each client

spring raptor
spring raptor
dire aurora
#

When I say message I mean a part of a packet, a packet could have one, but most likely has many

#

With a per-entity approach we'd end up having 1 packet buffer per client. Which is honestly not very different from what my code does, except it can have more than 1 buffer, and has a separate set of buffers for broadcasting

spring raptor
#

Will user yours

dire aurora
#

It's the same thing really, in both cases you have a message, which is basically just a bunch of serialized bytes, and you try to make as big as possible packets with those

spring raptor
#

Oh, you mean the same thing

#

Got it!

dire aurora
#

Only distinction is that Joy didn't mention the fact that some messages don't fit in a packet, because afaik Joy really hates fragmentation

dire aurora
#

If you have say 3 channels and we say the maximum reasonable amount of clients that can be handled is 100, then the number of buffers isn't a huge deal memory wise

spring raptor
dire aurora
#

Unreliable, ReliableUnordered, ReliableOrdered

#

Things that aren't time sensitive make more sense to send over Reliable channels, same thing with despawns which the server doesn't know about after it sent them

#

I also only use ReliableOrdered for 1 thing. So chat messages stay ordered

#

Which is still a thing I'm working on, building chat UI in bevy is hell ferris_sob

spring raptor
spring raptor
# spring raptor Let's try to think from a different angle Our goal is: 1. Iterate over all enti...

My refined thoughts:

  1. In macro you select which component will work as a marker and on registration store its component ID, all other component IDs and serialization / deserialization (good idea to generate deserialization too and support networked_as) functions for each component.
  2. Then on sending we iterate over archetypes and check if archetype contains special markers from macro. If do, for each component from bundle we iterate over archetype entities and serialize changed components into chunks that stored in one big buffer. I think bytes crate that used in Renet will help with this.
  3. On receive use functions from 1.

Objections?

dire aurora
dire aurora
# spring raptor My refined thoughts: 1. In macro you select which component will work as a marke...
  1. You don't need to store the component ids. If you generate the serialization you can have it immediately query from EntityRef. And I don't think we should require a marker component for the bundles, ideally we'd make systems to add and remove the markers based on it having all bundle fields and just use our own marker, something like ReplicationBundle<T>. We could even have those systems just add some entry to a list of present bundles for that entity, that would speed up step 2
  2. Iterating over archetype could work, but this sounds more complex than necessary. Getting component ids and archetypes involved makes it very hard for an average user to understand what the crate is doing
  3. That's fine but not really in-scope since we pretty much already sorted that out
#

And what do you mean by "one big buffer"?

spring raptor
# dire aurora Yes but you can't do this. You need to use reliable for big messages and things ...

I think we can. Firstly we can re-serialize only if data changed (since now we track changes). Secondly we can ask renet author to improve the library or find a better one.
We won't deploy our games in a few month, so I would implement the right logic instead of adapt to renet.

  1. Make sense, component IDs not needed. And agree about own marker.
  2. I agree that it's not very clear, but it's what Bevy do and it's definitely not worse then generate big systems with macro. If you know a better way - suggest.

We could use crate like bytes to pre-allocate a re-usable buffer. It allows to clone slices cheaply.
Maybe several buffers if one of them becomes filled. Need to try this out and see. Just thoughts out loud.

dire aurora
#

Preallocating reusable buffers can be done without bytes, but reusing that data without cloning it would be a different story. I don't think bytes actually supports that however

#

I assume what you're thinking is: You write all data, create bytes that point to each message those packets will contain, and then send those packets to renet

#

But afaik bytes can only point to one region of data. So the best we can do is call extend_from_bytes on each buffer that needs the message. At which point it can be any type really, as long as you can get &[u8] from it

#

As for looping over the archetypes, the fastest alternative is to have a list of available bundles ready to go. But if all we do is loop over world.archetypes to find out which entities to send it's not a major issue

#

We'd already need some way to filter out anything without Replication or whatever the general marker is

spring raptor
dire aurora
#

I think we can probably just consider that part a problem for the future, since it would mostly scale by number of clients, which is usually irrelevant during development πŸ€”

#

Having a benchmark with like 10 or so clients that doesn't perform awful would be good enough for now I think

spring raptor
dire aurora
#

You can't query traits πŸ€”

spring raptor
#

I actually can, via reflection. But why I need it?

dire aurora
#

Reflection just does it by throwing a lot of overhead at the problem

#

Either way iterating over archetypes is probably doable, just need to check if there's any unavoidable mutations in serialization

spring raptor
dire aurora
#

Look at the overhead at the bottom

spring raptor
#

Yes, I don't think that it's a lot.

dire aurora
#

I think all I had was LastSent<Bundle> and LastSent<()>, but both of those wouldn't be needed anymore I think πŸ€”

spring raptor
#

I will try to play with this idea and write you back. I better at Rust then at English πŸ˜…

#

Actually, I better at C++, but it's another story πŸ˜…

#

But I glad that we switched to Rust at work.

dire aurora
#

Yea only mutable thing is buffers and adding LastSent. We can remove the buffers resource from the world before looping, and I think if we borrow world while looping over an archetype's entities we can still borrow it again with world.entity(...)

dire aurora
#

That benchmark wouldn't actually send anything, just call some empty "serialize" functions

#

Serialize function being fn (&EntityRef, &mut Buffers) -> () I guess πŸ€”

spring raptor
#

But tomorrow, it's late night for me

dire aurora
#

If we can get that part working I could probably update my macro to that format and integrate it in my crate. The current code should have enough similar cases to a final solution that we should get a clear picture of the performance impact it would have (I assume that one 5ms benchmark would get a lot faster)

#

@echo lion In the issue about rooms you mention visibility for children ... How would we propagate this? ... And I feel like with the approach me and Shatur are working on right now might work better with a ClientVisibility component that just holds a list of clients with new/maintained, with only removed visibility being some type of map

#

Actually rather than new/maintained, a tick value might be even better, then we only need to update for changes, never to change new to maintained πŸ€”

echo lion
# dire aurora <@648083596850102275> In the issue about rooms you mention visibility for childr...

I really doubt putting any kind of visibility container on entities can scale. For children you can iterate the hierarchy (presumably? idk how bevy hierarchies work) as an initial solution. Alternatively you could only iterate over root entities in the main replication loop, then traverse the child hierarchy of each root entity to identify children visible to different clients (although this can lead to duplication if a child entity has multiple parents).

I have not had bandwidth to follow your discussion, I'll have to look at the new implementation when it shows up to evaluate what is needed.

dire aurora
#

iirc bevy updates GlobalVisibility and GlobalTransform by starting at the root, then iterating down their chain

#

Iterating over root entities in the replication loop wouldn't be very performant at least

echo lion
#

can't be that bad, just check if an entity is a child and skip, otherwise continue

dire aurora
#

The problem is that you can apply efficient filtering of what we do and don't have to check at an archetype level, but not at an entity level (we can check it, just much more expensive)

#

Also iterating over children that way would probably make despawns on them very hard to manage

echo lion
#

One way or another you need to traverse the relationship hierarchy...

dire aurora
#

Yea, and ideally in a way where it adds no cost for people that care about efficiency and don't use hierarchies

spring raptor
#

BTW, I probably need to take a look a Bevy internals. I think iterating over archetypes and query Serializable from components could be faster since it's less work.
I looked at internals - Serializable it's basically an abstraction over EntityRef::get, so no.

spring raptor
#

Query is also faster then EntityRef because Query caches several things, such as TypeId to ComponentId conversion and columns for table storage components.

spring raptor
#

But we can do some sort of hybrid: iterate over archetypes, get Ptr directly and call serialization functions on them (this type is convertible to T). I think that this will be even faster then Query and can be done in a single system. I will measure.

dire aurora
#

I think these would be optimizations involving unsafe, we should probably look at those later

spring raptor
#

I will measure and write you back.
I already have this code, I just need to replace my ReflectComponent with Ptr and lookup for serialiation function.

dire aurora
#

For now the main concern isn't really serialization performance anyway. Also do we have a way to directly get a component from an archetype? I can't find the methods for it

#

We can get ComponentId and TableRow but the only source of Ptr I see is World.get_by_id πŸ€”

spring raptor
dire aurora
#

Ah, it's Archetype -> TableId -> Table -> Table column (w/ ComponentId) -> get (w/ TableRow (from Archetype.entities())

spring raptor
#

Yep

spring raptor
dire aurora
#

I'm not sure how we would efficiently pass this to a serialize function tho

#

Ah yea, there's SparseSet too

spring raptor
dire aurora
#

But now we're just replacing one type of overhead with other types of overhead (function calls that can't be inlined in any way)

spring raptor
#

Not necessary. Yes, we could have one function per-bundle with this approach too. But we will need to fetch pointers for all components from the bundle to call it even if they didn't changed.

spring raptor
dire aurora
spring raptor
dire aurora
#

Well rather than "looks odd", in this case it really is odd, it would heavily complicate how things would work (both the macro and iteration) for overhead we have yet to measure

spring raptor
#

In my opinion it's not really that complicated. Serialization is similar to how my code works, but instead of using reflect, we generate serialization functions.
And deserialization will work similar to yours.
I would say that it will simplify macro.

spring raptor
dire aurora
#

The concern is not what approach is right or wrong, but that we're introducing new variables without having tested the impact of a single one

spring raptor
#

We don't do anything novel, what I described as goals is pretty standard things.

#

We just need to find a good way of doing them

dire aurora
#

But we wouldn't be measuring the performance impact of anything if we do 7 things at once

spring raptor
#

I see you point. But I think that such "standard" things are common for a reason. I bet that people already measured and come up with this solution. We can experiment with different designs, but I pretty sure that we end up in something like this.

dire aurora
#

It's just our hypothesis that iterating over archetypes and calling serialization functions is faster than having 1 system per bundle. That having loops that can't be unrolled and extra function calls is worth it if it optimizes most of EntityRef::get. Both of those could be untrue, and since the bevy internals are complex we have no way of knowing for sure ahead of time

#

Same thing could end up true when comparing some fancy type that does what we tought bytes does to just calling extend_from_bytes

spring raptor
#

And the suggested approach with Ref will be faster.

dire aurora
#

Query can be a bit complicated in general. It's very likely fetching it directly in the unsafe way is faster, but it would create other overhead (looping over a list of ComponentId and calling the serialization functions). And ofc there's room for user error, when trying to optimize things 1 small mistake can easily make things way slower

#

Which is why it would be better to test each step. First get entity iteration we know is efficient, then try to optimize fetching and serialization, then change the buffers design to something that meets the new requirements

spring raptor
dire aurora
#

One of the mentioned goals also includes being faster than the current approach, using what we can before rewriting everything makes it easier to compare (since in theory the first two steps can be done without api changes, so the benchmark would be identical)

spring raptor
# dire aurora One of the mentioned goals also includes being faster than the current approach,...

I agree with you, maybe we even misunderstood each other. I think we both agree that we need to end up with something like the mentioned goals, right? I'm not suggesting to rewrite all at once, we should start with the first one (iterating over entities).
And I found the mentioned approach with Ptr that can lead us to the goal. Suggest a better alternative how we can iterate over entities if don't like the solution. Or you don't think that we should iterate over entities?

dire aurora
#

Iterating over entities is fine, but the EntityRef::get would be called in the serialization code

#

So including this optimization would immediately bring in changes to serialization

spring raptor
#

So you suggest to use EntityRef::get instead of Ptr?

dire aurora
#

Well not really suggest using it, but it would be the easiest thing to implement first to just test the iteration. Ideally we'd even test the iteration with empty serialization functions

#

Tho ofc those functions do still need to get called

spring raptor
#

Oh, you mean that you are not sure if using a single system is faster then using multiple systems with queries?

dire aurora
#

Yea, tho I'm fairly sure it would be faster to some extent in that 5ms example, it's hard to say if it's 1% faster or 99% faster

#

If it's 99% faster we would basically need no extra optimizations πŸ˜‚

#

And if it's somehow slower we'd need to figure out another solution that meets our requirements

spring raptor
#

We probably need to rephrase our goal iterate over entities. It's not necessary that we need to iterate over entities. We need to have acess to serialized components per-entity. And have an open window to track changes.

dire aurora
#

Well the most important part is probably filtering entities

spring raptor
#

So I imagine how multiple systems could also be fine.

#

This is how your code currently works.

spring raptor
dire aurora
#

We definitely have at least 2 different types of markers. There's some general one, like Replication and then we'd have markers for the actual bundles

spring raptor
#

Why do we need general one, btw?

dire aurora
#

Those other markers would just be there as some kind of security check, wouldn't want to try and serialize a bundle that isn't complete

#

The general one is to inform the crate that the entity is actually supposed to get networked

spring raptor
#

But it could be slower

dire aurora
#

Yea, but we can just do that with a simple separate system, A query that does With<ComponentA>, With<ComponentB>, Without<BundleMarker> should run extremely fast

#

Since most of the time there should be no archetypes that match that

#

It kind of requires having a regular system too, or we'd need to manually update a QueryState which wouldn't be very fun πŸ˜‚

spring raptor
#

@dire aurora What if user have a bundle that is superset of another bundle?

dire aurora
#

That would work fine, it's a bit of a nonsensical thing in most usecases tho

#

There's one specific good usecase for it tho:
The smaller bundle is ServerToObserver
The bigger bundle is ServerToOwner

#

And the bigger one just has extra data other players don't need

#

We might be able to optimize those into being 1 bundle later with some extra attributes, but any optimization the user has to think about is probably a minor concern atm

spring raptor
#

These bundles sometimes confuse me πŸ˜…

#

Component-based replciation is easier to understand.

#

I wondering if having something like Ignored<Component> would help.

#

I mean I dislike per-component replicaiton (my original idea, how currently replicon works) because sometimes I don't want to replicate specific component on specific entity.

#

And I liked bundle-based becasue it sovles this issue.

#

But if we have a marker for bundles, I starting to wondering if maybe having per-component marker to achieve the goal will help?

dire aurora
#

That wouldn't really work, since there's a fairly big difference between what you can with a fully bundle-based approach and a pseudo-bundle approach

#

The macro attributes give you a lot of extra features here, for example: replicating a rotation for one thing could be a single f32 (Y rotation usually), but for something else you might want a full Quat.

#

There's also things like bundle fields you just don't send, which can save you from creating a whole lot of blueprint patterns for minor things

spring raptor
spring raptor
spring raptor
dire aurora
dire aurora
spring raptor
spring raptor
#

Just thinking out loud, I'm not suggesting to switch to component-based replication.

dire aurora
#

I think the best we could do is store per-component functions for a bundle. Since the way you send a component varies on the context the component is used in

#

A player can only rotate on the Y axis, but there might be things that need a Vec3 or full Quat

spring raptor
#

If you talking about component-based replication

#

Or am I misunderstood you?

dire aurora
#

If you wanted to apply a pattern like this on a component-based system you'd have to store the way it's handled in each entity I think yea. That seems pretty hard to manage tho πŸ€”

spring raptor
#

I think this solution kinda interesting, I would probably keep it in mind.

#

Here is what I suggest. If you remember, some time ago I wanted to see how far can I get with reflection. Then I dropped it because the reflection sucks πŸ˜… . But Ptr will work quite similar, it's a very easy change for my crate. I can apply it and see how it goes.
And you will try to improve your crate.
And then we update benchmarks to compare. So I suggesting to experiment in parallel.

dire aurora
#

Hmmm ... How would you serialize if you use Ptr and no reflection? πŸ€”

spring raptor
#

Using function pointers that cast to specific type based on ComponentId.

dire aurora
#

Ah, you just keep a list of component serialization functions?

spring raptor
#

Yep

dire aurora
#

Then you can at least do a benchmark that doesn't have the massive overhead of writing strings πŸ˜‚

spring raptor
#

Exactly, I can't even serialize 10k entities πŸ˜…

dire aurora
#

Tho tbf my benchmark wasn't really bottlenecked by writing data either, 150micros for empty bundles, 200 for a fairly big bundle

spring raptor
#

Mine definitely was πŸ˜…

dire aurora
#

Tho serializing bincode is probably faster than your format too

spring raptor
#

I serialized relfects using bincode too

#

But it still strings

dire aurora
#

Oh ... I wasa confused for a bit cause you sent a snippet with ron format before

#

If you're CPU bottlenecked ron is pretty awful πŸ˜‚

spring raptor
#

I sent ron just to show visually what information is serialized with reflection. Serde for reflection implemented differently than for actual components.

dire aurora
#

Yea it's not too surprising, many of the components don't even have serde::Serialize in the first place. Like Transform and Entity πŸ˜‚

dire aurora
#

I think we missed something pretty obvious

#

Iterating trough all archetypes in the world by itself is not super efficient, because you end up with many empty archetypes

#

From the docs:

Like tables, archetypes can be created but are never cleaned up. Empty archetypes are not removed, and persist until the world is dropped.
#

Luckily we can just steal the code to update query state to handle this efficiently πŸ€”

#

Well apparantly not fully, since the .value() it uses is private. But we can at least use generations to only update archetype info when we need to

dire aurora
#
Finished updating archetypes in 1.343Β΅s. Extracted 3 archetypes with a total of 6 bundles
Finished updating archetypes in 15.239Β΅s. Extracted 51 archetypes with a total of 118 bundles
Finished updating archetypes in 17.173Β΅s. Extracted 52 archetypes with a total of 120 bundles
Finished updating archetypes in 25.348Β΅s. Extracted 52 archetypes with a total of 120 bundles
Finished updating archetypes in 15.359Β΅s. Extracted 52 archetypes with a total of 120 bundles

Well it's fast at least, and runs pretty infrequently for now (watch as relations destroy the performance)

dire aurora
#
entities send           time:   [100.97 Β΅s 101.04 Β΅s 101.11 Β΅s]
                        change: [-52.373% -52.316% -52.258%] (p = 0.00 < 0.05)
                        Performance has improved.

entities receive        time:   [303.67 Β΅s 304.30 Β΅s 305.04 Β΅s]
                        change: [+2.1979% +2.3406% +2.5081%] (p = 0.00 < 0.05)
                        Performance has regressed.

many unused bundles     time:   [101.01 Β΅s 101.07 Β΅s 101.14 Β΅s]
                        change: [-54.064% -53.921% -53.786%] (p = 0.00 < 0.05)
                        Performance has improved.

many overlapping bundles
                        time:   [459.30 Β΅s 460.24 Β΅s 461.55 Β΅s]
                        change: [-92.075% -92.057% -92.039%] (p = 0.00 < 0.05)
                        Performance has improved.

@spring raptor I did an entity iteration thing ... Besides the entity receive (which always fluctuates ~2-3% per run) everything got significantly faster ... I don't have all old features implemented yet (specifically it just sends every time without checking changes (which isn't that expensive, I benchmarked it before) and sending data to new clients (that code doesn't get hit in the benchmarks tho))

spring raptor
dire aurora
#

It updates archetypes if the generation changes. Then uses the cached data to iterate over all archetypes with the replication marker (Identifier atm) and at least 1 bundle. It just passes EntityRef along with a bunch of other fields to the bundle's serialization function

#

So no significant changes to how the serialization or buffers work so far

#

I'll have to give some thought to how to efficiently handle those changges before I pick them up. The old code used to serialize data for each new connected player, and I really want to avoid the new approach doing stuff like that (especially since gaining visibility would work the same way as new players connecting). But I also don't want broadcasting data to get extremely slow

#

It also currently still serializes all bundles as it used to, so the messages still look like this:
packet_id identifier data packet_id identifier data

spring raptor
#

Yep, I understand, it's nice to do it iteratively

#

I haven't tried coding yet today, I decided to relax over the weekend. I'll try Ptr thing tomorrow :)

granite owl
#

In no small part because I'm doing work on the next major version and so now is the time to make big breaking changes

dire aurora
#

Ah, I didn't mean that as a general statement but specific to networking. I've used rkyv in the past and got some overhead in the output on things like enums with varying variants

#

Tho I guess even with networking it's kind of dependant on the circumstances, it's not that we really need to save every byte, but rather because we pack messages into a single UDP packet the less bytes each packet is the less packets there are, and the more likely it is that things arrive completely and on time

granite owl
#

Yeah that makes sense. And I can definitely recommend bitcode as a good format for when you need to squeeze every out last byte. There is a fundamental tension between total zero-copy deser and message size, but one I think can be improved with effort

#

There are also a few techniques you can use to reduce the serialized size with rkyv but I'm the first to admit that it's pretty technical and difficult unless you have a lot of familiarity with the crate

dire aurora
#

Yea, that's always a hard one, even if in theory you make the best possible crate, if people struggle to use it effectively the average performance becomes much worse

#

It's the main reason I currently use bincode. It's not the best in any metrics, but it has a good space efficiency vs effort spent ratio. Will definitely have to swap it out with something better tho. Bitcode and speedy seem the most interesting, tho with neither you get amazing results without having custom derives

granite owl
#

Speedy is generally a very good pick based on the benchmarks! Thanks for taking some time to talk about rkyv, I appreciate it very much πŸ™‚

dire aurora
#

The benchmarks are all pretty crazy tbh. bincode is already faster than what you get in many other languages, and other libs make it look like a joke πŸ˜‚

granite owl
#

It's so true, very easy to lose sight of the big picture. Some of the biggest gains come from just switching to Rust in the first place

spring raptor
#

Realized that with bundle-based replication it's impossible to single component support removals. It's quite limiting :(

dire aurora
#

Why would you want support for single component removals?

#

Also you can just have a bundle with 1 component in the rare cases where this is somehow desired behavior. After all if you can remove it by itself and still have everything work as intended that means it doesn't belong in a bundle with other components

spring raptor
dire aurora
#

Status effects might also be a bundle. Tho in general I'm not convinced networking status effects as separate components is a good idea

spring raptor
#

Design depends on the game.

dire aurora
#

Yea if it's one or two in the whole game it could work, but in that case making it a bundle with 1 component is just fine

#

But if there was no clear limit on how many status effects will get added, like in many RPGs, it would probably be better to make a more flexible system

spring raptor
#

How bundle removals should be tracked, btw? When all/any component removed?

dire aurora
#

I think when any component is removed, tho I haven't gotten to removing components yet

#

My game literally never removes components over the network πŸ˜‚

#

Does replicon handle component removals?

#

Overall shouldn't be too hard to implement it I think πŸ€”

spring raptor
dire aurora
#

I have despawns in my crate, but no removals. I'll probably come up with some design for visibility since it would weave into all of the things I need to tackle next πŸ€”

#

My despawn logic right now for example has issues like sending despawns to clients that didn't get data about it yet

dire aurora
#

@spring raptor Do you have any ideas for how acks could be handled with multiple messages yet? Having some way for clients to know when they received all data for that tick would also allow for some improvements with client prediction I think πŸ€”

spring raptor
dire aurora
#

The best I can think of is once we're done with everything, we check how many messages were sent to clients (including the one that's currently being buffered if it has any content), then if it's greater than 0 we send how many there were. Then the client can just keep track of how many updates it received for each tick and once it has them all send an ack and fire some event so the client can do any cleanup it wants

#

It might need to be 2 separate numbers actually. One for the stuff that actually needs these acks (and thus fires the response) and one that includes reliable messages

spring raptor
#

You don't need to use reliable messages for replicaiton

dire aurora
#

You do, you can't do "resend till it arrives" on 200+ entities that are up to 8kB in size

#

You also need reliable for events, otherwise they just disappear (and that would be working as intended)

spring raptor
dire aurora
#

The only way to not reserialize it would be to keep it in memory, which is exactly what renet already does, and it handles acks for us

spring raptor
#

And send it twice

#

This is why we use unreliable channel in the first place

dire aurora
#

You don't put reliable on bundles that change often, that would be stupid

spring raptor
#

It's a workaround, yes

#

But smart serialization to serialize only changes is needed anyway

dire aurora
#

Skipping bundles that didn't get changed is easy, but it would be stupid for us to remake everything renet already does so we don't need to serialize things again when resending it

spring raptor
dire aurora
#

If you send changed thing over unreliable the packets just get lost

spring raptor
#

Doesn't matter, you track acks anyway

#

What the point of using reliable channel for replication if we serialize only actually changed data?

#

We can just keep sending Bytes to renet, it's cheaply clonable.

dire aurora
#

If you send a packet over reliable you can send it once, then forget about it until it changes again

spring raptor
#

I know how reliable channel works, what the point of sending some components through reliable channel?

dire aurora
#

Currently the way replicon works is by just flooding the client with the update until it's acked. This means the message is sent every tick, when a client has high ping for up to 20 ticks, even more if the ack only gets sent if all packets arrived and there's any amount of packet loss involved

#

And the way my crate works is by sending changes only once, and then not again until it changes again

#

Any data I send on reliable is very unlikely to change if ever at all

spring raptor
#

You send data until it acks.

dire aurora
#

Reliable sends the data based on a configured timeout, not every tick

spring raptor
#

We should do the same, otherwise we just food the channel.

dire aurora
#

Reliable also sends acks per message, not for the whole tick of updates

#

Acking a whole tick of updates gets less reliable the more data is in that update, so especially 200+ 8kB entities are things you're gonna want to exclude from that ack

dire aurora
spring raptor
spring raptor
#

But I think it's a rarely needed fine-tuning.

dire aurora
#

If you want your game to feel smooth and stable with any amount of packet loss setting such resend timers correctly would be very necessary

spring raptor
#

You rarely care about such details

#

Usually develpers focus on the game first, optimizations come later.

#

Anyway, no engine provide a way to use reliable channel for replication πŸ˜…

#

It's better to configure timeout per-component if you really need this to just to handle everything in similar manner

dire aurora
#

That would be per-bundle probably, but that still doesn't fix the issue of it being pointless to serialize data again that isn't gonna change

spring raptor
#

I don't get you. You shouldn't serialize any data that wasn't changed.

dire aurora
#

In this case the client never acked it, after all it's data that spans multiple packets because it's too big, it's very easy to get lost

spring raptor
#

Still don't get it. You have a single entity that contains 8kB?

#

Even if you have such a huge entity, you don't need to reserialize it, you just tell renet to send already serialized data again as other components.

dire aurora
#

Up to 8kB yes, and there are hunderds of them

spring raptor
#

If you have a hungred messages, it's several packets.

dire aurora
spring raptor
#

And if you ack per-packet it's not a problem.

spring raptor
#

But in a different way.

dire aurora
#

We can't ack per-packet cause then we need to keep track of when every client ever acked data from any packet

spring raptor
#

Why not?

spring raptor
#

Oh, you mean changes with multiple messages, not packets.

#

No, it should be a single diff, separated into packets.

#

And you can ack it partially.

#

Not sure how, but I would expect it to work this way.

dire aurora
#

Acking partially is what adds all the complexity. Now you need to track what entities are in what partial diffs and what acks were received by which clients for which entities

spring raptor
#

Yes, but it's how it usually works. We can't send huge messages.

#

We need to separate messages into packes

dire aurora
#

That's how reliable works, that's not how acking world diffs works. Most games just send data constantly not send you world diffs

spring raptor
#

Most games
Which one?

#

I said this before and even Joy confirmed it - you send diffs.

#

And Unreal Engine works this way. You just mark property for replication and it's magically works.

dire aurora
#

Unreal engine's networking solution is not most games. A lot of games have custom network code, and they tend to do either of 3 things:

  1. Send only changes and not care if it gets lost
  2. Send things constantly if they matter
  3. Use reliable or TCP
spring raptor
#

It doesn't mean that we should use something like this πŸ˜…

dire aurora
#

I'm just saying it makes no sense to say that most games diff per packet

#

They don't, they don't even diff at all

spring raptor
#

Then just send everything over TCP

#

It will work πŸ˜„

dire aurora
#

It won't, I've played games that use TCP, it's horrible πŸ˜‚

#

You get 1 small packet drop and suddenly you get these huge hunder millisecond lagspikes, and then the devs say they don't care cause their target audience is south korea, where they have decent internet and the distance to servers is tiny πŸ™ƒ

spring raptor
#

I understand that sending diffs in message that could be split per packet and use reliable for stuff that you send once or rarely change could be easier to implement.
But I would prefer to provide a solution that "just works".

#

You mark something for replication - you are good to go.

#

I had wonderful experience with UE and I would like to replicate it (pun intended)

dire aurora
#

Giving users a solution that "just works" at the cost of being impossible to optimize would be a pretty bad route

spring raptor
#

cost of being impossible to optimize
I don't see this why it could be impossible

dire aurora
#

If you can't specify what is reliable vs unreliable and what the resend times are, how can you optimize it?

spring raptor
#

What do you want to optimize if unreliable channel will just work?

dire aurora
#

If unreliable floods the other side until it acks and does per-packet acks it would take up a significant % of the CPU and bandwidth for that

#

If you make something that "just works" and leave the user no decisionmaking you just end up making some huge compromise somewhere

spring raptor
#

We need a way to configure the delay anyway

#

UE works even for AAA games

dire aurora
#

AAA games barely work, and UE works horrible for most indie games. I'm sure you've played an unreal game that uses 100% of your GPU for potato graphics before πŸ˜‚

spring raptor
#

Fortnite have exellent networking, for example.

dire aurora
#

It would be a pretty big meme if the creators of the engine couldn't make a decent game in it :')

spring raptor
#

This also means that their approach works.
For me it looks like you want to accept the compromise of using single package per-diff, but don't want to accept a different one to use a single system to send all components :)

dire aurora
#

Which different one to use a single system to send all components?

spring raptor
#

Under system I didn't mean Bevy system

#

I meant implementation in general

dire aurora
#

Using per-packet diffs would mean we would need to:
Store which entities were in which packets sent to each client (that's a lot of allocs)
Check those lists when we receive an ack
Store the timestamps per client (more allocs)
Clean up old data if we ever figure out what is no longer necessary
Check all acks for all clients against all components for every entity

#

Compare this to a world-diff ack, where you just store the ack per-client, and you can optimize the checks by just looking at the oldest ack of any client that gets that entity (or maybe it could even be pre-processed per room or whatever)

spring raptor
#

Fair point, but we probably need to take a look how other people implement it.

dire aurora
#

There's also the option of doing acks per-bundle, but then you'd want to change your packet filling strategy from per-entity to per-bundle. But per-entity would be a significantly more stable experience, since you don't get half updated entities

#

Or you'd need some other type of grouping, which can also somehow make it more likely all entities in that group end up in the same packet

#

Time to invent network archetypes, which consist of a unique set of networked bundles πŸ˜‚

spring raptor
#

You configure how much entity is important for you

dire aurora
#

How do you configure this?

spring raptor
#

Scroll to "Relevance and Priority"

dire aurora
#

That doesn't actually group them ... And relevance seems to just be the same as a marker component

spring raptor
#

Basically what to include in package first.

#
Sometimes there is not enough bandwidth available to replicate all relevant Actors during a single frame of gameplay. Actors therefore have a Priority value which determines which Actors get to replicate first. By default, Pawns and PlayerControllers have a NetPriority of 3.0, which makes them the highest-priority Actors in a game, while base Actors have a NetPriority of 1.0. The longer that an Actor goes without being replicated successfully, the higher its priority will get with each successive pass until it is replicated. 
dire aurora
#

Yea I guess in a way a system like that could be used

#

You'd need to sort by priority, so things with the same priority will end up in the same packets

spring raptor
#

And easier to tracks acks, yes

#

It's basically an anwer how to handle this

#

No, not exactly

#

I would say that grouping should be a different.

#

Priority is based on bandwith

#

We need grouping to solve the acking problem

dire aurora
#

Considering how bandwidth is often checked I'm pretty sure priority just affects sorting. We could make Priority<const u8> a component, then sort archetypes by that value or something πŸ€”

spring raptor
#

Group can have a priority πŸ€”

dire aurora
#

Yea that could also work, as long as we can sort archetypes

spring raptor
#

See, it's easier to take a look at what other already implemented.

#

Even without code, just by looking at the API

dire aurora
#

It's not a very user friendly approach still ... Tho maybe it could be made better if you couldn't forget groups by accident

spring raptor
#

I think it's safe to have a global group by default.

#

In most cases it will just work.

#

Also it solves the problem when some entities are already replicated and other that depends on them are not.

#

And if users separate components to groups, they just need to make sure that related components are in the same group.

dire aurora
#

We'd want to group entities, not components

spring raptor
#

Right, right

dire aurora
#

We'd need some way to register groups, maybe in debug mode a warning when an unregistered group is found, and send messages about the number of entities that were in those groups (if any)

spring raptor
#

@dire aurora crazy idea: create Replicate<T> that holds component buffer

#

And this way you can get serialization in parallel

#

And easy serialization reuse.

dire aurora
#

Serialization in parallel is not worth it most likely. And I don't think it's a good idea to buffer the data

spring raptor
#

I meant store Bytes.

#

To avoid re-serializaiton

dire aurora
#

I don't think we're anywhere near the point where we'd need to worry about that

#

Also storing Bytes is useless for components, you need to copy it into a packet buffer anyway

spring raptor
#

No if it's bytes

#

It's cheaply clonable.

dire aurora
#

I already mentioned this before, but Bytes can only point to one contiuous section of memory

#

You can't make a Bytes out of 200 different slices or Bytes

spring raptor
#

We don't need to, just store a single Bytes for each component just to avoid cloning it into renet

dire aurora
#

But renet doesn't take components, it takes complete packets

spring raptor
#

Right, I forgot

#

Damn

dire aurora
#

But yea either way I doubt we'd need to optimize this far yet. I can serialize 1000 entities with quite a few components in 100 microseconds ... But there's still missing features and deserialization is very slow in comparison ...

spring raptor
#

Will go to sleep, have a good night / day.

I almost finished replacing reflection with pointers, maybe finish tomorrow and will do some comparsions / measurements.

dire aurora
#

I really wonder what that code ends up looking like ... If we can make a simple function to get a component by component_id from an archetype it might be worth testing ... I might even be able to fit it into my current bundle serialization (the function signature is already ugly so wouldn't hurt if I add a few more fields)

spring raptor
#

@dire aurora I replaced reflection in my crate with Ptr. Benchmark says that it's almost 2.5x improvement. But reflection was slow, so this numbers not really represent anything πŸ˜…
I need to rework my WorldDiff creation (the slowest part right now) and we can try to do some meaningful comparsions.
You was interested about the code, here: https://github.com/lifescapegame/bevy_replicon/pull/37

dire aurora
#

Can you send more than 900 empty components yet? πŸ€”

spring raptor
dire aurora
#

It might depend on how big the packets get. I'd imagine there's a limit to how many fragments it will try to make

spring raptor
#

I only slightly changed WorldDiff to just make it work, I need to create it in a more optimal way.

Could be, yes. Wondering how much space it takes.

#

My crate had per-component replication before, so I just did a few refinements to see how @echo lion old idea about having Ignored<T> fits and I quite like it!
I think per-component replication have some pros over per-bundle approach:

  1. No macro required at all, everything is done via plain generics.
  2. It's very clear for user how to implement custom serialization for specific type.
  3. It's obvious how it works. What if two bundles intersects? What if two bundles on entity represent a third bundle? What if you remove one component from a bundle, should we send removal to clients for the whole bundle or just stop replicating it?
  4. If you want to replicate a single component, you can without creating a bundle.

The cons is that we call function for each component now. But bundle-based approach needs other systems to run, so I think that performance should be comparable.

#

Not entirely sold about per-component approach either, just saying.

dire aurora
#
  1. Having no macro also means you lose a lot of features
  2. Implementing custom serialization would be the same tho?
  3. These things should be obvious if you know how bevy bundles and queries normally work
  4. (T,) is a bundle and you didn't really need to create it (but I do still need to make a blanket impl for it)
#

Also which systems? With archetype iterations there are no systems that get registered

spring raptor
dire aurora
#

I don't insert markers, it's a waste of time since checking archetypes is susper efficient

#

They rarely ever change

spring raptor
#

You iterate over all bundles and check if archetype contains all entities from bundle? Or is there a more efficient way of doing it?

dire aurora
#

I check if it contains all components

#

The efficiency doesn't matter right now. Archetypes hardly ever change, they never get cleaned up after all

spring raptor
#

Then with per-components you have less checks. Because instead of iterating over all bundles that you replicate and check if archetype contains it, you do this check in reverse - you iterate over all archetype's components and check if they replicated.

#

These checks are super tiny though.

dire aurora
#

Like I said, the check doesn't matter. It takes a few micros to execute, and stays cached for as long as no new archetypes are created. It doesn't take long for new archetypes to stop being created

spring raptor
#

Okay, 900 empty components somehow form 32420 bytes diff πŸ˜…

#

I pretty sure that bincode serializes these components as empty bytes.

dire aurora
#

If you serialize just an empty struct it becomes a single zero byte iirc

spring raptor
#

Yes

#

Double checked it, my diff looks like this (9 components):

WorldDiff { tick: Tick { tick: 169 }, entities: {0v0: [Changed((ReplicationId(1), []))], 7v0: [Changed((ReplicationId(1), []))], 1v0: [Changed((ReplicationId(1), []))], 8v0: [Changed((ReplicationId(1), []))], 2v0: [Changed((ReplicationId(1), []))], 6v0: [Changed((ReplicationId(1), []))], 5v0: [Changed((ReplicationId(1), []))], 3v0: [Changed((ReplicationId(1), []))], 4v0: [Changed((ReplicationId(1), []))]}, despawns: [] }
#

I serialize components as Vec and pass to world diff and serialize it again, I think that's the problem.

#

Have you tried serializing empty Vecs?

dire aurora
#

Every Vec gets a 32-bit length in bincode

spring raptor
#

Damn

dire aurora
#

Then followed by the elements

spring raptor
#

I can serialize it manually, but it's a lot of error-prone code.

#

I will try too tweak serde.

dire aurora
#

In general that format isn't great tho. It creates more allocs for one entity than my serialization logic does probably πŸ€”

spring raptor
#

I think if I switch to byte slices it should work better.

dire aurora
#

Well besides the missing features my crate probably has the better basis anyway, since my original goal was to replace my old custom packets that wasted as little bytes as possible πŸ™ƒ

#

Really need to change the format to something that really sends thing per-entity tho. Could save some overhead when multiple bundles update at once

spring raptor
#

You mean for the future networking crate?
I think that we have different views on it.

At first I thought that my approach was just bad and I should rewrite it or help you to build yours.
But today I did fairy small change and I think it's very close to what I would like to see. Here is what I think should be changed:

  1. World diff should be reduced in size.
  2. Instead one single diff, I should separate it by user-defined groups.
  3. Diffs should be compressed optionally via feature.
  4. Add some customization, like send timeout, per-entity serializaiton rules.
    That's it, everything else is in place.
dire aurora
#

I asked Joy about the compression at a packet level, and there's probably no good way to do it efficiently

#

Also reducing the size and splitting diffs aren't small changes, both of those would probably break the entire crate in every possible way πŸ˜‚

echo lion
#

have you tried serde_with?

#

#[serde_as(as = "Bytes")]
bytes: Vec<u8>

This is supposed to make it serialize better

dire aurora
#

Compression libraries tend to make it very annoying to know how full your packet is

dire aurora
spring raptor
dire aurora
#

Replacing bincode is an option, but that still leaves overhead of shoving a bunch of vecs in there, which just adds extra unnecessary data

echo lion
#

32 bits for the length or something? what if you use bincode::DefaultOptions::new().serialize() so it serializes ints as varints?

spring raptor
# echo lion have you tried serde_with?

I currently didn't tried anything, I only ditched the replication. Going to play with several approaches. I can always serialize manually, but I will try more automatic approach.

spring raptor
dire aurora
#

If you don't know how big the data you are writing is, you can't fill a packet up to the limit

#

Which means you don't get the benefits of compressing it. Sending many half full packets doesn't usually get you more bandwidth

echo lion
#

you compress and then rearrange the bytes after that

#

unless you are compressing cross-entity?

#

Or use heuristics to guestimate how much compression you'll get, then back-track if it doesn't compress enough.

spring raptor
dire aurora
#

One reasonable approach is to compress only some components

#

If you use bitcode for example most data is barely gonna compress

#

But there's always gonna be exceptions where compression could get you huge improvements

#

Also reduces the amount of overhead it creates

spring raptor
#

Makes sense. I will leave it up to users, they can add any kind of compression in serialization functions.

dire aurora
#

You mean in a serde::Serialize impl?

spring raptor
spring raptor
#

Drafted a new release with these changes.
It's already better then what I had before and I would like to gather user feedback about API changes.
Tomorrow I will continue working on diff representation.

spring raptor
#

Currently I serialize WorldDiff in two steps: serialize components into bytes and then serialzie it again with corresponding entities. It was quick and dirty solution and I see at least two disadvantages with second serialization:

  1. It copies serialized component bytes.
  2. It includes bytes len into packet.

And I think I know how to solve it. I can manually implement serde traits for WorldDiff. This way I will be able to just put Ptr into it and serialize it once.
I did the same thing with reflection back then. But reflection was both serialized and deserialized as Box<dyn Reflec>. I can't deserialize Ptr as is, so I have to deserialize as a different struct with OwningPtr.

spring raptor
# spring raptor Currently I serialize `WorldDiff` in two steps: serialize components into bytes ...

OwningPtr can't hold value, it's just a pointer that responsible for dropping. So I implemented serde deserialization trait to deserialize into the world directly. It works, but looks so cursed... I probably will take some time to rethink it or polish it.

In the meanwhile I reworked entity mapping to remove extra allocations. In my benchmark client process updates about 10% faster. But the speedup depends on the number of entities in message.

spring raptor
echo lion
#

Could you do some intermediary traits to get the API you want?

#

With an auto implementation for stuff that implements Serialize/Deserialize

spring raptor
# echo lion With an auto implementation for stuff that implements Serialize/Deserialize

It works exactly this way, but it limits how users can customize serialization.
Here is how serialization automatically implemented:
https://github.com/lifescapegame/bevy_replicon/pull/38/files#diff-6952d46bb3064d306c840e5590fff42e4001643f9e7df462521f778421c601faR234
User can override it, but user can only return Serializable for the whole struct or specific field. You can't serialize, for example, both transform and rotation.
It would be possible if I were able to pass Serializer(note r at the end) to this function instead. But due to some lifetime issues I can't. And it's only for serialiation, deserialization is flexible:
https://github.com/lifescapegame/bevy_replicon/pull/38/files#diff-6952d46bb3064d306c840e5590fff42e4001643f9e7df462521f778421c601faL238

#

The issue with serializer is that for some reason I can't erase it.
The following code:

let serializer = &mut <dyn erased_serde::Serializer>::erase(serializer);

results in

the associated type `<S as client::_::_serde::Serializer>::Ok` may not live long enough
consider adding an explicit lifetime bound `<S as client::_::_serde::Serializer>::Ok: 'static`...
...so that the type `<S as client::_::_serde::Serializer>::Ok` will meet its required lifetime bounds [E0310]

This is why I return Serialize from function and pass it to serializer.

pseudo lion
#

Hey! This is a nice crate. Are there any plans to support prediction in bevy_replicon?

spring raptor
bold crow
#

Hi, I have a weird Problem that with a standalone headless server no components get replicated, no client events fired and nothing works, altough the connection is fine. Does anyone know why this could be the case? I checked the network traffic and there is indeed stuff being sent.

spring raptor
bold crow
#

Yeah the example works, and my code also worked with a "client host" but not with a standalone server

#

Basically the server doesn't do anything except for replicating components and logging events:

if server {
    App::new()
        .add_plugins(MinimalPlugins)
        .add_plugins(LogPlugin::default())
        .add_plugins(ReplicationPlugins.set(ServerPlugin::new(TickPolicy::MaxTickRate(60))))
        .replicate::<Transform>()
        .replicate::<Player>()
        .add_client_event::<TestEvent>(SendPolicy::Unordered)
        .add_systems(
            Startup,
            (network::server::init_server, || info!("Server started")).chain(),
        )
        .add_systems(
            Update,
            (
                network::server::server_system,
                |mut events: EventReader<FromClient<TestEvent>>| {
                    for event in events.into_iter() {
                        info!(
                            "Received event from client {}: {:?}",
                            event.client_id, event.event
                        );
                    }
                },
            ),
        )
        .run();
}

The client successfully logs on which the server registers and then the server spawns a Player component which I can confirm is spawned on the server but not replicated to the client. Also TestEvents sent from the client are not registered by the server 😦

pseudo lion
#

Did you remember to add the Replication component to your entity?

bold crow
#

Yes, It also worked with a client host so that's probably not the Problem

sharp roost
#

Having an entirely separate client prediction framework that's networking layer independent would be the dream I think

#

Oops that was meant for the previous conversation πŸ˜…

bold crow
# pseudo lion Did you remember to add the `Replication` component to your entity?

Btw, quick question regarding replication: as seen in the example you don't replicate the complete entity with all components that it's supposed to have on the client, like Sprites and stuff (right?) so you need something that "upgrades" the replicated entities to add the missing components. Why would that be any different then sending an event like "Spawn X" and then the client proceeds to spawn that entity with all it's components?
Of course replicating transform components and stuff is very useful.

pseudo lion
#

The docs mention the "blueprint pattern" to deal with this problem. It's also a good pattern to ensure separation of concerns.

#

As for how its different from manually sending events, it's less manual work and leads to a nicer design

sharp roost
#

I'm just getting started with Bevy_replicon, and I have a question about what you should do with player characters. I tried spawning them on the client side, but it seems only entities spawned on the server-side are replicated (which is probably good), but that means when a client joins I spawn a new entity to be that character's player (which does make sense that the server would be responsible for in case the player had a previous location in the level), but that means the server has to tell the player which entity to control, which does not seem quite as easy to me.
My current idea id to send an event from the server to the client with the Entity ID, which the player will then map to the local entity, and then add a Control component, but this seems kind of weird, and also seems like it would require some buffering in case the event is delivered before the entity gets replicated. Is there a better way to handle this? Like a "replicate this component only to this client"?

bold crow
spring raptor
spring raptor
sharp roost
spring raptor
# sharp roost I'm just getting started with Bevy_replicon, and I have a question about what yo...

Yes, only server replicates to clients. You need to either to send an event from client like "I want to spawn something" or spawn it on server automatically on join and send and event from server "okay, you controlling this one". It's common, I think Unreal Engine works the same way.
After you receive from server event "this is your character" you locally add your marker component and other stuff you need.

spring raptor
spring raptor
bold crow
#

But what does that mean for latency, I mean you can't just send all your inputs to the server and then wait for the changed transform component to replicate back to the client, that could mean huge amounts of latency

spring raptor
#

The idea is that you don't wait for the respond from server and spawn your entity right away.

#

And when server sends you validation back, you check if it succeeded. If not - you rollback changes.

#

If you want more details, I would highly recommend to watch https://youtu.be/zrIY0eIyqmI?feature=shared&t=1341

In this 2017 GDC session, Blizzard's Timothy Ford explains how Overwatch uses the Entity Component System (ECS) architecture to create a rich variety of layered gameplay.

GDC talks cover a range of developmental topics including game design, programming, audio, visual arts, business management, production, online games, and much more. We post a...

β–Ά Play video
#

They also use ECS, btw :)

bold crow
#

So can you disable the replication of a single component for a single entity? Like when a player moves you send the inputs to the server but apply it locally right away, but then you get the update from the server replicating the Transform component which is now one RTT out of date

sharp roost
#

There's Ignore<T> I think

spring raptor
#

Predicted<T> currently doesn't exists, but I can provide this API for you if you are planning to create a prediction logic for your game or crate.

#

The idea is that all changes goes into scoop called Predicted<T> and you apply changes from it manually.

#

If this component exists.

#

But if your game is slow-paced, like mine (a life simulation game, similar to The Sims), you may don't want to have prediction at all.
So it depends.

spring raptor
bold crow
#

I think for my purpose it may be okay to accept the client as authoritative over their own player and don't check anything, that would enable cheaters to do bad stuff but I'm not really building the maximum security game with kernel-level anticheat here πŸ˜‚

spring raptor
#

There are many things that are not deterministic.

pseudo lion
#

i'm toying around with a prediction api on top of replicon now, but no guarantees that it will lead anywhere

spring raptor
#

Also in the video above you can see the example when stun could be misspredicted. Highly recommend to watch if you haven't

bold crow
#

Yeah you can't skip that if the clients sends inputs, but if the client sends its transforms and stuff to the server you can accept that, replicate it to the other clients and proceed πŸ˜‚

spring raptor
#

Oh, this could work, yes πŸ˜…

#

My general recommendation would be focus on the game and just plan arhictecurre for networking in mind. You can fix and improve things later.

spring raptor
#

Maybe "layer" is a wrong word. I should have said "transport" instead.

spring raptor
sharp roost
#

So for a moving player (like platformer movement) you'd have to send an event of that movement and have the server apply it and then it'll get replicated back?

bold crow
spring raptor
# sharp roost So for a moving player (like platformer movement) you'd have to send an event of...

Absolutely not, you will have a terrible response time. You need prediction for time-sensetive things. The video linked above explains it a lot better, but tl;dr:

  1. You send input and apply it locally. And you buffer your inputs.
  2. Then wait until server replicates it back, you apply the value from server and re-play all inputs since the acknowledged tick from server. And some smooth on top to avoid teleportation.
sharp roost
#

Yeahh, I meant that's what happens without prediction

#

Is there currently a way in bevy_replicon to know when the replicated value corresponds to? So that you'd know how many of the inputs to apply

spring raptor
spring raptor
sharp roost
sharp roost
spring raptor
spring raptor
# spring raptor Refined it, reduced amount of unsafe, all tests pass now: https://github.com/lif...

This approach is not only ugly, but also limits serialization customization that we have. I need this flexibility because I planning to have a macro in the future that generates custom serialization functions with specific numbers precision, for example.
So I treid manual approach. And it's not only simpler and takes less code, but also keeps all flexibility in place!
https://github.com/lifescapegame/bevy_replicon/pull/39

Overall this change reduced packet size in my bench from 32420 to 25220. And now doors to memory reuse are open. We also have about 10% send speed improvement and 30% for receive in my benchmark. But depends on message size.

distant shore
spring raptor
#

But your current approach is totally fine. Start with working things, you will be able to iterate on it later.

spring raptor
distant shore
#

Gotcha, I'll watch that video and add this to my gameplay feel Todo list to eventually come back to lol

spring raptor
#

@dire aurora I was going to implement memory reuse, but realized that Renet always require me to copy the message.
I can use BytesMut to write my message to it and use freeze() to convert it to Bytes and cheaply clone it to Renet. But I can't convert it into BytesMut back.
Am I missing something?

dire aurora
#

You can't pass it to renet and then alter it later. Since it holds onto it for (as far as the compiler is concerned) an arbitrary amount of time. The common case here is to just give your data to renet, and preallocate a new buffer (or pass renet a copy, which I think might be more efficient, because if cap > len converting Vec to Bytes requires extra allocation)

spring raptor
bold crow
#

Btw, what are your approaches to abstracting the networking stuff? I've just begun adding multiplayer stuff but I find myself constantly differentiating between "local" and "remote" (i.e. replicated) stuff and that really messes things up and introduces complexity & coupling that I don't neccessarily want

spring raptor
bold crow
#

What? I mean the docs say "Write the same logic that works for both multiplayer and single-player." but I don't see how that would be possible, you have to explicitly upgrade replicated entities and send events between the client and server so how can they share the same code?

spring raptor
bold crow
#

Oh I see

#

thanks!

spring raptor
#

English is not my native language, so if you find any unclear things in the docs - let me know.

echo lion
#

@spring raptor in diffs_sending_system() you can cut this out let mut messages = Vec::with_capacity(client_diffs.len()); if you remove the renetserver resource and .send_message() in-line (my guess is you separated them due to lifetime issues).

stable jolt
echo lion
#

@stable jolt you can Ignore<T> components that shouldn't be replicated.

#

if that isn't convenient you could put all the non-replicated stuff in a separate child/parent entity

stable jolt
#

When a component is replicated all the other components of the entity will be replicated so ?

echo lion
#

Components aren't replicated specifically, it is 'if an entity has Replicated then replicate all components that are eligible'.

stable jolt
#

If I want to do the other way around, like saying which component should be replicated instead of writing Ignore<T> everywhere how should I do that ?

stable jolt
#

So if an Entity has the component Replication then it will replicate all the component marked as replicated

echo lion
#

Only components that are registered to the app with .replicate::<T>() will be replicated, so you could avoid replicating components you don't want.

stable jolt
spring raptor
echo lion
#

yes

spring raptor
#

Yes, I did it because of lifetime issues :)

spring raptor
#

I currently borrow it read-only

echo lion
#

True but there isn't a whole lot you can do in parallel when a system has &World locked up.

#

You'll probably need to do that anyway once renet exposes a streaming API.

spring raptor
echo lion
#

It's just one vec alloc so not that big a deal in the grand scheme of things, but nice to shave things off where possible πŸ™‚

spring raptor
spring raptor
#

BTW, I switched to using network ticks, will push this change soo.

echo lion
#

Ok so, I believe you can cut out the ClientDiffs completely and serialize directly into the message.

#

The only slighlty hard part is updating the entities.len() and components.len() parts as you traverse the world archetypes.

spring raptor
echo lion
#

Are entities found in multiple archetypes?

echo lion
# spring raptor Yes, because of this :(

With this you just need a pointer to the place where the length is stored. The main problem is Changed and Removed diffs would have to be in separate spots of the message.

#

Although now that I look at it... removal trackers are on the entities where components are removed. So you could add an in-line branch for archetypes with RemovalTracker πŸ™‚

stable jolt
echo lion
spring raptor
echo lion
#

If you map [ components : entities ] would the perf be bad for clients?

spring raptor
#

If you know how to change it, the PR is welcome. But I would suggest to wait a little bit until I push the new update.

spring raptor
spring raptor
echo lion
#

The server-side perf would be significantly better without the ClientDiff intermediary. Client-side is dominated by renet, indirections, and component deserializations.

#

It's hard to say how a different indirection scheme would affect things.

#

I will take a stab at it once your update is ready, lmk

spring raptor
stable jolt
spring raptor
spring raptor
echo lion
#

ok πŸ™‚

#

Is it fine to limit number of entities to 2^16 = 65k?

spring raptor
spring raptor
#

Didn't realize it right away.

#

Maybe we could change iteration...

echo lion
#

Why would that increase size?

#

Isn’t it components x entities either way? Or am I blind

#

Oh because component values are different per entity duh

spring raptor
#

Yes, we can't map it backwards :(
We can only include component and entity.

#

Like in tuple.

#

I wondering if we could change the iteration. Can we iterate over archertype entities first and then fetch info about components?

echo lion
#

What do you think about somehow reusing buffers, then copying final buffer value into message for renet?

spring raptor
#

I tried but, but got a very small speedup, something about 1%

#

I used HashMap<u64, Vec<u8>> as a Local.

#

Also thought about reusing WorldDiff, but impossible because of pointers.

#

They borrow World.

echo lion
#

Hmm strange that it wouldn’t be faster

spring raptor
#

You could try it on your machine, it's a very easy change.

#

Mabye it depends on the machine, not sure.

echo lion
#

I will try it!

spring raptor
echo lion
spring raptor
echo lion
#

@spring raptor the benchmark seems to be broken on master as well: Benchmarking entities send: Collecting 50 samples in estimated 6.0184 s (3825 iterations)thread 'main' panicked at 'assertion failed: (left == right)
left: 0,
right: 900', benches/replication.rs:37:17

#

Ok it seems to you need to add a thread sleep between server/client updates in order to wait for the packets to travel.

echo lion
stable jolt
#

When spawning something like a Player for instance, it should be done on the server ? Like Replication does not make any entity right ?

echo lion
#

yes

#

it will spawn the entity on the client though

spring raptor
spring raptor
echo lion
#

Ah it should actually be a lot faster for real applications where you are replicating updates. The benchmark is making new apps for every test, so it's always testing the worst case scenario (new entities and new components).

echo lion
#

Added some benchmarks, looks like 50% faster for updating changed components on existing entities.

spring raptor
spring raptor
#

About the delay. Was surprised that it's needed because renet doesn't do anything between updates. Maybe it's because it takes time to update data in the written socket?

#

Maybe we could make the check more robust. For example, add extra updates if the message wasn't received.

stable jolt
#

I have an issue, when I try to send an Event during the Startup phase from a client it is not received by the server

spring raptor
#

We send data only in PostUpdate, unfortunately you can't send it there.

stable jolt
#

That's problematic if you want to setup things during Startup

#

I just wanted to spawn a Player entity when a Client connect to the server

spring raptor
stable jolt
#

So I have an issue cuz it doesn't work for me I think

spring raptor
#

So the issue somewhere on your side. Maybe you send it before the connection.

stable jolt
echo lion
spring raptor
echo lion
#

Until then, a simple 5ms sleep solves it. The sleep is not in the bench zone so should not cause a problem.

stable jolt
#

I think you can send events on startup
You need to send events after connection. I can't connect in startup.

Isn't it contradictory ?

spring raptor
#

I wouldn't rely on it. Instead, wait for connection and then do any kind of initialization.

spring raptor
stable jolt
spring raptor
#

So wait for connection and spawn it.

echo lion
#

I have a lot of sleeps in my server test code lol. Async and socket stuff just needs it.

stable jolt
spring raptor
#

Check out the example in the repo

#

On server there is an event when client connects. For client there is a condition.

stable jolt
#

Thx I'll try that out

spring raptor
#

Yes, most likely you want to spawn it on server and it will be replicated to client.

spring raptor
echo lion
#

The RenetServerPlugin causes ServerEvent events to be emitted, which include client connects.

spring raptor
#

@echo lion added constant for clarity and removed two sleeps that I suspect is unnecessary. Could you please check it?

#

Because they always work on my machine for some reason.

echo lion
#

I don't see any changes

spring raptor
#

Oh, there was an amend, one second.

echo lion
#

All the sleeps in there are necessary. You need sleeps between server -> sleep -> client -> sleep -> server

#

otherwise the behavior is not guaranteed to match expectations for the test

spring raptor
#

Done

echo lion
#

Yes the constant is fine, those sleeps are needed for correctness

spring raptor
#

Hm... Are you sure that client -> sleep -> server is necessary? It sends very tine amount of data.

echo lion
#

the amount of data isn't the problem, it's the uncertainty about if it arrives or not

spring raptor
#

Could you check and confirm it just to be sure?

echo lion
#

it isn't deterministic

spring raptor
#

Strange, I have tests that always pass on your machine.

#

They do pass, right?

#

And they don't have any sleeps

echo lion
#

The tests pass without those two extra sleeps, but the test itself is incorrect without them.

spring raptor
#

I have a similar test for acks

echo lion
#

Future changes to the core code may silently break the tests.

spring raptor
#

Never happened to me. I don't think that any of us completely understand how system sockets works πŸ˜…
I assume it's because of amount of data.

#

And not only on my machine. Even slow CI machines never fail.

echo lion
#

Well my problems are evidence that you can't leave these things to chance. Correct is better than minimized.

spring raptor
#

Just try this change out, check if it works.

echo lion
#

Like I said, the tests pass without the two extra sleeps. I added them for correctness.

spring raptor
#

It just works for you

#

So it confirms my theory. It depends on amount of data.

#

My system somehow handles it and you don't. it's okay.

echo lion
#

No, if the server doesn't receive an ack then the benchmark doesn't fail. A separate test/benchmark that dealt with acks would fail if acks weren't received.

spring raptor
#

Are you talking about tests, not benchmarks?

echo lion
#

I'm talking about the benchmarks.

spring raptor
#

Oh, I get it now.

#

Okay, let's keep them then.

echo lion
#

Thanks!

#

It's possible the socket gets overloaded and slows down in the bench loops.

spring raptor
#

I noticed that we have two sleeps inside the loop in update, and don't have them in send. Is this on purpose?

#

Sorry, I rephrase. In benchmark when we replicate spawns we use only one sleep. Is this on purpose?

echo lion
#

Yes because the client is discarded after it updates

spring raptor
#

We definitely need an abstration over networking layer in the future :)

#

Moving to the next PR

#

You PR reduced packet size from 25220 to 4006

#

That's impressive.

echo lion
#

wow lol

spring raptor
#

That's because I used enum and I suspect that it was serialized as u64 system.

echo lion
#

probably from serializing entities and replication ids as varints

spring raptor
#

on my system*

#

Exactly. What a stupid idea it was πŸ˜…

echo lion
#

entities could be further reduced by splitting into two u32s and serializing both as varints

spring raptor
#

Also you reduced component sizes quite nicely too, cool

echo lion
#

since they are actually two ids concatenated

spring raptor
#

Could you rebase your PR on latest master?

echo lion
#

The benches pr did not merge yet

spring raptor
#

Right, merged manually without waiting for CI.
I don't run benches on CI anyway. For now, at least.

echo lion
#

Ok fixed

spring raptor
#

Nice PR, reasonable change.

#

BTW, I used enums long time ago because of reflects, totally unnecessary now, thanks

echo lion
#

I would not have figured it out without your recent updates exposing the Ptr<> idea and showing how to use bincode, and the hint about Local

spring raptor
#

I would not figure it out without @NiseVoid idea about having a map to deserialize and serialize things πŸ˜…

echo lion
#

I think we can get packet size down to 1.5-2k with that entity serialization change

#

I will update the PR

#

or we can do a new PR

spring raptor
#

Let's go with a separate PR

#

I will merge this one tomorrow, because want to suggest some minor style/organization adjustments. It's late night for me. Tomorrow I will have more time because today I went to visit relatives and tomorrow I have a day off

echo lion
#

sounds good

spring raptor
#

Have a good night/day!

echo lion
#

although style stuff may be more efficient to commit on top yourself

spring raptor
#

Maybe, yes. I will open a PR to your branch then

stable jolt
#

Do you know if replication would work well with physics crates like bevy_xpbd or bevy_rapier ?

echo lion
#

probably not, physics is very tightly coupled to fixed updates

#

however you can replicate events tied to physics

#

like 'start jumping'

stable jolt
#

So like every event (keyboard, mouse...) that would directly impact physics so

spring raptor
#

@stable jolt Highly recommend to read https://gafferongames.com/post/introduction_to_networked_physics/
So replication can be used too, but you need to buffer the received data and interpolate it.

#

@echo lion Hi!
I noticed that on my machine sleep somehow affects benchmark and even on master I sometimes having random spikes in peformance. Could you check if you having the same on your machine?

#

The spikes about 15-30% instead of 0-3% as I usually have.

spring raptor
#

I don't mind having sleeps in benchmarks at all, but such huge spikes make it difficult to see how much performance has actually improved.
I tried reducing amount of entities instead and added benchmark to CI. Looks like it pass. Could you confirm that it works on your machine too?
If it doesn't work on your machine, then I assume that's because of platform (I use GNU/Linux). In this case we need to bring back sleeps, but apply them based on platform.
https://github.com/lifescapegame/bevy_replicon/pull/43

spring raptor
#

About your PR - I think I have an idea how to extend it to have only a single buffer for per client instead of per entity. The way you wanted to do it in the first place.
I will try it now, if it turns out garbage, we will go with your PR unchanged (expect a few style things).

echo lion
#

Maybe a spin sleep instead?

spring raptor
spring raptor
#

I'm fine with any solution that won't cause spikes and works on your machine too :)

#

Do you also have such spikes?

echo lion
spring raptor
#

That's probably because of it. Maybe apply sleeps only on macos?

echo lion
#

I'm not sure what you mean by spikes. A spin sleep is just a hot loop, which should keep the CPU hot in the scheduler. The OS sleep probably invalidates the warm-up.

#

I will implement an in-memory transport layer to solve it fully. It is needed anyway for WASM local player.

spring raptor
#

Under spikes I meant that sometimes I randomly have 15-30% speedup or slowdown. Do you observe anything like this? Just try to run cargo bench about 3 times.

Let's try, feel free to open a PR.

spring raptor
#

For temporary workaround I would go with spin loop as you suggested.

echo lion
#

Ok new PR to spin, it seems to give much cleaner results.

spring raptor
#

Great, let me try

#

Works perfectly for me too, thank you!

#

I pushed a small change to create sleeper only once.

echo lion
#

Ah nice

spring raptor
#

@echo lion merged. Could you pull this change to your branch?
I have no doubts that your PR increases performance significantly, I just want to compare it with my work on top of it. I almost finished it.

echo lion
#

done

#

I am working on a fix for issue #44

spring raptor
echo lion
#

I am looking at entity serialization. When the generation is zero, the entity can be serialized as u64 with 1 byte as a varint, but when the generation is non-zero it needs at least 5 bytes. If we serialize as 2 u32s then it always needs at least 2 bytes. So there is a tradeoff - assume generation zero for most entities, or assume worst-case of many entity spawn/despawn cycles?

echo lion
spring raptor
spring raptor
#

Ping me or reply if you want to reach me, I will answer faster.

echo lion
#

Ok

spring raptor
#

Did it! It's possible!
But there is some mismatch in serialization and deserialization, chasing it now.

echo lion
#

You should look at #44, it seems to be a deeper bug than I thought.

spring raptor
#

Will do right after this change

spring raptor
#

It's a lot faster then without it, but slightly slower then your version. But it's because it currently I don't use varint encoding. The size reduced from 25220 to 22510. But your PR have 4006 because of varints. I will add it now.

#

About get_info - I totally agree with the change, but address in a separate PR.

echo lion
#

I don't think your version can get smaller than mine, since you are using an enum for diffkind per component but I use one or two chars per entity.

#

Test with 5 different components and see

spring raptor
echo lion
#

oh wait I see lol, hasty comment

spring raptor
#

But about entity - yes, I currently don't do it in a smart way, but it's possible, yes. I thought that you have something cooler in mind for it?

spring raptor
spring raptor
#

I keep static encoding temporarery for debug purposes.

#

kept*

#

Just pushed this change.

#

Results with varint encoding on my machine:

entities send           time:   [204.81 Β΅s 204.98 Β΅s 205.15 Β΅s]
                        change: [+15.927% +16.516% +17.075%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 50 measurements (8.00%)
  3 (6.00%) high mild
  1 (2.00%) high severe

Benchmarking entities receive: Warming up for 3.0000 s
Warning: Unable to complete 50 samples in 5.0s. You may wish to increase target time to 8.0s, enable flat sampling, or reduce sample count to 20.
entities receive        time:   [174.15 Β΅s 174.33 Β΅s 174.52 Β΅s]
                        change: [-0.9250% -0.6786% -0.4302%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 50 measurements (4.00%)
  2 (4.00%) high mild

entities update send    time:   [89.772 Β΅s 90.205 Β΅s 90.673 Β΅s]
                        change: [-23.335% -22.546% -21.729%] (p = 0.00 < 0.05)
                        Performance has improved.

entities update receive time:   [75.727 Β΅s 75.997 Β΅s 76.301 Β΅s]
                        change: [-12.857% -12.307% -11.748%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 50 measurements (6.00%)
  1 (2.00%) high mild
  2 (4.00%) high severe
#

Your branch compared to mine. Decrease is better for yours.

#

The package size in your version is ~4 times smaller (16207 vs 4006 ). I think that if we add entity varint to my version it will be even faster.

echo lion
#

Ok nice, you were able to swap the entity and component loops

spring raptor
#

Yes! It results in barely noticeble slowdown, I think it worth it.

echo lion
#

The main disadvantage of your approach is entity ids are duplicated between the change and removal sections.

spring raptor
#

Feel free to review and suggest changes in my branch

spring raptor
echo lion
#

true

spring raptor
#

What I like about having a single buffer, is that it will be very fast for future streaming.

#

I copy data only because of renet.

spring raptor
echo lion
#

It is easy to preallocate size if you know the entity πŸ™‚

#

Serialize the entity into the start of the array immediately, and record the initial start position so you can over-write it if the array is empty.

#

You just reserve one byte for setting the array length after you are done, that's all.

spring raptor
echo lion
#

Let's see what happens with u64 varint for comparison with my PR, then we can split it

#

It's a clean PR, nice work πŸ™‚

#

Ok review done, there are a few things to fix.

spring raptor
#

I worry that serializing entity right away will result in serialization of every entity in the world each time. But probably shouldn't matter since it's fast?

echo lion
#

Hopefully it's very fast

#

Another option would be to use extra-large buffers, then when you copy them into the renet message you chop out the gaps

#

idk if that's worth it

#

Alternatively, you could wait to serialize the entity until you know you need it (the first time you encounter a component to add)

#

That's probably the best option, although you may want to perf test it since it will be close

spring raptor
spring raptor
#

@echo lion don't get about removals size. Could you elaborate?

#

I think I do write 0 even if there are no removals, end_array should do it for me.

echo lion
#

Oh I misread it

#

For entity maps you should not write anything (entity or length) if the length is zero (you currently write the length of zero, which wastes 1 byte for every entity with no changes), for the non-entity maps you should write the length (as you are doing). The docs should make it clear why you do that for each case.

spring raptor
#

Got it, makes total sense, thank you!

spring raptor
#

@echo lion done!
Size now is 11205.

#

I think 4006 was for version with DummyComponent without usize.

#

Yes, your PR use 11206 bytes now. One byte more πŸ˜„

#

Thanks, now it's the fastest version:

Benchmarking entities send: Warming up for 3.0000 s
Warning: Unable to complete 50 samples in 5.0s. You may wish to increase target time to 7.9s, enable flat sampling, or reduce sample count to 20.
entities send           time:   [156.91 Β΅s 157.13 Β΅s 157.38 Β΅s]
                        change: [-23.223% -22.973% -22.729%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 50 measurements (4.00%)
  1 (2.00%) high mild
  1 (2.00%) high severe

Benchmarking entities receive: Warming up for 3.0000 s
Warning: Unable to complete 50 samples in 5.0s. You may wish to increase target time to 7.9s, enable flat sampling, or reduce sample count to 20.
entities receive        time:   [157.43 Β΅s 158.59 Β΅s 160.16 Β΅s]
                        change: [-6.7955% -6.0621% -5.4216%] (p = 0.00 < 0.05)
                        Performance has improved.

entities update send    time:   [97.222 Β΅s 98.642 Β΅s 99.973 Β΅s]
                        change: [-4.0230% -1.4704% +1.0846%] (p = 0.26 > 0.05)
                        No change in performance detected.
Found 2 outliers among 50 measurements (4.00%)
  2 (4.00%) low mild

entities update receive time:   [72.106 Β΅s 72.457 Β΅s 72.850 Β΅s]
                        change: [-12.786% -11.770% -10.837%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 50 measurements (16.00%)
  4 (8.00%) high mild
  4 (8.00%) high severe

My branch compared to yours. Decrease is better for mine.

#

Now I wondering if we should use varint encoding for components...

echo lion
#

Congrats πŸ™‚

#

I’m not sure about components, it’s kind of situational. Maybe you could add a plugin config

#

And default to varints

spring raptor
#

Yes, sounds good to me.

#

We already can do it, btw, via replicate_with.

echo lion
#

Ah yeah I didn’t look close at the injection stuff yet

spring raptor
#

We can provide replicate_with_varint or replicate_with_fixedint depending on default that we decide.

#

Not sure what should be default πŸ˜…

echo lion
#

Varint probably for games

#

Mainly ids will waste space with varints I think

#

And then if that’s a problem you can just do an array of bytes

spring raptor
#

btw, I used varint for tick.

#

Is it wrong?

echo lion
#

Yeah I think so

spring raptor
#

Will remove

echo lion
#

50% of ticks will use all 32 bits

#

That’s why you were -1 byte lol, since I didn’t use varint tick

spring raptor
#

Yeah πŸ˜…

echo lion
#

Yours should be 3 bytes more than mine I think

#

Or 2

#

Yeah 2 for the removal array size

spring raptor
#

Mystery solved

spring raptor
#

Merging?

#

Will take a look at varint for components and get_info tomorow and draft a new release.

spring raptor
spring raptor
echo lion
#

I will take another look in an hour

iron flare
#

can someone show me a basic example of using NetworkEntityMap?

spring raptor
# iron flare can someone show me a basic example of using NetworkEntityMap?

It's a resource that maps entities from client to server. It works automatically: when you receive an update from from server, you spawn all received entities on client and create a mapping to the server entity. So client contains server ID mappings and know which server entity corresponds to client entity.
You can insert your own mappings via .insert.
Hope this make it a bit more clear.

#

@echo lion Thanks, merging then. Great suggestions and the proposed throttle mechanism.
I will take a look at the trailing array sizes now.

Do you want to become a collaborator? If yes, I will send you invite.

spring raptor
#

About dropping trailing zeroes, there is no nice check if the received buffer is at the end? Only compare position with length of the underlying buffer?

spring raptor
echo lion
spring raptor
echo lion
#

Oh yeah, compare position with underlying length

echo lion
#

I wonder... could you compress entities even further by serializing the concatenation of the two varints as another varint?

spring raptor
#

@echo lion should we use replicate to serialize as varints and replicate_fixint to serialize as fixint?

echo lion
spring raptor
#

Feel free to try in my PR, you should have write access.

spring raptor