#Material.SetXXX performance

1 messages · Page 1 of 1 (latest)

earnest matrix
#

I've a ton of rather arbitrary game objects with materials whose properties I want to animate. However, this comes with a rather significant performance cost: material.SetFloat and friends are expensive.

Is there a way to improve on this, or another way to achieve the same result? I can't find any batch nor async APIs.

quiet oyster
earnest matrix
#

unfortunately I'm using URP
I think that would only improve on the Set function per material, but the problem is also that I have a lot of materials to animate

cerulean epoch
#

Expensive in what way?

quiet oyster
earnest matrix
#

More specifically:
There are disintegration effects on all the objects that make the meshes disappear. These are disintegrated by animating a radius & position for all of these objects, independently

#

the radius and position are the material properties in this case, and of course only the radius is the one changing most of the time

cerulean epoch
#

How many exactly?

earnest matrix
#

between 10 and 200 roughly?

cerulean epoch
#

Each frame I suppose?

earnest matrix
#

yes.. 😄

#

Talking about this, I think I have an idea

#

I can create a custom buffer that I fill with these animated properties on a job
Then assign to the materials an ID to read from this buffer

#

of course that removes the properties from the editor completely, but at least it could work

#

it's a lot more complicated to manage though

quiet oyster
#

Also 200 objects given 4 floats doesn't sound particularly much to me. How long does that take? Are you sure you are not creating new material instances every frame or something like that? Just pure SetXXX on existing material?

earnest matrix
#

I will make a profiler capture

quiet oyster
#

SetXXX itself is most probably just putting the value to a dictionary of sorts. I would assume that only stores the values on CPU side which are moved to the GPU only when the rendering is done

earnest matrix
#

apart from the SetXXX, there are some significant improvements to be made of course, but ignoring those, the cost remains very high

earnest matrix
#

of course profiled in release~

wow I say that but its not true, let me redo that 🤦

#

well, somewhat surprisingly, the result is almost the same in release

quiet oyster
#

Is 0.1 ms all it takes though? One obvious thing you could do is to move both radius and position in one Vector4 w being the radius for example

#

Btw how many times is the inner for loop run? Do you have many materials on each of those 200 ish objects?

#

Oh wait, is the code only setting the properties for one of the objects? I thought it was doing for them all with the outer loop

reef spear
earnest matrix
#

look at the total number: 9ms (release) vs. 13ms (debug) for 100 instances
and the cost inside the update for the SetXX functions which are slightly higher in release (relatively)

I'll look at the # of loops and calls to SetXXX, but regardless of the size of the inner loop, the number of meshes/materials/properties animated won't change.

earnest matrix
reef spear
#

It's very hard to come to any conclusions from the timeline view. Use hierarchy view...

#

And deep profiling if it doesn't show the specific calls in update

earnest matrix
#

you can see the % of time this function call takes in the frame, the total ms it takes, and the breakdown of the function in the superluminal capture, that is not enough ?

reef spear
#

Not enough. I trust the profiler more than superluminal, but you're not using it correctly, not providing enough info.

#

SetVector/SetFloat should just be queuing a gpu buffer update. The method itself should be pretty lightweight.

quiet oyster
#

Btw you can read transform.position directly from the model matrix in shader assuming all those renderers are on the same object/transform. Ultimately still I'm having a hard time believing SetVector call would take that long

reef spear
#

Probably each instance loops like 100+ times. 20000 calls to set to set property could probably account for 9ms

#

using the profiler properly would reveal it all

quiet oyster
#

Just tried myself, for me setting a vector and a float around 6k times took 9ms. That is maybe not great either but as discussed earlier, you can get rid of one of the set calls (by combining and potentially reading the position from the matrix) and can always consider more optimized approaches (like using single buffer).

#

Now I'm interested in how many times you are setting the exact same data to the materials, in other words, how many times does the inner loop run (outerLoopIterations * innerLoopIterations) in average. If that is not something very high, I'm having hard time trying to explain how that could take that long

pale dew
#

I wonder if using the property id instead of its name would help but i suspect there is a cost to updating data on the gpu now (unless its actually queued to be updated at a later point)

quiet oyster
pale dew
#

It's hard to say I can't find anything explaining this.
The cost could also be the constant managed to native jump and conversion to apply these changes.

Doing animation in shader only or using GPU buffers with compute shaders would be best to reduce this problem

earnest matrix
#

I thought the cost would be mostly due to native calls, though HasProperty is way cheaper, which I assume also does a native call (but I cant check now)

the call to Shaders.Contains comes out even cheaper; this is iterating over an array with 3 elements on average and comparing class instances, which shouldn't be that much slower than a hashmap addition (if that's really all SetXX was doing)

#

I don't know the # of calls yet, I'm not in the project, but I doubt I can change much about that anyway (in our worst case they all need to update)

I was mostly interested to see if there were any native APIs I could use to bypass it, but it seems not, which leaves us with the graphics buffer solution

#

Before this, we were driving the material parameter with the animator and we had no performance issues. We switched away from that for practical reasons (too many objects to animate by hand).

Perhaps another solution could be to bake the animations at edition time, though that kind of thing is never straightforward in my experience

pale dew
#

can you give more detail as to what you are "animating" by changing material parameters?

#

it may be wise to pack data into some buffer or texture and use global time in shader to produce the effect

reef spear
#

Even if it's 6k materials, it's too much to be rendered in one frame, let alone have properties modified. 🤔

earnest matrix
#

I guess the SRP batcher does some magic here? Don't know how this works under the hood. In any case I will look at the numbers again tomorrow.

earnest matrix
#

We did consider using a start + end time for the animation, which in terms of performance will work better of course. I don't remember why this wasn't an option. I will ask again.

#

They can be partially disintegrated as well and maintain that state, so there's some juggling to be done with the parameters. But that sounds feasible to me.

pale dew
reef spear
#

You could also reduce the rate at which you update the materials. Especially for objects far away. Sort of an LOD system. Or update them all in one centralized place and split over several frames.

earnest matrix
#

Yeah in any case GPU-wise we are still OK from what I could tell.

earnest matrix
reef spear
#

Is it something like vampire survivors with hundreds(more like thousands?) of units on screen?

earnest matrix
#

they are buildings actually, where many mesh parts are animated individually (for construction mostly), which is why the numbers are so high
of course artist time is a factor there too

reef spear
#

Yeah, for something like that, reducing update rate(or having an update queue with limited updates per frame) is pretty common I think.

#

Also DOTS obviously.

earnest matrix
#

yeah that's what I did for our terrain updates and such, it's all jobified and spread out over frames

#

but there's no native material API so I got stuck on this mentally

#

there are many good suggestions in this thread 🙏

earnest matrix
reef spear
#

Generally with hundreds/thousands of updating objects, you want to go for ECS.

cerulean epoch
#

And like mentioned before at some point the animation could be handled by the shader entirely, if it happens at a predictable rate
Even if the animation in the shader would run on a global timer, you could subtract elapsed time as a per-instance offset for this material to restart the animation just for it

pale dew
#

You could have some mask texture that controls how the effect changes over time too if that helps combining meshes again.

earnest matrix
#

for now I ended up baking all the animations to a timeline animation clip, as it's the least invasive solution (it was already using timeline animations)

I think on average we had like 20-30 meshes per object, so for 100 objects, with each 2 materials, lets say 6000 SetXXX calls? this more or less matches what you measured, @quiet oyster

#

throwing this kind of thing into ECS is possible of course, but I'm not a big fan of mixing ECS and game object workflows

#

merging the meshes is definitely something to explore, though in terms of performance right now we're OK (the SRP batcher does a good job, there are only 30 batches)

earnest matrix
#

sure sure, give me half a year to refactor everything

quiet oyster
cerulean epoch
#

1-2 instances I mean

earnest matrix
cerulean epoch