#How do you know *what* to optimize?

1 messages · Page 1 of 1 (latest)

obsidian blade
#

I'm on a project that is near release and was asked to do an optimization pass. That's fine, I am used to CPU optimization as the profiler shows you exactly how much time and GC a method is taking. Big number bad, small number good. Easy.

But I'm not sure what to do with the GPU info, I am familiar with the rendering module in the profiler, and the frame debugger and such. I just am not sure what a good number is, where I should be looking, what is causing the most slowdown, etc.

Even though I've used unity for years I haven't been on a project that was either close to release or big enough where it matters.

Any advice or pointers?

dense lodge
#

You can dive into the render calls in the CPU profiler as well. Basically the same. If there is a ton of time spent on transparent or SSAO, that is the culprit.
You can enable wireframes in the scene to see if there are any micro or narrow triangles, which are slower to render. MeshLOD is a nice and relatively easy solution for that
A bit deeper is trying to keep down the draw calls, which you can check in the Frame Debugger (and see why they are not batched)
If that does not help and you've checked all basic settings (here some tips: https://www.youtube.com/playlist?list=PLjnZTAR0BuTDNeDPy44G4bUpYRHGzy_xh), then you could use a 3rd party GPU profiler like NSight or RenderDoc to see how long each draw call takes and act based on that

#

That's my process when I am hired for performance optimization

obsidian blade
dense lodge
#

There is no 1 single solution I'm afraid

obsidian blade
#

But that is just CPU time and not GPU time right?

dense lodge
#

You can look into the URP (or other pipeline) calls as well, that could help

To dive deep in the GPU I suggest profilers like NSight and RenderDoc, but I rarely have needed them myself tbh

obsidian blade
dense lodge
#

GPU Profilers!!!!!

real rampart
obsidian blade
# real rampart There is always some sync points in rendering that are handled on CPU side, so ...

Got it, it does seem to give some indication. Opening up RenderDoc is super helpful (though the integration seems to not be in the Unity editor any more?)
Downside is I can't tell what 'normal' is. Like shadows are taking up ~25% of the frametime, and it looks like it is rendering the shadows for the same objects multiple times. But can't tell if that is normal or if there is something funky going on on the project.

real rampart
gloomy frigate
#

You can go even deeper by outputting the shader debug symbols(a bit of a pain in unity) and profiling the shader code itself.

obsidian blade
gloomy frigate
#

No. Everything I said was about PIX. Not sure if there are similar features in render doc.

#

Sadly, unity is shit when it comes to GPU profiling. Especially on SRPs.

obsidian blade
#

Okay that is what I thought

gloomy frigate
#

This is one aspect where unreal engine 100% wins.

#

Their in engine profiling tools are great.

#

At least the GPU ones.

obsidian blade
#

Got it, looking at at least RenderDoc, like 20% is shadowmaps, 30% IndirectDraw calls, otherwise it is like just a bunch of small stuff. Which is good I guess?

#

And VFX taking 10% (oof).

gloomy frigate
#

Mm... It would be a lot more helpful if you could see what passes take how much time. Indirect draw calls for what? Is it a deferred or forward renderer?
What exactly takes the other 50%?

#

If you can make a PIX capture and share it, I could have a look and possibly provide some insights.

#

But before that, is there even a GPU bottleneck? Is your GPU time not meeting the target frame time?

obsidian blade
#

No idea, I was just given instructions to improve the performance if I can without too much work.

gloomy frigate
#

What's the GPU frame time? You should be able to see it in render doc(I think).

obsidian blade
#

14500? That doesn't seem right...

gloomy frigate
#

If it's in microsecond it does. That would be 14.5ms.

#

Which should be > 60 fps, so it's not bad. The question is whether it's stable over time and different stages of the game.

obsidian blade
#

Ahh that could be it. Might be something weird with the capture because there is nothing until about 6500

gloomy frigate
#

Hmm. Yeah, that's one reason I'd recommend PIX.

obsidian blade
#

yeah just finished getting a capture with PIX

#

It is saying 6.173ms which is about what Unity shows in the stats overlay in the editor

#

Looking at the timeline is much nicer than RenderDocs. Seems there might be some area for improvement but not a ton

gloomy frigate
#

Yeah. That's pretty good.
Though, one thing to note is that it might not be very representative if you have a very strong GPU and your target audience might have a way weaker GPU.

#

Also, it might not represent all the game stages/camera views properly.
The way you optimize in the industry is play through the game with the frame time counters up and check if there are places in the game that go beyond the target frame time(often an automated system that warns you is used too).
Even if the game is optimized in this specific place, it doesn't gurantee it's the same everywhere.

obsidian blade
#

Yeah, that is what I am concerned about as I have a nvidia 4080 and thr game is targeted more of mid end pcs. Which there was a better way to test the just also having the worse hardware.

You mean the "recording satistics" or whatever ot is called in PIX?