IN-49838
Currently due to the current design of Entities Graphics that forces the complete job on the main thread, it will affect all the things that have dependency to Entities Graphics. For example, Entities Graphics will affect job at ecs physics that will increase main thread time based on time taken of ecs physics job to execute. This make it very slow specially on performance critical mobile platform that higher main thread time means lag, thermal throttling and high battery consumption. Even at desktop with i9-11900 CPU, the spike goes up and down from 1ms+ to 8ms+ is about 7x performance. What I expect is job should not affect the main thread and when it affects it just beat the purpose of job that to reduce main thread time. I think official really needs to prioritize it to solve this performance issue that this will bring significant amount of performance to the table.
#(IN-49838) Entities Graphics main thread stalling
1 messages · Page 1 of 1 (latest)
Entities Graphics needs to wait for certain jobs to complete, because those jobs need to have run before the compute dispatch that uploads data to the GPU is executed. The jobs are responsible for copying the data from the ECS to a graphics buffer that can be accessed by the compute shader.
The compute dispatch needs to run before anything is rendered, so it's currently run at the end of the Entities Graphics update step which is the last guaranteed moment we have for doing it. Dispatches must be performed from the main thread, so we can't do it from a job either.
Delaying the dispatch until culling starts is a theoretical option, but culling is not guaranteed to happen every frame, so doing that would be bug prone. In some situations the rendering can get skipped, and skipping the culling callback would result in the GPU data getting corrupted as it would no longer match the ECS contents.
Why compute dispatch must be performed from the main thread? Is that because of current class type design limitation that not able to support job system and burst?
The threading model of Unity graphics is based around the main thread queuing up work for the render thread, and it is not currently possible to queue work from other threads. Even if you could technically do that, it would likely be nondeterministic as the ordering of the GPU work is determined by the order those commands were queued from the CPU.
In addition, Burst does not support using managed classes, but the main thread limitation applies even if you were to use a non-Burst job without this problem.
As an additional update on the Job Complete, there is a second Complete in the beginning of the Entities Graphics update step. We will investigate to see if that could be removed, but even if it is, there is another Complete at the end of update that we most likely cannot remove due to what I wrote above.
I see. From what I know official is working on graphics job to make it off the main thread. I guess in future will address this limitation?
I know that work is being done in this direction, but I can't promise anything specific regarding that.
Btw will official address much more main thread stalling issues at Entities Graphics like at SkinningDeformationSystem and also main thread stalling at SRP like frustrum culling job
If the main thread waits on culling jobs, it typically means it needs their results to proceed.
It tries to kick off the culling jobs a bit in advance so they have some time to run, but at some point the results of the jobs are needed to start creating the actual rendering work, and that has to start on the main thread.
🥲 I guess SRP side basically most of the logic run on main thread unless they port them to dots land one day
There has been some work on SRPs to move some heavy processing to Burst jobs.
For example, I believe HDRP light setup is like that.
And URP Forward+ light setup.
Ya but one official staff tells me that there's overhead call from managed land to burst land. So to completely solve this issue, it needs to burst starting from srp entry point so all the logic swmming at burst land
Btw how about SkinningDeformationSystem main thread stalling at Entities Graphics?
I can ask the person who is responsible for that system.
SkinningDeformationSystem waits for jobs for the same reason as Entities Graphics. It needs to do a compute dispatch, and in order to do that it needs some skinning matrices that are copied by the jobs it waits for.
Since the compute dispatch has to be done on the main thread and it can't be done until the jobs are ready, the main thread needs to wait for those jobs.
@late hinge : I've been looking at this case just yesterday again, and I think what you could do immediately to mitigate this situation is that you add your own systems after the physics systems and before the EntitiesGraphicsSystem.
This way, you can schedule jobs there or do work on the main thread, and the EntitiesGraphicsSystem will not wait since it hasn't been called yet (OnUpdate()). Since the EntitiesGraphicsSystem waits on the jobs from the physics to complete, adding your work on the main thread between physics and that graphics system would allow you to snatch up CPU time in the meantime.
You can use the out-of-the-box AfterPhysicsSimulationGroup to add your systems at the right moment. It's made for exactly that purpose.
🤔 Do u mean I just add systems at AfterPhysicsSimulationGroup to process to not waste the main thread stalling time at EntitiesGraphicsSystem?
Yep
Because the EntitiesGraphicsSystem is effectively only waiting.
If you do other work before that on the main thread, it will wait less once it's its turn.
You might even be able to spawn some jobs which are not conflicting with any of the currently running physics jobs at that time and this way even get some extra CPU time from the worker threads