I'm working on Final Factory, and we're running into a weird performance overhead with the DOTS job worker threads. We're seeing that each thread is consuming significant CPU just for existing--the more job worker threads we have, the more CPU the system consumes. The CPU is a Ryzen 5800x3D (8C/16T) and we are using IL2CPP (but the same behavior occurs with Mono). The game is rendering windowed at 1080p, but it happens in all rendering modes.
Test Setup
We started the game with a nearly empty scene (so not many jobs or systems are doing much activity) with vsync @ 60hz and getting 60 fps.
Initial State:
CPU usage is around 15%. As expected, there are 15 job worker threads (16 cores-1). Looking into the CPU usage with Visual Studio's profiler, I see heavy usage in the job worker threads (see screenshot #1). 81% of the total CPU time was from the job worker threads, with only 6% of that actually running our jobs. The overhead is from RtlQueryPerformanceCounter and lane_try_steal.
Optimization:
I made a one-line change and set the number of job worker threads to 1 with this command:
JobsUtility.JobWorkerCount = 1;
Now, the game still runs at 60 fps, but CPU usage drops to 2.8%. We see a dramatic drop in the job overhead, although it's still present (see screenshot #2)
We are not calling JobHandle.ScheduleBatchedJobs(). It looks like there's just pure overhead in the worker job threads, even though they are mostly idle. On systems with weaker CPUs, the overhead is even greater (on my laptop with a Ryzen 3750H, it's consuming ~50% CPU) and it's making the fans really run hard due to the excess CPU usage.
Any idea why this is happening or how we can eliminate this overhead?