#Improving performance with wgpu (compute/render synchronization)

3 messages · Page 1 of 1 (latest)

mortal bobcat
#

(I've asked this question in #games-and-graphics , but this channel is probably a better place for it)

Hello everyone! So I'm making a numerical simulation with a compute step and a render step with wgpu. Currently I'm performing both steps synchronously in a single event loop (such that the rendering occurs after each compute step, similar to the boid example in the wgpu repo), but I'd like to run the compute step in a separate thread, so that it can run in a fixed timestep.

I've attempted to do this already but it doesn't seem to work. Maybe the rendering is getting in the way (as there's only one command queue in wgpu - see https://github.com/gfx-rs/wgpu/issues/1066). I don't think the compute step(or the render step, for that matter) is in any way throttling my GPU (nvtop shows at most 1 to 3% load), which is why I thought it might be a problem in the precise way I'm syncing threads. I've also profiled using wgpu-profiler, and identified each thread separately (see attached image: the interval between compute steps can take up to ~600ms, but it should take at most 1/60s~=16.7ms).

This is the branch with the new implementation: https://github.com/vini-fda/heat-wgpu/tree/feat-improve-async-execution.

As someone pointed out in my initial question, wgpu does not work with multithreaded calls(there's a Mutex barrier in the encoder), which is a bummer, but I still think the performance could be improved, even without multithreading or multiple queues (e.g. lowering from 600ms to about 50ms).

Can someone help?

GitHub

Cross-platform, safe, pure-rust graphics api. Contribute to gfx-rs/wgpu development by creating an account on GitHub.

stuck shoal
#

I had a look around and I notice that in compute.rs you are holding the lock on the profiler while you do a

device.poll(wgpu::MaintainBase::Wait);
#

so that's a potential bottleneck — you're waiting for GPU tasks to complete with a lock held, so the render thread can't do anything