#Compute workload inconsistent behaviour across vendors

32 messages · Page 1 of 1 (latest)

dense lodge
#

Hello, I have a question. I have made a Vulkan compute program (a simple port of the renderer from the “ray-tracing in a weekend” book) that behaves inconsistently between GPU vendors and operating systems. It works as expected on dedicated Nvidia graphics under Linux and Windows, and on integrated Intel graphics under Windows.

However, on the same Intel system, but under Linux, it works fine only for small inputs (small sample count and/or low geometry count). For bigger inputs, it either finishes (in a very short time) with no errors but produces an empty image, or crashes with an error saying a single submission command buffer got submitted twice. Moreover, a single binary run multiple times can produce both of these behaviours. I got my friend to run it on their Linux system with integrated AMD graphics, and again works fine for small inputs, but for bigger inputs it produces partial renders. Where some tiles are fully rendered and some are just empty.

What could be causing this? Why does it only happen under these driver+hardware combinations?

code avaible on github: https://github.com/tomatih/raytracing_in_one_weekend/tree/vulkan_compute

GitHub

My attempt at working throuh the Ray Tracing in One Weekend book - GitHub - tomatih/raytracing_in_one_weekend at vulkan_compute

severe rivet
#

this almost always means you are dealing with UB or incorrect sync

#

have you enabled sync validation yet?

dense lodge
#

by that do you mean the synchronization2 layer?

severe rivet
#

that is not a validation layer, that is a feture emulation layer

#

sync validation is standard validation with sync enabled

dense lodge
#

ok so the synchrosniation preset on the validation layer?

severe rivet
#

yes

#

also do the gpuav preset after that

dense lodge
#

ok will try that now

#

done now, both produced no warnings or errors

#

i can include the output logs if you would like to take a look yourself

severe rivet
#

then all I can say is spend loads of time in renderdoc and inspect everything

dense lodge
#

well I have found that renderdoc doesn't really work with compute only workloads

severe rivet
#

if you don't have a window you have to use the api to do a manual capture

#

you can find instructions on the doc

dense lodge
#

thanks, i have tried that before but it refused to properly find the librenderdoc shared object

#

guess i will try again

severe rivet
#

so you can't just run your app, you need to run your app inside renderdoc

dense lodge
#

oh, must have missed that somehow. Thanks for the help

dense lodge
#

Well I am no expert but the barrier setup look correct. With a barrier before every dispatch on all buffers used within. It looks the same for both small working captures and big broken captures.

severe rivet
#

inspect the buffers and the data before/after and try to see if anything looks off

#

you can also debug step any compute shader

dense lodge
severe rivet
#

yes

#

you can see literally everything you passed to a vk function too

dense lodge
#

well i can see that i have passed it in, in which tab would i see the actual contents?

severe rivet
#

you double click the buffer in question in the stage view

dense lodge
severe rivet