#Unexpected behaviour when using buffer_reference to modify a vec4, then access a single component

30 messages · Page 1 of 1 (latest)

copper goblet
#

I would like some input on how to solve this issue I'm having - whether I'm doing something wrong or this issue needs to be filed with some upstream project, but if that were the case I have no idea who I should report the bug to.

When using buffer references like in the small GLSL compute shader I shared with this post, I'm getting different results on NVIDIA and other devices, all tested on Linux (Debian 11). So far, only NVIDIA outputs the expected result.
On Intel, llvmpipe, and AMD, all single-component reads always return the .x component, so the printed result is: x=11, y=11, z=11, w=11, instead of the expected: x=11, y=22, z=33, w=44.

The erroneous behaviour is quite strange and specific, and makes no sense to me. For instance, outputting .zz instead of .z returns the expected value. And if the Vector is set before the TestRef function call instead of inside, the output is as expected.

I'm on Vulkan 1.3.268.0. There's no validation warning or error that I can see.
Since all those drivers except NVIDIA behave the same, it seems likely I'm doing something wrong somewhere, but I have not found what.
This happens across two different physical machines, one AMD and one Intel/NVIDIA, both on Debian. I haven't tried on other OSes yet.
This also happens on both my main project and this minimal repro, with quite different characteristics in the general setup (I'm not using push constants to send the address on my main project for instance but a storage buffer, but the result is the same).

I worked out a minimal repro project, for which I shared the .c file as well in case anyone is interested in trying to reproduce this result. The shader and .c can be compiled on Linux using the .sh file I also shared. Debug Printf display should be enabled in vkconfig to see the results.
Any help will be welcome ! I'm quite baffled trying to make sense of this behaviour at the moment.

#

I wish the C file was smaller by the way but this is as small as I could get it. The compute shader is small though, and I believe that's the most relevant part.

#

Also note, the behaviour is not specific to debugPrintfEXT. The same results are observed when a component is returned from the function, or written to some storage buffer or variable. This was causing very strange behaviour bugs in my program that I conclusively tracked down to be coming from this behaviour. (after much pain... 🥲 )

#

If anyone tries, please change PHYSICAL_DEVICE_INDEX to 0 in the .c file to use the first physical device, I just forgot to reset it before saving. I don't think I can edit that file now that it's attached.

copper goblet
#

Probably the way I presented this is too involved, but here's the short version: This compute code:

{
    in_Ref.Vector = vec4(11, 22, 33, 44);
    debugPrintfEXT("x=%g, y=%g, z=%g, w=%g.\n", in_Ref.Vector.x, in_Ref.Vector.y, in_Ref.Vector.z, in_Ref.Vector.w);
}```
Where VectorRef is a buffer_reference, displays the following result on Linux AMD, Intel and llvmpipe (but gives correct result on NVIDIA):
`x=11, y=11, z=11, w=11`
Any other swizzle gives correct result, for instance `in_Ref.Vector.zz` correctly returns `zz=33, 33`.
royal flax
#

what happens if you use %f

copper goblet
#

Same result, it's not specific to debugPrintf either, I confirmed this gets written wrong if the value is returned from the function, this was causing me actual bugs in a game

#

I can't figure out who I could take this up with either to discuss further. Initially I thought AMD as I thought it might be in their implementation of the buffer reference extension, but then I saw it on Intel and llvmpipe too, I don't know if they have a piece of shared code or whatever, but at any rate I'm confused.

#

I'd be relieved if I'm just doing something stupid with alignment or whatever, but I haven't found the issue so far. Otherwise it smells a bit like a copy/paste mistake in someone's extension implementation code that got shared around, but I'm surprised no one caught it so far if that's the case.

royal flax
#

nvidia is known to be more lax in the implementation than AMD/others

#

AMD is usually pretty close to the letter on the spec

#

so I'm feeling like you might not be following the spec closely enough but idk where or what

copper goblet
#

It's possible, just haven't found my issue yet. I'm just trying to figure out what my next step is, maybe I can try to contact the authors of the buffer_reference spec to see what they think. I just wanted to make sure there wasn't something super obvious first before wasting people's time 🙂

#

First I could raise an issue about it on the Khronos GLSL spec github. Not sure if it's appropriate but it hosts the spec for the extension here https://github.com/KhronosGroup/GLSL/blob/master/extensions/ext/GLSL_EXT_buffer_reference.txt , so it's the most logical place to get knowledgeable people to weigh in maybe.

GitHub

GLSL Shading Language Issue Tracker. Contribute to KhronosGroup/GLSL development by creating an account on GitHub.

#

Otherwise, I could try to figure out how the extension is implemented in llvmpipe source code maybe and see if I can spot what's happening, but I don't know if I'll be able to navigate that or find something obvious

#

If it's like a hardware limitation with memory alignment or whatever it might not be visible at all in the source

#

Though if there's something wrong along those lines in my code, it might be worth suggesting a new validation layer check somewhere as I'm not getting any warning

#

So yeah whatever the issue is I'm interested in finding out

#

I think I'll try emailing Jeff Bolz at nvidia who authored the spec just to get his opinion on where would be the appropriate place to pursue or discuss something like this

copper goblet
#

I emailed him a question, we'll see where that gets me 🤷‍♂️

copper goblet
#

Jeff Bolz says it could be a bug in mesa's compiler frontend and to report it to them. Getting somewhere !

past grotto
#

Just want to confirm that I'm observing the same behavior. I compiled and ran your code and with NVidia the output is 11, 22,33,44, with the integrated Intel the output is 11,11,11,11.

#

The interesting thing is that if you replace the line

in_Ref.Vector = vec4(11.0, 22.0, 33.0, 44.0);

with

in_Ref.Vector.x = 11.0;
in_Ref.Vector.y = 22.0;
in_Ref.Vector.z = 33.0;
in_Ref.Vector.w = 44.0;

the output is good for both GPUs.

copper goblet
#

Thanks for confirming. Yeah it's a very specific case, a lot of trivial-seeming changes alter it, but it's very consistent on multiple gpus at the same time, so I think a common compiler bug is a possibility.

#

I wonder if it's an optimizer bug given it seems to happen only if the vec4 assignment is the last operation that interacts with the vec4 in the buffer before reading a component...

#

Feels like it's shortcutting the read of a component of the just-written vec4 but gets it wrong

copper goblet
past grotto
#

Yes, it's linux. I also thought that it could be a compiler bug. I was curious to see the assembly and put the code in the shader playground, but for some reason the shader just freezes the page instead.

copper goblet
#

Someone pointed out to me that it can't be a GLSL compiler issue, and they're right. I'm only passing SPIRV to the driver. The SPIRV seems fine as it works on NVIDIA. So it has to be in the translation from SPIRV to the shader binary, not sure if that's where NIR comes in in Mesa, need to look into it more.

copper goblet