#SSBO doesn't get written to by compute shader

112 messages · Page 1 of 1 (latest)

olive hedge
#

Hello guys.
I'm having difficulty writting to an SSBO.
its an array of uint32_t, and in each of those uint32_t, there are 4 uint8_t that I write into the uint32_t.
Now here is the glsl code for it:

    ArrayOffset colorIndicesOffset = GetArrayOffset(id);
    uint packedColorIndices = colorIndices[colorIndicesOffset.indexOffset];
    packedColorIndices &= ~(0xFF << (colorIndicesOffset.byteOffset * 8));
    packedColorIndices |= (chunkIDLod << (colorIndicesOffset.byteOffset * 8));
    debugPrintfEXT("i Index: %u\n", colorIndicesOffset.byteOffset);
    colorIndices[colorIndicesOffset.indexOffset] = packedColorIndices;
    SynchronizeBufferAccess();

the indexOffset is index / 4 and the byteoffset is index % 4.
when I do this on the cpu side, it works just fine, but not in the shader.
when I do

colorIndices[colorIndicesOffset.indexOffset] = packedColorIndices

and then I do

debugPrintfEXT("u Index: %u\n", colorIndices[colorIndicesOffset.indexOffset]);

it prints me the original value in the array (or some garbage idk), not the updated one.

here is the buffer declaration:

layout (std430, binding = 5) buffer coherent ChunksColorIndicesBuffer
{
    uint colorIndices[];
};

and on the cpu side, the setup of the buffer:

_chunksColorsIndicesBuffer = BoxelVulkanBuffer("Chunks Color Indices Buffer");
    _chunksColorsIndicesBuffer.CreateBuffer(sizeof(uint32_t) * MAX_CHUNKS_TOTAL / 4, VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT,
        VMA_MEMORY_USAGE_AUTO_PREFER_HOST);
    std::vector<uint32_t> initColorIndices = std::vector<uint32_t>(MAX_CHUNKS_TOTAL / 4, 0);
    _chunksColorsIndicesBuffer.CopyDataToBuffer(initColorIndices.data());

Thanks in advance for the help!

also the SynchronizeBufferAccess:

void SynchronizeBufferAccess()
{
    memoryBarrier();
    barrier();
}
oak hearth
#
void SynchronizeBufferAccess()
{
    memoryBarrier();
    barrier();
}

this is not a global sync, it is a local workgroup sync

#

also what are you using for id

olive hedge
oak hearth
#

I mean where is it coming from

olive hedge
#

From another ssbo that is an array of chunksData where I get the id of the chunk Im processing. Each thread processes a chunk

#

The ssbo with the chunkdata is valid I checked too

oak hearth
#

and how are you indexing that?

olive hedge
#

But two chunks can update the same item in the areay

#

Since there are 4 uint8 in that si gle uint32

#

And each uint8 is for a chunk

olive hedge
oak hearth
#

and what are you using for thread id?

olive hedge
oak hearth
#

k so prob just the bad sync then

olive hedge
oak hearth
#

you can't

#

either do multiple dispatches or figure out how to make it work with atomics but that is slow

#

or figure out how to make use of local workgroups

olive hedge
#

So if I have 15k chunks and I have like 1024 threads per workgroup max I need to dispatch ~15 times?

oak hearth
#

iunno but I assume your math is right 😛

olive hedge
#

so its not from the fact that memory is written by multiple workgroups

olive hedge
#

Turns out writing a value like 5 works

#

But manipulating the bytes in the uint and copying the result doesnt work

#

If you have an idea

oak hearth
#

assuming your shader is correct and that renderdoc shows the right values in the buffer then that would be a sync issue

#

if renderdoc doesn't even show the right value then it is still likely your shader/math assuming your input buffers are correct

#

you should do a simple test by just setting every single voxel to be the ID and see if that works and that you get the iDs you are expecting

olive hedge
#

Im just print the array value before, then setting it to 5, and printing it again

#

It works likethis

#

But when manipulating the bytes it doesnt

oak hearth
#

well how can we be sure you use it correctly?

#

safest is to look at the buffer's values

oak hearth
olive hedge
# oak hearth do simple math that is easy to verify

well this is the current code:

chunksColorIndices.colorIndices[colorIndicesOffset.indexOffset] = 5;
    SynchronizeBufferAccess();
debugPrintfEXT("output: %u\n", chunksColorIndices.colorIndices[colorIndicesOffset.indexOffset]);

and this works totally fine, whatever number I put, I get the appropriate one in the debug and on the cpu side too

#

also, manipulating the bytes of the packedColorIndices variable it works just fine

#
uint packedColorIndices = chunksColorIndices.colorIndices[colorIndicesOffset.indexOffset];
    packedColorIndices &= ~(0xFF << (colorIndicesOffset.byteOffset * 8));
    packedColorIndices |= (chunkIDLod << (colorIndicesOffset.byteOffset * 8));
#

its the process of

chunksColorIndices.colorIndices[colorIndicesOffset.indexOffset] = packedColorIndices;
#

that doesn't work

#

if I manipulate the bytes in a uint and I copy it to another variable, does it copy the value of the variable of the bytes that make up the value or the variable in glsl?

oak hearth
#

and you are sure colorIndicesOffset.indexOffset is completely unique over every single worker?

olive hedge
oak hearth
#

maybe replace colorIndicesOffset.indexOffset with length(gl_GlobalInvocationID)

oak hearth
#

so it copies the value

#

if you want to set a uint to a float for eg you're not going to get the same bytes

olive hedge
#

and same thing

oak hearth
#

much sus

olive hedge
#

i Index: 0
i Index: 256
i Index: 131072
u Index: 0
u Index: 0
u Index: 0

#

i is the input

#

u the ouput

oak hearth
#

so yeah can you check the buffer in renderdoc so we can be sure it's not the debug print

oak hearth
#

it must be a unique index

olive hedge
#

it like really basic rn

oak hearth
#

one thread how?

#

you need local size = 1 and dispatch also = 1

olive hedge
#

yeah yeah

oak hearth
#

btw what data type is colorIndices

olive hedge
olive hedge
oak hearth
#

hmmmmmmmmmmmmmmmmmmmmmmm

olive hedge
#

in the shader its uint

#

Im basically writing 4 uint8_t in each element of the array

oak hearth
#

ohhhhh

olive hedge
#

hence why the byte manipulation

oak hearth
#

yeah that should be fine as long as the type matches

olive hedge
oak hearth
#

much sus

#

well try the shader debug

olive hedge
oak hearth
#

yeah

#

renderdoc lets you debug step shaders

olive hedge
#

tried some things, still the same problem tho, can't put that variable in that buffer

oak hearth
#

try nsight then

#

and try to do something simple first like from a tutorial

#

going to be very hard to say anything useful here

oak hearth
#

if compute ray tracing then that is a simple fix

olive hedge
#

ok well

#

thanks for the help

#

appreciate it

oak hearth
#

ok maybe would be a good idea to try to disable the raytracing stuff first

#

and only do compute

#

going to be very hard to debug if you cannot hook a debugger to it

olive hedge
# oak hearth going to be very hard to debug if you cannot hook a debugger to it

alright quick update:

ArrayOffset colorIndicesOffset = GetArrayOffset(id);
    uint originalValue, desiredValue;
    uint mask = 0xFFFFFFFFu ^ (0xFFu << (8u * colorIndicesOffset.byteOffset));

    do {
        originalValue = chunksColorIndices.colorIndices[colorIndicesOffset.indexOffset];
        desiredValue = (originalValue & mask) | (chunkIDLod << (8u * colorIndicesOffset.byteOffset));
    } while (atomicCompSwap(chunksColorIndices.colorIndices[colorIndicesOffset.indexOffset], originalValue, desiredValue) != originalValue);

    uint extractedByte = (chunksColorIndices.colorIndices[colorIndicesOffset.indexOffset] >> (8u * colorIndicesOffset.byteOffset)) & 0xFFu;
    debugPrintfEXT("Result: %u\n", extractedByte);
#

this version of the code seems to work, it correclty updates the same element in the array

#

for some reasons. the memorybarriers available in glsl have no effect whatsoever so I had to use atomics

#

I basically create a lock with the atomic, not the most efficient way but it seems to work so for now I'll keep that I think

oak hearth
#

memorybarriers only works within a local group it does not sync every worker which is the very first thing I mentioned

olive hedge
#

and the threads dont seem to be affected by the barriers

oak hearth
#

so I'm guessing you are having multiple threads using the same index

oak hearth
#

yeah barriers won't help you there unless you make everything serial

#

I feel like you might be able to figure out something with the subgroups stuff cuz atomics is really the last thing you want to do

olive hedge
oak hearth
#

prob not if you order your data and workers correctly

#

like if you made it so that the entire workgroup is only working on one index