Iris - A Journey through OpenGL and beyond to learn Graphics | Graphics Programming | Page 3

frank sail May 4, 2023, 11:59 PM

#

Does gltfpack set a flag

wicked notch May 5, 2023, 12:00 AM

#

It's kind of naive, I just see if the glTF has basisu images in it.

#

Ideally I'd want to set a flag somere in the extras section

#

https://github.com/LVSTRI/Iris/blob/master/src/texture.cpp#L28
https://github.com/LVSTRI/Iris/blob/master/src/model.cpp#L65
This is the implementation by the way

#

Feel free to steal as I have stolen from others KEKW

frank sail May 5, 2023, 12:19 AM

#

I've never seen std::type_identity before
auto* ktx = std::type_identity_t<ktxTexture2*>();

#

apparently it's useful for implicit type conversions to the template type

wicked notch May 5, 2023, 12:20 AM

#

My reasoning for using that is I don't like declaring the type on the left

frank sail May 5, 2023, 12:20 AM

#

uh

wicked notch May 5, 2023, 12:21 AM

#

Call me crazy, I deserve it

frank sail May 5, 2023, 12:21 AM

#

you can just write auto* ktx = ktxTexture2*{};

#

nvm bleakekw

#

you have to write it like (int*){}

#

er

#

(ktxTexture2*){}

wicked notch May 5, 2023, 12:23 AM

#

Interesting.

frank sail May 5, 2023, 12:23 AM

#

damn, libktx looks super easy to use

wicked notch May 5, 2023, 12:23 AM

#

I guess my life is a lie then

frank sail May 5, 2023, 12:23 AM

#

for value types, you can indeed write auto foo = Foo{}; (or Foo()), which I do a lot

#

I guess pointer syntax is brain damaged, so you can't write it exactly like those

wicked notch May 5, 2023, 12:24 AM

#

frank sail I guess pointer syntax is brain damaged, so you can't write it exactly like thos...

Truer words have never been spoketh

frank sail May 5, 2023, 12:28 AM

#

I learn something new about C++ every time I read your code

#

what's the advantage of defining swap for a custom type instead of just defining a move constructor and move assignment operator?

wicked notch May 5, 2023, 12:30 AM

#

I don't understand it fully either, but apparently it avoids repetition and favors ADL

frank sail May 5, 2023, 12:31 AM

#

my move semantics are cursed, lemme show you

wicked notch May 5, 2023, 12:31 AM

#

It's called the copy-and-swap idiom

frank sail May 5, 2023, 12:31 AM

#

  Buffer::Buffer(Buffer&& old) noexcept
    : size_(std::exchange(old.size_, 0)),
      storageFlags_(std::exchange(old.storageFlags_, BufferStorageFlag::NONE)),
      id_(std::exchange(old.id_, 0)),
      mappedMemory_(std::exchange(old.mappedMemory_, nullptr))
  {
  }

  Buffer& Buffer::operator=(Buffer&& old) noexcept
  {
    if (&old == this)
      return *this;
    this->~Buffer();
    return *new (this) Buffer(std::move(old));
  }

wicked notch May 5, 2023, 12:32 AM

#

I always flinch when I see a placement new

#

for some reason bleakekw

frank sail May 5, 2023, 12:32 AM

#

I did some epic spec reading with others to ensure this is actually legal code

#

iirc there is an edge case with this pattern when you have a pointer to a subobject of the object that gets destroyed

#

where the spec isn't clear on whether it's UB

#

but no one should (or can) be making pointers to members of Buffer, so it's fine here

wicked notch May 5, 2023, 12:35 AM

#

Reject the spec, embrace "it works on my machine"

frank sail May 5, 2023, 12:36 AM

#

anyways, the whole point of this is to reduce duplication in favor of spooky placement new

#

60% of the time, it works every time

wicked notch May 5, 2023, 12:37 AM

#

I'll keep my "swap" thanks KEKW

frank sail May 5, 2023, 12:37 AM

#

you need move semantics anyways, no?

#

swapping isn't the only reason I need em

#

idk, this is #bikeshed-😇 material

wicked notch May 5, 2023, 12:39 AM

#

If I didn't misunderstand that idiom, copy-and-swap works in all cases

#

Because you define only one copy assignment op, taking self by value

#

and you just swap that with *this

frank sail May 5, 2023, 12:39 AM

#

yeah, it seems to work

wicked notch May 5, 2023, 12:40 AM

#

If you pass anything that's not an xvalue or rvalue it copies it

#

otherwise, move constructed

#

Then you swap it with either the copied state, or "unspecified" state (depending always on whether you move or not) which agrees to move semantics

#

I don't know where ADL comes into play but I never understood ADL

#

Or anything about C++'s overloading resolution rules

frank sail May 5, 2023, 12:41 AM

#

yeah, they're absurdly complex

#

all I know is that it enables this construct
endl(std::cout);

#

oh, it also lets you do this

    std::string a{"hello"};
    std::string b{"world"};
    swap(a, b);

#

which means you don't have to force a particular version of swap, if your thing happens to define it in its namespace/class/whatever

#

std::swap means you will get something slightly less optimal than if you implemented your own swap, since it always move-constructs a temporary instead of just swapping each member

#

maybe once they add reflection, we can finally have an optimal std::swap bleakekw

wicked notch May 5, 2023, 8:18 AM

#

Good morning friends

#

Today I discovered that cutting electricity for half a day is legal in this country

#

Can't really do anything about it if it's an issue in the electrical distribution network.

#

Anyways now that we have Indirect Drawing and Shadows figured out, I will now ponder where to go next.

#

Possible candidates are OIT and Frustum Culling

#

OIT is really interesting because it uses Linked Lists on the GPU which to me is fairly wild.

frank sail May 5, 2023, 8:29 AM

#

that's just one possible implementation of OIT

#

if you want the state of the art, you probably want one of these (second one requires RoVs which aren't on AMD)
http://momentsingraphics.de/MissingTMBOITCode.html
https://www.intel.com/content/www/us/en/developer/articles/technical/oit-approximation-with-pixel-synchronization.html

wicked notch May 5, 2023, 8:33 AM

#

But no linked lists 😦

#

Are there other things things that use GPU Linked Lists, I'm very curious to try them

frank sail May 5, 2023, 8:47 AM

#

linked lists generally aren't what you want to be doing on the GPU 😄

frank sail May 5, 2023, 9:34 AM

#

wicked notch Possible candidates are OIT and Frustum Culling

may I suggest occlusion culling

#

all done in a compoot shader

#

well, there's raster occlusion culling, which uses the frogment shader

wicked notch May 5, 2023, 9:36 AM

#

Occlusion culling in compute hmm.

frank sail May 5, 2023, 9:41 AM

#

look up "hi-z occlusion culling"

#

it's a bit more complex than raster occlusion culling

#

you can use both at the same time if you wish

wicked notch May 5, 2023, 10:00 AM

#

Alright I have pondered enough, it's time to setup frustum + occlusion culling

wispy spear May 5, 2023, 4:10 PM

#

so, how did you compress sponza

#

did you just run compressonator with some flags?

#

and it produced ktx2 files out of png?

wicked notch May 5, 2023, 4:53 PM

#

I ended up not using Compressonator as it was far too overkill for my purposes, instead gltfpack is very automagic

#

gltfpack -i .\bistro\bistro.gltf -o .\compressed\bistro\bistro.glb -tc -tq 10 -vpf -kn -km -ke -noq```

#

I just had to run this and boom: everything is compressed

wispy spear May 5, 2023, 4:56 PM

#

ah neat

wicked notch May 5, 2023, 9:45 PM

#

Somehow I always manage to forget GPUs are parallel machines

#

Hmm I still get occasional flickering for some reason?

#

Even though this was most likely the issue, there is still flickering once every second or so, randomly

#

Shader is just this:

void main() {
    uint index = gl_GlobalInvocationID.x;
    if (index == 0) {
        for (uint i = 0; i < u_draw_count; ++i) {
            draw_count[i] = 0;
        }
    }
    barrier();
    if (index < u_object_count) {
        const object_info_t object = objects[index];
        const mat4 local_transform = local_transforms[object.local_transform];
        const mat4 global_transform = global_transforms[object.global_transform];
        const mat4 model = global_transform * local_transform;
        if (is_object_visible(object, model)) {
            const uint slot = atomicAdd(draw_count[object.group_index], 1);
            indirect_commands[slot + object.group_offset] = object.command;
            object_shift[slot + object.group_offset].object_id = index;
        }
    }
}

#

(is_object_visible always returns true for now)

frank sail May 5, 2023, 10:13 PM

#

do you call glMemoryBarrier in your host code

wicked notch May 5, 2023, 10:14 PM

#

I have no idea what that is (so no)

frank sail May 5, 2023, 10:14 PM

#

it's necessary to make your program correct

wicked notch May 5, 2023, 10:14 PM

#

Interesting, I'll read up on that

frank sail May 5, 2023, 10:14 PM

#

it ensures incoherent reads and writes (SSBO and image stores from shaders) are visible/completed to future operations

#

so if you write some indirect commands in a shader, then do glMultiDrawElementsIndirect, you need glMemoryBarrier(GL_COMMAND_BARRIER_BIT); between them

#

otherwise the driver cannot see that the MDI command depends on the dispatch and issue the corresponding synchronization and cache flush/invalidation

wicked notch May 5, 2023, 10:18 PM

#

I see

#

So just like CPUs atomics then

#

I assume "visible" and "available" mean the same thing (I mean, not that visible == available, just that visible/available on the CPU are the same on the GPU)?

frank sail May 5, 2023, 10:19 PM

#

glMakeVisible and glMakeAvailable bleakekw

#

(jk those don't exist)

wicked notch May 5, 2023, 10:19 PM

#

It's basically a cache flush

frank sail May 5, 2023, 10:19 PM

#

there is just one concept of memory visibility in opengl

#

btw, DX11 doesn't have this, so the driver has to issue conservative barriers between every pass that does incoherent writes

#

which means you can't mess up sync, but you also cannot get maximum perf

#

anyways, flickering like what you have is typically a symptom of a synchronization issue

#

you can check it by inserting glFinish or glMemoryBarrier(GL_ALL_BARRIER_BITS) after every draw/dispatch that does SSBO/image writes

wicked notch May 5, 2023, 10:24 PM

#

Very interesting

#

Why does cache become incoherent on the GPU itself though?

#

Like, the GPU is feeding itself data, how does it fail to maintain coherency?

#

Oh it's probably because each GPU SM has its own cache and any workgroup writing data is not guaranteed to be the same reading it?

frank sail May 5, 2023, 10:25 PM

#

GPU cache coherency protocols are very basic compared to CPU ones

#

and often rely on manual flushes and sync

#

idk all the details though

wicked notch May 5, 2023, 10:27 PM

#

So hold on a sec

#

I also do a depth reduce and setup shadow cascades in compute

#

...Am I supposed to insert barriers here too?

frank sail May 5, 2023, 10:27 PM

#

prob

wicked notch May 5, 2023, 10:27 PM

#

But it's been working fine up until now with no errors from the debug callback frog_sweat

frank sail May 5, 2023, 10:28 PM

#

any time you want a write to be visible or a read to have finished before the next pass that consumes the memory, you need a barrier

wicked notch May 5, 2023, 10:28 PM

#

So this:

depth_reduce_init_shader.bind();
offscreen_attachment[1].bind_texture(0);
depth_reduce_attachments[0].bind_image_texture(0, 0, false, 0, GL_WRITE_ONLY);
camera_buffer.bind_base(1);
glDispatchCompute(depth_reduce_wgc[0].x, depth_reduce_wgc[0].y, 1);

depth_reduce_shader.bind();
for (auto i = 1; i < depth_reduce_wgc.size(); i++) {
    depth_reduce_attachments[i - 1].bind_image_texture(0, 0, false, 0, GL_READ_ONLY);
    depth_reduce_attachments[i].bind_image_texture(1, 0, false, 0, GL_WRITE_ONLY);
    glDispatchCompute(depth_reduce_wgc[i].x, depth_reduce_wgc[i].y, 1);
}```
becomes this:
```cpp
...
glDispatchCompute(depth_reduce_wgc[0].x, depth_reduce_wgc[0].y, 1);
glMemoryBarrier();
depth_reduce_shader.bind();
for (auto i = 1; i < depth_reduce_wgc.size(); i++) {
    ...
    glDispatchCompute(depth_reduce_wgc[i].x, depth_reduce_wgc[i].y, 1);
    glMemoryBarrier();
}```?

frank sail May 5, 2023, 10:29 PM

#

ye

wicked notch May 5, 2023, 10:30 PM

#

Hmm I also read from the depth's sampler though, do I need a barrier here too 🤔

#

Before the first dispatch, I mean

frank sail May 5, 2023, 10:31 PM

#

wdym before the first dispatch

#

like in some draw?

wicked notch May 5, 2023, 10:32 PM

#

wicked notch So this: ```cpp depth_reduce_init_shader.bind(); offscreen_attachment[1].bind_te...

The first dispatch here, reads from a framebuffer attachment (the depth attachment, written to by a Z-prepass)

#

Should I do glMemoryBarrier(GL_FRAMEBUFFER_BARRIER_BIT);?

#

The spec isn't very clear on the rules.

frank sail May 5, 2023, 10:34 PM

#

SHADER_IMAGE_ACCESS_BARRIER_BIT: Memory accesses using shader built-in image load, store, and atomic functions issued after the barrier will reflect data written by shaders prior to the barrier. Additionally, image stores and atomics issued after the barrier will not execute until all memory accesses (e.g., loads, stores, texture fetches, vertex fetches) initiated prior to the barrier complete.

#

btw, the ref pages do not mention this critical information for some of the barrier bits, so you best refer to the spec

#

https://github.com/KhronosGroup/OpenGL-Refpages/issues/128

wicked notch May 5, 2023, 10:37 PM

#

Hm

#

Interesting

frank sail May 5, 2023, 10:37 PM

#

I guess that is to say that you need a barrier before the first dispatch as well

wicked notch May 5, 2023, 10:38 PM

#

By the way

#

We might have a situation

#

Using glFinish(); I still see occasional flickering nervous

frank sail May 5, 2023, 10:39 PM

#

spoopy

#

idk if glFinish technically makes writes visible, so maybe don't use that

#

or use glFinish+glMemoryBarrier if you're super paranoid 😄

wicked notch May 5, 2023, 10:43 PM

#

Uhhh

#

glMemoryBarrier(GL_ALL_BARRIER_BITS);
glFinish();``` still flickers

#

oh no

frank sail May 5, 2023, 10:43 PM

#

where did you put it

wicked notch May 5, 2023, 10:43 PM

#

After the dispatch that writes indirect commands

frank sail May 5, 2023, 10:44 PM

#

damn

#

maybe there is a race within that shader. Lemme look at it again

#

shader looks okay

wicked notch May 5, 2023, 10:46 PM

#

I pushed this broken stuff

#

https://github.com/LVSTRI/Iris/blob/master/shaders/5.0/generic_cull.comp

frank sail May 5, 2023, 10:48 PM

#

I thought atomics could only be done on variables that were neither readonly nor writeonly

wicked notch May 5, 2023, 10:48 PM

#

Compiler doesn't really complain but you are right lol

#

It doesn't really make sense for it to be writeonly

#

Same flickering though

frank sail May 5, 2023, 10:52 PM

#

I guess try debugging with nsight

#

or somehow simplifying the shader (e.g., removing the atomic and just using the global invocation id, if possible)

#

wait

#

Can you explain the loop at the beginning of your shader

wicked notch May 5, 2023, 10:54 PM

#

Since the shader increments the draw_count, I need a way to reset it between invocations

#

So that it doesn't grow to infinity and beyond

frank sail May 5, 2023, 10:54 PM

#

What do you expect barrier() to do

wicked notch May 5, 2023, 10:55 PM

#

Make all other threads wait for the first thread to finish initializing draw_count?

frank sail May 5, 2023, 10:55 PM

#

Did you know that barrier only synchronizes threads within a single workgroup

#

i.e., it is not global sync

#

so the code is probably wrong if you have more than one wg

wicked notch May 5, 2023, 10:57 PM

#

me:

frank sail May 5, 2023, 10:58 PM

#

Try clearing the buffer with this
https://registry.khronos.org/OpenGL-Refpages/gl4/html/glClearBufferSubData.xhtml

wicked notch May 5, 2023, 10:59 PM

#

Yeah...

#

no more flickering

#

How many more absolutely vital pieces of information am I missing I wonder KEKW

frank sail May 5, 2023, 11:01 PM

#

just read the whole gl and glsl specs before continuing smart

wicked notch May 5, 2023, 11:02 PM

#

Took a whole day just to setup Frustum Culling

#

Very promising dare I say KEKW

frank sail May 5, 2023, 11:05 PM

#

usually takes me a lot longer than that 🙂

wispy spear May 5, 2023, 11:07 PM

#

and me ^3 that

wicked notch May 6, 2023, 12:14 AM

#

Alright that's a wrap, tomorrow we'll have actual frustum culling (and possibly even occlusion)

wicked notch May 6, 2023, 11:16 AM

#

By the way I only just realized that gltfpack merges primitives if they have the same node and material

#

Thankfully it's open source so I could simply add a flag and build from source

wicked notch May 6, 2023, 12:21 PM

#

bool intersect_aabb_plane(in aabb_t aabb, in vec4 plane) {
    const vec3 normal = plane.xyz;
    const vec3 size = aabb.size.xyz;
    const vec3 center = aabb.center.xyz;
    const float radius = dot(size, abs(normal));
    return -radius <= dot(normal, center) - plane.w;
}

bool is_object_visible(in object_info_t object, in mat4 model) {
    const aabb_t aabb = object.aabb;
    const vec3 world_aabb_max = vec3(model * vec4(aabb.max.xyz, 1.0));
    const vec3 world_aabb_min = vec3(model * vec4(aabb.min.xyz, 1.0));
    const vec3 world_aabb_center = (world_aabb_max + world_aabb_min) / 2.0;
    const vec3 world_aabb_extents = world_aabb_max - world_aabb_center;

    const aabb_t global_aabb = aabb_t(
        vec4(world_aabb_min, 0.0),
        vec4(world_aabb_max, 0.0),
        vec4(world_aabb_center, 0.0),
        vec4(world_aabb_extents, 0.0));
    for (int i = 0; i < 6; ++i) {
        if (!intersect_aabb_plane(global_aabb, frustum.planes[i])) {
            return false;
        }
    }

    return true;
}``` It was disappointingly trivial to implement...

#

Learning about AABBs with mouse picking was worth it KEKW

wicked notch May 6, 2023, 12:39 PM

#

Is it worth doing this for shadow maps too? 🤔

wicked notch May 6, 2023, 2:21 PM

#

Hmm, frustum culling shadow cascades doesn't really work sadly, I need more tolerance I guess?

wicked notch May 6, 2023, 6:45 PM

#

Turns out my intuition for culling shadow lights was completely off

#

I'm now following this scary looking paper: https://arisilvennoinen.github.io/Publications/Shadow_Caster_Culling_for_Efficient_Shadow_Mapping.pdf but it's probably better if I do occlusion culling first

wicked notch May 6, 2023, 7:21 PM

#

CHC++ Uses hardware occlusion queries, but from what I'm reading they are fairly inefficient due to CPU stalling, can conditional rendering fix this?

#

Maybe HiZ culling is the way to go?

frank sail May 6, 2023, 9:25 PM

#

wicked notch Is it worth doing this for shadow maps too? 🤔

Ye

#

Hardware occlusion queries aren't great these days since you can just write to a buffer now

wicked notch May 6, 2023, 9:39 PM

#

But how do you do occlusion culling for shadows?

#

HiZ requires depth

#

But shadows are depth

frank sail May 6, 2023, 9:41 PM

#

uh

#

you do it the exact same way as usual

wicked notch May 6, 2023, 9:42 PM

#

https://tenor.com/view/favorite-facts-equations-math-confused-gif-15404559

Tenor

frank sail May 6, 2023, 9:43 PM

#

here's a classic algorithm:

Render objects that were marked visible to depth
Perform occlusion culling against depth, marking visible objects

#

if you do it all on the GPU, there is just one frame of latency between an object being marked visible, and actually being drawn

#

But you can add a third step to remove that latency

#

By simply drawing the objects whose visibility changed from 0 to 1 this frame

wicked notch May 6, 2023, 9:50 PM

#

I uh

#

How do you do step 2 without rendering all objects

#

You need the depth of every object to check whether the object is visible or not?

frank sail May 6, 2023, 9:58 PM

#

Step 2 depends on the implementation

#

For hi-z, it means performing the test for every object's bounding volume

#

For raster occlusion culling, it means drawing the bounding volume for every object (which is hopefully substantially cheaper than actually drawing every object)

#

Raster is cool because it's so shrimple

#

Here's a sample that implements it
https://github.com/JuanDiegoMontoya/Fwog/blob/main/example/05_gpu_driven.cpp

wicked notch May 6, 2023, 10:05 PM

#

Wow you actually render cubes

#

Incredible

#

It's not like you told me already

#

...if I'm dumb

wispy spear May 6, 2023, 10:07 PM

#

thats a cute pic

wicked notch May 6, 2023, 10:44 PM

#

Shadow frustum culling for some unknown reason does not work

#

Isn't it the same exact thing? As frustum culling for perspective projections I mean.

#

Even though I explicitly disable near plane culling, it looks like it's doing it anyways...?

#

I think

#

Nevermind I don't

frank sail May 6, 2023, 11:00 PM

#

wicked notch Isn't it the same exact thing? As frustum culling for perspective projections I ...

The only difference is where your planes are

wicked notch May 6, 2023, 11:14 PM

#

Ight it works now

wicked notch May 6, 2023, 11:15 PM

#

wicked notch ```glsl bool intersect_aabb_plane(in aabb_t aabb, in vec4 plane) { const vec...

bug was actually pretty obvious, you get one LVSTRI point if you spot it

#

Actually nevermind, that was just placebo, I'm not culling anything now nervous

#

I actually have no idea now 🤔

#

bool is_aabb_inside_plane(in aabb_t aabb, in mat4 model, in vec4 plane) {
    const vec3 normal = plane.xyz;
    const vec3 extent = aabb.extent.xyz;
    const vec3 center = aabb.center.xyz;
    const float radius = dot(extent, abs(normal));
    return -radius <= (dot(normal, center) - plane.w);
}

bool is_object_visible(in object_info_t object, in mat4 model) {
    const aabb_t aabb = object.aabb;
    const vec3 world_aabb_min = vec3(model * vec4(aabb.min.xyz, 1.0));
    const vec3 world_aabb_max = vec3(model * vec4(aabb.max.xyz, 1.0));
    const vec3 world_aabb_center = vec3(model * vec4(aabb.center.xyz, 1.0));
    const vec3 right = vec3(model[0]) * aabb.extent.x;
    const vec3 up = vec3(model[1]) * aabb.extent.y;
    const vec3 forward = vec3(-model[2]) * aabb.extent.z;

    const vec3 world_extent = vec3(
        abs(dot(vec3(1, 0, 0), right)) +
        abs(dot(vec3(1, 0, 0), up)) +
        abs(dot(vec3(1, 0, 0), forward)),

        abs(dot(vec3(0, 1, 0), right)) +
        abs(dot(vec3(0, 1, 0), up)) +
        abs(dot(vec3(0, 1, 0), forward)),

        abs(dot(vec3(0, 0, 1), right)) +
        abs(dot(vec3(0, 0, 1), up)) +
        abs(dot(vec3(0, 0, 1), forward)));

    const aabb_t world_aabb = aabb_t(
        vec4(world_aabb_min, 1.0),
        vec4(world_aabb_max, 1.0),
        vec4(world_aabb_center, 1.0),
        vec4(world_extent, 1.0));
    const uint planes = bool(u_disable_near_culling) ? 5 : 6;
    for (uint i = 0; i < planes; ++i) {
        if (!is_aabb_inside_plane(world_aabb, model, frustum.planes[i])) {
            return false;
        }
    }

    return true;
}
``` This should be fine?

#

I mean, it works perfectly fine for a perspective projection, why not for shadows?

#

it's just culling completely visible objects for some unknown reason?

#

They aren't even z < 0

#

It's only the first cascade as well...

#

I'm lighting the Jaker beacon

#

https://tenor.com/view/wongwingchun58-gif-15613675

Tenor

frank sail May 6, 2023, 11:40 PM

#

ask chatgpt what's wrong with your code bleakekw

wicked notch May 6, 2023, 11:42 PM

#

Why would you even suggest that frog_gone

frank sail May 6, 2023, 11:42 PM

#

cuz I'm a lazy bastard

wicked notch May 6, 2023, 11:44 PM

#

frank sail May 6, 2023, 11:45 PM

#

ngl I actually asked chatgpt, but I can't tell if it's answer is correct nervous

#

probably because I don't understand 100% of the math in the original code

#

maybe it'll help if you walk me through the math, rubberducky style 😄

#

btw, is_aabb_inside_plane has an unused parameter

#

and arguably is_object_visible should take an aabb_t instead of an object_info_t, if all you need from it is the AABB

wicked notch May 6, 2023, 11:51 PM

#

Yeah that's me checking various things

#

Anyways the math is as follows:

Translate AABB's center to world space model * vec4(center, 1)
Translate and correct AABB's extents (should account for rotations and scales, we use the first 3 columns of the model matrix to correct this)
Check if the AABB is on or inside all 6 planes (or 5 if near culling is disabled, last plane is the near plane), we basically take the signed distance from the plane's origin to the center of the AABB and check if it's within radius or more

#

dot(normal, center) gives whether the point is inside or outside the plane

#

Is there any way I can debug a compute shader?

frank sail May 7, 2023, 12:06 AM

#

no 😦

#

well, not in gl

wicked notch May 7, 2023, 12:11 AM

#

Out of pure curiosity, what did nogpt answer?

frank sail May 7, 2023, 12:35 AM

#

It said something about the computation for world_extent being wrong

wicked notch May 7, 2023, 11:06 AM

#

I have discovered

#

A thing

#

Actually multiple things.

#

First off my signs are completely broken.

wispy spear May 7, 2023, 11:06 AM

#

https://tenor.com/view/signs-gif-13324315

Tenor

wicked notch May 7, 2023, 11:06 AM

#

Second, distances from the plane origins are garbage

#

Third.

#

I have no idea how to fix all this KEKW

#

Therefore I'll grab a man's best friends: pen and paper, and write down stuff.

frank sail May 7, 2023, 11:09 AM

#

how dare you disrespect man's true best friend

wicked notch May 7, 2023, 11:09 AM

#

True, pen and paper are actually a man's oldest friend.

#

A tool as old as time

wicked notch May 7, 2023, 11:25 AM

#

With inverse(view)
[0] = {iris::plane_t} {normal=[0.609994292 0 0.792405844], distance=-4.57495737}
[1] = {iris::plane_t} {normal=[0.609994292 0 -0.792405844], distance=-4.57495737}
[2] = {iris::plane_t} {normal=[0.5 0.866025447 0], distance=-2.88397455}
[3] = {iris::plane_t} {normal=[0.5 -0.866025447 -0], distance=-4.61602545}
[4] = {iris::plane_t} {normal=[-1 -0 -0], distance=-504.5}
[5] = {iris::plane_t} {normal=[1 0 0], distance=-7.4000001}

With inverse(pv)
[0] = {iris::plane_t} {normal=[-0.609995067 0 0.792405247], distance=-4.57496309}
[1] = {iris::plane_t} {normal=[-0.609995067 0 -0.792405247], distance=-4.57496309}
[2] = {iris::plane_t} {normal=[-0.500000775 0.86602503 0], distance=-4.61603069}
[3] = {iris::plane_t} {normal=[-0.500000775 -0.86602503 0], distance=-2.88398075}
[4] = {iris::plane_t} {normal=[1 0 0], distance=-504.5}
[5] = {iris::plane_t} {normal=[-1 0 0], distance=-7.4000001}

#

Why god

wicked notch May 7, 2023, 1:10 PM

#

I fixed the thing

#

I finally achieved inner peace.

#

@ derhass helped me a lot, honorable mention here.

wicked notch May 7, 2023, 1:34 PM

#

Alright

#

Now occlusion culling

#

Wish me luck frog_sweat

wispy spear May 7, 2023, 1:55 PM

#

good luck

wicked notch May 7, 2023, 7:50 PM

#

Hmm, the "first frame" is very important in raster occlusion culling apparently, but I still don't understand the "core loop" very well

#

If the first frame I perform no occlusion culling, the next frame I am supposed to mark all objects that were visible the previous frame and render them?

#

What about changes though? Reprojection of the depth buffer?

wispy spear May 7, 2023, 7:57 PM

#

you can run a frame before the gameloop starts

wicked notch May 7, 2023, 7:57 PM

#

And I guess any time I can't reliably reproject 😅

frank sail May 7, 2023, 8:06 PM

#

you don't need to reproject anything frogstare

wicked notch May 7, 2023, 8:10 PM

#

thonk

frank sail May 7, 2023, 8:39 PM

#

reprojection is for if you want to reuse an old depth buffer for new object positions

#

but you don't have to do that

wicked notch May 7, 2023, 8:49 PM

#

Hmm.

#

https://tenor.com/view/pondering-pondering-my-orb-my-orb-orb-pondering-my-pondering-my-orb-orb-gif-24060364

Tenor

frank sail May 7, 2023, 9:04 PM

#

I described some methods above that don't require reprojection

wispy spear May 7, 2023, 9:04 PM

#

a dogjiff.gif also doesnt require reprojection

frank sail May 7, 2023, 9:04 PM

#

I suppose the core idea is that, instead of reprojecting, you use the object visibility from last frame instead

#

Also, the first frame isn't a special case when you do this

wicked notch May 7, 2023, 9:06 PM

#

Yes, I see

#

Something like this I suppose

frank sail May 7, 2023, 9:06 PM

#

yus

wicked notch May 7, 2023, 9:07 PM

#

How do I know if an object is completely occluded though...

#

No samples pass the depth test? Doesn't that require an occlusion query 🤔

#

Oh wait that's what early Z is for right?

frank sail May 7, 2023, 9:08 PM

#

early z + ssbo write in fs

wicked notch May 7, 2023, 9:09 PM

#

If no samples pass the depth test then there will be no writes?

#

Crazy

#

Huge

frank sail May 7, 2023, 9:09 PM

#

yep

#

quite epic indeed

#

https://github.com/JuanDiegoMontoya/Fwog/blob/main/example/shaders/gpu_driven/CullVisibility.frag.glsl

wicked notch May 7, 2023, 9:09 PM

#

I can feel more brain expansion

frank sail May 7, 2023, 9:10 PM

#

technically that code is UB btw

#

since there is a race if there are multiple fragments frog_gone

wicked notch May 7, 2023, 9:10 PM

#

atomicCompareExchange?

#

Or whatever it's called in GLSL

frank sail May 7, 2023, 9:10 PM

#

atomicExchange

wicked notch May 7, 2023, 9:11 PM

#

atomicCompSwap

frank sail May 7, 2023, 9:11 PM

#

either one works

wicked notch May 7, 2023, 9:11 PM

#

atomicCompSwap(data[draw_id].is_visible, 0, 1)

#

So that you only do this once

#

efishenshy

frank sail May 7, 2023, 9:12 PM

#

writing the same value from multiple threads is technically UB according to the spec, but works on actual hw

wicked notch May 7, 2023, 9:12 PM

#

Yeah makes sense, there's no reason why it wouldn't

frank sail May 7, 2023, 9:12 PM

#

I imagine it's faster than atomics since you don't have to serialize access to the memory controller

wicked notch May 7, 2023, 9:13 PM

#

Now what vertices do I feed the GPU?

#

Hmm

#

I suppose I could first do frustum culling, get the number of visible objects there, then build another indirect buffer with instanceCount = number of objects in frustum and draw cubes?

#

``` I don't really understand this...

frank sail May 7, 2023, 9:17 PM

#

wicked notch Now what vertices do I feed the GPU?

a cube lol

#

Cube hack
https://github.com/JuanDiegoMontoya/Fwog/blob/main/example/shaders/gpu_driven/BoundingBox.vert.glsl

#

The comment says 24 vertices, but I think it's actually 14

wicked notch May 7, 2023, 9:19 PM

#

1e-1

#

What

#

-1.0?

frank sail May 7, 2023, 9:19 PM

#

lmao

#

I must've been testing different epsilon values

wicked notch May 7, 2023, 9:28 PM

#

Oh god this is going to be a mess isn't it

#

If I want to do occlusion culling for shadows too, that is

#

Each cascade gets its own indirect buffer...

frank sail May 7, 2023, 9:31 PM

#

it shouldn't be too bad if you abstract it properly

wicked notch May 7, 2023, 9:31 PM

#

This piece of code is copy-pasted 5 times KEKW

#

Uhh

#

I have these two buffer bindings here:

layout (std430, binding = 7) restrict buffer b_occlusion_draw_count {
    uint occlusion_draw_count;
};

layout (std430, binding = 8) writeonly restrict buffer b_occlusion_indirect_command {
    indirect_command_t occlusion_command;
};```

#

Because I need atomics basically

#

Is there any way I can avoid creating a whole SSBO for a single uint?

wispy spear May 7, 2023, 9:45 PM

#

you could just use a glUniform1ui

proven laurel May 7, 2023, 9:46 PM

#

^

wispy spear May 7, 2023, 9:46 PM

#

what happened to your foo_t naming convention

wicked notch May 7, 2023, 9:46 PM

#

Hmm but I'd still need to bind this buffer to GL_PARAMETER_BUFFER

wicked notch May 7, 2023, 9:46 PM

#

wispy spear what happened to your foo_t naming convention

I changed it, it didn't make much sense for buffers and uniforms and stuff

wispy spear May 7, 2023, 9:46 PM

#

are you doing glMDICount?

wicked notch May 7, 2023, 9:46 PM

#

Yes

proven laurel May 7, 2023, 9:47 PM

#

isn't there GL_DRAW_INDIRECT_BUFFER? or does that not work with MDI?

wicked notch May 7, 2023, 9:47 PM

#

Yes but you also need to bind a buffer to tell GL how many indirect commands to consume, that's bound to GL_PARAMETER_BUFFER

wispy spear May 7, 2023, 9:48 PM

#

those are 2 buffers

proven laurel May 7, 2023, 9:48 PM

#

oh lol

wispy spear May 7, 2023, 9:48 PM

#

ye you need two

#

just looksied in the spec

proven laurel May 7, 2023, 9:48 PM

#

the buffer types are weird

wispy spear May 7, 2023, 9:48 PM

#

proven laurel May 7, 2023, 9:48 PM

#

this is effectively a counter buffer

wicked notch May 7, 2023, 9:48 PM

#

Naming couldn't be worse yes KEKW

wispy spear May 7, 2023, 9:48 PM

#

May Fiffths

#

almost to the day 🙂

proven laurel May 7, 2023, 9:49 PM

#

75% of the problems people have when learning gl would be solved with better names

#

KEKW

#

anyhow, doing hi-z culling here?

wicked notch May 7, 2023, 9:50 PM

#

Raster based, because it's more 🦐le

proven laurel May 7, 2023, 9:51 PM

#

https://tenor.com/view/shrimp-as-that-clash-royale-hee-hee-hee-haw-gif-25054781

Tenor

wicked notch May 7, 2023, 9:52 PM

#

I'm currently taking the output of the frustum culling pass to draw only the AABBs that are in frustum

proven laurel May 7, 2023, 9:55 PM

#

oh you mean like

#

that opengl occlusion thing?

wicked notch May 7, 2023, 9:56 PM

#

Sorry, which thing?

proven laurel May 7, 2023, 9:57 PM

#

was thinking about the conditional rendering occlusion stuff opengl has

#

but pretty sure the perf of that ain't that great

wicked notch May 7, 2023, 9:57 PM

#

Ah no, the idea is to rasterize AABBs and check visibility with early Z

#

So you basically leverage early Z to write into an SSBO the visibility of any object

proven laurel May 7, 2023, 9:58 PM

#

right

wicked notch May 7, 2023, 9:59 PM

#

Then you just write indirect commands that pass this test (if they managed to pass frustum culling first that is)

wicked notch May 7, 2023, 10:22 PM

#

Jaker, why is your depth write disabled when rendering visible bounding boxes?

#

Ah I see.

wicked notch May 7, 2023, 10:43 PM

#

Do we use last frame's depth buffer as well hmm

#

"This pass comes after the scene pass because it relies on a depth buffer to have already been created" do we really need this though?

frank sail May 7, 2023, 10:46 PM

#

wicked notch Do we use last frame's depth buffer as well hmm

noooooo

frank sail May 7, 2023, 10:46 PM

#

wicked notch "This pass comes after the scene pass because it relies on a depth buffer to hav...

yes, otherwise you're testing occlusion against an empty depth buffer

wicked notch May 7, 2023, 10:46 PM

#

But what about this

frank sail May 7, 2023, 10:47 PM

#

which part

wicked notch May 7, 2023, 10:47 PM

#

It's a bullet point but it's all the same technique

frank sail May 7, 2023, 10:47 PM

#

the temporal coherence just means that object visibility will be almost the same from frame to frame

#

objects visible last frame are probably visible this frame, etc

wicked notch May 7, 2023, 10:48 PM

#

So rip shadow mapping occlusion culling?

#

If the depth buffer already needs to be created then we can't really cull anything can we?

#

Unless it's acceptable for shadow mapping culling to just use last frames' depth?

frank sail May 7, 2023, 10:50 PM

#

it should work for shadows, even if the frustum changes every frame

#

I think you have some confusion about exactly what data is taken from the last frame

wicked notch May 7, 2023, 10:51 PM

#

I'm very bad at tracking resources nervous

#

But I think just depth?

#

Uh

#

Not sure frog_sweat

frank sail May 7, 2023, 10:53 PM

#

lemme go through it again

#

clear depth buffer
render visible objects
render bounding boxes for occlusion testing
render objects whose visibility changed from 0 to 1 this frame (optional step to prevent one frame of pop-in when objects become visible)

#

the visibility info from step 2 is used in step 1 of the next frame

#

visibility=draw commands or whatever

wicked notch May 7, 2023, 10:56 PM

#

🐸 💡

#

Got it, thanks

frank sail May 7, 2023, 10:58 PM

#

the thingy with reprojection basically uses the last frame's depth buffer, and swaps occlusion testing and object rendering (so u test first)

#

which I guess is more intuitive lol

#

the problem is that reprojection is not perfect and leads to arguably worse artifacts (false occlusion) compared to one frame of lag (which, again, can be mitigated by step 3 above)

wicked notch May 7, 2023, 11:01 PM

#

About step 3

#

Can I just do this:

void main() {
    const uint prev = visibility[i_object_id] & 0x1;
    visibility[i_object_id] = (prev << 1u) | 1;
}```

frank sail May 7, 2023, 11:06 PM

#

uh

#

why the long shift

wicked notch May 7, 2023, 11:07 PM

#

Yeah, it's useless

frank sail May 7, 2023, 11:07 PM

#

red herring smh

wicked notch May 7, 2023, 11:08 PM

#

Does this work as in last = curr?

#

Idea is bit 1 is last and bit 0 is curr

frank sail May 7, 2023, 11:08 PM

#

Tbh I'd use an if statement just to make it obvious 😄

wicked notch May 7, 2023, 11:08 PM

#

To keep track of change

frank sail May 7, 2023, 11:10 PM

#

if (lastFrameVisibility[obj_id] == 0)
{
objectsThatNeedToBeDrawnThisFrame[obj_id] = 1;
}
thisFrameVisibility[obj_id] = 1;

#

you can certainly reduce the number of buffers needed here

wicked notch May 7, 2023, 11:11 PM

#

Yeah, I'll just make it work first

#

On my way copy pasting 5 times the same snippet (again)

frank sail May 7, 2023, 11:11 PM

#

hehe

wicked notch May 7, 2023, 11:11 PM

#

Actually screw that, I don't even know if it works only for objects in my perspective

#

Let's just test primary camera perspective first

#

It didn't work 😦

#

Wait I didn't bind the shader

#

It worksn't

#

Ah my shift is wrong

#

Goddamnit

wispy spear May 7, 2023, 11:31 PM

#

BlobhajReach

wicked notch May 7, 2023, 11:37 PM

#

It's "working" but it doesn't seem to cull anything more than frustum culling did

frank sail May 8, 2023, 12:24 AM

#

Make a debug mode that draws the bounding boxes to the screen

wicked notch May 8, 2023, 10:37 AM

#

Alright, occlusion culling works

#

Except it's slower than... a depth prepass?

wicked notch May 8, 2023, 3:57 PM

#

HiZ is looking more and more appealing... there are loads of issues with ROC apparently

#

Well, it's to be expected when dealing with bounding boxes, I wish ROC was a bit more conservative though

frank sail May 8, 2023, 4:39 PM

#

wicked notch HiZ is looking more and more appealing... there are loads of issues with ROC app...

like what

wicked notch May 8, 2023, 4:42 PM

#

I don't know the exact reason, but some flickering can be observed if you are inside an AABB

#

Sometimes not even just flickering, you get the object seemingly transparent due to its visibility changing every frame

#

Here for example

#

I am using your Fwog because I nuked ROC KEKW

frank sail May 8, 2023, 4:46 PM

#

ah

#

that artifact happens when an object occludes its own bounding box

wicked notch May 8, 2023, 4:47 PM

#

Hmm

frank sail May 8, 2023, 4:48 PM

#

a shrimple way to mitigate it is to always draw the object if the camera is very close to its bounding box

wicked notch May 8, 2023, 4:49 PM

#

wicked notch Here for example

Makes sense, what about this?

#

The object is not very close to the camera here, perhaps some offset could be applied to the AABB's position?

frank sail May 8, 2023, 4:50 PM

#

I can't see the aabb of the object in question

wicked notch May 8, 2023, 4:51 PM

#

Well... we are inside it KEKW

frank sail May 8, 2023, 4:51 PM

#

all it takes for the artifact to appear is for the camera to be inside the aabb and to not be looking at any other side of the object

#

I wonder if you can use clip planes for this

wicked notch May 8, 2023, 4:51 PM

#

Hmm

#

I already have frustum planes from frustum culling

#

I would shrimply have to invert the condition

frank sail May 8, 2023, 4:52 PM

#

Tbh it's easier to just draw the thingy if you're inside the aabb

wicked notch May 8, 2023, 4:52 PM

#

Yeah

#

But that kills occlusion culling basically KEKW

frank sail May 8, 2023, 4:53 PM

#

ideally, you wouldn't have objects with giant bounding boxes like that

wicked notch May 8, 2023, 4:53 PM

#

Many scenes merge primitives so their bounding boxes are huge

#

I de-nuked ROC

#

Ah by the way, I can't find any documentation about NSight's profiler magic words

frank sail May 8, 2023, 4:55 PM

#

wicked notch Many scenes merge primitives so their bounding boxes are huge

you'd get much finer culling if you used it on meshlets

wicked notch May 8, 2023, 4:55 PM

#

How the hell do I read this

#

What is PES+VPC

frank sail May 8, 2023, 4:56 PM

#

wicked notch Ah by the way, I can't find any documentation about NSight's profiler magic word...

https://developer.nvidia.com/blog/the-peak-performance-analysis-method-for-optimizing-any-gpu-workload/

NVIDIA Technical Blog

Louis Bavoil

The Peak-Performance-Percentage Analysis Method for Optimizing Any ...

This describes a performance triage method used to figure out the main performance limiters of a given GPU workload, using NVIDIA-specific hardware metrics.

#

vpc = viewport culling

#

idk what pes is

wicked notch May 8, 2023, 4:58 PM

#

Also PCIe throughput is reporting 16GB/s, should I be worried about that?

#

They only happen at the end of shadow map draws

frank sail May 8, 2023, 5:07 PM

#

Are you uploading something 🤔

#

Or maybe some buffer is in host memory

wicked notch May 8, 2023, 5:09 PM

#

Actually... yes lol

#

I was rewriting the whole texture buffer, object buffer, indirect buffer for each call to "perform_frustum_cull"

#

Truly incredible

frank sail May 8, 2023, 5:09 PM

#

Zoo wee mama

wicked notch May 8, 2023, 5:47 PM

#

Looks like checking if you are inside the AABB works fine

#

Or rather, I have not found any more artifacts yet.

wicked notch May 8, 2023, 5:47 PM

#

frank sail you'd get much finer culling if you used it on meshlets

By the way what's this

frank sail May 8, 2023, 6:15 PM

#

meshlets are little baby meshes

#

like 64-128 verts each

#

might be too fine though, since each aabb is like 14-24 verts

#

Meshoptimizer can emit meshlets

#

They are typically used with mesh shaders

wicked notch May 8, 2023, 6:21 PM

#

This is super interesting holy

#

Apparently it's a whole new rendering paradigm?

#

Gone are vertex -> geometry -> raster?

frank sail May 8, 2023, 6:25 PM

#

everything before fs is replaced by task and mesh shaders

#

it's pretty low level

wicked notch May 8, 2023, 6:31 PM

#

No worries, it'll only be a very small detour

#

How do you enable extensions?

frank sail May 8, 2023, 6:33 PM

#

they are automatically enabled

#

you just have to check that your implementation supports it

wicked notch May 8, 2023, 6:34 PM

#

https://tenor.com/view/20-twenty-minutes-rick-and-morty-gif-12928450

Tenor

wicked notch May 8, 2023, 6:34 PM

#

frank sail they are automatically enabled

very good

frank sail May 8, 2023, 6:34 PM

#

Like this
https://github.com/JuanDiegoMontoya/Fwog/blob/main/src/Context.cpp#L122

wicked notch May 8, 2023, 7:32 PM

#

Old: model_t
New: meshlet_group_t

#

To be honest I'm not understanding much, if anything at all nervous

#

The task shader is optional apparently, which is great for me, one less thing to worry about

frank sail May 8, 2023, 7:41 PM

#

you use task shaders to cull meshlets, basically

wispy spear May 8, 2023, 7:41 PM

#

are you switching to vk already? froge_evil

wicked notch May 8, 2023, 7:42 PM

#

No way

frank sail May 8, 2023, 7:42 PM

#

just using the cursed gl nv meme shader ext

wicked notch May 8, 2023, 7:42 PM

#

I'll stay on GL for a long time

#

Why is it cursed?

wispy spear May 8, 2023, 7:44 PM

#

ah

frank sail May 8, 2023, 7:44 PM

#

cursed because probably no one uses it 🐸

wicked notch May 8, 2023, 7:44 PM

#

frog_sweat

#

I know everyone's given up on GL, but I have not

#

I couldn't help but see there's no AMD version of this mesh shader extension

#

froge_sad

frank sail May 8, 2023, 7:46 PM

#

yup

#

you only get cross platform meme shading on nv

#

I mean vk lol

proven laurel May 8, 2023, 8:08 PM

#

meme shading lol

wispy spear May 8, 2023, 8:26 PM

#

who cares about AMD and INTEL anyway

#

froge_evil

frank sail May 8, 2023, 8:26 PM

#

AMD (pejorative)

wispy spear May 8, 2023, 8:28 PM

#

heh

wicked notch May 8, 2023, 8:30 PM

#

auto meshlet_triangles = std::vector<uint32>(max_meshlets * max_triangles * 3);``` I stared at that 3 for a solid minute

#

Before remembering: "Wait a minute, a triangle has 3 points"

#

Incredible.

wispy spear May 8, 2023, 8:32 PM

#

tis a multi tringle

#

one per universe

frank sail May 8, 2023, 8:51 PM

#

wicked notch Before remembering: "Wait a minute, a triangle has 3 points"

smh

struct Tringle
{
  uint32_t verticeeeez[3];
};

std :: vector < Tringle >(max_meshlets * max_triangles);

wicked notch May 8, 2023, 8:56 PM

#

https://tenor.com/view/demo-man-going-in-circles-eye-illuminati-triangle-gif-26507108

Tenor

#

struct meshlet_t {
    uint32 vertex_offset = 0;
    uint32 vertex_count = 0;
    uint32 triangle_offset = 0;
    uint32 triangle_count = 0;
    
    // this indices into meshlet_group_t::vertices
    std::vector<uint32> vertices;
    // gl_PrimitiveIndicesNV I guess
    std::vector<uint8> triangles;
};

struct meshlet_group_t {
    std::vector<meshlet_t> meshlets;
    std::vector<vertex_format_t> vertices;
    std::vector<uint32> indices;
};``` I have no idea what I'm doing ![KEKW](https://cdn.discordapp.com/emojis/666849321462792234.webp?size=128 "KEKW")

frank sail May 8, 2023, 9:00 PM

#

are you copying a sample from somewhere

#

also, I N D I R E C T I O N

wicked notch May 8, 2023, 9:00 PM

#

I'm relying on meshoptimizer's documentation which is not much

#

constexpr auto max_vertices = 64u;
constexpr auto max_triangles = 124u;
constexpr auto cone_weight = 0.0f;
const auto max_meshlets = meshopt_buildMeshletsBound(indices.size(), max_vertices, max_triangles);
auto meshlets = std::vector<meshopt_Meshlet>(max_meshlets);
auto meshlet_vertices = std::vector<uint32>(max_meshlets * max_vertices);
auto meshlet_triangles = std::vector<uint8>(max_meshlets * max_triangles * 3);
const auto meshlet_count = meshopt_buildMeshlets(
    meshlets.data(),
    meshlet_vertices.data(),
    meshlet_triangles.data(),
    indices.data(),
    indices.size(),
    (const float32*)vertices.data(),
    vertices.size(),
    sizeof(vertex_format_t),
    max_vertices,
    max_triangles,
    cone_weight);

auto& last_meshlet = meshlets[meshlet_count - 1];
meshlet_vertices.resize(last_meshlet.vertex_offset + last_meshlet.vertex_count);
meshlet_triangles.resize(last_meshlet.triangle_offset + ((last_meshlet.triangle_count * 3 + 3) & ~3));
meshlets.resize(meshlet_count);``` This "works"

#

i.e: it doesn't crash

frank sail May 8, 2023, 9:01 PM

#

oh jeez

wicked notch May 8, 2023, 9:01 PM

#

I guess the "triangles" are indices into the meshlet itself?

frank sail May 8, 2023, 9:01 PM

#

I guess cone_weight is some value to influence how faces are grouped w.r.t. their normal

#

idk, we're all guessing

wicked notch May 8, 2023, 9:02 PM

#

It's something regarding cone based culling but I dunno

#

"cone_weight should be left as 0 if cluster cone culling is not used, and set to a value between 0 and 1 to balance cone culling efficiency with other forms of culling like frustum or occlusion culling."
Whatever this means

frank sail May 8, 2023, 9:02 PM

#

yeah, ultimately you would use it to make normal-based culling better

#

but optimizing for that too much means you might group far apart vertices together, making frustum and occlusion culling worse

wicked notch May 8, 2023, 9:04 PM

#

What I would really like to know right now is whatever the hell gl_PrimitiveIndicesNV is

#

Since it's a uint8 and NVIDIA only allows up to 256 - 1 triangles in a meshlet I suppose they are indices that index into the meshlet vertices?

#

i.e: gl_MeshVerticesNV

frank sail May 8, 2023, 9:05 PM

#

When each mesh shader work group completes, it emits an output mesh
    consisting of
...
  * an array of vertex index values written to the built-in output array
      gl_PrimitiveIndicesNV, where each output primitive has a set of one,
      two, or three indices that identify the output vertices in the mesh used
      to form the primitive.

wicked notch May 8, 2023, 9:06 PM

#

So... yes?

#

gl_PrimitiveIndicesNV are indices for gl_MeshVerticesNV?

#

I'll just assume yes for now

#

I'll know if I'm wrong because I will see funny triangles (or none at all)

frank sail May 8, 2023, 9:07 PM

#

where did you see gl_MeshVerticesNV

wicked notch May 8, 2023, 9:07 PM

#

https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/

frank sail May 8, 2023, 9:07 PM

#

I don't see it in the spec

#

https://registry.khronos.org/OpenGL/extensions/NV/NV_mesh_shader.txt

wicked notch May 8, 2023, 9:07 PM

#

out gl_MeshPerVertexNV {
     vec4  gl_Position;
     float gl_PointSize;
     float gl_ClipDistance[];
     float gl_CullDistance[];
  } gl_MeshVerticesNV[];```

#

It's this thing apparently

wispy spear May 8, 2023, 9:11 PM

#

hmm the roblox guy, the same guy who made meshoptimizer, forgor his name atm, has a video series of meshletisms going iirc

#

its vulkan, but still

wicked notch May 8, 2023, 9:14 PM

#

I'll look it up one sec

#

I have to figure out if I need the original index buffer or not

#

I'm 80% sure I don't

wispy spear May 8, 2023, 9:20 PM

#

https://www.youtube.com/watch?v=BR2my8OE1Sc&list=PL0JVLUVCkk-l7CWCn3-cdftR0oajugYvd

YouTube

Arseny Kapoulkine

niagara: Building a Vulkan renderer from scratch*

https://github.com/zeux/niagara

We will kick off the Vulkan stream series by discussing what we're going to be building and the general approach; then we'll start writing code to get a triangle on screen.

▶ Play video

#

i think it was towards the end, meshlet culling

wicked notch May 8, 2023, 9:28 PM

#

fun fact, sponza subdivides in 3515 meshlets

#

(Assuming I didn't break any laws of physics)

wicked notch May 8, 2023, 9:55 PM

#

We are back to the origins once more.

#

For the third time KEKW

#

Jaker, could you translate into (comprehensible) english what the first parameter of glDrawMeshTasksNV does?

#

Wait it's simply an offset int gl_WorkGroupID.x

#

Why would I ever need this

frank sail May 8, 2023, 10:09 PM

#

if you want to draw a subset of meshlets

wicked notch May 8, 2023, 10:12 PM

#

Hmm

#

I lately type "hmm" a lot, I should make an emoji 🐸 🤔

#

frog_think

frank sail May 8, 2023, 10:12 PM

#

consider it a convenience parameter, like how dispatches are 3D

frank sail May 8, 2023, 10:13 PM

#

wicked notch Hmm

🇭🇲 🇲🇲

wicked notch May 8, 2023, 10:13 PM

#

Yeah, that's fair.

#

deccer if you please.

#

Oh shit this is garbage, one second

frank sail May 8, 2023, 10:17 PM

#

#1027528776717975592

#

reminds me of this frog_shush I made

#

the hand needs an outline though

wicked notch May 8, 2023, 10:19 PM

#

Yeah, I'll do it 'morrow.

#

I would like to see sponza before I sleepy

wicked notch May 8, 2023, 10:53 PM

#

Uh

#

------------
Internal error: assembly compile error for mesh shader at offset 30836:
-- error message --
line 1003, column 8:  error: unknown opcode modifier
-- internal assembly text --
!!NVmp5.0
OPTION NV_internal;
OPTION NV_shader_storage_buffer;
OPTION NV_bindless_texture;
GROUP_SIZE 1;
PRIMITIVE_TYPE TRIANGLES;
PRIMITIVES_OUT 124;
VERTICES_OUT 64;
# cgc version 3.4.0001, build date Apr 13 2023
# command line args:
#vendor NVIDIA Corporation
#version 3.4.0.1 COP Build Date Apr 13 2023
#profile gp5mp
#program main
#semantic b_transform_buffer : SBO_BUFFER[0]
#semantic b_meshlet_buffer : SBO_BUFFER[1]
#semantic b_vertex_buffer : SBO_BUFFER[2]
#semantic pv
#semantic meshlet.5 : __LOCAL
#var uint3 gl_WorkGroupID : $vin.CTAID : CTAID[0] : -1 : 1
#var float4 gl_MeshVerticesNV[0].gl_Position : $vout.POSITION : HPOS[32] : -1 : 1
#var float gl_MeshVerticesNV[0].gl_PointSize : $vout.PSIZE :  : -1 : 0
#var float gl_MeshVerticesNV[0].gl_ClipDistance[0] :  :  : -1 : 0
#var float gl_MeshVerticesNV[0].gl_Cul

I managed to crash NVIDIA's internal shader compiler

frank sail May 8, 2023, 10:54 PM

#

oof

wicked notch May 8, 2023, 10:55 PM

#

Removing this:

layout (location = 0) in flat uint o_meshlet_id;``` fixes it

#

Ah I see

#

layout (location = 0) out t_per_vertex {
    uint meshlet_id;
} o_per_vertex[];```

#

Where the hell do I put the flat

#

lol

#

out flat t_per_vertex doesn't work

frank sail May 8, 2023, 11:08 PM

#

idk

#

you probably can't

wicked notch May 8, 2023, 11:08 PM

#

....how do I send flat attributes to the frag shader?

frank sail May 8, 2023, 11:09 PM

#

you can write the same value three times

#

ez flat

wicked notch May 8, 2023, 11:09 PM

#

It's per vertex not per primitive sadly

frank sail May 8, 2023, 11:09 PM

#

and from the mesh shader, it should be easy

#

You have access to the whole meshlet in the mesh shader, no?

#

You should be able to do whatever you want with a little creativity

wicked notch May 8, 2023, 11:12 PM

#

So far my creativity has caused the NVIDIA internal compiler to crash 4 times

#

KEKW

#

I figured it out btw

#

layout (location = 0) out t_per_vertex {
    flat uint meshlet_id;
} o_per_vertex[];``` this is legal apparently

#

The spec says it's not

#

But NVIDIA is NVIDIA I guess

#

What do you mean nsight does not support debugging their own fucking extension

#

Only D3D12 Mesh Shaders are supported in NSight apparently..

wicked notch May 8, 2023, 11:45 PM

#

At least deccer's cubes work

#

But I'll leave it like this

#

There's no way I can continue with mesh shading if I can't even debug...

#

Pretty sad

wicked notch May 9, 2023, 9:28 AM

#

I once again forgor a triangle has 3 vertices

#

pog

#

Now I just have to figure out why everything borks when there's more than one meshlet group

#

While I figure that out, here's good froge

frank sail May 9, 2023, 9:40 AM

#

frogchamp

wicked notch May 9, 2023, 9:53 AM

#

#

83796 meshlets and 5942 meshlet groups (or if you are old fashioned "meshes" 😄)

#

Occupancy is dead though nervous

#

I'm hitting the memory limit 11 times out of 10 KEKW

#

#version 460 core
#extension GL_NV_mesh_shader : require

struct meshlet_t {
    uint triangle_count;
    uint vertex_count;
    uint vertex_offset;
    uint[64] vertices;
    uint[384] triangles;
    uint mesh_index;
};

struct vertex_format_t {
    vec4 position;
    vec4 normal;
    vec4 uv;
    vec4 tangent;
};

layout (local_size_x = 1) in;

layout (triangles, max_vertices = 64, max_primitives = 124) out;

layout (location = 0) out t_per_vertex {
    flat uint meshlet_id;
} o_per_vertex[];

layout (location = 0) uniform mat4 pv;

layout (std430, binding = 0) readonly restrict buffer b_transform_buffer {
    mat4[] transforms;
};

layout (std430, binding = 1) readonly restrict buffer b_meshlet_buffer {
    meshlet_t[] meshlets;
};

layout (std430, binding = 2) readonly restrict buffer b_vertex_buffer {
    vertex_format_t[] vertices;
};

void main() {
    const uint workgroup_index = gl_WorkGroupID.x;
    const meshlet_t meshlet = meshlets[workgroup_index];
    const mat4 transform = transforms[meshlet.mesh_index];

    for (uint i = 0; i < meshlet.vertex_count; ++i) {
        const vertex_format_t vertex = vertices[meshlet.vertex_offset + meshlet.vertices[i]];
        gl_MeshVerticesNV[i].gl_Position = pv * transform * vec4(vertex.position.xyz, 1.0);
        o_per_vertex[i].meshlet_id = workgroup_index;
    }

    const uint index_count = meshlet.triangle_count;
    gl_PrimitiveCountNV = index_count;
    for (uint i = 0; i < index_count * 3; ++i) {
        gl_PrimitiveIndicesNV[i] = meshlet.triangles[i];
    }
}
``` Backup

finite yacht May 9, 2023, 10:17 AM

#

wicked notch ....how do I send flat attributes to the frag shader?

i think perprimitiveNV qualifier is what you want

wicked notch May 9, 2023, 10:19 AM

#

Interesting, I'll check it out

#

I think this struct is the culprit by the way:

struct meshlet_t {
    uint triangle_count;
    uint vertex_count;
    uint vertex_offset;
    uint[64] vertices;
    uint[384] triangles;
    uint mesh_index;
};```

#

Each thread loading 2KiB of data isn't ideal I think KEKW

finite yacht May 9, 2023, 10:27 AM

#

the bad occupancy you mean? Thats probably because the layout (local_size_x = 1) in; so its only using 1/subgroupSize of the hardware

wicked notch May 9, 2023, 10:27 AM

#

Ah yeah I changed that, I now use 32

#

Occupancy is still horrible, though better by a factor of 2

finite yacht May 9, 2023, 10:28 AM

#

btw didnt you start learning opengl only a few months ago how can you already be messing with mesh shaders and shit thats crazy

wicked notch May 9, 2023, 10:28 AM

#

I just copied the nvidia sample KEKW

#

Also jaker helped me translate the spec into human language

frank sail May 9, 2023, 10:29 AM

#

pretty sure I was doing learnopengl shit a couple months in

wicked notch May 9, 2023, 10:29 AM

#

I am still doing that by the way, I implemented normal mapping on the way

frank sail May 9, 2023, 10:29 AM

#

I didn't have this server when I started tho

wicked notch May 9, 2023, 10:30 AM

#

Yeah, this server is a gold mine

frank sail May 9, 2023, 10:31 AM

#

wicked notch Also jaker helped me translate the spec into human language

what's funny is that I recently submitted a bug report to nvidia and got a reply basically telling me to rtfm

#

#1019779751600205955 message

cunning atlas May 9, 2023, 10:33 AM

#

finite yacht btw didnt you start learning opengl only a few months ago how can you already be...

this thread singlehandedly making me question what I've been spending my time on

frank sail May 9, 2023, 10:37 AM

#

I haven't done a bunch of the things lvstri is doing here

#

the shadows are way cooler already

#

and I haven't even touched mesh shaders

wicked notch May 9, 2023, 10:38 AM

#

You gave me the filtering algorithm though

cunning atlas May 9, 2023, 10:39 AM

#

yeah they look great :3

#

probably look back through this thread when I go back to making my shadows look better

wicked notch May 9, 2023, 10:39 AM

#

I also spent a solid week fixing bugs in my shadows nervous

cunning atlas May 9, 2023, 10:40 AM

#

and look how it paid off 😉

wicked notch May 9, 2023, 10:40 AM

#

Everything regarding shadows is here btw: https://github.com/LVSTRI/Iris/blob/master/shaders/5.0/setup_shadows.comp

#

While filtering is here: https://github.com/LVSTRI/Iris/blob/master/shaders/5.0/main.frag#L100

cunning atlas May 9, 2023, 10:41 AM

#

ty, I've saved those links for later to have a browse

#

although with what I'm currently doing I don't care about shadows too much

#

but i'll definitely take a look after

wicked notch May 9, 2023, 12:44 PM

#

I somewhat fixed the occupancy by putting everything in buffers, but it's still limited because there's too much data in each thread?

wicked notch May 9, 2023, 4:27 PM

#

What the hell is a subgroupBallot and subgroupVote

#

We're electing the next US president?

#

Ah, GL_KHR_shader_subgroup isn't even supported in OpenGL

#

There is no apparent way of scheduling work to the mesh shader from the task shader that allows culling in OpenGL without KHR_shader_subgroup

#

I guess this is as far as we go huh...

#

Pretty sad, I was having a ton of fun even without a debugger

#

Final image with Mesh Shaders, 32 threads per workgroup, 32 vertices MAX (1 to 1), 124 primitives MAX, 174777 meshlets

#

It's time to go back to our origins, with good ol' vertex shaders.

finite yacht May 9, 2023, 4:47 PM

#

wicked notch Ah, `GL_KHR_shader_subgroup` isn't even supported in OpenGL

they are supported. NVIDIA, AMD and Intel have them in OpenGL. Just not core which is sad

wicked notch May 9, 2023, 4:48 PM

#

I see, then they are buried under the second page result of google, because first page is Vulkan only KEKW

proven laurel May 9, 2023, 4:48 PM

#

finite yacht they are supported. NVIDIA, AMD and Intel have them in OpenGL. Just not core whi...

I don't think the next core opengl version will come for a long time lol

wicked notch May 9, 2023, 4:48 PM

#

I'm sure it'll come within the heat death of the universe

finite yacht May 9, 2023, 4:51 PM

#

actually one type of subgroup operations is core (since 4.6) : ARB_shader_group_vote

wicked notch May 9, 2023, 4:53 PM

#

Interesting, I'll eventually come back to mesh shaders, maybe I'll figure these out.

wicked notch May 9, 2023, 5:52 PM

#

Hi-Z time!

frank sail May 9, 2023, 6:22 PM

#

wicked notch There is no apparent way of scheduling work to the mesh shader from the task sha...

what

wicked notch May 9, 2023, 6:25 PM

#

Nevermind that, you can do culling with GL, I just refused to go past the first few google hits lol

#

I also didn't bother to look at how ballotThreadNV actually worked, turns out it's extremely useful

frank sail May 9, 2023, 6:44 PM

#

https://www.khronos.org/blog/vulkan-subgroup-tutorial

The Khronos Group

Vulkan Subgroup Tutorial

Subgroups are an important new feature in Vulkan 1.1 because they enable highly-efficient sharing and manipulation of data between multiple tasks running in parallel on a GPU. In this tutorial, we will cover how to use the new subgroup functionality.

wicked notch May 9, 2023, 6:47 PM

#

Yes indeed, unfortunately it's very different from the NV one.

#

That said, GPUs are scary.

frank sail May 9, 2023, 6:47 PM

#

gpus are epiiiiiiic

wicked notch May 9, 2023, 6:48 PM

#

GPUs are trying to take over the world, they can already feed themselves data to work with.

#

With task shaders they can even dispatch work to themselves

finite yacht May 9, 2023, 7:00 PM

#

KHR_shader_subgroup_ballot is a superset of ARB_shader_ballot which is basically NV_shader_thread_group

wispy spear May 9, 2023, 7:07 PM

#

if this continues, lustri is publishing some api more capable than vulkan, soon

#

you read it here first

wicked notch May 9, 2023, 7:09 PM

#

I doubt that, but I may or may not try to hack bindless textures into RenderDoc.

#

It's really a pain having to disable textures everytime I want to debug with it.

wispy spear May 9, 2023, 7:35 PM

#

yes please 😄

#

that also reminds me i wanted to readd a texturearray path for that reason to my shit

frank sail May 9, 2023, 7:35 PM

#

you'd be a 🦵end if you did it

#

it's probably not easy to implement if baldurk has refused to do it a million times already

#

hmm, maybe you can still hack a non-complete, non-performant version in somehow

wispy spear May 9, 2023, 7:36 PM

#

jaker and i are worthy guinea pigs

wicked notch May 9, 2023, 7:58 PM

#

How do I know the size of the mips in a texture created with glTextureStorage2D(..., mips)? 🤔

#

If the base level is 1024x768, what's level 1?

frank sail May 9, 2023, 7:58 PM

#

The spec says how they're created

#

But basically it's just max(1, floor(res / 2)) for each level

wicked notch May 9, 2023, 7:59 PM

#

Very nice

#

So if I specify floor(log2(max(w, h))) + 1 mips I'll have
0 -> 1024x768
1 -> 512x384
...
9 -> 2x1
10 -> 1x1

frank sail May 9, 2023, 8:02 PM

#

yeah, probably

wicked notch May 9, 2023, 8:03 PM

#

I worry about the probably, but it'll be fine

frank sail May 9, 2023, 8:06 PM

#

I almost always mince my statements so I can't be wrong smart

wispy spear May 9, 2023, 8:06 PM

#

not curry?

frank sail May 9, 2023, 8:08 PM

#

that comes later

wicked notch May 9, 2023, 8:52 PM

#

Anything that combines textureGather and textureLod?

wispy spear May 9, 2023, 8:54 PM

#

textureGatherLod

#

jk

wicked notch May 9, 2023, 8:57 PM

#

https://tenor.com/view/lies-the-batman-liar-you-lie-dont-lie-to-me-gif-23486294

Tenor

finite yacht May 9, 2023, 9:08 PM

#

interesting AMD has an extension just for that

wicked notch May 9, 2023, 9:26 PM

#

There is this stupid edge that doesn't go away

#

#version 460 core

layout (local_size_x = 16, local_size_y = 16, local_size_z = 1) in;

layout (location = 0) uniform uint u_level;

layout (binding = 0) uniform sampler2D u_in_depth;
layout (binding = 1, r32f) uniform writeonly image2D u_out_depth;


void main() {
    const uvec2 coord = gl_GlobalInvocationID.xy;
    const ivec2 size = imageSize(u_out_depth);
    if (all(lessThan(coord, size))) {
        const vec4 depth = vec4(
            textureLod(u_in_depth, ((vec2(coord) + vec2(0.0, 0.0) + 0.5) / vec2(size)), u_level).r,
            textureLod(u_in_depth, ((vec2(coord) + vec2(1.0, 0.0) + 0.5) / vec2(size)), u_level).r,
            textureLod(u_in_depth, ((vec2(coord) + vec2(0.0, 1.0) + 0.5) / vec2(size)), u_level).r,
            textureLod(u_in_depth, ((vec2(coord) + vec2(1.0, 1.0) + 0.5) / vec2(size)), u_level).r);
        imageStore(u_out_depth, ivec2(coord), vec4(max(max(depth.x, depth.y), max(depth.z, depth.w))));
    }
}``` What's wrong here?

wispy spear May 9, 2023, 9:28 PM

#

hmm wrong addressmode perhaps in your u_in_depth?

#

clamp_to_edge vs clamp_to_border?

#

(i am just talking out of my ass here)

wicked notch May 9, 2023, 9:28 PM

#

Yeah it's clamp_to_edge

wispy spear May 9, 2023, 9:28 PM

#

what if you clamp to border and bordercolly to black?

wicked notch May 9, 2023, 9:29 PM

#

Wait hold on, I'm an idiot.

#

Only the first mip is clamp_to_edge

#

Lovely, it works now

wicked notch May 9, 2023, 10:31 PM

#

Hmm, HiZ culling has the opposite problem as ROC

#

It's not very conservative

#

Mayhaps I have some errors in my implementation.

wispy spear May 9, 2023, 10:33 PM

#

what was ROC again?

wicked notch May 9, 2023, 10:33 PM

#

Raster Occlusion Culling

wispy spear May 9, 2023, 10:37 PM

#

never heard that before

frank sail May 9, 2023, 11:10 PM

#

wicked notch Only the first mip is clamp_to_edge

Wot

wicked notch May 9, 2023, 11:14 PM

#

That doesn't make a lot of sense yes, I meant to say "only the original depth buffer is clamp to edge, the actual hiz mip chain was clamp to border"

#

I should probably start using sampler objects.

frank sail May 9, 2023, 11:15 PM

#

Sampler objects are bae

wicked notch May 9, 2023, 11:31 PM

#

Hmm HiZ is super fast

#

By a factor of 2 at least

#

Still broken though KEKW

#

It's possible that D3D/Vulkan conventions are biasing this article

frank sail May 9, 2023, 11:40 PM

#

try glClipControl to change the depth range, then call the corresponding glm function to generate a new projection

wicked notch May 9, 2023, 11:42 PM

#

I'd have to change all my 60 shaders to use [0;1] instead of [-1;1]

#

But honestly it's worth it, I don't know who the hell thought -1,1 depth was a good idea, I hope he's repenting

wicked notch May 10, 2023, 2:04 PM

#

Finally, I switched to a sane(r) NDC system

#

No more depth [-1,1] bullshit. (HiZ is still broken though KEKW )

wicked notch May 10, 2023, 4:17 PM

#

Hmm these are the AABB's uvs that HiZ is seeing, I'm not sure I see anything wrong with them

#

Maybe the scale?

wispy spear May 10, 2023, 4:25 PM

#

https://tenor.com/view/baby-scale-omg-shookt-surprised-gif-9356799

Tenor

wicked notch May 10, 2023, 6:38 PM

#

I am failing to understand HiZ

#

And the funny part is I don't know why I'm failing at understading it, it's quite straightforward.

#

I'll start again from square 0, building the HiZ mip chain

frank sail May 10, 2023, 6:42 PM

#

are you failing to understand HiZ as a whole, or just a particular bit of its implementation?

wicked notch May 10, 2023, 6:43 PM

#

I don't know, I feel like I understand it, but when I try and apply "fixes" that I think are causing me problems, everything breaks (or nothing changes at all).

#

Also:

const mat4 pv = u_cascade_layer == -1 ? camera.pv : cascades[u_cascade_layer].pv;
// project AABB in clip space
vec4 ndc_corner = pv * model * vec4(aabb_corners[i], 1.0);
ndc_corner.z = max(ndc_corner.z, 0.0);
ndc_corner /= ndc_corner.w;``` Why the hell we max out ndc's Z before perspective div is a mystery

wispy spear May 10, 2023, 6:46 PM

#

when you say "i dont know" you need to say it like jimmy yang in his chinese accent... "oh.... i dont knoow"

frank sail May 10, 2023, 6:47 PM

#

wicked notch Also: ```glsl const mat4 pv = u_cascade_layer == -1 ? camera.pv : cascades[u_cas...

then why do you have it

wicked notch May 10, 2023, 6:48 PM

#

https://interplayoflight.wordpress.com/2017/11/15/experiments-in-gpu-based-occlusion-culling/

#

His fault

#

I mean, it makes sense to do that, just not before perspective division

#

We don't care about objects behind the near plane after all.

frank sail May 10, 2023, 6:51 PM

#

does it even matter if you do the clamp before or after perspective division

#

the eventual value will be 0 either way

wicked notch May 10, 2023, 6:51 PM

#

I guess

#

I'm grasping at straws here frog_gone

#

ROC was much easier froge_sad

frank sail May 10, 2023, 6:53 PM

#

ASUS ROC (Republic of Camers)

wicked notch May 10, 2023, 6:55 PM

#

Alright enough complaining, I'll get to work seriously now

#

Ah just one thing, could you find some HiZ samples for me? I can't find anything other than Niagara and the one linked above

#

I'd like to see some common ground, hopefully

frank sail May 10, 2023, 7:00 PM

#

vkguide.dev has one

frank sail May 10, 2023, 7:03 PM

#

wicked notch <https://interplayoflight.wordpress.com/2017/11/15/experiments-in-gpu-based-occl...

I like this variable name
NoofInstances

#

N o o f

wicked notch May 10, 2023, 7:12 PM

#

Is there anything equivalent to VK_STRUCTURE_TYPE_SAMPLER_REDUCTION_MODE_CREATE_INFO_EXT in OpenGL?

#

Looks like vkguide is using it as a foolproof way of reducing a depth image

#

Also that's one long struct name lol

frank sail May 10, 2023, 7:17 PM

#

no

wicked notch May 10, 2023, 7:17 PM

#

https://tenor.com/view/crying-emoji-gif-21922016

Tenor

frank sail May 10, 2023, 7:19 PM

#

wicked notch Looks like vkguide is using it as a foolproof way of reducing a depth image

it's not hard to implement yourself, it'll just perform worse since you have to explicitly take four shrimples

wicked notch May 10, 2023, 7:21 PM

#

I am gigastupid then

#

Can't seem to implement a proper depth reduction

wispy spear May 10, 2023, 7:22 PM

#

just go straight to whatever you want to call your gpu architecture + api which will be the successor of vk

#

where xxReduceDepth() is a thing

wicked notch May 10, 2023, 7:23 PM

#

:(

frank sail May 10, 2023, 7:23 PM

#

wicked notch Can't seem to implement a proper depth reduction

it's ez pz
return 0;

#

or return 1; if you don't have reverse depth smart

wicked notch May 10, 2023, 7:27 PM

#

How

#

In god's holy name

#

is level 9

#

#

lower than level 10

#

#

How does this even happen

frank sail May 10, 2023, 7:33 PM

#

isn't that expected behavior

#

the final mip should be really deep since it's the deepest of all pixels

wicked notch May 10, 2023, 7:33 PM

#

Well yes, it would be

#

except I omitted one crucial detail

#

level 8 is also higher than level 9

#

Which is absolutely bonkers

frank sail May 10, 2023, 7:34 PM

#

maybe you don't handle reducing odd resolutions correctly

#

you have to do something special for those iirc

wicked notch May 10, 2023, 7:35 PM

#

yes

#

Which is why I made this:

static auto previous_power_two(iris::uint32 v) noexcept -> iris::uint32 {
    auto r = 1;
    while (r < v) {
        r <<= 1;
    }
    return r >> 1;
}```

wicked notch May 10, 2023, 8:07 PM

#

Mip chain is fully working now

#

Onto figuring out the dumb algorithm

wicked notch May 10, 2023, 9:26 PM

#

This is very frustrating.

#

I'll stop for now

wicked notch May 10, 2023, 10:32 PM

#

Here projectSphere seems to take a world space sphere, but that doesn't seem right does it? https://github.com/zeux/niagara/blob/4e3e21440e4b7d0699bcc5d46f2efbe1e0050946/src/shaders/drawcull.comp.glsl#L82

frank sail May 11, 2023, 3:53 AM

#

finite yacht i think `perprimitiveNV` qualifier is what you want

DX12 equivalent if you're curious
https://github.com/microsoft/DirectX-Specs/blob/master/d3d/MeshShader.md#primitive-attributes

wicked notch May 11, 2023, 8:58 AM

#

Good morning friends.

#

Day 2 of debugging HiZ, hopefully I'll have a clue what's going on by today

frank sail May 11, 2023, 8:59 AM

#

did it come to you in your dreams

wicked notch May 11, 2023, 8:59 AM

#

Sadly not

#

By the way, if you know how this works, could you give me a hint?

frank sail May 11, 2023, 9:00 AM

#

idk, never implemented it

#

I mean, I know how it works conceptually

#

do depth reduction, project object bounds (AABBs or spheres), gather some texels, etc.

wicked notch May 11, 2023, 9:03 AM

#

Could you find the original SIGGRAPH slides or the original paper? I can't seem to find it lol

frank sail May 11, 2023, 9:03 AM

#

I'll try

#

I failed

wicked notch May 11, 2023, 9:06 AM

#

I see you don't go past google hit #5 too bleakekw

frank sail May 11, 2023, 9:16 AM

#

is there something fundamental about hi-z culling that you're unsure about?

wicked notch May 11, 2023, 9:27 AM

#

frank sail is there something fundamental about hi-z culling that you're unsure about?

A bit of this

#

This is deccer cubes

#

™️

#

These are the screen space bounding rectangles that HiZ is seeing

#

Now, suppose the red cube is huge, except there's a hole in the middle that allows you to see the purple and green cubes

#

Would they get culled because the bounding box's depth is lower than in the HiZ's?

frank sail May 11, 2023, 9:29 AM

#

shouldn't be a problem

#

the bounding boxes cannot occlude other bounding boxes

#

they are simply a conservative testing volume

#

which you test against the depth buffer

wicked notch May 11, 2023, 9:30 AM

#

So the bounding rects simply serve as an "index" against the HiZ chain?

frank sail May 11, 2023, 9:30 AM

#

they serve the same purpose as the bounding volumes in ROC

#

a volume that conservatively encapsulates an object which, if visible, is an indication that the object itself is probably visible as well

#

hi-z and ROC are really just different ways of testing the bounding volumes against the depth buffer

wicked notch May 11, 2023, 9:33 AM

#

Ah I was going with the opposite (and wrong) intuition: if the volume is NOT visible then the object is culled

frank sail May 11, 2023, 9:35 AM

#

that is correct though

#

if the conservative bounding volume isn't visible, then the object itself definitely isn't

wicked notch May 11, 2023, 9:35 AM

#

Hmm I see

frank sail May 11, 2023, 9:35 AM

#

but if the bounding volume is visible, then the object is only probably visible (because the bounding volume is conservative)

wicked notch May 11, 2023, 9:36 AM

#

Alright

#

Thanks a lot for the additional insight

wicked notch May 11, 2023, 10:55 AM

#

I think I'm getting close to solving it.

#

I only need to figure out the projectSphere's weird projection thing

wicked notch May 11, 2023, 8:25 PM

#

Heh

#

Hahah

#

Aahahahha

frank sail May 11, 2023, 8:25 PM

#

frogstare

wicked notch May 11, 2023, 8:25 PM

#

I would like to order one death please

#

You would think.

#

As a human person

#

That GL_NEAREST takes the closest pixel to the UV coordinates specified in textureLod

wispy spear May 11, 2023, 8:27 PM

#

GL_CLOSEST does

frank sail May 11, 2023, 8:27 PM

#

It just truncates 🙂

wicked notch May 11, 2023, 8:27 PM

#

Except if you have mips

#

You need GL_NEAREST_MIPMAP_NEAREST

wispy spear May 11, 2023, 8:27 PM

#

badumm tsss

frank sail May 11, 2023, 8:27 PM

#

ah

#

because gl is so unhinged that you have to specify min and mip in the same parameter

#

otherwise you implicitly disable mips

wispy spear May 11, 2023, 8:28 PM

#

wouldntvedve happend with the right abstraction

frank sail May 11, 2023, 8:28 PM

#

fwog user #2???

wispy spear May 11, 2023, 8:28 PM

#

wicked notch May 11, 2023, 8:29 PM

#

I want to say this was a good experience

wispy spear May 11, 2023, 8:29 PM

#

OpenGL is like a diving in some closed mines experience

wicked notch May 11, 2023, 8:29 PM

#

But this really wasn't, I just hope OpenGL doesn't fail me again...

wispy spear May 11, 2023, 8:30 PM

#

crystal clear water everywhere, but as soon as you dive and cause a wave shock, shit is going to hit the fan

frank sail May 11, 2023, 8:30 PM

#

or maybe it's lvstri's ~~evil villain~~ Vulkan backstory

wispy spear May 11, 2023, 8:30 PM

#

haha

wicked notch May 11, 2023, 8:30 PM

#

By the way, in the few months I have only had a few dozen debug messages

wispy spear May 11, 2023, 8:30 PM

#

there are usually not many anyway

wicked notch May 11, 2023, 8:30 PM

#

Not once has the debug message callback come to my help like "hey, you might want to do GL_NEAREST_MIPMAP_NEAREST"

frank sail May 11, 2023, 8:30 PM

#

Did you ever enable the synchronous thingy

wicked notch May 11, 2023, 8:30 PM

#

Yes since you told me

wispy spear May 11, 2023, 8:31 PM

#

glValidationLayer :C

wicked notch May 11, 2023, 8:31 PM

#

Haven't had the chance to use it though, since literally not a single message has happened

frank sail May 11, 2023, 8:31 PM

#

wicked notch Not once has the debug message callback come to my help like "hey, you might wan...

Well it's not an api error to do that methinks

wicked notch May 11, 2023, 8:31 PM

#

It should be

#

fucking hell

#

Don't just implicitly disable shit

wispy spear May 11, 2023, 8:31 PM

#

if the api would have known what you wanted to achieve

wicked notch May 11, 2023, 8:32 PM

#

Right, I just don't think implicitly disabling mips is not a good way of handling this

#

Give me literally any message, don't just do stuff silently

wispy spear May 11, 2023, 8:32 PM

#

dont be so hard on neither of you (you and OpenGL)

#

the latter will have more opportunities to bite you in the butt

wicked notch May 11, 2023, 8:33 PM

#

I can only imagine

#

By the way

#

Here's how I diagnosed this

wispy spear May 11, 2023, 8:34 PM

#

pengu jaker and i were thinking/toying with some glValidation layer thing a while ago, but its super dead

wicked notch May 11, 2023, 8:34 PM

#

Blue is mip level, red is value sampled from the uvs of that level

wispy spear May 11, 2023, 8:34 PM

#

that thing could/should/might have picked that one up

wicked notch May 11, 2023, 8:34 PM

#

I checked level 9, and the mip was just 1.0

#

So thank you, OpenGL.

#

I'm sure this would not have happened with any saner API KEKW .

#

But it is what it is..

#

At least I have occlusion culling now

#

Actually, I have 4 different fully functional algorithms of doing occlusion culling KEKW

wispy spear May 11, 2023, 8:36 PM

#

noice

wicked notch May 11, 2023, 8:36 PM

#

Because yes, I rewrote the entire algorithm 4 times, in different ways

#

And they all work perfectly fine

#

Now, onto removing the over 500 lines of code purely for debugging purposes KEKW

#

After this I think I'll be taking a big chamomile cup

#

I'm kind of irritated.

frank sail May 11, 2023, 8:42 PM

#

wicked notch Right, I just don't think implicitly disabling mips is not a good way of handlin...

You're using the wrong API then, my friend

wispy spear May 11, 2023, 8:53 PM

#

https://github.com/GraphicsProgramming/gl-validation-layer is the thing i was talking about

GitHub

GitHub - GraphicsProgramming/gl-validation-layer: OpenGL validation...

OpenGL validation layer similar to Vulkan layers. Contribute to GraphicsProgramming/gl-validation-layer development by creating an account on GitHub.

wicked notch May 11, 2023, 9:02 PM

#

I came back.

wicked notch May 11, 2023, 9:03 PM

#

wispy spear https://github.com/GraphicsProgramming/gl-validation-layer is the thing i was ta...

I checked this out too, but it looks like it's abandoned?

#

😦

frank sail May 11, 2023, 9:03 PM

#

It's merely taking a nap

wicked notch May 11, 2023, 9:03 PM

#

frank sail You're using the wrong API then, my friend

I do think sometimes that I could "just" switch to Vulkan, "just" use meshlets, "just" use meshlet culling which is super accurate and all the bleeding edge features you can name

#

Except the "just" KEKW

frank sail May 11, 2023, 9:04 PM

#

wicked notch May 11, 2023, 9:04 PM

#

I saw some Vulkan code today as a result of trying to figure out why HiZ was not working and it's very thicc

#

I don't really want to give up the convenience of GL yet.

frank sail May 11, 2023, 9:05 PM

#

Maybe you could do vkguide as a side-side project

wicked notch May 11, 2023, 9:05 PM

#

The unhingedness does not outweigh the convenience (yet)

wispy spear May 11, 2023, 9:06 PM

#

ja its kind dead 😦

frank sail May 11, 2023, 9:06 PM

#

You could always use a nice gl wrapper ahem (ignore broken docs build)
https://github.com/JuanDiegoMontoya/Fwog

wispy spear May 11, 2023, 9:06 PM

#

or switch to c# and use my experiment 😛

wicked notch May 11, 2023, 9:06 PM

#

wispy spear ja its kind dead 😦

At least I know I'm not alone fighting the non-existant debugging messages

frank sail May 11, 2023, 9:06 PM

#

wispy spear ja its kind dead 😦

But we accept contributors

wicked notch May 11, 2023, 9:07 PM

#

Anyways I have now calmed down and I'm not mad anymore at GL

#

I want to do something smol next.

#

Anti-Aliasing

frank sail May 11, 2023, 9:08 PM

#

Do something easy like cutting edge TAA

wicked notch May 11, 2023, 9:08 PM

#

Except I won't implement TAA myself, I'll just use FSR2 since it's open source (Thank you Jaker)

frank sail May 11, 2023, 9:08 PM

#

One small issue

#

Only Vulkan and dx12 backend are provided

#

So you'd have to make a gl one

wicked notch May 11, 2023, 9:08 PM

#

Why do I get the feeling it's not... small

#

Yeah

#

You work at AMD right? Just make an OpenGL version duh

frank sail May 11, 2023, 9:09 PM

#

I can make one in my personal time

#

But there won't be an official one

wicked notch May 11, 2023, 9:09 PM

#

Sad

frank sail May 11, 2023, 9:10 PM

#

If you want, I can start working on it. I've been needing an excuse to do it

wicked notch May 11, 2023, 9:10 PM

#

What do you mean it's not supported on OpenGL by the way? Isn't it just a spatio-temporal upscaling algorithm?

#

Like, why does it require Vulkan or DX12?

frank sail May 11, 2023, 9:10 PM

#

It's not just a shader you invoke. Fsr2 also needs a bunch of internal resources and stuff

wicked notch May 11, 2023, 9:13 PM

#

Hmm

#

How hard is it to port to OpenGL?

wispy spear May 11, 2023, 9:14 PM

#

from 1 to 10? 9.2 id say

frank sail May 11, 2023, 9:15 PM

#

https://github.com/GPUOpen-Effects/FidelityFX-FSR2/blob/master/src/ffx-fsr2-api/vk/ffx_fsr2_vk.cpp

GitHub

FidelityFX-FSR2/ffx_fsr2_vk.cpp at master · GPUOpen-Effects/Fidelit...

FidelityFX Super Resolution 2. Contribute to GPUOpen-Effects/FidelityFX-FSR2 development by creating an account on GitHub.

frank sail May 11, 2023, 9:15 PM

#

wicked notch How hard is it to port to OpenGL?

Probably not super hard, provided you understand the requirements

wicked notch May 11, 2023, 9:17 PM

#

frank sail https://github.com/GPUOpen-Effects/FidelityFX-FSR2/blob/master/src/ffx-fsr2-api/...

...Translating Vulkan code (which I have 0 idea about) to OpenGL code (which I have at most 1 idea about) isn't exactly in my skillset

frank sail May 11, 2023, 9:18 PM

#

I am unironically willing to try it

#

I need to better understand fsr2 for my job anyways KEKW

wispy spear May 11, 2023, 9:19 PM

#

after the volume clustered renderer~~?~~!

#

Jaker:

#

https://tenor.com/view/dredd-i-knew-youd-say-that-stallone-gif-23229845

Tenor

wicked notch May 11, 2023, 9:23 PM

#

?????

#

I don't really handle transparency for now so I'm safe from this thing?

frank sail May 11, 2023, 9:25 PM

#

wicked notch May 11, 2023, 9:27 PM

#

Ah yes

#

Lines

#

Yeah ok, I don't understand shit KEKW

#

Hmm where to go next I wonder

frank sail May 11, 2023, 9:27 PM

#

FXAA

wicked notch May 11, 2023, 9:28 PM

#

FXAA sounds good

frank sail May 11, 2023, 9:28 PM

#

SMAA is like improved FXAA if you're willing to suffer through a much more complicated implementation

cunning atlas May 11, 2023, 9:59 PM

#

you got it figured out at least, you've been doing great things frogapprove

wispy spear May 11, 2023, 10:13 PM

#

did you know Iris is also a female firstname in germany

#

not just the eye part

frank sail May 11, 2023, 10:14 PM

#

eyeris

wicked notch May 11, 2023, 10:22 PM

#

wispy spear did you know Iris is also a female firstname in germany

I hope I don't bring shame to any women in Germany bearing that name with my shitty OpenGL stuff then KEKW

wispy spear May 11, 2023, 10:28 PM

#

haha

#

na, dont worry

wicked notch May 11, 2023, 11:19 PM

#

Lovely edges

wispy spear May 11, 2023, 11:22 PM

#

oi that looks cool

frank sail May 11, 2023, 11:29 PM

#

@wicked notch are you following the catlike coding tutorial (I think that was the one I used)

wicked notch May 11, 2023, 11:31 PM

#

I'm using
https://developer.download.nvidia.com/assets/gamedev/files/sdk/11/FXAA_WhitePaper.pdf and
http://blog.simonrodriguez.fr/articles/2016/07/implementing_fxaa.html

frank sail May 11, 2023, 11:31 PM

#

ah

wispy spear May 11, 2023, 11:31 PM

#

+1 for sorting your messages for better readabliktliblity

frank sail May 11, 2023, 11:31 PM

#

the second link only implements half of FXAA btw

#

it doesn't do the end-of-edge search, which is pretty important for reducing geometric aliasing

wicked notch May 11, 2023, 11:33 PM

#

I see, I'll try catlike coding then

frank sail May 11, 2023, 11:33 PM

#

ye
https://catlikecoding.com/unity/tutorials/advanced-rendering/fxaa/

wicked notch May 11, 2023, 11:33 PM

#

The paper should implement everything anyways?

frank sail May 11, 2023, 11:34 PM

#

yeah

#

lol

twin bough May 12, 2023, 6:32 AM

#

How many aa techniques are there anyway

wicked notch May 12, 2023, 9:23 AM

#

Too many KEKW

wicked notch May 12, 2023, 4:22 PM

#

2 hour wait for a haircut but we got it done

#

Time to finish FXAA

wicked notch May 12, 2023, 7:06 PM

#

Visualizing AABBs when in FXAA debug mode is kinda weird

wicked notch May 12, 2023, 7:30 PM

#

It's not very... anti-aliased.

#

The AABB debug lines are pretty smooth though

wispy spear May 12, 2023, 7:31 PM

#

looks smoof to me

wicked notch May 12, 2023, 7:31 PM

#

Yeah it's definitely better

wispy spear May 12, 2023, 7:32 PM

#

this scene is probably also not a good scene to show off antialiasing

wicked notch May 12, 2023, 7:32 PM

#

I've been trying to find new scenes, no luck unfortuantely 😦

wispy spear May 12, 2023, 7:33 PM

#

the directxsamples might have one with 2 telefone poles and a wire

wicked notch May 12, 2023, 7:33 PM

#

The tree in bistro gets blurred a lot lol

wicked notch May 12, 2023, 7:33 PM

#

wispy spear the directxsamples might have one with 2 telefone poles and a wire

I'll download that

wispy spear May 12, 2023, 7:34 PM

#

ah bistro also has those lights hanging across on wire

wicked notch May 12, 2023, 7:46 PM

#

These DirectX samples made me remember how attracted I am to Mesh Shaders

#

They are so good

#

Anyways testing scene right now

wicked notch May 12, 2023, 8:42 PM

#

After a bit more testing I found that FXAA was indeed working..

#

It's just.. well, not very good nervous

wicked notch May 12, 2023, 9:33 PM

#

Do I do TAA or do I not do TAA...

finite yacht May 12, 2023, 9:34 PM

#

Do it!

wicked notch May 12, 2023, 9:34 PM

#

I'll try asking my orb (the magic 8 ball)

#

#

Looks like I will do TAA

frank sail May 12, 2023, 9:35 PM

#

Sweet

#

all I ask is that you use a TAA impl that relies on a single frame of history

wicked notch May 12, 2023, 9:36 PM

#

No temporal accumulation?

frank sail May 12, 2023, 9:36 PM

#

I mean something different

#

some old TAA impls have like 5 frames of history that they reproject and test against

#

but modern impls use the last frame only, since it's faster and more stable

wicked notch May 12, 2023, 9:37 PM

#

Interesting

frank sail May 12, 2023, 9:39 PM

#

https://developer.download.nvidia.com/gameworks/events/GDC2016/msalvi_temporal_supersampling.pdf

wicked notch May 12, 2023, 9:39 PM

#

It's always either Marco Salvi or Akenine-Moller

frank sail May 12, 2023, 9:39 PM

#

oh hi marc(o)

wicked notch May 12, 2023, 9:39 PM

#

They wrote 90% of the papers

wicked notch May 13, 2023, 7:59 PM

#

I went with ROC in the end

#

HiZ is too damn conservative for my tastes

#

Anyways TAA, I think I get the gist of the algorithm but there's still some parts I can't figure out quite well cough neighborhood clipping

#

But we'll deal with that later.

finite yacht May 13, 2023, 8:28 PM

#

never tried but I think for ROC GL_NV_representative_fragment_test can be useful

#

could be worth a try you just need to do glEnable(REPRESENTATIVE_FRAGMENT_TEST_NV)

wicked notch May 13, 2023, 8:29 PM

#

I'll try it right now since it seems easy enough

#

Ah it's a performance thing, let's see..

#

Damn, this actually reduced my frametimes by 1ms lol

#

Unfortunately it's NV only..

finite yacht May 13, 2023, 8:33 PM

#

yeah. Just test if its available before enabling it ez

#

i am suprised it makes such a big difference. Wouldnt expect drawing a few aabbs and writing to ssbo to be that expensive

#Iris - A Journey through OpenGL and beyond to learn Graphics