Iris - A Journey through OpenGL and beyond to learn Graphics | Graphics Programming | Page 7

frank sail Jul 10, 2023, 11:00 PM

#

devsh bingo:
he will shill his lib to you (free space)

wispy spear Jul 10, 2023, 11:02 PM

#

hehe

#

just stand your ground

distant lodge Jul 10, 2023, 11:05 PM

#

assert your dominance by shilling your lib to him

wicked notch Jul 11, 2023, 12:24 PM

#

I think I just had a huge brain moment

#

Actually nevermind

#

It's view dependent so it won't work

#

goddamnit

#

pure sadness

#

I thought marking the biggest primitive in the cluster and checking only that would be smart, but it isn't

glass sphinx Jul 11, 2023, 12:26 PM

#

frog_think

wicked notch Jul 11, 2023, 12:26 PM

#

since meshlets are disconnected, I can't guarantee that such primitive will be in view when others are

#

What does this meeean

#

How can I do this in a way that doesn't take all my goddamn compute power

distant lodge Jul 11, 2023, 12:31 PM

#

you'd think just the AABB extents would give you a close enough read on that

wicked notch Jul 11, 2023, 12:31 PM

#

yes but they won't, because meshoptimizer prioritizes topological efficiency to locality efficiency

#

Which means clusters are very much welcome to have disconnected triangles that span the entire mesh

#

https://github.com/zeux/meshoptimizer/blob/master/src/clusterizer.cpp#L600

glass sphinx Jul 11, 2023, 12:33 PM

#

wicked notch What does this meeean

as long as zhe aabb is smaller then 32 in any direction in screen space pixels

glass sphinx Jul 11, 2023, 12:34 PM

#

wicked notch Which means clusters are very much welcome to have disconnected triangles that s...

this is good for low density meshes kinda

wicked notch Jul 11, 2023, 12:34 PM

#

glass sphinx as long as zhe aabb is smaller then 32 in any direction in screen space pixels

I guess that's the most I can do right now yes

distant lodge Jul 11, 2023, 12:35 PM

#

oh I get what you mean, but I have a feeling that meshopt's clusterizer doesn't work like nanite's

#

and I think zeux's opinion on locality only holds if you're not making a hybrid hw-sw rasterizer

wicked notch Jul 11, 2023, 12:36 PM

#

Yes I get that, this all translates in "I have to make my clusterizer that matches my requirements"

#

I have no idea how to though bleakekw

distant lodge Jul 11, 2023, 12:36 PM

#

you could look at nanite's, or start simple

#

maybe try converting your meshes to triangle strips

#

and then subdivide those strips into equal pieces

#

that should give you some kind of locality

wicked notch Jul 11, 2023, 12:38 PM

#

I'll try to find Unreal's clusterizer, their folder structure is kinda insane KEKW

cedar seal Jul 11, 2023, 1:06 PM

#

Wouldn't something like octree give you locality?

wicked notch Jul 11, 2023, 1:18 PM

#

Possibly, I'd have to experiment

wicked notch Jul 11, 2023, 2:01 PM

#

These are all the software rasterized clusters

#

For some reason I don't understand there are clusters that are way too big

#

Ah I wasn't abs'ing properly

#

Now it's good

#

Not gonna lie it's kinda cool to see it work in practice, even if 50% of the potentially software rasterizable clusters are hardware rasterized...

#

Looks pretty good

#

(still no culling btw)

wicked notch Jul 11, 2023, 2:44 PM

#

With everything enabled it's absurdly fast

#

You can kinda see when the rasterizer chokes because of small tris, for 100 microseconds or so vertex occupancy is great but pixel occupancy is very low, then for the next 200 microseconds everything is good

wispy spear Jul 11, 2023, 4:05 PM

#

you are an animal

#

Nanite 2.0 soon

cedar seal Jul 11, 2023, 6:57 PM

#

Frognite

wicked notch Jul 11, 2023, 6:58 PM

#

Yes

#

That'll be the name of this

cedar seal Jul 11, 2023, 7:01 PM

#

Is this iris thing somewhere in github?

wispy spear Jul 11, 2023, 7:01 PM

#

check pins

frank sail Jul 11, 2023, 7:04 PM

#

da peeens

wicked notch Jul 11, 2023, 7:23 PM

#

so

#

here's Sponza

#

very cute huh?

#

except uh

#

Perhaps I subdivided it a bit too much

#

bleakekw

frank sail Jul 11, 2023, 7:25 PM

#

you got that signature nanite look though

wispy spear Jul 11, 2023, 7:26 PM

#

now add Lumen 2.0

#

perhaps call it Nemul 🙂

wicked notch Jul 11, 2023, 7:27 PM

#

GI is terrifying

distant lodge Jul 11, 2023, 7:28 PM

#

you're right

wispy spear Jul 11, 2023, 7:29 PM

#

says the person who just reshrimplemented Nanite

frank sail Jul 11, 2023, 7:29 PM

#

wicked notch GI is terrifying

that's what my dad's gastroenterologist said

wicked notch Jul 11, 2023, 7:29 PM

#

wispy spear says the person who just reshrimplemented Nanite

I implemented only the easy part KEKW

#

The hard part is the LOD DAG

wispy spear Jul 11, 2023, 7:29 PM

#

ah

#

thats where you should also talk to devsh

wicked notch Jul 11, 2023, 7:29 PM

#

yes

wispy spear Jul 11, 2023, 7:30 PM

#

he has a little video series on his discord explaining it

wicked notch Jul 11, 2023, 7:32 PM

#

btw I'm curious to check if I effectively wasted my time, or if this is actually better

#

so I'm packaging this high poly sponza™️ for you guys to test

distant lodge Jul 11, 2023, 7:35 PM

#

oh wtf so he actually implemented his own meshleting LOD

#

I'm hyped to see if it works on my 760

#

hyped to see if I even have the video memory for it

frank sail Jul 11, 2023, 7:36 PM

#

I don't think lod is in yet

#

But I'm hyped

#

Oh you're talking about devsh

distant lodge Jul 11, 2023, 7:37 PM

#

yeah

wispy spear Jul 11, 2023, 7:42 PM

#

my 2x780ti were able to run devsh's LOD thing with 25mio tringles iirc, 1.5years ago

wicked notch Jul 11, 2023, 8:06 PM

#

https://drive.google.com/file/d/1szaZTtSy3IHWrrx84XNjFFMMWnj4Zkz0/view?usp=sharing

#

Go ahead boys, post screenshots and perf data

wispy spear Jul 11, 2023, 8:09 PM

#

only if it runs on lunix too 😛

wicked notch Jul 11, 2023, 8:09 PM

#

It's just a model

#

The high poly sponza

#

To test with your own schtuff

wispy spear Jul 11, 2023, 8:10 PM

#

ah

#

fuck

#

im blind now

wicked notch Jul 11, 2023, 8:11 PM

#

Here's my perf data

#

1.96ms to rasterize
2.22ms total with textures
no culling

wispy spear Jul 11, 2023, 8:13 PM

#

1.x GB 😛

#

frametime 41ms

wicked notch Jul 11, 2023, 8:19 PM

#

Hmm, can I see what's taking so much time

wispy spear Jul 11, 2023, 8:19 PM

#

wicked notch Jul 11, 2023, 8:19 PM

#

In NSight

raven orchid Jul 11, 2023, 8:32 PM

#

wicked notch Here's my perf data

Can nsight generate this for older gpus?

#

I sometimes get “not supported on this hardware” messages with parts of the profiler

wicked notch Jul 11, 2023, 8:34 PM

#

Actually yeah, I forgor that GPU trace only works with new GPUs

#

You should still have the frame profiler though

raven orchid Jul 11, 2023, 8:35 PM

#

yeah frame profiler should work

#

dang wish the full profiler worked 😦

#

for this mesh did you subdivide it in your thing and then export or how did you gen?

wicked notch Jul 11, 2023, 8:37 PM

#

Regular catmull-clark subdivision

frank sail Jul 11, 2023, 8:37 PM

#

wicked notch Actually yeah, I forgor that GPU trace only works with new GPUs

I had the gpu trace option available in older versions of nsight, but I never tried it with opengl

wicked notch Jul 11, 2023, 8:37 PM

#

I downloaded some library

frank sail Jul 11, 2023, 8:37 PM

#

I think demongod said it doesn't work on his pascal gpu though

raven orchid Jul 11, 2023, 8:37 PM

#

yeah with opengl it gets mad for some reason

frank sail Jul 11, 2023, 8:38 PM

#

also make sure you run nsight as admin or sudo

raven orchid Jul 11, 2023, 8:38 PM

#

well it gets mad for 2 reasons: A) it does not like pascal

frank sail Jul 11, 2023, 8:38 PM

#

that's another classic

raven orchid Jul 11, 2023, 8:38 PM

#

B) does not support anything but DX12 or VK

frank sail Jul 11, 2023, 8:38 PM

#

I can use it just fine for my opengl apps

#

just not the shader profiler

raven orchid Jul 11, 2023, 8:38 PM

#

can you profile the shaders

#

ah yeah

#

😦

#

same

frank sail Jul 11, 2023, 8:38 PM

#

gpu trace is good enough for me 90% of the time

#

e.g., if it says you're vram bottlenecked, it's generally quite obvious what lines would impact that

raven orchid Jul 11, 2023, 8:41 PM

#

yeah true, usually you can frogdelet through the code for that stage and guess the bad parts

wicked notch Jul 11, 2023, 8:54 PM

#

mr jaker

#

what do you have to say about this kind of occupancy

frank sail Jul 11, 2023, 8:55 PM

#

#

register usage seems kinda high, but I'm not sure how to interpret the graph exactly

#

oh this is my own thing lol

wicked notch Jul 11, 2023, 8:57 PM

#

bleakekw

frank sail Jul 11, 2023, 8:57 PM

#

what mesh are you drawing

wicked notch Jul 11, 2023, 8:57 PM

#

smh doesn't even recognize his own brainchild

wicked notch Jul 11, 2023, 8:57 PM

#

frank sail what mesh are you drawing

The ultrahighpoly sponza

frank sail Jul 11, 2023, 8:57 PM

#

I see

#

how many primitives

wicked notch Jul 11, 2023, 8:57 PM

#

25 million

frank sail Jul 11, 2023, 8:58 PM

#

I mean in terms of draw calls

#

Doesn't seem like too many tbh, which is good

wicked notch Jul 11, 2023, 8:58 PM

#

Ah yeah, it's very little

frank sail Jul 11, 2023, 8:58 PM

#

All I can say is that the RSM pass is the only thing I attempted to optimize

#

The rest is just a very naive deferred renderer

wicked notch Jul 11, 2023, 8:59 PM

#

25 draw calls btw

frank sail Jul 11, 2023, 9:00 PM

#

Lol the sample RSM pass is super L2 bound because it's doing random samples

raven orchid Jul 11, 2023, 9:00 PM

#

I tried to open it in blender and i don’t have enough ram for that lol

wicked notch Jul 11, 2023, 9:01 PM

#

raven orchid I tried to open it in blender and i don’t have enough ram for that lol

hehe, yeah you need too much RAM to do stuff in blender

frank sail Jul 11, 2023, 9:02 PM

#

wicked notch what do you have to say about this kind of occupancy

what resolution btw

wicked notch Jul 11, 2023, 9:02 PM

#

1080p

#

No FSR2

frank sail Jul 11, 2023, 9:02 PM

#

nice

#

fsr2 performs horribly on nv anyways

#

also, this is definitely vertex bound

wicked notch Jul 11, 2023, 9:03 PM

#

do I have your permission to implement my thing in frogfooding

frank sail Jul 11, 2023, 9:03 PM

#

what thing frogstare

#

meshlet renderer?

wicked notch Jul 11, 2023, 9:04 PM

#

yes, without hybrid sw/hw though

#

Because I can't do that in GL 😦

frank sail Jul 11, 2023, 9:04 PM

#

does that require mesh shading?

wicked notch Jul 11, 2023, 9:04 PM

#

no

#

good old vertex shader

#

I am planning to switch back to vertex shaders as well

#

I like being able to run on more than 0.01% of systems bleakekw

frank sail Jul 11, 2023, 9:06 PM

#

does your technique require any deps

#

besides maybe meshopt

wicked notch Jul 11, 2023, 9:06 PM

#

just meshopt

frank sail Jul 11, 2023, 9:07 PM

#

yeah you can add it if you want

#

though

#

it's in very early dev atm

#

I didn't originally plan frogfood as a collaborative thingy, but this feature seems lit enough that I'm willing to try

wicked notch Jul 11, 2023, 9:10 PM

#

Epique

frank sail Jul 11, 2023, 9:11 PM

#

wicked notch do I have your permission to implement my thing in frogfooding

does your thingy require 64 bit atomics or neh

wicked notch Jul 11, 2023, 9:11 PM

#

nope

#

I need that only because of hybrid sw/hw raster

frank sail Jul 11, 2023, 9:11 PM

#

oh because regular shaders

#

noice

wicked notch Jul 11, 2023, 9:11 PM

#

If I were able to write to D32 attachments from compute life would be so much easier (and no 64 bit atomics either)

frank sail Jul 11, 2023, 9:12 PM

#

you can write to a D32 attachment and then copy

#

oops I mean R32

#

it's ugly, but works

wicked notch Jul 11, 2023, 9:12 PM

#

How do you merge 2 D32 attachments?

frank sail Jul 11, 2023, 9:13 PM

#

wdym

wicked notch Jul 11, 2023, 9:14 PM

#

Uhhh

#

It could work?

#

I mean I need to store meshlet_id and primitive_id

#

so how would I update the meshlet_id and primitive_id only if the depth test passed?

frank sail Jul 11, 2023, 9:15 PM

#

cas loop bleakekw

#

but yeah just frogment shader is fine

wicked notch Jul 11, 2023, 9:16 PM

#

I could yolo it like this:

const float depth = uintBitsToFloat(imageLoad(depth, position).x);
if (depth > current_depth) {
    // YOLO
    imageStore(depth, position, current_depth);
    imageStore(visbuffer, position, meshlet_id << 24 | primitive_id);
}```

#

bleakekw

frank sail Jul 11, 2023, 9:17 PM

#

ship it

glass sphinx Jul 11, 2023, 9:17 PM

#

cyberpoonk

frank sail Jul 11, 2023, 9:19 PM

#

wicked notch I could yolo it like this: ```glsl const float depth = uintBitsToFloat(imageLoad...

what we need is EXT_compute_shader_interlock smart

wicked notch Jul 11, 2023, 9:29 PM

#

The only thing that's kinda meh with the old vertex shader approach is the preprocessing step

#

It takes a good third of the raster time*

#

Perhaps one could cache this

raven orchid Jul 11, 2023, 9:30 PM

#

Reprocesses all vertices each frame?

wicked notch Jul 11, 2023, 9:31 PM

#

Yeah, it builds an index buffer with meshlet_id << 7 | primitive_id & 0x7f

frank sail Jul 11, 2023, 9:31 PM

#

chonky

wicked notch Jul 11, 2023, 9:31 PM

#

The shader's super easy as well

#

shared uint base_index[MESHLET_PER_WORK_GROUP];
shared uint base_primitive[MESHLET_PER_WORK_GROUP];
shared uint primitive_count[MESHLET_PER_WORK_GROUP];

void main() {
    const uint meshlet_base_id = gl_WorkGroupID.x * MESHLET_PER_WORK_GROUP;
    const uint meshlet_offset = gl_LocalInvocationID.x / 64;
    const uint meshlet_id = meshlet_base_id + meshlet_offset;
    const uint local_id = gl_LocalInvocationID.x % 64;
    const uint index = local_id * 3;

    if (meshlet_id < meshlet_count && local_id == 0) {
        const meshlet_t meshlet = meshlets[meshlet_id];
        base_index[meshlet_offset] = atomicAdd(o_command.index_count, meshlet.primitive_count * 3);
        base_primitive[meshlet_offset] = meshlet.base_primitive;
        primitive_count[meshlet_offset] = meshlet.primitive_count;
    }
    barrier();
    if (meshlet_id < meshlet_count && local_id < primitive_count[meshlet_offset]) {
        o_meshlet.indices[base_index[meshlet_offset] + index + 0] = (meshlet_id << 8u) | (primitives[base_primitive[meshlet_offset] + index + 0] & 0xffu);
        o_meshlet.indices[base_index[meshlet_offset] + index + 1] = (meshlet_id << 8u) | (primitives[base_primitive[meshlet_offset] + index + 1] & 0xffu);
        o_meshlet.indices[base_index[meshlet_offset] + index + 2] = (meshlet_id << 8u) | (primitives[base_primitive[meshlet_offset] + index + 2] & 0xffu);
    }
}```

wispy spear Jul 11, 2023, 9:32 PM

#

FFU 🙂

#

should have a name too

wicked notch Jul 11, 2023, 9:33 PM

#

#define PRIMITIVE_MASK (MAX_PRIMITIVES - 1)```
![bleakekw](https://cdn.discordapp.com/emojis/1082598350303539240.webp?size=128 "bleakekw")

frank sail Jul 11, 2023, 9:33 PM

#

not too bad

wicked notch Jul 11, 2023, 9:51 PM

#

I made it better

#

One less memory load goes a long way

wispy spear Jul 11, 2023, 9:53 PM

#

how many ms does that save?

wicked notch Jul 11, 2023, 9:53 PM

#

About half

#

0.29 -> 0.15

wispy spear Jul 11, 2023, 9:53 PM

#

not bad

#

means more juice for potential denoisers/giisms later

wicked notch Jul 11, 2023, 9:54 PM

#

I feel like the driver is hiding some more juice though

#

The hardware itself too

wispy spear Jul 11, 2023, 9:55 PM

#

that is why you shuold write about all that

#

and make it available for others outside our GP bubble

raven orchid Jul 11, 2023, 10:03 PM

#

How does nanite do lods? For some reason I didn’t think they were anymore

distant lodge Jul 11, 2023, 10:03 PM

#

the LODs are like the bread and butter technically speaking

wicked notch Jul 11, 2023, 10:03 PM

#

raven orchid How does nanite do lods? For some reason I didn’t think they were anymore

It's clamplicated

#

Given a list of N clusters, they build a DAG where the leafs are the most detailed LODs

glass sphinx Jul 11, 2023, 10:04 PM

#

raven orchid How does nanite do lods? For some reason I didn’t think they were anymore

nanite lodding is like 66% of the point of nanite

#

virtualized geo makes it possible to draw so much

raven orchid Jul 11, 2023, 10:05 PM

#

Yeah I guess when I hear lod I think chunky stuff from older games

glass sphinx Jul 11, 2023, 10:05 PM

#

its really complex, especially the hierarchy building is very complex and involved

#

nanite cal dynamically lod parts of obnjects even

wicked notch Jul 11, 2023, 10:06 PM

#

Then for each LOD level until we reach the root, we take M clusters (let's say M=4 clusters) with the most shared boundaries and shrimplify them, the shrimplification happens such that the resulting cluster will be half the triangles of the previous cluster, after that we split the cluster into two clusters and they will be the parents

finite yacht Jul 11, 2023, 10:06 PM

#

frank sail what we need is EXT_compute_shader_interlock <:smart:591864977296588830>

finally someone sees it

wicked notch Jul 11, 2023, 10:06 PM

#

We simplify clusters in groups because otherwise we would have cracks where cluster boundaries don't match, so we try to group clusters with the most shared boundaries, and leave the outer boundaries of the group unchanged

#

This is repeated until we reach the root which is always MAX_PRIMITIVES triangles

raven orchid Jul 11, 2023, 10:07 PM

#

That’s super cool

#

So right now is yours rendering tons of stuff but at max lod?

wicked notch Jul 11, 2023, 10:08 PM

#

How to choose the correct cut for the DAG is even more clamplicated, it's based on screen space projected error using a quadric error metric

wicked notch Jul 11, 2023, 10:08 PM

#

raven orchid So right now is yours rendering tons of stuff but at max lod?

yes, which is very suboptimal bleakekw

#

But honestly the error thing flew so high over my head that I'm thinking of not implementing that bleakekw

#

Brian Karis himself said it took over one year of full development just to get the error metric right

raven orchid Jul 11, 2023, 10:11 PM

#

Wow

#

Rnd for nanite and lumen was extreme

wicked notch Jul 11, 2023, 10:12 PM

#

https://youtu.be/eviSykqSUUw?t=1202

#

Everytime I hear this I get scared frog_gone

raven orchid Jul 11, 2023, 10:13 PM

#

Oh yeah 1 man year he said

wispy spear Jul 11, 2023, 10:14 PM

#

its supposed to be simple to be implemented

#

otherwise they wouldntve said "simplify"

wicked notch Jul 12, 2023, 3:22 PM

#

I was thinking of doing some good old shading

#

lighting, shadowing, ~~real time SDF based global illumination~~

#

But for shadows I would need to call the whole Frognite™️ (patent pending) pipeline N times

#

So multiview?

#

Do I just dispatch more workgroups (for software)?

#

And I guess I could just use native multiview for mesh shaders

wicked notch Jul 12, 2023, 9:18 PM

#

Aaaaaaaand gltfpack can't properly convert normal maps to BC5_RG

#

epic

frank sail Jul 12, 2023, 9:18 PM

#

what does it do?

wicked notch Jul 12, 2023, 9:18 PM

#

It expects the normal map to be transcoded to BC1_RGB

#

so it doesn't swizzle the green channel of the normal map to its alpha channel

frank sail Jul 12, 2023, 9:19 PM

#

can't wait for your release of 2gltf2pack

#

after you implement Frognite™️ in Frogfood®️ ofc

wicked notch Jul 12, 2023, 9:20 PM

#

I do plan on releasing Frogmen™️ too

#

(lumen but with frogs)

#

SDFDDGI caught my eye

frank sail Jul 12, 2023, 9:20 PM

#

more like Lufrog

wispy spear Jul 12, 2023, 9:24 PM

#

Froglight

wicked notch Jul 12, 2023, 9:37 PM

#

Before I'll make ridiculous breaking changes

#

Here's some stupid N dot L * base_color shading bleakekw

#

#

Next up: https://github.com/microsoft/DirectXMesh

#

zeux says "it doesn't take into consideration topology very much"

#

Which is exactly what I need KEKW

#

I wonder how many curses I will get from Khronos for integrating DirectX tools with their pristine Vulkan™️ API

#

I also need to start thinking seriously about multiview

#

I am not sure I can use VK_KHR_multiview because I don't bind any color attachments whatsoever in my entire pipeline

#

(except for output to swapchain)

#

typedef struct VkRenderPassMultiviewCreateInfo {
    VkStructureType    sType;
    const void*        pNext;
    uint32_t           subpassCount;
    const uint32_t*    pViewMasks;
    uint32_t           dependencyCount;
    const int32_t*     pViewOffsets;
    uint32_t           correlationMaskCount;
    const uint32_t*    pCorrelationMasks;
} VkRenderPassMultiviewCreateInfo;``` I'm really not sure how to use this ![bleakekw](https://cdn.discordapp.com/emojis/1082598350303539240.webp?size=128 "bleakekw")

wispy spear Jul 12, 2023, 9:44 PM

#

i also noticed something fuhny

#

our pluginshinanians exposed quantization via gltf, but the plugin itself has some quantificitione options already

wicked notch Jul 12, 2023, 9:47 PM

#

Does blender document their quantization shenanigans

#

gltfpack is open (blender is too but good luck navigating the 10 million loc black box of stuff bleakekw )

wispy spear Jul 12, 2023, 9:47 PM

#

i have not checked tbh

#

and its the gltf plugin i was talking about not blender itself

wicked notch Jul 12, 2023, 9:48 PM

#

Ah it's a plugin

#

By the way

#

Some implementations may not support multiview in conjunction with mesh shaders, geometry shaders or tessellation shaders.

#

This is very sad 😭

wispy spear Jul 12, 2023, 9:50 PM

#

fook

wicked notch Jul 12, 2023, 9:50 PM

#

Why would you not support multiview with the feature that literally needs multiview more than any other feature bleakekw

wispy spear Jul 12, 2023, 9:51 PM

#

non ho idea di cosa sia il multiview

wicked notch Jul 12, 2023, 9:51 PM

#

basically fancy gl_Layer

wispy spear Jul 12, 2023, 9:51 PM

#

ah

wicked notch Jul 12, 2023, 9:51 PM

#

if you make a framebuffer with layers, you can draw to each layer by writing to gl_Layer

#

multiview gives you gl_ViewIndex and it's read-only

wispy spear Jul 12, 2023, 9:52 PM

#

yeah, rather than employing a gs

wicked notch Jul 12, 2023, 9:52 PM

#

You set a number of views you want to render to when you setup the thing and then all the draw commands are broadcasted to each view

#

Nanite does this to render all shadow maps, for all lights, in all viewports simultaneously

wispy spear Jul 12, 2023, 9:53 PM

#

oof that sounds neat

#

something i need too at some poitn for my shadowisms 🙂

#

or lightprobes perhaps?

frank sail Jul 12, 2023, 9:54 PM

#

Ah so that's how they do vsm

wicked notch Jul 12, 2023, 10:11 PM

#

ye

#

I wonder how to do multiview for compute

wispy spear Jul 12, 2023, 10:11 PM

#

glMultiDispatch

frank sail Jul 12, 2023, 10:12 PM

#

It would be cool if we could schedule barriers from the GPU

wicked notch Jul 12, 2023, 10:14 PM

#

True but I don't think I need barriers

#

Perhaps I could shrimply put the unused y dimension to use

frank sail Jul 12, 2023, 10:17 PM

#

or the hidden w dimension that driver devs don't want you to know about

wicked notch Jul 12, 2023, 10:17 PM

#

The only issue I have is with memory

#

but it's only a couple million uints

#

so a couple megabytes at worst will become a couple hundred

#

With 100 multiviews

frank sail Jul 12, 2023, 10:19 PM

#

what's the issue with multiview?

#

also, do you do culling for every view frog_sweat

wicked notch Jul 12, 2023, 10:19 PM

#

course I do it's extremely fast

#

HZB build + cull + classify doesn't even show up in gputrace right now

frank sail Jul 12, 2023, 10:20 PM

#

even for 100 views?

wicked notch Jul 12, 2023, 10:21 PM

#

Combined is less than 50 microsecs

frank sail Jul 12, 2023, 10:21 PM

#

waw

wicked notch Jul 12, 2023, 10:21 PM

#

frank sail even for 100 views?

Well uh frog_sweat

frank sail Jul 12, 2023, 10:21 PM

#

ah

wicked notch Jul 12, 2023, 10:22 PM

#

For 100 views you could estimate 5 milliseconds just to cull bleakekw

#

But 100 views is just a huge upperbound, I think unreal supports up to 128

#

But they cache their whole pipeline/scene and make it persist across frames and more shenanigans beyond my comprehension bleakekw

frank sail Jul 12, 2023, 10:25 PM

#

ouf

cedar seal Jul 13, 2023, 5:53 AM

#

You really should fork gltfpack. Most of the processing is in meshoptimizer, gltfpack is essentially almost "just" a wrapper for meshoptimizer.

#

Or even file issues / pull requests

wicked notch Jul 13, 2023, 1:33 PM

#

TODO
Test MaterialID depth buffering:

Rasterize visbuffer
Fullscreen triangle, load visbuffer per pixel and write gl_FragDepth = uintBitsAsFloat(material_id);, depth test set to ALWAYS
Draw more fullscreen triangles, one per material, depth test set to EQUALS

wispy spear Jul 13, 2023, 7:27 PM

#

not sure if you are into bundles, but https://www.youtube.com/watch?v=6_BBgz5-H20

YouTube

Gamefromscratch

EPIC Environments Asset Bundle -- Best Unreal Humble Yet?

The Unreal Engine Mega Pack is a huge collection of high quality 3d environment assets.
https://www.humblebundle.com/software/unreal-engine-mega-pack-software?partner=gamefromscratch

This pack from Hivemind contains thousands of 3D objects and blueprints for creating a wide variety of maps, including Viking, Medieval, Harbours, Churches, Houses...

▶ Play video

#

for the triangle counts 🙂

wicked notch Jul 13, 2023, 7:30 PM

#

epic

#

Hopefully the gltf exported doesn't die as usual bleakekw

wispy spear Jul 13, 2023, 7:34 PM

#

heh

wicked notch Jul 14, 2023, 12:13 PM

#

wicked notch TODO Test MaterialID depth buffering: 1. Rasterize visbuffer 2. Fullscreen trian...

twas a success pog

wicked notch Jul 14, 2023, 6:22 PM

#

@frank sail

#

I drew a lil something

#

#

Feast your eyes upon my huge drawing skills

#

This is with regards to SMRT

#

As far as I've understood at least

wispy spear Jul 14, 2023, 6:25 PM

#

reads like al hamdu lillah

wicked notch Jul 14, 2023, 6:26 PM

#

KEKW

#

It's actually english but my writing skills shit

wispy spear Jul 14, 2023, 6:26 PM

#

heh, you havent seen my handwirting

wicked notch Jul 14, 2023, 6:26 PM

#

white text is "Sun"
blue text is "Hit"
green text is "March until hit"

wispy spear Jul 14, 2023, 6:27 PM

#

ah lol

frank sail Jul 14, 2023, 6:36 PM

#

wicked notch

this diagram works for any ray traced shadow tbh

wicked notch Jul 14, 2023, 6:37 PM

#

How does this produce contact hardening though? thonk

#

Also you seem to define something like a "heightmap thickness" which I have no idea where it comes in

wicked notch Jul 14, 2023, 6:37 PM

#

wicked notch How does this produce contact hardening though? <:thonk:728843920393371708>

o I think I got it

#

the farther away we are, the less likely we are to hit any object?

frank sail Jul 14, 2023, 6:43 PM

#

Father away means the ground-to-light ray has more time to diverge before hitting a blocker

#

Idk if you've played counter strike, but you might know in shooter games that peeking when you're near a corner will reveal the enemy more quickly than peeking from far away

wicked notch Jul 14, 2023, 6:46 PM

#

yes

#

I get that, what about the height map thickness though?

frank sail Jul 14, 2023, 6:47 PM

#

well that's just a heuristic

#

the depth map doesn't tell us the true geometry of everything behind it, so we have to guess somehow

#

so we just say the depth map is a solid wall of N thickness with absolutely nothing behind it (except for the surface we're shading)

wicked notch Jul 14, 2023, 6:50 PM

#

So instead of the tree we have a huge wall

frank sail Jul 14, 2023, 6:50 PM

#

unreal uses some additional heuristics that are suggested by console commands

wicked notch Jul 14, 2023, 6:50 PM

#

From the perspective of the ray at least

frank sail Jul 14, 2023, 6:50 PM

#

yeah, it's a wall with the outline of a tree

wicked notch Jul 14, 2023, 6:50 PM

#

I'm testing SMRT in unreal right now lol

#

This is SMRT with 1spp and 1rpp

#

pretty blocky

frank sail Jul 14, 2023, 6:51 PM

#

btw the UE docs don't cover the new SMRT console commands

#

they remained the same since the 5.0 release

wicked notch Jul 14, 2023, 6:52 PM

#

Now the obligatory question

#

What are the disadvantages of SMRT?

#

To me it looks like free contact hardening lol

#

Also what happens when the blocker is not in this cascade thonk

frank sail Jul 14, 2023, 6:58 PM

#

it has light leaking

frank sail Jul 14, 2023, 6:58 PM

#

wicked notch Also what happens when the blocker is not in this cascade <:thonk:72884392039337...

idk bleakekw

#

you ought to be able to trace within multiple cascades I guess

wicked notch Jul 14, 2023, 6:59 PM

#

damn

#

Tracing within multiple cascades sounds baad

#

Let's say we got 8 shadow rays

#

8 steps per cascade

#

On average we can assume the blocker has a 50% probability of being in the current cascade

frank sail Jul 14, 2023, 7:01 PM

#

what I mean is just switching to a different cascade when you go outside the bounds of one

wicked notch Jul 14, 2023, 7:01 PM

#

hm

frank sail Jul 14, 2023, 7:01 PM

#

the shadow ray is typically pretty short (even when the blocker is far away, you just teminate the ray), so it's unlikely you'll ever go between more than two

#

at least according to my mental heuristics (I haven't actually implemented this with >1 cascade lel )

wicked notch Jul 14, 2023, 7:05 PM

#

How do I know when I'm out of bounds though thonk

#

shadow_clip_pos.xy >= 1.0 or <= 0.0?

#

Or maybe when the ray is above the shadow map?

frank sail Jul 14, 2023, 7:14 PM

#

wicked notch shadow_clip_pos.xy >= 1.0 or <= 0.0?

yeah I guess

glass sphinx Jul 14, 2023, 11:12 PM

#

btw do you do entity cvulling on the gpu?

wicked notch Jul 14, 2023, 11:16 PM

#

yes

#

In the "cull_and_classify" step

glass sphinx Jul 14, 2023, 11:53 PM

#

froge_love manual barriers

glass sphinx Jul 15, 2023, 12:00 AM

#

wicked notch In the "cull_and_classify" step

isnt that just culling meshlets?

#

not full entities

#

im looking at cull_classify.comp

wicked notch Jul 15, 2023, 12:01 AM

#

Ah right now it's not the latest version

#

But yes, I am culling meshlets

#

meshlet instances

glass sphinx Jul 15, 2023, 12:01 AM

#

but not entities before that

wicked notch Jul 15, 2023, 12:01 AM

#

What do you mean by entity?

glass sphinx Jul 15, 2023, 12:01 AM

#

i guess im asking if you cull full meshes

#

before you cull the meshlets

wicked notch Jul 15, 2023, 12:02 AM

#

Ah, nope but good call

#

I should probably do that as well

glass sphinx Jul 15, 2023, 12:02 AM

#

i was gonna ask how if you did

#

because its actually annoying as fuck

#

because you have an asymetric work expansion from mesh to meshlets

#

each mesh can return a different count of meshlets

#

so you need to either do an indirect draw count, where you cull in the vertex shader and discard all vertices, starting MESH_COUNT draws each having an indirect draw with PER_MESH_MESHLET_COUNT as the number of vertices

#

or you do compute work expansion

#

by doing a prefix sum and then binary search

#

but soon tm we may get work graphs which solve this issue btw

wicked notch Jul 15, 2023, 12:04 AM

#

Hmm sounds hard indeed

#

Yeah, soon KEKW

glass sphinx Jul 15, 2023, 12:04 AM

#

this is exactly what work graphcs would solve

#

sadge that we need to wait

glass sphinx Jul 15, 2023, 12:04 AM

#

glass sphinx this is exactly what work graphcs would solve

with them you would just dispatch meshlet culls from the mesh culls

#

making its ultra simple and efficien t

#

ALSO WHY IS THERE NO DISPATCHINDIRECTCOUNT

#

😿

#

@frank sail give it to me

wicked notch Jul 15, 2023, 12:05 AM

#

Wat

glass sphinx Jul 15, 2023, 12:05 AM

#

i need multi dispatch indirect

wicked notch Jul 15, 2023, 12:05 AM

#

??

frank sail Jul 15, 2023, 12:05 AM

#

why do you want that

wicked notch Jul 15, 2023, 12:05 AM

#

Wtf there is no dispatch indirect count

#

How

wicked notch Jul 15, 2023, 12:05 AM

#

frank sail why do you want that

What do you mean why

#

It's super useful

glass sphinx Jul 15, 2023, 12:05 AM

#

frank sail why do you want that

for the same reason i would want draw indirect count

frank sail Jul 15, 2023, 12:06 AM

#

just increase the size of the dispatch

wicked notch Jul 15, 2023, 12:06 AM

#

no

frank sail Jul 15, 2023, 12:06 AM

#

unless you also want indirect global barriers and such

glass sphinx Jul 15, 2023, 12:06 AM

#

if we get work graphs i wont care at all i wont use any indirect anymore only work graphs if they dont suck

glass sphinx Jul 15, 2023, 12:06 AM

#

frank sail just increase the size of the dispatch

ok how do i map the indices

#

prefix sum and binary search is the most efficient way aside from draw indirect count abuse

frank sail Jul 15, 2023, 12:07 AM

#

glass sphinx ok how do i map the indices

idk what problem you're trying to solve so idk

glass sphinx Jul 15, 2023, 12:07 AM

#

i need to map from global thread index to meshlet index and mesh index

#

each mesh not culled has different counts of meshlets

#

i need to iterate over the surviving meshes meshlets

#

so i cant just use the thread index to index meshlets

frank sail Jul 15, 2023, 12:07 AM

#

can you explain how you would use multi dispatch indirect

#

and how that is not mappable to regular indirect dispatch

#

if you need ordering, then you're basically asking for indirect barriers and command submission, which I agree would be cool and useful

glass sphinx Jul 15, 2023, 12:09 AM

#

i would make a buffer containing n dispatch indirect structs
each containing the meshlet count / workgroupsize rounded up as the x parameter, 1,1 for y and z
i would populate this in mesh culling, each surviving mesh appends to this buffer filling the dispatch values
then dispatch indirect count over the array of dispatch infos, each working to cull meshlets for a mesh

#

this can be done with task shaders as well btw

#

vkDrawMeshletsDispatchTasksCount or whatever

#

but it has the stupit shit with setting up drawing and all that for no reason

glass sphinx Jul 15, 2023, 12:10 AM

#

frank sail and how that is not mappable to regular indirect dispatch

how would you map global thread index to meshlet and mesh index

#

if you have n meshes

#

each have m[N] meshlets

#

this is why we have draw indirect count

wicked notch Jul 15, 2023, 12:11 AM

#

You could probably have a buffer with the counts of each meshlet and divide the global ID by some upper bound

frank sail Jul 15, 2023, 12:11 AM

#

what if you did an indirect dispatch where you use Y or Z to indicate how many meshlets you produced or whatever

wicked notch Jul 15, 2023, 12:12 AM

#

wicked notch You could probably have a buffer with the counts of each meshlet and divide the ...

Not sure if this would work

glass sphinx Jul 15, 2023, 12:12 AM

#

frank sail what if you did an indirect dispatch where you use Y or Z to indicate how many m...

wildly different sizes

#

its super slow like that

#

so you would get like 95% of the grid wasted

#

one way to fix this

#

is to simply not be gpou driven

wicked notch Jul 15, 2023, 12:13 AM

#

Unacceptable

glass sphinx Jul 15, 2023, 12:13 AM

#

and record a dispatch per mesh, and then use predicates of whatever they are called

#

to cull

#

but that is actually much slower

#

as you need an extra dispatch for all draws

frank sail Jul 15, 2023, 12:13 AM

#

I'm sure the issue is trivially solvable with another level of indirection

glass sphinx Jul 15, 2023, 12:14 AM

#

it is

#

i do a prefix sum

#

of an array containing meshlet counts

wicked notch Jul 15, 2023, 12:14 AM

#

prefix sum over the meshlet counts?

#

Hm

glass sphinx Jul 15, 2023, 12:14 AM

#

for each nonculled mesh

#

then i binary search for each thread in a fat dispatch

#

if they find two counts they are in between

wicked notch Jul 15, 2023, 12:14 AM

#

Each thread does binsearch?

glass sphinx Jul 15, 2023, 12:14 AM

#

they found their mesh

#

and can then subtract that meshes prefix sum of their id

#

to get meshlet id

wicked notch Jul 15, 2023, 12:15 AM

#

glass sphinx if they find two counts they are in between

aha, that's genius

glass sphinx Jul 15, 2023, 12:15 AM

#

🤓 👆

wicked notch Jul 15, 2023, 12:15 AM

#

But isn't binsearch for each thread slow as fucc

glass sphinx Jul 15, 2023, 12:15 AM

#

i had to ponder the orb for that one

glass sphinx Jul 15, 2023, 12:15 AM

#

wicked notch But isn't binsearch for each thread slow as fucc

ultra fast

#

at least for small entity counts

#

another solution would be to have different expansion rates

#

so inside the mesh cull shader you do

test mesh
if meshlet count < 128 append to 128 buffer
if meshlet count < 512 append to 512 buffer
...

#

then later you dispatch for each of these buffers the number of entries in x, y is the multiplication to get to the buffer count from workgroup size

wicked notch Jul 15, 2023, 12:17 AM

#

This sucks

glass sphinx Jul 15, 2023, 12:17 AM

#

this will probably waste around 70% worst case or so

wicked notch Jul 15, 2023, 12:17 AM

#

I like binsearch after all bleakekw

glass sphinx Jul 15, 2023, 12:18 AM

#

if you do it power of two steps, it will be at most 50% ignoring anything under 32 or what ever warp size is

#

i believe it is the best way to combine them

#

so have an if on massive meshes

#

like idk > 1024 meshlets

wicked notch Jul 15, 2023, 12:18 AM

#

Ye prolly the best

glass sphinx Jul 15, 2023, 12:19 AM

#

and put them in the buffer for big dispatch or so

#

but its so much work and so stupit

#

GIVE ME DISPATCH INDIRECT COUUUUNNNTTTT

#

it is actually more efficient to do a draw indirect count and cull in vertex shaders im pretty sure

#

just to not do all the shit inbetween

wicked notch Jul 15, 2023, 12:19 AM

#

I really fail to understand why there is no indirect count for dispatch

#

such a basic thing

frank sail Jul 15, 2023, 12:20 AM

#

because nobody needed it misinfo

wicked notch Jul 15, 2023, 12:20 AM

#

Well mr potrick needs it now (and I will be as well in the near future)

glass sphinx Jul 15, 2023, 12:20 AM

#

bleakekw

wicked notch Jul 15, 2023, 12:20 AM

#

So we'll be raiding Khronos HQ

frank sail Jul 15, 2023, 12:20 AM

#

whip out the copium bois

glass sphinx Jul 15, 2023, 12:20 AM

#

wicked notch I really fail to understand why there is no indirect count for dispatch

i believe it woul be very easy to implement in my naive world view

#

if we get workgraphs this is all not important

#

they are indirect on all steroids at the same time

wicked notch Jul 15, 2023, 12:22 AM

#

Workgraphs would solve so many issues with GPU driven it's crazy

glass sphinx Jul 15, 2023, 12:22 AM

#

yes

frank sail Jul 15, 2023, 12:22 AM

#

did someone already mention doing a bunch of DispatchIndirect (up to a fixed max) on the CPU and letting the GPU populate each one

#

truly one of the GPU-driven strategies of all time

glass sphinx Jul 15, 2023, 12:23 AM

#

yep

glass sphinx Jul 15, 2023, 12:23 AM

#

glass sphinx and record a dispatch per mesh, and then use predicates of whatever they are cal...

.

#

i think its actually how it should be done if i wasnt full gpu driven

frank sail Jul 15, 2023, 12:24 AM

#

why do you need predication

glass sphinx Jul 15, 2023, 12:24 AM

#

i heard from the mountains that some vendors like it over 0 dispatches

wicked notch Jul 15, 2023, 12:24 AM

#

The olympus gods

frank sail Jul 15, 2023, 12:24 AM

#

can't you treat it like MultiDispatchIndirect from the GPU side (no count)

glass sphinx Jul 15, 2023, 12:24 AM

#

i guess 0 is fine

#

would be cool to loop

#

omg give me the command processor

#

i will programm it

#

😟

frank sail Jul 15, 2023, 12:26 AM

#

abandon vulkan and become an amdgpu main

wicked notch Jul 15, 2023, 12:26 AM

#

Why stop at that

#

Expose the whole warp scheduler

#

I'll program it myself

frank sail Jul 15, 2023, 12:27 AM

#

oh, so you want to replace fixed-function bits of the hw? bleakekw

wicked notch Jul 15, 2023, 12:27 AM

#

While you're at it expose the whole memory subsystem, so I won't need CPU readback to update gpu mem pages bleakekw

glass sphinx Jul 15, 2023, 12:28 AM

#

ok ok listen to me:
i prerecord a command buffer with 1 million dispatches or osme other high number, the nhave predication aroiund every 100 or so.
Then i fill them as i need enabling predicates to unlock more dispatches.
I the nreuse that cmd buffer every frame

glass sphinx Jul 15, 2023, 12:28 AM

#

frank sail oh, so you want to replace fixed-function bits of the hw? <:bleakekw:10825983503...

i use my iron

frank sail Jul 15, 2023, 12:29 AM

#

wicked notch While you're at it expose the whole memory subsystem, so I won't need CPU readba...

the gpu already has virtual memory and can load pages from cpu ram

frank sail Jul 15, 2023, 12:29 AM

#

glass sphinx ok ok listen to me: i prerecord a command buffer with 1 million dispatches or os...

ok I thought you were talking about query objects

wispy spear Jul 15, 2023, 1:23 PM

#

wicked notch Jul 15, 2023, 7:09 PM

#

TODO: look at SDF tracing and probe tracing

#

Voxels scare me

#

Actually any kind of data that is supposed to be stored in a data structure that's not a simple ass array scares me bleakekw

wispy spear Jul 15, 2023, 7:18 PM

#

wicked notch TODO: look at SDF tracing and probe tracing

raven orchid Jul 15, 2023, 8:09 PM

#

@wicked notch I haven't done SDF tracing yet

#

for probe tracing are you talking about

wispy spear Jul 15, 2023, 8:10 PM

#

its about time you do 🙂

frank sail Jul 15, 2023, 8:10 PM

#

wispy spear

is dis yours

raven orchid Jul 15, 2023, 8:10 PM

#

yeah I think SDF is used for things like Godot 4

frank sail Jul 15, 2023, 8:10 PM

#

I believe the data structure is baked though

raven orchid Jul 15, 2023, 8:11 PM

#

oh dang so they're not redoing it

#

probes tend to be baked though

#

I think

#

from what I remember for things like Division 2

#

I think their probes end up using sort of offline preprocessing so that each one can cache which surfaces they can see

wispy spear Jul 15, 2023, 8:11 PM

#

frank sail is dis yours

oui

frank sail Jul 15, 2023, 8:12 PM

#

why it posted here KEKW

#

but also, very cool

wispy spear Jul 15, 2023, 8:12 PM

#

luschtri asked to have dfdx(worldpos) vischuellized

frank sail Jul 15, 2023, 8:12 PM

#

It looks super kewl

wispy spear Jul 15, 2023, 8:12 PM

#

yeah

#

disco bounding lines

#

im surprised you dont try to sell me "you need fsr2"

#

: >

wicked notch Jul 15, 2023, 8:13 PM

#

I was doing an experimentationes with dFdx

#

You can transfer the knowledge I gained to frogfooding btw

#

mat3 TBN = mat3(0.0);
{
    const vec3[] world_positions = vec3[](
        vec3(transform * vec4(positions[0], 1.0)),
        vec3(transform * vec4(positions[1], 1.0)),
        vec3(transform * vec4(positions[2], 1.0))
    );
    const vec3 ddx_position = analytical_ddx(derivatives, world_positions);
    const vec3 ddy_position = analytical_ddy(derivatives, world_positions);
    const vec2 ddx_uv = uv_grad.ddx;
    const vec2 ddy_uv = uv_grad.ddy;

    const vec3 N = w_normal;
    const vec3 T = normalize(ddx_position * ddy_uv.y - ddy_position * ddx_uv.y);
    const vec3 B = -normalize(cross(N, T));

    TBN = mat3(T, B, N);
}```

#

Here how I do TBN now

#

No tangents required

#

vec3 analytical_ddx(in partial_derivatives_t derivatives, in vec3[3] values) {
    return vec3(
        dot(derivatives.ddx, vec3(values[0].x, values[1].x, values[2].x)),
        dot(derivatives.ddx, vec3(values[0].y, values[1].y, values[2].y)),
        dot(derivatives.ddx, vec3(values[0].z, values[1].z, values[2].z))
    );
}

vec3 analytical_ddy(in partial_derivatives_t derivatives, in vec3[3] values) {
    return vec3(
        dot(derivatives.ddy, vec3(values[0].x, values[1].x, values[2].x)),
        dot(derivatives.ddy, vec3(values[0].y, values[1].y, values[2].y)),
        dot(derivatives.ddy, vec3(values[0].z, values[1].z, values[2].z))
    );
}``` With this

wispy spear Jul 15, 2023, 8:15 PM

#

is that from how to reconstruct normals out of thin air?

wicked notch Jul 15, 2023, 8:15 PM

#

No but that's the next step bleakekw

wispy spear Jul 15, 2023, 8:15 PM

#

heh

wicked notch Jul 15, 2023, 8:15 PM

#

I could actually calculate normals analytically right now

wispy spear Jul 15, 2023, 8:15 PM

#

i rember there was a blog flying around wrt to that, recently

wicked notch Jul 15, 2023, 8:16 PM

#

It's just normalize(cross(v[2] - v[0], v[1] - v[0]))

wispy spear Jul 15, 2023, 8:16 PM

#

cheeky

wicked notch Jul 15, 2023, 8:16 PM

#

So the vertex data becomes just position and UV

#

And we can quantize both of them perfectly bleakekw

#

Road to 0 byte vertex format

raven orchid Jul 15, 2023, 8:16 PM

#

dang that's pretty cool

wispy spear Jul 15, 2023, 8:16 PM

#

hehe

frank sail Jul 15, 2023, 8:16 PM

#

wicked notch So the vertex data becomes just position and UV

rip smooth norballs

wispy spear Jul 15, 2023, 8:16 PM

#

this is brainworm 2.0

#

perhaps you can smear a little dithering over it, nobody will notice non smoof norbels

frank sail Jul 15, 2023, 8:17 PM

#

wicked notch Road to 0 byte vertex format

just hardcode the scene into your shader to make it free

wispy spear Jul 15, 2023, 8:18 PM

#

powerplant.obj.vs.glsl

frank sail Jul 15, 2023, 8:18 PM

#

I think you mean, tessellate the mesh so much that multiple pixels do not share a triangle

#

that's how you get smooth face balls

wicked notch Jul 15, 2023, 8:19 PM

#

That's the objective with Nanite anyways 🚠

frank sail Jul 15, 2023, 8:19 PM

#

Eckszacktly

wicked notch Jul 15, 2023, 8:25 PM

#

I wonder if I could compute a gradient for smoothizing the normals

frank sail Jul 15, 2023, 8:26 PM

#

You have to get the neighboring faces too

wicked notch Jul 15, 2023, 8:26 PM

#

we can't use subgroup ops in frag shaders right? 😦

frank sail Jul 15, 2023, 8:26 PM

#

Which means you pass a half edge structure to the GPU bleakekw

#

There are certain subgroup ops that you can use

#

Like the quad ones

#

I think that's it

wicked notch Jul 15, 2023, 8:27 PM

#

Is there no WaveReadAcrossQuadLaneX or something like that

frank sail Jul 15, 2023, 8:28 PM

#

yeah there is

#

https://www.khronos.org/blog/vulkan-subgroup-tutorial

The Khronos Group

Vulkan Subgroup Tutorial

Subgroups are an important new feature in Vulkan 1.1 because they enable highly-efficient sharing and manipulation of data between multiple tasks running in parallel on a GPU. In this tutorial, we will cover how to use the new subgroup functionality.

#

ctrl f quad

wicked notch Jul 15, 2023, 8:29 PM

#

subgroupQuadBroadcast what a shit name

#

bleakekw

#

I thought this was a "write" operation, not a read one

wispy spear Jul 15, 2023, 8:31 PM

#

(facepalming at the name, not you)

frank sail Jul 15, 2023, 8:31 PM

#

It actually broadcasts an FM radio signal when you call it

wispy spear Jul 15, 2023, 8:31 PM

#

so you can tune in?

frank sail Jul 15, 2023, 8:32 PM

#

Ye so we can look at dem quads

wicked notch Jul 15, 2023, 8:32 PM

#

I prefer triangles

minor root Jul 15, 2023, 8:35 PM

#

did you end up figuring out the dFdx thingy

wicked notch Jul 15, 2023, 8:36 PM

#

yes

wicked notch Jul 15, 2023, 8:36 PM

#

wicked notch ```glsl mat3 TBN = mat3(0.0); { const vec3[] world_positions = vec3[]( ...

see here for results

minor root Jul 15, 2023, 8:37 PM

#

ah

#

this thing

#

its neat

#

hows perf

wicked notch Jul 15, 2023, 8:38 PM

#

minor root hows perf

ridiculous bleakekw

minor root Jul 15, 2023, 8:38 PM

#

bleakekw

wicked notch Jul 15, 2023, 8:38 PM

#

400 microseconds in total for bistro with sampling and all

#

At this point I should begin optimizing the memory bandwidth of the GPU because that's my bottleneck bleakekw

#

Multiview is coming soon

frank sail Jul 15, 2023, 8:40 PM

#

You'll port that to #1128020727380054046 right frogstare

wicked notch Jul 15, 2023, 8:40 PM

#

multiview ain't supported on GL 😭

minor root Jul 15, 2023, 8:40 PM

#

you'll port all this to #1073361699651989584 right

wicked notch Jul 15, 2023, 8:40 PM

#

It's all open sus

frank sail Jul 15, 2023, 8:40 PM

#

wicked notch multiview ain't supported on GL 😭

Isn't there an ext or am I tripping

wicked notch Jul 15, 2023, 8:40 PM

#

ye for oculus only

#

https://registry.khronos.org/OpenGL/extensions/OVR/OVR_multiview.txt

minor root Jul 15, 2023, 8:41 PM

#

wicked notch It's all open sus

nono i need more users

wicked notch Jul 15, 2023, 8:41 PM

#

Though I have no idea if any other vendor silently supports this ext bleakekw

frank sail Jul 15, 2023, 8:44 PM

#

Czech gpuopen

#

Wrong gpu website

#

I meant gpuinfo

wispy spear Jul 15, 2023, 8:45 PM

#

youll port all this to #1019740157798273024 right

#

hehe we do a little funny

wicked notch Jul 15, 2023, 8:45 PM

#

Holy shit it's supported

frank sail Jul 15, 2023, 8:46 PM

#

Wdym

wicked notch Jul 15, 2023, 8:46 PM

#

Incredible

wispy spear Jul 15, 2023, 8:46 PM

#

brainworm 3.0 unlocked

frank sail Jul 15, 2023, 8:46 PM

#

You know what to do now

wispy spear Jul 15, 2023, 8:46 PM

#

there is a OVR_multiview2 too

frank sail Jul 15, 2023, 8:47 PM

#

OVR = OpenVR

wicked notch Jul 15, 2023, 8:47 PM

#

frank sail You know what to do now

Yes, study for my upcoming exam bleakekw

wispy spear Jul 15, 2023, 8:47 PM

#

i difuger

wicked notch Jul 15, 2023, 8:47 PM

#

You go ahead and implement all the PBRisms

wispy spear Jul 15, 2023, 8:47 PM

#

you can study tomorrow evening

frank sail Jul 15, 2023, 8:49 PM

#

bad parenting deccer bleakekw

wispy spear Jul 15, 2023, 8:50 PM

#

https://tenor.com/view/bad-parents-bad-parenting-parenting-fail-bad-dad-bad-mom-gif-13534030

Tenor

bad parents

▶ Play video

wicked notch Jul 15, 2023, 8:50 PM

#

Anyways, returning to GI one sec

#

@raven orchid How exactly do VPL work?

raven orchid Jul 15, 2023, 8:53 PM

#

wicked notch <@192156505070501888> How exactly do VPL work?

The basic idea is that when light hits a surface it bounces off in certain ways, for example diffuse. At some point someone realized we can approximate that by spawning point lights on or near surfaces where light directly hits it

#

Then single or multi bounce lighting just becomes spawning virtual lights around the scene

#

Which I think can somewhat be related to probe based lighting too

wicked notch Jul 15, 2023, 8:55 PM

#

Hmm

raven orchid Jul 15, 2023, 8:55 PM

#

For that instead of VPLs they spawn probes

#

And probes capture light info for regions of the world

minor root Jul 15, 2023, 8:55 PM

#

that seems pretty neat

wispy spear Jul 15, 2023, 9:00 PM

#

re ovr_multiview, just found that while looking for the txt file https://forums.developer.nvidia.com/t/gl-ovr-multiview-performance-on-rtx3000/184313

wicked notch Jul 15, 2023, 9:01 PM

#

The gains are from avoiding expensive barriers and state changes

#

If you are doing basic things with no barriers multiview is unlikely to make a diff

minor root Jul 15, 2023, 9:02 PM

#

another unanswered nvidia forum post bleakekw

#

they really do not reply at all

wicked notch Jul 15, 2023, 9:02 PM

#

typical

twin musk Jul 15, 2023, 9:09 PM

#

frank sail OVR = OpenVR

akshually in this case it's oculus vr 🤓 ☝️

#

unfortunately both have been shortened to ovr

frank sail Jul 15, 2023, 9:13 PM

#

pranked

twin musk Jul 15, 2023, 9:18 PM

#

wispy spear re ovr_multiview, just found that while looking for the txt file <https://forums...

tbf that application is basically a worst case for multiview
going from singleview to multiview is only going from 2 drawcalls to 1, so there's barely any reduction in cpu overhead, and the points only have a single colour attribute that is passed straight through in the vs so there's no vertex work it can share between the two views

proven laurel Jul 16, 2023, 12:17 PM

#

wicked notch Hopefully the gltf exported doesn't die as usual <:bleakekw:1082598350303539240>

Does it work?

#

I am considering getting it but don't want to waste money KEKW

wicked notch Jul 16, 2023, 12:18 PM

#

I forgor 💀

#

one sec

wicked notch Jul 16, 2023, 12:39 PM

#

Alright I produced a functional FBX

#

ignore blender not responding

wispy spear Jul 16, 2023, 12:44 PM

#

https://tenor.com/view/ralphyredflaggy-sad-flagmailbox-not-responding-gif-12097706

Tenor

wicked notch Jul 16, 2023, 1:17 PM

#

success

#

It's not a very high poly scene though, just 20 million instanced triangles

wispy spear Jul 16, 2023, 1:18 PM

#

what does instanced mean in this ocntext?

#

are those wall towers the same mesh and those have been instanced?

wicked notch Jul 16, 2023, 1:18 PM

#

yes, the trees too

wispy spear Jul 16, 2023, 1:19 PM

#

ah

#

kewl

wicked notch Jul 16, 2023, 1:20 PM

#

It's also just 34MB lol

#

it doesn't have any textures sadly

#

Perhaps Unreal's FBX exporter is unable to export textures?

wispy spear Jul 16, 2023, 1:43 PM

#

yup it seems that way

#

i also had no luck so far

wicked notch Jul 16, 2023, 2:08 PM

#

triangle dust KEKW

wispy spear Jul 16, 2023, 2:20 PM

#

: )

#

how many this time?

wicked notch Jul 16, 2023, 2:26 PM

#

34 million

#

smol increase

#

normals are also a bit fucced

wispy spear Jul 16, 2023, 2:42 PM

#

looks ok to me

#

are you sure you loaded them not as srgb 😛 (i know there are no maps yet)

wicked notch Jul 16, 2023, 2:47 PM

#

https://file.io/DnATx0vUl37J here be the scene btw

#

It's stupidly small somehow lol

wispy spear Jul 16, 2023, 2:53 PM

#

uhm

#

this mesh turns my "modelviewer" black and imgui wont show up either 😄

#

4:54:54

wicked notch Jul 16, 2023, 2:55 PM

#

epic KEKW

#

renderdoc says anything useful?

wispy spear Jul 16, 2023, 2:56 PM

#

i doublt i can even take a capture 😛

wicked notch Jul 16, 2023, 2:56 PM

#

Perhaps you could display the instance ID or the primitive ID

#

here's the color func I use

#

vec3 hsv_to_rgb(in vec3 hsv) {
    const vec3 rgb = saturate(abs(mod(hsv.x * 6.0 + vec3(0.0, 4.0, 2.0), 6.0) - 3.0) - 1.0);
    return hsv.z * mix(vec3(1.0), rgb, hsv.y);
}

#

You use it like this

hsv_to_rgb(vec3(float(gl_PrimitiveID) * M_GOLDEN_CONJ, 0.875, 0.85)```

wispy spear Jul 16, 2023, 2:57 PM

#

ah

#

42k nodes, its at 24k or so

#

that seem to take ages to load 🙂

wicked notch Jul 16, 2023, 2:59 PM

#

great

#

I wonder why it takes ages to load, I can load it in a few ms

wispy spear Jul 16, 2023, 3:00 PM

#

because my code is shit presumably

wicked notch Jul 16, 2023, 3:00 PM

#

incredible

wispy spear Jul 16, 2023, 3:00 PM

#

[16:55:33 DBG] SharpGltfMeshLoader: Loading Material MI_Fountain_Water_Inst
[17:00:08 DBG] SharpGltfMeshLoader: Loaded 46961 primitives from /home/deccer/Personal/Code/Projects/lessGravity/OpenSpace/src/OpenSpace.Main/bin/Debug/net7.0/Data/Props/bazaar.glb
``` 4.5min 😄

#

ok this is weird, i has finished loading everything, but screen is black XD

#

wtf

#

its busy creating the meshpool out of those 43k meshprims 😄

#

ok i have to work on that hehe

wicked notch Jul 16, 2023, 3:03 PM

#

Ah you don't handle instancing nervous

wispy spear Jul 16, 2023, 3:03 PM

#

yes, i dont handle it

#

ie i dont check gltf extensions

wicked notch Jul 16, 2023, 3:03 PM

#

Ah no need

#

it's not using EXT_mesh_instancing

#

It's just using regular gltf node instancing

wispy spear Jul 16, 2023, 3:04 PM

#

ah

wicked notch Jul 16, 2023, 3:04 PM

#

where multiple nodes reference the same mesh

wispy spear Jul 16, 2023, 3:04 PM

#

i just iterate over all nodes

#

and meshes should be handled properly, i believe my deccer cubes work the same way

#

thanks for this fucked model : > to show me how shit my code is

wicked notch Jul 16, 2023, 3:05 PM

#

KEKW

#

I blame unreal

wispy spear Jul 16, 2023, 3:05 PM

#

na, my code is also actually shit

#

too much memory copy bs and allocation shinanigans

wicked notch Jul 16, 2023, 7:25 PM

#

Huh

#

Apparently KHR mesh shaders require the "geometryShader" feature to be enabled thonk

wispy spear Jul 16, 2023, 7:28 PM

#

ugh

wicked notch Jul 16, 2023, 8:33 PM

#

when unreal engine

wispy spear Jul 16, 2023, 8:33 PM

#

where did the other 64gig go?

#

didnt you upgrade to 128?

wicked notch Jul 16, 2023, 8:34 PM

#

Yes, but I constantly bluescreed bleakekw

#

MEMORY_MANAGEMENT or some stuff

#

Turns out my CPU's IMC did NOT like 128GB

#

smh Jaker

#

fix your CPUs

wispy spear Jul 16, 2023, 8:35 PM

#

fook

proven laurel Jul 17, 2023, 4:23 AM

#

wicked notch Turns out my CPU's IMC did NOT like 128GB

Could be a timing issue

proven laurel Jul 17, 2023, 4:24 AM

#

wicked notch triangle dust <:KEKW:666849321462792234>

Might need to decimate?

minor root Jul 17, 2023, 7:53 AM

#

why did you buy 64 gb of ram before checking if your cpu can handle it bleakekw

wicked notch Jul 17, 2023, 9:07 AM

#

It was on the QVL

#

I blame the QVL

wicked notch Jul 17, 2023, 9:53 AM

#

TODO

#

https://www.youtube.com/watch?v=Tx32yi_0ETY

YouTube

High-Performance Graphics

Real Time Ray Tracing of Micro Poly Geometry with Hierarchical Leve...

Real Time Ray Tracing of Micro Poly Geometry with Hierarchical Level of Detail
Carsten Benthin, Christoph Peters
Paper Session: Primitives, Surfaces & Appearance Modeling - HPG 2023 - Day 2

▶ Play video

#

learn how the fuck they managed this

#

Hold on

#

KEKW

#

They just went: "alright no available API allows us to do efficient BVH rebuild what do we do"

#

"we obviously forgo hardware acceleration and just use Embree and make our own BVH!"

#

amazing

#

If they can make a GPU accelerated, dense BVH based on clusters

#

Why can't AMD or NVIDIA

#

bruh

proven laurel Jul 17, 2023, 11:51 AM

#

wicked notch It was on the QVL

QVL only looks at entire kits

#

which is why I meant could be a timing issue

wicked notch Jul 17, 2023, 7:05 PM

#

Alright boys

#

it is time for one of my usual detours

#

Like last time with mesh shaders, we can all see it didn't turn out into anything serious

#

It's not like I'm in a rabbit hole 9km deep into meshlets

#

Totally not that

#

Anyways this time the detour will be RayTracing!

#

After the last exam is done, I will spend day and night learning aboud BVHs on the GPU (doing them myself, no VK_KHR_ray_tracing)

#

Then I will re-read the paper about nanite style RT LODs

#

and finally I will try making an issue on Vulkan-Docs to see how the big brains over at Khronos will receive it

wispy spear Jul 17, 2023, 7:07 PM

#

https://tenor.com/view/community-ken-jeong-ben-chang-ill-allow-it-allowed-gif-4468552

Tenor

i'll allow it

▶ Play video

wicked notch Jul 17, 2023, 7:08 PM

#

Also will probably buy a 7600XT

#

Because RADV

glass sphinx Jul 17, 2023, 8:11 PM

#

bleakekw the driver sink hole you will fall in gives me enough time to catch up again

#

i can draw again btw

#

i am slowly clawing back my power in the rewrite

wicked notch Jul 17, 2023, 8:13 PM

#

I saw your impl of the entity culling btw, I think I get it now

#

Amazing ideas behind it

wicked notch Jul 18, 2023, 8:52 PM

#

You guys remember the ballz

#

It's time to rewrite the raytracer, on the CPU with a proper BVH this time bleakekw

frank sail Jul 18, 2023, 9:00 PM

#

Nice balls homie, solid 8/10

wispy spear Jul 18, 2023, 9:50 PM

#

https://tenor.com/view/nice-balls-ill-have-to-gif-24823674

Tenor

wicked notch Jul 19, 2023, 6:00 PM

#

void trace(bvh, origin, direction) -> color {
  auto ray = { origin, direction };
  const auto max_bounces = 32;
  for (i = 0; i < max_bounces; ++i) {
    auto hit = bvh.traverse(ray);
    if (!hit) {
      break;
    }
    ray.origin = hit.point;
    ray.direction = random_direction_in_hemisphere(hit.normal);
  }
}``` hmm

#

deep thought

wicked notch Jul 19, 2023, 7:37 PM

#

does each primitive need to store an ID to the mesh it pertains?

#

So each BVH node will contain the ID of the primitive and the ID of the mesh?

fluid jungle Jul 19, 2023, 8:26 PM

#

wicked notch You guys remember the ballz

the colors are nice, how did you pick them

wicked notch Jul 19, 2023, 8:26 PM

#

I just went on the usual adobe color picker and choose something that looked nice

fluid jungle Jul 19, 2023, 8:27 PM

#

looks very vibrant

wicked notch Jul 19, 2023, 8:27 PM

#

Also I wanted to mimic Sebastian Lague's layout so that I had a good reference image

frank sail Jul 19, 2023, 9:48 PM

#

wicked notch does each primitive need to store an ID to the mesh it pertains?

~~probably not~~

#

hmm

#

have you read these
https://jacco.ompf2.com/2022/04/13/how-to-build-a-bvh-part-1-basics/
https://madmann91.github.io/2020/12/28/bvhs-part-1.html

#

you could also ask pixelduck about the BVH format on AMD

wicked notch Jul 19, 2023, 9:50 PM

#

I have not

#

why did you not send me these when I first asked you months ago

#

KEKW

frank sail Jul 19, 2023, 9:53 PM

#

my dementia only allows me to remember a small number of things at a time

fluid jungle Jul 19, 2023, 9:56 PM

#

frank sail my dementia only allows me to remember a small number of things at a time

We're probably sharing braincells because I have the same issue

frank sail Jul 19, 2023, 10:04 PM

#

btw on AMD, an ID (often used for materials) is stored in triangle nodes

#

and each triangle node can store up to four triangles arranged as a fan (so only five total positions have to be stored)

wicked notch Jul 19, 2023, 10:07 PM

#

Damn BVHs are heavy

#

The library I'm using stores bounding box (24 bytes) and 2 indices (8 bytes)

frank sail Jul 19, 2023, 10:09 PM

#

box nodes hold "pointers" (indices) to four children as well as their bounds, stored as f16 or f32 (so there are two kinds of box nodes)

wicked notch Jul 19, 2023, 10:09 PM

#

Overall I can see how to send this thing to the GPU though

#

it's basically a flat tree stored as a vector

frank sail Jul 19, 2023, 10:10 PM

#

so I guess you're not doing hw rt

wicked notch Jul 19, 2023, 10:10 PM

#

Not yet™️

#

I first have to understand the basics before I can trust the hardware to do it right bleakekw

frank sail Jul 19, 2023, 10:10 PM

#

tru

#

for more info, check the RDNA 2/3 ISA guides and search for IMAGE_BVH_INTERSECT_RAY

#

well it might not be that much more info

minor root Jul 20, 2023, 7:43 AM

#

@wicked notch sir web wizard

#

<div style="display:flex;align-items:center;">
                <input
                  type="checkbox"
                  id="{{component.id}}-checkbox-{{option}}"
                  name="{{component.id}}"
                  value="{{option}}"
                  [checked]="component.val.includes(';{{option}};')"
                  (change)="onCheckboxChange($event)"
                  style="width: 10%; height: 30px;">
                <label for="{{component.id}}-checkbox-{{option}}" style="">{{option}}</label>
              </div>

why the label nicely centered but the checkbox not

minor root Jul 20, 2023, 8:08 AM

#

ok bruh the input was inheriting some css that messed it up

#

thanks previous devs

#

sorry for bothering https://cdn.discordapp.com/emojis/860295985463558174.webp?size=48&name=PrayGe&quality=lossless

wicked notch Jul 20, 2023, 10:44 AM

#

The exam have finished

#

I now have endless free time

glass sphinx Jul 20, 2023, 12:11 PM

#

LVSTRI be working in the secret 25h h each day

wicked notch Jul 20, 2023, 12:15 PM

#

I will admit my average sleep time this month was 3 hours

#

bleakekw

minor root Jul 20, 2023, 12:39 PM

#

nervous

finite yacht Jul 20, 2023, 12:56 PM

#

nervous

wide shadow Jul 20, 2023, 1:10 PM

#

bleakekw

wicked notch Jul 20, 2023, 1:37 PM

#

Alright for now I'll shrimply store another indirection vector

#

purpose is mesh_id = ind[prim_id]

#

I see now why we have two BVHs bleakekw

finite yacht Jul 20, 2023, 1:46 PM

#

you have a blas for each mesh right?

wicked notch Jul 20, 2023, 1:47 PM

#

Ye that's the plan at least

finite yacht Jul 20, 2023, 1:47 PM

#

so when iterating through the blases and traversing each one cant you remeber the index, just like you remember closest hit pos

wicked notch Jul 20, 2023, 1:47 PM

#

Perhaps that would be best

#

Also my "mesh" right now is a single triangle KEKW

#

I'll just try tracing this tringle for now

wispy spear Jul 20, 2023, 3:31 PM

#

soon lvstri will be snatched by some big $GPUVENDOR where he is put in the basement to work on $TECH and we will never see/hear/read from him anymore

wicked notch Jul 20, 2023, 3:41 PM

#

behold

#

A photorealistic render of a triangle in a scene with no light sources

wispy spear Jul 20, 2023, 3:47 PM

#

wicked notch Jul 20, 2023, 6:54 PM

#

The primitives in the BVH have to be NDC don't they

#

No actually that's wrong

#

hmm

wicked notch Jul 20, 2023, 7:12 PM

#

I've lost count of how many times I had to draw a tringle

#

But here we are again

#

with bonus barycentric coordinates

#

I decided that everything shall be world space for simplicity

wispy spear Jul 20, 2023, 7:15 PM

#

is it srgb though?

wicked notch Jul 20, 2023, 7:18 PM

#

now it is

wicked notch Jul 20, 2023, 9:02 PM

#

oh shit

#

how do we parallelize trasversal

frank sail Jul 20, 2023, 9:08 PM

#

put it in a shader

wispy spear Jul 20, 2023, 9:08 PM

#

Parallel.ForEach(trasversals, trasversal => {});

#

i should be quiet, i cant do any gp really : >

frank sail Jul 20, 2023, 9:09 PM

#

also, wdym "parallelize traversal"? like you want one ray's traversal to be parallelized?

wicked notch Jul 20, 2023, 9:10 PM

#

Perhaps it's just this library I'm using, but their BVH trasversal function isn't really thread safe

#

it does execute in parallel internally I think though

frank sail Jul 20, 2023, 9:10 PM

#

tf

#

how

#

traversal should be thread safe

wicked notch Jul 20, 2023, 9:11 PM

#

intersect() isn't marked const

#

so that could be why

#

it modifies internal state or something

frank sail Jul 20, 2023, 9:11 PM

#

what library is this

wicked notch Jul 20, 2023, 9:11 PM

#

https://github.com/madmann91/bvh

frank sail Jul 20, 2023, 9:11 PM

#

madmann is here btw

wicked notch Jul 20, 2023, 9:12 PM

#

actually hold on

#

I'm dumb

#

yeah I'm dumb

#

the ray required for trasversal isn't const

#

but intersect is

frank sail Jul 20, 2023, 9:12 PM

#

https://tenor.com/view/mushroom-gif-23662651

Tenor

wicked notch Jul 20, 2023, 9:12 PM

#

so... safe?

#

ish

#

I mean I should lock the ray bleakekw

frank sail Jul 20, 2023, 9:13 PM

#

what does the function return?

wicked notch Jul 20, 2023, 9:13 PM

#

intersect? Nothing

frank sail Jul 20, 2023, 9:13 PM

#

actually can you just tell me what file it's in

wicked notch Jul 20, 2023, 9:13 PM

#

https://github.com/madmann91/bvh/blob/master/src/bvh/v2/bvh.h#L83

#

it takes a function that is supposed to iterate over some primitives and intersect each one with the ray

frank sail Jul 20, 2023, 9:14 PM

#

ok so I guess the ray is mutable in case one of the callbacks needs to mutate it

#

rays also store tmin and tmax

wicked notch Jul 20, 2023, 9:15 PM

#

hmm yes

#

makes sense

frank sail Jul 20, 2023, 9:16 PM

#

so it should be perfectly fine to call that fn from many threads

wicked notch Jul 20, 2023, 9:17 PM

#

yep

#

I am invoking UB

#

I wonder why I didn't doubt myself before doubting the lib

#

smh

#

Also fun fact, none of the internal usages of ray change its state apparently

#

Perhaps I'm missing something?

frank sail Jul 20, 2023, 9:19 PM

#

czech the exshrimples

#

oh ok so tmin and tmax are actually the min and max distance for the ray to travel

wicked notch Jul 20, 2023, 9:20 PM

#

casual 40ms

#

to trace a single tringle

#

amazing

frank sail Jul 20, 2023, 9:23 PM

#

the example has this line which tells me that perhaps tmin and tmax get modified during traversal
https://github.com/madmann91/bvh/blob/master/test/simple_example.cpp#L98

#

so I guess it's sorta an inout ray param

wicked notch Jul 20, 2023, 9:24 PM

#

Actually, not during trasversal

#

during ray-triangle intersection

#

https://github.com/madmann91/bvh/blob/master/src/bvh/v2/tri.h#L57

frank sail Jul 20, 2023, 9:25 PM

#

close enough 😎

wispy spear Jul 20, 2023, 9:38 PM

#

mayhaps duing tlasversar

wicked notch Jul 20, 2023, 9:51 PM

#

I wonder why not make tmin and tmax optionally atomic

#

Holy pog 4ms

frank sail Jul 20, 2023, 10:10 PM

#

wicked notch I wonder why not make tmin and tmax optionally atomic

why would you have the same ray traversing the scene multiple times in parallel

wicked notch Jul 20, 2023, 10:17 PM

#

ya got a point

#

I wonder if loading in suzanne would be a good idea

#

Damn, 10ms

#

Not bad

#

Now I'll load intel sponza bleakekw

frank sail Jul 20, 2023, 10:21 PM

#

hmm how many tris are in suzanne?

#

if intel sponza has 1000x as many tris, I expect only about a 10x decrease in perf (assuming bvh2)

wicked notch Jul 20, 2023, 10:21 PM

#

KEKW

#

to be fair

#

logN is great

frank sail Jul 20, 2023, 10:22 PM

#

oh nice only 5x decrease

wicked notch Jul 20, 2023, 10:22 PM

#

quite nice tbh

frank sail Jul 20, 2023, 10:22 PM

#

not bad

wicked notch Jul 20, 2023, 10:22 PM

#

I'll go back to cornell box though bleakekw

frank sail Jul 20, 2023, 10:22 PM

#

this is cpu too

wicked notch Jul 20, 2023, 10:22 PM

#

ye fully CPU

#

executor.for_each(0, height, [&](size_t start, size_t end) {
    for (auto y = start; y < end; ++y) {
        for (auto x = 0u; x < width; ++x) {
            auto color = glm::vec4(0.0f, 0.0f, 0.0f, 1.0f);
            for (auto s = 0u; s < spp; ++s) {
                const auto u = x / static_cast<float>(width - 1);
                const auto v = y / static_cast<float>(height - 1);
                const auto uv_near = glm::vec4(glm::vec2(u, v) * 2.0f - 1.0f, 0.0f, 1.0f);
                const auto uv_far = glm::vec4(glm::vec2(u, v) * 2.0f - 1.0f, 0.1f, 1.0f);
                auto world_near = inv_pv * uv_near;
                auto world_far = inv_pv * uv_far;
                world_near /= world_near.w;
                world_far /= world_far.w;

                auto ray = bvh_ray(
                    as_vec3(world_near),
                    as_vec3(glm::normalize(world_far - world_near)));
                auto bary = glm::vec3(0.0f);
                auto hit = intersect(bvh, ray, [&](size_t i) {
                    if (auto hit = perm_prims[i].intersect(ray)) {
                        const auto& [b_u, b_v] = *hit;
                        bary = glm::vec3(b_u, b_v, 1.0f - b_u - b_v);
                        return true;
                    }
                    return false;
                });
                if (hit != -1) {
                    color = glm::vec4(bary, 1.0f);
                }
            }
            color /= spp;
            image[y * width + x] = encode_rgba(glm::vec4(as_srgb(color), 1.0f));
        }
    }
});```

#

Amazing

frank sail Jul 20, 2023, 10:23 PM

#

now do path tracing

wicked notch Jul 20, 2023, 10:23 PM

#

soon™️

frank sail Jul 20, 2023, 10:23 PM

#

wicked notch ```cpp executor.for_each(0, height, [&](size_t start, size_t end) { for (aut...

is this a parallel executor or something

wicked notch Jul 20, 2023, 10:23 PM

#

ye

distant lodge Jul 20, 2023, 10:23 PM

#

if it's 43ms fully CPU you could totally have it go realtime on your GPU

frank sail Jul 20, 2023, 10:23 PM

#

oke that's ebic

wicked notch Jul 20, 2023, 10:23 PM

#

BVHv2's

wicked notch Jul 20, 2023, 10:24 PM

#

distant lodge if it's 43ms fully CPU you could totally have it go realtime on your GPU

I still gotta understand how to traverse the BVH on the GPU

#

but I'm slowly beginning to expand my brain mass

frank sail Jul 20, 2023, 10:25 PM

#

wicked notch I still gotta understand how to traverse the BVH on the GPU

literally the exact same as traversing on the cpu in this case

#

because madmann is using the SmallStack thingy with a fixed size

wicked notch Jul 20, 2023, 10:26 PM

#

Yeah but

#

Does AMD parallelize trasversal

frank sail Jul 20, 2023, 10:26 PM

#

what does that mean

wicked notch Jul 20, 2023, 10:27 PM

#

as in, instead of

while (!stack.is_empty()) {
    // traverse
}```

frank sail Jul 20, 2023, 10:27 PM

#

each thread has its own ray and does its own traversal and intersection

wicked notch Jul 20, 2023, 10:27 PM

#

You do something fancier

frank sail Jul 20, 2023, 10:27 PM

#

and each thread has its own stack

wicked notch Jul 20, 2023, 10:27 PM

#

Fair enough

frank sail Jul 20, 2023, 10:27 PM

#

the shader compiler generates a traversal kernel

#

the only thing the hardware accelerates on AMD is bvh node and triangle intersection

#

the actual traversal is just regular code

wicked notch Jul 20, 2023, 10:28 PM

#

Very nice

frank sail Jul 20, 2023, 10:28 PM

#

using a "stackless" (actually fixed size stack) method

wicked notch Jul 20, 2023, 10:28 PM

#

I guess trasversal is inherently difficult to parallelize

#

I mean, where would you even begin

#

Each step depends on the previous

frank sail Jul 20, 2023, 10:29 PM

#

stop thinking about parallelizing traversal bleakekw

#

each thread has its own ray to worry about

wicked notch Jul 20, 2023, 10:29 PM

#

I guess I might bleakekw

#

nah jk

#

I'll do it the lame, easy way

#

Oh I got an idea

frank sail Jul 20, 2023, 10:30 PM

#

unless you have a scene with 10^100 triangles and literally only a single ray, parallelizing traversal doesn't seem very helpful

wicked notch Jul 20, 2023, 10:30 PM

#

Perhaps work expansion could help

#

each work package does the trasversal for its own level

#

and dispatches more work for the next level

#

until leaves are reached

#

ok I'll stop now

#

KEKW

frank sail Jul 20, 2023, 10:31 PM

#

actually that does seem kinda interesting for making memory access more coherent

#

but you'd have one dispatch per level of the bvh, and each dispatch would become increasingly incoherent as there are more nodes

#

prolly not worth tbhbh

wicked notch Jul 20, 2023, 10:32 PM

#

I mean if the big brains at NV and AMD are doing it this way then it is not worth to think about just yet

#

anyways

#

I'm quite happy I managed to understand BVHs this quickly

#

I was expecting a more gruesome and bloody thing

frank sail Jul 20, 2023, 10:34 PM

#

what you're proposing sounds like distributing the work of a single ray's traversal to several threads. but you're already gonna have millions of rays, so you can shrimply have each thread compute one ray

wicked notch Jul 20, 2023, 10:34 PM

#

ye sounds about right

#

we gotta shade cornell box

frank sail Jul 20, 2023, 10:35 PM

#

wdym it looks pretty shaded already

#

hol up, now it's shaded

distant lodge Jul 20, 2023, 10:37 PM

#

this is what AMD devrel does to your shaders

#

they don't want you to know this

wicked notch Jul 20, 2023, 10:40 PM

#

Alright boys

#

poll time

#

Do I first shade this on the CPU or do I immediately start writing a shader

frank sail Jul 20, 2023, 10:42 PM

#

for learning purposes it's probably easier to start with the CPU

#

and you also don't have to use a shit (shading) lang while you're learning

wicked notch Jul 20, 2023, 10:42 PM

#

How much do I have to pay you for you to make a good shading language btw

frank sail Jul 20, 2023, 10:43 PM

#

uhh

#

tree fiddy

wicked notch Jul 20, 2023, 10:43 PM

#

deal

frank sail Jul 20, 2023, 10:43 PM

#

(approx)

distant lodge Jul 20, 2023, 10:43 PM

#

just write your own shading lang that will fix what's wrong with GLSL for real this time

wicked notch Jul 20, 2023, 10:44 PM

#

I have no idea how to write languages

wispy spear Jul 20, 2023, 10:44 PM

#

call it glsl 2.0

frank sail Jul 20, 2023, 10:47 PM

#

too bad shading languages still require some knowledge of graphics APIs

#

e.g., you still need a concept of resource binding

wicked notch Jul 20, 2023, 10:49 PM

#

I know what that is fortunately bleakekw

frank sail Jul 20, 2023, 10:50 PM

#

if you're using cutting-edge vulkan, at least you can use BDA and descriptor indexing to shrimplify that stuff a bit

#

but I doubt you can make anything cuda-like without also providing your own API that wraps stuff nicely

wicked notch Jul 20, 2023, 11:03 PM

#

I don't want cuda like tbh

#

I want to learn how to do RT in glsl

#

so I can use it in Iris KEKW

#

And frogfood as well

frank sail Jul 20, 2023, 11:03 PM

#

I'm just saying that shading languages suck and cuda is a much nicer environment to use

wicked notch Jul 20, 2023, 11:04 PM

#

ye true

frank sail Jul 20, 2023, 11:04 PM

#

and I'd like to be able to have something similar for graphics

wicked notch Jul 20, 2023, 11:04 PM

#

you are at AMD

frank sail Jul 20, 2023, 11:04 PM

#

without vendor lock-in bleakekw

wicked notch Jul 20, 2023, 11:04 PM

#

just pester some graphics engineer or something

#

you can use advanced tactics like:
guns
guns
more guns
intercontinental ballistic missiles (in case they escape)

frank sail Jul 20, 2023, 11:05 PM

#

the tf2 mercenary approach to persuasion

wicked notch Jul 20, 2023, 11:22 PM

#

Alright I have materials and normals

#

tomorrow we'll be doing good ol path tracing

wicked notch Jul 21, 2023, 11:00 AM

#

as it turns out

#

intersecting a bvh is hard

#

bleakekw

minor root Jul 21, 2023, 11:15 AM

#

it do be

#Iris - A Journey through OpenGL and beyond to learn Graphics