Iris - A Journey through OpenGL and beyond to learn Graphics | Graphics Programming | Page 23

primal shadow Sep 26, 2024, 3:56 AM

#

from unreal 5.0's docs

#

We achieve this by performing quantization in object space using a user-selectable
power of step size centered around the object origin.
It is crucial that the step size is not normalized to the bounds of the object or in other
ways tied to its dimensions.
From the nanite siggraph presentation

#

So yeah, feels a bit contradictory

#

I assume they just choose a step size based on how big the object is, under the assumption that meshes that will be used together should have similar sizes, and therefore the same size-based heuristic should lead to the same quantization factor, and it'll work out fine when the meshes both get used with transforms multiples of the step size

primal shadow Sep 27, 2024, 7:00 PM

#

memory allocation of 4548506711262943144 bytes failed uh oh

wicked notch Sep 27, 2024, 7:02 PM

#

imagine failing to allocate 4039 petabytes smh

#

are you stuck in 2024

primal shadow Sep 27, 2024, 7:22 PM

#

memory allocation of 4548506711262943144 bytes failed
[Inferior 1 (process 209) exited with code 011]
(gdb) bt
No stack.
(gdb)

wtf

wispy spear Sep 27, 2024, 7:32 PM

#

how many exa bytes is that

primal shadow Sep 27, 2024, 7:33 PM

#

idk what went wrong :((

primal shadow Sep 27, 2024, 8:05 PM

#

oh whoops, wasn't my code's fault

#

forgot to copy paste some asset writing code into my testing setup

buoyant summit Sep 28, 2024, 9:37 PM

#

primal shadow ``` memory allocation of 4548506711262943144 bytes failed [Inferior 1 (process 2...

I mean what are you surprised about

#

it exited

#

normally

#

if you were to crash it as like with abort(), then it'd work

primal shadow Sep 28, 2024, 11:07 PM

#

Nooo, I finished my shaders for decoding, and all it's rendering is points D:

primal shadow Sep 28, 2024, 11:31 PM

#

hmm, so every vertex has the same position, huh...

#

oh, LOL

#

totally forgot to use the vertex_id parameter of my get_meshlet_vertex_position() function 😅

#

that would explain it

#

oh god

fiery bolt Sep 29, 2024, 12:57 AM

#

perfectly shippable

wicked notch Sep 29, 2024, 12:57 AM

#

looks bunny enough to me

#

ship it

frank sail Sep 29, 2024, 2:06 AM

#

it's a buggy

fiery bolt Sep 29, 2024, 2:07 AM

#

bugs bunny!

primal shadow Sep 29, 2024, 2:54 AM

#

I tested it on a cube, it's rendering as a 2d plane

#

heck

primal shadow Sep 29, 2024, 6:41 AM

#

Bleh I can't figure out how to fix it, as renderdoc seems to be showing my fake values in the debugger :/

wispy spear Sep 29, 2024, 8:36 AM

#

primal shadow oh god

artifishial paint simulator, i like it. that blue bunny on the right ackchually looks quite cool

#

i will steal the image and use it as the new server banner : )

primal shadow Sep 29, 2024, 11:19 PM

#

Ah I have discovered my issue, fml

#

the bitstream reader I did on the GPU does not account for when the bits of a vertex are split across two different buffer elements :/

primal shadow Oct 1, 2024, 3:14 AM

#

Spent 3h trying to fix this, broke my brain

#

didn't figure it out

wispy spear Oct 1, 2024, 4:44 PM

#

let the brain grow together again

#

https://tenor.com/view/greys-anatomy-addison-montgomery-well-try-again-try-again-give-it-another-shot-gif-25611967

Tenor

primal shadow Oct 1, 2024, 5:16 PM

#

People in #webgpu helped me out, think I got it working

#

Shading looked a bit off though, I think I'm compressing normals too much

primal shadow Oct 1, 2024, 7:56 PM

#

I'm not compressing UVs, but so far per-meshlet buffer is not a size savings, at least for the bunny mesh

#

I guess the triangle compression I can do should improve it more

#

And streaming will give me memory savings

wide shadow Oct 1, 2024, 8:00 PM

#

compressing UVs should be safe from f32vec2 -> u32 (packed 2 halfs)

primal shadow Oct 1, 2024, 8:02 PM

#

I've heard very bad htings about 16bit UVs

wide shadow Oct 1, 2024, 8:03 PM

#

I mean it depends on the asset bleakekw

primal shadow Oct 1, 2024, 8:24 PM

#

Ok yeah, using octahedral encode, and snorm2x16 is too inaccurate

#

Wait am I supposed to be using snorm, or unorm 🤔

#

What's the outpout of octahedral encode?

#

[0,1] or [-1,1]?

wispy spear Oct 1, 2024, 8:27 PM

#

@wicked notch Mega Lights 2.0 when?

delicate rain Oct 1, 2024, 8:28 PM

#

Are they rt btw or is it not known?

#

Must be rt no?

wispy spear Oct 1, 2024, 8:29 PM

#

i have no idea, only saw the summary at gamesfromscratch

primal shadow Oct 1, 2024, 8:54 PM

#

delicate rain Are they rt btw or is it not known?

It's up on their github. It's RT based, but no ReSTIR surprisingly.

delicate rain Oct 1, 2024, 8:55 PM

#

Interesting

primal shadow Oct 1, 2024, 9:03 PM

#

https://xcancel.com/knarkowicz/status/1830232208164175898#m

Nitter

Krzysztof Narkowicz (@knarkowicz)

It's still in the research phase of finding the best solution (best tradeoffs for our use cases) and things may change, but at the moment there's no ReSTIR in MegaLights.

finite quartz Oct 1, 2024, 9:16 PM

#

delicate rain Are they rt btw or is it not known?

delicate rain Oct 1, 2024, 9:17 PM

#

Stochastic light sampling doesn't really say much

#

But yeah makes sense, thank you!

primal shadow Oct 1, 2024, 9:41 PM

#

Pray I do not quantize you further

fiery bolt Oct 1, 2024, 10:03 PM

#

automatic minecraft shader

#

ship it

fiery bolt Oct 1, 2024, 10:39 PM

#

finite quartz

but how do they RT properly with nanite meshes froge_sad

wicked notch Oct 1, 2024, 10:44 PM

#

fallback repr

fiery bolt Oct 1, 2024, 10:44 PM

#

that's not 'properly'

#

that's cringe

#

incredibly cringe

#

so cringe that i'm probably gonna do fallback repr myself

primal shadow Oct 1, 2024, 11:04 PM

#

fiery bolt but how do they RT properly with nanite meshes <:froge_sad:1105211463511048314>

That's what I want to know

wicked notch Oct 1, 2024, 11:29 PM

#

it is what unreal does

#

you might not like it

#

but unreal doesn't care that you don't like it

fiery bolt Oct 1, 2024, 11:52 PM

#

but what if I ping the entire epic games developer github org

wicked notch Oct 2, 2024, 12:00 AM

#

hmm bold strategy

#

almost as bold as the king's gambit

fiery bolt Oct 3, 2024, 2:10 PM

#

@primal shadow just fixed another silly mistake in the edge detection, maybe try it now 🙃

ebon ruin Oct 3, 2024, 3:11 PM

#

Hello Nanite folks

#

my meshlet generator sometimes creates split meshlets where a meshlet would be in two pieces (that can be quite far from each other). Will this be problematic when implementing the LOD tree?

wicked notch Oct 3, 2024, 3:15 PM

#

yes

ebon ruin Oct 3, 2024, 3:17 PM

#

Here's an example:

#

oops

#

Screenshot_2024-10-03_at_11.17.45_AM.png

ebon ruin Oct 3, 2024, 3:18 PM

#

wicked notch yes

i see, thanks

primal shadow Oct 3, 2024, 5:13 PM

#

fiery bolt <@145540119141679105> just fixed another silly mistake in the edge detection, ma...

Will try later tn

primal shadow Oct 4, 2024, 3:23 AM

#

https://github.com/bevyengine/bevy/pull/15643

GitHub

Per-meshlet compressed vertex data by JMS55 · Pull Request #15643 ·...

Objective

Prepare for streaming by storing vertex data per-meshlet, rather than per-mesh (that means duplicating vertices per-meshlet)
Compress vertex data to reduce the cost of this

Solution

Po...

primal shadow Oct 4, 2024, 5:44 AM

#

primal shadow Will try later tn

tomorrow*, ended up getting paged for work

primal shadow Oct 5, 2024, 3:00 AM

#

Question for people: how are you generating the LOD bounds for the group?

#

not for individual meshlets

fiery bolt Oct 5, 2024, 12:40 PM

#

for base groups, merge the bounds of the meshlets (or just calculate it directly)

#

then merge the group bounds of all meshlets for higher LODs

primal shadow Oct 5, 2024, 6:28 PM

#

Oops forgot to update it here, I figured it out

#

Yep that's what I figured out was correct

#

Except apparently getting a minimal bounding sphere of bounding spheres is a 183 page thesis xD

#

There's an open source implementation in the cgal library, but it's gpl v3 :/

#

So guess I'll go with an approximate method

fiery bolt Oct 5, 2024, 8:01 PM

#

primal shadow Except apparently getting a minimal bounding sphere of bounding spheres is a 183...

yep i just merge two at a time KEKW

primal shadow Oct 5, 2024, 9:34 PM

#

Did some asset size / quality / perf comparisons between compressed per-meshlet vertex data, and a single set of uncompressed vertex data shared between all meshlets. Sadly asset size is nearly identical. On the upside, quality and perf are also identical, I can implement streaming now, and there's still room to further compress vertex data so more wins in the future hopefully. https://github.com/bevyengine/bevy/pull/15643#issuecomment-2395198350

#

UVs are completely uncompressed (64 bits), normals can probably be quantized and variable-length encoded similiar to positions rather than the current 32 bits per vertex, and triangle data can be compressed with fancy triangle strip encodings (currently 24 bits per triangle, 8 bits per index * 3)

faint crane Oct 5, 2024, 10:53 PM

#

primal shadow Except apparently getting a minimal bounding sphere of bounding spheres is a 183...

Do you have a link handy? Can ~~procrastinate further on occlusion culling~~ give it a shot.

primal shadow Oct 5, 2024, 11:13 PM

#

faint crane Do you have a link handy? Can ~~procrastinate further on occlusion culling~~ giv...

https://people.inf.ethz.ch/emo/DoctThesisFiles/fischer05.pdf have fun

fiery bolt Oct 6, 2024, 1:45 AM

#

faint crane Do you have a link handy? Can ~~procrastinate further on occlusion culling~~ giv...

oh yeah that reminds me that my culling is a wee bit broken along the bottom and right edges of the screen

primal shadow Oct 6, 2024, 8:30 PM

#

I'm realizing I have no idea how nanite's error projection is supposed to work

#

Each cluster needs: bounding sphere of it's group, and of it's parent group

#

and then, parent group boundign spheres must strictly encompass all of their child bounding spheres

#

but then how do you mix error into this setup? Where does the error come in during the building and runtime steps?

#

I suppose the test is group_can_be_rendered = projected_sphere_radius(group.center, group.radius) < group.error_radius

#

I.e. the group has error_radius deformity from it's children, so if the size of the group on screen is less than that, it's basically equivilant to it's children, so it's ok to render

#

no, that dosen't seem quite right either

fiery bolt Oct 6, 2024, 8:47 PM

#

yeah me neither

#

it's a pain

#

what i was originally doing was placing a sphere on the closest point of the lod bounds and checking it's projected screenspace radius

#

(and clamping to the camera)

#

but that leads to holes for some reason

#

https://github.com/SparkyPotato/radiance/blob/337436d2a0bd57bcb2f558f687fcf98a6a650150/shaders/passes/mesh/cull.slang#L238

GitHub

radiance/shaders/passes/mesh/cull.slang at 337436d2a0bd57bcb2f558f6...

Rendering things. Contribute to SparkyPotato/radiance development by creating an account on GitHub.

#

now i do this

#

but i forgot what it's actually doing bleaker_kekw

#

and it sometimes leads to double-rendering i think

fiery bolt Oct 6, 2024, 8:49 PM

#

fiery bolt but i forgot what it's actually doing <:bleaker_kekw:1271044788086509619>

(the comment is wrong)

#

(please tell me if you come up with something better)

primal shadow Oct 6, 2024, 9:09 PM

#

Based on https://vcg.isti.cnr.it/~ponchio/download/ponchio_phd.pdf 3.6.1, it sounds like we should:

#

For each cluster, store culling bounding sphere, lod bounding sphere, and error

#

Leaf meshlets (i.e. initial set of starting meshlets): Generate the culling bounding sphere, lod bounding sphere is a copy of the culling bounding sphere, error = 0

#

And then you group meshlets, simplify the group, and split into new meshlets

#

And for each new meshlet: compute culling bounding sphere, lod bounding sphere = new sphere encompassing lod bounding sphere of all children in the group, and error = max(simplification_error, child1_error, child2_error, ...)

#

Ok so that's building, now you have for each cluster: culling sphere, lod sphere, and error

#

now at runtime you gotta do this

#

tight bounding sphere = meshlet cluster bounding sphere

#

...and this runtime part I'm still reading the paper to figure out

#

but anyways you do this, and also for the parent sphere somehow?

#

and then you draw if self == good && parent == bad

#

finding the minimum enclosing ball of points (for leafs) and minimum enclosing ball of balls (for
all other nodes) [35]
Oh hey, they reference fischer's thesis, ok cool so I was on the right track with that

primal shadow Oct 6, 2024, 9:31 PM

#

it seems like you're supposed to take the cluster's bounding sphere, find the closest point on the surface to the viewport, and then project a new sphere where the center is that point, and the radius is your error

#

so that tells you whether or not the current cluster has visible error, but what do you do about the parent???

#

And when you're building a BVH like nanite does, you can't involve the cluster bounding sphere at all, it has to be based solely on group/LOD data

#

So nanite must be doing something different here

#

really I think my original idea I've been using for the past year is on track

#

the cluster bounding sphere dosen't matter

#

what matter is for the cluster's group, and the cluster's parent group (group with cluster in it before simplifying),

#

given the LOD bounding sphere (located somewhere), with radius = group error, is the projected size of that group small enough such that the error is invisible?

#

The problem is, where do you choose to locate that sphere?

#

that's really the key question

#

because if you're saying radius = error, and you force error to be monotonic, then the bounding sphere projections will always be monotonic if they're located in the same spot

#

the issue is if you start moving where the bounding spheres are, then you run into issues

#

so where the heck do you choose to locate it??

primal shadow Oct 6, 2024, 10:06 PM

#

I think the way Nanite does this is not neccesairly straightforward sphere projection

#

You have the group bounding sphere (encompasing all child group bounding spheres), and the group error of the cluster

#

And then you somehow project that error to the screen using the group sphere bounds

#

but it's not projecting the sphere itself? Something like that

primal shadow Oct 6, 2024, 10:09 PM

#

primal shadow now at runtime you gotta do this

Mayeb it is this

#

tight bounding sphere = group bounds

#

and then you find the closest point on the group bounds to the viewport, make a new sphere centered there with radius = error, and then calculate projected size of that sphere?

fiery bolt Oct 6, 2024, 11:10 PM

#

primal shadow Mayeb it _is_ this

that is using the saturated sphere (lod bounds) and placing the error sphere on the closest point on that, and then projecting it to the screen

#

unfortunately it doesn't work if you're inside the lod bound sphere

#

or it doesn't work with a bvh, idk

#

didn't work for me when i tried it

primal shadow Oct 7, 2024, 12:46 AM

#

fiery bolt unfortunately it doesn't work if you're inside the lod bound sphere

I think you can probably just force LOD 0 at that point

fiery bolt Oct 7, 2024, 12:51 AM

#

primal shadow I think you can probably just force LOD 0 at that point

they get quite large for things high up in the bvh

#

so there's holes in the mesh when the camera is inside one lod bound but not the other

#

or something like that

#

instead i calculate the distance that the error would be less than a pixel and check if the closest point on the lod sphere (or something like that) is closer or farther than that

primal shadow Oct 7, 2024, 2:41 AM

#

fiery bolt unfortunately it doesn't work if you're inside the lod bound sphere

What meshoptimizer's nanite demo is doing is returning infinity when inside the LOD sphere. That way infinity is never <= threshold, and therefore that LOD never gets selected.

#

Forcing you to pick a finer LOD

#

Think I'm going to try that with projecting an error-radius sphere on the closest point on the LOD sphere

primal shadow Oct 7, 2024, 4:58 AM

#

https://github.com/zeux/meshoptimizer/discussions/783

GitHub

Nanite.cpp LOD cut / error projection · zeux meshoptimizer · Discus...

I've been looking at improving my DAG building based on what you've done in nanite.cpp recently. One thing that stood out to me is that the way I'm using the LOD group bounds is definit...

fiery bolt Oct 7, 2024, 8:06 AM

#

primal shadow https://github.com/zeux/meshoptimizer/discussions/783

yeah I think I'm doing something similar to what zeux says nanite is doing here

#

but it's buggy so I might need to revisit that lol

primal shadow Oct 8, 2024, 4:27 AM

#

https://github.com/bevyengine/bevy/pull/15643#issuecomment-2398801204 big win for memory usage!

primal shadow Oct 8, 2024, 5:51 AM

#

LZ4 was somehow doing a shit ton of work before considering before/after asset size with LZ4 compression applied is basically the same

#

But memory usage is nearly halved after

primal shadow Oct 10, 2024, 2:21 AM

#

fiery bolt <@145540119141679105> just fixed another silly mistake in the edge detection, ma...

Still on my backlog dw. Currently doing some changes to error projection and bounding spheres which should both improve LODs but also allow my converter code to work on larger meshes that it was crashing on before. After that, I'm going to go back to tweaking the builder code, and add the manual edge locking, larger meshlet groups, and probably attribute-aware simplification.

primal shadow Oct 10, 2024, 5:51 AM

#

#

Confusing

primal shadow Oct 10, 2024, 7:21 AM

#

Well this clearly didn't work

#

Much better, but I think it's vastly over-estimating error 😅

wispy spear Oct 10, 2024, 10:25 AM

#

primal shadow Well this clearly didn't work

you COULD start a side business and sell those as contemporary art installations

fiery bolt Oct 10, 2024, 10:30 AM

#

primal shadow

I think this is what I was describing that isn't always monotonic?

#

but which paper is that

primal shadow Oct 10, 2024, 5:28 PM

#

fn lod_error_is_imperceptible(sphere: MeshletBoundingSphere, error: f32, world_from_local: mat4x4<f32>, world_scale: f32) -> bool {
    let cp_world = world_from_local * vec4(sphere.center, 1.0);
    let r_view = world_scale * sphere.radius;
    let cp_view = (view.view_from_world * vec4(cp_world.xyz, 1.0)).xyz;

    // TODO: Handle view clipping / being inside sphere bounds
    let aabb = project_view_space_sphere_to_screen_space_aabb(cp_view, r_view);
    let screen_size = max(aabb.z - aabb.x, aabb.w - aabb.y);
    let meters_per_pixel = sphere.radius / screen_size;

    return error < meters_per_pixel;
}

Not documented and poorly named atm, but this

#

Take LOD sphere, project to screen space to get the pixel size

#

Then you do sphere.radius(?) / pixel_size

#

I.e. if you sphere has radius 10, and your pixel_size is 4

#

you get 10/4 = 2.5

#

E.g. 2.5 meters = 1 pixel on screen

#

And error is already an object-space distance in meters

#

So now if e.g. error = 3.2

#

Well 2.5 megters = 1 pixel

#

So 3.2 meters on screen is greater than 1 pixel

#

I.e. visible error

#

So it's sufficent to check that error < meters_per_pixel

#

I.e. error needs to be less than 2.5 meters so that it's less than 1 pixel on screen

#

Since the relation between meters and pixel on screen is linear

#

I do need to handle clipping when inside the sphere bounds though. The paper covers it.

#

I'm not quite convinced on some of this though. And it feels really weird to project the LOD sphere and then compare the size to the simplification error, rather than projecting the simplification error directly.

fiery bolt Oct 10, 2024, 6:41 PM

#

the screenshot you sent above mentions something about comparing error directly with distance * some threshold thonk

#

this code looks different

primal shadow Oct 10, 2024, 6:44 PM

#

Yeah I didn't follow it

#

Because I have no idea how to handle the case where you're inside the sphere for that

#

The one from the batched multi triangulation paper

#

I also could not find code or the algorithm description for it at all, I'm giving up on that approach

fiery bolt Oct 10, 2024, 6:51 PM

#

primal shadow The one from the batched multi triangulation paper

could you send a link to it?

#

i can take a look and try to figure out what it is

primal shadow Oct 10, 2024, 6:53 PM

#

https://d-nb.info/997062789/34#page=48 sections 3.6.1 (specifically figure 3.15), and 4.2.3 on page 60

fiery bolt Oct 10, 2024, 6:58 PM

#

thanks!

fiery bolt Oct 10, 2024, 7:05 PM

#

primal shadow Since the relation between meters and pixel on screen is linear

it's not for perspective projections

#

bleakforg

primal shadow Oct 10, 2024, 7:06 PM

#

Can you help me understand what zeux is saying here then? https://github.com/zeux/meshoptimizer/discussions/783

#

UE5 Nanite computation is similar in principle, but mechanically different - it takes into account cases where the sphere is clipped by znear, and it takes camera orientation into account, so the computation is not camera rotation invariant. Conceptually I think it's the same as your reference to fig 3.15, even though I find that specific figure odd as no lines or points there connect to sphere radius 🙂 they compute the projected sphere radius in pixels, invert it (that way they get the length that would project to one pixel, using the same coordinates that the sphere is in), and compare that to the error (which is also in linear units in the same coordinates that the sphere is in).

fiery bolt Oct 10, 2024, 7:07 PM

#

hmm

#

from what i understand, it's the same thing you're doing

#

but the inversion is probably more complex than just a div

#

it could also be calculating the distance at which the error becomes less than a pixel, and just comparing the closest point on the sphere with that

#

but the distance for 1px error depends on where the sphere is thonk

#

so there could be some normalization step to convert from an off-center sphere to a sphere in the center

primal shadow Oct 10, 2024, 7:42 PM

#

I have no clue

primal shadow Oct 10, 2024, 8:34 PM

#

fiery bolt it's not for perspective projections

Am I insane or does zeux make it sound like it's linear?

fiery bolt Oct 10, 2024, 8:44 PM

#

he does say that perspective distortion exists

#

which makes it nonlinear

primal shadow Oct 11, 2024, 12:44 AM

#

Ok lol so I forgot to multiple 0..1 by the view size 😅 , looks better now

#

Zeux also left me some more info, so I'm going to take a look at that too

primal shadow Oct 11, 2024, 1:51 AM

#

Ok I'm just stealing zeux's code

#

I give up trying to understand this

#

https://github.com/JMS55/bevy/blob/8b71b243ca51be1a59a7c40d9fbfd7a5083f04e7/crates/bevy_pbr/src/meshlet/cull_clusters.wgsl#L145

#

Not sure I handled ortho correctly but yeah

faint crane Oct 11, 2024, 2:52 AM

#

Is orthographic even useful with a Nanite implementation?

#

I guess reviewers will complain anyway, even if w=1.

primal shadow Oct 11, 2024, 3:03 AM

#

faint crane Is orthographic even useful with a Nanite implementation?

Shadow maps 🙂

#

https://github.com/bevyengine/bevy/pull/15846

GitHub

Meshlet new error projection by JMS55 · Pull Request #15846 · bevye...

New error projection code taken from @zeux's meshoptimizer nanite.cpp demo for determining LOD
Builder: compute_lod_group_data()
Runtime: lod_error_is_imperceptible()

faint crane Oct 11, 2024, 6:33 AM

#

briannaPls

#

If only I could go the internet archive.

#

📎 Smallest_enclosing_balls_of_balls.pdf

primal shadow Oct 11, 2024, 11:43 PM

#

@fiery bolt what are you using for your error projection? Seems like you're using a method n don't understand based on the projected bounding sphere

fiery bolt Oct 11, 2024, 11:44 PM

#

i have no idea tbh

#

i did geometry on paper for it

#

idk where it went

#

froge_sad

#

it's not correct tho

primal shadow Oct 11, 2024, 11:46 PM

#

Hmm ok. Back to builder improvements.

fiery bolt Oct 11, 2024, 11:50 PM

#

you should multithread simplification

primal shadow Oct 11, 2024, 11:59 PM

#

No time for that, bevy release is very soon

#

Plan is to finish stealing zeux's error projection, steal some of the simplification improvements he had, write a hopefully faster and easy fill cluster buffers improvement, and then maybe fix SW raster if I have time

#

And then write the blog post for everything I did this cycle and help out with the rest of the release

#

Oh btw do you have a gltf -> virtual geometry mesh converter?

#

I'm curious how you're handling materials

fiery bolt Oct 12, 2024, 12:04 AM

#

primal shadow I'm curious how you're handling materials

as of now, by not

#

i don't have a renderer

#

it's just visbuf output and debug

fiery bolt Oct 12, 2024, 12:07 AM

#

primal shadow No time for that, bevy release is very soon

right i forget you have users KEKW

wicked notch Oct 12, 2024, 12:09 AM

#

"forgot you had users" is the most GP thing ever

fiery bolt Oct 12, 2024, 12:11 AM

#

no no we do it in the rust server too

#

specifically #games-and-graphics and # lang-dev

#

no that's not what i wanted discord

wicked notch Oct 12, 2024, 12:12 AM

#

discord moment

wicked notch Oct 12, 2024, 12:13 AM

#

fiery bolt specifically #games-and-graphics and # lang-dev

I claim ownership of these channels

#

they now belong to GP inc.

#

surrender or else 🔫 🐸

fiery bolt Oct 12, 2024, 12:13 AM

#

for that you must shill rust cutecatNE

wicked notch Oct 12, 2024, 12:14 AM

#

the first rule I'll implement is ban rust

fiery bolt Oct 12, 2024, 12:14 AM

#

along with a healthy dose of offtopic cat gifs

wicked notch Oct 12, 2024, 12:14 AM

#

that is allowed

fiery bolt Oct 12, 2024, 12:14 AM

#

https://tenor.com/ifnHXihE6QX.gif

Tenor

primal shadow Oct 12, 2024, 12:15 AM

#

wicked notch the first rule I'll implement is ban rust

So focused on the CPU language, you're missing the real issue of the GPU language smh

fiery bolt Oct 12, 2024, 12:15 AM

#

wicked notch the first rule I'll implement is ban rust

you're outnumbered here actually froge_evil

wicked notch Oct 12, 2024, 12:15 AM

#

primal shadow So focused on the CPU language, you're missing the real issue of the GPU languag...

I didn't know there was a difference froge_love

fiery bolt Oct 12, 2024, 12:16 AM

#

yes contrib to vcc

#

make it fully usable

wicked notch Oct 12, 2024, 12:16 AM

#

fiery bolt you're outnumbered here actually <:froge_evil:1105211334020309052>

mfw this is my own channel and I'm outnumbered

#

I have been playing with vcc tho

#

without telling gob

#

I managed to make a transpiler

#

from clang to glsl

fiery bolt Oct 12, 2024, 12:17 AM

#

i should try to re-derive or figure out the error proj math in my code tomorrow thonk

fiery bolt Oct 12, 2024, 12:17 AM

#

wicked notch from clang to glsl

why

#

why would you

wicked notch Oct 12, 2024, 12:17 AM

#

because I didn't feel like reading the spirv spec

fiery bolt Oct 12, 2024, 12:17 AM

#

fiery bolt i should try to re-derive or figure out the error proj math in my code tomorrow ...

and figure out why it doesn't work sometimes and only sometimes

#

the spirv spec is surprisingly good ngl

wicked notch Oct 12, 2024, 12:18 AM

#

remember this is my first ever compiler bleaker_kekw

#

I know but I'm a compiler nub

fiery bolt Oct 12, 2024, 12:18 AM

#

i'm currently on my like

#

6th

#

probably 6th renderer too

#

but i think i passed my final interview for an internship at e🅱️ic despite my serious lack of braincells and knowledge froge

#

just maybe

wicked notch Oct 12, 2024, 12:22 AM

#

shit

#

we lost one

fiery bolt Oct 12, 2024, 12:22 AM

#

nono it's for lang dev not giraffics

wicked notch Oct 12, 2024, 12:22 AM

#

same thing, your soul will be sucked dry in exchange for monetary compensation

fiery bolt Oct 12, 2024, 12:23 AM

#

only for 3 months

#

if i even get the offer that is

wicked notch Oct 12, 2024, 12:23 AM

#

you will

#

you have brain

fiery bolt Oct 12, 2024, 12:23 AM

#

it's been 2 whole days since my interview

#

they've never taken more than a day to respond KEKW

wicked notch Oct 12, 2024, 12:24 AM

#

terminally online, just like us frfr

fiery bolt Oct 12, 2024, 12:24 AM

#

fr

#

60 hour work weeks

wicked notch Oct 12, 2024, 12:24 AM

#

damn

#

aren't there work laws or something where you're from bleaker_kekw

fiery bolt Oct 12, 2024, 12:33 AM

#

i mean yeah it's biscuits and tea country

#

so it can't be america levels of bad KEKW

#

in return i will get 30% the pay tho

#

(there are several banks offering the same pay for software dev interns as tesco shelf stackers)

#

oh and tesco themselves actually

wicked notch Oct 12, 2024, 12:39 AM

#

ye but 60hrs work weeks is insane

#

it's half here KEKW

#

(36)

faint crane Oct 12, 2024, 5:24 AM

#

Work at NVIDIA at that point and take the comp.

primal shadow Oct 12, 2024, 5:35 AM

#

Ok error projection finished. TODO:

Builder improvements from zeux
SW raster subpixel precision + top left rule
New fill cluster buffers

primal shadow Oct 12, 2024, 8:28 PM

#

Alright, manual vertex lock time

glass sphinx Oct 12, 2024, 8:57 PM

#

Iwant to do some lodding soon. I ll have to catch up here

#

how do you build the lod hierarchy levels?

wide shadow Oct 12, 2024, 9:01 PM

#

I can only point you here:
https://jms55.github.io/posts/2024-06-09-virtual-geometry-bevy-0-14/
https://jglrxavpok.github.io/

fiery bolt Oct 12, 2024, 9:01 PM

#

glass sphinx how do you build the lod hierarchy levels?

split into meshlets
group meshlets into groups of 8-16
lock boundary vertices on those groups
simplify groups
split groups into meshlets
pool all meshlets together and repeat

#

where nanite slides smh

primal shadow Oct 12, 2024, 9:03 PM

#

glass sphinx how do you build the lod hierarchy levels?

My current code (still making some improvements litterly today): https://github.com/JMS55/bevy/blob/42617d4abc6ec6ac4fba5c24db84d7e1f60666b5/crates/bevy_pbr/src/meshlet/from_mesh.rs#L63
Meshopt's demo: https://github.com/zeux/meshoptimizer/blob/master/demo/nanite.cpp
Nanite slides: https://advances.realtimerendering.com/s2021/Karis_Nanite_SIGGRAPH_Advances_2021_final.pdf
Traverse research: https://blog.traverseresearch.nl/creating-a-directed-acyclic-graph-from-a-mesh-1329e57286e5

glass sphinx Oct 12, 2024, 9:03 PM

#

fiery bolt - split into meshlets - group meshlets into groups of 8-16 - lock boundary verti...

do i have to meet some criteria for those 8-16 groups?

fiery bolt Oct 12, 2024, 9:03 PM

#

meshlets in them should be touching or as close together as possible

glass sphinx Oct 12, 2024, 9:03 PM

#

I read the nanite slides but i still dont understand the hierarchy building]

primal shadow Oct 12, 2024, 9:04 PM

#

If you use my blog post, I wouldn't copy the code exactly, make sure to reference mine/meshopt's code for the up-to-date changes. I have a new blog post coming in ~2-3 weeks that will be up to date.

glass sphinx Oct 12, 2024, 9:04 PM

#

fiery bolt - split into meshlets - group meshlets into groups of 8-16 - lock boundary verti...

how does the simplify groups step work?

#

is it N tris to N/2 tris?

fiery bolt Oct 12, 2024, 9:04 PM

#

https://github.com/SparkyPotato/radiance/blob/main/crates/asset/src/mesh/import.rs
this is my current code

fiery bolt Oct 12, 2024, 9:04 PM

#

glass sphinx is it N tris to N/2 tris?

yup

#

and you keep track of error introduced

#

and set that as the parent error for the unsimplified group, and self error for the simplified meshlets

#

which reminds me i should write a blog post on error once i figure it out correctly KEKW

primal shadow Oct 12, 2024, 9:05 PM

#

glass sphinx how does the simplify groups step work?

Take ~8 meshlets, merge them into one big triangle strip, use meshopt to simplify into a new triangle strip, break apart back into new meshlets (if you simplified by ~50%, then hopefully you end up with ~4 meshlets)

fiery bolt Oct 12, 2024, 9:05 PM

#

and how i did the bvh

glass sphinx Oct 12, 2024, 9:05 PM

#

how do you guys do the hierarchical culling? With persistent threads?

fiery bolt Oct 12, 2024, 9:05 PM

#

nah, i use an indirect dispatch chain rn

#

it's fast enough

#

nanite uses a dispatch chain on pc too

glass sphinx Oct 12, 2024, 9:06 PM

#

but didnt they say they dont?

fiery bolt Oct 12, 2024, 9:06 PM

#

that's on console

primal shadow Oct 12, 2024, 9:06 PM

#

glass sphinx how do you guys do the hierarchical culling? With persistent threads?

I just do a brute-force dispatch over every cluster in the scene across all LODs atm. I'm planning to switch to hierchal culling in a bit, once I'm done with my current round of improvements to other areas.

fiery bolt Oct 12, 2024, 9:06 PM

#

because it's technically UB

glass sphinx Oct 12, 2024, 9:06 PM

#

yea but who cares

#

loool

fiery bolt Oct 12, 2024, 9:06 PM

#

you because you don't wanna debug that shit

#

if even UE uses indirect dispatches, i will too

#

i might try persistent threads after i have streaming working

glass sphinx Oct 12, 2024, 9:07 PM

#

workgraphcs cant come soon enough

fiery bolt Oct 12, 2024, 9:07 PM

#

or just use workgraphs yeah lol

glass sphinx Oct 12, 2024, 9:07 PM

#

sad that they have such horrible perf on nvidia

glass sphinx Oct 12, 2024, 9:08 PM

#

primal shadow Take ~8 meshlets, merge them into one big triangle strip, use meshopt to simplif...

meshoptimizer can build lower poly versions of a mesh?

fiery bolt Oct 12, 2024, 9:08 PM

#

i wonder if i can abuse DGC to only conditionally insert an indirect dispatch

glass sphinx Oct 12, 2024, 9:09 PM

#

you can do that with DGC

#

but you still need barriers i think

primal shadow Oct 12, 2024, 9:09 PM

#

glass sphinx meshoptimizer can build lower poly versions of a mesh?

Yeah.

glass sphinx Oct 12, 2024, 9:09 PM

#

oh

#

then this is all i want for now i think

primal shadow Oct 12, 2024, 9:09 PM

#

Try their demo, they have a bunch of configurable stuff

#

Or download bevy and play around with it

fiery bolt Oct 12, 2024, 9:10 PM

#

if i use coherent writes i don't think i'll need barriers?

glass sphinx Oct 12, 2024, 9:10 PM

#

oh shit what, meshopt has a nanite demo

fiery bolt Oct 12, 2024, 9:10 PM

#

yeah it's new

primal shadow Oct 12, 2024, 9:10 PM

#

I linked it above 😛

glass sphinx Oct 12, 2024, 9:10 PM

#

primal shadow I linked it above 😛

yee just saw that

fiery bolt Oct 12, 2024, 9:10 PM

#

primal shadow Or download bevy and play around with it

cutecatNW or download my shitty code and play around it cutecatNE

#

i have an editor too

primal shadow Oct 12, 2024, 9:11 PM

#

True, Bevy dosen't yet

fiery bolt Oct 12, 2024, 9:11 PM

#

i need add a way to actually spawn meshes though lmao

#

you can only select and move things rn KEKW

#

oh and undo

glass sphinx Oct 12, 2024, 9:16 PM

#

hm the demo doesnt seem to have a cmake target

#

https://meshoptimizer.org/demo/ oh shit im dumb

#

i cant interact with it

wide shadow Oct 12, 2024, 9:20 PM

#

glass sphinx Oct 12, 2024, 9:21 PM

#

i think i might wanna try the hierarchical lodding at some point

#

but i definetly dont want the partial streaming part. Streaming full lods is way less headache i think.

#

i saw some analysis and nanite mesh streaming is a major cause for stutter. Needs much more pcie bandwidth then tex streaming for some reason

glass sphinx Oct 12, 2024, 9:27 PM

#

fiery bolt <:cutecatNW:1264190122161864795> or download my shitty code and play around it <...

is yours c++?

fiery bolt Oct 12, 2024, 9:28 PM

#

nope

primal shadow Oct 12, 2024, 9:28 PM

#

glass sphinx i cant interact with it

The demo isin't interactable. You can have it dump data to an obj and open in blender though.

glass sphinx Oct 12, 2024, 9:28 PM

#

rustyyy

#

ok i ll look at both your code's

fiery bolt Oct 12, 2024, 9:28 PM

#

glass sphinx i saw some analysis and nanite mesh streaming is a major cause for stutter. Need...

was it a threat interactive analysis bleaker_kekw

wicked notch Oct 12, 2024, 9:29 PM

#

nah that's actually true

#

but only because UE's streaming code is dogshit

fiery bolt Oct 12, 2024, 9:29 PM

#

lol

#

i doub mine will be any better

glass sphinx Oct 12, 2024, 9:31 PM

#

wicked notch but only because UE's streaming code is dogshit

ue has lots of problems like that causing stutter

#

kinda a shame. Their shader comp stutter is only getting worse with time too

fiery bolt Oct 12, 2024, 9:32 PM

#

traversal stutter my beloved

#

you can make UE compile shaders on start though, can't you

glass sphinx Oct 12, 2024, 9:34 PM

#

not really, no

#

they have made it fundamentally kinda impossible with their materials

#

have to pat the back of cod team on this one. I think cod engine is probably one of the least stuttery engines. Everything is prebuild and static, even the rendergraph.

fiery bolt Oct 12, 2024, 9:35 PM

#

e🅱️ic should hire us to fix their stutter

glass sphinx Oct 12, 2024, 9:35 PM

#

they dont care about such things

wide shadow Oct 12, 2024, 9:35 PM

#

glass sphinx ok i ll look at both your code's

There are also these repos that are worth looking into (C++)
https://github.com/jglrxavpok/Carrot/tree/rendering-improvements
https://github.com/daniilvinn/omniforce-engine/tree/meshlet-lods

glass sphinx Oct 12, 2024, 9:35 PM

#

their priority is best graphics at 30fps with 700p on consoles

#

kekw

wicked notch Oct 12, 2024, 9:36 PM

#

maybe threat interactive was right all along

fiery bolt Oct 12, 2024, 9:36 PM

#

should i do streaming or restir or should i fix my shitty code first

wicked notch Oct 12, 2024, 9:36 PM

#

why not all three

#

at the same time

fiery bolt Oct 12, 2024, 9:37 PM

#

do i like

#

write a line of code for each

#

then repeat

wicked notch Oct 12, 2024, 9:37 PM

#

no

#

parallelize

#

what are you unreal engine that does everything on a single thread?

#

grow 4 more hands and buy two more keyboards and mice

fiery bolt Oct 12, 2024, 9:38 PM

#

oh shit never thought of that

#

lemme try

#

brb

#

help this is me now

wicked notch Oct 12, 2024, 9:38 PM

#

working as intended

#

keep going

primal shadow Oct 12, 2024, 9:43 PM

#

fiery bolt should i do streaming or restir or should i fix my shitty code first

Lighting is so much harder, don't start that if you haven't done it before until you finish virtual geoemtry 😅

#

I also have a partially written blog post on restir I never finished, I should do that...

frank sail Oct 12, 2024, 10:04 PM

#

glass sphinx workgraphcs cant come soon enough

Cross platform DGC is a thing now

glass sphinx Oct 12, 2024, 10:11 PM

#

frank sail Cross platform DGC is a thing now

they dont solve this problem afaik

frank sail Oct 12, 2024, 10:13 PM

#

Idk what problem you need to solve

#

I thought you just needed a variable number of indirect dispatches

glass sphinx Oct 12, 2024, 10:14 PM

#

for hierarchical culling youd ideally just start new threads in the same dispatch

frank sail Oct 12, 2024, 10:15 PM

#

Ah hierarchy

#

Rip

glass sphinx Oct 12, 2024, 10:15 PM

#

ah actuallynow that i think of it

#

Is it possible to start new dispatches immediately?

#

i think it has to flush and then do an execute indirect on the shader recorded command buffer

#

so it will have to run in passes still\

wicked notch Oct 12, 2024, 10:19 PM

#

with DGC you would figure out the deepest level of the hierarchy and create the DGC commands

glass sphinx Oct 12, 2024, 10:20 PM

#

wicked notch with DGC you would figure out the deepest level of the hierarchy and create the ...

how would you figure out the deepest hierarchy level without doing all the work

#

you only know what meshlets you need after each cull phase

wicked notch Oct 12, 2024, 10:20 PM

#

no you only need to test the error

#

which admittedly is a lot of work

#

actually yeah dgc doesn't solve the issue KEKW

glass sphinx Oct 12, 2024, 10:21 PM

#

it would also be very poorly parallel

wicked notch Oct 12, 2024, 10:21 PM

#

ye

glass sphinx Oct 12, 2024, 10:21 PM

#

honestly dgc doesnt really do much at all imo

#

i dont really see a use for it in my things

wicked notch Oct 12, 2024, 10:21 PM

#

DGC is really just a budget vkCmdDispatchIndirectCount

glass sphinx Oct 12, 2024, 10:21 PM

#

omg yes

#

thats what ive been saying too

#

give me vkCmdDispatchIndirectCount

#

ffs

fiery bolt Oct 12, 2024, 10:22 PM

#

primal shadow Lighting is so much harder, don't start that if you haven't done it before until...

I mean I've done traditional RT before

frank sail Oct 12, 2024, 10:22 PM

#

Reading the spec for dgc hurt me ngl

fiery bolt Oct 12, 2024, 10:22 PM

#

same

#

I tried

#

and failed

wicked notch Oct 12, 2024, 10:22 PM

#

why it's not that bad

frank sail Oct 12, 2024, 10:22 PM

#

It reminded me of work graphs how you have to create a bunch of shit first

wicked notch Oct 12, 2024, 10:22 PM

#

ye you have indirect token layouts and command tokens

frank sail Oct 12, 2024, 10:22 PM

#

Or what feels like it from reading

wicked notch Oct 12, 2024, 10:23 PM

#

which is a bit sad, but the layout only says which commands you're gonna use

#

and how many of each of them

glass sphinx Oct 12, 2024, 10:23 PM

#

noone will use it outside of dxvk

#

hottest take

wicked notch Oct 12, 2024, 10:23 PM

#

nah

#

mild take

wicked notch Oct 12, 2024, 10:23 PM

#

glass sphinx give me vkCmdDispatchIndirectCount

it's still this

#

only more convoluted KEKW

#

it does have the added benefit of being able to also issue draw commands

#

and pipeline state changes

frank sail Oct 12, 2024, 10:24 PM

#

Budget vkCmdDrawIndirectCount froge_love

wicked notch Oct 12, 2024, 10:25 PM

#

maybe this will make UE not issue a drawcall per material

frank sail Oct 12, 2024, 10:25 PM

#

Press x to doubt

wispy spear Oct 12, 2024, 10:25 PM

#

vkCmdSetIndirectScissorsCount when

primal shadow Oct 12, 2024, 10:30 PM

#

fiery bolt I mean I've done traditional RT before

Real time though? It is very difficult to not have noise 😅

primal shadow Oct 12, 2024, 10:30 PM

#

wicked notch maybe this will make UE not issue a drawcall per material

As opposed to?

wicked notch Oct 12, 2024, 10:31 PM

#

issuing a drawcall per material?

fiery bolt Oct 12, 2024, 10:31 PM

#

primal shadow Real time though? It is very difficult to not have noise 😅

uhhh I'll figure something out maybe it can't be too hard right

#

agonyfrog

primal shadow Oct 12, 2024, 10:31 PM

#

fiery bolt uhhh I'll figure something out maybe it can't be too hard right

Hahaha yeah....

wicked notch Oct 12, 2024, 10:31 PM

#

radiance cache

#

surely radiance caching is easy

#

:clueless:

fiery bolt Oct 12, 2024, 10:32 PM

#

how about you catche deez nuts instead

primal shadow Oct 12, 2024, 10:32 PM

#

wicked notch radiance cache

Then you have slow lighting response. Also that's even harder than per pixel fun fact.

wicked notch Oct 12, 2024, 10:32 PM

#

ye you do the AMD memes

primal shadow Oct 12, 2024, 10:32 PM

#

Getting your cache to bend around corners and crap sucks

wicked notch Oct 12, 2024, 10:32 PM

#

with screen space cache and world space cache

primal shadow Oct 12, 2024, 10:32 PM

#

While not leaking light

fiery bolt Oct 12, 2024, 10:32 PM

#

just do whatever ReSTIR does

frank sail Oct 12, 2024, 10:33 PM

#

wicked notch surely radiance caching is easy

Speaking of caching gi-related things... kekwfroggified

wicked notch Oct 12, 2024, 10:33 PM

#

frank sail Speaking of caching gi-related things... <:kekwfroggified:1143616473726734357>

shhh

primal shadow Oct 12, 2024, 10:33 PM

#

fiery bolt just do whatever ReSTIR does

RTX DI uses a neural net trained as you do inference in real time as their radiance cache, so hf with that

wicked notch Oct 12, 2024, 10:33 PM

#

I have this as my alibi

fiery bolt Oct 12, 2024, 10:33 PM

#

novidia

fiery bolt Oct 12, 2024, 10:34 PM

#

wicked notch I have this as my alibi

idk what these words mean so they're not a good enough alibi

frank sail Oct 12, 2024, 10:37 PM

#

wicked notch I have this as my alibi

I like your funny words, pasta man

fiery bolt Oct 12, 2024, 10:37 PM

#

I cooked pasta today

#

did crimes froge_love

#

the spaghet wasn't fitting in my pot so I broke it before putting it in froge_love /s

wicked notch Oct 12, 2024, 10:40 PM

#

don't mind me

#

I'm not calling in the military

#

you can stay safe in your house

frank sail Oct 12, 2024, 10:42 PM

#

https://tenor.com/view/kys-keep-yourself-safe-low-tier-god-gif-24664025

Tenor

#

Lvstri after reading that

wicked notch Oct 12, 2024, 10:42 PM

#

real

fiery bolt Oct 12, 2024, 10:45 PM

#

frank sail https://tenor.com/view/kys-keep-yourself-safe-low-tier-god-gif-24664025

wicked notch Oct 12, 2024, 10:46 PM

#

avoid looking in your walls too

#

for extra safety

primal shadow Oct 13, 2024, 4:59 AM

#

Manual vertex locks are a great improvement!
Before:
https://cdn.discordapp.com/attachments/148468683998625792/1294885657310920744/image.png?ex=670ca3be&is=670b523e&hm=c17d3497e0b94cc1eff5ab4344575a13d0383ada496409d2b13e063746aab87a&
After:
https://cdn.discordapp.com/attachments/148468683998625792/1294886356039897128/image.png?ex=670ca465&is=670b52e5&hm=b4e5d06c0fa046596c9933c17a39d89ebea230229b1f62adb38a685a232dc339&

primal shadow Oct 13, 2024, 5:28 AM

#

Too much triangle cruft at the intersections still. Hopefully retrying stuck clusters in later passes works.

#

1 bunny = 1 meshlet though!

fiery bolt Oct 13, 2024, 8:49 PM

#

so i seemed to have fixed my occlusion culling

#

well, almost

#

there just seems to be a band of under-culling at the edges for some reason

fiery bolt Oct 13, 2024, 10:32 PM

#

ok i fixed that but now there's flickering but only for occluded things

#

wtf

faint crane Oct 13, 2024, 10:54 PM

#

I fixed an occlusion culling bug today where my bounding spheres were wrong since I referenced i instead of j somewhere in a nested loop or something to that effect.

craggy shale Oct 14, 2024, 12:07 PM

#

wicked notch from clang to glsl

how does that work ? based on shady or what

wicked notch Oct 14, 2024, 12:10 PM

#

yes

#

I have been playing with shady a lot in recent times

craggy shale Oct 14, 2024, 12:10 PM

#

love to hear it

#

hopefully recent commits didn't fuck your shit up too much

#

it's been refactor time for a while

wicked notch Oct 14, 2024, 12:10 PM

#

o I haven't pulled in a while

#

yea

craggy shale Oct 14, 2024, 12:10 PM

#

nervous

wicked notch Oct 14, 2024, 12:11 PM

#

welp

craggy shale Oct 14, 2024, 12:11 PM

#

yeah I'm renaming everything, moving files etc

#

sowwy

wicked notch Oct 14, 2024, 12:11 PM

#

is ok

#

it's also on me for not checking regularly

craggy shale Oct 14, 2024, 12:11 PM

#

but also having functions with super common name not namespaced whatsoever is a recipe for disaster

#

i'm prefixing everything with shd_ or _shd_

wicked notch Oct 14, 2024, 12:11 PM

#

that's v nice

craggy shale Oct 14, 2024, 12:11 PM

#

and shuffling headers arround

#

splitting ir.h in smaller more sensible bits

faint crane Oct 14, 2024, 12:12 PM

#

I have a shader compiler called Slim Shady, but not the real Slim Shady, or death to Slim Shady.

severe dome Oct 14, 2024, 4:24 PM

#

craggy shale i'm prefixing everything with `shd_` or `_shd_`

shid

craggy shale Oct 14, 2024, 4:37 PM

#

the one who says, is

severe dome Oct 14, 2024, 10:51 PM

#

is that the european version of "whoever smelt it dealt it"?

buoyant summit Oct 15, 2024, 11:07 PM

#

faint crane I have a shader compiler called Slim Shady, but not the real Slim Shady, or deat...

I'm gonna

primal shadow Oct 16, 2024, 4:39 PM

#

Finally figured out why my renderer broke with 1042 instances

#

I overflowed the 2^25 cluster limit...

#

That was so awful to debug

glass sphinx Oct 16, 2024, 5:05 PM

#

is that cause you have all cluster instances always present

#

?

#

cause you should lod away most of them right?

#

or os that 2^25 after culling bleakekw

primal shadow Oct 16, 2024, 5:09 PM

#

glass sphinx is that cause you have all cluster instances always present

Yeah :/

#

It's pre-lod/culling

#

Ideally this becomes part of the culling/lod pass and we only write out data for the meshlets that we intend to raster...

#

I need hierachal culling first. Also culling is the bottleneck atm, so for that reason too 🙂

glass sphinx Oct 16, 2024, 5:12 PM

#

do you take contributions?

primal shadow Oct 16, 2024, 5:18 PM

#

glass sphinx do you take contributions?

Sure! I would love help, there's so much to do 😅 . I have a whole github issue on things I need to improve. You're welcome to take up anything. Just talk to me first, so that we're on the same page.

fiery bolt Oct 16, 2024, 5:39 PM

#

hierarchical culling is great

#

I spend like a constant 0.4 ms on culling iirc

#

completely unoptimized

primal shadow Oct 16, 2024, 5:42 PM

#

What kind of culling? Multiple dispatches? And what are your inputs/outputs?

fiery bolt Oct 16, 2024, 5:51 PM

#

bvh frustum + occlusion + lod culling

#

and yeah it's a chain of dispatches

primal shadow Oct 16, 2024, 5:56 PM

#

fiery bolt and yeah it's a chain of dispatches

What are your inputs/outputs bteween dispatches though? I'm curious how it's set up.

#

E.g. rn I have:

Fill cluster buffers: Input list of instances, write out clusters (instance + meshlet IDs)
Culling/lod: Input list of clusters, write out visible clusters IDs

fiery bolt Oct 16, 2024, 6:09 PM

#

ah

#

it's pairs of instance and bvh node IDs yeah

#

and then instance and meshlet IDs to meshlet cull and output from meshlet cull

primal shadow Oct 16, 2024, 6:11 PM

#

Gotcha gotcha. Thanks.

primal shadow Oct 17, 2024, 5:26 AM

#

Sigh, maybe it's finally time to try and fix my occlusion culling

#

It's gonna suck to debug though

fiery bolt Oct 17, 2024, 7:37 AM

#

yeah it does

#

the padding to 64 with granite's HZB seems to work btw

#

still not perfect though

faint crane Oct 17, 2024, 9:19 AM

#

primal shadow Sigh, maybe it's finally time to try and fix my occlusion culling

I have a theory: occlusion culling is permanently broken.

#

Or all of us here are cursed with slightly broken occlusion culling forever.

wide shadow Oct 17, 2024, 9:31 AM

#

Count me in 😔

loud crag Oct 17, 2024, 10:21 AM

#

my occlusion culling always the issue of culling small but visible triangles

wide shadow Oct 17, 2024, 10:32 AM

#

mine functions as frustum culling bleakekw

fiery bolt Oct 17, 2024, 11:27 AM

#

mine looks like it works but there's a very small border along the bottom and right edges that has less culling for some reason

faint crane Oct 17, 2024, 11:33 AM

#

Mine is slightly not conservative when I disable Hi-Z.

#

Surely, we'll encounter enough bugs to converge on something which works?

glass sphinx Oct 17, 2024, 1:08 PM

#

fiery bolt mine looks like it works but there's a very small border along the bottom and ri...

how do you build the hiz

glass sphinx Oct 17, 2024, 1:09 PM

#

fiery bolt the padding to 64 with granite's HZB seems to work btw

thats really smart i didnt think of that yet

#

tho that doesnt scale over 4k

fiery bolt Oct 17, 2024, 1:26 PM

#

glass sphinx how do you build the hiz

uhh I stole granite's HZB gen

#

but yeah it almost works but not perfectly

primal shadow Oct 17, 2024, 4:06 PM

#

fiery bolt the padding to 64 with granite's HZB seems to work btw

What needs padding? Input depth texture? Or output?

#

I'll have to reference your code

glass sphinx Oct 17, 2024, 4:18 PM

#

primal shadow What needs padding? Input depth texture? Or output?

input. I think its single dispatch hiz gen. Single dispatch hoz gen works in two passes of 64 x 64 downsamplings per workgroup

fiery bolt Oct 17, 2024, 4:21 PM

#

primal shadow What needs padding? Input depth texture? Or output?

output

glass sphinx Oct 17, 2024, 4:22 PM

#

damn i did misinfo

fiery bolt Oct 17, 2024, 4:22 PM

#

I wonder if the issue is due to me not passing the input though

glass sphinx Oct 17, 2024, 4:22 PM

#

how does padding the output help

fiery bolt Oct 17, 2024, 4:22 PM

#

glass sphinx how does padding the output help

changes mip dimensions

#

so you have space to store the extra data generated due to NPOT

glass sphinx Oct 17, 2024, 4:23 PM

#

but why does that work for 64 padd

#

why wont it break on the higher levels

#

those will still have the downrounded div mip sizes

#

this gave me an idea tho

fiery bolt Oct 17, 2024, 4:25 PM

#

there's special handling for higher mips

#

it's goofy

glass sphinx Oct 17, 2024, 4:25 PM

#

ok

#

rn i just scale the depth image to pot then downsample single pass

#

but it feels dirty

fiery bolt Oct 17, 2024, 4:26 PM

#

yeah

#

I might just resort to that ngl

glass sphinx Oct 17, 2024, 4:28 PM

#

it does make everything very simple tho

#

also same speed, just bandwith limited

primal shadow Oct 17, 2024, 5:31 PM

#

glass sphinx rn i just scale the depth image to pot then downsample single pass

Idk if that's conservative

glass sphinx Oct 17, 2024, 5:43 PM

#

it is conservative

#

why wouldnt it be

#

in the downsampling to pot each pixel reads 2x2 pixels of the original image. Read is using uvs so it should map as long as the pot image is not less then half the size

#

now im paranoid

fiery bolt Oct 17, 2024, 8:26 PM

#

the issue lies in mapping coordinates from the screen to your scaled pot

#

because the mapping differs for each mip

glass sphinx Oct 17, 2024, 8:31 PM

#

fiery bolt the issue lies in mapping coordinates from the screen to your scaled pot

i dont unserstand

#

the culling is using the pot image dimensions

#

i just scale 2560px1440p -> 2048x1024p for example with a 2x2 filter for each pixel and then downsample. The culling then uses the pos image dimensions

#

the culling doesnt need to use the screens dimensions. The mapping of pot image mips to original image potential mips doesnt matter after its downscaled

fiery bolt Oct 17, 2024, 8:34 PM

#

how do you map from screen pixel to hzb pixel?

glass sphinx Oct 17, 2024, 8:34 PM

#

you mean in the downsampling?

fiery bolt Oct 17, 2024, 8:34 PM

#

no, culling

glass sphinx Oct 17, 2024, 8:34 PM

#

i dont have to

#

why would i

fiery bolt Oct 17, 2024, 8:34 PM

#

how do take NDC AABB and sample from your hzb

glass sphinx Oct 17, 2024, 8:34 PM

#

the culling uses the hzb dimensions

#

i calculate a ndc for hzb pot size

fiery bolt Oct 17, 2024, 8:35 PM

#

oh you're... scaling down NPOT?

glass sphinx Oct 17, 2024, 8:35 PM

#

?

fiery bolt Oct 17, 2024, 8:35 PM

#

i just scale 2560px1440p -> 2048x1024p for example with a 2x2 filter for each pixel

#

this is definitely wrong

glass sphinx Oct 17, 2024, 8:35 PM

#

what, how?

glass sphinx Oct 17, 2024, 8:35 PM

#

fiery bolt how do take NDC AABB and sample from your hzb

why would i need to use screen dimensions in culling?

#

it doesnt matter at all what dim the culling tex has

fiery bolt Oct 17, 2024, 8:36 PM

#

yeah if your hzb is scaled so that the entire hzb maps to the entire screen

#

but how you do the mapping seems very wrong

#

do you have an overdraw debug view?

glass sphinx Oct 17, 2024, 8:37 PM

#

you seem to have a very flawed image of how i downsample

#

reading 2x2 doesnt mean there is an offset of 2 pixels for every out pixel

fiery bolt Oct 17, 2024, 8:37 PM

#

then what does it mean

#

do you just throw a min sampler and sample from UVs

glass sphinx Oct 17, 2024, 8:38 PM

#

you calculate the uv in the dst image, then gather in the original image and max/min (depending on depth dir) them all

#

#

i have made many debug visualizations and tested many cases and the culling never breaks from what i can tell

glass sphinx Oct 17, 2024, 8:46 PM

#

fiery bolt do you have an overdraw debug view?

#

i also had a visualization that draws the ndc for each culled object to see if its overculling (as visible ndc would mean if culled something thats actually visible)

#

but its not much to show as you just see no ndc 😮

#

this is culling off

fiery bolt Oct 17, 2024, 8:48 PM

#

hmmmm

glass sphinx Oct 17, 2024, 8:49 PM

#

glass sphinx i also had a visualization that draws the ndc for each culled object to see if i...

highly recommend this btw

#

was a massive help to fix it initially

fiery bolt Oct 17, 2024, 9:09 PM

#

glass sphinx highly recommend this btw

that would probably time out for me lmao

glass sphinx Oct 17, 2024, 9:10 PM

#

xD

fiery bolt Oct 17, 2024, 9:10 PM

#

I should do some stats though thonk

glass sphinx Oct 17, 2024, 9:10 PM

#

the debug utils in tido make up like 10-15% of all its code or so. But its so nice to have that stuff

#

turbo bikeshed

#

but also sanity

fiery bolt Oct 17, 2024, 9:11 PM

#

debug utils are insanely helpful

glass sphinx Oct 17, 2024, 9:11 PM

#

fiery bolt I should do some stats though <:thonk:728843920393371708>

what helps also is drawn meshlet count. If it changes each frame its badbad it its consistent with no cam movement happy

fiery bolt Oct 17, 2024, 9:11 PM

#

like, one of the most important things when debugging

glass sphinx Oct 17, 2024, 9:11 PM

#

i ll polish up the debug draws for culling and show them later

fiery bolt Oct 17, 2024, 9:11 PM

#

glass sphinx what helps also is drawn meshlet count. If it changes each frame its badbad it i...

yeah I have flickering shit in my overdraw debug view bleaker_kekw

glass sphinx Oct 17, 2024, 9:11 PM

#

ooooh

fiery bolt Oct 17, 2024, 9:11 PM

#

idk why

#

still tryna fix that

#

the weird border and flickering are the two bugs left

#

works great otherwise

#

oh yeah also sw raster

#

just a wee bit broken

#

in that it instacrashes when enabled

glass sphinx Oct 17, 2024, 9:14 PM

#

i think it took me a few months of random insights to fully fix everything

#

i had a few vey hard thinking mistakes

#

im stopping ymself from doing sw raster

#

too much bikeshed went into the rasterization

#

its time for cool visuals now

wicked notch Oct 17, 2024, 10:07 PM

#

fiery bolt in that it instacrashes when enabled

usually when something is broken you don't add something else (which also turns out to be broken)

#

KEKW

fiery bolt Oct 17, 2024, 10:07 PM

#

no

#

i build entire nanite

#

every component is slightly broken

#

then i try to debug

#

bleakforg

#

oh also i got the e🅱️ic internship froge_love

#

time to slave writing code for money but full time now

wicked notch Oct 17, 2024, 10:10 PM

#

pog

#

send some here to finance my stupid decisions (like buying the upcoming intel cpus)

fiery bolt Oct 17, 2024, 10:11 PM

#

they're gonna burn themselves up

#

while being slower than the 14900k

wicked notch Oct 17, 2024, 10:12 PM

#

intel pinky promises that these ones are safe

fiery bolt Oct 17, 2024, 10:12 PM

#

https://tenor.com/bSx3t.gif

Tenor

wicked notch Oct 17, 2024, 10:12 PM

#

I didn't preorder or anything just to be safe

#

I'll wait for buildzoid to post his usual rants

fiery bolt Oct 17, 2024, 10:13 PM

#

lol

#

you should give amd your money instead

wicked notch Oct 17, 2024, 10:13 PM

#

I considered that

#

then I looked that they regressed on memory latency

fiery bolt Oct 17, 2024, 10:13 PM

#

buy zen 4 tho

wicked notch Oct 17, 2024, 10:13 PM

#

which was already abysmal

fiery bolt Oct 17, 2024, 10:13 PM

#

zen 5 is just not worth it

wicked notch Oct 17, 2024, 10:13 PM

#

ye

fiery bolt Oct 17, 2024, 10:13 PM

#

buy a 7950x3d froge_love

wicked notch Oct 17, 2024, 10:13 PM

#

zen 5 is a mess

fiery bolt Oct 17, 2024, 10:14 PM

#

or an 8950x3d if it has cache on both ccxs

wicked notch Oct 17, 2024, 10:15 PM

#

I mean

#

have you seen the core to core latency

#

on zen 5 it's sometimes faster to read from ram than from cache (in another ccd)

fiery bolt Oct 17, 2024, 10:15 PM

#

lmao wtf

#

is this after the agesa patch they did

wicked notch Oct 17, 2024, 10:15 PM

#

ye there was a graph somewhere in #hardware that was absolutely funny

fiery bolt Oct 17, 2024, 10:16 PM

#

i think that was before the patch

wicked notch Oct 17, 2024, 10:16 PM

#

turns out T_cache + T_fabric >= T_ram KEKW

wicked notch Oct 17, 2024, 10:16 PM

#

fiery bolt i think that was before the patch

perchance

#

lemme find some info

primal shadow Oct 18, 2024, 6:20 AM

#

@fiery bolt for the BVH, what do BVH nodes equal? Some kind of grouping of clusters? Or of cluster groups? The nanite slides are vague on how the BVH is setup. Also how they enforce only 8 clusters per node.

fiery bolt Oct 18, 2024, 7:06 AM

#

primal shadow <@488643966502436865> for the BVH, what do BVH nodes equal? Some kind of groupin...

so what I do is:

leaf nodes are cluster groups
for each lod, build a normal BVH8 using SAH, while also storing max parent error and merged lod bounds
then build a BVH of the root nodes (no SAH because the AABBs are gonna be the same)

#

seems to work well enough

#

might wanna read the code tbh

#

and ask questions based on that

primal shadow Oct 19, 2024, 8:49 PM

#

My software raster is broken and idk how to debug it 😭

fiery bolt Oct 19, 2024, 9:26 PM

#

literally me

#

but mine crashes the GPU and I have no idea why

wispy spear Oct 19, 2024, 9:41 PM

#

can you use novideo aftermath?

fiery bolt Oct 19, 2024, 9:55 PM

#

wispy spear can you use novideo aftermath?

yeah all it says is 'misaligned read'

#

and only when I touch groupshared mem

#

which is agonyfrog

wispy spear Oct 19, 2024, 9:58 PM

#

did you ask in the more public channels already?

primal shadow Oct 19, 2024, 10:08 PM

#

wispy spear did you ask in the more public channels already?

#software-rasterization message

wispy spear Oct 19, 2024, 10:20 PM

#

primal shadow https://discord.com/channels/318590007881236480/362945838366064651/1297314177249...

heh, i meant the metal guy 🙂

#

but these meshlets look neat and make me jealous

primal shadow Oct 19, 2024, 10:23 PM

#

wispy spear but these meshlets look neat and make me jealous

Join my project! I need more people 😭

wispy spear Oct 19, 2024, 10:26 PM

#

i am mentally not capable yet unironically

fiery bolt Oct 21, 2024, 1:26 AM

#

i have successfully completely broken my error projection froge_love

#

i'm also somehow crashing with an MMU fault when i shrimply index my output storage image with SV_Position.xy

#

how does that even happen

wicked notch Oct 21, 2024, 1:42 AM

#

bro is finding bugs not even the hw knew it had

fiery bolt Oct 21, 2024, 1:56 AM

#

fr

#

that can't be real

#

aftermath must be tripping

primal shadow Oct 22, 2024, 6:23 AM

#

Whooh, fixed my SW rasterizer!

#

Had to force the HW rasterizer for near-clipped clusters, and add backface culling to the SW rasterizer

wide shadow Oct 25, 2024, 6:12 PM

#

https://github.com/zeux/meshoptimizer/releases/tag/v0.22

GitHub

Release v0.22 · zeux/meshoptimizer

This release contains many improvements to the meshoptimizer library and some gltfpack enhancements!
Notably, meshopt_simplifyWithAttributes has seen significant improvements to attribute handling,...

wicked notch Oct 25, 2024, 6:16 PM

#

zeux massive as usual

primal shadow Oct 25, 2024, 8:20 PM

#

Hello, I am collecting a list of nanite-related resources for my blog. If I've missed any of your projects, please let me know.

faint crane Oct 25, 2024, 10:08 PM

#

wide shadow https://github.com/zeux/meshoptimizer/releases/tag/v0.22

This makes it an absolute joy to use on web. Deleted so much WASM ... except for METIS. briannaPls

faint crane Oct 25, 2024, 10:08 PM

#

primal shadow Hello, I am collecting a list of nanite-related resources for my blog. If I've m...

Don't forget https://advances.realtimerendering.com/s2024/index.html#hable from Denver which credited your blog.

primal shadow Oct 25, 2024, 10:29 PM

#

faint crane Don't forget <https://advances.realtimerendering.com/s2024/index.html#hable> fro...

Oh, I suppose I could. I wanted to link virtual geometry stuff specifically, but I'll see if I can find a section to stick that in...

#

Probably the roadmap actually, software VRS is something I want to experiment with

primal shadow Oct 26, 2024, 7:26 PM

#

I updated to meshopt 0.22 and factored normal error into the LOD selection, so much better now!

wicked notch Oct 27, 2024, 11:08 PM

#

Here's a fun fact from my days as an professor assistant

#

so I was correcting the practical exams of second year students for DSA

#

the tasks were

Create a graph data structure, load the nodes from file and implement both DFS and BFS visits
Write an algorithm to find the longest cycle in the graph
Write an algorithm that determines whether there is a Hamiltonian cycle in the graph

#

now here's the funny part, the third task is NP complete but the professor somehow missed it KEKW

#

most of the students were able to just write the bruteforce algortihm with backtracking, however the our uni's computers are extremely outdated and an n factorial algorithm isn't exactly the fastest

frank sail Oct 27, 2024, 11:12 PM

#

I was asked multiple times to solve np hard problems in college. A lot of them are easy, you just gotta brute force it

wicked notch Oct 27, 2024, 11:12 PM

#

real

#

anyway this wasn't supposed to happen but oh well

loud crag Oct 27, 2024, 11:13 PM

#

funny words

fiery bolt Oct 27, 2024, 11:56 PM

#

our uni computers have 7900xs... but the intel kind, and they just upgraded them all from 3080s to 4070 ti supers????

#

like pls can we get new CPUs too

minor root Oct 28, 2024, 12:32 PM

#

frank sail I was asked multiple times to solve np hard problems in college. A lot of them a...

An exam last year had a question asking to come up with a linear algorithm for something that was equivalent to SAT TooTroll

#

(It was an intentional trick question but still lmao)

wide shadow Oct 28, 2024, 10:45 PM

#

primal shadow Hello, I am collecting a list of nanite-related resources for my blog. If I've m...

You can throw mine up there 😉
https://github.com/lukasino1214/foundation

GitHub

GitHub - lukasino1214/foundation

Contribute to lukasino1214/foundation development by creating an account on GitHub.

wispy spear Oct 28, 2024, 10:46 PM

#

needs a social preview image 😛 (its in the repo -> settings -> "social preview")

wide shadow Oct 28, 2024, 10:46 PM

#

https://tenor.com/view/breaking-bad-bryan-cranston-sad-despair-gif-25732605

Tenor

loud crag Oct 28, 2024, 10:47 PM

#

wispy spear needs a social preview image 😛 (its in the repo -> settings -> "social preview"...

froge

fiery bolt Nov 3, 2024, 3:54 AM

#

i replaced shrimple 1 buffer -> 1 indirect queue with a complex dequeue and that fixed my software rasterizer?

#

bleakforg

wispy spear Nov 3, 2024, 9:38 AM

#

nice

#

time to blog about it : >

#

https://graphicsprogramming.github.io/blog/

Blog | Graphics Programming Discord Server

Blog

fiery bolt Nov 3, 2024, 12:42 PM

#

forgi

ebon ruin Nov 7, 2024, 3:20 AM

#

Okay Nanite general

#

how come Bevy uses sphere bounds for cluster frustum culling?

#

It already has AABBs that it uses for occlusion culling

#

is the perf difference that big?

fiery bolt Nov 7, 2024, 3:50 AM

#

AABBs are usually better than spheres

#

but you need a sphere for error projection bounds

#

I use an AABB for frustum and occ cull, and spheres for error

glass sphinx Nov 7, 2024, 2:24 PM

#

spheres are faster for testing frustum

fiery bolt Nov 7, 2024, 3:44 PM

#

yes but AABBs usually lead to better culling efficiency

#

and that's worth it

glass sphinx Nov 7, 2024, 4:35 PM

#

im not convinced it is

#

show code, maybe you have some clever way to make it faster than what i did

#

From what ive seen so far, the culling gains from it do not justify the extra overhead for higher meshlet counts

delicate rain Nov 7, 2024, 4:55 PM

#

We need obbs

#

Ups not Tido thread

fiery bolt Nov 7, 2024, 5:54 PM

#

glass sphinx show code, maybe you have some clever way to make it faster than what i did

https://github.com/SparkyPotato/radiance/blob/main/shaders%2Fpasses%2Fmesh%2Fcull.slang#L323

GitHub

radiance/shaders/passes/mesh/cull.slang at main · SparkyPotato/radi...

Rendering things. Contribute to SparkyPotato/radiance development by creating an account on GitHub.

#

do note that I have BVH culling, not just instance -> meshlet

wispy spear Nov 7, 2024, 5:55 PM

#

cute pfp : )

glass sphinx Nov 7, 2024, 9:13 PM

#

@wicked notch i reimplemented prefix sum + binary search now with devshs trick

#

its much faster than po2 buffers

wicked notch Nov 7, 2024, 9:13 PM

#

damn

glass sphinx Nov 7, 2024, 9:14 PM

#

the overhead of many dispatches is massive on nvidia

wicked notch Nov 7, 2024, 9:14 PM

#

that's op

glass sphinx Nov 7, 2024, 9:14 PM

#

honestly crazy to me

#

devsh made the point to me that the draw order is fucked with po2

wicked notch Nov 7, 2024, 9:14 PM

#

should we report that though

glass sphinx Nov 7, 2024, 9:14 PM

#

i think thats a big reason

#

also

wicked notch Nov 7, 2024, 9:14 PM

#

idk it feels like a driver issue

glass sphinx Nov 7, 2024, 9:14 PM

#

idk their frontend always was bad

#

also im not sure but my new binary search is much faster than my old one

#

idk why i did my old one so badly

#

now my mesh shaders reach much better occupancy

#

i have the suspicion that multiple mesh shader dispatches have high overhead and cant share resources well

#

tyhe isbe memory might be contested between many dispatches

wicked notch Nov 7, 2024, 9:17 PM

#

huge findings

#

I'll do binary search too then

glass sphinx Nov 7, 2024, 9:17 PM

#

i give you the code

#

https://github.com/Sunset-Flock/Timberdoodle/blob/main/src/shader_lib/gpu_work_expansion.hlsl

#

https://github.com/Sunset-Flock/Timberdoodle/blob/main/src/shader_shared/gpu_work_expansion.inl

#

this will make vsms much much faster i think

#

nuking their overhead

#

they do 32x16 dispatches atm

#

that will go down to just 16

#

but this means that mesh shader with task shaders have much larger overhead for launches

#

than normal draws

#

but still way better than compute

#

kinda a middle child

#

#

i can kinda see that the dispatches start after each other

#

the later buckets are emptier

#

and with bucket launch the later part is just kinda empty

primal shadow Nov 8, 2024, 2:12 AM

#

Hello, you all are getting early-access to my virtual geometry blog post for Bevy 0.15. Please read it and give me feedback! TODO: Compression section, perf comparison section, and images for all sections

https://github.com/JMS55/jms55.github.io/blob/meshlet-0.15/content/posts/2024_10_25_virtual_geometry_bevy_0_15/index.md

fiery bolt Nov 8, 2024, 2:57 AM

#

RenderDoc does not actually show results from the GPU, it's all simulated on the CPU
pretty sure it replays commands, no?

#

if it was all run on the CPU it would be incredibly slow

#

and it also dies if i device lost in the capture

frank sail Nov 8, 2024, 2:58 AM

#

it uses transform feedback to get vs output, for example

#

I think the only thing it simulates on the cpu is shader debuggin

fiery bolt Nov 8, 2024, 3:01 AM

#

yeah

#

and shader debugging kinda sucks as a result

#

because other waves and workgroup threads are shrimply not simulated

primal shadow Nov 8, 2024, 3:04 AM

#

Let me remove that part then, thanks. Not sure why my output was changing every time I clicked the dispatch then.

#

I guess even if it was running on the GPU, renderdoc kept re-simulating it

fiery bolt Nov 8, 2024, 3:05 AM

#

yeah it probably replays the entire command stream when you go backwards

#

because making a copy of each buffer for each command would eat up a lot of mem

#

looks good otherwise though froge_love

frank sail Nov 8, 2024, 3:06 AM

#

there are many things that can trigger renderdoc to replay the frame

primal shadow Nov 8, 2024, 3:06 AM

#

New text:

Debugging the issue was complicated by the fact that the rewritten fill cluster buffers code is no longer deterministic. Clusters get written in different orders depending on how the scheduler schedules workgroups, and the order of the atomic writes. That meant that every time I clicked on a pass in RenderDoc to check it's output, the output order would completely change as RenderDoc replayed the entire command stream up until that point.

frank sail Nov 8, 2024, 3:06 AM

#

like clicking on a different event

primal shadow Nov 11, 2024, 1:55 AM

#

fiery bolt Nov 11, 2024, 2:01 AM

#

that's some major improvement with sw raster

#

what's the test scene?

primal shadow Nov 11, 2024, 2:06 AM

#

fiery bolt that's some major improvement with sw raster

I don't think it's actually all for SW raster. Look at how much better culling got. It's probably 85% due to improving the DAG, and 15% due to SW raster.

primal shadow Nov 11, 2024, 2:06 AM

#

fiery bolt what's the test scene?

15^3 stanford bunnies arranged in a cube

fiery bolt Nov 11, 2024, 2:07 AM

#

primal shadow I don't think it's actually all for SW raster. Look at how much better culling g...

ah yeah culling improvements make sense

fiery bolt Nov 11, 2024, 2:08 AM

#

primal shadow 15^3 stanford bunnies arranged in a cube

that's 'only' 236 million tris froge_sad

#

you should try the lucy scan

primal shadow Nov 11, 2024, 2:08 AM

#

Unfortunately v0.14 can't render higher counts due to how I handle allocating some buffers

#

It runs OOM

#

So I need a test scene I can use on both

fiery bolt Nov 11, 2024, 2:09 AM

#

oh rip

primal shadow Nov 11, 2024, 2:09 AM

#

I'm going to use the megascan cliffs and show perf for that too in 0.15

fiery bolt Nov 11, 2024, 2:09 AM

#

lucy is also great for testing out import perf and how good your simplifier is

#

because there's just so many tris

#

28 million iirc

primal shadow Nov 11, 2024, 2:11 AM

#

Maybe for 0.16 😛

#

I don't have it setup

fiery bolt Nov 11, 2024, 2:12 AM

#

lol

#

you'll need to multithread generation too

#

it takes me 18 minutes to import on an (amd) 7900x bleakekw

#

with all cores being hammered throughout

primal shadow Nov 11, 2024, 2:13 AM

#

Ah. I have a 2600...

#

So I'll try that in 0.16 😛

fiery bolt Nov 11, 2024, 2:13 AM

#

probably a good idea yeah

wicked notch Nov 11, 2024, 12:00 PM

#

boys

#

we did it

#

time to party

wide shadow Nov 11, 2024, 12:14 PM

#

Collage is over?

wicked notch Nov 11, 2024, 12:15 PM

#

college part 1

#

part 2 is the real stuff

wide shadow Nov 11, 2024, 12:15 PM

#

Don't say masters bleaker_kekw

wicked notch Nov 11, 2024, 12:15 PM

#

yep

frank sail Nov 11, 2024, 12:16 PM

#

let's go

#

I think I had a reminder set for this day

#

where are you doing your masters?

wide shadow Nov 11, 2024, 12:18 PM

#

How much of free time until you start masters

wicked notch Nov 11, 2024, 12:28 PM

#

frank sail where are you doing your masters?

still deciding

wicked notch Nov 11, 2024, 12:28 PM

#

wide shadow How much of free time until you start masters

till sept so plenty of time to catch up with nanitebros

wide shadow Nov 11, 2024, 12:36 PM

#

Damn that's quite a lot

wicked notch Nov 11, 2024, 12:37 PM

#

ye I've decided to take some time off to avoid actually dying KEKW

cunning solstice Nov 11, 2024, 12:38 PM

#

year is actually quite short

#

I blink and it's over

wide shadow Nov 11, 2024, 12:41 PM

#

Don't ever start on that in 5ish months I have school leaving exams

glass sphinx Nov 11, 2024, 1:02 PM

#

very cool

#

you studied agriculture and animal welfare?

wide shadow Nov 11, 2024, 1:18 PM

#

glass sphinx you studied agriculture and animal welfare?

Computer science

wide shadow Nov 11, 2024, 1:36 PM

#

Basically first and second semester of collage

#

You get all the jazz about CPUs, how memory works, electrical engineering, programming, databases and operating systems

glass sphinx Nov 11, 2024, 1:39 PM

#

i was betting on lvstri to care for my cows

#

fyck

fiery bolt Nov 11, 2024, 1:48 PM

#

wicked notch still deciding

do it at nanite uni

wicked notch Nov 11, 2024, 2:10 PM

#

what is the nanite uni

loud crag Nov 11, 2024, 2:55 PM

#

#bikeshed-😇

glass sphinx Nov 11, 2024, 4:17 PM

#

lmao

fiery bolt Nov 11, 2024, 4:32 PM

#

wicked notch what is the nanite uni

wherever the italian nanite dude is i guess KEKW

primal shadow Nov 12, 2024, 6:14 AM

#

Blog post for bevy 0.15 meshlet stuff is almost done

#

I kinda wish I scrapped the idea and had just done a blog post or two on some specific parts, instead of everything. It's kinda a big mess of a post, but too late to change now...

#

I think the memory compression section came out well, but not so much the rest

primal shadow Nov 12, 2024, 7:50 AM

#

@fiery bolt BVH for nanite = internal nodes point to cluster groups, and use AABBs based on cluster LOD spheres, and then leaf nodes point to clusters? Or is that wrong?

primal shadow Nov 12, 2024, 7:52 AM

#

fiery bolt so what I do is: - leaf nodes are cluster groups - for each lod, build a normal...

Oh I found this. Why do you build a seperate BVH per LOD? I'm trying to think if that accelerates common culling scenarios or something.

wispy spear Nov 12, 2024, 11:42 AM

#

@wicked notch congratulazzione

fiery bolt Nov 12, 2024, 3:57 PM

#

primal shadow <@488643966502436865> BVH for nanite = internal nodes point to cluster groups, a...

leaves are groups, because they must always render together, internal nodes are just a normal SAH-optimized BVH built out of the leaves

fiery bolt Nov 12, 2024, 3:59 PM

#

primal shadow Oh I found this. Why do you build a seperate BVH per LOD? I'm trying to think if...

I thought that since all LOD groups would be spatially near, if I built a BVH out of everything at once, multiple LODs would be parented by a single group, so max parent error would always be pretty high

#

so you're gonna be expanding a lot more nodes

primal shadow Nov 12, 2024, 5:20 PM

#

fiery bolt I thought that since all LOD groups would be spatially near, if I built a BVH ou...

Ahhhh that makes sense...

#

then build a BVH of the root nodes (no SAH because the AABBs are gonna be the same)
What does this mean? Litterly just take 8 random nodes, group, and repeat until you have a single root node?

#

I.e. if you have 16 LODs, pick 2 sets of 8 randomly to group, and then group the two sets once more

fiery bolt Nov 12, 2024, 6:17 PM

#

primal shadow I.e. if you have 16 LODs, pick 2 sets of 8 randomly to group, and then group the...

yeah pretty sure that's what I do

#

sorting by error is probably a better idea now that I think about it lol

primal shadow Nov 12, 2024, 6:19 PM

#

fiery bolt sorting by error is probably a better idea now that I think about it lol

For grouping the LODs? It probably barely matters right, it's only a few nodes

fiery bolt Nov 12, 2024, 6:20 PM

#

yeah tru

#

even lucy only has 13 LODs

#

so we don't really add too many levels

primal shadow Nov 14, 2024, 5:36 AM

#

I made this diagram to explain how meshopt's LOCK_BORDERS flag works (#2)

#

And then I realized I have no idea how it works

#

So guess I'm not using it and just gonna skip explaining it lol

#

No idea how it preserves the meshlet borders if it's just going off of the topological border

primal shadow Nov 14, 2024, 7:56 AM

#

Hello, I am once again asking for (this time final) feedback on my meshlet blogpost: https://github.com/JMS55/jms55.github.io/blob/ef1d060e11daf89e9ff68f4fdf3bd80f6b0653f2/content/posts/2024_11_14_virtual_geometry_bevy_0_15/index.md

primal shadow Nov 14, 2024, 8:13 PM

#

It's up https://jms55.github.io/posts/2024-11-14-virtual-geometry-bevy-0-15

Virtual Geometry in Bevy 0.15

primal shadow Nov 18, 2024, 7:00 AM

#

I have 0 motivation to do BVH culling after I spent so much time on virtual geoemtry for bevy 0.15 😬

#

Guess I need to take a break

primal shadow Nov 19, 2024, 7:45 AM

#

@fiery bolt for your BVH, for interior nodes, what bounding sphere do you use to project the error?

#

Your leaf nodes are cluster groups, with error = parent group error, and bounding sphere = parent group bounding sphere

#

And when building an interior node over those leaf nodes, you set error = max error of leaf nodes

#

But what do you set the bounding sphere to be?

#

A new bounding sphere enclosing all the leaf node bounding spheres?

#

(btw it's confusing because you use the DAG parent group LOD data, which is different than the BVH parent lol)

fiery bolt Nov 19, 2024, 10:13 AM

#

primal shadow A new bounding sphere enclosing all the leaf node bounding spheres?

yup

#

it's just all the child BVH nodes' lod spheres merged

primal shadow Nov 19, 2024, 5:35 PM

#

Thanks! This shouldn't be too bad to implement then. Just very confusing, because there's both DAG parents and BVH parents 😅

primal shadow Nov 19, 2024, 6:40 PM

#

struct BvhNode {
    child_start_id: u32, // If meshlet, is meshlet ID, else is pointer to BvhNode
    child_count: u16, // If u16::MAX, then node is a single meshlet, else is BvhNode child count
    error: f16, // If meshlet then is group_error, else if is lod group then is parent_group_error, else is max of child's parent_group_errors
    bounding_sphere: vec4<f32>, // If meshlet then is group_bounding_sphere, else if is lod group is parent_group_bounding_sphere, else if new bounding sphere enclosing all children bounding spheres
}

😅

fiery bolt Nov 19, 2024, 9:04 PM

#

primal shadow ```rust struct BvhNode { child_start_id: u32, // If meshlet, is meshlet ID, ...

you should have groups as leaf children, not singular meshlets

#

what I do is SOAify my BVH into a BVH8, so a u8::max is a single node child, otherwise it's meshlet count

#

('single node' => actually holds data for 8 nodes)

#

this also reduces queue memory size by 8x

#

probably the main reason I did it tbh

primal shadow Nov 19, 2024, 9:08 PM

#

fiery bolt you should have groups as leaf children, not singular meshlets

Right, but I plan to reuse the same type for the meshlets within each leaf

fiery bolt Nov 19, 2024, 9:08 PM

#

thonk

#

not entirely sure how that would work

primal shadow Nov 19, 2024, 9:09 PM

#

Each meshlet needs a bounding sphere and error anyways

fiery bolt Nov 19, 2024, 9:09 PM

#

but also a tight cull sphere and other metadata

primal shadow Nov 19, 2024, 9:09 PM

#

Because after the lod group check against parent error, you need to check the meshlets self error

primal shadow Nov 19, 2024, 9:10 PM

#

fiery bolt but also a tight cull sphere and other metadata

Yeah I have that seperate

fiery bolt Nov 19, 2024, 9:10 PM

#

ah ok

#

if it works, it works lol

primal shadow Nov 19, 2024, 9:11 PM

#

Mhm

glass sphinx Nov 19, 2024, 9:30 PM

#

btw what is bvh culling?

#

isnt the dag alreay a tree that can be used to cull?

primal shadow Nov 19, 2024, 9:33 PM

#

glass sphinx isnt the dag alreay a tree that can be used to cull?

Yeah so this confused me as well. But dag traversal is a lot more expensive than rearranging it into a bvh with a root per lod level, is why I think it's done.

glass sphinx Nov 19, 2024, 9:56 PM

#

hm

fiery bolt Nov 19, 2024, 11:18 PM

#

glass sphinx isnt the dag alreay a tree that can be used to cull?

DAGs aren't really trees

#

they reconverge

#

so DAG traversal is more complex, as you have to ensure you don't revisit things

#

since we can already do LOD selection in parallel, we don't really have to follow the DAG, we can use any structure to accelerate it

#

thus, the BVH

wicked notch Nov 20, 2024, 1:05 AM

#

graph is hard to visit

#

bvh is ez

fiery bolt Nov 20, 2024, 1:57 AM

#

wicked notch bvh is ez

if it's so ez why don't you have it

#

froge_evil

primal shadow Nov 20, 2024, 6:12 AM

#

@fiery bolt are you doing frustum + occlusion culling in the same kernel as the LOD traversal, or do you do hierchal LOD traversal, write all the meshlets to a buffer, and then do culling?

fiery bolt Nov 20, 2024, 11:45 AM

#

primal shadow <@488643966502436865> are you doing frustum + occlusion culling in the same kern...

I frustum and occlusion cull BVH nodes too, and also have a separate meshlet cull stage

glass sphinx Nov 20, 2024, 5:29 PM

#

im confused

#

you have to visit the dag anyway as you instantiate meshlets, no?

#

what is the dag for at all if its not used?

fiery bolt Nov 20, 2024, 5:32 PM

#

the DAG exists during build

#

but you don't have to visit it, since LOD decision can be localized if you store current and parent error

#

so now you can just build a BVH out of parent error to accelerate stuff

glass sphinx Nov 20, 2024, 5:34 PM

#

hmmmm

#

i see

#

but arent the parents spawning the childrens work?

fiery bolt Nov 20, 2024, 5:37 PM

#

yeah, from the BVH

glass sphinx Nov 20, 2024, 5:37 PM

#

i dont get it

#

its not a bvh tho its a dag

#

like what does the dag do then

#

at build time

fiery bolt Nov 20, 2024, 5:38 PM

#

you're converting a DAG to a BVH

#

at build time

glass sphinx Nov 20, 2024, 5:38 PM

#

is that possible

#

huh

#

i dont have intuition for that at all

fiery bolt Nov 20, 2024, 5:39 PM

#

the DAG isn't really important tbh

#

it doesn't really 'exist' at build time either

#

there's no data structure explicitly storing it

glass sphinx Nov 20, 2024, 5:39 PM

#

i see

fiery bolt Nov 20, 2024, 5:39 PM

#

it's just implicitly there with how groups relate to each other

#

but since you only care about the current group and parent group

#

you can BVH-ify it

glass sphinx Nov 20, 2024, 5:40 PM

#

ok wait

primal shadow Nov 20, 2024, 5:40 PM

#

fiery bolt I frustum and occlusion cull BVH nodes too, and also have a separate meshlet cul...

Heck? Do you have a seperate tight culling sphere for every BVH node?

fiery bolt Nov 20, 2024, 5:40 PM

#

AABB

#

but yeah

#

public struct BvhNode {
    public Aabb aabbs[8];
    public f32x4 lod_bounds[8];
    public f32 parent_errors[8];
    public u32 child_offsets[8];
    public u8 child_counts[8];
}

glass sphinx Nov 20, 2024, 5:45 PM

#

@fiery bolt when a child finds out its parent is instantiated it kills itself?

fiery bolt Nov 20, 2024, 5:45 PM

#

no

#

the parent doesn't spawn the child

primal shadow Nov 20, 2024, 5:45 PM

#

fiery bolt ```java public struct BvhNode { public Aabb aabbs[8]; public f32x4 lod_b...

Why do they store 8 values at a time per node?

fiery bolt Nov 20, 2024, 5:45 PM

#

SOA

glass sphinx Nov 20, 2024, 5:45 PM

#

fiery bolt the parent doesn't spawn the child

but how do you do the lod cut then

fiery bolt Nov 20, 2024, 5:45 PM

#

so i can have one index in the queue for 8 nodes

primal shadow Nov 20, 2024, 5:45 PM

#

I see

fiery bolt Nov 20, 2024, 5:46 PM

#

nanite also does a BVH8 iirc

primal shadow Nov 20, 2024, 5:46 PM

#

So you basically just build a BVH, and then inline the 8 children into each node?

#

and then the children start/end become what?

glass sphinx Nov 20, 2024, 5:46 PM

#

fiery bolt the parent doesn't spawn the child

but the parents share children with a dag

fiery bolt Nov 20, 2024, 5:47 PM

#

primal shadow and then the children start/end become what?

it's a single index for BVH nodes, with count == 255

#

count only matters for meshlets

fiery bolt Nov 20, 2024, 5:48 PM

#

glass sphinx but the parents share children with a dag

ok so, firstly, do you understand how it works without a BVH?

#

if you just expand to all meshlets

glass sphinx Nov 20, 2024, 5:48 PM

#

no

fiery bolt Nov 20, 2024, 5:48 PM

#

read jasmine's first blog froge_love

#

then ping me and i'll tell you how the BVH works

#Iris - A Journey through OpenGL and beyond to learn Graphics