#Iris - A Journey through OpenGL and beyond to learn Graphics

23407 messages Β· Page 24 of 24 (latest)

primal shadow
#

Single index for BVH nodes? So nodes point to only 1 other node?

#

I'm confused how you built that

fiery bolt
#

each 'node' is actually 8 nodes

#

so one index points to 8 nodes

#

each of the 8 internal nodes has one child index, which represents 8 nodes

primal shadow
#

Oh I see, child_offsets is also [8]

#

How the heck do you build that? Is there a way to take a normal BVH with 8 children per node, and create that?

fiery bolt
#

you just 'hoist' things up a level

primal shadow
#

Yeah I think I kind of see it. I think you need to do it bottom up right?

fiery bolt
primal shadow
#

Also i'm curious why you used AABBs and not bounding spheres btw

glass sphinx
#

btw jasmine what is your pfp

#

i am starring at it a lot but its jsut a mush to me i see it

fiery bolt
primal shadow
fiery bolt
primal shadow
#

Oh it's not hard to do an approximate method btw

#

I'll stick to spheres then

#

Anyways time to go redo my plan for the BVH yet again πŸ˜…

fiery bolt
#

yeah i use an approximate thing for lod spheres

#

but approximate is bad when you want the tighest bounds you can get

primal shadow
#

hmm

glass sphinx
#

@fiery bolt do you have locked edges?

#

like across all lods

fiery bolt
#

no

glass sphinx
#

in jasmins blog she just tests all clusters

#

which makes sense

#

but that is not hierarchical

fiery bolt
#

yes so now

#

you can do everything in parallel right?

#

just build a BVH out of that

glass sphinx
#

i dont understand

fiery bolt
#

so you have groups with a self and parent error

glass sphinx
#

if you dont have locked edges, you must have a dag. Dag implies that each parent will potentially share children in simplification. So when a parent gets culled, survivies and then sends its children to be processed duplicates can appear as other parents have the same children

fiery bolt
#

dude ignore the DAG

#

it's not real

#

there's no DAG

glass sphinx
#

what?

#

no dag means locked edges

fiery bolt
glass sphinx
#

yes

#

when grouping meshlets in each lod

fiery bolt
#

ignore build time

#

is there a DAG at runtime

glass sphinx
#

how do you build the bvh

fiery bolt
#

each LOD gets a SAH-optimized BVH

glass sphinx
#

SAH?

fiery bolt
#

where each node also stores the max parent error

#

surface area heuristic

glass sphinx
#

ok

#

what is a node

fiery bolt
#

a BVH node

#

AABB + max parent error

glass sphinx
#

each lod has a full bvh?

fiery bolt
#

yes

#

then they're combined together

glass sphinx
#

oooooooooooooooooooooooooooooooooooooooooooooooooooooooh

fiery bolt
#

in order

glass sphinx
#

now i understand

#

i thought the lod levels somehow are the bvh levels

fiery bolt
#

i mean they're combined into one BVH at the end

glass sphinx
#

so you do cull all lods at the same time

#

but each has a bvh

fiery bolt
#

i just make one BVH per LOD and then combine them to make building easier

glass sphinx
#

how

#

i can kinda see it i guess

fiery bolt
#

take the root nodes of each LOD

#

and make a BVH out of them

glass sphinx
#

yea i see it

fiery bolt
#

no SAH optimization because the AABBs are all the same

glass sphinx
#

wait no i dont see it

fiery bolt
primal shadow
#

When building your DAG, make a list of LOD groups for each LOD.
Afterwards, build a BVH for each LOD level, with your leaf nodes being the LOD groups.

glass sphinx
#

but if the groups are the nodes then its a dag

fiery bolt
#

how

glass sphinx
#

multiple meshlets go into the same group

#

and many come out again

fiery bolt
#

yes that's the meshlet DAG

#

which is reduced down to a local decision

#

each group can locally decide if it should render

glass sphinx
#

nonono

#

we are talking about the bvh

#

you cant use this for a bvh as its a dag

fiery bolt
#

what is 'this'

glass sphinx
#

so

#

n meshlets in lod n form agroup

#

and m meshlets in lod n+1 come out

#

i assume here youd cull lod n+1, survives then spawn work for lod n

#

this is a dag tho (cause multiple meshlets in lod n+1 point to the same group), so you cant build a bvh for this

#

actually yea you can build a bvh

#

but it will have duplicates

fiery bolt
#

literally me rn

glass sphinx
#

can you be more specific

fiery bolt
#

you said you understood and then you went back to your original misunderstanding in 20 microseconds

glass sphinx
#

agonyfrog i forgot to write lod

glass sphinx
#

but then you said you somehow combine all bvhs of all lods

fiery bolt
#

i take the root node for each LOD BVH

#

and just add them to a BVH

glass sphinx
#

huh

fiery bolt
#

both the cases are the same

glass sphinx
#

so you make a bvh over the roots?

fiery bolt
#

yeah

glass sphinx
#

why

fiery bolt
#

so i have one thing to work with lol

glass sphinx
#

i see

#

ok

#

then i get it

fiery bolt
#

but the DAG is completely irrelevant

#

it doesn't matter

#

what matters is that each group has a local and parent error

glass sphinx
#

yea if all lods are separately evaluated then its irrelevant i see that

fiery bolt
glass sphinx
#

is that how nanite does it also?

fiery bolt
#

yes

glass sphinx
#

huh

fiery bolt
#

idk how they build the BVH

glass sphinx
#

seems a bit wasteful

fiery bolt
#

but they have a BVH8

#

wasteful how

glass sphinx
#

the bvh is only for occlusion cull right?

fiery bolt
#

no

#

it's mainly for lod cull

glass sphinx
#

can it discard nodes on error?

fiery bolt
#

i just do frustum and occlusion cull because why not

glass sphinx
#

so occlusion + fustum?

fiery bolt
glass sphinx
#

lod cull

#

how does it lod cull

fiery bolt
#

by storing max parent error

glass sphinx
#

can the bvh estimate what children are needed?

fiery bolt
#

if max parent error is less than a pixel, all groups under the current node are going to have parent error less than a pixel

#

therefore

#

they are too detailed

glass sphinx
#

aaaaah

#

ok then its not wasteful as i thought

fiery bolt
#

parent error in this case is DAG parent error

primal shadow
#

Ok so you have this node as your root. What steps are you doing? Does each thread test the 8 children, and then write which of the 8 children that pass, or?

fiery bolt
#

i split one node across 8 threads

#

so each thread tests one and writes the child if it passes

#

if it's a leaf (pointing to a group), it writes all of the meshlets into the meshlet cull input buf

#

then meshlet cull runs at the very end to check self errors and bin hw/sw

#

and frustum and occlusion cull ofc

primal shadow
#

Ok but why split one nore across 8 threads?

#

Why not just do the normal BVH stuff of having one node point to 8 children, test the node, and then if it passes add it's 8 children to a buffer for the next run

#

And have one node per thread

fiery bolt
#

i did but that needs 8x more queue memory

#

so i switched to the SOA thingy

primal shadow
fiery bolt
#

since i have 1 node struct representing 8 real nodes, i only need 1 index in the queue to represent 8 nodes

#

if 1 index was 1 node, i would need 8 indices to represent 8 nodes

#

thus, 8x the mem

#

i should try a bvh16 thonk

primal shadow
#

E.g. given node 0 with children [1, 7], write node 0

#

And then in the next stage 8 threads pull node 0, and process 1-7

#

Although idk how you would handle < 8 children hmmm

fiery bolt
#

that's an extra dependent mem load

#

you can handle less than 8 the same way i currently do (by just have zeroed nodes)

primal shadow
#

Ok so the options are basically:

8 threads per node
Each thread processes one child of the node, doing frustum + occlusion culling (different per thread), and LOD test (the same for each thread in the node, I think?)
Each thread with a passing child writes one node index (containing 8 children of the child) to the output buffer

OR

1 thread per node
LOD test, frustum, + occlusion culling using the AABB of the node itself and not the children
If passing, write 8 indices (8 children of the node) to the output buffer

fiery bolt
#

the lod test would be different for every thread in both cases

#

might also wanna experiment with bvh16 or bvh4

primal shadow
#

god this is so complicated

#

Gotta traverse the BVH

#

then once you get a leaf, it's a lod group

#

So then iterate that to get meshlets

#

And then test the meshlets too

fiery bolt
#

you already have the meshlet test part tho

#

so it's just one more pass

primal shadow
#

Yeah, just, so many damn pieces

#

The BVH nodes, where the leaves point to LOD groups, and then the meshlets within the group...

#

And there's so many different ways you can inline different parts of the traversal or distribute it across threads

ebon ruin
#

great read

primal shadow
# ebon ruin great read

Glad you enjoyed it! I'm realizing I need a comment section on my blog so I can see if people are actually engaging with it or not πŸ˜…

#

AI? πŸ€”

ebon ruin
#

sorry lol

#

β€œGlad you enjoyed it!” immediately made me think of ai

#

that crappy snapchat ai

primal shadow
#

oh lol

glass sphinx
#

how much faster is sw raster for you guys?

#

in my scenes its around 20-30%

#

a lot worse then what nanite ppl claim with up to 3x

primal shadow
#

It's much faster than vertex shaders, but from what I've heard not much faster than mesh shaders, _at reasonable triangle sizes _

#

You'll get more improvements if you have really tiny 1 pixel triangles

#

Also the dag structure matters a lot more than the raster method

glass sphinx
glass sphinx
#

hmm

#

ill say hw raster only yhen

primal shadow
#

Faster is faster. If you've already done the work, might as well keep it.

#

You should read my blog though, I covered this πŸ™‚

fiery bolt
#

if it's already done, keep it

glass sphinx
#

id have to impl the splitting into hw and sw meshlets

#

i wonder where the nanite guys got the 3x number

#

i dont think vertex shaded meshlets are that muvh worse

fiery bolt
#

having to generate index buffers is slow

#

also nanite has much smaller trognles than you probably

#

sw raster is only really faster if all triangles are about 1-3 px

glass sphinx
#

so they are much smaller than nanite on average

#

maybe that is a problem for sw raster

glass sphinx
#

maybe that changed

glass sphinx
fiery bolt
#

your mesh shaders are subpixel culling right

glass sphinx
#

yes

fiery bolt
#

makes sense then

#

the hw raster and pixel quads are what's slow

#

if everything is being culled, mesh is gonna be almost as fast as sw

glass sphinx
#

im guessing the vertex shader and cull work is disproportionate in my test yea

#

cause most tris are culled

#

the actual rasterization is not a bottleneck

#

buty the meshlet processing is

#

i need lods

fiery bolt
#

when things get small it hits 2-3px total per tri

glass sphinx
#

@fiery bolt how does your world format look like

fiery bolt
#

uhhhhh

#

list of instances KEKW

primal shadow
#

Taking some time off virtual geometry stuff to go back to RT lighting instead.

storm wolf
#

i think u've encouraged me to try and start compressing my meshlets

primal shadow
#

Enjoy! It's very cool.

storm wolf
#

nice, thanks!

ebon ruin
#

@primal shadow im calling you out

#

from afar your pfp looks like some sort of pokemonβ€”the wings being ears, the girl being the mouth

primal shadow
#

Huh, weird

#

It's stolen from the rust zulip, although idt they use the icon anymore

#

And I have no idea what the original source is

glass sphinx
#

where is lvstri

wide shadow
#

When his masters comes around

#

I guess

glass sphinx
#

btw lukasino i made a renderdoc like inspector for taskgraph

#

ill put it into daxa as a util some time this quater

wide shadow
#

πŸ‘€

#

I know that you have it in timberdoodle but wasnt expecting it to get into daxa as util

glass sphinx
#

i separated it mostly from tido now so ahould be mostly easy

wide shadow
#

oh yeah you meant something else facepalm

glass sphinx
#

huh

#

i need to add buffer inspection

#

then its good to go i think

wide shadow
#

My brain is not braining for past few days

glass sphinx
#

yea thats what will come

#

in yhe util

#

it injects itself into task graph so you can inspect images in attachments

wide shadow
#

cool

fiery bolt
#

what would the use of something like that be?

glass sphinx
#

its renderdoc lite in real time

wide shadow
#

debugging features in your engine

#

very nice since my renderdoc capture is always broken somehow

glass sphinx
#

i couldnt use it since forever cause i have rt

wide shadow
#

tbf I havent used renderdoc in months

wide shadow
glass sphinx
#

i think i prefer the task graph viewer now anyway

#

a lot more convenient

wide shadow
#

yeah

fiery bolt
wide shadow
#

I guess that will be hidden in task graph

#

There is already the allocator

glass sphinx
#

yep buffers are the big downside

glass sphinx
#

i hope i can extract layouts and stuff from slang

fiery bolt
#

yeah pershonally I'd just open in renderdoc and use it KEKW

glass sphinx
#

renderdoc constantly doesnt work

wide shadow
glass sphinx
#

doesnt work with lots of extensions

fiery bolt
#

it's better than wish.com renderdoc that I could make

wide shadow
#

If it works thats other thing

fiery bolt
#

nosight?

glass sphinx
#

nsight is so ass tho

#

also has no bda at all

fiery bolt
#

true

wide shadow
#

☝️

fiery bolt
wide shadow
#

🀨

fiery bolt
#

I thought you didn't have buffer view support yet?

glass sphinx
#

yep

wide shadow
glass sphinx
fiery bolt
glass sphinx
#

fr tho the real time view is super duper convenient

wide shadow
#

also much better than spamming printf KEKW

fiery bolt
#

debug printf works for you? bleaker_kekw

wide shadow
#

yes

fiery bolt
#

lucky

wide shadow
#

for very long time

#

what not worky for you?

fiery bolt
#

idk it just crashes if I enable it

#

don't rember why

#

slang moment

wide shadow
#

heh

glass sphinx
#

hmm it also works for me

wide shadow
#

I just go to vkconfig get the printf preset and launch my thing and add printf to my shader

glass sphinx
#

tools, am i right

wide shadow
#

thats it

fiery bolt
#

ok maybe I should try again sometime

wide shadow
#

also update slang

#

I am updating it every few weeks

fiery bolt
#

I should also implement a pass viewer and debug view or something like you guys

wide shadow
#

lambdas at home dropped few months ago

fiery bolt
#

ifunc?

wide shadow
#

functor

#

ifunc is something new no clue KEKW

fiery bolt
#

function interface

#

my issue is that the newest slang version doesn't build on windows if I disable some targets or something

wide shadow
#

you build from source? bleakekw

#

we yoink the dlls or whatever from releases

fiery bolt
wide shadow
glass sphinx
#

real

fiery bolt
glass sphinx
#

you got to suffer like us

delicate rain
fiery bolt
#

in rust?

wide shadow
#

there were some rust wrappers around daxas c api

fiery bolt
#

bro I'll just use my own render graph at that point lol

wide shadow
#

I proposed to nuke daxa c api since nobody uses it

fiery bolt
#

I have my own 'RHI'

loud crag
wide shadow
#

I wish but I am not touching metal

wide shadow
#

🀨

glass sphinx
#

hahahaha

glass sphinx
loud crag
#

same goes for nabla

glass sphinx
#

πŸ€ͺ

loud crag
#

(dont tell devsh i said this)

wide shadow
#

they will be enlightened soon

glass sphinx
#

do you need a new vacuum? the daxa 3000 has the best ....

fiery bolt
glass sphinx
#

btw @primal shadow if you make nanite, do you also plan to make vsms?

Vsms are kinda needed to capture the detail of the high poly geo and avoid popping.

I feel like it doesnt make much sense to have nanite geo if the shadows still pop/dont show the small details

#

rt shadows? 🫨

#

that would prob be what i would try

delicate rain
#

good luck with rebuilding as for the nanite changing geo

glass sphinx
#

ah forgot about that

fiery bolt
#

rtx megageometry hard req froge_love

primal shadow
glass sphinx
primal shadow
#

You already have the high poly gbuffer, you can just trace direct lighting off of that and have it be good wuality

glass sphinx
#

i was thinking about the shadows for direct lighting

glass sphinx
primal shadow
#

Yeah, just RT from the gbuffer you rasterized. Why wouldn't that work?

primal shadow
#

Screen trace + RT direct lighting should be plenty.

glass sphinx
#

you defo need the screen trace for the small details

primal shadow
#

And depending on what mega geometry is it might involve scrapping the entire raster system I've wrote anyways πŸ˜…

glass sphinx
#

the future is chrome raytraced

primal shadow
#

I hope so

primal shadow
#

I have absolutely 0 motivation to work on stuff rn 😦

#

I have everything sketched out for the VG BVH changes, but just don't feel like sitting down and writing it

ebon ruin
#

I wonder what happened here

frank sail
#

shhh, he is eeping

faint crane
#

I see Meshoptimizer recently added meshopt_partitionClusters which could finally replace METIS.

ebon ruin
#

free nanite?

faint crane
#

Tried my hand at it. Can't get the LOD chain as deep as I'd like; many triangles are getting stuck at LOD 1.

#

Not using meshopt_buildMeshletsFlex yet but the Stanford bunny won't even simplify into LOD 1.

#

Let me see if I can compare METIS.

faint crane
fiery bolt
#

i think your meshlets or groups or both are fucked

faint crane
#

Both. The initial meshlets are very rough.

faint crane
#

Initial clusters with meshopt_buildMeshlets vs meshopt_buildMeshletsFlex.

fiery bolt
#

btw when exporting did you merge vertices and shade smooth

faint crane
#

I merge vertices here. Don't know how the shading was exported here.

fiery bolt
#

that mesh looks horrid

#

you should use the original scan

#

and merge vertices in blender on import and then shade smooth

faint crane
#

Thanks. Will give it a try.

faint crane
dull oyster
#

It helps a lot

wicked notch
frank sail
#

I hope you've been well my man

wicked notch
#

Slowly getting my life back together

buoyant summit
#

woah

delicate rain
#

He returns

faint crane
#

Looking over compression and now streaming so I can throw a trillion triangle Deccer cube in.

glass sphinx
faint crane
#

Party's over. Hide the frog food.

ebon ruin
#

or at least significantly easier than using metis

faint crane
#

Yes. See the nanite.cpp demo example in the meshoptimizer repo.

faint crane
#

Implemented streaming to find that I still have cracks in my DAG with larger assets like Activision Caldera or Intel Jungle Ruins.

#

Curious now as to what Unreal would produce for these.

faint crane
ebon ruin