Iris - A Journey through OpenGL and beyond to learn Graphics | Graphics Programming | Page 10

raven orchid Aug 16, 2023, 1:56 PM

#

Is this for binding individual pages? What’s the speed for binding the entire sparse texture?

wicked notch Aug 16, 2023, 2:03 PM

#

binding as in vkQueueBindSparse?

raven orchid Aug 16, 2023, 2:08 PM

#

I think I’m wondering if sparse can be attached to a frame buffer

#

Like the entire texture region (even if partially committed)

wicked notch Aug 16, 2023, 2:09 PM

#

raven orchid I think I’m wondering if sparse can be attached to a frame buffer

I think so

#

hol up lemme read the spec

#

ye there is no mention of sparse resources being incompatible with VkFramebuffer

wicked notch Aug 16, 2023, 7:51 PM

#

I have acquired a machine capable of doing sparse

#

(for legal reasons I have to disclose that I did not, in fact, steal the machine)

frank sail Aug 16, 2023, 7:53 PM

#

Grinding GP while on vacation 🫡

wispy spear Aug 16, 2023, 7:54 PM

#

frank sail Grinding GP while on vacation 🫡

i had the very same thought

wicked notch Aug 16, 2023, 7:54 PM

#

average idea of a break of a modern human

#

I also studied for exams all evening bleakekw

frank sail Aug 16, 2023, 7:54 PM

#

Wtf

wispy spear Aug 16, 2023, 7:54 PM

#

hook up with some locals

wicked notch Aug 16, 2023, 7:54 PM

#

I do have stuff to do tonight as well

#

I thank god every day that my body seemingly has limitless energy

wispy spear Aug 16, 2023, 7:55 PM

#

https://tenor.com/view/jackie-groenen-manchester-united-women-muwomen-netherlands-dutch-gif-22369524

Tenor

wicked notch Aug 16, 2023, 7:55 PM

#

I don't feel tired for whatever reason bleakekw

#

maybe I am going to die

#

I am going to die soon yes

frank sail Aug 16, 2023, 7:55 PM

#

I don't feel the physical sensation of being tired, but I notice that I become dumber

wicked notch Aug 16, 2023, 7:56 PM

#

only possible explanation

wispy spear Aug 16, 2023, 7:56 PM

#

hascheesh al jaffar al jaker... i think we will get nanite 3.0 very soon

#

running on some fujitsu t450, 2gig of ram, some pentium 4 thing

#

on a custom version of mesa

#

hmm jaffar ... reminds me of these

#

https://tenor.com/view/jaffa-cake-70s-cartoon-retro-snack-gif-14412934

Tenor

wicked notch Aug 16, 2023, 8:00 PM

#

I can at least rest assured that AMD performs well with sparse

#

on RADV at least

#

pixel said amdgpu shouldn't be too different as well

#

so common amd W

wispy spear Aug 16, 2023, 8:01 PM

#

then somebody needs to convince novideo to make it right on their chips too

frank sail Aug 16, 2023, 8:02 PM

#

Yeah once ue6 is dependent on lvstri's stuff, they will

wicked notch Aug 16, 2023, 8:02 PM

#

it would be cool to have an NV rep in here

#

just so I can see them and Jaker fight

wispy spear Aug 16, 2023, 8:02 PM

#

we dont have a rep, but

#

someone joined their shop a year ago or so

wicked notch Aug 16, 2023, 8:04 PM

#

wispy spear someone joined their shop a year ago or so

call them back

#

~~use force if you need to~~

#

I have air support if needed

wispy spear Aug 16, 2023, 8:05 PM

#

that opi from arm also seems to be a cool chap

#

tom olson

raven orchid Aug 16, 2023, 8:06 PM

#

wicked notch I have acquired a machine capable of doing sparse

awesome!

#

I did some experiments with OpenGL sparse virtual shadow maps today actually

wicked notch Aug 16, 2023, 8:06 PM

#

holy pog

raven orchid Aug 16, 2023, 8:06 PM

#

they seem to mostly work when used as a framebuffer target

wicked notch Aug 16, 2023, 8:06 PM

#

that's music to my ears

#

very good, did you use the ARB ext?

raven orchid Aug 16, 2023, 8:07 PM

#

yeah used the extension

#

support seems to be pretty strong for the 1060, so hopefully newer hardware is even better

wispy spear Aug 16, 2023, 8:07 PM

#

eggcellent

raven orchid Aug 16, 2023, 8:07 PM

#

but it allows me to commit and decommit, and I'm using compute to write to some readback textures which tells the CPU what to do

frank sail Aug 16, 2023, 8:08 PM

#

raven orchid they seem to mostly work when used as a framebuffer target

Mostly?

raven orchid Aug 16, 2023, 8:08 PM

#

welll

#

performance is questionable right now lol

wicked notch Aug 16, 2023, 8:08 PM

#

can I see 🥺 👉 👈

raven orchid Aug 16, 2023, 8:08 PM

#

I might see if I can try stencil buffer rejection for the pages

frank sail Aug 16, 2023, 8:08 PM

#

If it wurks

wicked notch Aug 16, 2023, 8:08 PM

#

ye, the shrimple fact that it works is hands down incredible

raven orchid Aug 16, 2023, 8:10 PM

#

I will try to capture the framebuffer

#

nsight really really hates 16k textures

#

at lesat on this gpu

#

#

the black parts are uncommitted

#

as the camera moves it allocs/frees pages

wicked notch Aug 16, 2023, 8:13 PM

#

raven orchid at lesat on this gpu

lovely

#

wrong message but aight

raven orchid Aug 16, 2023, 8:13 PM

#

what do you think about stencil rejection for pages?

wicked notch Aug 16, 2023, 8:13 PM

#

anyways, do you need to keep track of pages yourself?

raven orchid Aug 16, 2023, 8:13 PM

#

I wonder if it would be viable

frank sail Aug 16, 2023, 8:13 PM

#

Did you read the final fantasy shadows paper

raven orchid Aug 16, 2023, 8:14 PM

#

wicked notch anyways, do you need to keep track of pages yourself?

yeah this is done with a 128x128 residency texture

#

well two of them

#

one for this frame, one to look at prev frame

raven orchid Aug 16, 2023, 8:14 PM

#

frank sail Did you read the final fantasy shadows paper

saw someone linked it but I forgot to read it lol. Did they use vsm?

wicked notch Aug 16, 2023, 8:14 PM

#

raven orchid what do you think about stencil rejection for pages?

I have no idea for stencil, Froyok is far more experienced with stencil

raven orchid Aug 16, 2023, 8:14 PM

#

hmmm maybe we need to ask Froyok

wicked notch Aug 16, 2023, 8:14 PM

#

raven orchid one for this frame, one to look at prev frame

oh so two page tables, how come

raven orchid Aug 16, 2023, 8:14 PM

#

basically it seems that on OpenGL at least

#

if you read from an unalloced page, it returns all 0

#

so what if we explot that by setting alloced pages to some value other than 0

#

and have stencil rejection regions

frank sail Aug 16, 2023, 8:15 PM

#

raven orchid saw someone linked it but I forgot to read it lol. Did they use vsm?

No but it was pretty cool

#

Anyways combining vsm with it would be cooler

raven orchid Aug 16, 2023, 8:15 PM

#

wicked notch oh so two page tables, how come

basically it just checks "if previous frame it was allocated and current frame it is as well, do nothing"

#

or "if previous frame it was not alloced and this frame it is requested, alloc it"

wicked notch Aug 16, 2023, 8:16 PM

#

raven orchid basically it just checks "if previous frame it was allocated and current frame i...

hm, perhaps my idea of keeping only one page table isn't viable

raven orchid Aug 16, 2023, 8:16 PM

#

maybe it is

#

this current impl is kind of hacked together just for testing

wicked notch Aug 16, 2023, 8:16 PM

#

do you use morton codes btw

raven orchid Aug 16, 2023, 8:16 PM

#

frank sail Anyways combining vsm with it would be cooler

maybe we need to check it. @wicked notch did you read it?

wicked notch Aug 16, 2023, 8:17 PM

#

to store the pages

raven orchid Aug 16, 2023, 8:17 PM

#

what are those froghorror

wicked notch Aug 16, 2023, 8:17 PM

#

raven orchid maybe we need to check it. <@320895822394818561> did you read it?

only halfway through

#

but they don't do VSM, their shadow map paper is mostly for point and area lights

frank sail Aug 16, 2023, 8:17 PM

#

raven orchid what are those <:froghorror:1105248469251276830>

Funny numbers

wicked notch Aug 16, 2023, 8:18 PM

#

they have a pretty interesting idea where they make a frustum per screen tile and bin depth values, then they make a close up shadow frustum per tile, based on the binned depth values

#

but I dunno how it actually works

raven orchid Aug 16, 2023, 8:18 PM

#

interesting

#

I definitely don't use those

#

the setup is

wicked notch Aug 16, 2023, 8:19 PM

#

how do you access pages then thonk

raven orchid Aug 16, 2023, 8:19 PM

#

one 16Kx16K sparse texture (hardware supported)
two 128x128 residency tables (regular textures)
a readback SSBO that the compute shader writes the changelist to

#

all I did was attach the texture to a framebuffer lol

wicked notch Aug 16, 2023, 8:20 PM

#

you use textures for page tables? froghorror

raven orchid Aug 16, 2023, 8:20 PM

#

yeah

#

just 2D int textures so I can do atomic ops

wicked notch Aug 16, 2023, 8:21 PM

#

huh

#

you do any mip mapping by chance?

raven orchid Aug 16, 2023, 8:21 PM

#

no everything is on one level for now

wicked notch Aug 16, 2023, 8:22 PM

#

I see, very nice

#

how do you index into the page tables?

#

just uv / granularity?

raven orchid Aug 16, 2023, 8:33 PM

#

yeah for the analysis part I just take the depth, convert to world space, then convert to shadow uvs and index the table that way

#

I might give the stencil thing a try later

wicked notch Aug 16, 2023, 8:37 PM

#

If you wanted to do mips

#

I suppose you'd also mip map the page table right?

raven orchid Aug 16, 2023, 8:37 PM

#

hmm

#

yeah I think you're right

#

we'd probably need a page table for each mip level right?

wicked notch Aug 16, 2023, 8:38 PM

#

yeah definitely

#

making yet another branch..

#

btw how do your VSM look

#

let me see them sharp shadows

raven orchid Aug 16, 2023, 8:48 PM

#

#

some sponza shadows

raven orchid Aug 16, 2023, 8:48 PM

#

wicked notch making yet another branch..

same this has been my experimental shadow branch

wicked notch Aug 16, 2023, 8:48 PM

#

dayum

#

they do be sharp

raven orchid Aug 16, 2023, 8:48 PM

#

not sure if it'll ever be ready to merge in

wicked notch Aug 16, 2023, 8:50 PM

#

oh btw

#

do you ever evict

raven orchid Aug 16, 2023, 8:51 PM

#

like evict a page from memory?

wicked notch Aug 16, 2023, 8:51 PM

#

yeah

raven orchid Aug 16, 2023, 8:51 PM

#

yeah right now I just have it set to

#

if a page was around last frame but not needed this frame, evict immediately

#

(not caching anything atm lol)

wicked notch Aug 16, 2023, 8:51 PM

#

huh

#

how's 🅱️erf again?

raven orchid Aug 16, 2023, 8:51 PM

#

not good

#

like it's definitely doing too much work each frame

wicked notch Aug 16, 2023, 8:52 PM

#

wait, are you bound because of raster?

#

or because of memory management

raven orchid Aug 16, 2023, 8:53 PM

#

it seems to be raster

#

I think it could be a lot better since it wastes time with non resident pages

#

which is kind of why I'm wondering if some kind of stencil method to reject invalid pages would help

#

my guess is page rejection will go a long way with performance

#

but to truly make it viable for older hardware caching would need to be really good

wicked notch Aug 16, 2023, 8:59 PM

#

hmm I'm wondering what is the probability that NV's driver is actually doing some deferred eviction kind of thing

raven orchid Aug 16, 2023, 9:00 PM

#

Yeah it’s probably pretty high

#

Hmmm good point though, better eviction strategy is definitely needed

wicked notch Aug 16, 2023, 9:00 PM

#

my idea for Vulkan was not evicting, ever

#

instead I keep track of pages internally and just overwrite stuff where things change

#

because the idea is that you allocate big chunks of memory and then you suballocate them to sparse pages

raven orchid Aug 16, 2023, 9:01 PM

#

That would work

wicked notch Aug 16, 2023, 9:01 PM

#

eviction in this case would mean deallocating the big chunk of memory

raven orchid Aug 16, 2023, 9:02 PM

#

Do you plan to just have 1 big memory pool?

#

Or multiple?

wicked notch Aug 16, 2023, 9:02 PM

#

A few

#

well I'm thinking to make the pool size 1GiB

#

or something crazy high like that

#

it's something to experiment

raven orchid Aug 16, 2023, 9:03 PM

#

One really nice thing you could exploit is caching if you go that route now that I think about it

#

Since you never remove pages

wicked notch Aug 16, 2023, 9:03 PM

#

eh

#

yes

#

but that means more bookkeeping

raven orchid Aug 16, 2023, 9:03 PM

#

True

wicked notch Aug 16, 2023, 9:04 PM

#

also can you trust the GPU to not invalidate memory froghorror

finite quartz Aug 16, 2023, 9:04 PM

#

raven orchid hmmm maybe we need to ask Froyok

What's the question ? 😄

raven orchid Aug 16, 2023, 9:08 PM

#

finite quartz What's the question ? 😄

Oh yeah I think maybe it boils down to how much performance can be gained with stencil rejection. Is it similar to increases we can see with early depth rejection?

finite quartz Aug 16, 2023, 9:14 PM

#

raven orchid Oh yeah I think maybe it boils down to how much performance can be gained with s...

If the cost is on the fragment shader side (and not the raster itself), then I think so. 🤔

wicked notch Aug 16, 2023, 9:19 PM

#

I guess it's raster only

#

so maybe some other sort of culling is necessary

#

mesh shaders would be so good here

#

you can just use a task shader to cull invalid pages and dispatch only the necessary meshlets

raven orchid Aug 16, 2023, 9:22 PM

#

Yeah think this one is raster bound mainly, dang

raven orchid Aug 16, 2023, 9:22 PM

#

wicked notch you can just use a task shader to cull invalid pages and dispatch only the neces...

Oh wow

#

That’s massive actually

#

nogl

wicked notch Aug 16, 2023, 9:22 PM

#

The alternative is work expansion in compute shaders

#

which should work just as well, it's just painful to implement bleakekw

#

the core idea is the same

#

check if the page is invalid, check whether any meshlets overlap that page, if both are true cull meshlets for that page

raven orchid Aug 16, 2023, 9:23 PM

#

Yeah true that would work

#

I saw a paper a while back too where they estimate if a shadow will intersect the frustum

wicked notch Aug 16, 2023, 9:26 PM

#

the only painful thing to check for is whether said meshlet overlaps many pages

#

iirc the nanite talk went over that

#

they also do HZB wtf

#

HZB except for memory pages froghorror

#

so much indirection

#

I lost count

raven orchid Aug 16, 2023, 9:53 PM

#

Hmmm yeah that’s also true

#

Though I do wonder how much better your perf will be since you’re using nanite 2

#

I’m testing just with regular meshes

wicked notch Aug 16, 2023, 9:59 PM

#

probably a bit better

#

the reason it works so well is due to the finer grained nature of meshlets

#

it could work just as well with regular meshes to be fair

#

just perhaps less of an impact in 🅱️erf

wicked notch Aug 16, 2023, 10:30 PM

#

it is time for me to attend a party

#

I shall ponder VSM there too

wispy spear Aug 16, 2023, 10:30 PM

#

: )

wicked notch Aug 16, 2023, 10:30 PM

#

if I'll be still lucid by then

wispy spear Aug 16, 2023, 10:30 PM

#

have some fun irl

#

nanite 2.0 can wait

wicked notch Aug 16, 2023, 10:30 PM

#

ye it's unlikely that I will ponder them bleakekw

wispy spear Aug 16, 2023, 10:31 PM

#

we need a cooler name for it too btw 🙂

wicked notch Aug 16, 2023, 10:31 PM

#

I'll leave that to my aide JStephano

wispy spear Aug 16, 2023, 10:31 PM

#

~~dropped jaker like a hot potato~~

wicked notch Aug 16, 2023, 10:31 PM

#

if Jaker is my right hand man, JStephano is my left hand man

wicked notch Aug 16, 2023, 10:31 PM

#

wispy spear ~~dropped jaker like a hot potato~~

I have two hands for a reason

frank sail Aug 16, 2023, 10:43 PM

#

wicked notch I shall ponder VSM there too

Tell everyone there about it

#

They'll be impressed

wicked notch Aug 17, 2023, 5:59 PM

#

finally

#

the thing that will destroy darian

#

not even I can load this

#

it crashes while trying to allocate 20GB of VRAM (totally reasonable and valid amount of memory)

frank sail Aug 17, 2023, 6:00 PM

#

yeah you need an AMD GPU to load that

wispy spear Aug 17, 2023, 6:00 PM

#

one of them workstation ones with 48 jigs?

#

why one file btw 🙂 why not split it into chunks?

#

"somehow"

twin bough Aug 17, 2023, 6:01 PM

#

omg

wicked notch Aug 17, 2023, 6:01 PM

#

I could

#

in fact I will

twin bough Aug 17, 2023, 6:01 PM

#

render it plz

#

i wanna see xd

wicked notch Aug 17, 2023, 6:01 PM

#

I'll send it to darian soon

#

I have a gigabit internet here, luckily the netherlands are a civilized place

twin bough Aug 17, 2023, 6:02 PM

#

seems like the future of game sizes will be bleak.. 16gb of just geometry seems...

wicked notch Aug 17, 2023, 6:02 PM

#

fuck I don't even have enough space on my drive

finite quartz Aug 17, 2023, 6:02 PM

#

frank sail yeah you need an AMD GPU to load that

🙋‍♀️

wispy spear Aug 17, 2023, 6:02 PM

#

wait until jakers vkspec llm turns hostile and generates mesh out of text

wicked notch Aug 17, 2023, 6:03 PM

#

I envy my friend so much right now, his workstation exported 30GB of FBX and converted it into 20GB of glTF, all in 2 hours

wispy spear Aug 17, 2023, 6:04 PM

#

some threadrepperino?

wicked notch Aug 17, 2023, 6:04 PM

#

ye

#

one you can't actually buy

#

a 5995WX or something

#

whatever it's called bleakekw

wispy spear Aug 17, 2023, 6:04 PM

#

ah the 64c one

#

a potential 96c one is coming soon 😉

distant lodge Aug 17, 2023, 6:17 PM

#

is that its operating temperature

wicked notch Aug 17, 2023, 6:27 PM

#

no that's the idle one

#

operating temperature is somewhere between the surface of the sun and the core of super star WR102f

wispy spear Aug 17, 2023, 6:32 PM

#

wolf rayet stars are cool shit

wicked notch Aug 17, 2023, 10:42 PM

#

alright then

#

I guess I should get started

#

may the god of sparse watch over me

wispy spear Aug 17, 2023, 10:48 PM

#

daigo_eye

wicked notch Aug 18, 2023, 4:06 PM

#

ahh yes, don't you love it when a single layered/mipmapped texture takes up 32GiB of VRAM

raven orchid Aug 18, 2023, 4:21 PM

#

Oh wow wtf. Just from one 16k mipmapped texture?

wicked notch Aug 18, 2023, 4:23 PM

#

I also layered it 32 times over bleakekw

glass sphinx Aug 18, 2023, 6:27 PM

#

bleakekw

wicked notch Aug 18, 2023, 10:25 PM

#

we found the random blogs that has fake sparse http://gpupro.blogspot.com/

#

(MJP did, god bless him, literal MVP)

frank sail Aug 18, 2023, 10:28 PM

#

all this talk about vsm makes me want to make my own vsm thingy 😄

wicked notch Aug 18, 2023, 10:30 PM

#

do it

#

it's the shadow technique to end all shadow techniques after all

glass sphinx Aug 18, 2023, 10:31 PM

#

wispy spear <:daigo_eye:1019033551397720125> <:daigo_eye:1019033551397720125>

berndt das brot

frank sail Aug 18, 2023, 10:37 PM

#

wicked notch do it

I don't wanna do it if you want to implement it in FF though

#

Since I have plenty of other things to look at

wicked notch Aug 18, 2023, 10:39 PM

#

I'm not sure

#

on one side GL's sparse is much easier

#

on another side I know it's gonna perform very bad so fake sparse will be necessary bleakekw

frank sail Aug 18, 2023, 10:40 PM

#

The impl I was thinking of wouldn't use real sparse anyway

wicked notch Aug 18, 2023, 10:40 PM

#

I'm looking at SVT's slides right now

#

and I guess I get it

#

but the UV remapping to the real texture is weird

#

the hell does this mean bleakekw

#

paging is offset galore truly

frank sail Aug 18, 2023, 10:43 PM

#

tl;dr indirection

wicked notch Aug 18, 2023, 10:44 PM

#

????

#

I like how phys_tc doesn't exist

frank sail Aug 18, 2023, 10:51 PM

#

I feel like reading code will just increase confusion at this point

#

The whole thing just seems like indirection and basic arithmetic

wicked notch Aug 18, 2023, 11:08 PM

#

ok yes it's probably less clamplicated than I expected

#

I still fail to see how to translate virtual uv to physical uv though

#

ok suppose I wanna make a physical texture that will have at most 4096 128x128 pages

#

so basically a 8192^2 R32_SFLOAT texture

#

then I take a regular 2k texture and stuff its pages in the physical texture

#

now suppose I want to get the texel at uv (1, 1)

#

This will translate to page indexed (127, 127) I think?

#

which will again translate to somewhere in the physical texture, suppose the page also gives us the uv offset into the physical texture

frank sail Aug 18, 2023, 11:16 PM

#

wicked notch ????

I guess they're using one big texture2D for the backing memory

wicked notch Aug 18, 2023, 11:16 PM

#

yes

#

My plan is to use multiple backing textures

frank sail Aug 18, 2023, 11:17 PM

#

why

wicked notch Aug 18, 2023, 11:18 PM

#

how do I decide how big to make the backing storage

#

isn't it better to have many smol ones

frank sail Aug 18, 2023, 11:18 PM

#

Compile time constant smart

#

But yeah that's fair

wicked notch Aug 18, 2023, 11:18 PM

#

ah yes

frank sail Aug 18, 2023, 11:19 PM

#

If you have a strict budget that you can decide at init time, then surely you can get by with one big boy texture

wicked notch Aug 18, 2023, 11:19 PM

#

anywho

#

we got the correct offset into the physical page that has the sample we're looking for

#

horray!

#

except it's just an offset, specifically it's the lower left corner of the page, we need to offset the offset by something

frank sail Aug 18, 2023, 11:29 PM

#

um here's what I was imagining

vec2 virtual_uv = WorldPosToShadowUv();
ivec2 page_texel = ivec2(pagesDimensions * virtual_uv);
vec2 physical_uv = fract(pagesDimensions * virtual_uv);
int physical_index = texelFetch(s_virtualToPhysical, page_texel).x;

float shadow_depth = textureLod(s_shadowArray, vec3(physical_uv, physical_index), 0.0).x;

#

I made the backing texture an array of pages to spice things up

wicked notch Aug 18, 2023, 11:31 PM

#

hmm

#

wouldn't it be phys_uv = fract(backingMemorySize * virtual_uv)

frank sail Aug 18, 2023, 11:34 PM

#

wicked notch Aug 18, 2023, 11:34 PM

#

are 0, 1 and 2 the backing textures

frank sail Aug 18, 2023, 11:34 PM

#

these are pages

#

we don't know or care where they (physically) are, nor their size

#

the virtual2physical texture will tell us where they are physically located

#

and I guess what you do with size depends on how you handle the backing memory

#

maybe

#

using an array for the backing memory seems to most logical here, since it doesn't give any false impression that virtual pages map directly to physical ones

wicked notch Aug 18, 2023, 11:42 PM

#

hold on

#

I must draw as well

frank sail Aug 18, 2023, 11:51 PM

#

this picture is gonna be the hottest drop of 2023

wicked notch Aug 18, 2023, 11:52 PM

#

here it comes

#

#

alright so lemme explain this shit

#

the first texture, marked with zero is the shadow map

#

let's say we are good boys and it's only 2k

#

we wanna sample in a particular texel

#

alright no prob, we go onto texture 1, the page table

#

the page table tells us the uv of the lower left corner of the physical page in the backing texture

#

very nice, let's go onto texture number 2, the humongous backing texture

#

now here's a problem, we know the page we want to sample from, but the actual texel info is lost, we cannot use the original offset because the texture sizes mismatch

#

and we don't even have an offset to begin with, since we only know the virtual uvs

#

how do we deal with translation in this case

frank sail Aug 18, 2023, 11:58 PM

#

ok I realized an error in my code

#

virtualShadowMapSize * virtual_uv
should be
pagesDimensions * virtual_uv

frank sail Aug 19, 2023, 12:01 AM

#

wicked notch now here's a problem, we know the page we want to sample from, but the actual te...

you need to get the UV into the page in the first step

wicked notch Aug 19, 2023, 12:03 AM

#

ok so we get the uv of the lower left corner of the page we are sampling in the virtual texture

#

then we subtract = offset into the page

#

right?

frank sail Aug 19, 2023, 12:04 AM

#

er

#

maybe I mean the second step

#

actually lemme think bleakekw

wicked notch Aug 19, 2023, 12:04 AM

#

pondering together

#

let's link our brains

frank sail Aug 19, 2023, 12:04 AM

#

ok so I guess you can ignore the actual size of the virtual texture here (when sampling)

#

you just need to know the UV and how many pages there are on each axis

#

I'll draw another pic

wicked notch Aug 19, 2023, 12:14 AM

#

fake sparse is painful

#

offsets are spooky

frank sail Aug 19, 2023, 12:14 AM

#

#

surely knowing the dims of all the textures involved is sufficient

wicked notch Aug 19, 2023, 12:15 AM

#

ok what's the actual uv then

#

the part we're missing is missing bleakekw

frank sail Aug 19, 2023, 12:15 AM

#

lol I didn't want to calculate it since it would be a yucky number

#

but we know where the physical page is and the offset into it, so it would not be hard

wicked notch Aug 19, 2023, 12:16 AM

#

it should be lower left corner + (0.25, 0.25) * ????

frank sail Aug 19, 2023, 12:17 AM

#

1/3 is your last number innit

wicked notch Aug 19, 2023, 12:17 AM

#

1 / num pages ye

#

we solved it

#

incredible

frank sail Aug 19, 2023, 12:17 AM

#

how did we solve elementary skool bleakekw

wicked notch Aug 19, 2023, 12:18 AM

#

listen

frank sail Aug 19, 2023, 12:18 AM

#

I mean pass bleakekw

wicked notch Aug 19, 2023, 12:18 AM

#

we are kindergarten students

#

it's ok

frank sail Aug 19, 2023, 12:18 AM

#

frank sail

I like the random red herring of specifying that the whole VSM is 256x256 and then never mentioning page sizes again

raven orchid Aug 19, 2023, 12:19 AM

#

This is awesome lol

wicked notch Aug 19, 2023, 12:20 AM

#

ok jaker

frank sail Aug 19, 2023, 12:20 AM

#

wicked notch it should be `lower left corner + (0.25, 0.25) * ????`

anyways yeah it would be simpler to use an array texture because you wouldn't have to do this incredibly difficult + and *

raven orchid Aug 19, 2023, 12:20 AM

#

Full blown virtual texturing next I guess

wicked notch Aug 19, 2023, 12:20 AM

#

now solve linear mip transitions

frank sail Aug 19, 2023, 12:20 AM

#

erm

#

I mean

wicked notch Aug 19, 2023, 12:20 AM

#

frank sail anyways yeah it would be simpler to use an array texture because you wouldn't ha...

ikr, literally the hardest operations

frank sail Aug 19, 2023, 12:20 AM

#

wicked notch now solve linear mip transitions

once you know how to take one sample, taking more isn't difficult

wicked notch Aug 19, 2023, 12:21 AM

#

is there a spec on how shrimplers work

frank sail Aug 19, 2023, 12:21 AM

#

you can use hw filtering for intra-page filtering, then handle edge cases yourself as mjp mentioned

wicked notch Aug 19, 2023, 12:21 AM

#

like linear filtering in Vk or GL

frank sail Aug 19, 2023, 12:21 AM

#

wicked notch is there a spec on how shrimplers work

yeah, the api specs

wicked notch Aug 19, 2023, 12:21 AM

#

good

#

I thought it was magic implementation defined black boxes

#

like aniso bleakekw

frank sail Aug 19, 2023, 12:22 AM

#

Only aniso

wicked notch Aug 19, 2023, 12:22 AM

#

reinventing samplers one step at a time

#

only in GP

frank sail Aug 19, 2023, 12:22 AM

#

mipmap filtering (trilinear) is actually quite shrimple

#

you just sample each mip, maybe filter the per-mip samples, then blend them together with fract(lod)

wicked notch Aug 19, 2023, 12:24 AM

#

good ol lerp

raven orchid Aug 19, 2023, 12:26 AM

#

wicked notch

So I guess with this one the 2048^2 represents our “true” shadow map which covers the entire scene

wicked notch Aug 19, 2023, 12:26 AM

#

wicked notch we solved it

alrighr @wispy spear pin this utterly deranged rambling when possible

wicked notch Aug 19, 2023, 12:26 AM

#

raven orchid So I guess with this one the 2048^2 represents our “true” shadow map which cover...

ye

#

it should've been 256^2 because then I would've actually been able to understand what was going on bleakekw

#

the only thing left to do is the allocator I guess

#

but that's a problem for tomorrow

#

tomorrow's me will surely be far more intelligent and not dumb at all, for sure

#

thank god the allocator is the same both for fake and real sparse

frank sail Aug 19, 2023, 12:39 AM

#

Real fake allocators

wicked notch Aug 19, 2023, 12:43 AM

#

how do I name this allocator

#

sparse page allocator?

#

real fake sparse page allocatorn bleakekw

frank sail Aug 19, 2023, 12:45 AM

#

William

wicked notch Aug 19, 2023, 12:49 AM

#

william it is

wispy spear Aug 19, 2023, 8:15 AM

#

wicked notch let's link our brains

#

Horse Conch

#

might have been a better name 😛

wicked notch Aug 19, 2023, 9:00 AM

#

Damn, I wake up to another, 2 hours long, uber detailed explanation

wispy spear Aug 19, 2023, 9:01 AM

#

hey thats devsh

frank sail Aug 19, 2023, 9:05 AM

#

wicked notch Damn, I wake up to another, [2 hours long](<https://youtu.be/WaxByO69BiI>), uber...

#

so, what heuristics are used to determine if a page is resident? do we just analyze the gbuffer each frame?

#

I guess that is easy enough

wicked notch Aug 19, 2023, 9:10 AM

#

just depth

frank sail Aug 19, 2023, 9:10 AM

#

ye

#

big brain: don't ~~mark~~ request it resident if albedo is 0 since the shadow won't be visible anyways

wicked notch Aug 19, 2023, 9:11 AM

#

step 1. unproject_depth()
step 2. to_page_index()
step 3. make_resident(page_index)

wicked notch Aug 19, 2023, 9:11 AM

#

frank sail big brain: don't ~~mark~~ request it resident if albedo is 0 since the shadow wo...

ah yes

frank sail Aug 19, 2023, 9:11 AM

#

so the only hard part about this is the CPU readback crap

#

unless

#

allocating/freeing pages via a compute shader (only viable with fake sparse of course)

wicked notch Aug 19, 2023, 9:12 AM

#

how the hecc are you supposed to allocate pages with compute

#

you need to call in API functions no?

frank sail Aug 19, 2023, 9:12 AM

#

atomics or somethin

#

oh I mean if you have fixed backing memory already

wicked notch Aug 19, 2023, 9:12 AM

#

hm then yes

frank sail Aug 19, 2023, 9:13 AM

#

not having to do cpu readback is compelling ngl

wicked notch Aug 19, 2023, 9:13 AM

#

but it's necessary once you run out of backing memory

#

otherwise it isn't very virtual at all bleakekw

frank sail Aug 19, 2023, 9:14 AM

#

the virtual part is remapping pages

#

and possibly pretending you have more memory than in reality

#

you can make this work with clever use of clipmaps and fixed storage I'm sure

wicked notch Aug 19, 2023, 9:15 AM

#

perchance

frank sail Aug 19, 2023, 9:16 AM

#

for instance, a worst-case where you can see the whole scene can just allocate pages for a coarser map instead of filling the entire higher-detail map

#

I guess determining which level of the clipmap to use is when the heuristics get hard

#

maybe not

#

you can just prioritize coarser clipmap levels when you need to evict

#

assuming they overlap like this

frank sail Aug 19, 2023, 9:21 AM

#

wicked notch but it's necessary once you run out of backing memory

the thing is, there's always a budget no matter what you're doing

#

with real sparse, you don't need to max out your budget immediately

#

well I guess with non-sparse you can just create separate textures that act as your memory allocations, so idk

#

but yeah I think figuring out how to handle eviction in the worst cases will be the hardest part of this

wispy spear Aug 19, 2023, 9:28 AM

#

could you use the other 10bits for that perhaps?

wicked notch Aug 19, 2023, 9:28 AM

#

the way I was thinking about memory is the same for fake and real sparse

#

with real sparse I would've allocated 256MiB VkDeviceMemory every time

#

with fake sparse I allocate 256MiB physical textures

frank sail Aug 19, 2023, 9:29 AM

#

wispy spear could you use the other 10bits for that perhaps?

ah the 10 bits thing was related to how the hardware handled rasterization internally. I wasn't clear about that

wicked notch Aug 19, 2023, 9:29 AM

#

but ye eviction will be very bad to handle

#

maybe if block of pages hasn't been touched in X frames evict

#

depending also maybe on the remaining budgets, how many pages are currently active, etc

frank sail Aug 19, 2023, 9:31 AM

#

I think you should only evict if needed

#

that way you can avoid writes if nothing changed

wicked notch Aug 19, 2023, 9:31 AM

#

ye, if the remaining budget is: "a lot", then no eviction

wispy spear Aug 19, 2023, 9:31 AM

#

perhaps you could just play with different strategies

#

and see which is less weird

frank sail Aug 19, 2023, 9:32 AM

#

I reckon there needs to be an atomic queue (or multiple) for pages

wicked notch Aug 19, 2023, 9:33 AM

#

hm?

frank sail Aug 19, 2023, 9:33 AM

#

when you allocate a page, push to the queue. when you need to dealloc, pop the oldest thing

wicked notch Aug 19, 2023, 9:33 AM

#

oh, nah I just use morton codes

frank sail Aug 19, 2023, 9:33 AM

#

idk I remember they did something similar for surfel allocation in surfel GI bleakekw

wicked notch Aug 19, 2023, 9:34 AM

#

I make an array of 128x128 unsigned integers, last bit is the valid bit, each frame I clear everything to invalid and then I readback the valid pages and mark them

frank sail Aug 19, 2023, 9:38 AM

#

alright so that gives us the invisible pages I suppose

#

but that also means no reuse unless you use another bit

wicked notch Aug 19, 2023, 9:39 AM

#

what does this bit do

frank sail Aug 19, 2023, 9:41 AM

#

it tells us that something has been rendered to the page at some point

#

if the page becomes visible again, we needn't re-render the page unless the sun moved or that area of the scene changed

wicked notch Aug 19, 2023, 9:42 AM

#

ah

frank sail Aug 19, 2023, 9:42 AM

#

which is admittedly a bit of an endgame optimization

wicked notch Aug 19, 2023, 9:42 AM

#

yeah

#

lol

#

caching comes far into the future

frank sail Aug 19, 2023, 9:43 AM

#

JS already has shadow map caching, hurry up frog_whip

wicked notch Aug 19, 2023, 9:44 AM

#

I'm still conflicted on whether I should first finish real sparse or already move on with fake sparse

frank sail Aug 19, 2023, 9:45 AM

#

if you do real sparse, I'll do fake sparse

#

then we can compare

#

I'll have to go to bed now because I need to wake up and perform manual labor is 5-6 hours bleakekw

wicked notch Aug 19, 2023, 9:47 AM

#

💀

wicked notch Aug 19, 2023, 11:44 AM

#

Screen Space Shadows

#

Just because

raven orchid Aug 19, 2023, 2:58 PM

#

wicked notch maybe if block of pages hasn't been touched in X frames evict

This is what I ended up doing

#

Every frame ended up being excessive

#

The read back is an ssbo that compute atomically pushes work to for the CPU

#

Though it still might be necessary to add a sort of allocator hardware budget to prevent it from trying to dealloc too many pages during one frame

wicked notch Aug 19, 2023, 3:40 PM

#

hmm another problem has arose

#

Not every virtual texture will have the same resolution, some may be 16K, others may be 4K, so their page table sizes too will differ

#

Eh actually this is easily solvable, I just have to make yet another buffer storing each texture's page table offset in the main array

wicked notch Aug 19, 2023, 4:07 PM

#

why are you asking this here bleakekw

runic surge Aug 19, 2023, 4:07 PM

#

OH FUCK

wicked notch Aug 19, 2023, 4:07 PM

#

I dun have a stretchy monitor

runic surge Aug 19, 2023, 4:07 PM

#

I THOUGHT THIS WAS QUESTION

#

IM SORRY

#

bleakekw

wicked notch Aug 19, 2023, 4:07 PM

#

no prob lmao

runic surge Aug 19, 2023, 4:08 PM

#

KEKW

wicked notch Aug 19, 2023, 5:15 PM

#

Jaker was indeed right when he told me to mark reusable pages

#

too bad I'm stoopid

wicked notch Aug 19, 2023, 5:44 PM

#

void main() {
    const ivec2 position = ivec2(gl_GlobalInvocationID.xy);
    const uvec2 size = imageSize(u_visbuffer);
    if (any(greaterThanEqual(position, size))) {
        return;
    }
    const uint depth_bits = uint((payload >> 34) & 0x3fffffff);
    if (depth_bits == 0) {
        return;
    }
    const float depth = uintBitsToFloat(depth_bits);
    const vec2 uv = (vec2(position) + vec2(0.5)) / vec2(size);
    vec4 world_position = inverse(u_camera.data.pv) * vec4(uv * 2.0 - 1.0, depth, 1.0);
    world_position /= world_position.w;
    const vec4 shadow_uv = shadow_data_ptr.pv * world_position;
    const uint shadow_tile_x = min(uint(shadow_uv.x * VSM_RESOLUTION / VSM_TILE_SIZE_X), VSM_TILE_SIZE_X - 1);
    const uint shadow_tile_y = min(uint(shadow_uv.y * VSM_RESOLUTION / VSM_TILE_SIZE_Y), VSM_TILE_SIZE_Y - 1);
    const uint shadow_tile_index = shadow_tile_x + shadow_tile_y * VSM_PAGE_COUNT;
    atomicAdd(vsm_page_req_ptr.pages[shadow_tile_index], 1);
}
``` amazing

wicked notch Aug 19, 2023, 8:32 PM

#

holy god

#

this is horrifying to look at

frank sail Aug 19, 2023, 8:38 PM

#

Is this real sparse

wicked notch Aug 19, 2023, 8:38 PM

#

well this is just the page request part of real sparse

#

the thing where you see which pages are needed

frank sail Aug 19, 2023, 8:38 PM

#

That's just analyzing the gbuffer eh

#

Depth

#

Wait so you aren't even touching the sparse texture here froghorror

wicked notch Aug 19, 2023, 8:39 PM

#

ye

#

mfw writing to host memory takes time

#

shocking

frank sail Aug 19, 2023, 8:40 PM

#

Ah

glass sphinx Aug 19, 2023, 8:40 PM

#

wicked notch holy god

love to see pixelwarps in your hw raster

frank sail Aug 19, 2023, 8:40 PM

#

You should probably transfer to host after this pass or something

glass sphinx Aug 19, 2023, 8:40 PM

#

🥸

wicked notch Aug 19, 2023, 8:40 PM

#

ye I'll do some transfer shenanigans

frank sail Aug 19, 2023, 8:40 PM

#

Or use device local memory

#

With host visible

wicked notch Aug 19, 2023, 8:41 PM

#

Darian said it's even more painful than HOST_CACHED bleakekw

frank sail Aug 19, 2023, 8:41 PM

#

Fake

#

Just try it bro

wicked notch Aug 19, 2023, 8:42 PM

#

alright

#

surely much better

#

but will I read correct data?

#

or rather

#

will reading from the CPU be absurdly slow?

#

I shall now test

#

welp

#

it takes half a millisecond to read 16KiB of data

#

jesus

frank sail Aug 19, 2023, 8:49 PM

#

hmm

wicked notch Aug 19, 2023, 8:49 PM

#

I think a dedicated transfer operation is necessary

frank sail Aug 19, 2023, 8:51 PM

#

Ye

#

Seems the least finicky too, since memory types can change per device

wicked notch Aug 19, 2023, 8:53 PM

#

I don't need to care about memory types though do I?

#

Like I always transfer from DEVICE_LOCAL to HOST_CACHED

wicked notch Aug 19, 2023, 9:21 PM

#

yup, dedicated transfer op is a noop basically lol

#

and reading takes 2 microseconds

#

amazing

frank sail Aug 19, 2023, 10:12 PM

#

wicked notch I don't need to care about memory types though do I?

I mean just the basic ones

wicked notch Aug 19, 2023, 10:20 PM

#

Hmm

#

once a virtual page is mapped to a physical page, should this mapping remain the same until the virtual page is evicted?

#

or rather, overwritten, not evicted

frank sail Aug 19, 2023, 10:54 PM

#

I don't see why you'd need to shuffle around allocated page tbh

wicked notch Aug 19, 2023, 10:54 PM

#

maybe not for shadows

#

but for textures in general

#

actually, maybe for shadows too

#

I am suballocating from a 256MiB VkDeviceMemory after all, I need to free up space whenever possible

#

Say in frame 0 we draw to virtual page 0, this will map to physical page 7 or something; now suppose we never draw to virtual page 0 again, the physical page 7 will never be used again

#

If I don't "free" it

frank sail Aug 19, 2023, 10:56 PM

#

Ok so you can shuffle allocated, but unused (freed) pages

#

That makes sense

wicked notch Aug 19, 2023, 10:59 PM

#

ok good

#

all this indirection is destroying my single shared neuron

wispy spear Aug 19, 2023, 11:02 PM

#

here, have mine.

wicked notch Aug 19, 2023, 11:10 PM

#

ok the roadmap is as follows

#

step 1. get id of the virtual pages requested from the GPU
step 2. invalidate all pages
step 3. for each requested id, if it's not resident allocate new page

#

I am skeptical about the invalidate all pages step, but it's a logical operation so it's fine I guess

#

Jaker can you fact check

#

JS you're welcome to share the neuron and fact check too

frank sail Aug 19, 2023, 11:15 PM

#

Invalidate all pages seems okay

#

I think it will even work when you add caching, provided you add a little more bookkeeping

wicked notch Aug 19, 2023, 11:16 PM

#

I'm thinking of it like a linear allocator

#

that gets reset every frame

#

except already resident pages that didn't change shouldn't be touched

#

so actually hold on

frank sail Aug 19, 2023, 11:19 PM

#

I'm holding for dear life

wicked notch Aug 19, 2023, 11:20 PM

#

ok I got it

#

I keep track of 2 things

#

allocated pages and non allocated pages, for the allocated pages I also keep track if they are resident or not

#

when I get the requests from the GPU, there are a lot of things I have to do

#

while I was writing I decided that this is stupid

#

Better method: only keep track of allocated and unallocated pages

#

Now:

requested page is already allocated => do nothing
requested page is not allocated => allocate and update sparse bindings
if page wasn't requested => deallocate it

#

how does this sound

raven orchid Aug 19, 2023, 11:28 PM

#

Most of the time that will be really good

#

One exception I ran into was big camera changes if you immediately dealloc

wicked notch Aug 19, 2023, 11:28 PM

#

the only time I call in vkQueueBindSparse is if 2 happens

raven orchid Aug 19, 2023, 11:29 PM

#

Like frame 3 might get overwhelmed with dealloc requests

wicked notch Aug 19, 2023, 11:29 PM

#

Hm

raven orchid Aug 19, 2023, 11:29 PM

#

Then frame 4 the dealloced pages get requested again

wicked notch Aug 19, 2023, 11:30 PM

#

btw deallocation in my case effectively does nothing

#

it just changes one bit in the page table

raven orchid Aug 19, 2023, 11:31 PM

#

ah ok I don't think that will be a problem then

wicked notch Aug 19, 2023, 11:31 PM

#

I should be safe right?

#

ok good

raven orchid Aug 19, 2023, 11:31 PM

#

so memory still stays around?

wicked notch Aug 19, 2023, 11:31 PM

#

ye memory and sparse bindings are untouched

raven orchid Aug 19, 2023, 11:31 PM

#

Ok I think that will be fine since it's a cheap operation

wicked notch Aug 19, 2023, 11:31 PM

#

I make a promise to not use that page again

#

if I break the promise I die

#

actually

#

I can break the promise

#

until I do caching

#

if I do caching and break the promise then I really will die

wispy spear Aug 19, 2023, 11:39 PM

#

no worries, ill resurrect you

frank sail Aug 20, 2023, 6:35 AM

#

I wonder if SDSM would even be a reasonable thing to implement with VSM

#

caching would instantly die

#

eh, it would probably suck since the whole point of SDSM is to tightly fit the shadow map to the frustum, which would play poorly with the concept of sparse allocation

twin bough Aug 20, 2023, 8:09 AM

#

speaking of shadows

#

did anyone of you guys tried Tetrahedron Shadow Mapping yet

#

i want to learn it because it seems a really cool to have omni shadows on a 1x1 grid

frank sail Aug 20, 2023, 8:11 AM

#

I have not seen it

twin bough Aug 20, 2023, 8:11 AM

#

frank sail I have not seen it

https://a-raven.github.io/PersonalWebpageCN/2019/12/21/10CS562Project5/

CS562 Project5：Tile Based Omnidirectional Shadows - FAN JIN'S PORTF...

Introduction:In this paper, I will describe the process of my implementation of Tile Based Omnidirectional Shadow Mapping, which is an effective way o

frank sail Aug 20, 2023, 8:12 AM

#

only free resource about it is some student project

twin bough Aug 20, 2023, 8:12 AM

#

i have a huge problem right now regarding omni shadows because the filter kernel leaks into the other parts of the cubemap

#

yeah 😦

#

there is a source code

frank sail Aug 20, 2023, 8:14 AM

#

ok the algorithm seems relatively shrimple

#

I rarely see user-assigned clip planes actually used for anything

twin bough Aug 20, 2023, 8:20 AM

#

the draw call generation is interesting

wicked notch Aug 20, 2023, 10:05 AM

#

frank sail caching would instantly die

ye caching is dead with SDSM

#

but we more than make up for it with clipmaps

wispy spear Aug 20, 2023, 11:21 AM

#

@wicked notch how come you only came out of hibernation half a year ago or so and not already when you joined this frog pond 2 years ago? : )

wicked notch Aug 20, 2023, 11:21 AM

#

I joined this pond 2 years ago due to my uni project I think I mentioned

wispy spear Aug 20, 2023, 11:22 AM

#

oh, i may or may not have missed that

wicked notch Aug 20, 2023, 11:22 AM

#

it was a Vk project and most of the questions I asked here at the time were things my classmates had written bleakekw

wispy spear Aug 20, 2023, 11:22 AM

#

i dont rember seeing you being active before hence the question 😄

wicked notch Aug 20, 2023, 11:22 AM

#

ye I basically wasn't active

wispy spear Aug 20, 2023, 11:22 AM

#

either way, im glad you are here

distant lodge Aug 20, 2023, 12:24 PM

#

I think the discord server panel even has a buzzword for this

#

it's called activation time or something

wicked notch Aug 20, 2023, 2:23 PM

#

we got le page table

#

no amount of debug outputs will make me not write bugs

wicked notch Aug 20, 2023, 2:54 PM

#

#define VSM_TILE_SIZE_X 128
#define VSM_TILE_SIZE_Y 128
#define VSM_CHANNEL_SIZE 4
updates.clear();
for each page_index in requested_pages {
  if (page_index == 1) {
    if (page_table[page_index].is_allocated()) {
      continue;
    }
    auto memory = VkDeviceMemory();
    auto offset = VkDeviceSize();
    page_table[page_index] = allocate_page(page_index, &memory, &offset);
    updates.emplace_back(VkSparseMemoryBind {
      .resourceOffset = VSM_TILE_SIZE_X * VSM_TILE_SIZE_Y * page_index,
      .size = VSM_TILE_SIZE_X * VSM_TILE_SIZE_Y * VSM_CHANNEL_SIZE,
      .memory = memory,
      .memoryOffset = offset
    });
  } else {  
    page_table[page_index] = deallocate_page(page_index);
  }
}
if (!updates.empty()) {
  vkQueueBindSparse(...);
}

#

ok good

#

this will quickly get out of hand for textures though

wicked notch Aug 20, 2023, 5:19 PM

#

actually nevermind

#

this is completely bogus for regular textures and will work only for VSM

wicked notch Aug 20, 2023, 6:51 PM

#

Ironed out it kinda looks like this

#

struct sparse_memory_block_t {
    VmaAllocation memory = {};
    VmaAllocationInfo info = {};
    uint64 allocations = 0;
};

struct sparse_memory_page_t {
    std::reference_wrapper<sparse_memory_block_t> block;
    uint64 offset = 0;
    uint64 size = 0;
};

struct sparse_image_memory_bind_t {
    image_subresource_t subresource = {};
    offset_3d_t offset = {};
    extent_3d_t extent = {};
    sparse_memory_page_t page;
};```

#

I am sort of unhappy still

frank sail Aug 20, 2023, 7:11 PM

#

father

#

I crave serotonin

wicked notch Aug 20, 2023, 7:12 PM

#

go do VSM

#

plenty of dopamine and serotonin once you achieve it

wicked notch Aug 20, 2023, 9:13 PM

#

as it turns out

#

an unconditional vkAllocateMemory inside the render loop will lead to bad things

wicked notch Aug 20, 2023, 9:51 PM

#

This is the single, most fucked up, most inefficient and most ridiculously non-scalable piece of code I have ever written, in my entire life

#

but it somehow works

#

actually I take "inefficient" back

#

it's actually pretty good bleakekw

#

averaging 30usec

#

but it is still stupidly non scalable

#

I gotta separate the "page allocator" and the "block allocator"

#

very tedious overall bleakekw

#

things left to do:

build a mip chain out of the page table, to be used as a "HZB"
build meshlet lists for the hw and sw rasterizers and cull pages
actually draw the shadow map based on the info above

#

oh btw, @raven orchid how bad is clearing a 16k render target? bleakekw

raven orchid Aug 20, 2023, 10:18 PM

#

wicked notch oh btw, <@192156505070501888> how bad is clearing a 16k render target? <:bleakek...

Actually pretty good!

#

Well though I’ve never had the full thing in memory though

#

But clearing the sparse target every frame hasn’t been bad so far surprisingly

raven orchid Aug 20, 2023, 10:21 PM

#

wicked notch I gotta separate the "page allocator" and the "block allocator"

Block is collection of pages or is it something else?

raven orchid Aug 20, 2023, 10:22 PM

#

wicked notch things left to do: - build a mip chain out of the page table, to be used as a "H...

I’m actually wondering about your approach to drawing the shadow map since you have combined hw and sw rasterizing. This step has been one of the worst for me

wicked notch Aug 20, 2023, 11:10 PM

#

I draw into an R32_UINT shadow map

#

and I imageAtomicMin it with floatBitsToUint(depth) for both the hardware and software paths

#

so rip early Z

#

but I already know it won't be an issue

delicate rain Aug 20, 2023, 11:36 PM

#

I'm confused didn't you do rt a little while ago? How are you doing insano shadow stuff now 🥸

wicked notch Aug 20, 2023, 11:47 PM

#

Parkour

#

RT is currently stuck due to graph partitioning

delicate rain Aug 20, 2023, 11:52 PM

#

I'm interested where this shadow map hole leads you to, I need inspiration, I hit a bit of a slump when it comes to my shadows

frank sail Aug 21, 2023, 1:08 AM

#

We've been discussing it in here, #1090536732769927178, and #1128020727380054046

delicate rain Aug 21, 2023, 1:13 AM

#

Will make sure to check it out, thanks

#

Okay I have no idea what VSMs are nor what you talk about in any of these threads but I'm sold already

#

Give me two days to catch up lol

frank sail Aug 21, 2023, 1:27 AM

#

basically we are decoupling storage from our shadow map, allowing us to act as though the shadow map is huge (16k^2), at the expense of having to determine active pages and manage memory ourselves

delicate rain Aug 21, 2023, 1:28 AM

#

So you are pretending you have huge shadowmap but only actually draw the sampled parts into a much smaller texture which you manage manually?

delicate rain Aug 21, 2023, 1:50 AM

#

Nvm I'm starting to get it

delicate rain Aug 21, 2023, 2:06 AM

#

wicked notch ye caching is dead with SDSM

Taking the risk of sounding really stupid, could you not make your VSM span the frustum (similarly to SDSM) and then virtually reproject the tiles from last frame that are still in the new frustum this frame + draw the new tiles you need where the frustum moved?

frank sail Aug 21, 2023, 2:36 AM

#

It could work if your frustum snaps to the world grid or something

#

That way tiles can reproject cleanly

#

I reckon you'd also have to transform light space depth from the old frame into the current

delicate rain Aug 21, 2023, 8:38 AM

#

Why would you need that? The light space depth should be invariant to the frustum position no?

#

The thing that worries me the most with having small "tiles" is the amount of times I'll need to draw the scene

#

I'll prob need to yeet someones culling in order to make this viable

wicked notch Aug 21, 2023, 9:19 AM

#

delicate rain The thing that worries me the most with having small "tiles" is the amount of ti...

no, you draw the entire thing all at once

#

but you do culling per tile

#

i.e for each mesh/meshlet you determine if it overlaps any tiles that will be needed for sampling

#

if it do the you draw it

#

writes into unpaged memory get discarded with real sparse, but with fake sparse you should make sure to not write into unpaged tiles

delicate rain Aug 21, 2023, 9:25 AM

#

Ohhhhh

#

Hmmm I see

#

Very smart

wicked notch Aug 21, 2023, 9:26 AM

#

I dunno what kinda substances unreal devs sniff to come up with this shit but boy am I glad they do

#

KEKW

delicate rain Aug 21, 2023, 9:27 AM

#

Yeah the soda there must have a lot of minerals or smth

wicked notch Aug 21, 2023, 4:20 PM

#

I am considering using a linear allocator for this shit

#

I now realize that 250 microseconds on average is far too much on the CPU side to update just 16k pages

#

I'm also considering placing a fixed amount of pages that can be updated each frame

#

but that's if the linear allocator thingy fails

glass sphinx Aug 21, 2023, 4:52 PM

#

linear allocators are good

#

its really impressive how fast atomics are on gpus

#

its so good

wicked notch Aug 21, 2023, 7:16 PM

#

Alright this is hard

frank sail Aug 21, 2023, 7:16 PM

#

Same

wicked notch Aug 21, 2023, 7:16 PM

#

Here's how I'm thinking of solving the amounts of page requests per frame

twin bough Aug 21, 2023, 7:17 PM

#

That's what he said

wicked notch Aug 21, 2023, 7:27 PM

#

layout (scalar, buffer_reference) restrict buffer b_page_table {
    uint8_t[] pages; // this is num_pages_total, for all virtual textures that exist
};

layout (scalar, buffer_reference) restrict buffer  b_page_req_table {
    uint count;
    uint[] pages;
};

void main() {
    // do whatever it is I have to do to get a page index for this frame or PAGE_INVALID if this pixel requests no pages.
    const uint page_index = find_virtual_page(...);
    if (page_index != PAGE_INVALID) {
        const uint8_t page_value = atomicExchange(page_table_ptr.pages[page_index], 1, gl_ScopeQueueFamily, gl_StorageSemanticsBuffer, gl_SemanticsAcquireRelease);
        if (page_value == 0) {
            const uint slot = atomicAdd(page_req_table_ptr.count, 1, gl_ScopeQueueFamily, gl_StorageSemanticsBuffer, gl_SemanticsAcquireRelease);
            page_req_table_ptr.pages[slot] = page_index;
        }
    }
}```

#

Ok pardon for the long time, I was thinking about it while I was writing bleakekw

#

I call in the atomics man: Wpotrick

#

please analyze this code bleakekw

wispy spear Aug 21, 2023, 7:28 PM

#

summoning @glass sphinx

glass sphinx Aug 21, 2023, 7:29 PM

#

https://tenor.com/view/superman-superhero-gif-26117061

Tenor

wicked notch Aug 21, 2023, 7:29 PM

#

I should also note that this is to make CPU readback easier by sending in only page indices that have actually been requested, instead of all the pages

#

given that the memory is deallocated automatically every frame, since I'm switching to linear allocators

glass sphinx Aug 21, 2023, 7:30 PM

#

what does find virtual page do

#

i have to go gym

#

later i help

wicked notch Aug 21, 2023, 7:31 PM

#

it detects which tile of which texture this pixel is going to sample from

#

alright monkey man, have a nice workout session

wicked notch Aug 21, 2023, 8:19 PM

#

Hmm I'm still thinking

#

Now that I have all requested pages, there is no way to guarantee order

#

so each frame I would end up doing a huge number of page requests

#

or wait, not exactly

#

the page requests would remain the same

#

But their location in memory could differ

#

Do I want this?

distant lodge Aug 21, 2023, 8:22 PM

#

that might mean some of them are colder in cache than otherwise could be

#

but I dunno if your actual access patterns make that noticeable

wicked notch Aug 21, 2023, 8:36 PM

#

Alright I was terribly wrong

#

I do need to update page bindings too with a linear allocator

frank sail Aug 21, 2023, 8:37 PM

#

distant lodge that might mean some of them are colder in cache than otherwise could be

Each page will already be a multiple of the cache line size so it doesn't matter too much where they are

wicked notch Aug 21, 2023, 9:02 PM

#

After very careful thought

#

A linear allocator is not usable for this kind of thing

#

I need some allocations to be persistent

wicked notch Aug 21, 2023, 9:26 PM

#

It may work well for fake sparse though

#

since binding ops are super cheap

wicked notch Aug 21, 2023, 10:31 PM

#

My brain is currently at max capacity, overheating and I still don't have a solution to this problem, despite thinking most of the day

#

The thing is also conceptually very easy too:

Get page indices requested from the GPU
For each page in the request, check if the requested page is resident, if it is do nothing, if it isn't, allocate and update sparse table, also if the page isn't requested deallocate it

#

And yet, this performs terribly, even with a first fit allocator

#

250us on average to update a mere 500 pages

#

How do I fix this

#

I don't think a linear allocator can fundamentally work either, I need some pages to stay resident in between frames

#

I thought also about maybe a "bump allocator" with a ring buffer, but that still doesn't sound right

distant lodge Aug 21, 2023, 10:38 PM

#

kinda fallen behind this problem, but why can't you split your allocation strategy between persistent and non-persistent stuff

wicked notch Aug 21, 2023, 10:38 PM

#

I'm not sure how to do that split

#

any page can be resident for any number of frames

#

"Persistent" here means, "the gpu requested this page 2 or more times in a row"

#

the first frame page 0 is requested it is allocated, if the next frame page 0 is requested again, nothing is done (no deallocation or moving of sorts)

#

english died for a sec there bleakekw

distant lodge Aug 21, 2023, 10:42 PM

#

oh then rip

wicked notch Aug 21, 2023, 10:42 PM

#

Does a bump allocator with a ring buffer make any sense whatsoever

distant lodge Aug 21, 2023, 10:43 PM

#

maybe ¯_(ツ)_/¯

#

why not just have some central bitmask block you can check to see if a page is taken

#

I think some real OS allocators use that

wicked notch Aug 21, 2023, 10:44 PM

#

For the ring buffer you mean?

#

Also that would be one huge bitmask

distant lodge Aug 21, 2023, 10:44 PM

#

for any page allocator

#

how big are your pages?

wicked notch Aug 21, 2023, 10:44 PM

#

It doesn't really matter, but 64KiB

distant lodge Aug 21, 2023, 10:45 PM

#

idk it probably won't be that big though

wicked notch Aug 21, 2023, 10:45 PM

#

It would be 16384 bits big

distant lodge Aug 21, 2023, 10:45 PM

#

2048 bytes, 512 uints

#

per gig

wicked notch Aug 21, 2023, 10:46 PM

#

Because in one virtual shadow map there are 2^14 pages

distant lodge Aug 21, 2023, 10:46 PM

#

that's not that bad

wicked notch Aug 21, 2023, 10:46 PM

#

Hm perhaps

distant lodge Aug 21, 2023, 10:46 PM

#

2KiB/gig is a pretty small cost

#

all for an O(1) cache efficient residency czech

wicked notch Aug 21, 2023, 10:50 PM

#

So the bump allocator I think would work like this

auto curr_ptr = bump.current();
while (is_page_allocated(curr_ptr)) {
  curr_ptr = bump.advance();
}
allocate_page(curr_ptr, ...);```

#

Doesn't look too bad

#

Advance automatically brings the ptr back to the first slot once it has reached the end

#

And is_page_allocated is said bitmask czech

distant lodge Aug 21, 2023, 10:51 PM

#

you could probably use fancy bit intrinsics to do funny stuff there too

wicked notch Aug 21, 2023, 10:52 PM

#

do tell

#

I like funny bits

distant lodge Aug 21, 2023, 10:52 PM

#

both on the CPU and GPU there are dedicated instructions for stuff like bit count, or getting the first bit that's a 1 or the last bit that's a 1

wicked notch Aug 21, 2023, 10:52 PM

#

Ah, __builtin_clz

distant lodge Aug 21, 2023, 10:53 PM

#

so you can probably check 32 pages at a time if you're smart

#

or 64, not sure what you're targeting

wicked notch Aug 21, 2023, 10:53 PM

#

I'm targeting my system

#

which is fairly modern, so 64 KEKW

distant lodge Aug 21, 2023, 10:53 PM

#

your CPU system or GPU system

wicked notch Aug 21, 2023, 10:53 PM

#

CPU

distant lodge Aug 21, 2023, 10:54 PM

#

oh

#

so then yeah you can check 64 pages at a time

wicked notch Aug 21, 2023, 10:54 PM

#

Maybe I didn't mention this, but this allocator should run on the CPU ye

#

I got a big brained idea

#

__builtin_clz is exactly what I need holy shit

#

const auto free = __builtin_clz(~bitmask);```

#

or not

#

eh, nah rip

#

it's not what I need

#

there must be a fancy instruction that returns the first 0 bit

distant lodge Aug 21, 2023, 11:02 PM

#

it might be ffs

#

or maybe ctz?

wicked notch Aug 21, 2023, 11:03 PM

#

hmm

#

likely

#

let me run some high level simulations (drawing on my tablet)

distant lodge Aug 21, 2023, 11:05 PM

#

https://godbolt.org/z/3jzv663EG

#

here's a higher level simulation

#

from what I read just now, beware that ctz/clz are technically undefined when x = 0

#

so you might need to be careful with it

wicked notch Aug 21, 2023, 11:06 PM

#

Ok clz is definitely what I want

#

Suppose 0 is allocated and 1 is free, __builtin_clz(0011) returns 2, __builtin_clz(1000) returns 0 and __builtin_clz(0000) is undefined (good, it means it's completely full)

#

epic

distant lodge Aug 21, 2023, 11:08 PM

#

undefined means it'll return whatever and you can't depend on it

#

so you have to check it separately

wicked notch Aug 21, 2023, 11:08 PM

#

yeah, I'll just add a separate checc

#

so then

#

So the while loop before becomes bogus if I can check 64 things at a time

#

for each mask in list {
  if mask == 0 { continue; }
  const auto free = __builtin_clz(mask);
  allocate_page(free);
  break;
}```

distant lodge Aug 21, 2023, 11:11 PM

#

I wonder if there's SIMD versions of clz

wicked notch Aug 21, 2023, 11:11 PM

#

ah yes

#

loops? what are those

#

I only know 8192 bit wide vector instructions

#

int mm256_lzcnt_si256(__m256i vec)
{
    __m256i   nonzero_elem = _mm256_cmpeq_epi8(vec, _mm256_setzero_si256());
    unsigned  mask = ~_mm256_movemask_epi8(nonzero_elem);

    if (mask == 0)
        return 256;  // if this is rare, branching is probably good.

    alignas(32)  // gcc chooses to align elems anyway, with its clunky code
    uint8_t elems[32];
    _mm256_storeu_si256((__m256i*)elems, vec);

//    unsigned   lz_msk   = _lzcnt_u32(mask);
//    unsigned   idx = 31 - lz_msk;          // can use bsr to get the 31-x, because mask is known to be non-zero.
//  This takes the 31-x latency off the critical path, in parallel with final lzcnt
    unsigned   idx = bsr_nonzero(mask);
    unsigned   lz_msk = 31 - idx;
    unsigned   highest_nonzero_byte = elems[idx];
    return     lz_msk * 8 + _lzcnt_u32(highest_nonzero_byte) - 24;
               // lzcnt(byte)-24, because we don't want to count the leading 24 bits of padding.
}``` oh god what the fuck is this

distant lodge Aug 21, 2023, 11:16 PM

#

looks weird

wicked notch Aug 21, 2023, 11:21 PM

#

mfw gcc can't auto vectorize this shit because of control flow

#

alright that's enough bikeshedding the most optimal of clz instructions bleakekw

wicked notch Aug 21, 2023, 11:26 PM

#

wicked notch ```cpp for each mask in list { if mask == 0 { continue; } const auto free = ...

I shall remember this for tomorrow™️

wicked notch Aug 21, 2023, 11:44 PM

#

is_allocated(allocator, page) {
    index = page.index / 64;
    bit = page.index % 64;
    mask = allocator.list[index];
    return !(mask & (1 << bit));
}

deallocate(allocator, page) {
    index = page.index / 64;
    bit = page.index % 64;
    mask = allocator.list[index];
    mask |= 1 << bit;
}

allocate(allocator, page) {
    for (mask in allocator.list) {
        if (mask == 0) { continue; }
        index = 63 - __builtin_clz(mask);
        mask &= ~(1 << index);
        page.memory = allocator.memory;
        page.offset = ...;
        return VkSparseMemoryInfo(...);
    }
}

for (req in requests) {
    if (req == REQUEST_PAGE_NEEDED) {
        if (!is_allocated(allocator, pages[req])) {
            updates.emplace_back(allocate(allocator, pages[req]));
        }
    } else {
        if (is_allocated(allocator, pages[req])) {
            deallocate(allocator, pages[req]);
        }
    }
}```

#

This looks very promising

#

bit ops, how lovely

glass sphinx Aug 21, 2023, 11:46 PM

#

wicked notch ```glsl layout (scalar, buffer_reference) restrict buffer b_page_table { uin...

that looks good to me

wicked notch Aug 21, 2023, 11:47 PM

#

nice, unfortunately I won't be using that 😦

#

Or well

#

all is to be seen

#

so good thing you confirmed it was good

#

@raven orchid you may wanna invest in bitops too

raven orchid Aug 21, 2023, 11:58 PM

#

Bitops?

#

Also trying to catch up on this

wicked notch Aug 21, 2023, 11:58 PM

#

ok so, my allocator's performance was not very good

#

I was originally using a first fit, free list allocator (for some goddamn reason, no wonder it was slow) to manage my virtual pages

#

turns out you can do a much better thing

#

you use a big ass bitmask to remember which pages are allocated and which are not, then when you allocate you simply go through the bitmask and find the first free slot, using a CPU intrinsic

#

deallocating is even easier, you take the index of the page within the bitmask and the index of the bit, and you set the bit back to 1

#

now this, is blazing fast

#

like nanoseconds blazing fast

raven orchid Aug 22, 2023, 12:00 AM

#

Ohhh now I see so this is specifically optimizing the problem of

#

How to find the next free slot fast

wicked notch Aug 22, 2023, 12:00 AM

#

yes exactly

distant lodge Aug 22, 2023, 12:01 AM

#

wicked notch mfw gcc can't auto vectorize this shit because of control flow

make it branchless

wicked notch Aug 22, 2023, 12:01 AM

#

I dunno how to make that branchless bleakekw

distant lodge Aug 22, 2023, 12:01 AM

#

make what specifically branchless

#

you can make all the aggregated bit checks branchless probably

wicked notch Aug 22, 2023, 12:01 AM

#

wicked notch ```cpp for each mask in list { if mask == 0 { continue; } const auto free = ...

this ig

distant lodge Aug 22, 2023, 12:02 AM

#

hmmm

#

you just need to do

#

actually nah

#

idk

#

I was thinking you could use a special sentinel value like (mask == 0) * sentinel + (mask != 0) * __builtin_clz(mask)

#

but you'd just be deferring the branch

raven orchid Aug 22, 2023, 12:03 AM

#

Hmmm

#

I think I’ll try this next if my current experiments blow up

wicked notch Aug 22, 2023, 12:04 AM

#

epic

#

did you ever get around to implementing the HZB

distant lodge Aug 22, 2023, 12:04 AM

#

https://en.wikipedia.org/wiki/SSE4
uhhh
there seems to be something for LZCNT and BSR

raven orchid Aug 22, 2023, 12:04 AM

#

No I fell into the caching rabbit hole

#

But I did increase the page group size to 32x32

wicked notch Aug 22, 2023, 12:05 AM

#

distant lodge <https://en.wikipedia.org/wiki/SSE4> uhhh there seems to be something for LZCNT ...

wot is this

distant lodge Aug 22, 2023, 12:05 AM

#

nvm I think the scalar lzcnt came in SSE4, not that there's an SSE lzcnt

#

that's a rip

raven orchid Aug 22, 2023, 12:06 AM

#

wicked notch did you ever get around to implementing the HZB

Current is still that a page group just maintains a min bounding box around invalid non cached pages and checks against that only

distant lodge Aug 22, 2023, 12:06 AM

#

SSE4 is one of the later x86 isa extensions

wicked notch Aug 22, 2023, 12:06 AM

#

raven orchid No I fell into the caching rabbit hole

rip

distant lodge Aug 22, 2023, 12:06 AM

#

its in AVX512 :^(

#

https://en.wikipedia.org/wiki/AVX-512

#

vplzcntd

raven orchid Aug 22, 2023, 12:07 AM

#

For the physical memory itself I lost track

#

Did you decide on a “never actually free full” strat or do you release memory sometimes?

distant lodge Aug 22, 2023, 12:08 AM

#

if you had a 32 core avx512 CPU you could do an allocation check in 1 cycle

wicked notch Aug 22, 2023, 12:08 AM

#

the current idea is to deallocate a page block when it's empty for a number of frames

distant lodge Aug 22, 2023, 12:08 AM

#

if only...

wicked notch Aug 22, 2023, 12:09 AM

#

distant lodge if you had a 32 core avx512 CPU you could do an allocation check in 1 cycle

weird way to say: "buy a fucking threadripper scrub"

distant lodge Aug 22, 2023, 12:09 AM

#

don't think those support avx512

#

do they?

wicked notch Aug 22, 2023, 12:09 AM

#

the latest arch does

#

the 7000 series I think

distant lodge Aug 22, 2023, 12:09 AM

#

simply buy one of those

wicked notch Aug 22, 2023, 12:10 AM

#

requirements to run my engine: "32 core threadripper (64 preferred) with AVX512"

#

KEKW

raven orchid Aug 22, 2023, 12:10 AM

#

Another question

#

Could your current alloc strategy be partially moved to the GPU? Like could compute essentially select the next free pages atomically and just write that as work for the cpu to

wicked notch Aug 22, 2023, 12:12 AM

#

uhh

#

let me think about it

#

ok yes

distant lodge Aug 22, 2023, 12:13 AM

#

if you can do it with this simd crap you could totally dispatch a compute shader to do this same operation

raven orchid Aug 22, 2023, 12:13 AM

#

I guess I’m thinking right now it’s probably
Gpu: I would like a page
Cpu: I will find next and alloc

Vs

Gpu: I need you to alloc this exact page
Cpu: ok

distant lodge Aug 22, 2023, 12:13 AM

#

it'd be cool too

wicked notch Aug 22, 2023, 12:13 AM

#

yes this is totally doable on the GPU now

#

the only thing I have to readback is the indices of the virtual pages I should update and the offsets into the device memory it should reside in

#

the only problem I see with this is when the block is completely full

raven orchid Aug 22, 2023, 12:15 AM

#

Oh true

#

That would be what

#

Block fault lol

wicked notch Aug 22, 2023, 12:15 AM

#

But I could issue an extra dispatch for that no prob

raven orchid Aug 22, 2023, 12:15 AM

#

Unless the gpu could somehow just do some kind of virtual block strategy

#

And the cpu would alloc the block followed by the actual requests lol

wicked notch Aug 22, 2023, 12:15 AM

#

man the world would be a better place if the GPU could allocate memory for itself

distant lodge Aug 22, 2023, 12:16 AM

#

it'd be cool if the CPU had some kind of stable handle to GPU stuff so you didn't have to readback

raven orchid Aug 22, 2023, 12:16 AM

#

Yeah 😦

distant lodge Aug 22, 2023, 12:16 AM

#

vkCmdAllocateIndirect

wicked notch Aug 22, 2023, 10:44 AM

#

18446744073709551615

#

18428729675200069631

#

stupid fast now

#

20usec on average

wispy spear Aug 22, 2023, 10:56 AM

#

how long does UE5's impl take?

wicked notch Aug 22, 2023, 10:57 AM

#

I dunno and I dun care bleakekw

#

This is different though

wispy spear Aug 22, 2023, 10:57 AM

#

hehe, fair enough

#

ah

wicked notch Aug 22, 2023, 10:57 AM

#

UE5 does it differently

#

But their code is so convoluted that I'm not basing it off UE at all

wispy spear Aug 22, 2023, 10:57 AM

#

makes sense

wicked notch Aug 22, 2023, 11:15 AM

#

For backup

wicked notch Aug 22, 2023, 12:02 PM

#

now the hard part

#

rendering this thing bleakekw

proven laurel Aug 22, 2023, 12:28 PM

#

wicked notch ```cpp const auto free = __builtin_clz(~bitmask);```

clz is part of the standard now btw.

distant lodge Aug 22, 2023, 12:33 PM

#

what's the standard one look like?

wicked notch Aug 22, 2023, 12:33 PM

#

proven laurel clz is part of the standard now btw.

ye I use that now

#

std::countl_zeros

wicked notch Aug 22, 2023, 2:05 PM

#

oh no

#

oh no...

#

all hope is lost

#

it's so joever

#

sparse is slow

#

(shocking)

wispy spear Aug 22, 2023, 2:37 PM

#

wicked notch For [backup](<https://hastebin.com/share/cayivusoqa.cpp>)

wicked notch Aug 22, 2023, 4:11 PM

#

page table HZB is such a ridiculous idea

frank sail Aug 22, 2023, 4:15 PM

#

Impossible, it can't be done

wicked notch Aug 22, 2023, 4:21 PM

#

watch me do it anyways and fail miserably bleakekw

runic surge Aug 22, 2023, 7:44 PM

#

i believe lvstri

wicked notch Aug 22, 2023, 8:34 PM

#

I don't

wicked notch Aug 22, 2023, 9:27 PM

#

at least sync works now

#

timeline semaphores are fucking awesome

#

man this is so incredibly tedious

#

as it turns out when your entire renderer is hybrid sw/hw compute/mesh shader it isn't exactly trivial adding stuff bleakekw

wispy spear Aug 22, 2023, 9:30 PM

#

https://tenor.com/view/weasel-gif-5175477

Tenor

wicked notch Aug 22, 2023, 9:30 PM

#

wispy spear https://tenor.com/view/weasel-gif-5175477

literally me rn

wispy spear Aug 22, 2023, 9:30 PM

#

ikr 😄

wicked notch Aug 22, 2023, 9:30 PM

#

perfectly depicts my current struggle bleakekw

wispy spear Aug 22, 2023, 9:30 PM

#

hehe

#

make sure to take a rest in between too

wicked notch Aug 22, 2023, 9:31 PM

#

yeah I'll leave this for tomorrow me

wispy spear Aug 22, 2023, 9:31 PM

#

NO EXCUSES!

#

oki

wicked notch Aug 22, 2023, 9:31 PM

#

surely tomorrow me will be far more intelligent

wispy spear Aug 22, 2023, 9:31 PM

#

sounds like my motto

frank sail Aug 22, 2023, 9:59 PM

#

Don't worry, I'll finally be at my computer tonight and my implementation will be perfect

#

Inshallah

wicked notch Aug 22, 2023, 9:59 PM

#

🙏

#

I shall await its completion habibi

wicked notch Aug 23, 2023, 1:30 PM

#

I am extremely unhappy with how I wrote my shaders, I kept thinking "eh this is fine for now"

#

Well, it isn't fine anymore bleakekw

#

and I have also determined that HLSL still isn't as good as a drop in replacement as I'd like it to be

#

so good old GLSL it is

#

horray

wicked notch Aug 23, 2023, 1:50 PM

#

wicked notch so good old GLSL it is

Actually scratch this

#

I'm not convinced

#

GLSL sucks

#

sucks so bad

#

How much can I hide behind convenience functions the inline spirv?

glass sphinx Aug 23, 2023, 1:58 PM

#

wicked notch GLSL sucks

what dont you like about it

wicked notch Aug 23, 2023, 1:59 PM

#

functions

#

there are a million restrictions on functions, plus there are no templates which would come in handy for some stuff in my compute rasterizer

#

it's hard to generalize something when functions suck

glass sphinx Aug 23, 2023, 1:59 PM

#

since i use bda and texture handles i dont really mind it mich

wicked notch Aug 23, 2023, 2:00 PM

#

I do too, but sometimes just using functions is nicer

glass sphinx Aug 23, 2023, 2:00 PM

#

i dont know what you mean

#

"just using functions"

wicked notch Aug 23, 2023, 2:01 PM

#

take a hypothetical check_hzb function

#

struct X {
    // can't have this in a struct in GLSL
    sampler2D hzb;
    // other data
};

struct Y {
    image2D hzb;
    // other data
};

// here a template would be useful for the core part of the algorithm, taking in both X and Y
template <typename T>
vec4 get_projected_aabb_uvs(in T x) {
    ...
}

bool is_occluded(in X info) {
    // regular HZB sample check
    ...
}

bool is_occluded(in Y info) {
    // different things, page table HZB check for example
    ...
}```

glass sphinx Aug 23, 2023, 2:05 PM

#

well i have all textures virtual with handles

wicked notch Aug 23, 2023, 2:05 PM

#

all this would be possible in HLSL as far as I'm reading

glass sphinx Aug 23, 2023, 2:05 PM

#

so this is not a problem for me

wicked notch Aug 23, 2023, 2:05 PM

#

yeah your texture handles are great

#

I don't get how they work though bleakekw

glass sphinx Aug 23, 2023, 2:06 PM

#

the makros are pretty simple now

#

i generalized them back

#

so they arent typed anymore

#

just the accessor makros are

#

in shaders i mostly dont feel the need for hlsl too much yet

#

i started to dislike bda

#

because i dont have boundschecks

#

it just explodes

#

super annoying

#

so i wanna try hlsl byteaddressbuffer

wicked notch Aug 23, 2023, 2:10 PM

#

ehh it's not too bad

#

oh yeah that's nice

#

I kinda want to YOLO it and move to HLSL no questions asked

glass sphinx Aug 23, 2023, 2:11 PM

#

for example for rt i will use hlsl

#

cause nsight doesnt understand bda

#

so i couldnt debug it

wicked notch Aug 23, 2023, 2:11 PM

#

nsight doesn't even understand timeline semaphores

#

garbage

glass sphinx Aug 23, 2023, 2:11 PM

#

holy shit

#

honestly vulkan is really bad

#

dx has much better tooling sometimes

wicked notch Aug 23, 2023, 2:11 PM

#

I tried NV's own sample with timeline semaphores

#

and Nsight just hung

glass sphinx Aug 23, 2023, 2:11 PM

#

nv and amd also care more about dx i feel like

#

wtf

wicked notch Aug 23, 2023, 2:12 PM

#

ye but DX feels shite to use

#

plus windows.h

#

I dunno about DX, maybe I'll try it sometime in the future

#

but it's yucky

glass sphinx Aug 23, 2023, 2:13 PM

#

is it

#

it works in pix

#

another debugger at our whim

wicked notch Aug 23, 2023, 2:14 PM

#

hmm perchance

#

eh maybe

#

there's a 10% chance I'll try D3D12 in the near future

glass sphinx Aug 23, 2023, 2:15 PM

#

i dont think i ever will

#

way too kuch work

#

i can use vulkan already

wicked notch Aug 23, 2023, 2:16 PM

#

oh for sure

#

I'll probably never switch to D3D12 even if it turns out to be the greatest

#

Vk works fine for me

glass sphinx Aug 23, 2023, 2:18 PM

#

one of the few nice things in hlsl is that you can have multiple entry points in a file without makro spamm

#

sometimes its also quite painful without templates

#

also having member functions is also nicer to txpe

#

the syntax is much better in hlsl

wicked notch Aug 23, 2023, 4:01 PM

#

TODO: Read this carefully

#Iris - A Journey through OpenGL and beyond to learn Graphics