#Iris - A Journey through OpenGL and beyond to learn Graphics

1 messages · Page 11 of 1

glass sphinx
#

huh

wicked notch
#

uhh that was a message for myself btw KEKW

glass sphinx
#

lol

proven laurel
#

Everything except writing out mul as an actual function

glass sphinx
#

its fine

proven laurel
#

yeah it's fine but it does clutter it a bit

#

given that this is my biggest complaint about HLSL, it is a much better shading language

#

also: templates

#

Bindless in HLSL is also so much more of a joy than in GLSL

glass sphinx
#

its not tho

#

syntactically is basically the same

#

except for buffers beeing much worse

#

you wan make makros generating the boilerplate in glsl

wicked notch
#

it's ok, we have byte address buffer

glass sphinx
#

yep

#

and fir reads of globals and such also structured buffer

#

but read only

wicked notch
#

after a long and hard pondering of my orb

#

I have decided to stay with GLSL

#

it was a hard decision

#

the dealbreaker was the dxc header API being windows only (among other things of course)

#

or well, technically it should be supported on linux too?

#

but it's a hassle bleakekw

frank sail
#

least fucked shading lang situation

wide shadow
#

I just wanted to say that

wicked notch
#

yeah it seems like you can compile it

#

not sure how much of a hassle compiling it would be

#

should I try just for the memes?

glass sphinx
#

gabe made a vcpkg magic thingy you can use

wispy spear
#

fucking vcpkg

glass sphinx
#

so you don't need to build it

#

gabe is a magic man

wispy spear
#

you need 10 differnt tools to get shit compiled to get shit done

glass sphinx
#

you only need vcpkg cmake and ninja

wispy spear
#

and a compiler and an ide

glass sphinx
#

the holy trinity

glass sphinx
#

but vcpkg is nessecary

wicked notch
#

I uh

#

think I just realized how fucked I am

#

I forgot that I had mandatory internships to do this year

wispy spear
#

👉 👌

wicked notch
#

so I just applied for an external academy, thinking "oh it would be fine"

#

so I have to work 2 jobs basically now, but that's not all

#

I thought that I would be able to skip some exams due to my high credits

wispy spear
#

if that means you have to take a break from frogite, illallowit.gif

frank sail
#

Mandatory internship to graduate 😦

wicked notch
#

as it turns out (I didn't know this until now), I cannot actually do that, they changed it last year

wispy spear
#

fook

frank sail
#

You need to talk to a counselor

wicked notch
#

therefore, I am fucked 💀

wicked notch
frank sail
#

Tell them you're working on something more important (frogfood)

wicked notch
#

I don't believe the human body was made to sleep 1 hour per week

wispy spear
#

lustri, you got sis

wicked notch
#

granted this is my fault but KEKW

wispy spear
#

still

#

shit happens sometimes

delicate rain
#

These things are always scary, but what I found is there is a lot of leeway you can get if you ask

frank sail
#

But you won't know if you don't try

wispy spear
#

counselors are also just frogs like you

wicked notch
#

ye I gotta talk with the department's head

wispy spear
#

we can write you a reference letter, if need be

wicked notch
#

because it's stupid, since last year if you accumulated a lot of credits, you could've skipped half of the internship (150 hours instead of 300) and one 6 credit exam of my choice

#

so I was working with this assumption in mind

delicate rain
#

Think about it this way, they don't want to get rid of smart ppl and you definitely are very smart, I'm sure you will find a way 💪

wicked notch
#

yeah this is a very roundabout way of me telling the VSM dream team™️ that I won't be able to work on it for a while bleakekw

wispy spear
#

#1090390868449558618 message

delicate rain
#

My semester starts soon too, we are in it for the long run

wispy spear
#

assuming you w ork 20hrs aweek, that means 8 weeks off of frogiteisms

wicked notch
#

it's gonna be 4hrs mon-sat

#

so 24hrs/week

wispy spear
#

ah even better

#

interns are not allowed to work more than 20 in 🇩🇪

delicate rain
#

What a dream land

wicked notch
#

ye, the saturday is stupid

#

smh

wispy spear
#

still roughly 8 weeks 🙂 we will keep this channel alive

wicked notch
#

you do be forgetting something though

#

the academy I also applied to

#

that's another 20 hours

#

except uh

wispy spear
#

well

wicked notch
#

it's 9 months 💀

wispy spear
#

ah

delicate rain
#

Who needs sleep anyways

wispy spear
#

time will fly

#

it always do

wicked notch
#

I sure as hell won't be bored

wispy spear
#

MAYBE by the teim you are back i can proudly show you how i got CSM working xD

distant lodge
#

just start rambling about memory paging and they'll get the idea

#

they'll know you're doing something important

wispy spear
#

you are basically Lustri Sweenie

#

Reinventing UE

wicked notch
#

if we still had covid I would've been able to WFH

#

ahh, why must we be so archaic

wispy spear
#

work from home is the wrong term mate

#

its called remote work

delicate rain
#

Recorded lectures are the best thing to happen since idk what

#

So naturally unis are trying to suppress them

#

At any cost

wicked notch
#

they won't succeed

#

I have my entire 2nd year recorded

#

just audio though

#

rip

delicate rain
#

The student black market is very strong

wicked notch
#

it's still very useful

#

Thanks to my tablet I can take notes at supersonic speed

delicate rain
#

It's like 2-3x as efficient to study recorded lectures

wicked notch
#

so my workflow has been go to class -> record and take notes -> go home and review for 1 or 2 hours

#

it works tremendously well

delicate rain
#

That is very diligent

#

I never could do that

wispy spear
#

i never studied, i got my a levels and started working after military training

delicate rain
#

I usually just fool around the entire semester and then study 10 hours a day for a month

wispy spear
#

but im also stupid af now when it comes to maffs and GP 😛

delicate rain
#

That is also true, I learn everything in a week but forget it in a week too

frank sail
wispy spear
#

ha almost

#

i became an admin because i wanted change and was able to communicate it 🙂

proven laurel
#

are they actually

#

because at my uni they are effectively permanent now for most lectures

wicked notch
#

heh

#

here it's "NOOOOOOO how dare you suggest we keep recording lectures, do you know we've done this for hundreds of years like this!!!! we must keep it like god intended!!!"

#

"there are studies suggesting recorded lectures actually kill productivity!!!"

proven laurel
#

for me it's harder to focus but at the very least losing focus is not as bad because you can just repeat the part

delicate rain
proven laurel
#

I mean from a lecturer perspective I do get it

#

doing a lecture for 5 people sucks

#

But when online is preferred there is usually a good reason for it

#

it's not plainly about comfort. In my case it's to avoid a ~1 hour commute (that's just to uni, 1 additional hour is back home)

delicate rain
#

Yeah... So dont do it, imagine how much more time you'd have for individual consultation etc

glass sphinx
#

maybe the lecturers suck

proven laurel
#

so imho it's the institutions that should fix the underlying issues that they can fix. Ofc. they cannot resettle students but there's definitely problems that they can fix. KEKW

delicate rain
#

I like recorded because I can 2x through the lecture

glass sphinx
#

all the good lectures immediately were full again in my uni time

proven laurel
#

very slow

delicate rain
glass sphinx
#

@delicate rain code

delicate rain
#

Ok dad

proven laurel
#

most courses here have pretty good attendance from what I know though

raven orchid
#

Trying to build a mental timeline, is it basically between now and 9 months from now your schedule is on fire?

wicked notch
#

20th of september and 9 months from then and I die yes

#

I am expected to come back to life in June 2024

#

I'll try to find blank spots (very hard, KEKW) to do some VSMisms

wispy spear
#

we will be here, when you are back 🙂

proven laurel
wispy spear
#

skool and half assed attempt at interning

proven laurel
#

so suffering

wicked notch
#

basically

wispy spear
#

@wicked notch i see you happily chatting away in #hardware ... wasnt there an exam today?

wicked notch
#

tomorrow

wispy spear
#

indeed, i thought 7th was today already

wicked notch
#

It's an ez exam

wispy spear
#

hehe

wicked notch
#

no sweat

wispy spear
#

right, that sounded totally different the other day

wicked notch
#

Just a bit of fear mongering for myself

#

to get my dumb brain to study

delicate rain
#

If you don't go through three stages of panic - extremely calm - panic - ... before an exam you are not doing it right

wicked notch
#

fr

#

the day before and up to 5 minutes until after the exam I am at peace

wispy spear
#

i wish you best of luck nonetheless

wicked notch
#

5 minutes and 1 nanoseconds after the exam my heartrate spikes back up to 300

wicked notch
#

Them exams don't be stopping

#

15th next

#

I am tired boss

delicate rain
#

You got this mister, how many more?

wicked notch
#

uh

#

I'm not even sure nervous

#

3 more in september only

delicate rain
#

Still weird that you got exams this late

wicked notch
#

I missed the early dates

#

or rather, I physically couldn't do them at those dates

#

Unless some of you have a method of being in two different places at the same time?

wispy spear
#

tell your electrons to figure out superpositions, ezpz

delicate rain
#

You need the time machine from Harry Potter 3

wicked notch
#

professors conveniently forget that the literal system they use to schedule exams warns them when overlapping time and dates arise

#

They gotta be doing it on purpose

runic surge
#

it's a ploy to stop lvstri from finishing its engien

delicate rain
#

Epic got a sniff of his tech and started shaking

distant lodge
#

Epic sending agents to your university to make your tests extra hard

#

it's a conspiracy

twin bough
#

They are watching u from the shadows with great interest

proven laurel
#

I have an exam as well on the 15th KEKW

pale horizon
#

I thought he meant "the 15th exam" and was like "is this getting a degree any% WR attempt?" KEKW

runic surge
#

Ez

wicked notch
#

My brain feels 80% lighter

#

but all exams have been done

#

I will wait 2 days for my brain to recover, then it's full speed ahead with VSM

wicked notch
twin bough
#

btw @wicked notch since you are here i have been meaning to ask ( i havent implement visbuffers at all atm ): how does it play with parallax occlusion?

#

if you tried making them work together

#

also congrats on ur exams 😄

wicked notch
#

Hmm I've never done parallax occlusion mapping but from LearnOpenGL I recall it being a shrimple uv interpolation in the shading pass

#

so it should be easy enough

twin bough
#

it needs ddx/ddy for the texgrad lookup tho

wicked notch
#

yeah you can use analytical derivatives or finite differences

#

lemme get the references™️

twin bough
#

gotcha

#

thanks

wicked notch
#

we are officially back in business

#

time to refactor every single shader

wispy spear
#

that was a kwik 3 quarters of a year 😄

#

time flies

wicked notch
#

I am completing this before the 25th

#

even if it costs me everything

wispy spear
#

please dont burn yourself out my spaghetti friend

wicked notch
#

ok so I may or may not have fucked up the git repo

#

I hate it so much when git doesn't do what you tell it to do

#

me: please update this submodule from the remote
git: ok
(doesn't do anything at all, directory still empty)

#

git reset --hard did the trick

#

alright it took me 40 minutes but I have unfucked the fuckery

#

No idea what happened, git being git sometimes I guess

#

Or more likely a skill issue on my part

wispy spear
#

git submodule update --init --recursive

pale horizon
#

I added a bash script to do it. I always forget this command. So unintuitive

wispy spear
#

yeah git submodule update should do all the necessary things already by default

#

if you want to update a single submodule then you specify that somehow

wicked notch
#

I tried fetching and pulling myself with no luck

#

then with a git reset it worked

#

idk why the tree was pointing at some random ass location

wispy spear
#

hmm then there is also git submodule foreach git pull origin main/masterdependsonwhatprojectisusingwhat

#

but yeah its a pain in the ass

frank sail
wicked notch
#

This is so nice

#

all the things I had to trial and error myself

#

Imma search chinese forums from now on KEKW

wicked notch
#

ok so

#

I shall define stuff

#
  1. We want to use a light projection centered at 0 to get stable addressing (for a given rotation the same position in world space will give me the same address)
#
  1. Since we want to use a light projection centered at 0, we cannot just use the virtual UV as is, we have to wrap them around
#
  1. Given the above, for each clipmap, we draw using the translated light projection (and have it shift gradually, invalidating the new regions
raven orchid
#

I’ll put it here lol

#

With this one you can imagine the red box is the old (prev frame) projection view render which gave us [-1,1]

#

The arrow is basically a motion vector in ndc space

#

So if we have coordinates from the previous frame in [-1,1] and we apply the motion vector, anything that goes outside that range is getting wrapped around to the other side this frame

#

Meaning we can mark it dirty

#

But anything that stays inside the valid range after the motion vector is applied was visible last frame before the shift so we can skip it

wicked notch
#

How does this check look like in GLSL?

raven orchid
#

Let me switch to desktop lol

raven orchid
#
vec2 ndc = vec2(2 * localPixelCoords) / vec2(vsmSize) - 1.0;
// Apply motion vector to NDC value
// cascadeNdcClipOriginChange = motion vector in NDC space
vec2 ndcChange = ndc - cascadeNdcClipOriginChange;

vec2 virtualUvCoords = convertLocalCoordsToVirtualUvCoords(
    vec2(localPixelCoords),
    vec2(vsmSize),
    cascade
);

vec2 virtualPixelCoords = wrapIndex(virtualUvCoords, vec2(vsmSize));

ivec2 physicalPageCoords = ivec2(wrapIndex(virtualUvCoords, vec2(numPagesXY)));
uint physicalPageIndexFlat = uint(physicalPageCoords.x + physicalPageCoords.y * numPagesXY + cascade * cascadeStepSize);

uint frameMarker = unpackFrameMarker(
    pageTable[physicalPageIndexFlat].info
);

uint dirtyBit = unpackDirtyBit(
    pageTable[physicalPageIndexFlat].info
);

// frameMarker > 0 means it has been requested
if (frameMarker > 0) {
    performBoundsUpdate = true;

    // If moving this pixel from previous to current NDC goes out of the [-1, 1] range or is on the edge,
    // it was not visible last frame before the origin shift and will be wrapped around to 
    // the other side
    if (dirtyBit > 0 || ndcChange.x <= -1 || ndcChange.x >= 1 || ndcChange.y <= -1 || ndcChange.y >= 1) {
        pageDirty = true;
    }
}
wicked notch
#

the double questionmark worries me bleakekw

raven orchid
#

same honestly

raven orchid
#

because it starts with local pixel coords and their ndc is always [-1, 1]. So it uses the motion vector and then basically asks "does applying the motion vector result in out of bounds, meaning it has become dirty?"

wicked notch
#

does this happen during the clear stage or the allocation stage

raven orchid
#

this one is after allocation and just before the clear

#

it's marking screen regions as cached or dirty so that clear knows what to do

wicked notch
#

aight

#

so allocate -> mark dirty -> clear -> draw -> repeat

raven orchid
#

yeah, and if you include depth analyze then

#

depth analyze -> allocate -> mark dirty/cached -> clear -> draw -> repeat

wicked notch
#

alright

#

I now have a path

#

thank you

wicked notch
#

mfw all of today was spent refactoring shaders 💀

#

At least I am happy with the shader structure now

#

In the end I didn't manage to do it before the 25th rip

#

tomorrow I gotta do 9-18 between uni classes and internship froge_bleak

#

Also review when I get home

wispy spear
#

no excuses needed ❤️

pale horizon
wicked notch
#

lads I am back home

wispy spear
#

wb my frog

wispy spear
#

nihao ma

wicked notch
#

Ready to translate more Chinese blogposts bleakekw

frank sail
wicked notch
#

ye it's pinned here somewhere

#

I read the first few sections and then died inside bleakekw

#

newfound resolve is necessary

frank sail
#

VSMs first I guess hehe

wicked notch
#

at least my small brain can understand a bit of VSM

#

although some have reached out to me in my DMs talking about graph partitioning, I should thank them all and get back to work soon™️

#

one left the server sadly, not sure why

wicked notch
#
vec2 page_corner_from_uv(in vec2 uv) {
    return (ivec2(uv * VSM_VIRTUAL_PAGE_WIDTH) / vec2(VSM_VIRTUAL_PAGE_WIDTH));
}

vec2 virtual_uv_to_phyiscal(in vec2 virtual_uv) {
    const uvec2 virtual_page_position = uvec2(virtual_uv * VSM_VIRTUAL_PAGE_WIDTH);
    const vec2 virtual_page_corner = page_corner_from_uv(virtual_uv);
    const vec2 virtual_page_offset = virtual_uv - virtual_page_corner;
    const vec2 scale_virtual_uv = float(VSM_VIRTUAL_PAGE_WIDTH) / float(VSM_PHYSICAL_PAGE_WIDTH);
    const uvec2 physical_page = u_page_table.data[virtual_page_position.x + virtual_page_position.y * VSM_VIRTUAL_PAGE_WIDTH];
    const vec2 physical_page_corner = vec2(physical_page) / VSM_PHYSICAL_PAGE_WIDTH;
    return vec2(physical_page_corner + virtual_page_offset * scale_virtual_uv);
}```
#

I invoke the VSM gang to check whether I am going insane or not

raven orchid
#

what are VSM_VIRTUAL_PAGE_WIDTH and VSM_PHYSICAL_PAGE_WIDTH set to btw

wicked notch
#
#define VSM_VIRTUAL_BASE_SIZE 16384
#define VSM_VIRTUAL_PAGE_SIZE 128
#define VSM_VIRTUAL_PAGE_WIDTH (VSM_VIRTUAL_BASE_SIZE / VSM_VIRTUAL_PAGE_SIZE)
#define VSM_VIRTUAL_PAGE_COUNT (VSM_VIRTUAL_PAGE_WIDTH * VSM_VIRTUAL_PAGE_WIDTH)
#define VSM_PHYSICAL_BASE_SIZE 8192
#define VSM_PHYSICAL_PAGE_SIZE 128
#define VSM_PHYSICAL_PAGE_WIDTH (VSM_PHYSICAL_BASE_SIZE / VSM_PHYSICAL_PAGE_SIZE)
#define VSM_PHYSICAL_PAGE_COUNT (VSM_PHYSICAL_PAGE_WIDTH * VSM_PHYSICAL_PAGE_WIDTH)```
wicked notch
raven orchid
#

so does this mean

#

are the physical pages themselves still 128x128 texels? just you have 128x128 virtual pages but only 64x64 physical pages?

wicked notch
#

all the pages are 128^2

#

but the backing texture is only 8192 while the VSM is 16k

#

therefore it has less pages in a row

raven orchid
#

u_page_table.data[virtual_page_position.x * VSM_VIRTUAL_PAGE_WIDTH + virtual_page_position.y];
is this supposed to be virtual_page_position.x + virtual_page_position.y * VSM_VIRTUAL_PAGE_WIDTH?
or maybe the data is packed differently than what I used

wicked notch
#

uh yeah KEKW

raven orchid
#

hmm I'm trying to plug some stuff in for virtual_uv - virtual_page_corner but I'm getting some weirdness

#

the min value I can get is 0 and max is 1/128

#

trying to figure out if that's a me problem

#

or maybe it's not weird at all........

#

actually not weird at all. I think this code might be right!

wicked notch
#

The point is to get the offset from the corner of a virtual page and scale it to an offset for sampling the physical page

raven orchid
#

yeah

#

i'll probably stare at it some more but the more I plug numbers in the more it seems to be 100% right

frank sail
wicked notch
#

width is a garbage name

#

both of them are 128x128

#

the difference is the number of pages in a row

frank sail
#

Oh this is the texture size?

wicked notch
#

no actually

#

it's the number of pages in a row

#

so for example 16384 / 128

frank sail
#

Alright

wicked notch
#

I need that to do funni indexing

frank sail
#

I had so many places in my code where I conflated the number of pages per row in the VSM with the size of a page because they were both 128

wicked notch
#

yeah

#

whoever invented math just had to make 16384 / 128 be equal to 128

wicked notch
#

ok I am back where I started

#

with a luxury structure to my code if I may say

#

compared to what I had previously at least (absolutely zero structurefroge_bleak)

frank sail
#

I spent a long time on the vsm abstraction before I even started implementing it

wicked notch
#

yeah makes sense

#

I can work much faster now (and spend less time going back and forth changing random shit in random shaders because it's all one big blob)

#

something something, weeks of writing can save hours of thinking

#

to be fair, thinking is hard, my poor froge-like brain can't do it

wicked notch
#

Hmmmmmmm

wispy spear
#

whats on your mind : )

wicked notch
#

I be pondering

#

Whether this is enough

// [0-11]: block index (used only in the virtual allocator)
// [12-25]: page index
// [26-31]: bit index
using sparse_allocation = uint32;

static auto encode_sparse_allocation(uint32 block, uint32 page, uint32 bit) noexcept -> sparse_allocation {
    return
        (block & 0x0fff) << 20 |
        (page & 0x3fff) << 6 |
        (bit  & 0x003f) << 0;
}

static auto decode_sparse_allocation(const sparse_allocation& alloc) noexcept -> std::tuple<uint32, uint32, uint32> {
    return std::forward_as_tuple(
        (alloc >> 20) & 0x0fff,
        (alloc >> 6) & 0x3fff,
        (alloc >> 0) & 0x003f
    );
}```
#

I see mr. Jaker and mr. JS encoding a lot of stuff into their page table entry

#

Setting aside the residency status and valid/dirty status that I will need as well

#

I wonder what "X offset" and "Y offset" are for

#

I should be able to reconstruct the physical's page lower left corner by just using (page * 64 + bit) / 128 and (page * 64 + bit) % 128

wispy spear
#

but?

wicked notch
#

idk

#

I feel like I'm missing something

wispy spear
#

you could try some unit tests

#

and see if you can calculate those values back and forth into the right format/encoding

wicked notch
#

ye my unit tests right now are opening a python console and throwing random numbers inside bleakekw

#

I should really invest in better debugging tools

wispy spear
#

: )

#

meanwhile i try to make sense of my Fuk and see if i can make sense of the things i used in there

#

it helped to have enginekit made before with its pipelines and shit

wicked notch
#
uvec3 decode_page_entry(in uint page) {
    return uvec3(
        (page >> 20) & 0x0fffu,
        (page >> 6) & 0x3fffu,
        (page >> 0) & 0x003fu
    );
}

// physical page llc reconstruction
const uvec3 page_entry = decode_page_entry(b_page_table.data[virtual_page]);
const uint physical_page_index = page_entry.y * 64 + page_entry.z;
const uint physical_page_x = physical_page_index % VSM_PHYSICAL_PAGE_ROW_SIZE;
const uint physical_page_y = physical_page_index / VSM_PHYSICAL_PAGE_ROW_SIZE;
const vec2 physical_page_corner = vec2(
    physical_page_x / float(VSM_PHYSICAL_PAGE_ROW_SIZE),
    physical_page_y / float(VSM_PHYSICAL_PAGE_ROW_SIZE));
#

it looks good from good ole python

#
vec2 translate_virtual_uv_to_physical_uv_offset(in vec2 virtual_uv) {
    const uvec2 virtual_page_position = uvec2(virtual_uv * VSM_VIRTUAL_PAGE_ROW_SIZE);
    const vec2 virtual_page_corner = virtual_page_position / float(VSM_VIRTUAL_PAGE_ROW_SIZE);
    const vec2 virtual_page_offset = virtual_uv - virtual_page_corner;
    return virtual_page_offset * VSM_VIRTUAL_TO_PHYSICAL_UV_SCALE;
}``` better name for this function? ![KEKW](https://cdn.discordapp.com/emojis/666849321462792234.webp?size=128 "KEKW")
wispy spear
#

yeah

#

virtual_uv_to_physical_uv_offset is also ok

#

or virtual_to_physical_uv_offset

delicate rain
#

Although I have no clue what bit is for

wicked notch
#

My page table is basically this

std::array<uint64, P / 64>```
#

Where each bit in a uint64 represents a page

#

it indicates whether the page is resident or not

delicate rain
#

Aha I see - I have a bit (haha bit) different approach. My page table is a u32 texture and I store all my info there - some bits for meta info and then mem block idx, and xy offset inside mem block

#

Interesting though - how do you convert your VSM uvs to the index in this buffer?

wicked notch
#

why the xy offset btw, can't you reconstruct the page offset from the texel you use to fetch the page?

delicate rain
#

XY is offset into physical memory

wicked notch
# delicate rain Interesting though - how do you convert your VSM uvs to the index in this buffer...
const vec2 virtual_shadow_uv = fract(shadow_position.xy * 0.5 + 0.5);
const ivec2 virtual_page_index = ivec2(virtual_shadow_uv * vec2(VSM_VIRTUAL_PAGE_ROW_SIZE));
if (all(lessThan(virtual_page_index, ivec2(VSM_VIRTUAL_PAGE_ROW_SIZE))) && all(greaterThanEqual(virtual_page_index, ivec2(0)))) {
    page_request_ptr.data[virtual_page_index.x * VSM_VIRTUAL_PAGE_ROW_SIZE + virtual_page_index.y] = uint8_t(1);
}```
delicate rain
#

Might be interesting to store the residency bit in a u32 texture, you could then use the uvs directly to fetch and decode without flattening

#

I was thinking about having a separate residency texture but then was too lazy to make it that way, so now I just pack everything into the entry in vsm page table

delicate rain
delicate rain
wicked notch
#

No you're good

#

I'm dumb

#

Storing the xy offset and the flattened page index is equivalent

#

for some reason my brain couldn't comprehend that

delicate rain
#

I see, yeah you get the same range and everything

raven orchid
#

So like if the physical memory was 8192 then offset x,y would range from 0 to 63

wicked notch
#

makes sense you have to add something else to get the actual texel to sample right?

raven orchid
#

Yeah the page table entry can only get me the lower left texel of the physical page

wicked notch
#

how do you get the texel offset to add to the lower left corner?

raven orchid
#

So then it becomes lower left + offset computed from virtual coord

wicked notch
#

mod?
where exactly thonk

raven orchid
#

Modulus yeah

#

Mod(virtual pixel, 128) I think it was

wicked notch
#

ah

#

damn my brain is extra smooth today

raven orchid
#

I think we’re pretty much doing the same thing just with a little different approach haha

wicked notch
#

yeah but I just realized I can't shrimply use texture

raven orchid
#

I packed my page table like that just so I could store a lot of data without going over 32 bit per page table entry

wicked notch
#

so I need to use imageLoad & Store & friends

raven orchid
#

Dang yeah, wait are you using texture for page table or ssbo?

wicked notch
#

No, I realized while I was writing that it was impossible bleakekw

frank sail
wicked notch
#

address = index?

frank sail
#

The backed bit is basically redundant since I could use a sentinel address

frank sail
wicked notch
#

do you not need which block of memory the page is

#

uh I guess we don't

frank sail
#

There is no such concept

#

I have one thicc allocation

wicked notch
#

yeah I could just append to the page table

#

offset galore

delicate rain
#

It's the same again isn't it? You get the same range with one big chungus block or more smaller blocks

wicked notch
#

yeah

#

I could steal bits from the "memory block" portion to check whether it's dirty/visible or not

#

right now I reserve 12 bits for the memory block KEKW

frank sail
#

You could use real sparse for the backing memory and treat it like one giant block smart

#

Actually could work even with the pathetically slow API if you only do big chunks

distant lodge
#

real fake memory

wicked notch
#

no it couldn't

#

the driver has to do magic things, whether you do little huge blocks, or many small blocks

#

it's the most garbage thing that exists

#

I dunno how it's not fixed yet

frank sail
#

Rip in pepperoni

wispy spear
#

im slightly puzzled what you guys try to do now, saky posting a video with solved addressing... jaker posted video which looks like the addressing is also solved

wicked notch
#

I'm playing catchup

#

I proposed this thing and I'm the one who's the farthest behind bleakekw

frank sail
wicked notch
#

we need a bicycle containment unit (aka bikeshed)

distant lodge
#

the bicycle halfway house

frank sail
#

the bicycle crackhouse

wispy spear
#

😄

#

i liekd sakys terrain video thingy quite a lot

delicate rain
#

Thank you! I'm conflicted how I should continue now though

wispy spear
#

you share your wisdom with the 2 frogs

delicate rain
#

I thought allocators are done for everyone, is that not true?

wicked notch
#

I'll tell you once I have something drawn bleakekw

frank sail
#

My allocator is still stupid style, but it may work forever

wicked notch
#

but generally mine is good

#

and equivalent to Jaker's, except on the CPU

wispy spear
#

sounds like you should produce a write up, saky

#

API agnostic 😉

delicate rain
#

I should create my own community-project thread, I've been procrastinating on that quite a while

delicate rain
#

so I have a safe space to rent and don't have to spam onther project threads 😄

wispy spear
#

spamming other threads is fine

wicked notch
#

issa fine, 'tis but a bikeshed for everyone

wispy spear
#

thats what they are there for 🙂

#

or we focus on #1128020727380054046 for impl details

#

you chose 🙂

delicate rain
#

The thing is we want to merge Patricks culling with what I do currently and we'll prob create a new project for that

wispy spear
#

who is we

wicked notch
#

the daxa cult

delicate rain
#

me and Patrick

wispy spear
#

ah

delicate rain
#

yeye

glass sphinx
#

i was summoned

wicked notch
#

hi

wispy spear
#

better get cracking

glass sphinx
#

we gained new members

#

i think daxa is actively used by at least 6 ppl now

#

and we need more

wicked notch
#

nice nice

glass sphinx
#

use daxa

delicate rain
#

every thread is Daxa ad thread for Patrick

glass sphinx
wicked notch
#

btw here's a very disturbing fact

#

this thing

#
uvec2 virtual_to_physical_texel_offset(in vec2 virtual_uv) {
    const uvec2 virtual_page_position = uvec2(virtual_uv * VSM_VIRTUAL_PAGE_ROW_SIZE);
    const vec2 virtual_page_corner = virtual_page_position / float(VSM_VIRTUAL_PAGE_ROW_SIZE);
    const vec2 virtual_page_offset = virtual_uv - virtual_page_corner;
    return uvec2(virtual_page_offset * VSM_VIRTUAL_TO_PHYSICAL_UV_SCALE * VSM_PHYSICAL_PAGE_SIZE);
}```
#

is equivalent to this

#
uvec2 virtual_to_physical_texel_offset(in vec2 virtual_uv) {
    const vec2 physical_texel = virtual_uv * VSM_VIRTUAL_TO_PHYSICAL_UV_SCALE * VSM_PHYSICAL_BASE_SIZE;
    return uvec2(mod(physical_texel, float(VSM_PHYSICAL_PAGE_ROW_SIZE)));
}
#

math™️

wispy spear
#

the other one is "more" "readable"

wicked notch
#

true

#

not sure about the perf of mod

#

perhaps it's better like this

wispy spear
#

perhaps you can add a #ifdef FASTPATH of sorts later

#

or perhaps keep the longer one as a comment/description for the modeffed one

raven orchid
#

For mine I can configure how many physical memory pools there are so I opted to include the pool index

#

Plus I have some ideas I want to try with vulkan that would need the pool index, but that’s still a ways out

wicked notch
#

readback really clamplicates stuff with VSM huh

#

I just realized that I need to have a device local only buffer for page information that is managed by the GPU and a host visible and device local buffer for page allocation information that is written by the CPU and read by the GPU

#

this is honestly, very garbage

#

or do I rgbemojiwiggled

#

effectively the CPU never reads what the GPU writes

raven orchid
#

I went the other way, like gpu writes to an atomic queue saying which pages it wants to allocate

#

That ended up being the only readback I needed

wicked notch
#

yeah that's a benefit of real sparse™️

#

but to be fair, fake sparse also doesn't really have this problem

#

I am creating the problem myself bleakekw

#

ehh screw it

#

fixed memory budget it is

raven orchid
#

Yeah with fixed memory I just disable the readback queue lol

raven orchid
wicked notch
#

my error was thinking that CPU and GPU communication was cool and good™️

#

(it isn't)

#

GPU driven all the way dammit

raven orchid
wicked notch
#

@delicate rain

#

lord Saky, your allocator yesterday tickled me internally

#

I am very intrigued, how many workgroups do you dispatch to allocate pages?

delicate rain
#

do you mean to allocate or to find the free pages?

wicked notch
#

to allocate

delicate rain
#

it's one thread per allocation request in the AllocationRequests buffer

#

since they actually have very little work to do

wicked notch
#

How do you synchronize accesses to the page table

delicate rain
#

they just modify the vsm/meta_memory entries

#

I don't

wicked notch
#

uh

#

you

#

don't?

delicate rain
#

you don't have to

#

or am I dumb?

#

why do you think I'd need synchronization?

wicked notch
#

synchronize as in atomics and eventual CAS loopies and whatnot (or subgroup ops)

#

do you do any of these

delicate rain
#

I don't think I need to

#

because none of these threads ever write into the same texels

#

each thread gets assigned -> page requesting allocation, memory page to allocate and potentially page to deallocate

#

but these are exclusive for each thread doing the actual allocation

wicked notch
#

hmmmmmmmm

#

so the thread already knows which (physical) page to allocate

#

how do you do that

delicate rain
#

that is the purpose of the find free pages pass

#

it goes over the physical memory and finds pages which are free/not visible this frame and writes them out into a buffer

wicked notch
#

How does this one workie

delicate rain
#

I'll draw

#

gimme sec

wicked notch
#

wait

#

Uhh

#

first let me know how many workgroups you dispatch for this one

#

depending on your answer I may or may not have the answer already bleakekw

delicate rain
#

so I dispatch (num_physical_pages / subgroup_size) workgroups

wicked notch
#

welp

#

I did not have the answer after all

#

you may draw KEKW

delicate rain
#

In the yellow box is just the setup I have

#

Then below is what I use to find free/not visited pages

#

Important is that I have the meta memory texture - it holds information about the physical memory tiles (it is one u32 per tile)

#

it holds if the physical memory is free/has been visited this frame and packed coords back to the owning vsm texel (if there is one)

wicked notch
#

My brain's working rn

#

the gears are turning

delicate rain
#

To find the free/not visited pages I then:

  • dispatch a single thread per texel in this meta memory texture.
  • do some subgroupOps to find the number of free/not_visited pages in each subgroup
  • one thread from each subgroup attempts to reserve slots in the free_pages/not_visited buffers by using atomicAdd and some logic (mainly to keep the buffers from overflowing)
  • either the buffer is already full -> do nothing, or we reserved some slots in which case I write out pages equal to the amount of slots the one thread reserved earlier
#

I can also provide source/shaders if that will help

#

I need the link back to VSM from Meta memory because of the deallocation - I need to access the owning vsm entry and reset it before a new VSM entry can allocate it. I did not find a better way than to keep this double-linking

wicked notch
#

I'm pondering

#

so I was looking at your writeup again back in frogfood

#

And got some inspirationz

#

But I feel like applying it to the allocator I had in mind is impossible

#

Because it's pretty linear y'know

delicate rain
#

you mean mine?

wicked notch
#

no the one I have in mind

delicate rain
#

ah I see

#

The main problem I hope the background threads in my head figure out is how to introduce priority into this - I'd like to prefer deallocating pages with a heuristic

#

since now I just pick any not_visible page whose thread writes into the buffer first

wicked notch
#
for (uint i = 0; i < page_table.size(); ++i) {
  const uint page = findLSB(~page_table[i]);
  page_table[i] |= (1 << page);
}```
#

this is the gist of the allocator

delicate rain
#

The advantage of having a bitfield for residency

#

Freeing pages could be problematic with this no?

wicked notch
#

🅱️erhaps

delicate rain
# wicked notch ```glsl vec2 page_corner_from_uv(in vec2 uv) { return (ivec2(uv * VSM_VIRTUA...

I finally have the context to think about this and I feel dumb, doesn't this work aswell?

f32vec2 virtual_to_physical(f32vec2 virtual_uv)
{
   const i32vec2 virtual_texel_coord = virtual_uv * VSM_VIRTUAL_RES; // VSM_VIRTUAL_RES = 16k
   const i32vec2 in_page_texel_coord = virtual_texel_coord % VSM_PAGE_SIZE; // VSM_PAGE_SIZE = 128
   const f32vec2 in_page_uv = f32vec2(in_page_texel_coord) / VSM_PAGE_SIZE;
   return in_page_uv;
}
wicked notch
#

ye

#

that's what I do now

#
// [physical_page]: in range [0; VSM_PHYSICAL_PAGE_COUNT)
uvec2 calculate_physical_page_texel_corner(in uint physical_page) {
    const uvec3 entry = decode_physical_page_entry(physical_page);
    const uint index = entry.y * 64 + entry.z;
    const uint x = index % VSM_PHYSICAL_PAGE_ROW_SIZE;
    const uint y = index / VSM_PHYSICAL_PAGE_ROW_SIZE;
    return uvec2(
        x * VSM_PHYSICAL_PAGE_SIZE,
        y * VSM_PHYSICAL_PAGE_SIZE
    );
}

// [virtual_texel]: in range [0; VSM_VIRTUAL_BASE_SIZE)
uvec2 virtual_to_physical_texel(in uvec2 virtual_texel) {
    return virtual_texel % VSM_PHYSICAL_PAGE_SIZE;
}

// [virtual_uv]: in range [0; 1]
uvec2 virtual_to_physical_texel_offset(in vec2 virtual_uv) {
    const vec2 physical_texel = virtual_uv * VSM_VIRTUAL_TO_PHYSICAL_SCALE * VSM_PHYSICAL_BASE_SIZE;
    return uvec2(mod(physical_texel, float(VSM_PHYSICAL_PAGE_ROW_SIZE)));
}```
#

Here's the full thing if it can help

#

I have drawn a lot of times to come up with this bleakekw

delicate rain
#

I have been staring at this for 30 minutes lol

delicate rain
wicked notch
#

I use it like this right now

#
void main() {
    restrict b_virtual_page_table page_table_ptr = b_virtual_page_table(u_address_ptr.data[IRIS_GLSL_VSM_PAGE_TABLE_ALLOCATION_BUFFER_SLOT]);

    const uvec2 virtual_page = uvec2(gl_FragCoord.xy) / VSM_VIRTUAL_PAGE_ROW_SIZE;
    const uint virtual_page_index = virtual_page.x + virtual_page.y * VSM_VIRTUAL_PAGE_ROW_SIZE;
    const uint virtual_page_entry = page_table_ptr.data[virtual_page_index];
    if (virtual_page_entry != uint(-1)) {
        const uvec2 physical_page_corner = calculate_physical_page_texel_corner(virtual_page_entry);
        const uvec2 physical_texel = virtual_to_physical_texel(uvec2(gl_FragCoord.xy));
        imageStore(u_vsm, ivec2(physical_page_corner + physical_texel), uvec4(floatBitsToUint(gl_FragDepth), 0, 0, 0));
    }
}```
#

context: rasterizing a single VSM, fragment shader

#

I get the lower left corner of a physical page given a physical page index

#

and then I add a [0; PAGE_SIZE) offset

delicate rain
#

oh right I see the physical corner is the offset inside of the memory

wicked notch
#

ye

#

I suggest you work with texels because we can't use UVs with storage images bleakekw

delicate rain
#

I forgor that you don't actually get 128x128 texture lol

delicate rain
#

Thank you for the help master frogeheart

wicked notch
#

so

#

we have some rasterization going on

#

da physical texture™️

#

perf is very good yes yes

#

no perf issues at all KEKW

delicate rain
#

I mean

#

3ms not that bad no?

glass sphinx
wicked notch
delicate rain
#

I'm challenging myself to write as much VSM code as possible without actually rendering the shadows, since I'm very scared of the perf

wicked notch
#

@ Jaker imma need that shrimple_scene.glb

frank sail
#

hmm

frank sail
#

That's in the repo

wicked notch
wicked notch
#

yup

#

virtual shadow mapping done right

#

now onto figuring out which offset I fucked up bleakekw

wicked notch
#

how can the virtual page index be repeating itself like this thonk

#
const view_t shadow_view = view_ptr.data[1];
const vec3 virtual_shadow_position = vec3(shadow_view.proj_view * vec4(position, 1.0));
const vec2 virtual_shadow_uv = fract(virtual_shadow_position.xy * 0.5 + 0.5);
const ivec2 virtual_page_position = ivec2(virtual_shadow_uv * vec2(VSM_VIRTUAL_PAGE_ROW_SIZE));
float shadow = 1.0;
const uint virtual_page_index = virtual_page_position.x + virtual_page_position.y * VSM_VIRTUAL_PAGE_ROW_SIZE;
const uint virtual_page_entry = page_table_ptr.data[virtual_page_index];
if (virtual_page_entry != uint(-1)) {
    const uvec2 physical_page_corner = calculate_physical_page_texel_corner(virtual_page_entry);
    const uvec2 physical_page_texel = virtual_to_physical_texel_offset(virtual_shadow_uv);
    // sample
    const float shadow_depth = uintBitsToFloat(imageLoad(u_vsm, ivec2(physical_page_corner)).x);
    shadow = shadow_depth < virtual_shadow_position.z ? 0.0 : 1.0;
}
``` Does this look correct?
frank sail
#

You forgor to multiply/divide by 128 somewhere I guess

wicked notch
#

hmm I can't see it

frank sail
#

The code is unreadable on mobile so I'll check on desktop after I eat

raven orchid
wicked notch
#

so

#

here's the thing

#

there's some repetition going on

#

Apparently, the tiles themselves use the whole VSM

#

so like

#

uv = fract(shadow_uv.xy * 0.5 + 0.5)

#

if I go uv * VSM_VIRTUAL_PAGE_ROW_SIZE

#

Then all the pages will be mapped inside the tile

#

which is wrong

#

Therefore

frank sail
wicked notch
#

yeah true

#

but damn this means my whole thing is wrong

#

how the hell did I think about it

#
const vec3 shadow_position = vec3(shadow_view.proj_view * vec4(world_position, 1.0));
const vec2 virtual_shadow_uv = fract(shadow_position.xy * 0.5 + 0.5);
const ivec2 virtual_page_index = ivec2(virtual_shadow_uv * vec2(VSM_VIRTUAL_PAGE_ROW_SIZE));```
#

This is wrong

#

[0;1] will be mapped to [0;128]

#

How the hell do I fix this

#

Jaker help

#

I'm lighting the Jaker beacon

frank sail
#

uhh

#

reading code is hard

#

right so you want a uv across the whole vsm that's in [0, 1]

#

then you want to multiply it by pageTableSize to see which page you are in

wicked notch
#

yes

frank sail
#

then to get the uv within a page you need to tile the whole vsm uv like so
vec2 within_page_uv = PAGE_SIZE * mod(vsm_uv, 1.0 / PAGE_SIZE);

wicked notch
#

yes

#

how do I make a projection matrix that spans the entire VSM

frank sail
#

well ndc already spans the whole vsm 🤔

#

and thus uv

#

doesn't matter what your projection looks like

#

maybe I am misunderstanding le problem

wicked notch
#

they're the whole VSM

#

and the texels are the individual pages

#

which is dumb

frank sail
#

the red squares look like pages

#

but they're actually whole vsms?

wicked notch
#

but they aren't bleakekw

#

yes

frank sail
#

they're supposed to be pages, right

wicked notch
#

I want them to be pages yes

frank sail
#

can you try outputting virtual_shadow_uv

#

that should be a single big gradient across the whole vsm if I understood

wicked notch
#

It won't

#

good ole fract makes it repeat foreva

#

so what I'm doing is, each tile has a shadow uv in range [0;1]

#

which is not what I want

#

or well, maybe it is what I want? dunno honestly bleakekw

frank sail
#

hrmm

#

the fract seems a little fischy

#

well the fact that it's generating a large value is odd already

#

do you have any clipmaps?

wicked notch
#

no this is all as shrimple as possible

#

no clipmaps, no culling, no nothing

frank sail
#

oh then I guess it's fine

#

the smallest vsm should be pretty small if you define it to be that way

#

my clipmap 0 is probably about that big

#

and yes it's supposed to tile

#

I guess the main thing is that when you have clipmaps you should never see several tiles of the same clipmap at once

#

if you do, you will get horribly wrong results

#

so idk if it's related to your current problem exactly, but you can try making the projection way bigger so one tile basically covers the screen

#

or temporarily clamp the UV (until you add clipmaps) so it never repeats

wicked notch
frank sail
#

ye I made it 10x10 meters

wicked notch
#

what happens if you set your clipmap 0 to a [-10;10] projection width

frank sail
#

well mine is [-5, 5] so that's pretty close eh

wicked notch
#

so this is just because my proj is too smol

frank sail
#

make it 100x100 or smth until you add clipmaps

raven orchid
#

Tbh with 16k you’ll get good results up until -512,512

#

Well…. Ok results which become good results with shadow filtering

frank sail
#

btw it compiles and runs fine on my work machine (running 7900 XTX) froge_love

raven orchid
#

Is that the newest from AMD 😮

frank sail
#

it's the most powerful

#

ignore the horrible artifacts because I haven't pushed any of the changes I've been working on (including the fix for the depth being a frame behind)

#

that is displaying the UV across a whole VSM

#

then I do lod bias = -10 to force it to the lowest clipmap so you can see the tiling more easily

frank sail
raven orchid
#

That’s awesome, I’m not fully up to date with gpu release news lately

wicked notch
#

ok it's very much less broken

#

still broken

#

once again Jaker saves the day

frank sail
#

inshallah I will turn broken code into broken code

wicked notch
#

less broken* smart

#

I shall put my brain into sleep mode

#

hopefully biology does the things it's supposed to do and the answer will come tomorrow

frank sail
#

🇬🇳

wicked notch
#

boys

#

we got em

#

It turns out I don't understand viewports bleakekw

#

but we got it nonetheless

#

minus serious temporal/biasing/peter panning/literally any other problem issues bleakekw

frank sail
wicked notch
#

showcasing all the problems bleakekw

raven orchid
#

Very nice! What clip0 radius did you end up using?

wicked notch
#

this is just a single clipmap so far

#

but I'll probably use something not too smol

#

probably 5x5 or something

wicked notch
# wicked notch

@frank sail do you have any clue as to what the hell is the final problem I showcase

#

The one where I go back and forth and the bias seems to "change" (it's a constant value)

frank sail
#

something is definitely one frame behind

#

because it only happens under motion

wicked notch
#

Hmmm

#

it's probably the page table

frank sail
#

if it's anything like my impl, you may have put all the logic before the depth buffer is written, so you read last frame's depth which is wrong when moving

#

mine didn't have the exact same artifact though so maybe you are using an old view matrix or something

#

which would cause the sampled position to be slightly wrong when transforming to light space

wicked notch
#

yep

#

I'm a fucken idiot

#

I shade before I render shadows

#

big brain

frank sail
#

lol

wicked notch
#

6429869 sync errors incoming because I swapped passes around

frank sail
#

not using vuk be like

wicked notch
#

I do be considering tbh

#

luckily I'm a Vulkan god and I fixed them in .5 nanoseconds

#

(there were no sync errors)

wispy spear
glass sphinx
frank sail
#

OpenGL my homie

glass sphinx
wispy spear
#

you can fuck up rendergraph the same way

#

idk why people keep suggesting it as the holy grail

delicate rain
#

Not if you use DAXA 🧙‍♂️

wispy spear
#

just plug in the wrong image at the wrong slot

glass sphinx
#

thats not the same at all

#

he reordered which caused a problem

#

rendergraphs handle that automagically

wispy spear
#

i doubt that they magically fix everything

#

the graph needs to be setup too

wicked notch
#

Not everything, but pass reordering do be automagic

glass sphinx
#

any order management is automatic

wispy spear
#

yeah, but not its parameters/values, is wot im saying

wicked notch
#

yeah

glass sphinx
#

that sounds trivial but its actually very complex to do manuall,

wispy spear
#

yes, i get why you or others suggest rg

glass sphinx
frank sail
#

Deccer's brain is expanding from #1157025378112634972

wicked notch
#

Potrick, how do you pass BDA arrays around?

#

Like say you have this

layout (buffer_reference) buffer X {
    T[] data;
};
#

I would like to pass the data to functions around

#

Without having to redeclare the buffer block if possible

frank sail
#

I don't think you can pass the unbounded arrays, but you can pass the buffer reference

#

The uint64 or whatever

wicked notch
#

yeah.. but that requires redeclaring the buffer block

#

so sad

frank sail
#

Hmm

#

GLSL moment

delicate rain
#

can you not pass a reference to the buffer X around and just access the data inside of the function?

frank sail
#

Apparently that requires declaring the thingy twice but idk why since I haven't used bda myself

wicked notch
#

ye I wanted to use this in a header and stuff

#

then there's the qualifiers I have to think about

#

restrict, readonly, etc.

delicate rain
#

@glass sphinx come explain your glsl woodoo macro things I tried to read them but it gives me headache honestly

glass sphinx
#

we make that for every strucr we need it for

#

then we have the syntax: daxa_BufferPtr(Struct)

#

we all like it a lot so far

#

that doesnt always work tho so sometimes i declare buffer blocks manually for bda

wicked notch
#
#define IRIS_GLSL_MESHLET_INSTANCE_BUFFER_BLOCK(name) name { \
    meshlet_instance_t[] data;                               \
}

#define IRIS_GLSL_MESHLET_BUFFER_BLOCK(name) name { \
    meshlet_t[] data;                               \
}

#define IRIS_GLSL_TRANSFORM_BUFFER_BLOCK(name) name { \
    transform_t[] data;                               \
}

#define IRIS_GLSL_VERTEX_BUFFER_BLOCK(name) name { \
    vertex_format_t[] data;                        \
}
``` Like this?
glass sphinx
#

no

#

i have to go gym now i hope this was insightful

wicked notch
#

thank you frogeheart

glass sphinx
#

its not the best but it has pointer semantics for the most part which is nice for my wallnut brain

#

also explicit and consistent

#

which is always a plus

wicked notch
#

Hmm

#

Is there any way to shorten this:

IRIS_BUFFER_ARRAY_PTR(type) type_ptr = IRIS_BUFFER_ARRAY_PTR(type)(iris_id_to_address(slot));
glass sphinx
#

that works in daxa at least

glass sphinx
#

uuuh

#

you could make a new makeo

#

buffer ptr from address(type, address)

#

whej im home ill make an example

wicked notch
#

I have this yea

#
#define IRIS_RW_GET_BUFFER_ARRAY(T, id) IRIS_RW_BUFFER_ARRAY_PTR(T)(iris_id_to_address(id))
#define IRIS_R_GET_BUFFER_ARRAY(T, id) IRIS_R_BUFFER_ARRAY_PTR(T)(iris_id_to_address(id))
#define IRIS_W_GET_BUFFER_ARRAY(T, id) IRIS_W_BUFFER_ARRAY_PTR(T)(iris_id_to_address(id))

#define IRIS_RW_DECLARE_BUFFER_PTR(name, T, id) IRIS_RW_BUFFER_PTR(T) name = IRIS_RW_GET_BUFFER(T, id)
#define IRIS_R_DECLARE_BUFFER_PTR(name, T, id) IRIS_R_BUFFER_PTR(T) name = IRIS_R_GET_BUFFER(T, id)
#define IRIS_W_DECLARE_BUFFER_PTR(name, T, id) IRIS_W_BUFFER_PTR(T) name = IRIS_W_GET_BUFFER(T, id)```
glass sphinx
#

thats cool

#

i like it

wicked notch
#

Another thing I don't like is declaring slots but I guess it's the only way

#

Like this:

#define IRIS_VIEW_BUFFER_SLOT 0
#define IRIS_MESHLET_INSTANCE_BUFFER_SLOT 1
#define IRIS_MESHLET_BUFFER_SLOT 2
#define IRIS_TRANSFORM_BUFFER_SLOT 3
#define IRIS_VERTEX_BUFFER_SLOT 4
#define IRIS_INDEX_BUFFER_SLOT 5
...```
glass sphinx
#

do bindless only

#

much better api possibilities

wicked notch
#

These are indices inside the "address" buffer

#

I thought that was a good way of handling BDA

#

is there a better way?

glass sphinx
#

huh

#

i dont understand

#

i have a big address buffer for all buffers that i index eith the buffer ud

#

id

#

the buffer id is unique

#

has index and version

#

other then that i use structs for all uploadss

wicked notch
#

How does the buffer ID work?

glass sphinx
#

when buffer is created it gets its id

#

in the creation function

#

that ids index indexes into the descriptor set binding and address buffer

#

so then i can take the id from cpu to gpu and use it there to get the buffer in hlsl or bda in glsl

wicked notch
#

Interesting

glass sphinx
#

look at the mandelbrot sample again

#

it shows how thats done

wicked notch
#

If you have buffers in flight how do you handle that?

#

Do you upload buffer IDs for all buffers?

glass sphinx
#

i dont follow

#

can you make an example?

wicked notch
#

You need the buffer ID on the GPU right?

#

I suppose you have to put the IDs in a buffer somewhere, is that correct?

glass sphinx
#

either another buffer or push constant

wicked notch
#

I see I see

#

So about the buffer IDs still

glass sphinx
#

can you vc?

#

i can probably explain it faster in a voice chat

wicked notch
#

sorry not right now 😢

glass sphinx
#

ok

#

then go ahead

wicked notch
#

but it's the last question I promise

#

You gotta collect all buffers to get their IDs right? (on the CPU side)

#

How does that work?

glass sphinx
#

collect?

wicked notch
#

something like:

const auto buffers = get_all_buffers();
std::vector<uint32> ids;
for (const auto& [id, rest] : buffers) {
    ids.emplace_back(id);
}
upload_buffer(ids);```
#

my question is do you do anything like get_all_buffers?

glass sphinx
#

right now buffers in daxa are the ids

#

uuh

#

why would i need that

wicked notch
#

Ahh I see

#

if your buffers are ids then you don't need that yeah

glass sphinx
#

yes

#

the bindless + shared file in daxa make resource management so fucking easy

#

its a joy

wicked notch
#

so you have something like a buffer table

#

with all buffer ids and stuff

glass sphinx
#

yes

wicked notch
#

and then you can access them as you wish

glass sphinx
#

well not ids

#

the ids index the table

#

then the table hass all the metadata like bda host ptr creation info blabla

#

and on gpu side i also have a smaller version of the table

#

but for a shader input i usually have a struct containing all input data. that would have constants ids addresses etc

wicked notch
#

nice

wicked notch
wispy spear
#

looks like you smuggled some part of your view matrix into the formula 🙂

wicked notch
#

ye I thought so too

#

but it's still the same even with a super shrimple view mat

#

so eye = light_dir, center = 0, up = (0 1 0)

distant lodge
#

it seems to be adding angle so it might be a sign or an inversion issue

#

the lines spin faster than your camera

wicked notch
#

ah frick yeah

#

I was adding a one that didn't belong there

#

wait no nevermind, it's still borked bleakekw

#

time to check le nsight

wispy spear
#

: D

wicked notch
#

they do be very stable now

#

I truly hate vulkan's flipped Y

#

no friggin reason to have it flipped

wicked notch
#

rip h264 encoding

#

also rip perf

frank sail
#

now your gpu trace looks like mine

raven orchid
#

What are you gonna try first for performance?

wicked notch
#

the easiest thing

#

cull meshlets that don't overlap visible pages

wicked notch
#

Readback latency is real

frank sail
#

I know how to make the latency 0 frames

#

vkDeviceWaitIdle

wicked notch
#

I wanna move to a GPU driven allocator already

#

but Saky's thing is ubercomplex

frank sail
#

Instead of writing logic on the CPU you just write a bunch of little compute shaders

#

The overhead of writing even one compute shader do be kinda annoying ngl

#

Wiring all the schtuff together

delicate rain
#

My thing is unnecessarily complex

#

I think Jakers is enough

#

Linear allocator all the way

wicked notch
#

Imma just do one GPU thread doing all the work KEKW

delicate rain
#

Yep

delicate rain
#

3 nanoseconds after mentioning something is complex Patrick's swoops in from the skies

glass sphinx
frank sail
delicate rain
#

It's like 30 loc

#

To add a new shader

glass sphinx
#

i have a little 20 loc template i copy and then add things too. Most small shaders are done in like 5 minutes this way

frank sail
#

In fwog it's like 5 loc to make a shader+pipeline and it's still annoying because I still need to bind resources and stuff

glass sphinx
#

well the template is smaller, i counted the loc of adding the shader everywhere including src code and adding it to the tg, also declaring and passing parameters.

glass sphinx
#

you define the parameters once then you insert the task with the parameters thats it

frank sail
#

There is also some mental overhead in making a new file to add a tiny shader

delicate rain
#

Back in my raw Vulkan nightmare days, when adding a compute pass meant adding stuff to 10 different places

glass sphinx
#

i dont add files for every shader

#

enough daxa shilling tho

#

back to vsmisms

wicked notch
#

I'm making the kompoot shader

#

writing layout (local_size_x = 1) in; feels like a crime though 💀

frank sail
#

if it helps you sleep at night, you can get the same effect by writing nothing

wicked notch
#

before going gpu driven

#

I fixed a nasty bug

#

where I had a massive lod bias, so some geometry in the smol clipmaps would get clipped

#

now lod bias is sane (as is the first clipmap width)

wicked notch
#

Ok so thinking about the GPU allocator

#

I suppose I can use the same systems I use on the CPU

#

but I need some more state to keep track of allocated pages

#

so the visible_page_buffer needs to be doubled and [0] will contain the older visible values

#

I need this to determine transitions from non visible to visible and viceversa

wicked notch
#

perchance I need two passes

wicked notch
#

mfw tomorrow is monday

#

I hate monday

wispy spear
#

i have 2 days off : >

delicate rain
#

I'm fighting with the urge to not go to uni and watch lecture from video

#

the classic

frank sail
#

Bro you can work on VSM later listenyoupieceofshit

#

Participate in class for maximum brain growth

delicate rain
#

yeah you are right

wicked notch
#

I just work on VSM in uni through parsec KEKW

frank sail
#

Your education is very important saky

#

And it is fleeting, unlike VSM which will always present problems for you to solve when you get back frogeheart

wicked notch
#

there are some useless lectures tho

frank sail
#

Fr but you're ruining my pep talk KEKW

wicked notch
#

I have this one professor who is literally reading the slides he makes

#

ok true

#

I'll keep quiet

frank sail
#

But yeah it's up to saky to judge whether that's the case

#

Just don't let your tiredness cloud your judgement

delicate rain
#

Mister Jaker saying the exact things I need to hear

#

I attend lecture tommorow like a good student

wicked notch
#

good, now that the pep talk has concluded

#

you most likely have some boring lectures, give into the VSM temptation then 😈

wispy spear
#

i can read some spaghetti recipes out loud too : ) if that helps

delicate rain
#

hahaha it's Algorithmics tommorow - I already know most of that

#

but I need the walk to uni and back

#

been sitting in my room for far too long

wispy spear
#

walking is good to get the head free

#

make sure not to look down onto the road before you as you walk

wicked notch
#

I have my friends take up some seats for ourselves and I always have a dual screen setup on my desk

#

my tablet and my laptop are connected through parsec to my home machine

#

thankfully my uni didn't cheap out on the internet so I have zero latency

delicate rain
#

based setup

wicked notch
#

Imma send pic tomorrow

#

to show how powerful it actually is

#

I suggest this setup to anyone who likes procrastinating

delicate rain
#

I'm actually very curious how it looks like haha

wispy spear
#

when you said parsec, that reminded me of a space game which was in the making 23 years ago lol

#

called parsec, very 3d et al

raven orchid
#

School is the only time the shared vsm neuron gets a break

#

It’s been worked nearly to death

wicked notch
#

I ended up doing this

void main() {
    restrict b_page_table_request_block page_request_ptr = b_page_table_request_block(b_addresses.data[IRIS_VSM_PAGE_REQUEST_BUFFER_SLOT]);
    restrict b_page_table_block page_table_ptr = b_page_table_block(b_addresses.data[IRIS_VSM_PAGE_TABLE_ALLOCATION_BUFFER_SLOT]);
    restrict b_vsm_allocate_request_block alloc_req_ptr = b_vsm_allocate_request_block(b_addresses.data[IRIS_VSM_PAGE_ALLOCATION_REQUEST_BUFFER_SLOT]);
    const uvec2 position = uvec2(gl_GlobalInvocationID.xy);
    if (any(greaterThanEqual(position, uvec2(VSM_VIRTUAL_PAGE_ROW_SIZE)))) {
        return;
    }
    const uint clipmap = gl_WorkGroupID.z + 1;
    const uint virtual_index = position.x + position.y * VSM_VIRTUAL_PAGE_ROW_SIZE;
    const uint virtual_page_entry = page_table_ptr.data[virtual_index * clipmap];
    const bool is_page_allocated = is_virtual_page_backed(virtual_page_entry);
    const bool is_page_requested = page_request_ptr.data[virtual_index * clipmap] == 1;
    const uint free_bit = 0;
    const uint allocate_bit = 1;
    if (is_page_requested && !is_page_allocated) {
        const uint slot = atomicAdd(alloc_req_ptr.count, 1);
        alloc_req_ptr.data[slot] = ((virtual_index * clipmap) << 1) | allocate_bit;
    } else if (!is_page_requested && is_page_allocated) {
        const uint slot = atomicAdd(alloc_req_ptr.count, 1);
        alloc_req_ptr.data[slot] = ((virtual_index * clipmap) << 1) | free_bit;
    }
}```
wicked notch
#

and this:

void main() {
    restrict b_vsm_allocate_request_block alloc_req_ptr = b_vsm_allocate_request_block(b_addresses.data[IRIS_VSM_PAGE_ALLOCATION_REQUEST_BUFFER_SLOT]);
    restrict b_vsm_physical_page_table phys_page_table_ptr = b_vsm_physical_page_table(b_addresses.data[IRIS_VSM_PHYSICAL_PAGE_TABLE_BUFFER_SLOT]);
    restrict b_page_table_block page_table_ptr = b_page_table_block(b_addresses.data[IRIS_VSM_PAGE_TABLE_ALLOCATION_BUFFER_SLOT]);

    for (uint i = 0; i < alloc_req_ptr.count; i++) {
        const uint page_request = alloc_req_ptr.data[i];
        const uint virtual_clipmap_index = page_request >> 1;
        const bool is_allocation = (page_request & 0x1u) == 1;
        if (is_allocation) {
            // find first free bit in physical page table
            // FIXME [WARNING]: suppose allocation never fails
            const uint mask_count = VSM_PHYSICAL_PAGE_COUNT / 32;
            for (uint j = 0; j < mask_count; ++j) {
                const uint mask = phys_page_table_ptr.data[j];
                if (mask == ~0u) {
                    continue;
                }
                const uint bit = findLSB(~mask);
                phys_page_table_ptr.data[j] |= 1u << bit;
                page_table_ptr.data[virtual_clipmap_index] = encode_physical_page_entry(uvec2(j, bit));
                break;
            }
        } else {
            const uvec2 physical_page_entry = decode_physical_page_entry(page_table_ptr.data[virtual_clipmap_index]);
            phys_page_table_ptr.data[physical_page_entry.x] &= ~(1u << physical_page_entry.y);
            page_table_ptr.data[virtual_clipmap_index] = uint(-1);
        }
    }
}```
#

But shadows disappeared

delicate rain
#

Either I'm blind or you never reset the allocated bit in the virtual page entries when freeing no?

#

So you free the pages memory but the page still thinks it's allocated

wicked notch
#

ye true

wicked notch
#

gpu allocator has been achieved

#

time for lunch break

wispy spear
#

bon appetito

wicked notch
#

on my way to work I am pondering

#

SDSM + VSM?

#

could this be a cursed combination, or the holy grail of shadows

#

I shall leave this thought in the back of my head for a while, need to make this thing perform well first nervous

delicate rain
#

I've also been thinking about this

#

but I don't think it works because SDSM relies on you redrawing the cascades each frame

wicked notch
#

ye it's just a little "what if no caching hehe"

delicate rain
#

I mean if we can make it fast enuf so that we redraw all each frame

wicked notch
#

I suppose a precondition for that would be absolutely balls to the walls perfect culling

delicate rain
#

Maybe we could do SDSM only for nearest X clips though

wicked notch
#

first experiment with HZB coming soon

#

jesus

#

As it turns out reducing num_clipmaps * mip_count images isn't exactly performant bleakekw

wicked notch
#

Hmmm

#

How should I store the surviving meshlets

#

one buffer per clipmap?

frank sail
#

I'm just using your multiview setup in ff lol

wicked notch
#

ye that's one buffer view kindasorta

frank sail
#

Probably wasting a lot of memory that way though

wicked notch
#

yep

frank sail
#

idc though hehe

wicked notch
#

are there some educated guesses we can do about object visibility

#

hmm I don't think so, we can't make assumptions about object visibility based only on clipmap info

frank sail
#

Yeah I think you have to assume in the worst case that every meshlet is visible to every clipmap

wicked notch
#

with one mil meshlets that's 48MiB if I have 12 clipmaps bleakekw

frank sail
#

Btw I had an idea to save memory with multiview

#

Basically doing

foreach view in views
  CullMeshlets(view)
  DrawMeshlets(view)
wicked notch
#

reusing the same boofer?

frank sail
#

Ye

wicked notch
#

This will result in a one frame latency

#

but 12 times less memory

delicate rain
#

frame latency?

wicked notch
#

ye

delicate rain
#

why