#Iris - A Journey through OpenGL and beyond to learn Graphics

1 messages Β· Page 13 of 1

wicked notch
#

what's frustumSize here?

#

the width of the projection?

frank sail
#

yeah

#

like the params you supply to glm::ortho

#

couldn't figure it out so I'm gonna keep my unhinged inverse thing

delicate rain
#

I think you should project the basis from ndc virtual space into world space -> x = offset * unprojected_basis -> translate by x

#

you also need to translate by z in order for the camera to slide along the same plane no?

frank sail
#

not in view space

delicate rain
#

right but you translate the view matrix that does not mean translating in the view space no?

#

or am I dumb

frank sail
#

idk

delicate rain
#

translating the view matrix imo means "moving the camera"

#

in world space

frank sail
#

o

delicate rain
#

thats why you need to unproject from the viewspace of the vsm first

#

or multiply by the vsm view matrix to get it into vsm space translate there and then go back to world space

#

these should be identical

frank sail
#

or do glm::inverse(stableProjections[i]) * shiftedProjection * stableViewMatrix; smart

delicate rain
#

I have no clue what that does lmao

frank sail
#

it does what you described probably

#

the main thing is that it works

delicate rain
#

tru

frank sail
#

I don't like how translating a view matrix does not translate in view space

#

someone should fix that

delicate rain
#

you can do that by doing projection * view_space_translation * view_space * world_space_translation

wicked notch
#

bug fix for math.exe

wicked notch
delicate rain
#

yes because you first project into view space and then translate

#

which does translation in view space

#

I had the order wrong my bad

wicked notch
#

Are glm::ortho's parameters in ndc units or world units?

delicate rain
#

world

wicked notch
#

Hmmmm

wicked notch
#

I have come to the realization that with HWVSM none of the techniques you guys use for caching work, so I'll have to invent my own

#

either that, or I go back to SWVSM

distant lodge
#

you should do vulkan 1.2 compatible SWVSM :^^^)

#

that can run on a gtx 760

#

at 60fps on bistro

wicked notch
#

In any possible case, I can't simply set gl_Position and write

#

I need to shift the ndc position by some factor and have it wrap around to raster to the correct page

wispy spear
#

i took the liberty to acquire a potential new vsm customer in #wip

raven orchid
#

Like if your internal marker says the page is cached, don’t include it in hpb

#

Which means hpb culling will delete it frogdelet

wicked notch
#

the big issue is stable addressing

raven orchid
#

Oh yeah

#

I’ve had some success with fractional shifting of render matrices and caching

#

And then for addressing I use the untranslated version

wicked notch
#

yeah, my only hope right now is to do some unholy math

#

much more unholy than what Jaker cooked up kekkedsadge

wispy spear
#

abusing div by zero incoming πŸ˜„

wicked notch
#

ok simplest case

#

truncate camera's position

#

time to draw

#

suppose camera position = stable origin = 0

#

then view = stable_view

#

now what happens when I move one unit to the right

#

only god knows (jk I'm computing)

wispy spear
#

: )

cold sky
#

btw Light Perspective Shadow Mapping used to do some unholy trick with Z/W divide to have shit wraparound with weird perspective

#

but here you need to wraparound by 2 axes

#

in HW the only way to do that is to set up 4 separate viewports and multi-cast/cull your meshlets into them

#

ngl, SW raster (at least SW Raster Output Processing) is sounding nicer and nicer by the day

wicked notch
#

ok no matter

#

caching is dead

#

very good

raven orchid
#

lvstri for sw raster

#

didn't you say yesterday nanite only needs to call render shadows once?

wicked notch
#

yeah I'm switching back to SWVSM

#

goodbye HWVSM, it was fun while it lasted

wicked notch
#

ye they do it only once

#

multiview ftw

raven orchid
#

wow that's extremely powerful

wicked notch
#

the HWVSM branch is gone

#

I am not crying

wispy spear
#

maybe one day, while sitting on the loo, it will hit you : ) but not today hehe

frank sail
#

Even with hw sparse actually

wicked notch
#

what do you mean

frank sail
#

Idk just felt like saying stuff

#

Actually I'm probably misinfo about hw sparse there but I can't think bc I just woke up

wicked notch
#

giving me false hope smh

wicked notch
#

no hardware :(

wicked notch
#

I thought I'd take a bit of a detour and do ImGUI

#

so that I can have sexy debug visuals like Jaker and Saky

#

however there is a small issue, to render an image in ImGUI I need to give it a descriptor set with an image bound

#

...How do I make this descriptor set, I don't have access the internal layout ImGUI creates

frank sail
#

either guess or read the source

wicked notch
#

I need the handle itself tho

#

The VkDescriptorSetLayout inside of ImGUI

#

how the hell do I do this bleakekw

#

I don't wanna create a new backend

distant lodge
#

just write your own ba-

#

nvm

wicked notch
#

@delicate rain how does daxa handle it

frank sail
#

Use OpenGL interop and use the gl backend

#

frog_shrimplele as that

wicked notch
#

This is the universe calling me back to GL innit

delicate rain
#

I believe we have our own shaders into which we pass the daxa::ImageIds

#

and samplers

wicked notch
#

so you have your own backend, epic

#

Ok since I really don't want to spend the night writing a backend for shitty ImGUI

delicate rain
#

Yeah

wicked notch
#

I'll do something cursed

delicate rain
#

Just spend the night porting to Daxa

#

😈😈

wispy spear
#

hehe the purple fits

frank sail
frank sail
#

Just find someone's code that uses it

glass sphinx
#

thats how daxa handles it

#

the imgui backends are not meant to be used really they are mostly examples

#

you should make your own

wicked notch
#

sigh

#

I will

glass sphinx
#

i love how the devil emoji is daxa colored

frank sail
#

it's not impossible to use them though bleakekw

wispy spear
#

it should also be fairly simple to make an imgui backend

#

its literally just shoving verticles and indices into a vbo/ibo, loading 2 shrimple shaders, and textures : )

glass sphinx
#

they are fine

#

but usually they start to be annoying at some point

#

for daxa it took me a few hours (i had to debug a while cause the stupit in my head)

wicked notch
#

Thank god vulkan is well designed

wispy spear
#

lol

wicked notch
#

as it turns out you don't need to create a descriptor set out of the exact handle, you just need the layouts to be compatible

wispy spear
#

that is something i noticed during my early days with bulkan

#

the validator layer didnt always slap me in the face, only occassionally

#

while i was expecting it would at all times i fook something up

#

like having uniforms and shaders not be compatible

#

i think i forgor a part in the shader, while it was described in all the descriptorisms or other way around, it didnt yell at me

delicate rain
#

I was recently creating graphics pipeline with zero color and depth attachments and it didn't say anything

#

and still wrote to the first bound output from frag shader

wispy spear
#

ah i think i had something like that too

delicate rain
#

also does not complain when you fuck up the usage flags - for example when you don't set texture as sampled it will say nothing but the sampler reads will just return nothing

#

very fun to debug haha

wispy spear
#

speaking of vkguide, i should also continue : (

delicate rain
#

yes!

wispy spear
#

memory transfer nonsense : )

frank sail
#

I do that for VSM because I just need image store

delicate rain
#

yeah but I was binding two color attachments during begin rendering

#

which should not be compatible with the pipeline no?

frank sail
#

Yeah that seems wrong

wicked notch
#

This is my first time setting up ImGui on Vulkan lol

#

how do I get rid of that shitty border around my window

#

not the viewport, the main window

#

this border

wispy spear
#

ImGui_NextWindowItemPos or something

#

or per style

#

one sec

wicked notch
#

can you pass your own style somehow

#

or is it copyrighted

frank sail
#

Certain imgui calls require registering a license

wispy spear
#

: )

#

thats my dark style, i think, the one jaker stole for frogfood too

frank sail
#

All of my imguiisms are in Gui.cpp

wispy spear
#

was about to say, perhaps steal from frogfood, because of c# isms and whatnot

frank sail
#

Including the pirated style

wispy spear
#

and dont forget that one little thing

#

style.WindowMenuButtonPosition = ImGuiDir.None;

#

to get rid of that dropdown thingy next to the window "tab"

frank sail
#

I don't get the point of that button tbh

wispy spear
#

you also need to y offset your fonts, when using fontawesome

frank sail
#

I had to use trial and error to find the correct offset agonyfrog

wispy spear
#

pobalbyl different per font too somehow

frank sail
#

It is

distant lodge
#

it's the ascent of the font probably

wicked notch
#

sick style

#

but the fix was using those two little shits

ImGui::PushStyleVar(ImGuiStyleVar_WindowBorderSize, 0.0f);
ImGui::PushStyleVar(ImGuiStyleVar_WindowPadding, ImVec2());```
#

now border is no more

frank sail
#

Nice

#

Needs FSR 2 though

wicked notch
#

it really does

#

this night will be spent implementing basic features

#

like AA KEKW

wispy spear
#

what is this lvstri, irisvk 2.0?

wicked notch
#

irisvk except it's not a VSM only showcase bleakekw

frank sail
#

irisvuk

wispy spear
#

irisfukall

wicked notch
#

and there's no sRGB issues at all!

#

amazing

#

there's a severe lack of shadows

#

but I'm too deep into the refactoring now

wispy spear
#

noice

#

with all these 4kisms too

distant lodge
frank sail
#

It's just a pixel offset

cold sky
frank sail
#

continue reading the convo

frank sail
#

yep

wicked notch
#
// they are the same image
layout (rgba32f, set = 0, binding = 0) uniform writeonly image2D u_history_storage;
layout (set = 0, binding = 1) uniform sampler2D u_history_sampled;

void main() {
    const vec4 x = textureLod(u_history_sampled, gl_GlobalInvocationID / vec2(textureSize(u_history_sampled, 0)), 0);
    x = do_something(x);
    imageStore(u_history_storage, gl_GlobalInvocationID.xy, x);
}``` is this legal
frank sail
#

If u never read after a write to that texel, I think it's fine

#

But I do not think stores are visible to samplers without a pipeline barrier

wicked notch
#

Ye I don't want to read again after I write

cold sky
cold sky
#

there's no way to mark the sampler2D as coherent so any writes to the Storage image have no guarantee of showing up

#

furthermore, you "might" accidentally tap other pixels if you have the wrong sampler set (bilinear interpolation, etc)

#

why aren't you just using the image itself ?

#

like imagLoad and then imageStore to the same location ?

frank sail
#

btw writeonly images don't need the format in the layout

cold sky
# cold sky this is UB territory

its literally "barely legal", like maaaybe maaaybe if you do all the following it will work:

  • no interpolation, literal NEAREST sampler
  • only draw a single pixel with the pipeline at any location, and have an image barrier before and after
  • image layout is GENERAL
    etc.
cold sky
frank sail
cold sky
#

but true, 99% of devices support write without format

#

so Nabla requires that feature

#

but still in Vulkan you need to add some stuff to VkStruct when making the image view IIRC

#

its not magically enabled on everything

cold sky
wicked notch
#

I could do it myself but eh

wicked notch
#

ye this is very ub

cold sky
#

how are you going to linearly interpolate ?

#

:dafuq:

wicked notch
#

Ye I can't lol

cold sky
#

you write the texels with the invocations

wicked notch
#

I'll just use two images

cold sky
#

and then you want to read at an offset of 0.5

frank sail
cold sky
#

btw, your original image would read texel values 0.5 away from pixel center, and store them to pixel center

#

this basically computes a quasi 1/2 downsample

#

this textureLod(u_history_sampled, gl_GlobalInvocationID / vec2(textureSize(u_history_sampled, 0)), 0); will give you 0.5 pixel less in each axis

#

so when globalinvID is 0,0 you end up tapping pixels {0,0} {-1,0}, {0,-1}, {-1,-1}

wicked notch
#

yeah that was just quick way to demonstrate my issue

#

I do this now ```glsl
#version 460

layout (local_size_x = 16, local_size_y = 16) in;

layout (set = 0, binding = 0) uniform sampler2D u_velocity;
layout (set = 0, binding = 1) uniform sampler2D u_color;
layout (set = 0, binding = 2) uniform sampler2D u_history_sampled;
layout (rgba32f, set = 0, binding = 3) uniform image2D u_final_color;

void main() {
const vec2 resolution = vec2(textureSize(u_color, 0));
if (any(greaterThanEqual(gl_GlobalInvocationID.xy, ivec2(resolution)))) {
return;
}

const vec2 uv = (vec2(gl_GlobalInvocationID.xy) + 0.5) / resolution;
const vec2 velocity = textureLod(u_velocity, uv, 0).rg;
const vec2 prev_uv = uv - velocity;

const vec4 current_color = textureLod(u_color, uv, 0);
const vec4 previous_color = textureLod(u_history_sampled, prev_uv, 0);
imageStore(u_final_color, gl_GlobalInvocationID.xy, mix(current_color, previous_color, 0.9));

}```

cold sky
#

ah you're doing TAA

wicked notch
#

all images are different so no UB

cold sky
#

btw use texelFetch

#

or Jaker's FSR2 on GL

#

or ask @hallow umbra for help, as he's poured ungodly amounts of time into TAA

#

maybe got some code to throw at you, esp that you're using visbuffer

frank sail
#

So even better

cold sky
#

FSR3?

#

FSR3, even better than FSR2

frank sail
#

real

wispy spear
#

FSR4 when

frank sail
hallow umbra
#

effort spent making good TAA is better spent tweaking your masks and stuff to give FSR/streamline the highest quality inputs you can

wispy spear
#

@hallow umbra what happened to your sparkly TAA world btw? its been a while since you posted pics of progress : )

hallow umbra
#

nothing, i just did everything interesting that i could think of

wispy spear
#

oh oki

#

any other projects going then?

hallow umbra
#

nop, i'm just out of ideas for anything graphics related

#

i don't feel motivated making things that someone else already did and better

#

and i investigated all techniques that i thought are underlooked

wispy spear
#

you could give virtual shadow maps a try ;P

#

assist lvstri/saky/jaker unlocking its secrets

frank sail
wicked notch
#

uhh

#

I think NV knows

#

In my frag shader, I output a zero motion vector for now

#

And DLAA starts up in "NO_MV_MODE"

#

however as soon as I put some other value in it, NGX immediately switches to "LOWRES_MV_MODE"

#

how the fuck does it know

#

I actually don't even need to put any other value in it, if I do some operation that results in 0, it also switches

#

???

wispy spear
#

driver detects access to api, and flips a switch perhaps

wicked notch
#

Ok so NV calculates motion vectors like this in their sample

#
void main(
    in float4 i_position : SV_Position,
    in float2 i_uv : UV,
    out float4 o_color : SV_Target0
)
{
    o_color = 0;

#if USE_STENCIL
    uint stencil = t_GBufferStencil[i_position.xy].y;
    if ((stencil & g_TemporalAA.stencilMask) == g_TemporalAA.stencilMask)
        discard;
#endif
    float depth = t_GBufferDepth[i_position.xy].x;
    
    float4 clipPos;
    clipPos.x = i_uv.x * 2 - 1;
    clipPos.y = 1 - i_uv.y * 2;
    clipPos.z = depth;
    clipPos.w = 1;

    float4 prevClipPos = mul(clipPos, g_TemporalAA.reprojectionMatrix);

    if (prevClipPos.w <= 0)
        return;
    
    prevClipPos.xyz /= prevClipPos.w;
    float2 prevUV;
    prevUV.x = 0.5 + prevClipPos.x * 0.5;
    prevUV.y = 0.5 - prevClipPos.y * 0.5;

    float2 prevWindowPos = prevUV * g_TemporalAA.previousViewSize + g_TemporalAA.previousViewOrigin;

    o_color.xy = prevWindowPos.xy - i_position.xy;
}
#

And I gotta say, what the fuck

#

is this

#

What is a reprojection matrix

delicate rain
#

Probably last frames clip->world

wicked notch
#

Uhh

#

maybe

#
viewReprojection = inverse(view->GetViewMatrix()) * viewPrevious->GetViewMatrix();
reprojectionMatrix = inverse(view->GetProjectionMatrix(false)) * affineToHomogeneous(viewReprojection) * viewPrevious->GetProjectionMatrix(false);```
#

It's whatever this does

delicate rain
#

Huuuuuh

#

Wtf is affineToHomogeneous

wicked notch
#
template <typename T, int n>
matrix<T, n+1, n+1> affineToHomogeneous(affine<T, n> const & a)
{
    matrix<T, n+1, n+1> result;
    for (int i = 0; i < n; ++i)
    {
        for (int j = 0; j < n; ++j)
            result[i][j] = a.m_linear[i][j];
        result[i][n] = T(0);
    }
    for (int j = 0; j < n; ++j)
        result[n][j] = a.m_translation[j];
    result[n][n] = T(1);
    return result;
}```???????????????????????????
delicate rain
#

What's wrong with just doing vertex * prevMVP - vertex*thisMVP in your vert shader?

#

Why do you have to do this cursed solution

wicked notch
#

I don't

#

I do that in fact

#

This is the DLSS sample's app code

wicked notch
#

Thing is, I dunno if it is actually correct, because NV's docs tell me to do this

#

Whatever this means

#

#questions message
Here, check this out

#

In my thing stuff looks very aliased when I move around

delicate rain
#

I have no clue what's going on, why do they read the velocity from the texture only if both XY are nonzero?

wispy spear
#

perhaps negative/zero velocities need special treatment?

delicate rain
#

Isn't negative velocity just moving in the opposite direction of positive velocity? (Aka back positive forward negative or the other way around)

wicked notch
#

ye NV's docs don't mention anything about that

#

They just say this

delicate rain
#

So the reprojection matrix goes prevFrameClip -> prevFrameView-> prevFrameWorld -> thisFrameView -> thisFrameClip

wicked notch
#

btw

#

I am starting to think my derivative calculation isn't accurate enough

#

and my motion vectors are fine

wicked notch
#

Other fun fact NVSDK_NGX_VK_Feature_Eval_Params::Sharpness does apparently nothing

#

Maybe it only works for DLSS, I'm using DLAA

wicked notch
#

damn DLSS takes longer that it takes for me to rasterize the scene

#

amazing

delicate rain
#

Can you just use DLSS I thought FSR is the only one which you can freely use?

wicked notch
#

ye you can just clone it and use it

glass sphinx
#

its source available?

delicate rain
#

I doubt that

wicked notch
#

heh, it's a 33MiB DLL

#

zero source whatsoever

glass sphinx
#

uuuhm

#

is that allowed?

wicked notch
#

is what allowed?

glass sphinx
#

to use it

wicked notch
#

ye

#

as far as I know

glass sphinx
#

do they not even have headers?

wicked notch
#

Oh ye they do have headers

glass sphinx
#

ah ok

wicked notch
#

Here

wicked notch
#

man

#

AA is so nice

#

but you know what would be nicer

#

Figure out why the fuck I get unstable AA when moving around at the edges of triangles

wicked notch
#

Actually not even that

#

what the hell is wrong with DLAA

delicate rain
#

mister mister do you have code for your project using the ktx thingy?

frank sail
#

Why no fsr2

delicate rain
#

yes

wicked notch
delicate rain
#

I'm finally writing a scene loader

wicked notch
wicked notch
#

this is with DLAA applied

#

lol

wicked notch
#

I have decided that shipping spirv isn't so bad after all

frank sail
wicked notch
#

I do

frank sail
wicked notch
frank sail
wicked notch
#

For proper KTX management check out Jaker's code, he actually checks whether a texture is supercompressed, needs transcoding, etc.

wicked notch
delicate rain
wicked notch
#

There is but it's garbage

wicked notch
#

The function is ktxTexture_VkUploadEx

#

But nobody uses it because it's bad

delicate rain
#

bruuuh I'm starting to understand handmade ppl

frank sail
#

Ktx also has a GL upload function but it sucks too agonyfrog

delicate rain
#

I guess it is impossible to just take in a single pointer to a buffer into which we like to load our data

#

nono I AM THE LIBRARY I RETURN THE POINTER

#

ugh

wicked notch
#

Just memcpy my boy

delicate rain
#

yeah

#

I cope

wicked notch
wicked notch
# wicked notch

Jaker can you run FSR2 in AA mode like this? (normals only, bistro)

frank sail
#

Yes but it's not optimized for that

#

Fsr2 does a bunch of unnecessary work when you use it for 1x upscale (AA)

wicked notch
#

I'm not looking at πŸ…±οΈerf

#

Just if a correct impl of FSR2 also suffers from the same aliasing

frank sail
#

Are you asking me to test

#

Because I'm away from my PC for the next few days

wicked notch
#

oh rip

#

I'll just pull latest frogfood then

frank sail
#

Yeeeeeee

#

It's hard for me to tell if your vid shows poopy aliasing or compression artifacts (which are worsened due to discord mobile sucking)

wicked notch
#

Looks like FSR2 is the same

#

is this just a limitation of modern AA

wispy spear
#

that video looks neat nonetheless

frank sail
#

It's probably worse without any AA

wispy spear
#

the tree would probably go bonkers without

wicked notch
#

yeah definitely worse without AA

#

welp, I've ran all possible sanity checks, it looks like my impl of DLSS is without errors

#

At least, without obvious errors bleakekw

wicked notch
#

here is yet another fun fact about DLSS

#

Here's good ol sponza

#

looks pretty bad innit, well it is performance mode DLSS

#

Now here's NV's sponza

#

Also in performance mode, same resolution

#

wait lemme remove the shadows

#

There it is

#

notice any difference?

#

How in god's name is their sample's app, from which I stole all the code, look so much better

#

what in the everloving fuck

wispy spear
#

rename your shiddy.exe to whatevernvused.exe

wicked notch
#

actually

#

let me try that

#

I swear to god if something changes I'm pulling DLSS out

wispy spear
#

lol

#

im confident it willnt change anything

wicked notch
#

yeah

#

on the 0.1% chance it does tho

glass sphinx
#

😈

wispy spear
#

drivers might use hashes not filenames anyway i suppose

wicked notch
#

I'm changing the AppID

#

ok DLSS is safe

#

nothing changed

#

I'll try asking on NV's forums/discord

wispy spear
#

was worf a try

frank sail
cold sky
#

I've had the OptiX one refuse to work cause Mitsuba splatted samples to multiple pixels with a gaussian kernel

cold sky
#

and quite a lot of moire on your curtains

wicked notch
#

I disabled all post processing in the sample app btw

#

What I don't understand rn is the moirè

#

I am using the same LOD bias as they are using

frank sail
#

Did you check RenderDoc

#

Compare render and display resolutions, image formats, sampler state, etc

wicked notch
#

One thing I am noticing

#

In my stuff, I can only see the edges of objects jittered

#

Even with a 6.0x magnifier

#

In the sample app though I can see everything jittering thonk

frank sail
#

Are you jittering your view or projection matrix

wicked notch
#

The proj matrix

#
const auto jitter = sample_jitter(_device->frame_counter().current(), _state.dlss.jitter_count);
const auto jitter_translation = glm::vec3(2.0f * jitter / glm::vec2(_state.dlss.render_resolution), 0.0f);
const auto jitter_matrix = glm::translate(glm::mat4(1.0f), jitter_translation);
view.jittered_projection = jitter_matrix * view.projection;
#

As nvidia tells me to do

frank sail
#

rip idk

#

I thought your bug could be that you only jitter gl_Position but not the other attributes

wicked notch
#

I solved it

#

I forgor I was using a different view struct for the visbuffer resolve

#

goddamnit

wicked notch
#

we've done it boys

frank sail
#

How

wicked notch
# frank sail How

with the power of friendship and copy pasting code from the sample app

wispy spear
#

the dethfrog had a hand in this probably πŸ™‚

wicked notch
#

I kinda have a defcon0 situation on my hands

#

debugPrintfEXT gives me device loss πŸ’€

wispy spear
#

6 3 8 2 5 9 2 1

#

here the launch codes

wicked notch
#

this bug is megaweird ngl

#

also DLSS is enabling the deprecated VK_EXT_buffer_device_address

#

so that's fucking up my validation layers too

#

I fear integrating DLSS should be the last possible step of any engine

#

because it makes debugging impossible bleakekw

cold sky
#

This is why FOSS is best

#

You can go in and fix that shit

wicked notch
#

ye this absolutely sucks

wicked notch
#

sigh

#

looks like I've been debugging nothing for two hours

#

!remindme 12h open debug printf issue

vivid boughBOT
#

Alright lvstri, I'll remind you about open debug printf issue in 12 hours. ID: 62513782

wicked notch
#

thanks bot

#

imma go cry myself to sleep

#

at least debugPrintfEXT works now

wicked notch
#

and just now I notice that my page table isn't actually wrapping around bleakekw

#

worst texture viewer in the world btw

wispy spear
wicked notch
#

asteroids? wym

wispy spear
#

the texture viewer thingy

#

looks cool how it changes as you move

delicate rain
#

that is because mister has no caching still

#

daily reminder to add caching mister LVSTRI

wispy spear
#

heh

wicked notch
#
#define sampler_partially_bound decorate_with_string("update_after_bind|partially_bound")

layout (local_size_x = 16, local_size_y = 16) in;

layout (set = 0, binding = IRIS_TEXTURE_TYPE_2D_SFLOAT) sampler_partially_bound uniform sampler2D u_texture_2d_sfloat;
layout (set = 0, binding = IRIS_TEXTURE_TYPE_2D_SINT) sampler_partially_bound uniform isampler2D u_texture_2d_sint;
layout (set = 0, binding = IRIS_TEXTURE_TYPE_2D_UINT) sampler_partially_bound uniform usampler2D u_texture_2d_uint;
layout (set = 0, binding = IRIS_TEXTURE_TYPE_2D_ARRAY_SFLOAT) sampler_partially_bound uniform sampler2DArray u_texture_2d_array_sfloat;
layout (set = 0, binding = IRIS_TEXTURE_TYPE_2D_ARRAY_SINT) sampler_partially_bound uniform isampler2DArray u_texture_2d_array_sint;
layout (set = 0, binding = IRIS_TEXTURE_TYPE_2D_ARRAY_UINT) sampler_partially_bound uniform usampler2DArray u_texture_2d_array_uint;``` ahhh yes
#

modern GLSL code

#
static auto make_descriptor_binding_flag_from_decoration(const std::string& decoration) -> descriptor_binding_flag_t {
    const auto split = split_decoration_string(decoration);
    auto result = descriptor_binding_flag_t();
    for (const auto& each : split) {
        if (each == "update_after_bind") {
            result |= ir::descriptor_binding_flag_t::e_update_after_bind;
        } else if (each == "update_unused_while_pending") {
            result |= ir::descriptor_binding_flag_t::e_update_unused_while_pending;
        } else if (each == "partially_bound") {
            result |= ir::descriptor_binding_flag_t::e_partially_bound;
        } else if (each == "variable_descriptor_count") {
            result |= ir::descriptor_binding_flag_t::e_variable_descriptor_count;
        }
    }
    return result;
}``` mmmm
#

love it

frank sail
#

Reflection my beloved

#

Using shading langs makes you wish for a nuclear winter

wispy spear
#

hmm the e_ is ugly too, its quite obvious that its an enum already, otherwise iris* is quite sexy code wise

wicked notch
glass sphinx
#

why do you refelct at all

#

ah its not fully bindless?

wicked notch
#

it's not 😦

#

I still have to steal your gpu table of resources

#

one day I'll be 100% bindless

wispy spear
#

hmm we should make use of discord's soundboard πŸ˜„

frank sail
#

Doesn't that require you to be in voice

wispy spear
#

no idea tbh

#

but makes sense yeah

glass sphinx
#

now i think daxas descriptor code shrunk down to like 300loc for all descriptor management and i added a lot of validation

wicked notch
#

server wkde soundboard

glass sphinx
wicked notch
glass sphinx
#

πŸ‘Ή

#

youuung maan

wicked notch
#

I know

glass sphinx
#

πŸŸͺ its time to wear purple

delicate rain
#

we send you merch

#

you write rt api for daxa

runic surge
#

now i agree with that message

wispy spear
#

that guy reminds me of my racoon, who comes visit here every once in a while πŸ˜„

glass sphinx
#

omg i love saky so much

wicked notch
glass sphinx
#

i actually look like that irl

wispy spear
#

its more like this, potrick == picard, crusher == lvstri, they even sit on daxa coloured chairs πŸ˜„

runic surge
wicked notch
#
static float3 RandomVectorInCone(in float3 direction, in float angle) {
    const uint3 pixelCoord = DispatchRaysIndex();
    const uint3 dispatchDimension = DispatchRaysDimensions();
    const uint pixelIndex = pixelCoord.y * dispatchDimension.x + pixelCoord.x;
    const uint sampleIndex = RayTraceCB.CurrSampleIdx;
    uint state = pixelIndex * sampleIndex;

    const float phi = RandomPCG(state) * 2 * 3.141592653589793284626433;
    const float z = RandomPCG(state) * (1 - cos(angle)) + cos(angle);
    const float x = sqrt(1 - z * z) * cos(phi);
    const float y = sqrt(1 - z * z) * sin(phi);
    const float3 tangent = normalize(cross(float3(0, 1, 0), direction));
    const float3 bitangent = cross(direction, tangent);
    const float3x3 rotation = float3x3(tangent, bitangent, direction);
    return normalize(mul(float3(x, y, z), rotation));
}```
#

for posterity

wicked notch
#

0.00872665

runic surge
#

Don’t mind if i yoink that

wicked notch
glass sphinx
#

lvstri qhat gpu do you have

wicked notch
#

3070 doc

glass sphinx
#

nice

wicked notch
#

I may or may not be procrastinating on caching for my shadows with RT

#

ngl the RT API in Vulkan is super convoluted wtf

glass sphinx
#

it also has lots of options that are just not usefu

#

like cpu side build

#

early days

wicked notch
#

btw @frank sail

#

I figured out a very much more shrimpler way of doing your unhinged glm::inverse(bababooey) * baba_is_you * stable_view

frank sail
#

please god yes

wicked notch
#
const auto clip_world_position = view.stable_proj_view * glm::vec4(_camera.position(), 1.0f);
const auto uv_world_position = (glm::vec2(clip_world_position) / clip_world_position.w) * 0.5f;
const auto page_offset = glm::ivec2(uv_world_position * glm::vec2(IRIS_VSM_VIRTUAL_PAGE_ROW_SIZE));
const auto ndc_shift = 2.0f * (glm::vec2(page_offset) / glm::vec2(IRIS_VSM_VIRTUAL_PAGE_ROW_SIZE));
const auto world_page_offset = view.inv_stable_proj_view * glm::vec4(ndc_shift, 0.0f, 1.0f);
const auto world_page_offset_shift = glm::vec3(-world_page_offset);
const auto shifted_view = glm::translate(view.stable_view, world_page_offset_shift);
view.view = shifted_view;```
frank sail
#

will analyze in a bit

wicked notch
#

Does this suffer from "if player moves too far away then Z range is fucked" problem I wonder

frank sail
#

probably

wicked notch
#

because I'm supposedly translating the view matrix to where the player is, to the nearest page

frank sail
#

I mean if you're just shifting xy then yes

#

it will suffer

wicked notch
#

rip

frank sail
#

the solution for z will be more complicated

#

you will need a per-page z offset or something

wicked notch
#

or do what saky is doing

#

which is probably a per-page z offset bleakekw

frank sail
#

when we discussed, I understood that it did not solve that problem

wicked notch
#

o

#

so saky's thing is massively more clamplicated but it doesn't solve the problem? πŸ’€

frank sail
#

that's how I understood it, idk

wicked notch
#

welp

#

I guess we'll make full use of our god given fp32 precision

cold sky
#

for orthographic projection, fp32 makes no sense

#

use/emulate unorm32

#

with fp32 you waste 2 bits

#

and you have a logartihmic distribution of the remaining 30

wicked notch
#

how do you emulate unorm32?

#

rn I just do floatBitsToUint(gl_FragCoord.z)

frank sail
#

I mean, you don't have a real depth buffer

#

so you can use any format you want

wicked notch
#

ye but I can't do atomicMin on a unorm32 image can I

frank sail
#

you can do it on a uint32 image tho

#

unorm is just that, but with an implicit division by U32_MAX

#

I suppose you will have to do your math in fixed point to see any real benefit though

delicate rain
wicked notch
#

pog

delicate rain
#

I have to have per page z offset

frank sail
#

ah

delicate rain
#

And my thingy

#

And a bit more logic and it still has some quirks

#

So id suggest just go for sliding along the plane bleakekw

wicked notch
#

can't we fix by translating the origin of the world somehow

#

perhaps by recreating the stable view matrix to point at another center

delicate rain
#

You need to correct the depths then

#

That's what I do

#

Uh maybe I misunderstood actually

frank sail
#

you can invalidate everything if the player goes too far from the origin (on the light-space z axis), if you want to use minimal effort

#

then you can shift the light camera

wicked notch
#

what's an invalidation every time you move 2000km

frank sail
#

good luck getting sufficient z precision

wicked notch
#

rip

frank sail
#

you could make the frustum length like 1000 units and then shift every 500 (with a buffer zone to prevent the player from constantly triggering full refreshes by moving past a threshold)

delicate rain
#

Btw how do you deal with player going into negative coordinates from the origin? Won't it shift the sun camera underneath the terrain?

wicked notch
#

depth clamping smart

frank sail
#

yeah not much you can do there except make the frustum longer

#

most game content probably won't span such a huge area

#

vertically, that is

delicate rain
#

I want start citizen planets πŸ₯Έ

frank sail
#

make a bigger frustum

wicked notch
#

full scale planets yes

delicate rain
#

My frustum size increases with each clipmap

wicked notch
#

actually

#

make the galaxy cast a shadow

frank sail
#

you don't need insane precision when you are 50,000,000 km from the surface

delicate rain
#

So it actually is fine

wicked notch
#

everything would be so much easier if we had infinite memory and infinite precision smh

wicked notch
#

raytracing is bad for my health

#

it's 3am and I am staring at path traced power plant

#

I have been staring at it for 10 minutes

#

this is a cry for help

frank sail
#

go to sleep and dream about path traced frogs

wicked notch
#

thank god tomorrow is saturday

glass sphinx
#

i should sleep

runic surge
wicked notch
#

oh yeah

#

frame time variance

#

no variance at all

#

This is even worse KEKW

#

amazing

runic surge
#

are those frame times supposed to be normal?

#

i have no clue but usually frame times don't become sinosodual

wicked notch
#

depends on your definition of normal πŸ’€

runic surge
#

babe wake up new frametimes just dropped

#

technically speaking

#

average frametime is πŸ”₯

#

just ignore the 1% lows

wicked notch
#

does this mean my blocker search has not enough shrimples?

frank sail
#

Looks fine to me

wicked notch
#

me when shown literally any kind of contact hardening:

#

but perhaps the light size is too big KEKW

frank sail
#

Add sliders for sample count, width, etc

wicked notch
#

ye

twin bough
#

i have pcss too ill ask if i can share it

#

heh

#

looks mega stupid tho

frank sail
#

I'll allow it

twin bough
glass sphinx
#

nice

twin bough
#

also works for local lights

frank sail
#

I could recognize that rust texture anywhere

twin bough
#

sauce

frank sail
#

so it seems like UE5 do be making an HZB for the VSM

#

tbh I think hzb would work if you have a two-pass approach

wicked notch
#

remember that unreal's meshlets are 99% of the times smaller that a page

#

so they can do HZB per page

#

we gotta think more heavily about it bleakekw

frank sail
#

ye I'm doing that rn

#

ALSO

#

I determined that HZB is only helpful for dynamic geometry

#

if a page wasn't previously visible (when the camera moves or the light rotates), then it never had meaningful depth to cull against

#

so all geometry that touches that page must be rendered

#

anyways, here's the idea:

  1. the usual: mark & allocate visible pages, clear dirty physical pages, etc.
  2. build HPB and cull visible objects against it (visible objects are determined from step 4 of last frame)
  3. render remaining visible objects
  4. build HZB and cull objects against it
  5. render objects whose visibility changed from 0 to 1 (this is essential to avoid getting fucked by the cached nature of pages)
#

again, HZB only helps when moving geometry can cause an already-visible page to become invalidated

#

idk if geometry can move in any of our engines bleakekw

wicked notch
#

πŸ’€

frank sail
#

HPB however is useful for everything

#

and I don't think they can be cleverly combined like I originally thought

wicked notch
#

ye unfortunately

#

I wanted to go with the "idk separate them" anyways

frank sail
#

actually I think they can be merged if you put HPB in step 4

#

wait uh

#

with merged HPB+HZB in step 4, if you see a new page, objects may not be rendered to it in the first render, but they shouldn't be culled in step 4 as the HPB+HZB will be empty, which means they should be rendered in step 5

#

the idea requires storing object visibility until the next frame, which is numViews * numObjects bits of storage

#

where an object is presumably a meshlet

wicked notch
#

thank god for uint64_t KEKW

frank sail
#

well even if you have a million meshlets and 16 views, 16 million bits is only 2 MB

#

it's not as bad as trying to store the maximum number of indices for every view bleakekw

#

it's like 3 orders of magnitude less storage

wicked notch
#

tru

wicked notch
#

alright it's time to switch things up a bit

#

I shall put VSM in the backburner for a while

wicked notch
#

I will be doing RT yes

#

hopefully I can help out with daxa's RT efforts

wicked notch
#

potrick while you're here

#

do you mind explaining a bit how daxa's resource table work on the C++ side

#

how are BufferIds and SamplerIds created, bound to descriptors and destroyed specifically

delicate rain
#

in this endeavor

#

I thin HZB will help when you have animated thingy which just sways for example and you are redrawing the tile each frame

wicked notch
#

I have not abandoned you guys lol

#

the VSM train is still going strong

#

'Tis but one of my usual detours

delicate rain
#

Good goood I'm also on VSM holiday

#

but I'll soon return stronger than ever

glass sphinx
#

Example for Buffer:

  • creating it gives you an id (index + version)
  • index of id indexes into cpu side array of ImplBuffers (the metadata for the buffer)
  • index indexes into a descriptor set binding array
  • when creating the buffer its imediately written to the mega descriptor set
  • daxa only has one descriptor set that has update after bind and some other flags set to make it convenient
  • when calling destroy on the buffer it becomes a zombie
  • zombies life until all already submitted commands running at the point in time when you call destroy are done
  • daxa checks when they are done and actually performs destructions in Device::collect_garbage
  • it uses timeline semaphores tracking submits on a cpu and gpu timeline
  • actualyl destroying the buffer writes a dummy in the place of the dead buffer to avoid dangling descriptors
wicked notch
#

epic

#

then to access the buffer do you use push const or something?

#

to index in the buffer table that is

glass sphinx
#

you either put it in a push constant or bind it as a uniform buffer (yeeaaa i know, i am not sure if i wanna keep them uniform buffers but daxa has them atm)

wicked notch
#

alright the refactor is coming soonℒ️

#

but RT first

glass sphinx
wispy spear
#

lustri, traitor

wicked notch
#

hmm

#

making a bottom level AS per meshlet

#

what could go wrong?

frank sail
#

bottom level algebra subprogram

#

btw, why

wicked notch
#

idk

frank sail
#

why not coarser granularity

wicked notch
#

I was writing code for bvh building for hardware RT

wicked notch
#

maybe meshlets

#

except meshlet triangle upper bound is 65536 triangles KEKW

runic surge
#

Wait lvstri, you use daxa now?

#

I thought you used your own custom stuff

wicked notch
#

I don't use daxa

#

I might go for it in the future

runic surge
#

perfectly balanced
Lvstri uses daxa
I stop using phobos

wicked notch
#

I think I'll start using other's people stuff after I make a render graph for my own stuff

glass sphinx
#

a mostly complete rendergraph is massive pain and work

#

thats when i enslaved saky saky joined. Without our combined brains it would havebeen impossible

#

@delicate rain tell your tg pain

delicate rain
#

It is pain

#

Took us like a month to just think of all the shit

wispy spear
wicked notch
#

Currently thinking about full bindless, but I'm wondering if I have enough descriptor set bindings

#

I need:

  • binding for everything that is VK_DESCRIPTOR_TYPE_SAMPLER
  • binding for everything that is VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
  • binding for everything that is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
#

Possibly even one for VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER but let's not KEKW

frank sail
#

you could use mutable descriptors and make everything in the same binding

#

but I think support for that is extremely low nervous

#

inshallah we will have d3d12isms

wicked notch
#

do I actually need that

frank sail
#

ok support is actually decent on newer desktop gpus

frank sail
#

mainly it's used for porting d3d12 apps to vulkan KEKW

wicked notch
#

ye I could just do this

layout (set = 0, binding = 0) uniform sampler sampler_table[];
layout (set = 0, binding = 0) uniform samplerShadow sampler_table[];

layout (set = 1, binding = 0) uniform image1D image_table[];
layout (set = 1, binding = 0) uniform image2D image_table[];
layout (set = 1, binding = 0) uniform image3D image_table[];
layout (set = 1, binding = 0) uniform image1DArray image_table[];
layout (set = 1, binding = 0) uniform image2DArray image_table[];
...```
frank sail
#

it also wastes memory because it essentially makes your descriptors a union

wicked notch
#

it's ok I'll just allocate enough memory that nobody's ever going to use

#

128MiB of descriptor mem sounds reasonable enough

frank sail
wicked notch
#

I believe that's what daxa does

frank sail
#

seems legit

wicked notch
frank sail
#

ye but everyone has it so eh

wicked notch
#

I'm having a cursed idea

#

so you know how textures can be streamed in and out right

#

I'm thinking of shader_resource_table.remove_resource() and shader_resource_table.insert_resource();

#

Instead of having dummy descriptors, I just use a flat hash map that keeps track of all my textures and textures ids

#

so I can remap indices on the fly

frank sail
#

I'm not sure I follow either approach lol

#

But I'm just a duck

wicked notch
#

I'm not sure either approach is doable either

#

I'm searching for the laziest, easiest way I can do this

#

leveraging the power of flat hash maps

frank sail
#

you basically just need to allocate indices for descriptors, right

#

or are you trying to solve something else

wicked notch
#

ye I just need to allocate indices for descriptors

#

and be able to remove them without shifting shit around

#

like a page allocator where 1 descriptor = 1 page

frank sail
#

no need to overthink it bleakekw

#

yeah just keep a list of free indices or something

#

a "free list" of sorts

wicked notch
#

nah slow

#

__builtin_clz baby

#

even shrimpler

#

and I can reuse the old crusty VSM page allocator I did on the CPU side

#

when I still believed in hw sparse 😭

frank sail
#

lol

#

even I'm using the dumb bit array for my VSM so I can't complain

wicked notch
#

it's a great approach

frank sail
#

but allocations are O(n), rip perf

wicked notch
#

worst case smart

frank sail
#

when you have billions of descriptors that is

wicked notch
#

yeah

#

if I have 16k descriptors then worst case it's O(256) KEKW

frank sail
#

a performance disaster in vblancospeak

wicked notch
#

so more like O(sqrt(n))

frank sail
#

Wait what

#

Isn't it just n/32

wicked notch
#

ye but O(n/64) is asymptotically equivalent to O(n)

frank sail
#

Ye

wicked notch
#

so I paint a better picture by saying O(sqrt(n))

#

even though it's a lie

frank sail
#

The secret ingredient is lying

wicked notch
#

I gotta cache ids btw

#

and give them a TTL

frank sail
#

Ttl?

wicked notch
#

time to live

#

so a descriptor slot remains allocated as long as it's TTL > 0

#

and it's freed when TTL = 0

frank sail
#

Or be a man and overwrite it while in use

#

Btw I did not implement ttl for my gpu allocator in my voxel engine and chunks would sometimes artifact for a frame or two when you modified stuff bleakekw

#

The real issue was that I had occlusion culling data that was used next frame

#

But the end result was the same

wicked notch
#

technically

#

I can update while in use

#

I'm just not sure how leinent drivers are

#

but UPDATE_AFTER_BIND actually allows it KEKW

frank sail
#

I don't think it means modifying the descriptor that is in use though bleakekw

wicked notch
#

ye that's API misuse

#

but you can just turn off the validation layers

frank sail
#

One weird trick to not have any validation errors

wicked notch
#

here's another issue, double buffered resources

#

hmm

#

actually

#

not an issue

#

I can just allocate 2 ids

#
void init() {
    view_buffer = shader_resource_table->allocate_shared_buffer_resource<frames_in_flight>(); // std::vector<buffer_id_t>(2);
}

void render() {
    thing = shader_resource_table->get_buffer_slice({ .id = view_buffer[current_frame] }); // returns buffer_slice_t, refreshes the cache
    thing.insert(...);
}```
#

oh god this is gonna take ages

#

I can basically just delete pipeline.cpp and remake it from scratch

#

amazing

glass sphinx
#

😈

wicked notch
#

lovely

wispy spear
glass sphinx
#

sexy

wicked notch
#

with good ol macros it's a little less

#

now comes the hard part

#

the shader resource tableℒ️

wispy spear
#

hehe its pretty crazy how do you use all of them, must be some big ass if else block in main, no?

#

perhaps next evolution is generating the shaders to whatever you need it generate to

wicked notch
#

I make a macro that generates more macros that access these

#

the usage I'm going for is this

#
void main() {
    vec3 payload = IRIS_STORAGE_IMAGE_2D_LOAD(uint32, image_id).xyz;
}```
wispy spear
#

ah that do be looking daxaesque

#

or sweashopesque

wicked notch
#

I make sure to give publicity to daxa KEKW

wispy spear
#

ah πŸ˜„

wicked notch
#

beautiful

wispy spear
#

its readable, i like it

wicked notch
#

I wouldn't say this part is especially readable but the rest is manageable at least KEKW

#define _IRIS_ACQUIRE_COMBINED_SAMPLER(dimension, type, image_id, sampler_id) sampler##dimension(_IRIS_ACQUIRE_SAMPLED_IMAGE(dimension, type, image_id), u_sampler_table[sampler_id])
#define _IRIS_ACQUIRE_COMBINED_SAMPLER_SHADOW(dimension, type, image_id, sampler_id) sampler##dimension##Shadow(_IRIS_ACQUIRE_SAMPLED_IMAGE(dimension, type, image_id), u_sampler_shadow_table[sampler_id])```
wispy spear
#

hehe

#

sooner or later ill have to go through something like that as well

wicked notch
#

can you return opaque types from functions in glsl

#

like

image2D id_to_descriptor(uint id) {
    return table[id];
}```
wicked notch
#

I can't

#

but I have achieved epic syntax regardless

#
#define output_image iris_image_accessor(restrict_write, u_output_image_id)
#define texture_2d_sfloat iris_combined_sampler_2d(float32, u_texture_id, u_sampler_id)
#define texture_2d_sint iris_combined_sampler_2d(int32, u_texture_id, u_sampler_id)
#define texture_2d_uint iris_combined_sampler_2d(uint32, u_texture_id, u_sampler_id)
#define texture_2d_array_sfloat iris_combined_sampler_2d_array(float32, u_texture_id, u_sampler_id)
#define texture_2d_array_sint iris_combined_sampler_2d_array(int32, u_texture_id, u_sampler_id)
#define texture_2d_array_uint iris_combined_sampler_2d_array(uint32, u_texture_id, u_sampler_id)```
#

totally not inspired by daxa KEKW

#
#version 460
#include "bindings.glsl"

iris_declare_storage_image_descriptor_qualified(restrict_read, restrict readonly, image2D);
iris_declare_storage_image_descriptor_qualified(restrict_write, restrict writeonly, image2D);
#define input_image iris_image_accessor(restrict_read, u_input_image_id)
#define output_image iris_image_accessor(restrict_write, u_output_image_id)

layout (scalar, push_constant) restrict readonly uniform u_push_constant {
    uint u_input_image_id;
    uint u_output_image_id;
};

layout (local_size_x = 16, local_size_y = 16) in;
void main() {
    const ivec2 size = imageSize(input_image);
    if (any(greaterThanEqual(gl_GlobalInvocationID.xy, size))) {
        return;
    }
    const ivec2 position = ivec2(gl_GlobalInvocationID.xy);
    const vec4 payload = imageLoad(input_image, position);
    imageStore(output_image, position, vec4(linear_as_srgb(tonemap(payload.xyz)), 1.0));
}```
wispy spear
#

wouldnt surprise me if daxa was inspired by sweatshop.pl πŸ˜‰

wicked notch
#

god fucking damnit this is so good

#

I can literally stop thinking about anything

#

just handle = srt->allocate(); and buffer = srt->acquire(handle);

#

it's crazy

glass sphinx
#

it really is amazing

glass sphinx
#

designing and deciding on the makros were pure pain in my head

#

endless changes

#

now im happy

wicked notch
#

btw

#

what do you use for dummy descriptors

#

do you just create a 1x1 image or something

glass sphinx
#

yes

wicked notch
#

epic

glass sphinx
#

but im considering using the robustness vulkan feature stuff

wicked notch
#

it's quite sad we can't use null

glass sphinx
#

you actually can

#

but you need some feature

#

it can apparently tank perf

#

so im scared of it

#

but dx12 has it default enabled for everything afaik

#

so cant be too bad

#

robustness also makes it legal to read and write out of bounds

#

it ignores writes and on reads you get 0

wicked notch
#

device loss be gone

glass sphinx
#

REAL

#

maybe i make it optional or something idk

#

but i think its very nice that dx12 saves you and forced gpu makers to implement hw acceleration for these checks

glass sphinx
#

so desktop might not care

wicked notch
#

mobile is not real so we're good

glass sphinx
wicked notch
#

I mean if it works for D3D12 why not for Vk too

#

the hw is the same

glass sphinx
#

yea

wicked notch
#

I doubt the Vk/Dx drivers are much different either

glass sphinx
#

random insane fact: ada lovlace has 128 bit atomic cas

wispy spear
#

ah, i see

#

i really need to advance deeper into gpu drivenisms in order to understand all this

wicked notch
#

Hmm

#

me wonder

#

couple the ShaRT with frames in flight or not

#

shit I have a devilish idea

#

one ShaRT per frame in flight

wispy spear
#

a devshish idea?

wicked notch
#

yes

#

if one ponders the orb, one shall realize that two frames in flight might have completely different ShaRTs

wispy spear
#

make a third frame

#

where you interpolate between them

wicked notch
#

DLSS3 knockoff

wispy spear
#

oh

#

did i just reinvent it

#

i feel like i dropped into a barrel full of toxic waste and grew superpowers lol

wicked notch
#

did you figure out descriptors in Fuk btw

#

I haven't heard much lately

wispy spear
#

i will pick fuk up again over the wekekend

#

all this gfx nonsense has drained me : (

wicked notch
#

take your time frogking

wicked notch
#

hmm yes

#

a lovely 512KiB

frank sail
#

thic

wicked notch
#

it's 1MiB now KEKW

wicked notch
#

hmmmmm

#

Since updating a descriptor set only comes with allocation/deallocation

#

should I make that RAII-style

wicked notch
#

I'm lighting the daxa beam

glass sphinx
wicked notch
#

mr potrick, how do you handle buffers/images in the gpu resource table that change within a frame in flight

#

so for example, say you have a camera buffer that you update every frame

glass sphinx
#

well

#

either i just alloc from a per frame staging buffer (device local or host local depending on what its used for)

#

or i make an array inside the buffer of the cam info

#

and pass the index to it

#

write the appropriate part

#

the table doesnt know anything but creation and deletion

delicate rain
#

I don't think there is any specific resource table handling to them

glass sphinx
#

the indices are 100% tied to the resources

#

so when i change a resource between frames i pass different ids to the shaders

wicked notch
#

what about differences in bindings between frames?

glass sphinx
#

expand

wicked notch
#

e.g:
frame 0: I want to allocate these two images, please update this frame's descriptor set with the two images
frame 1: I want to allocate three more images, please update this frame's descriptor set with the three images

#

afaik calling vkUpdateDescriptorSets is illegal while the set is in use

glass sphinx
#

i dont understand

#

there is one descriptor set

#

you can update slots that arent used

#

its fine if the set is in use

#

the only restriction is that the specific slots within the descriptor array arent actually used

wicked notch
#

it's fine as long as you don't touch slots that are in use?

glass sphinx
#

yes

wicked notch
#

pog

glass sphinx
#

so stale slots are free to be written

#

aty any time

wicked notch
#

so for the camera buffer, staging and then vkCmdCopy?

glass sphinx
#

no just staging

#

well

delicate rain
#

You can do both

#

Depends on what you want

glass sphinx
#

well its actually bacially orthogonal to daxa @wicked notch

delicate rain
#

No?

glass sphinx
#

what I do is either:

  1. have a single buffer that i copy to from staging once a frame
  2. simply use device local host visible scratch memory that i get an offset into (linear alloc) then write and pass the bda to the shader, so no staging or copy. Just instant cpu write then use on gpu, ultra fast.
glass sphinx
#

i usually dont anymore

#

bar memory is sexier

wicked notch
#

alright epic

#

many thanks to the daxa team

delicate rain
distant lodge
#

why only one set though? why not have 1 set per frame in flight

#

so you can update one while reading the others

delicate rain
#

Why would you need that

#

You always allocate new buffers into unused slots

distant lodge
#

what if next frame, I free 1 image, add its index to the freelist, and then try to write into that

wicked notch
#

you defer the "add index to freelist" part

#

but hmm

delicate rain
#

Wat

#

I don't follow

distant lodge
#

imagine I have image A bound to a slot, next frame I remove this image but also register a new one, if I add A's old index to the freelist immediately then it will be picked up as the index for the new image

delicate rain
#

If you free a buffer or an image it is only freed after GPU frame cnt catches up to the point on CPU frame cnt where it was freed

distant lodge
#

and if I have a frame in flight, something might be reading that slot

glass sphinx
#

there is also no benefit afaik

#

daxa also doesnt know what fif is

distant lodge
#

that's interesting

glass sphinx
#

it doesnt need to

#

fif are trivial with daxa

delicate rain
wicked notch
#

Ye I'm trying to design this such that FIF are trivial too

delicate rain
#

It is deferred

distant lodge
#

so you don't have any core systems that rely on FIF like deletion queues

glass sphinx
#

its checked and deferred

glass sphinx
#

it doesnt rely on fif

delicate rain
glass sphinx
#

it checks a timeline semaphore

delicate rain
#

It's too arbitrary

glass sphinx
#

there is a cpu and gpu submit timeline

delicate rain
#

I might want to use daxa for compute sim where there is no bound on fif

glass sphinx
#

destructions are deferred until the gpu catches up to the cpu at the timepoint of destruction call

glass sphinx
#

daxa still has a cleanup function that should be called once a frame to do housekeeping

#

but its not tied to fif or anything like that

delicate rain
distant lodge
#

FIF is pretty convenient for anywhere you need to do n-buffered CPU-GPU sync

#

do you do timeline semaphores for all of that

wicked notch
#

CPU-GPU timeline makes more sense though

glass sphinx
delicate rain
#

Yeah but that is users responsibility

#

Handling fif for readback and everything I mean

glass sphinx
#

btw im working on a tg facelift atm that will make it easier to use, more powerful AND less loc

distant lodge
#

yeah that's interesting though, I think if I had timeline based deletion logic I could pull FIF code out of my core vulkan context

delicate rain
#

Too complex for me

glass sphinx
distant lodge
#

my context just owns the counter

#

and I mainly use it to scale buffer capacity for stuff like uploading instances

glass sphinx
#

gabe forgot most things about descriptors

delicate rain
#

I did too

glass sphinx
#

good