#Fwog and co.

1 messages Β· Page 5 of 1

golden schooner
#

but it was much worse before vs2010

long robin
#

for some reason, nothing renders when I run the examples through renderdoc on my work pc

#

then renderdoc crashes when I try loading a capture 😩

shell inlet
#

on amd gpu?

long robin
#

ye

#

and by "nothing renders" I mean only the blue sky and imgui stuff shows up

shell inlet
#

well yeah iirc you are working at amd

#

silly question

long robin
#

'tis quite odd

shell inlet
#

common outcome of developing on nvidia

#

it's more lenient

long robin
#

I guess I have a UBism somewhere

#

commented out the whole RSM pass too just in case, but that ain't it

#

also note that it runs fineℒ️ when it's not through renderdoc

shell inlet
#

do you have ogl debug callback?

long robin
#

ye

shell inlet
#

ok that is odd

#

not that you have debug callback

#

odd is that it's not giving clues

#

nvidia's debug callback is pretty vocal about every little thing

#

even non errors

long robin
#

renderdoc is doing something funky I guess

shell inlet
#

but it's only adding stuff to spy on gapi calls and shaders

long robin
shell inlet
#

yeah very funny

long robin
#

hmm that seems suspiciously small

shell inlet
#

or maybe I haven't told you that I don't use renderdoc

long robin
#

I did make the window tiny though

#

the captures seem corrupted somehow, even a 2k capture is less than 1mb even though I have several screen-size textures

#

wtf even hello triangle fails

#

well, it doesn't crash at least

#

but it displays incorrectly

#

I'm just going to assume that my renderdoc installation is cursed since I have been having other issues with it

golden schooner
#

i can load the capture

long robin
#

does it look like my pic?

golden schooner
#

yes, no vbo/ebo buond

long robin
#

weird

#

pretty sure I explicitly call glBindVertexBuffer or whatever it's called

golden schooner
#

history of vao creation is quite sad

long robin
#

it should be one of the sub-events for the draw call you have selected

#

wait no

#

it should be higher up

#

in the scene pass

#

or base pass

golden schooner
#

RSM Scene shows valid vao+vbo et al

#

interesting

#

mesh output shows this

#

vs input is fine

#

for base geometry adn rsm scene

#

maybe because of

long robin
#

sounds like bs

golden schooner
#

this also looks funny

long robin
golden schooner
#

GL_LINES is 1 i think

#

and number values are interepreted as glenums in function parameters

long robin
golden schooner
#

im also on 1.22 here

#

yep

long robin
#

for me the triangle doesn't render at all, but when I look in the capture it is black (which is also incorrect)

golden schooner
#

purple screen hre

long robin
#

ye it should be purple + round_triangle

golden schooner
#

ye, just shrimple purple

long robin
#

can you look at the draw clal

#

call

golden schooner
#

the mesh is fucked

#

color attibs are all 0 and when you click through the vertices in the vsout part, it only jumps back and forth on the start/end points of that "line" you see there

#

thats also what you see when you do wireframe draw in the textureviewer

#

vertexbuffer seems to have only 1 vec

#

0, 0, 1 is in there - thats positions

long robin
#

I was able to open the capture and the vertex positions looked correct, but the colors were wrong

golden schooner
#

a_color cant be displayed

#

just shows --- -- --- here, when formatted to vec4

#

its rgba32 according to the vao

long robin
#

a_color should be RGBA8_UNORM

golden schooner
#

your a_pos is a vec2 in th shader

long robin
#

I get the same

golden schooner
#

are you playing with a 7900 XTX? πŸ˜‰

golden schooner
#

πŸ˜„

long robin
#

I'm on a 6800 still

#

I wouldn't be having you help me debug this problem if it was an unreleased GPU KEKW

long robin
# golden schooner

anyways, I get the same thing here, but when I look in the VAO creation I see what I intended to put there

golden schooner
#

ye thats what i just checked too and formatted the buffer like that

#

looks like 256, 256, 256 as 3 values in a_color

long robin
#

try ubyte x3

golden schooner
#

ye

#

that makes sense

long robin
#

it shows 255 255 255 for me with ubyte x3

golden schooner
#

ye same

long robin
#

tl;dr I think renderdoc is being dum dum

golden schooner
#

that seems correct

#

ye

#

i wonder if its worth showing baldur

long robin
#

mayhaps

golden schooner
#

the vao showing funny shit is not the first time in the VTX section

long robin
#

failing on hello triangle is not good

golden schooner
#

resource inintitniittitiliaization for vao 57 also looks funky

long robin
#

I see a million glVertexArrayVertexBuffer calls after the pic I showed

golden schooner
#

exactly

#

thats what im referring to

#

those cant be all calls in all frames

#

otherwise the purpose of this window wouldnt make sense

long robin
#

ye it says frame #118

golden schooner
#

i would assume you see all calls on this resource during its creation, not durin its use until whenever

long robin
#

I make these glVertexArrayVertexBuffer calls at runtime though

golden schooner
#

can you create a new capture and check all the checkboxes

long robin
#

wat

golden schooner
#

capture options

long robin
#

ah

golden schooner
#

besides autostart

#

maybe that will catch some bogusism

#

02:22:22

#

πŸ˜„

long robin
golden schooner
#

strange how imgui works, but the other shit not

long robin
#

yeah

#

but imgui isn't using the new opengl stuff so maybe not

golden schooner
#

ye

#

in my renderisms its using the enw shit

#

eh

#

the triangle is skewed

long robin
#

wdym

golden schooner
#

its coming "out of the screen"

#

and you are looking right onto the hypothenuse

#

thats why we dont see anything

#

is that some glProvokeVertex shisms?

long robin
#

for me, it is somehow rendering even though the third vertex is invalid (I guess AMD makes it become (0,0) in that case?)

golden schooner
#

what i just sent was VS in

#

and this is how my VS Out looks like

long robin
#

my VS in is this

golden schooner
#

lol

long robin
#

eh I'm just gonna give up on this for now

#

I can report the issue lat0r, I gotta work on work stuff

golden schooner
#

please report and link the issue for updooting it

long robin
#

ok

#

ugh the repro is Application.cpp + 01_hello_triangle.cpp

#
  • all of Fwog
golden schooner
#

i do be rembering that when i had my r9 fury in this linuxism it also didnt work, just black screen

long robin
#

do you think he'll accept just the capture

#

hehe

golden schooner
#

i fink repro should also be shrimple wih the hello triangle repo

#

its cmake, select hellofrog and run

#
  • capture
long robin
#

this only happens on AMD afaik

#

captures work fine for me on NV

#

I'll submit the report laterℒ️

long robin
#

I cry

golden schooner
#

hehe just try again

#

although none of the glsl plugins worked really well at all in vs

long robin
#

I have one that worksonmymachine

modern saffron
shell inlet
#

when an accident takes place the driver is the primary suspect

long robin
#

frfr

#

I never write bugs

#

never, never

#

as if I could ever make such a mistake

modern saffron
#

Bugs are never a gpu devs fault

shell inlet
long robin
#

That was intended behavior which happened to annoy some users

#

If you can't handle me at my crashes, then you don't deserve me at my undefined behavior

shell inlet
#

lol

long robin
#

manually implementing texture bilinear interpolation is extremely cursed

#

if you land exactly on a texel center, you get an ambiguity which can lead to strange problems

#

my math was resolving it differently than textureGather was

#

I also discovered that my reprojection math is somehow off by half a texel

#

actually nvm that, I'm just trippin yo

shell inlet
#

ambiguity?

#

tripping yo?

long robin
#

when your sample lands exactly on the center of a texel, it could be within four different "boxes" for interpolation

#

take my expert drawing

#

if the texel (red) is in the magenta box, the bilerp weights would be (1, 1) (since it's on the top-right corner), but if it was in the green box, the bilerp weights would be (0, 0)

shell inlet
#

ah

long robin
#

so what I was experiencing was textureGather choosing a different box than what my math was choosing

#

so my weights were wrong

shell inlet
#

well yeah you snap to the centre of just one, usually top left

long robin
#

that was one of my solutions

#

the other is adding an epsilon to the uv

#

which is icky

shell inlet
#

no I mean bilinear is snapping to top left, getting the other 3 samples and interpolating using difference between snapped and original

long robin
#

ah

#

what I'm actually doing is just snapping to the texel and not using any interpolation if it's close enough

shell inlet
#

it is usually done by remapping uv to texture size and truncating to int, then the fractional part is your interpolation factor

long robin
#

ye that's what I'm doing

#

to get the weights

#

but only if the UV is not nearly on the center of a texel

shell inlet
#

why make any special cases?

long robin
#

I explained why

shell inlet
#

I don't get it then

#

why not interpolate indiscriminately

long robin
#

which four texels does textureGather give me if the provided texcoord is on the center of a texel?

shell inlet
#

it will give you the texel and zero lerp factors

long robin
#

I don't understand

#

textureGather gives you the four texels that would be selected for performing linear interpolation if you sampled at that point

#

it doesn't weigh them or anything

shell inlet
#

do you have to use gather?

#

you could do vec2 samplepos = uv*textureSize(tex); ivec2 texelpos = ivec2(samplepos); vec2 lerpfactor = fract(samplepos);

#

then use texelpos to get 4 samples with texelfetch

long robin
shell inlet
#

so how about dropping gather

#

it's implementation dependent

#

that branch is probably going to hurt you more than what you get as a hypothetical win from gather instead of 4 fetches

long robin
#

branch slander

#

according to that blog, the magic number I need is 1/512

shell inlet
#

not just branch itself but the used registers to evaluate it

long robin
#

assuming I'm already close to losing max occupancy due to register use

shell inlet
#

not on a rtx no

long robin
#

which is definitely something that would need to be profiled rather than guessed

shell inlet
#

well you seem to know better so I'll leave you be

long robin
#

taking four samples manually might be the best way to preserve what's left of my sanity after all

long robin
#

actually I decided to do something else entirely KEKW

#

please welcome float RejectDepth4(vec3 reprojectedUV); to the family

#

I think I'm going to have nightmares of temporal ghosts haunting me

#

holy crap when I move something in front of my eyes really fast I see ghosting

heavy cipher
golden schooner
#

ask Tear

#

πŸ˜„

long robin
golden schooner
long robin
#

not enough SEO

#

you have to go to the third page of results to see my creation

golden schooner
#

heh

#

i thought i had it in the cache already and pressed return too kwikli

long robin
#

I kinda wish opengl glsl had samplerless textures (you can only call texelFetch on them) like vulkan glsl does

#

just a minor convenience

long robin
#

@shell inlet I'm unsure where in the pipeline I should put the variance calculation, but I guess itdepends. It seems like the variance can be used to drive temporal accumulation as well as the spatial blur at the end.
Maybe I should compute variance of unfiltered output (or from history?) -> temporally accumulate (using variance) -> compute variance of the output -> spatial blur (using new variance)

#

I see pros and cons in the different places to compute variance, and maybe there is no one true answer

#

I have a feeling that I need to calculate the variance twice

#

they also use a phat kernel for computing variance

#

oh wait these guys also have temporal variance nervous

#

so there is "history moments" which I guess are accumulated per-frame moments (accumulated in the same reprojection pass I guess (though the variance won't go down over time unless you do something))

#

so then with these history moments you can compute the temporal variance you use for driving the spatial blur at the end

#

this variance stuff adds a bunch of complexity πŸ˜„

shell inlet
#

i dont remember where is everything at

long robin
#

hmm

#

it do be like that

long robin
#

I find it odd that svgf uses a constant temporal blend factor of 0.2

#

I feel like it should be driven by variance or something

shell inlet
#

no

#

why

#

variance is not the same as temporal gradient

#

variance does not help fight ghosting

long robin
#

the idea was to give more effective samples to areas with a longer history or lower variance by using a lower alpha for those places

#

I swear some implementations do something like that

#

ah, that's a-svgf

#

and I'm guessing they don't use just variance for that, judging by your response

#

well, my idea was not exactly about approximating the temporal gradient

#

it would probably make temporal overblurring worse though

golden schooner
long robin
#

Probably

golden schooner
#

they are everywhere πŸ˜„

long robin
#

There are more

golden schooner
#

there is so much interesting stuff out there

shell inlet
#

so you need to store a chunk of previous gbuffer or smth and use it to calculate new illumination with the up to date state of the scene, then compare it with illumination from previous frame

#

then the difference you have is only related to change in illumination

long robin
#

Seems 🦐le enough

shell inlet
#

too quiet

shell inlet
#

hows the progress going

long robin
#

slowly, but I think I understand almost the whole svgf paper now

#

I noticed that some impls use a bilateral filter for computing variance, while others use a separable gaussian without edge weights

#

I suppose I will try both

shell inlet
#

they prolly drop weighting to squeeze performance at the cost of losing quality

#

bilateral should be superior

long robin
#

I should note that the impl in question (which uses a separable gaussian) is specifically for shadow denoising, which means the only edge weight that matters is the depth weight

#

So they can get away with cheating more easily

shell inlet
#

yeah sounds reasonable

long robin
#

glsl has cool syntax
bool valid[2][2] = bool[2][2](bool[2](false, false), bool[2](false, false));

long robin
#

I was having a strugglelele with manual bilinear filtering (again KEKW), but I managed to solve it by swapping some numbers, and idk why. Here is the shadertoy I used to solve the issue
https://www.shadertoy.com/view/DsXXWB

shell inlet
#

you had to swap them because you swapped the passed arguments

long robin
#

dammit KEKW

#

wait no I intentionally swapped the arguments

#

the shadertoy as I sent it is the "correct" thing

#

even though the code looks wrong because of the swapped args

#

my confusion is somewhere else

shell inlet
#

vec3 Bilerp(vec3 _00, vec3 _10, vec3 _01, vec3 _11, vec2 weight)
return Bilerp(_00, _01, _10, _11, weight);
?

#

intentional?

long robin
#

yes, surprisingly

shell inlet
#

why though

long robin
#

when they are in the correct order, the output is this (the issue is with the inputs I pass to the function)

#

so I just changed the order of params because it was low effort

#

the problem is a misunderstanding elsewhere

shell inlet
#

i thought you changed the function?

#
//this really is left vertical
  vec3 bottom = mix(_00, _01, weight.x);
//this is right vertical
  vec3 top = mix(_10, _11, weight.x);
long robin
#

oh man I really messed that up

shell inlet
#

so if you make the order correct and swap in the function it will remain correct

long robin
#

yeah changing those to the correct thing fixes it

#

I appreciate you looking and confirming my dementia diagnosis frogeheart

shell inlet
#

you're welcome

shell inlet
#

wanna try fun thing?

#

use smoothstep(0.0, 1.0, weight.x) in your mix instead of weight.x

#
vec3 Bilerp(vec3 _00, vec3 _01, vec3 _10, vec3 _11, vec2 weight)
{
  vec3 bottom = mix(_00, _10, smoothstep(0.0, 1.0, weight.x));
  vec3 top = mix(_01, _11, smoothstep(0.0, 1.0, weight.x));
  return mix(bottom, top, smoothstep(0.0, 1.0, weight.y));
}
long robin
#

I'll try it after I wake up

shell inlet
#

more pixellated

#

still smooth though

long robin
#

Kinda cool effect

golden schooner
#

the right one reminds me of the first time i was playing HL1 in D3D, going super close to the rock wall

long robin
#

Yeah, it seems familiar somehow

#

Reminds me of alpha masked geometry with linear filtering

#

With those round edges

shell inlet
#

the right one is the hardware

long robin
#

Oh I misread deccer's message

#

I was talking about the left one

shell inlet
#

bicubic

#

rip gamma probably

#

#vulkan message

long robin
#

Bicubic in just 4 taps (presumably each is bilinear) mindblown

shell inlet
#

yep you need bilinear taps

long robin
#

Magic splines strike again

long robin
#

here's my crappy 7x7 estimated variance image

#

now I need to give it edge weights and actual kernel weights instead of using a box blur

long robin
#

I can't see much of a difference, maybe the plain bilerp is a bit smoother

shell inlet
long robin
#

I had to boost it a lot for it to be visible in renderdoc

#

it's actually showing [0, .2] or something

#

seems like some impls do a blur on the variance image itself prior to spatial filtering, so I might try that

shell inlet
#

denoising variance to denoise image πŸ€”

#

why do you think you need variance weighting anyways?

#

you are denoising indirect which is a low frequency signal

#

makes sense for direct/shadows but not so much for indirect

long robin
#

disoccluded areas have higher variance I guess

shell inlet
#

and where does the variance come into play

#

what do you use it for

long robin
#

I'm just implementing svgf

shell inlet
#

dissocluded areas have more variance because of low spp, so you could momentarily apply more blur there

#

yeah but svgf is for direct and indirect

#

you don't have noisy direct

#

svgf doesn't use variance weights for indirect as far as I remember

#

at least certainly not in quake 2 rtx

#

they use spherical harmonics instead to preserve normal map detail

long robin
#

in the paper they just split the direct and indirect and denoise them separately

#

maybe I glazed over the actual differences in denoising between them

shell inlet
#

SHs are not in the paper ye

#

well they state that variance weights are needed to preserve high frequency details

#

indirect does not generally have a considerable amount of those

long robin
#

not until there is specular indirect

shell inlet
#

specular is also separated

#

and is even harder to denoise than both of them

#

do they even cover specular in the paper? I don't remember

#

specular is a whole different beast which is a pain to even reproject

#

and since it's highly view dependent it results in extreme ghosting with any of the naive reprojection methods you use for diffuse

long robin
#

they talk about it a little

#

they mention that one of its weaknesses is overblurring reflections

shell inlet
#

I think "beyond svgf" paper talks exclusively about specular

long robin
#

the paper is 15 pages with an extra 51 pages of random path tracing info stapled to it

#

will read tho

shell inlet
#

it's considerably smoother than bilinear

long robin
#

I'd have to change a lot

#

maybe not

shell inlet
#

it's a copypaste of 3 functions and 1 call to sampleBicubic

#

should work as an alternative of texture

#

drop in

long robin
#

I'm not calling texture, so it's not drop-in

#

it's manual bilinear filtering except different (it's the one described in svgf)

shell inlet
#

ah okay I get what you are talking about

long robin
#

it's this thing in case you don't want to find the paragraph

shell inlet
#

yes I remember that

#

I think I mentioned this one time, maybe not here and not to you specifically

#

that svgf uses software bilerp like this

long robin
#

it works way better than the garbage I was using before

#

which was basically written without a reference

shell inlet
#

that's what I meant when I said reprojection is trickier than it seems

long robin
#

ye

shell inlet
#

actually I did say that to you and right in this thread

long robin
#

to the untrained eye, "software bilerp" seems equivalent to what the hardware gives you

#

tl;dr I should've just read the paper more carefully

shell inlet
#

svgf source code should also work and maybe even faster than only reading the paper

long robin
#

is the source code just falcor

shell inlet
#

reading both works best id say

#

source code is a chunk of falcor framework, maybe it's gone from nvidia developer page by now

#

in that case its just in falcor repo

long robin
#

yeah I started by reading falcor and some DX sample, but it was hard to understand without context

shell inlet
#

i still have the source code from nv dev on my home pc

long robin
long robin
#

ah, my linter is throwing a fit for some reason

long robin
#

added a new atrous pass, which doesn't look bad if I make the samples absurdly sparse (4, 8, and 16 wide)

#

I get low-frequency boiling artifacts which I'm not sure how to remove except by reducing the temporal alpha

#

going from 0.2 to 0.05 makes it a lot better in that regard

#

though the temporal lag is bad unless your FPS is absurdly high (which it is in this shrimple scene)

#

that's with 0.2 temporal alpha and the super wide spatial filter

golden schooner
#

still looks cool

long robin
#

when I add two more passes to the filter (so it's 1, 2, 4, 8, 16) it looks better, but the boiling is still there

golden schooner
#

wasnt pjbomb or the other guy rmax? talking about that boiling too some time ago?

golden schooner
#

its spaghetti code already ;D

long robin
#

I wonder if the issue is that my framerate is too high 🏎️

#

it would certainly make the boiling appear quicker

golden schooner
#

can you decrease the frequency of that boiling somehow?

long robin
#

I can lower the framerate

golden schooner
#

heh

long robin
#

I can decrease the temporal alpha so it blends more frames together

#

which removes the boiling, but adds temporal lag (moving the light makes the lighting update more slowly)

golden schooner
#

how frequent do you move a light

long robin
#

right now it's just the sun, so only when testing

golden schooner
#

or would you actually notice in an actual game using that engine?

#

when you focus on gameplay?

long robin
#

but there should be minimal temporal lag for practical purposes

golden schooner
#

of course

long robin
#

the default of 0.2 has almost unnoticeable lag since it's only about 5 frames of accumulation

#

unless your FPS is really low

#

it also results in that boiling you see

golden schooner
#

ah

#

swr is also doing some GIisms

#

or is it "just" about implementing a paper properly now?

long robin
#

you can think of the temporal blend as just having extra samples to spatially filter with rather than something that solves noise on its own

long robin
#

hmm, I feel like the variance stuff may still be helpful since the amount of noise is not constant across the image

#

It won't solve the boiling though

#

I wonder if the sampling sucks somehow, or if I just need TAA like they have in the paper

golden schooner
#

the 2nd pic shows the variance stuff in action neh?

shell inlet
#

boiling is due to variance, if you remove spatial filters and leave only temporal you will see that it's just bad, either needs longer lived history or better sampling strategies

#

surface sampling is just that bad

#

but what if we add restir

#

also are you storing first pass of atrous in the history?

#

the trick to "increase" sample count

#

by the way what if we, instead of using disk random, build a 2d discrete pdf out of the shadow map, based on "flux" intensity

#

on the fly

#

then use it to sample

#

maybe not even at full resolution to save performance

#

should still be pretty good

#

now that I think about it maybe it's not going to be pretty good πŸ€”

#

I'll be back after I eat

long robin
#

I tried different steps for that pass and it didn't help much

shell inlet
#

what does it look like before you blur it?

#

I bet all noisy still

#

anyways the reason why sampling whole rsm based on intensity would not help much is because we are already pretty much sampling only emissive parts since we sample only whats visible from the sun pov

#

there is no way to not sample that (i.e. sample areas in shadow)

#

there is a room for improvement in other aspect though, the cosine still makes flux variable so maybe sampling using discrete 2d invcdf could still improve sampling but not by much I think

#

then there is a third factor that influences contribution - geometry term

#

if we could sample proportionally to it then I bet the variance could be substantially reduced

#

perhaps resampling can help here

long robin
#

Well

long robin
#

With and without temporal blending (not in that order)

shell inlet
#

1spp?

#

maybe you could throw more spp at it since rsm samples are cheap

long robin
#

Yeah 1spp

#

More samples fixes it but I'd like to see if smarter filtering can help first

#

I also haven't tried quarter res yet

shell inlet
#

smarter than svgf is probably only RAE

#

but before filtering you really need to step up the convergence game of your 1spp

long robin
#

That's tough

#

I wonder if there is a trick that could improve the perf of the extra samples. Currently the sampling pattern obliterates the cache

#

Perhaps we could take multiple samples in a random small area

#

The extra samples should be cheap in theory since they are more coherent

#

But also maybe not, since cache lines can be evicted between each sample if you have enough competing threads

shell inlet
#

I bet the memory is laid out in morton order under the hood so as long as the tile that the taken samples cover can fit in the cache it's going to be somewhat cheap

#

otherwise it'd need another reload from vram to take the sample and flush the cache

#

that's just speculation I have no idea how real hardware operates

#

but that's how a software rasterizer would do

#

actually you tell me you work at amd

#

but anyhow I don't think that making samples coherent is useful because you actually want the samples to be incoherent, that way it means you are getting fresh diverse samples

long robin
#

It's laid out in tiles, and I don't believe they use a z curve

shell inlet
#

maybe a z curve per tile?

long robin
#

Take it with a grain of salt because there is a lot of conflicting info floating around so I'm not certain if I'm describing AMD hardware

long robin
shell inlet
#

iirc PS2 hw had z curve per tile

#

I've seen textures encoded like that in a ps2 title

#

which are immediately loaded into the memory from the disk

#

they call it preswizzled

long robin
#

Idk how big the tiles are either

shell inlet
#

even if today swizzling means .xyzw thing in shaders

long robin
shell inlet
#

swizzling threads hmm

#

and for ideal access of what exactly?

#

I was talking about swizzling meaning rearranging layout of texture texels

#

and preswizzled meant that textures are stored as swizzled for hardware to consume directly

long robin
#

ARmpRed8x8 is the function

#

The function I mentioned is for writing though

#

But I don't see why it wouldn't work for reading if your access pattern is predetermined

shell inlet
#

this resembles some sort of z curve

#

not exactly like the textbook morton order

#

but not entirely unlike the one either

long robin
#

I like how the code is unreadable (I don't)

long robin
#

I did a test

#

first image = previous setup (using one per-frame rng value for influencing xi.x and xi.y in the shader)

#

second image = two per-frame random numbers for xi.x and xi.y

#

@heavy cipher is the reason we couldn't have the xi covid variant

#

anyways, here's when I do one rng for xi.x and xi.y (respectively) instead of both at the same time πŸ‡¨πŸ‡³

#

(xi.x corresponds to sample radius and xi.y corresponds to sample angle)

#

seems like applying the same per-frame random number to both values is worse than any of these alternatives

long robin
#

post-3x3 bilateral filter (left one is rng for just xi.x, right one is rng for xi.x and xi.y)

shell inlet
#

what if you stop rotating xi with two blue noise values

#

ah wait it's temporal not spatial that you're interested in

#

I guess you might want a low discrepancy sequence instead of white random

long robin
#

a temporal LDS?

shell inlet
#

uhh

#

one dimensional lds to rotate the samples every frame

long robin
#

I'm generating two white noise values per frame and using it for all threads atm

shell inlet
#

you may call that temporal I guess

long robin
#

I could try ur idea

shell inlet
#

the effect is that you will have less samey samples every frame

long robin
#

ideally, any consecutive group of ~5 frames should have evenly distributed noise (kinda like interleaved gradient noise I guess)

shell inlet
#

because white noise has high frequencies in it and causes clumps

#

lds is more evenly spaced or spaced in a way that allows you to explore the space more effectively

long robin
#

TAA impls use interleaved gradient noise so you can temporally explore space efficiently

#

or so I hear

#

however, I think a 1D LDS might be good enough

shell inlet
#

you have everything up on git?

#

I wanted to try out specular some time ago and I'm home now

#

maybe working with what's there will do fine

long robin
#

I think I have something I can push though

#

ah yes, where I add the better reprojection

shell inlet
#

I don't feel like writing complex stuff so I'll probably use some solid angle sampling from gi compendium and assume it's the specular brdf (lol)

#

pdf=brdf

#

no ggx

long robin
shell inlet
#

tell me when you're done updating the repo

long robin
#

I'm not pushing anything else today, but I have some epic changes in the works

shell inlet
#

ok then

long robin
#

I won't be touching the sampling function

shell inlet
#

I'll use what's currently there

long robin
#

alright, I'm gonna go to bed now

#

opengl with your specular sampling

shell inlet
#

it already looks so bad I'm losing motivation

#

just with perfect marched reflections

long robin
#

how bad?

shell inlet
#

also was thinking of plugging in a different brdf at first but then could not find a brdf

#

maybe I could simply kill some rays that are outside of the cone

long robin
#

I suppose specular really benefits from indirect occlusion

shell inlet
#

funny but doesn't look as bad without marching

#

just killing rays outside of the cone

#

this is not physically accurate and broken maths-wise

#

but behaves like how specular would

#

yep my marching algo is garbag

#

boosted a bit, looks like specular to me

#

300 samples allow it lol

#

really need a good way to find intersections for this to work better

long robin
#

nice

shell inlet
#

idk if I'll be working on it further

#

don't know a better marching algo

#

it used to work for screen space reflections

long robin
#

hrm

#

which algo

shell inlet
#
  float dist = 0.05;
  vec3 rayWorldPos = {0, 0, 0};
  vec2 rayUV = {0, 0};
  for(uint i = 0; i < 10; ++i)
  {
    rayWorldPos = surfaceWorldPos + R * dist;
    rayUV = ProjectUV(rayWorldPos, rsm.sunViewProj);
    float rayDepth = textureLod(s_rsmDepth, rayUV, 0.0).x;
    rayWorldPos = UnprojectUV(rayDepth, rayUV, rsm.invSunViewProj);
    dist = length(surfaceWorldPos - rayWorldPos);
  }
long robin
#

is R the reflection vector

shell inlet
#

it is

long robin
#

ok, I think I see how it works

#

you might have better luck with an algorithm that has a constant step size and a condition to break once you get close to the surface defined by rsmDepth

shell inlet
#

thought about it too, didn't try

shell inlet
#

man idk raymarching is hard

#

I've been thinking about it and the best method would be to step pixel by pixel to not overshoot

#

otherwise it's very possible that it will return wrong intersection point and if we sample around it most if not all the samples will be zero according to brdf for sharp-ish reflections

long robin
#

you can use a larger step and then perform binary search at the end to refine the hit location

shell inlet
#

if you know how, can you make it work

#

?

#

because I can't

#

I've been already thinking about other ways

#

other ways to narrow down sampling area

long robin
shell inlet
#

I don't feel like trying to understand this rn so I'll wrap it up for now

#

kinda already satisfied my curiosity by making brute force specular

shell inlet
#

albedo modulation looks weird now

#

when you disable it that is

#

probably because of temporal filter

long robin
#

what's weird about it?

shell inlet
#

nothing actually

#

I'm tripping

#

this is still brute force

#

just looking interesting

#

it only reflects what the light source is seeing

long robin
#

I somehow managed to make the driver cause a crash when linking my program

#

it can compile the shader successfully though

#

if I comment out this loop, it stops crashing

  for (int x = 0; x <= kWidth; x++)
  {
    for (int y = 0; y <= kWidth; y++)
    {
      ivec2 pos = gid + ivec2(x - kRadius, y - kRadius);
      if (any(greaterThanEqual(pos, targetDim)) || any(lessThan(pos, ivec2(0))))
      {
        continue;
      }

      float weight = kernel[x][y];
      vec3 c = texelFetch(s_input, pos, 0).rgb;
      float lum = luminance(c);
      accumSamples += weight;
      mean += lum * weight;
      meanSquared += lum * lum * weight;
    }
  }
#

kWidth is 5
kRadius is 2
kernel is a 5x5 array

#

if I change weight so it's always 1, it doesn't crash

#

wtf

shell inlet
#

uhh why is it not strictly less than

#

it's less than or equal

#

if kernel is 5x5 and width is 5

#

you are probably going to get an out of bounds?

long robin
#

ruh roh

#

yeah it no longer crashes when linking nice

#

the <= was some vestigial thing from the previous code that was there

#
  for (int x = -radius; x <= radius; x++)
  {
    for (int y = -radius; y <= radius; y++)
long robin
#

later, one thing I want to try for improving perf is making the bilateral filter separable

long robin
#

This is a certified hood glsl classic

shell inlet
#

maybe we can project a cone somehow onto the rsm to sample specular

#

basically a map from world to uv space

long robin
#

sounds like hard math

#

but doable

shell inlet
#

although physically based specular is not a cone

#

honestly I don't even remember how to sample TR CT

#

was something like reflecting view vector along the microsurface normal

#

and that normal follows the NDF

#

I'd like to just observe how samples are distributed in uv space when you discard samples outside of the cone

#

maybe can make a desmos graph

#

need some pseudo depth map manifold

#

ah I actually need to do homework actually πŸ˜”

#

bye

long robin
#

good luck

shell inlet
#

hm think about it, we are already wasting samples by sampling a circle disregarding the fact that we are only interested in samples on the hemisphere around the surface normal

#

so many samples end up being wasted

#

but what if we start by projecting the hemisphere in uv space then finding the area of that projection to use in normalization term and find the jacobian determinant of the mapping to use as pdf

#

should end up having all samples contributing something

#

writing my thoughts here so I won't forget

long robin
#

interesting thoughts

rugged notch
#

I am inside your head

long robin
#

projecting the hemisphere into uv space though

#

how does that work

#

I mean, I see why it's good

shell inlet
#

by generating samples in the hemisphere in world space then transforming them with viewproj of the rsm

long robin
#

ah smart

#

that seems obvious in retrospect

shell inlet
#

the hard part is deriving pdf and normalization factor

long robin
#

I don't really understand all the jacobian stuff other than that it's used in differential equations or something

shell inlet
#

jacobian determinant is a way to find surface element

#

surface element is the area of the differential

long robin
#

I recall something about it representing the local rate of change I think

shell inlet
#

jacobian matrix is all of the partial derivatives of each argument of a multivariate function

#

wikipedia has a better definition I'm sure

long robin
#

using a projected hemisphere will result in up to 50% fewer wasted samples according to my head math

long robin
shell inlet
#

they are condensed but make sense if you have understanding of underlying concepts

golden schooner
#

there goes the J again πŸ™‚

long robin
#

wat

golden schooner
#

Jacobian Matrix

#

we talked about it yestergestern πŸ™‚

long robin
#

In your thread?

#

I think I remember void mentioning it once in the context of IBL, idk

golden schooner
#

you made fun of void sizzling J into things all the time πŸ™‚

long robin
#

Yes the Jacobians are an infection

#

Maybe I'll invent the counter to it called the Jakerbian

golden schooner
#

Jakerbean πŸ˜„

shell inlet
#

I didn't have to do it for what deccer had because it was an obvious spherical to cartesian map (but without radius)

#

I failed to recognize it at first

#

what's more of interest is the case of 1/pi becoming pi after going from integral to riemann sum

golden schooner
#

at least it keeps jaker on his toes

long robin
shell inlet
#

what if you write the full name?

long robin
#

it describes a video denoising technique instead

long robin
#

the video is kinda hyper to show that it works "under motion"

#

and that is with the variance-guided luminance weight

#

also with the temporal alpha turned down to 0.05 instead of 0.2

#

I can't see how they can make 0.2 not super unstable unless the sampling is really good

golden schooner
#

lol

#

i honestly didnt notice any "flicker in the distance" in the first video

long robin
#

it's easier to notice when everything is still and you aren't looking at a compressed video

golden schooner
#

you do fly around like someone having an asymptotic minute going on

#

its beautiful

#

its a huge step forward from the last one

long robin
#

I changed a bunch of stuff so I can't even remember

#

now I need to add a bunch of sliders so I can determine the ideal constants to hardcode

#

and then optimize because right now I'm doing five 5x5 bilateral filters in a row

golden schooner
#

i keep remembering to git commit things when i made a step forward, but i end up changing too much at the same place and then i cant do that anymore :3

long robin
#

I didn't want to commit anything because the code is quite messy right now

golden schooner
#

but sometimes i remember rider/clion have local history

#

you can always rewrite (change order/remove/etc) commits later, but you have a save point if you have to go back for some reason

shell inlet
shell inlet
#

that is, in comparison with what you have with dithered rsm

#

I wish I could spend some time on improving it but I am doing my homework

#

school slows down the technological progress in the world

long robin
#

I want to wrap this up soon anyways

shell inlet
#

check out the 1spp footage

long robin
#

damn lol

shell inlet
#

it's very good in terms of convergence rate

long robin
#

I wish that video wasn't 720p

shell inlet
#

your wish has been granted

#

pretend it's the same video

long robin
#

hehe

#

I see a little bit of low frequency noise still in theirs

#

at 58 seconds, the ceiling looks quite similar to the one in mine

shell inlet
#

they have shorter-lived history I believe

#

also remember the hysteresis trick

#

?

long robin
#

which one

shell inlet
#

what if you feed it back the result from a third iteration and not first

#

should have even more fake sample increase

long robin
#

interesting idea

golden schooner
#

metal sponza doesnt look bad either

long robin
#

I'm guessing the problem is the new samples not being blurred enough

#

the history is super smooth, but when you add 20% of the current frame, it's still kinda noisy

#

I think the extra blurred history only helps to an extent

long robin
#

I can crank the alpha down a lot with minimal issues except for lag when I move the light (even disocclusions aren't bad). Then I just need to add temporal gradient estimation

shell inlet
#

what if you make it blue noise

#

like use one value to rotate all xi components
don't do that it's pointless, I just did

#

also when will you commit the changes

shell inlet
#

well I ended up playing with it in my free time

#

and thought about better sampling but I'm totally out of ideas tbh, need to read more about stuff

long robin
long robin
#

mfw I had the luminance weight disabled this whole time

  float luminanceWeight = LuminanceWeight(oLuminance, cLuminance, variance, uniforms.phiLuminance);

  float weight = normalWeight * depthWeight;
#

only noticed once I added sliders for everything

heavy cipher
#

what was your cmake error jaker?

long robin
#

something probably identical to geryu's

#

I recall running vcvarsall or something like that

#

tomorrow I can try again on my work machine

#

the weird thing is that after my troubleshooting, I can create vs solutions with cmake, but not open the directory with vs

heavy cipher
#

but geryu's was from the cmdline

#

anyways

long robin
#

the output log had basically the same error

#

but yeah I need to fix these nans now

#

NaN/INF/-ve Display my beloved

#

the issue was that my variance calculation was sometimes returning a number less than 0 nervous (whose result was then fed to a sqrt)

long robin
#

the variance guiding seems to create more problems than it solves

#

some disoccluded pixels flicker (rejected by spatial denoiser) due to high variance, so you need TAA or more blurring to fix it

shell inlet
#

what are you using it for anywahs

long robin
#

I just wanted to try it

shell inlet
#

try it on direct illumination

#

there is more variance in indirect always

#

unless maybe caustics

long robin
shell inlet
#

I'm currently pestering criver in #mathematics in order to try to move it to semi-directional sampling

#

so that we could use importance sampling of some sort and specular

long robin
#

oh and by variance weight I mean illuminance weight (which is variance-guided)

#

the quality is a little better in the second, particularly at a distance where indirect is higher frequency

shell inlet
#

looks colorful tho very pleasing

#

always a treat to see the indirect from curtains

long robin
#

without vs with illuminance weight

#

second is darker overall, and there is a little less overblurring

shell inlet
#

by the way when using directional sampling you have more apparent blue noise patterns

long robin
#

that's very blue

shell inlet
#

so when sampling random points on a surface there is inherently less correlation between samples?

#

idk tbh

long robin
#

I was also able to get a visible blue noise pattern (though not as strong as yours) by just randomizing xi.x per frame

shell inlet
#

now gotta wait for criver to go online and discuss the issues

#

maybe we should move the conversation over here

long robin
#

maybe

#

I see you guys are in deep discussion

#

I normally have #mathematics muted because convos in it are so long and technical

shell inlet
#

shit I accidentally pressed on visual studio that has been minimized for a few days and is now certainly entirely swapped on hdd

#

it's now going to load everything in ram

#

my hdd usage led is now always active

#

gotta love how much ram vs is hogging

#

almost like chrome

long robin
#

I found that occasionally restarting it helps

shell inlet
#

I sometimes kill its processes

#

such as vcpkgsrv.exe

#

not lethal, it simply creates fresh instances that have less hogged

long robin
#

I'm also not boosting the variance of recently disoccluded pixels, but I doubt that matters much 🀰

long robin
#

one thing I noticed is that the RSM resolution heavily influences the performance of sampling it (2048^2 vs 512^2)

#

the difference in indirect lighting quality is not very obvious

#

sampling is about 5x cheaper when the RSM is 512^2 compared to 2048^2

#

when the RSM resolution is too low, the indirect lighting starts looking pretty bad (128^2)

#

but at least it only takes a few ms to take 40 samples

#

I need to experiment with having multiple RSMs, i.e., having multiple indirect light casters

long robin
#

alright, I pushed everything

golden schooner
#

the first 2 pics from the last batch of pics, really looks neat

long robin
#

now I gotta optimize it so it runs on your iGPU

#

atm it's about 4-5ms on my GPU, which is too slow

#

on an iGPU it will probably be 40-50ms 😦

#

the first opt I'll try is using separable bilateral filters

#

and then doing everything at quarter res

golden schooner
#

i fink it makes sense not to target igpus with these kind of things anyway

#

GI should be off

long robin
#

Mayhaps

#

I want to run it at 4k in like 2ms though

#

What's sad is that it'd still be slower than gi-1.0 which looks way better than this

shell inlet
#

you could make use of simple irradiance maps placed around for igpus

#

and I'm not talking about preintegrated ones

#

I'm talking about dot luts

#

kinda spherical harmonics I guess

#

there was this sonic game on dreamcast that did this

#

you basically have a gradient texture that covers -1 to 1 range of the dot product of the normal and some direction

golden schooner
#

or "GI" on igpus just takes much longer, speading it out over 30 frames or whatever

shell inlet
#

and you find the dot product map it to [0; 1] and sample from the texture to get the simple irradiance

#

btw turns out the rsm sampling using hemisphere thing is very hard, rest in peace

long robin
shell inlet
#

did you know you could optimize the compute light function

float sqr(float x) { return x*x; }

vec3 ComputePixelLight(vec3 surfaceWorldPos, vec3 surfaceNormal, vec3 rsmFlux, vec3 rsmWorldPos, vec3 rsmNormal)
{
  vec3 pathSegmentVec = rsmWorldPos - surfaceWorldPos;
  float cosines = max(0.0, dot(surfaceNormal, pathSegmentVec)) * max(0.0, -dot(rsmNormal, pathSegmentVec));
  // Clamp distance to prevent singularity.
  float d = max(length(pathSegmentVec), 0.03);
  // Inverse square attenuation. d^4 is due to us not normalizing the two ray directions in the numerator.
  return rsmFlux * cosines / sqr(sqr(d));
}
#

I once did that partially but you reverted the changes

#

also two cosines is not geometry term

#

geometry term is two cosines divided by squared distance

long robin
#

I think the paper calls it the geometry term, but not in reference to an actual BRDF or anything. Maybe a better name would be angular attenuation or something

#

what's the optimization?

shell inlet
#

1 sub and 1 negation instead of 2 subs and 2 muls instead of 3

#

d = d * d
d = d * d
vs
d = d*d*d*d

long robin
#

I would expect the shader compiler to do that one at least

shell inlet
#

maybe but I'm not relying on that

long robin
#

I don't worry too much about small arithmetic optimizations

shell inlet
#

you see another one right

#

we have to take the difference only once

long robin
#

also you can negate the dot product instead of negating the vector

shell inlet
#

oh and also the float d = max(length(pathSegmentVec), 0.03);
was distance before where I believe third sub would take place under the hood

long robin
shell inlet
#

I guess I'll move the sign over to the dot

long robin
#

I diffed the assembly (RDNA2) of the original and your version and they were identical

#

wait no

#

my diff tool is special

shell inlet
#

what's the change

long robin
#

one sec, windiff got messed up when I reinstalled VS on this machine

#

wow comp is so useful

Name of first file to compare: a.txt
Name of second file to compare: b.txt
Comparing a.txt and b.txt...
Files are different sizes.
shell inlet
#

lol

long robin
#

diffing assembly is poopy

#

once there is even the slightest change, everything becomes different (this is with some random vscode diff extension I found)

#

new idea is to analyze them with shae.exe

#

overall, the optimized version has 4 fewer instructions

#

and vgpr pressure goes from 31 to 29

shell inlet
#

it was not in vain

long robin
#

God I wish I could analyze OpenGL apps with RGP

long robin
#

I can send the (dis)assembly in a sec

shell inlet
#

so uhh I don't think I would like to compare them

#

what analysis you did was plenty enough

long robin
#

understandable

shell inlet
#

so are we going to implement software ray tracing in fwog

long robin
#

sw rt is something I eventually want to explore since a lot of techniques use it, but right now I have too much stuff on my plate to do it

shell inlet
#

does fwog at least have the means to implement it?

#

two level acceleration data structure is probably taking it too far though since even with rsm there are no dynamic things

#

but building one bvh for the whole scene and packing it into a buffer texture should be enough

#

ok maybe not, maybe using ssbo is better

#

buffer textures would be useful only for two level ADS

long robin
shell inlet
#

since you could exploit the bindless textures extension

#

and treat it as BDA

#

and BDA is very useful for two level ADSes

long robin
#

I see

shell inlet
#

though performance is worse than the true bindless buffers

#

but I have RTX so who cares

#

it can chew it ezpz

long robin
#

using a single giant SSBO for all BLASes is probably okay

#

one for TLAS, one for BLAS

#

it'd definitely be fine for demo purposes

shell inlet
#

probably, but code will be more convoluted

long robin
#

you'd abstract away buffer access to shrimplify it

shell inlet
#

you could implement a SSBO pool class that grows automatically like vector

#

and abstract the memory thing away at least on the host side

#

on the device side the shader might get messy

#

maybe with macros it could be saved a bit

long robin
#

instead of the TLAS pointing to a buffer, it points to a segment of the BLAS buffer

#

I don't think it would be too bad tbh

shell inlet
#

maybe, sounds good enough to me

long robin
#

actual buffer pointers would be nicer, but this is what we got

shell inlet
#

I also recommend using madmann's bvh lib

long robin
#

that's cheating 🀰

shell inlet
#

but writing your own is going to be a pain when there is a lib like that out there

#

I wouldn't call it cheating if it has more common sense

long robin
#

nih destroyed

shell inlet
#

or you're saying you'd like to make bvh building a core feature of fwog?

long robin
#

oh jeez

#

nah fwog just aims to be an opengl wrapper

shell inlet
#

examples are already something to behold though

#

ever since I came and said what if we dither-filter your rsm

long robin
#

ye I should advertise those more

shell inlet
#

there's also galunga I believe?

long robin
#

ideally, the examples can be used for learning

long robin
#

I can't find the project post

shell inlet
#

ok so

long robin
shell inlet
#

BLAS stores triangle indices
TLAS stores offsets into BVH buffer

#

wait nope

#

it stores renderables

#

renderables store offsets into BVH buffer

long robin
#

ah

shell inlet
#

ok it stores indices to renderables

#

there should be another buffer with them

#

you would have a matrix and material and BVH offset and shit per each renderable struct

#

also need to fetch tris

#

so this info too

long robin
#

there would be a lot of buffers

#

with all this indirection

shell inlet
#

you'll have to pack your geometry too then

#

just a buffer per vertex attribute type I guess

#

unless you store interleaved

#

array of structs

#

this is a limitation in the sense that your geometry will have to be all of same format or you'll have to make a buffer for all permutations

#

yeah there is a lot of things to consider

#

better make a draft in some text file or something

#

can't keep track of all of it in your mind

long robin
#

yeah

shell inlet
#

but are you set on making two level ADS?

#

I thought it's fine to have one BVH for the whole scene

#

still need to pack geometry thouhg

long robin
shell inlet
#

you will do it because I am in this thread

#

as soon as I get free time I will start a fork and you will most likely get dragged in purely by how contagious my enthusiasm is

long robin
#

you can just make a new project that uses fwog, no need to fork

#

but you can fork if you want

shell inlet
#

true

#

but that will become my project and I don't want that, I want to contribute to fwog

long robin
#

I would happily link your project under the "users of fwog" section πŸ˜„

shell inlet
#

should be the first library with the most over the top examples

#

examples:
hello triangle
deferred rendering
reflective shadow maps with denoising
real time software ray tracing with denoising
AAA game we made for fun in our spare time

brisk narwhal
#

I like it, the asymptotic upper bound is O(n^n) for the example complexity

long robin
#

I just realized that the RSM is definitely being affected by the depth bias in the shadow pass

#

the backsides of these curtains are no longer being strongly illuminated when I turn off the depth bias (ignore incorrect shadows)

#

with depth bias

shell inlet
#

so you're writing the depth with an offset huh

long robin
#

it really do be like that

#

I guess it's time to put the offset in the shader that applies the shadow

shell inlet
#

remember there was an offset along the normal? I wonder why the paper did that

long robin
#

yeah idk why

#

maybe that will be useful now

shell inlet
#

probably not but you made me recall it

#

I kinda like how curtains are translucent though

long robin
#

it was a neat effect, but horrible to denoise

shell inlet
#

why is it horrible to denoise?

long robin
#

there were not many valid samples, since the only samples it could use were ones on the interior of the wrinkles of the curtain

#

and what samples there were, were all pretty weak since none were directly facing the surface being shaded

shell inlet
#

btw does feeding samples from third atrous not help at all?

long robin
#

I tried and it didn't seem to do much

#

I'm normally feeding back the output of the first atrous pass already

#

seemed like the history was already denoised enough, and it was the current frame's samples that were causing issues

shell inlet
#

plz add my optimization

#
#extension GL_GOOGLE_include_directive : enable

thonk

#

this is not an extension that's implemented on any desktop gpu

#

on nvidia at least

long robin
#

nvidia actually implements part of it, but that's not why I have it

shell inlet
#

well it causes a crash for me

long robin
#

I have it just so my linter doesn't complain about #includes

shell inlet
#

wait that's not the problem

#

it warns but not errors

long robin
#

ye that's fine

shell inlet
#

ur dividing float by uint

#

118 reproject2

long robin
#

isn't the uint implicitly converted to float?

#

spec says no

#

wait

shell inlet
#

changed to float(historyLength) and now theres another error

#

in bilateral now

long robin
#

When performing implicit conversion for binary operators, there may be multiple data types to which the two operands can be converted. **For example, when adding an int value to a uint value, both values can be implicitly converted to uint, float, and double. In such cases, a floating-point type is chosen if either operand has a floating-point type. **Otherwise, an unsigned integer type is chosen if either operand has an unsigned integer type. Otherwise, a signed integer type is chosen.

#

what driver are you on?

shell inlet
#

hell if I know

long robin
#

ok, what GPU?

shell inlet
#

2060

long robin
#

I'm on nv as well (3070), so that's weird

shell inlet
#

I've always had to cast to float

#

even on 1050

#

and on amd too

#

some old radeon hd series

long robin
#

odd, I never had to do it in modern glsl

#

in shadertoy I have to do it cuz it uses some ancient glsl es

#

but the spec (for glsl 4.6) says right there that this conversion is legal

shell inlet
long robin
#

I also ran the code on my AMD machine today (rx 6800) and it ran fine

shell inlet
#

look I can see depth bias causing leaking

long robin
#

shadows in gltf viewer now use fragment shader offset

shell inlet
#

spatial filter step > 1 reduces boiling

#

considerably

#

but also overblurs

long robin
#

fyi the "alpha moments" and "phi luminance" sliders do nothing atm since luminance weighting is disabled

#

somewhere around line 59 in Bilateral.h.glsl is where it's disabled (I just set the weight to 1)

shell inlet
#

svgf and 5 samples seems stable

#

I smell something coming from the case

#

must be gpu

long robin
#

made the shadows stochastic because I deleted shadow samplers

#

only for gltf viewer though

shell inlet
#

finally a use for variance weights?

#

though denoising shadows temporally will lead to ghosting no doubt

#

unless you add temporal gradients or drop history (or make alpha smaller) on sun angle change

long robin
#

I ain't denoising no shadows 😩

#

just making something serviceable that isn't blocky af

shell inlet
long robin
#

knees weak, arms are heavy

golden schooner
#

hehe

#

rRSM needs to be higher a little