#Fwog and co.

1 messages ยท Page 6 of 1

long robin
#

tbh though, I think a shrimple screen-space denoiser would work fine for this

#

since the shadow is not sparse at all

golden schooner
#

at least the jagged shadow is history

long robin
#

once you start tracing actual rays is when a denoiser is needed

shell inlet
#

but it's called denoiser not derayer

#

but I understand what you mean, you aren't warranting a variance guided filter because you don't even need to preserve penumbra if you don't even have one

#

it's a penumbruh

#

same width everywhere

#

but you know what

#

actually nvm it's more expensive maybe

long robin
#

a 3x3 bilateral filter would probably remove most of the noise

shell inlet
#

I was about to suggest cubic for shadow sampler

long robin
#

maybe I could even go down to 1spp

shell inlet
#

shadow sampler is bilinear so will work

#

maybe it won't be as slow as I think due to locality of taken samples but I don't know really

#

essentially a better 2x2 pcf

#

yes here I go again with my weird ideas

long robin
#

you could try it in your local repo

#

just go to ShadeDeferredPbr.frag.glsl and update the shadow function

shell inlet
#

no you do it

long robin
#

and add your custom cubic thingy

shell inlet
#

you have the code

long robin
#

I already deleted the shadow sampler on mine

#

it's gone forever

shell inlet
#

I am doing homework I am business

long robin
#

well I gtg to bed

shell inlet
#

goodnight

long robin
#

gn

shell inlet
#

took a break and tried it out, not much of an improvement

long robin
#

Not a bad improvement imo

golden schooner
#

looks almost as good as my shadow ;P

long robin
#

after the depth bias was removed, ultra low RSM resolutions work fine (this is 128^2)

shell inlet
#

nice

shell inlet
#

what are those triangles on the green curtain

long robin
#

shadow acne

#

the bias numbers I hardcoded for the shadows do not work in all scenarios

shell inlet
#

have you heard about slope scale bias

long robin
#

this is what I'm doing

  float constantFactor = 0.0002;
  float angleFactor = 0.001 * (1.0 - dot(-shadingUniforms.sunDir.xyz, normal));
  ...
  lightDepth += angleFactor + constantFactor;
shell inlet
#

looks like slope scale, except cosine is not linear

#

angle is

#

cosine not

long robin
#

what's the correct math for it

shell inlet
#

I don't know

long robin
#

these numbers work okay most of the time

#

I'm pretty sure there is an analytical way to compute the bias, but I'm not gonna tryhard it that much

shell inlet
#

linearly changing value would be acos(dot)/pi

#

but I'm not sure whether it's required now

#

using cosine as a slope scale just feels strange to me because it won't grow properly with slope

long robin
#

you're right, it fails at extreme angles

shell inlet
#

I need to think what correct slope would be but I need to eat first

long robin
shell inlet
#

2014

long robin
#

the second one isn't an analytical approach, but it does talk about some other stuff

#

kinda interesting

#

rip
float bias = 1.0 - clamp(dot(lightFragNormal, lightFragmentCoord), 0.0, 1.0);

#

stabilizing the shadow with temporal filtering is a good idea though

shell inlet
#

here is what you have now

#

this is what would make sense to me

#

inverse trig is banned on GPU tho so maybe there's a way to find acos using identities

#

uhh yeah I think we can find tan by using tan=sin/cos

#

can get sin from cross product and cos is the dot product

#

so it's len(cross(d,n))/dot(n,d)

#

when n,d are normalized

#

actually order of cross doesn't matter if we are taking magnitude of a vector

#

in 2d equivalent would be absolute value

#

so here is the thing

#

I'm done

long robin
shell inlet
#

why are you rdna bombing

#

where are you going to need this

shell inlet
#

edited graph added automatic bias for a given width

#

I think width would be 1/max(textureSize(shadowmap)) in the code

#

max texel width basically

long robin
#

alright, I'll try it

shell inlet
#

it does not account for quantization of the depth value itself, but I guess it would be 1/((2^precisionbits)*depthrange)

long robin
#

for unorm formats I suppose

shell inlet
#

ye

#

integer and ortho

#

perspective shadows are a pain

long robin
shell inlet
#

peter panning under extreme angles

long robin
#

ye

shell inlet
#

is it worth it

long robin
#

better than horrible acne probably

shell inlet
#

bigger resolution should make peter panning smaller

long robin
#

ye

#

still peter pans at the most extreme angles

#

I know a trick though

shell inlet
#

mr. min()?

long robin
#

float cosTheta = max(0.0, dot(incidentDir, normal) - 0.02);

#

just make the surface dark before the shadow starts peter panning hard

shell inlet
#

๐Ÿ˜ฑ

#

this is what variance/moments shadow maps aim to solve

long robin
#

tfw leaking

shell inlet
#

also that's a hard cutoff isn't it?

#

let me graph actually

long robin
#

what's a hard cutoff

#

the cosTheta thing I'm doing is for the shading, not for the shadow bias

shell inlet
#

yes I know

#

I mean that it'll go black abruptly

#

it'll reach 0.8 and after that it'll go 0

long robin
#

I'm subtracting 0.02 from the whole thing, so it still smoothly goes to 0

#

the peak is being lost

shell inlet
#

oh yeah it'll go to 0 smoothly but it won't reach 1 then

long robin
#

could use a smoothstep near 0 or something

shell inlet
#

so you need to compensate with 1+0.02

#

1.02*dot(n,l)-0.02

long robin
#

I'd actually have to multiply by 1.02040816 to compensate

#

huge difference

#

ah nvm your math is different lel

#

tbh I'm not gonna bother

#

the peter panning is only barely noticeable in this fake scene

shell inlet
#

my math is wrong

#

you need (dot(n,l)-cutoff)/(1-cutoff)

#

division is to normalize the range [0; 1-cutoff] to [0;1]

long robin
#

i c

shell inlet
#

at this point we are straying further away from god pbr

long robin
#

it's odd because you can see that the shadow is farther back

#

ah I think it's my fault

#

nvm

#

I see what the problem is

#

the edge of the floor box is casting a shadow

#

tl;dr not an actual issue

shell inlet
#

what if you clamp the bias though would it really still cause acne?

#

min(max_bias, slope_scale_here)

long robin
#

Yeah I tried earlier

shell inlet
#

hmmm it will I think

#

ok confirmed then

#

what if we shift shadow sample uv along projected normal

#

at best 1 pixel to the side

#

would it solve acne at extreme angles

#

so that we could also reduce peter panning

#

or like check if tangent is higher than some threshold and then instead of adding depth bias shift the uv along the projected normal

#

you get what I mean?

long robin
shell inlet
#

yes and using it as offset vector

long robin
#

Ye I see how that could potentially help

shell inlet
#

projecting it is taking difference (project(p+n) - project(p))

#

huh that is stupid

#

can use inversetranspose

#

transpose(inverse(sunviewproj))

#

I did not think this through it's a crazy idea again

#

so it might not work at all

long robin
shell inlet
#

maybe shifting it towards the inverted projection

#

cuz we want the pixel to be in shadow

#

maybe adding bias AND shifting towards inverted projection

#

this is what'd happen

#

maybe multiply the projection by pixel size also

shell inlet
#

did it not work?

long robin
#

It was a response to your previous message about it being a crazy idea

shell inlet
#

ic

#

it's probably stupid

shell inlet
#

how did sponza change after adding perfect slope scale bias?

#

and does dithered shadow look better?

long robin
long robin
shell inlet
#

tried hqx shadows?

#
Hqx

In image processing, hqx ("high quality scale") is one of the pixel art scaling algorithms developed by Maxim Stepin, used in emulators such as Nestopia, FCEUX, higan, Snes9x, ZSNES and many more. There are three hqx filters defined: hq2x, hq3x, and hq4x, which magnify by factor of 2, 3, and 4 respectively. For other magnification factors, this ...

#

I mean obviously no but what if

long robin
#

interesting

shell inlet
#

there was a paper out there about vectorization of shadows too, this is similar but less fancy

#

similar in end result

#

this does not require bilinear taps

#

unlike bicubic

#

just 3x3 and a LUT for best results

#

this will likely reduce amount of pixelation

#

not as much as vectorized shadows tho

#

by the way about specular rsm

#

I think it's possible to do some sort of rejection sampling with reprojection to sample more efficiently

#

oh and probably restir too

#

or ris at least

#

I mean restir is built on top of ris so it's going to be a start

#

though it's going to require yet another temporal pass for both

#

and is probably overkill so maybe it's time to stop

long robin
#

ye I'm about done with the svgf stuff

#

I just want to optimize it and gtfo

long robin
long robin
#

I estimated

shell inlet
#

you would get penumbras all smudged without them

#

from direct illumination

long robin
#

well, it just looks smudgy in general

shell inlet
#

I don't see it

#

I see temporal lag

#

which is not related to variance estimation

#

perhaps you mean temporal gradient

long robin
#

that's wat I sayd

shell inlet
#

damn what's wrong with me

#

I'm killing myself now

#

you are absolutely right

long robin
#

I wonder why they didn't do it

#

I'd expect these people to use top tier denoising techniques

shell inlet
#

there are other methods besides temporal gradients

#

perhaps they used that but it's being overwhelmed

#

look up history clamping in ReLAX

long robin
#

april 2021 nice

shell inlet
#

no heuristic beats temporal gradients imo

#

but temporal gradients either roughly double the cost (trace twice per pixel and keep previous frame around in its entirety), or add bias (pick 1 pixel in 3x3 strata and forward project them to be shaded together with the current frame)

#

ok can also do it in half/third resolution the first way, too

#

but that kills more pixels that may be valid

#

depending on how many pixels 1 gradient pixel covers

#

obviously

long robin
shell inlet
#

what is, exactly

#

don't tell me whole thing

#

also if you love nvidia so much why don't you work at nvidia

long robin
#

da part where they use a fast history buffer

#

and clamp to neighborhood color like in TAA

shell inlet
#

tbh I wish I was working at nvidia then I would call you my sworn rival

#

lmao that emote nonovidia

long robin
#

I actually made that one lol

shell inlet
#

I expected nothing less

long robin
#

btw I think the fast history buffer would work great for indirect diffuse

#

also wtf that page turned chinese while I wasn't looking at it

rugged notch
#

not very bing chilling of them

shell inlet
#

ching cheng hanji moment

long robin
#

it must have been autoplay or something because it was a different page than before

golden schooner
#

whats the next topic for fwog? after svgtfo?

rugged notch
#

finding love ๐Ÿ˜ณ

golden schooner
#

he found love already, us, frogs

rugged notch
golden schooner
#

speaking of marzipan, are you working on a froge 2.0? @rugged notch

long robin
#

maybe I will finally have the willpower to implement clustered light culling

#

nah jk KEKW

rugged notch
golden schooner
#

too bad

rugged notch
#

which is fair, she just did it on a whim and doesn't want the bother I guess

golden schooner
#

maybe jumping spider is next ๐Ÿ™‚ can you do some inception work

rugged notch
golden schooner
#

so cute ๐Ÿ™‚

long robin
#

that froge looks funny

long robin
#

btw @shell inlet I think svgf's default of 0.2 for alpha is shrimply crazy

#

it's in the range of what this ReLAX technique considers to be "fast history"

#

kinda funny too cuz this path tracer with ReSTIR has insanely good sampling

#

so they don't even need a very low alpha

#

except maybe for specular, idk

shell inlet
#

good sampling will leave you a 1spp image that is almost converged

#

so I don't think it's too crazy

long robin
#

that's what I'm saying

#

they don't need the crutch of a low alpha like I do

shell inlet
#

you overestimate restir

#

restir also needs a few frames to digest samples before it's good

long robin
#

maybe the original SVGF paper is indeed being wacky by suggesting 0.2 for alpha

shell inlet
#

first few frames are horrible

#

so much so that they need a disocclusion filter

long robin
#

yeah

#

ok I think I see

#

so I guess restir is really good after it converges for a few frames

#

which is what you said

golden schooner
#

so loading screen at the beginning for it to catch up a few frames think

long robin
#

turn around
loading screen

shell inlet
#

display loading screen at disoccluded regions

golden schooner
#

๐Ÿ˜„

#

or you show the scene heavily blurred, and blur gets less and less until everything is loaded in

long robin
#

they should show this DONT CARE texture from renderdoc

#

man wtf is that font

golden schooner
#

hmm we have no :novulkan:

#

looks like some CJK font

#

MS Mincho or so

long robin
#

ye

#

the thinness + consoleish typeface is the tell for me

golden schooner
#

ye and the odd angles/serifes

#

i always wonder why the latin alphabet looks so shitty in the asian fonts

long robin
#

but I'm guessing the main issue is that it's a lot harder to create these fonts

golden schooner
#

and not run into copyright issues ๐Ÿ˜‰ with the other 2934824839845302482 fonts

long robin
#

I wonder if there is a way to use different fonts for different parts of unicode

#

so you could use a non sucky latin font + a CJK font

golden schooner
#

it is possible

#

you render the text runs shrimperately think

long robin
#

you're gonna give me the text runs

golden schooner
#

: D

shell inlet
#

of course it gets harder as scene complexity grows

#

each bounce becomes harder to sample efficiently and this is where resampling excels

#

but at the price of needing a few samples to get the idea of sampling domain

#

here it's analytically tweaked to importance sample direct illumination but without taking visibility term into account because that's basically impossible to do analytically

#

resampling is the numerical way to do it

#

anyways that's where small history would suffice, and sponza is not a very challenging scene in terms of complexity (most of it is directly illuminated and the rest is illuminated by 1st bounce)

#

if you want a challenging scene, see my path_tracing_nightmare.glb

long robin
#
Material getMaterial(int i) {
    if(i==0) return materials[0]; else
        if(i==1) return materials[1]; else
            if(i==2) return materials[2]; else
                if(i==3) return materials[3]; else
                    if(i==4) return materials[4]; else
                        if(i==5) return materials[5]; else
                            return materials[6];
    //return materials[i];    //webGL 2.0
}
#

send help

long robin
#

it'll probably look like ass with RSM but whatever

shell inlet
#

you asked for it already bruh

#

and I sent it

long robin
#

oh lmao

#

I wonder where I put it

shell inlet
#

#1030797522177888316 message

#

also saw this again, still funny

long robin
#

I must have forgotten to copy it to my models folder

shell inlet
#

it's designed to progressively get harder as you go deeper into the room

#

because each bounce is harder to estimate

#

blender basically dies there

#

and denoiser gets all splotchy

#

in the furthest corner that is

#

essentially a scene to test your path guiding algo

long robin
#

shadows be leaking

#

that bias

shell inlet
#

this scene is too hard for rsm

#

looks good if you don't let anyone see what's under the 1st floor ceiling

long robin
#

filtered RSM is such an improvement in this scene though

shell inlet
#

๐Ÿ˜Ž ๐Ÿ‘

#

oh by the way

#

can you make sun intensity slider

#

or exposure

#

tbh you maybe wanna some tonemapping now

#

maybe warrants a tonemapper demo?

long robin
#

I already got the

  // tone mapping (optional)
  //finalColor = finalColor / (1.0 + finalColor);

doe

#

jk jk

shell inlet
long robin
#

I think I'm using some slightly better tonemapper for the volumetric example already

shell inlet
#

well that was just an idea for another example which might be fun

#

I sure would like to try those out some time later for personal project

long robin
#

apparently using the ACES tonemapper without a bunch of other stuff is a bad idea (according to other people on this discord)

#

idk what the other stuff is though, and imo it looks fine on its own

shell inlet
#

I don't like aces because it's simply compressing everything to ldr

#

I want dynamic exposure change cuz its cool

long robin
#

ah

#

I have implemented that twice now

shell inlet
#

and overexposed parts that emit a glow with fft bloom

long robin
#

but not today since I gotta schleep

dire badge
shell inlet
#

I don't have an answer to this question

#

the only thing I can talk for is myself

#

if you mean used in absolute general sense, then it never was unused

#

if you mean being used in production with real time graphics such as games - I don't know

#

the fact is, FFT is cheaper than naive convolution when you have large kernels

#

and diffraction patterns are generally very large

golden schooner
#

that link came across my eyes again and i though i show you ๐Ÿ˜„ then i realised i must have showed it to you already

long robin
#

Ye I skimmed through the source the first time you showed me

golden schooner
#

ah

#

and then you forgor

long robin
#

what was I supposed to rembererer

golden schooner
#

idk, i was just trying to be funny

#

they have ray and pathtracing going on

long robin
#

ye it seems quite legit

golden schooner
#

all done in openjayell

shell inlet
#

so I installed win10 to play portal rtx

#

can confirm ghosting all around on any settings

#

sometimes even portal gun leaves trails when you move the camera

long robin
#

muh broment

long robin
#

separable 5x5 vs regular 5x5 bilateral (I adjusted the settings to make the spatial blur super obvious lol)

#

hmm I must have messed something up since the first looks so much smoother

#

or maybe the separable bilateral really do be like that

shell inlet
#

is this truly 5x5

long robin
#

it's 5 passes of 5x5

#

in the first image, I separated the filter so it alternates between horizontal and vertical passes

#

so it's technically 10 smaller passes

#

wait a sec

#

ah, I'm only using 4 passes in the second pic

#

here is how it should look

shell inlet
#

uncanny how pattern around the plant is very similar

#

check this and previous separable bilateral screenshot

#

I almost thought you posted same one twice

#

or did you

long robin
#

lol they're different, I promise

#

I put each image in a separate layer in paint.net so I can inspect

#

comparing images in chrome is painful

shell inlet
#

they are totally same patterns

#

everywhere

#

except one is darker

long robin
#

which pattern? the noise?

#

I made sure to fix the seed in both pics (and disable temporal accumulation)

#

so the only difference is the filter

shell inlet
#

yeah I thought you just somehow got lucky capturing similar patterns lol

long robin
#

dream luck

shell inlet
#

by the way I forgot to mention that last time I ran the example there were some issues with random history drops in the distance

#

it's probably related to depth

#

when you move the camera it randomly drops history

long robin
#

was it big sections or single pixels dropping history?

shell inlet
#

big sections

#

like slices

long robin
#

hmm, that artifact happens when the depth phi is too low

#

or anything else that causes excess sensitivity for that weight

#

I'm not seeing it on my machine, at least not in the scenes I tested

shell inlet
#

make filter step as low as possible

#

will be easier to spot

long robin
#

ye that's what I did

#

I also have a debug mode to spot disocclusions

#

in Reproject2.comp.glsl there is a #define at the top to toggle it

#

if I turn really fast I think I can see what you're talking about

shell inlet
long robin
#

ok that's weird

#

when was the last time you pulled?

shell inlet
#

maybe you are not fixing depth values after reprojection

#

I think it won't be same depth if you reproject a pixel from the side to the center

long robin
#

I wonder if the fact that I'm running at 240hz makes it harder to see it

#

it would make the depth differences smaller

#

hmm yeah I see it better at 60hz

shell inlet
#

maybe instead of using depth values use camera distance

#

those should be invariant to fragment locations certainly

#

length(world_pos - view_pos)

#

or distance(world_pos, view_pos)

#

though when camera moves depth will also change

#

for same surface point between two frames

long robin
#

it should probably be world pos then

#

I thought I could get away with linear depth, rip

#

btw the separable bilateral pass is ~1.2ms and the old one is ~2.8ms at 1080p

shell inlet
#

I remember sultim also having issues with that

#

depth heuristics are a bit of a pita

long robin
#

unfortunately the separable filter seems a bit worse in unstable areas

#

seems like there is more flickering in high frequency disoccluded areas like pot edges and leaves

#

I also tried a 7x7 filter for the lulz and there was 0 difference

#

its width probably made the depth phi too low for outer samples

dire badge
long robin
#

I pushed my changes, including gui options for the sun specifically for void

long robin
#

One trick I might try for improving perf/quality is to pre-downsample the RSM so it's cheaper to sample

#

Might even allow for multiple RSMs since sampling each one is relatively cheap

shell inlet
#

did you add something more than just intensity?

#

another rotation axis maybe?

#

sun color?

long robin
#

hacked up some code to make the RSM + denoiser run at quarter (0.5^2) res, now the whole pass takes ~1ms (it takes 1.8ms at full res)

#

it flickers really hard now

#

increasing the sample count is really cheap though, and it can reduce the temporal aliasing

#

though it would probably be ideal to use legit TAA

long robin
#

if I pre-downsample those to the target resolution first, perf may improve

#

sadly glBlitFramebuffer seems to be extremely slow. I use a blit when albedo modulation is disabled, and that alone takes 2ms

#

nsight seems to make the blit take a normal amount of time when profiling

long robin
#

odd how there is no single-component 32-bit unorm texture format in gl

#

except for D32

shell inlet
#

floatBitsToUint

#

uintBitsToFloat

#

wait unorm

#

is this reasonable

long robin
#

for what I need, not really

shell inlet
#

does float not lose a lot of info that you'd get to have in unorm 32 bit

long robin
#

it gets converted to float anyways when you sample

#

what I wanted was to copy a D32 texture to an R32 unorm texture, but R32F will work exactly the same

shell inlet
#

to sample?

#

in vulkan you probably could transition depth image to a readable layout and not copy things around

long robin
#

you can sample depth textures directly in gl

#

what I'm doing is downsampling depth with a compute shader

shell inlet
#

interesting haven't tried it

#

I remember d3d9 not allowing you to do that

long robin
#

I can't write to depth formats so I have to use something else

shell inlet
#

are you building depth pyramid

long robin
#

I'm just seeing if an optimization idea will actually work

shell inlet
#

naturally hitting more cached pixels yes

#

I anticipate costs going down

golden schooner
# long robin

i like it, i also like that the blue curtain kind of emits some volumetric looking looks

long robin
#

it runs in about 0.7ms after downscaling the relevant buffers

golden schooner
#

sounds fast

long robin
#

ye but it's flickery AF now

#

I can mitigate it by cranking up the samples tho

long robin
#

if I downsample the RSM to 128^2, I can take 40 samples and the whole pass still takes only 1ms

#

quality seems practically identical on sponza too

shell inlet
#

you need something less than 30 series to truly test it

#

less even than 20 series

#

maybe some pascal

long robin
#

maybe I can get deccer to try it again on his igpu

shell inlet
#

was his igpu better than mine 710m

long robin
#

idk

#

@golden schooner what was your perf the last time you ran my rsm example?

shell inlet
#

what's ur igpu

golden schooner
#

it was in the basement

#

intel hd 4600

#

yours is better

#

ill pull

long robin
#

wait

#

I haven't pushed yet

shell inlet
#

usermememark

long robin
#

there is a bug with the regular RSM

golden schooner
#

oi

#

i like the grid

shell inlet
#

where is the bug

golden schooner
#

wrong texture slot?

long robin
#

I'm using a half res texture somewhere that a full res texture is expected, or vice versa

golden schooner
#

building up some pressure

long robin
#

sheeeit

#

somehow I went too far in the opposite direction

#

ah

shell inlet
#

jaker plz add y axis sun rotation

long robin
#

ok, after this push

shell inlet
#

I added it some time ago on a keybind

#

but it was discarded

golden schooner
long robin
#

ah I accidentally bound the wrong textures somewhere

shell inlet
#

btw you can also add moving shadowmap for it to follow camera, with a pixel perfect snap so that it doesn't float as you move around

#

maybe that's more useful for deccer though

#

fwog examples have or will have AABB fit

#

I also think it's more reasonable to place the shadowmap origin in front of the view so that it doesn't render what's not going to contribute to the shading

golden schooner
#

i have a button in the debug window which kinda follows the camera

long robin
#

alright, bug fixed. now I just gotta write some commit messages

#

alright, pushed

golden schooner
#

still examples branch neh?

long robin
#

ye

shell inlet
#

when are you going to merge to main

long robin
#

never

shell inlet
#

interesting decision

golden schooner
#

8 fps

#

(deferred exshrimple)

shell inlet
#

I had 30 fps on 710m

long robin
#

when this rsm brainworm is "done" and I'm done with the other example stuff I will merge

long robin
shell inlet
#

maybe need to check out new version later since it's probably havier

#

heavier

golden schooner
#

ah one sec

shell inlet
#

no y rotation ๐Ÿ˜”

#

y no y

golden schooner
#

29-31 it wobbles back and forth

long robin
golden schooner
shell inlet
#

filtered is faster, that is a success

long robin
#

seems like you won't get much higher than 30fps though

shell inlet
#

you can add a checkbox for vsync

#

not very useful since you'd normally always want it on

#

otherwise gpu suffers

long robin
#

yeah I don't want to clutter to UI too much

golden schooner
long robin
#

interactive framerates

shell inlet
#

what is wrong with shadows

golden schooner
#

some shadowmapisms with steep sun angles

long robin
#

I clamped the bias

golden schooner
long robin
#

the ๐Ÿ…ฑ๏ธeter ๐Ÿ…ฑ๏ธanning was too intense if I let it run free

golden schooner
#

ah it doesnt like the jiff ๐Ÿ™‚

long robin
#

lol

shell inlet
#

discord embeds dying again

long robin
#

tbf deccer's gif looked kinda phallic

golden schooner
heavy cipher
golden schooner
#

that made me also chuckle a little

shell inlet
golden schooner
#

if i made #1019722539116802068 NSFW it should have worked

shell inlet
#

clam too explicit

long robin
#

didn't feel like adding it for that scene

golden schooner
#

jaker needs to goto bed first

shell inlet
#

it's exclusive for gltf viewer?

long robin
#

yeah

#

an incentive to look at it

golden schooner
#

oh didnt we talk about reusable imgui controls?

#

looks like you are cutting corners again

long robin
#

the only duplicated imgui controls atm are sun controls

golden schooner
#

ok c:

long robin
#

not gonna make a header for that

#

alright, the second sun angle slider has been added

#

somehow it kinda works when the RSM is downsampled to 32x32

#

really splotchy when the sun moves though

shell inlet
#

is resolution scaling a slider?

long robin
#

no

#

it can only be changed in the source

#

RsmTechnique.cpp line 104 is where the inverse resolution scale is hardcoded

#

2 = half res, 1 = full res

golden schooner
#

maye void can coerce you into making it a slider one day, or sends a PR

long robin
#

RsmTechnique.h line 102 is where you can change the downsampled RSM size

#

it would be cool, but annoying to add because it requires remaking all the textures

shell inlet
#

would be easy to make with preprocessor and recompilation

golden schooner
#

or just 2 methods, {Destroy/Create}SizeDependentTextures()

shell inlet
#

I used to have a lot of preprocessor stuff and a recompile button in my ogl raytracer

long robin
#

I just need to make a function to construct all the textures and it will be trivial to add those sliders

long robin
#

unless you're talking about cursed dll hot loadingisms

shell inlet
#

no im talkin bout shader

long robin
#

shaders don't need to be hot loaded atm

#

though a button to toggle debug mode would be kool

golden schooner
#

Fwog the ultimate RSM testbench

long robin
#

the effect looks decent at 0.25^2 res too

#

though I kinda have to fudge the sliders (super low alpha, increase samples to 8, decrease filter width a lot)

#

and the time seems to be dominated by downsampling the gbuffer and rsm at such a small resolution

#

maybe viable if you're on an iGPU ๐Ÿ˜„

golden schooner
#

the shrimpled version really makes a difference, i think before your rehussle i had like 15fps or so

long robin
#

if you look around, you can see artifacts from me naively upscaling the illuminance to screen resolution

golden schooner
#

which example is that?

long robin
#

this is void's path tracer hell scene

golden schooner
#

a scene switcher would be neat

long robin
#

on gltf viewer

golden schooner
#

ah is that also part of the fwog repo now?

long robin
#

not yet

shell inlet
#

that is part of my engine

#

you can include it in fwog if you want, I wouldn't give the scene if I was against it

long robin
#

(1/8)^2

long robin
#

e.g., right now it is not centered at 0

#

or maybe it is

shell inlet
#

the origin is somewhere inside of it

#

somewhere convenient I'm sure

long robin
#

nvm it is well-centered

#

the problem was that I had a position for the camera hardcoded for sponza

#

how silly of me

shell inlet
#

it's a scene measured in meters also so it might be bigger than sponza

#

way bigger I think actually

long robin
#

I had to scale this one down 2x as much as sponza

shell inlet
#

one cell is most likely a meter iirc

#

that's a meter probs

golden schooner
#

i like these default textures

long robin
#

I'm taking 400 samples here and it's still performing 5x better than unfiltered RSM

shell inlet
#

bigger rmax should worsen it

golden schooner
#

maybe plain RSM is just "shit" compared to all the optimiziation shteps you did and do for the filtered one

long robin
#

ah this is actually (1/16)^2

shell inlet
#

surprising but plain RSM is coherent but is shit in quality

#

it's faster than dithered sampling

#

the win here is that we only do 1spp and temporally accumulate, now even at low resolution

long robin
#

lol 4000 samples is 1.5ms still

#

the whole texture must be in L1

golden schooner
#

are you clamping it in code maybe?

long robin
#

no

golden schooner
#

crazy

long robin
#

if I take 40k samples per pixel it goes up to 15ms

#

but these are not true screen space samples

#

I'm sampling from a texture that's 256x smaller than the screen in terms of area

shell inlet
#

lol I noticed how you call sun angles

#

sun angle was so good they made sun angle 2

brisk narwhal
#

if sun angle is so good why isn't there sun angle 3

#

checkmate

shell inlet
#

because we adopt valve ideologies

shell inlet
long robin
#

hehe

#

I really need to upsample it less stupidly

shell inlet
#

you know that gives me another weird idea

brisk narwhal
#

can you linearly filter the RSM

shell inlet
#

what if we actually use surfels and do voronoi splatting

#

if rsm is so insensitive to resolution cuz it's low frequency

brisk narwhal
long robin
shell inlet
#

yes that's basically same as filtering

long robin
#

I'm guessing it would fail hard in a scene with more complexity though

#

like with foliage and stuff

#

I mean low res sampling in general

shell inlet
#

if you TAA maybe not as much

long robin
#

ye

brisk narwhal
#

At that point you could just switch to VXGI, it's probably simpler lol

long robin
#

should also do subpixel jitter on the low res samples

golden schooner
#

jaker needs to take a break from GI after this, and has other things to implement first, so that i can steal them for myisms

shell inlet
#

we brought up voxel tracing some time ago maybe worth a try eventually

brisk narwhal
#

The amount of optimizations that went in this RSM implementation is so good

#

I learned a lot just by lurking

shell inlet
#

turns out fwog thread is entertaining and educational

long robin
#

not shown, but if I crank the samples up it's not as bad

shell inlet
#

I wonder what is causing shimmers

long robin
#

reprojection rejection

shell inlet
#

history drop for sure but why is it droppong

#

lol typo

long robin
#

probably cuz high frequency stuff is sampled too sparsely at low res

#

I'm gonna try someting

#

ah

shell inlet
#

mayhaps it's a given that if pixel covers a lot of different surface points it should be less strict in reprojection rules

long robin
#

I'm also not using the 3x3 bilateral reprojection heuristic that the original paper has

shell inlet
#

I guess you could test if the reprojected pixel is still within the area that it covers downsampled

#

nah that's gonna give you a lot of troubles if you are accumulating the downsampled results

#

you actually need to accumulate full res

#

then it probably won't be as troublesome

#

is that what you're doing?

long robin
#

no

#

the reprojection pass would get more expensive if I accumulate at full res

#

possibly worth it though

shell inlet
#

but that's to assure it's not failing for the final image

long robin
#

at the same time, just adding more samples is incredibly cheap and works well

golden schooner
#

would be neat to have a switch/slider somehow to change the resolution to 1/16 to 1/1 or so, then we could also run some experiments on how it performs on all sorts of hardware again

long robin
#

I'll do it tomorrow

#

but now I gotta slep

shell inlet
#

ok if you slep I homwork

#

you can also hash it to quickly find closest ones for a given pixel

brisk narwhal
#

given that the RSM sits entirely in cache at low resolutions

#

What if we could irregular Z-buffer the RSM

#

At higher resolutions

long robin
#

I did the math and a 256^2 RSM is about 5x larger than the L1 cache of a single SM on my GPU (3070), but there are dozens of SMs and each pixel won't need to access the whole RSM

long robin
long robin
#

mfw the original paper for irregular z-buffers notes that they cannot be constructed in real time on current (~2005) hardware, so they propose a GPU that can do it efficiently okey

#

the idea of using an irregular z-buffer for pixel perfect shadows is interesting though

long robin
#

I love writing code like this

if (useSeparableFilter)
{
  Fwog::Cmd::BindSampledImage(0, indirectFilteredTex, nearestSampler);
}
else
{
  Fwog::Cmd::BindSampledImage(
    0,
    (5 - int(std::log2f(float(inverseResolutionScale)))) % 2 == 0 ? indirectUnfilteredTex : indirectFilteredTex,
    nearestSampler);
}
#

it's not brittle at all and doesn't lead to many bugs

heavy cipher
#

lmao that condition

rugged notch
long robin
#

ping-pong algorithms with a variable number of iterations give me cancer

#

6 levels of indentation removed for your viewing pleasure

for (int i = 0; i < 5 - std::log2f(float(inverseResolutionScale)); i++)
{
  filterUniforms.stepWidth = (1 << i) * spatialFilterStep;
  filterUniformBuffer.SubDataTyped(filterUniforms);

  // The output of the first filter pass gets stored in the history
  const Fwog::Texture* in{};
  const Fwog::Texture* out{};
  if (i == 0)
  {
    in = &indirectFilteredTex;
    out = &indirectUnfilteredTexPrev;
  }
  else if (i == 1)
  {
    in = &indirectUnfilteredTexPrev;
    out = &indirectUnfilteredTex;
  }
  else if (i % 2 == 0)
  {
    in = &indirectUnfilteredTex;
    out = &indirectFilteredTex;
  }
  else
  {
    in = &indirectFilteredTex;
    out = &indirectUnfilteredTex;
  }

  Fwog::Cmd::BindSampledImage(0, *in, nearestSampler);
  Fwog::Cmd::BindImage(0, *out, 0);
  Fwog::Cmd::MemoryBarrier(Fwog::MemoryBarrierAccessBit::TEXTURE_FETCH_BIT);
  Fwog::Cmd::Dispatch(numGroups.x, numGroups.y, 1);
}
heavy cipher
#

also replace int(log2f) with an int log

long robin
#

I thought about it

#

is there actually int log in the C++ standard library?

#

I don't see it in <cmath>

heavy cipher
#

i think not

long robin
#

maybe it's in <bit>

rugged notch
#

reinterpret cast your int's address to a non-trivial type

#

you'll get its log

#

sometimes

heavy cipher
#

ah right, i remember now

long robin
rugged notch
#

TIL about <bit>

heavy cipher
#

its in <iostream>

int x = 6;
std::clog << x;
long robin
#

making my code unreadable for epsilon perf gain smart

heavy cipher
long robin
long robin
heavy cipher
#

well yeah

long robin
#

too bad there is no way to fix that

rugged notch
heavy cipher
#

so don't write countr_zero instead of ilog2

long robin
rugged notch
heavy cipher
#

mfw when HR tells me "declaring jihad on warnings" is not appropriate workplace language

golden schooner
golden schooner
#

GL_GOOGLE_include_directive doesnt work on nvidia-470xx

long robin
#

it should be a warning at worst

golden schooner
#

it gtfs me out

long robin
#

enable: Causes the named extension to work; if the implementation does not support the extension, it only gives a warning. Fails if used with all.

golden schooner
#
/home/deccer/Private/Code/External/Fwog/build/example/02_deferred                                ๎‚ฒ โœ” 
terminate called after throwing an instance of 'Fwog::ShaderCompilationException'
  what():  Failed to compile shader source.
0(3) : warning C7508: extension GOOGLE_include_directive not supported
0(118) : error C1020: invalid operands to "/"
0(119) : error C1020: invalid operands to "/"
long robin
#

ok so the error is the implicit float conversion thing

golden schooner
#

ah ye ๐Ÿ™‚

long robin
#

deja vu from when void encountered the exact same thing

golden schooner
#

indeed

long robin
#

however, the driver is incorrect here

#

for binary operators, if one operand is a float type and the other is an integer type, the integer one should be promoted to the float type

golden schooner
#

i cannot use a newer driver

long robin
#

I guess I'll have to change it so it works on y'alls' crusty drivers

golden schooner
#

ye ๐Ÿ™‚

#

unless you have us sponsored the latest amd goodness, i wouldnt say no lol

long robin
#

hehe

#

I don't think I can let other people use my employee rebate sadly

golden schooner
#

nah i was just kidding

#

i dont think i will get me an amd card ;P

#

i was able to get one of my 780tis working again

long robin
#

did you bake it

golden schooner
#

xD i did

long robin
#

amazin

golden schooner
#

quite the hassle

long robin
#

it's a driver bug on god

shell inlet
#

It was always there since 200X

long robin
#

it actually works on both of my machines (rtx 3070 and rx 6800 with latest drivers, windows 10)

#

I can even pinpoint the exact part of the spec that says ur driver is bein silly

#

either way, I gotta conform to the lowest denominator ๐Ÿ˜ฉ

shell inlet
#

Would be fubby if i update mine and it stll wont work

#

I have 527.56 apparently

#

it's the latest I think

#

but I had to update nvidia geforce experience just to look it up

#

I'm now going to try your examples

long robin
#

I have the previous driver version from that

shell inlet
#

why do you have to explicitly specify if gltf is binary or not

long robin
#

too lazy to check the extension automatically ๐Ÿคฐ

#

well, I shouldn't have to check the extension

shell inlet
#

std::filesystem::path::extension

long robin
#

I mean there should be a magic number in the file to indicate it

#

only glbs have a magic number though

#

but yeah I guess I should check the extension

shell inlet
#

it works

#

so they actually did fix that just in 2022

#

and both amd and nvidia decided to do that in 2022?

#

well can't say for amd

#

but my drivers weren't year old

#

this is one of the reasons spirv is superior

#

compile to ir yourself and don't worry about driver not being able to compile

#

or idk link your app to compiler

long robin
#

ironically, opengl drivers are extra buggy when it comes to spirv compilation

shell inlet
#

but was spirv made for ogl at all

#

also I wonder if we'll get another ogl version

#

it seems to have been abandoned

long robin
#

it ded ๐Ÿ’€

shell inlet
#

ur a necromancer then

#

at least we're still getting updates that fix stuff

shell inlet
#

so intel too...

long robin
#

pretty sure that was actually his nv card

heavy cipher
#

eastern thinkers say 'kalpa' - a moment of eternity - passes when deccer buys a new gpu

shell inlet
#

if ur so smart why aren't you rich

#

(C) every mom ever

golden schooner
#

yep im on intel usually, but i was able to resurrect 1 of the two 780tis last night

long robin
#

I just had a thought

#

What if I could dynamically distribute samples like in surfels GI

#

It would be cool to do the RSM at a really low res, but put more samples (or move the sample location) on edges

#

Maybe just something like VRS would work, hmm

#

I want to put jittered samples on edges so high frequency geometry doesn't die

#

I guess aggressive VRS might work

#

Given how we've seen that downsampling the RSM effect to like (1/32)^2 works

#

But then there's still the problem of filtering, which I don't want to do at full res (as VRS still resolves to a full res image)

#

I gotta try the bilateral filter upscaling strategy first though

golden schooner
#

sounds like its worth try

golden schooner
#

@long robin that Limbosuccessor gdc talk was really cool, i find the game itself utterly boring, but the graphicsisms look intermeresting

#

mayhaps you can post the link in #graphics-resources when you find the time

long robin
#

I can't post there

golden schooner
#

oi

#

ah

#

its gp-news where everyone can

#

let me allow mvps to post in gp res too

#

should work now?

long robin
#

Still can't

golden schooner
#

now

#

i forgor to hit save

long robin
#

Worked

golden schooner
#

no embed?

long robin
#

Idk why

#

Lol nothing embeds in that channel

golden schooner
#

ah wait

#

try again please ๐Ÿ™‚

long robin
#

Ez

golden schooner
#

merci mon ๐Ÿช‘

long robin
#

The triangle noise technique was probably my favorite thing from the first talk lol

golden schooner
#

yeah

#

reminded me of how you helped me with my shadowisms ๐Ÿ™‚

long robin
#

I need a scene with high geometric complexity for testing RSM

#

something with foliage, perhaps

shell inlet
#

spawh a bunch of randomly colored cubes

#

inside sponza

long robin
#

why does the sketchfab's search suck so bad

#

the minimum price in the filter is $3 bruh

#

can't even filter for gltfs bruh

#

I had to use google instead of sketchfab's ๐Ÿ’ฉ search functionality

#

lmao I literally clicked the "free gltf assets" button on their site and was presented with paid non-gltf models

#

blender is taking literal minutes to open a 1gb obj file

#

already using 11gb of ram

shell inlet
#

hey I was thinking about morgan MC guire samples with foliage and forgor it had miguel san

long robin
#

yeah I resorted to those because the websites for sharing models suck

shell inlet
#

does this work as a complex scene?

long robin
#

it certainly makes some problems more obvious

#

like the fact that I need to do the 3x3 search in reprojection

shell inlet
#

looking at pixellated trees I'd say it's a success

#

the trees have been censored

long robin
#

the upscaling filter is something of a success

#

geometry no longer gets censored when only illuminated by indirect lighting

shell inlet
#

that is amazing, you win

long robin
#

it's way too unstable at that res though, even with the smooth upscaling

#

2x upscaling seems a lot more feasible

#

just gotta perfect the reprojection pass heee

brisk narwhal
long robin
#

persistent world-space surfels would be cool for this though

#

that would completely change filtering though

shell inlet
#

yes you could also make use of software rt using them

#

cuz less rays would need to be traced for surfels

long robin
#

yeah

#

I forgor how surfel lighting is actually applied to the image though. Is it just a lookup into the surfel structure?

shell inlet
#

there is a paper by ea called gibs

#

global illumination based on surfels

long robin
#

I saw the talk

#

didn't know there was a paper

shell inlet
#

well you gotta splat them

#

some edge aware filter I'd imagine

#

wide one

long robin
#

I thought they just used the world space pos of the surface being shaded to look into the surfel acceleration structure

#

I think they had an improvement on top of that tho

#

some kind of interpolation iirc

shell inlet
#

I didn't look into how they do it specifically so I don't know

long robin
#

they fetch N surfels from the cell and weight by distance and normal

shell inlet
#

well isn't it a voronoi splatting

long robin
#

the surfel locations?

shell inlet
#

weighting by distance is essentially saying mixing with a convex combination

#

not sure what's with normal but I guess in the end it's all for a single weight in a convex combination

#

and if you have a set of points that you interpolate based on distance that's basically voronoi

long robin
#

I see

#

when I googled voronoi splatting the first time you mentioned it, your shadertoy thing was the only relevant result lmao

shell inlet
#

I guess I coined the term

#

but there really isn't any other way to call it that makes sense

long robin
#

it do be like that

#

convex combination is another one of those scary terms

shell inlet
#

idea is simple though

long robin
shell inlet
#

essentially a constraint for a set such that a + b + c ... = 1

#

a way to achieve is to divide each element by the sum

long robin
#

Weighted means are functionally the same as convex combinations, but they use a different notation. The coefficients (weights) in a weighted mean are not required to sum to 1; instead the weighted linear combination is explicitly divided by the count of the weights.

shell inlet
#

example of a convex combo is barycentric coords

long robin
#

many image filters are convex combinations through the above definition

shell inlet
#

well it's a condition for a discrete pdf that it sums to 1

#

wait that's irrelevant

long robin
#

the term "convex combination" has a geometric implication imo

shell inlet
#

kernels just have the same condition

#

blur kernels

#

you can have whatever kernel you want for a convolution

long robin
#

yeah

#

the bilateral filters I'm using need to sum to 1

shell inlet
#

random fact did you know that we divide by pdf because estimators otherwise converge to it if you integrate 1

#

I guess you can numerically get a pdf using this property

long robin
#

that's a neat fact

#

I'll try to remember that the next time I need a pdf

shell inlet
#

it's not really practical in realtime

#

maybe not at all?

long robin
#

I mean just making a fake scene to derive the pdf, then plugging in the number into the code

shell inlet
#

maybe RIS is actually exploiting this property though ๐Ÿค”

#

I remember they have some weight using expected value, but maybe it's unrelated

long robin
#

if your pdf is not constant then my idea does not work

shell inlet
#

you could make a LUT for numerically estimated pdf

#

but the dimension of the LUT depends on the number of parameters of your sampling function

#

maybe if some parameters are degenerate you could exclude a dimension

long robin
#

I'd only consider it if the lut was 0-dimensional tbh

shell inlet
#

that would be an approximation that causes bias for a difficult continuous pdf

#

depending on the LUT resolution

#

well not to mention that fat LUTs are lethal for performance

#

otherwise I'm sure we'd be using LUTs for BRDFs

#

there are captured real world BRDF LUTs out there

#

what was it called I can't remember

#

basically 100 materials library

long robin
#

idk I'm not an artist

shell inlet
#

MERL BRDF database

long robin
#

yw

shell inlet
#

those are very very fat

#

the only use is pretty much research

#

to compare your analytical model against

long robin
#

neat

#

wow that 3x3 reprojection filter makes a difference

shell inlet
#

you have mastered reprojection?

long robin
#

I still haven't tried a bicubic filter like in that ReLAX talk

#

I won't try it tbh since I doubt it'll improve this much

heavy cipher
#

when did we take damage

long robin
#

by looking at the aliasing in the pic

#

it works pretty well even at 4x4 upscaling

#

still flickers a lot since there aren't enough samples placed on different surfaces

#

I want to temporally jitter the samples somehow, but idk if it's feasible

#

foliage hell

golden schooner
#

burn it

#

i really dont like that scene either, from an aeschtethical perspective

long robin
#

it's kinda ugly, but has useful geometry

#

sometimes there is black and I'm not sure why

shell inlet
#

from what pass

long robin
#

not sure

#

maybe the new thing I added to reprojection

shell inlet
#

is it consistent

long robin
#

somewhat

#

it seems to happen when there is a rejection

#

but not always

#

pretty sure I just got a bug somewhere

#

rejection seems to make areas get darker

#

it's odd because the affected areas have a long history

#

actually, no

#

low history seems to be correlated with darkening in general

#

I think it's just a lack of blurring + sparse sampling

#

except it still happens when there are 40 samples

#

probably using an index meant for a larger texture or vice versa

#

something is definitely wack, cuz it happens with 400 samples and when I only move by a pixel

#

I'm guessing some kind of UB cuz it doesn't happen in RenderDoc

shell inlet
#

real question why can I see albedo when I skip modulation

long robin
#

because it still does primary shading on the image

#

the "skip albedo modulation" is just for the output of the RSM pass, which means the indirect light

shell inlet
#

makes sense

#

another thing is that it feels like sun intensity somehow doesn't make sense with what I see on screen

#

feels like it should be way more overblown for 40

long robin
#

idk

shell inlet
#

btw

#

do you 1/pi for direct

#

I think you don't

#

well doesn't matter for first bounce

long robin
#

yeah direct is horribly not-PBR atm

#

btw this feels pretty overblown to me

shell inlet
#

I'm more interested in trying a tonemapper

long robin
#

ok, I can shoehorn one in sometime

shell inlet
#
vec3 reinhard_ext_luminance(vec3 v, float max_white_l)
{
    float l_old = dot(v, vec3(0.2126, 0.7152, 0.0722));
    float numerator = l_old * (1.0 + (l_old / (max_white_l * max_white_l)));
    float l_new = numerator / (1.0 + l_old);
    return v * l_new / l_old;
}
long robin
#

reinhard listenyoupieceofshit

#

ok, it's the ext version

shell inlet
#

luminance

#

only luminance is affected

long robin
#

nice

shell inlet
#

btw I think you also have reinhard

#

finalColor / (1.0 + finalColor)

long robin
#

yeah, it looks like ass though

#

it's commented out

shell inlet
#

I don't see any difference with it

#

I am maybe blind

long robin
#

should be in ShadeDeferredPbr.frag

#

it affects the image a lot for me

shell inlet
#

thats what I'm editing ye

long robin
#

here's my previous image with reinhard

shell inlet
#

how does it compare to this

vec3 reinhard_luminance(vec3 v)
{
    float l_old = dot(v, vec3(0.2126, 0.7152, 0.0722));
    float l_new = l_old / (1.0 + l_old);
    return v * l_new / l_old;
}
long robin
#

moar color

shell inlet
#

epic

#

I still prefer autoexposure

#

just crushing doesn't do it for me