#Fwog and co.
1 messages ยท Page 6 of 1
at least the jagged shadow is history
once you start tracing actual rays is when a denoiser is needed
but it's called denoiser not derayer
but I understand what you mean, you aren't warranting a variance guided filter because you don't even need to preserve penumbra if you don't even have one
it's a penumbruh
same width everywhere
but you know what
actually nvm it's more expensive maybe
a 3x3 bilateral filter would probably remove most of the noise
I was about to suggest cubic for shadow sampler
maybe I could even go down to 1spp
shadow sampler is bilinear so will work
maybe it won't be as slow as I think due to locality of taken samples but I don't know really
essentially a better 2x2 pcf
yes here I go again with my weird ideas
you could try it in your local repo
just go to ShadeDeferredPbr.frag.glsl and update the shadow function
no you do it
and add your custom cubic thingy
you have the code
I am doing homework I am business
well I gtg to bed
goodnight
gn
took a break and tried it out, not much of an improvement
Not a bad improvement imo
looks almost as good as my shadow ;P
after the depth bias was removed, ultra low RSM resolutions work fine (this is 128^2)
nice
what are those triangles on the green curtain
shadow acne
the bias numbers I hardcoded for the shadows do not work in all scenarios
have you heard about slope scale bias
this is what I'm doing
float constantFactor = 0.0002;
float angleFactor = 0.001 * (1.0 - dot(-shadingUniforms.sunDir.xyz, normal));
...
lightDepth += angleFactor + constantFactor;
what's the correct math for it
I don't know
these numbers work okay most of the time
I'm pretty sure there is an analytical way to compute the bias, but I'm not gonna tryhard it that much
linearly changing value would be acos(dot)/pi
but I'm not sure whether it's required now
using cosine as a slope scale just feels strange to me because it won't grow properly with slope
you're right, it fails at extreme angles
I need to think what correct slope would be but I need to eat first
2014
the second one isn't an analytical approach, but it does talk about some other stuff
kinda interesting
rip
float bias = 1.0 - clamp(dot(lightFragNormal, lightFragmentCoord), 0.0, 1.0);
stabilizing the shadow with temporal filtering is a good idea though
here is what you have now
this is what would make sense to me
inverse trig is banned on GPU tho so maybe there's a way to find acos using identities
uhh yeah I think we can find tan by using tan=sin/cos
can get sin from cross product and cos is the dot product
so it's len(cross(d,n))/dot(n,d)
when n,d are normalized
actually order of cross doesn't matter if we are taking magnitude of a vector
in 2d equivalent would be absolute value
so here is the thing
I'm done
btw linking this for later
https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf
edited graph added automatic bias for a given width
I think width would be 1/max(textureSize(shadowmap)) in the code
max texel width basically
alright, I'll try it
it does not account for quantization of the depth value itself, but I guess it would be 1/((2^precisionbits)*depthrange)
for unorm formats I suppose
seems like the technique works 
peter panning under extreme angles
ye
is it worth it
better than horrible acne probably
bigger resolution should make peter panning smaller
mr. min()?
float cosTheta = max(0.0, dot(incidentDir, normal) - 0.02);
just make the surface dark before the shadow starts peter panning hard
tfw leaking
what's a hard cutoff
the cosTheta thing I'm doing is for the shading, not for the shadow bias
yes I know
I mean that it'll go black abruptly
it'll reach 0.8 and after that it'll go 0
I'm subtracting 0.02 from the whole thing, so it still smoothly goes to 0
the peak is being lost
here is with -0.05 bias
oh yeah it'll go to 0 smoothly but it won't reach 1 then
could use a smoothstep near 0 or something
I'd actually have to multiply by 1.02040816 to compensate
huge difference
ah nvm your math is different 
tbh I'm not gonna bother
the peter panning is only barely noticeable in this fake scene
my math is wrong
you need (dot(n,l)-cutoff)/(1-cutoff)
division is to normalize the range [0; 1-cutoff] to [0;1]
i c
at this point we are straying further away from god pbr
I've noticed a different artifact. Look at the base of the red shape
it's odd because you can see that the shadow is farther back
ah I think it's my fault
nvm
I see what the problem is
the edge of the floor box is casting a shadow
tl;dr not an actual issue
what if you clamp the bias though would it really still cause acne?
min(max_bias, slope_scale_here)
Yeah I tried earlier
hmmm it will I think
ok confirmed then
what if we shift shadow sample uv along projected normal
at best 1 pixel to the side
would it solve acne at extreme angles
so that we could also reduce peter panning
or like check if tangent is higher than some threshold and then instead of adding depth bias shift the uv along the projected normal
you get what I mean?
You mean like projecting the surface normal into uv space?
yes and using it as offset vector
Ye I see how that could potentially help
projecting it is taking difference (project(p+n) - project(p))
huh that is stupid
can use inversetranspose
transpose(inverse(sunviewproj))
I did not think this through it's a crazy idea again
so it might not work at all
maybe shifting it towards the inverted projection
cuz we want the pixel to be in shadow
maybe adding bias AND shifting towards inverted projection
this is what'd happen
maybe multiply the projection by pixel size also
It was a response to your previous message about it being a crazy idea
how did sponza change after adding perfect slope scale bias?
and does dithered shadow look better?
It looks pretty much the same overall, except the backside of the curtains are no longer brightly lit
They look approximately the same as before. That scene wasn't suffering from shadow mapping artifacts much
tried hqx shadows?
In image processing, hqx ("high quality scale") is one of the pixel art scaling algorithms developed by Maxim Stepin, used in emulators such as Nestopia, FCEUX, higan, Snes9x, ZSNES and many more. There are three hqx filters defined: hq2x, hq3x, and hq4x, which magnify by factor of 2, 3, and 4 respectively. For other magnification factors, this ...
I mean obviously no but what if
interesting
there was a paper out there about vectorization of shadows too, this is similar but less fancy
similar in end result
this does not require bilinear taps
unlike bicubic
just 3x3 and a LUT for best results
this will likely reduce amount of pixelation
not as much as vectorized shadows tho
by the way about specular rsm
I think it's possible to do some sort of rejection sampling with reprojection to sample more efficiently
oh and probably restir too
or ris at least
I mean restir is built on top of ris so it's going to be a start
though it's going to require yet another temporal pass for both
and is probably overkill so maybe it's time to stop
they don't estimate those temporal gradients tho
https://youtu.be/qW6RVaIvZPk?t=118
how'd you know?
I estimated
the penumbras do be smudged
https://youtu.be/lZQJU3yek08?t=366
well, it just looks smudgy in general
I don't see it
I see temporal lag
which is not related to variance estimation
perhaps you mean temporal gradient
that's wat I sayd
I wonder why they didn't do it
I'd expect these people to use top tier denoising techniques
there are other methods besides temporal gradients
perhaps they used that but it's being overwhelmed
look up history clamping in ReLAX
april 2021 
no heuristic beats temporal gradients imo
but temporal gradients either roughly double the cost (trace twice per pixel and keep previous frame around in its entirety), or add bias (pick 1 pixel in 3x3 strata and forward project them to be shaded together with the current frame)
ok can also do it in half/third resolution the first way, too
but that kills more pixels that may be valid
depending on how many pixels 1 gradient pixel covers
obviously
this is brilliant tbh
what is, exactly
don't tell me whole thing
also if you love nvidia so much why don't you work at nvidia
da part where they use a fast history buffer
and clamp to neighborhood color like in TAA
tbh I wish I was working at nvidia then I would call you my sworn rival
lmao that emote 
I actually made that one lol
I expected nothing less
btw I think the fast history buffer would work great for indirect diffuse
also wtf that page turned chinese while I wasn't looking at it
not very bing chilling of them
ching cheng hanji moment
it must have been autoplay or something because it was a different page than before
whats the next topic for fwog? after svgtfo?
finding love ๐ณ
he found love already, us, frogs
speaking of marzipan, are you working on a froge 2.0? @rugged notch
sadly there was not much interest from my collaborator
too bad
which is fair, she just did it on a whim and doesn't want the bother I guess
maybe jumping spider is next ๐ can you do some inception work
so cute ๐
that froge looks funny
btw @shell inlet I think svgf's default of 0.2 for alpha is shrimply crazy
it's in the range of what this ReLAX technique considers to be "fast history"
kinda funny too cuz this path tracer with ReSTIR has insanely good sampling
so they don't even need a very low alpha
except maybe for specular, idk
good sampling will leave you a 1spp image that is almost converged
so I don't think it's too crazy
you overestimate restir
restir also needs a few frames to digest samples before it's good
maybe the original SVGF paper is indeed being wacky by suggesting 0.2 for alpha
yeah
ok I think I see
so I guess restir is really good after it converges for a few frames
which is what you said
so loading screen at the beginning for it to catch up a few frames 
turn around
loading screen
display loading screen at disoccluded regions
๐
or you show the scene heavily blurred, and blur gets less and less until everything is loaded in
ye and the odd angles/serifes
i always wonder why the latin alphabet looks so shitty in the asian fonts
some of them look decent
https://en.wikipedia.org/wiki/List_of_CJK_fonts
but I'm guessing the main issue is that it's a lot harder to create these fonts
and not run into copyright issues ๐ with the other 2934824839845302482 fonts
I wonder if there is a way to use different fonts for different parts of unicode
so you could use a non sucky latin font + a CJK font
you're gonna give me the text runs
: D
https://www.shadertoy.com/view/ldBcDt
simple scene with good sampling
of course it gets harder as scene complexity grows
each bounce becomes harder to sample efficiently and this is where resampling excels
but at the price of needing a few samples to get the idea of sampling domain
here it's analytically tweaked to importance sample direct illumination but without taking visibility term into account because that's basically impossible to do analytically
resampling is the numerical way to do it
anyways that's where small history would suffice, and sponza is not a very challenging scene in terms of complexity (most of it is directly illuminated and the rest is illuminated by 1st bounce)
if you want a challenging scene, see my path_tracing_nightmare.glb
Material getMaterial(int i) {
if(i==0) return materials[0]; else
if(i==1) return materials[1]; else
if(i==2) return materials[2]; else
if(i==3) return materials[3]; else
if(i==4) return materials[4]; else
if(i==5) return materials[5]; else
return materials[6];
//return materials[i]; //webGL 2.0
}
send help
where
it'll probably look like ass with RSM but whatever
I must have forgotten to copy it to my models folder
it's designed to progressively get harder as you go deeper into the room
because each bounce is harder to estimate
blender basically dies there
and denoiser gets all splotchy
in the furthest corner that is
essentially a scene to test your path guiding algo
this scene is too hard for rsm
looks good if you don't let anyone see what's under the 1st floor ceiling
the light gets into this corner just fine 
filtered RSM is such an improvement in this scene though
๐ ๐
oh by the way
can you make sun intensity slider
or exposure
tbh you maybe wanna some tonemapping now
maybe warrants a tonemapper demo?
I already got the
// tone mapping (optional)
//finalColor = finalColor / (1.0 + finalColor);
doe
jk jk
call it hdr and implement all of those found here: https://64.github.io/tonemapping/
well that was just an idea for another example which might be fun
I sure would like to try those out some time later for personal project
apparently using the ACES tonemapper without a bunch of other stuff is a bad idea (according to other people on this discord)
idk what the other stuff is though, and imo it looks fine on its own
I don't like aces because it's simply compressing everything to ldr
I want dynamic exposure change cuz its cool
and overexposed parts that emit a glow with fft bloom
maybe I could try this fancy thing sometime
https://www.shadertoy.com/view/XljBRK
but not today since I gotta schleep
Is fft bloom finally being used?
I don't have an answer to this question
the only thing I can talk for is myself
if you mean used in absolute general sense, then it never was unused
if you mean being used in production with real time graphics such as games - I don't know
the fact is, FFT is cheaper than naive convolution when you have large kernels
and diffraction patterns are generally very large
did you take a look at https://github.com/tippesi/Atlas-Engine yet?
that link came across my eyes again and i though i show you ๐ then i realised i must have showed it to you already
Ye I skimmed through the source the first time you showed me
what was I supposed to rembererer
ye it seems quite legit
all done in openjayell
so I installed win10 to play portal rtx
can confirm ghosting all around on any settings
sometimes even portal gun leaves trails when you move the camera
muh broment
separable 5x5 vs regular 5x5 bilateral (I adjusted the settings to make the spatial blur super obvious lol)
hmm I must have messed something up since the first looks so much smoother
or maybe the separable bilateral really do be like that
is this truly 5x5
it's 5 passes of 5x5
in the first image, I separated the filter so it alternates between horizontal and vertical passes
so it's technically 10 smaller passes
wait a sec
ah, I'm only using 4 passes in the second pic
here is how it should look
uncanny how pattern around the plant is very similar
check this and previous separable bilateral screenshot
I almost thought you posted same one twice
or did you
lol they're different, I promise
I put each image in a separate layer in paint.net so I can inspect
comparing images in chrome is painful
which pattern? the noise?
I made sure to fix the seed in both pics (and disable temporal accumulation)
so the only difference is the filter
yeah I thought you just somehow got lucky capturing similar patterns lol
dream luck
by the way I forgot to mention that last time I ran the example there were some issues with random history drops in the distance
it's probably related to depth
when you move the camera it randomly drops history
was it big sections or single pixels dropping history?
hmm, that artifact happens when the depth phi is too low
or anything else that causes excess sensitivity for that weight
I'm not seeing it on my machine, at least not in the scenes I tested
ye that's what I did
I also have a debug mode to spot disocclusions
in Reproject2.comp.glsl there is a #define at the top to toggle it
if I turn really fast I think I can see what you're talking about
maybe you are not fixing depth values after reprojection
I think it won't be same depth if you reproject a pixel from the side to the center
I wonder if the fact that I'm running at 240hz makes it harder to see it
it would make the depth differences smaller
hmm yeah I see it better at 60hz
maybe instead of using depth values use camera distance
those should be invariant to fragment locations certainly
length(world_pos - view_pos)
or distance(world_pos, view_pos)
though when camera moves depth will also change
for same surface point between two frames
it should probably be world pos then
I thought I could get away with linear depth, rip
btw the separable bilateral pass is ~1.2ms and the old one is ~2.8ms at 1080p
unfortunately the separable filter seems a bit worse in unstable areas
seems like there is more flickering in high frequency disoccluded areas like pot edges and leaves
I also tried a 7x7 filter for the lulz and there was 0 difference
its width probably made the depth phi too low for outer samples
and then you frogger
I pushed my changes, including gui options for the sun specifically for void
One trick I might try for improving perf/quality is to pre-downsample the RSM so it's cheaper to sample
Might even allow for multiple RSMs since sampling each one is relatively cheap
"options"?
did you add something more than just intensity?
another rotation axis maybe?
sun color?
hacked up some code to make the RSM + denoiser run at quarter (0.5^2) res, now the whole pass takes ~1ms (it takes 1.8ms at full res)
it flickers really hard now
increasing the sample count is really cheap though, and it can reduce the temporal aliasing
though it would probably be ideal to use legit TAA
tex hit rate goes down a lot in the filter pass, probably because I'm still sampling the full-res gbuffer in that
if I pre-downsample those to the target resolution first, perf may improve
sadly glBlitFramebuffer seems to be extremely slow. I use a blit when albedo modulation is disabled, and that alone takes 2ms
nsight seems to make the blit take a normal amount of time when profiling
odd how there is no single-component 32-bit unorm texture format in gl
except for D32
for what I need, not really
does float not lose a lot of info that you'd get to have in unorm 32 bit
it gets converted to float anyways when you sample
what I wanted was to copy a D32 texture to an R32 unorm texture, but R32F will work exactly the same
to sample?
in vulkan you probably could transition depth image to a readable layout and not copy things around
you can sample depth textures directly in gl
what I'm doing is downsampling depth with a compute shader
I can't write to depth formats so I have to use something else
are you building depth pyramid
I'm just seeing if an optimization idea will actually work
explained here
i like it, i also like that the blue curtain kind of emits some volumetric looking looks
it runs in about 0.7ms after downscaling the relevant buffers
sounds fast
if I downsample the RSM to 128^2, I can take 40 samples and the whole pass still takes only 1ms
quality seems practically identical on sponza too
you need something less than 30 series to truly test it
less even than 20 series
maybe some pascal
maybe I can get deccer to try it again on his igpu
was his igpu better than mine 710m
what's ur igpu
usermememark
where is the bug
wrong texture slot?
I'm using a half res texture somewhere that a full res texture is expected, or vice versa
building up some pressure
jaker plz add y axis sun rotation
ok, after this push
thats a good idea, for my shtuff too
ah I accidentally bound the wrong textures somewhere
btw you can also add moving shadowmap for it to follow camera, with a pixel perfect snap so that it doesn't float as you move around
maybe that's more useful for deccer though
fwog examples have or will have AABB fit
I also think it's more reasonable to place the shadowmap origin in front of the view so that it doesn't render what's not going to contribute to the shading
i have a button in the debug window which kinda follows the camera
still examples branch neh?
ye
when are you going to merge to main
never
interesting decision
I had 30 fps on 710m
when this rsm
is "done" and I'm done with the other example stuff I will merge
make sure to click the button to use filtered RSM
ah one sec
29-31 it wobbles back and forth
ye cuz I can't add it that quickly
filtered is faster, that is a success
you can disable vsync by changing line 585 in deferred.cpp to this
auto appInfo = Application::CreateInfo{.name = "Deferred Example", .vsync = false};
seems like you won't get much higher than 30fps though
you can add a checkbox for vsync
not very useful since you'd normally always want it on
otherwise gpu suffers
yeah I don't want to clutter to UI too much
interactive framerates
what is wrong with shadows
some shadowmapisms with steep sun angles
I clamped the bias
the ๐ ฑ๏ธeter ๐ ฑ๏ธanning was too intense if I let it run free
ah it doesnt like the jiff ๐
lol
tbf deccer's gif looked kinda phallic
that made me also chuckle a little
if i made #1019722539116802068 NSFW it should have worked
clam too explicit
what about dither
didn't feel like adding it for that scene
jaker needs to goto bed first
it's exclusive for gltf viewer?
oh didnt we talk about reusable imgui controls?
looks like you are cutting corners again
the only duplicated imgui controls atm are sun controls
ok c:
not gonna make a header for that
alright, the second sun angle slider has been added
somehow it kinda works when the RSM is downsampled to 32x32
really splotchy when the sun moves though
is resolution scaling a slider?
no
it can only be changed in the source
RsmTechnique.cpp line 104 is where the inverse resolution scale is hardcoded
2 = half res, 1 = full res
maye void can coerce you into making it a slider one day, or sends a PR
RsmTechnique.h line 102 is where you can change the downsampled RSM size
it would be cool, but annoying to add because it requires remaking all the textures
would be easy to make with preprocessor and recompilation
or just 2 methods, {Destroy/Create}SizeDependentTextures()
I used to have a lot of preprocessor stuff and a recompile button in my ogl raytracer
I just need to make a function to construct all the textures and it will be trivial to add those sliders
the shaders aren't what need to be recompiled
unless you're talking about cursed dll hot loadingisms
no im talkin bout shader
shaders don't need to be hot loaded atm
though a button to toggle debug mode would be kool
Fwog the ultimate RSM testbench
the effect looks decent at 0.25^2 res too
though I kinda have to fudge the sliders (super low alpha, increase samples to 8, decrease filter width a lot)
and the time seems to be dominated by downsampling the gbuffer and rsm at such a small resolution
maybe viable if you're on an iGPU ๐
the shrimpled version really makes a difference, i think before your rehussle i had like 15fps or so
if you look around, you can see artifacts from me naively upscaling the illuminance to screen resolution
which example is that?
this is void's path tracer hell scene
a scene switcher would be neat
on gltf viewer
ah is that also part of the fwog repo now?
not yet
that is part of my engine
you can include it in fwog if you want, I wouldn't give the scene if I was against it
(1/8)^2
I might modify it a little if I do
e.g., right now it is not centered at 0
or maybe it is
nvm it is well-centered
the problem was that I had a position for the camera hardcoded for sponza
how silly of me
it's a scene measured in meters also so it might be bigger than sponza
way bigger I think actually
I had to scale this one down 2x as much as sponza
i like these default textures
(1/32)^2 resolution ๐
I'm taking 400 samples here and it's still performing 5x better than unfiltered RSM
bigger rmax should worsen it
maybe plain RSM is just "shit" compared to all the optimiziation shteps you did and do for the filtered one
ah this is actually (1/16)^2
ye
surprising but plain RSM is coherent but is shit in quality
it's faster than dithered sampling
the win here is that we only do 1spp and temporally accumulate, now even at low resolution
are you clamping it in code maybe?
no
crazy
if I take 40k samples per pixel it goes up to 15ms
but these are not true screen space samples
I'm sampling from a texture that's 256x smaller than the screen in terms of area
because we adopt valve ideologies
your pixels are so big now they are basically almost surfels
you know that gives me another weird idea
can you linearly filter the RSM
what if we actually use surfels and do voronoi splatting
if rsm is so insensitive to resolution cuz it's low frequency
oh that's big brained
I think it would be more ideal to apply a 3x3 bilateral as I upscale so it doesn't run over edges
yes that's basically same as filtering
I'm guessing it would fail hard in a scene with more complexity though
like with foliage and stuff
I mean low res sampling in general
if you TAA maybe not as much
ye
At that point you could just switch to VXGI, it's probably simpler lol
should also do subpixel jitter on the low res samples
jaker needs to take a break from GI after this, and has other things to implement first, so that i can steal them for myisms
we brought up voxel tracing some time ago maybe worth a try eventually
The amount of optimizations that went in this RSM implementation is so good
I learned a lot just by lurking
turns out fwog thread is entertaining and educational
I wonder what is causing shimmers
reprojection rejection
probably cuz high frequency stuff is sampled too sparsely at low res
I'm gonna try someting
ah
mayhaps it's a given that if pixel covers a lot of different surface points it should be less strict in reprojection rules
I'm also not using the 3x3 bilateral reprojection heuristic that the original paper has
I guess you could test if the reprojected pixel is still within the area that it covers downsampled
nah that's gonna give you a lot of troubles if you are accumulating the downsampled results
you actually need to accumulate full res
then it probably won't be as troublesome
is that what you're doing?
no
the reprojection pass would get more expensive if I accumulate at full res
possibly worth it though
but that's to assure it's not failing for the final image
at the same time, just adding more samples is incredibly cheap and works well
would be neat to have a switch/slider somehow to change the resolution to 1/16 to 1/1 or so, then we could also run some experiments on how it performs on all sorts of hardware again
ok if you slep I homwork
before I forget here's what voronoi splatting is: https://www.shadertoy.com/view/NlKGWt
you can also hash it to quickly find closest ones for a given pixel
given that the RSM sits entirely in cache at low resolutions
What if we could irregular Z-buffer the RSM
At higher resolutions
I did the math and a 256^2 RSM is about 5x larger than the L1 cache of a single SM on my GPU (3070), but there are dozens of SMs and each pixel won't need to access the whole RSM
I'd have to look it up after I sleep
mfw the original paper for irregular z-buffers notes that they cannot be constructed in real time on current (~2005) hardware, so they propose a GPU that can do it efficiently 
the idea of using an irregular z-buffer for pixel perfect shadows is interesting though
I love writing code like this
if (useSeparableFilter)
{
Fwog::Cmd::BindSampledImage(0, indirectFilteredTex, nearestSampler);
}
else
{
Fwog::Cmd::BindSampledImage(
0,
(5 - int(std::log2f(float(inverseResolutionScale)))) % 2 == 0 ? indirectUnfilteredTex : indirectFilteredTex,
nearestSampler);
}
it's not brittle at all and doesn't lead to many bugs
lmao that condition
get a pipebomb for christmas in these 4-7 simple steps, hard to count
ping-pong algorithms with a variable number of iterations give me cancer
6 levels of indentation removed for your viewing pleasure
for (int i = 0; i < 5 - std::log2f(float(inverseResolutionScale)); i++)
{
filterUniforms.stepWidth = (1 << i) * spatialFilterStep;
filterUniformBuffer.SubDataTyped(filterUniforms);
// The output of the first filter pass gets stored in the history
const Fwog::Texture* in{};
const Fwog::Texture* out{};
if (i == 0)
{
in = &indirectFilteredTex;
out = &indirectUnfilteredTexPrev;
}
else if (i == 1)
{
in = &indirectUnfilteredTexPrev;
out = &indirectUnfilteredTex;
}
else if (i % 2 == 0)
{
in = &indirectUnfilteredTex;
out = &indirectFilteredTex;
}
else
{
in = &indirectFilteredTex;
out = &indirectUnfilteredTex;
}
Fwog::Cmd::BindSampledImage(0, *in, nearestSampler);
Fwog::Cmd::BindImage(0, *out, 0);
Fwog::Cmd::MemoryBarrier(Fwog::MemoryBarrierAccessBit::TEXTURE_FETCH_BIT);
Fwog::Cmd::Dispatch(numGroups.x, numGroups.y, 1);
}
also replace int(log2f) with an int log
I thought about it
is there actually int log in the C++ standard library?
I don't see it in <cmath>
i think not
maybe it's in <bit>
reinterpret cast your int's address to a non-trivial type
you'll get its log
sometimes
ah right, i remember now
TIL about <bit>
its in <iostream>
int x = 6;
std::clog << x;
Great Banter shirts: https://kek.gg/u/36dhq
making my code unreadable for epsilon perf gain 
ilog2(x) is not unreadable ๐ค
ah I actually want this function
https://en.cppreference.com/w/cpp/numeric/countr_zero
ilog2 is readable, but countr_zero is not unless you already know what it's being used for
well yeah
too bad there is no way to fix that
my brother in christ, you wrote the Fwog::Cmd::BindSampledImage(0, (5 - int(std::log2f(float(inverseResolutionScale)))) % 2 == 0 ? indirectUnfilteredTex : indirectFilteredTex, nearestSampler);
so don't write countr_zero instead of ilog2
inshallah that code is ""temporary""
He will be refactored alhamdullah
mfw when HR tells me "declaring jihad on warnings" is not appropriate workplace language

GL_GOOGLE_include_directive doesnt work on nvidia-470xx
it should be a warning at worst
it gtfs me out
enable: Causes the named extension to work; if the implementation does not support the extension, it only gives a warning. Fails if used with all.
/home/deccer/Private/Code/External/Fwog/build/example/02_deferred ๎ฒ โ
terminate called after throwing an instance of 'Fwog::ShaderCompilationException'
what(): Failed to compile shader source.
0(3) : warning C7508: extension GOOGLE_include_directive not supported
0(118) : error C1020: invalid operands to "/"
0(119) : error C1020: invalid operands to "/"
ok so the error is the implicit float conversion thing
ah ye ๐
deja vu from when void encountered the exact same thing
indeed
however, the driver is incorrect here
for binary operators, if one operand is a float type and the other is an integer type, the integer one should be promoted to the float type
i cannot use a newer driver
I guess I'll have to change it so it works on y'alls' crusty drivers
ye ๐
unless you have us sponsored the latest amd goodness, i wouldnt say no lol
nah i was just kidding
i dont think i will get me an amd card ;P
i was able to get one of my 780tis working again
did you bake it
xD i did
amazin
quite the hassle
Told ya
it's a driver bug on god
It was always there since 200X

it actually works on both of my machines (rtx 3070 and rx 6800 with latest drivers, windows 10)
I can even pinpoint the exact part of the spec that says ur driver is bein silly
either way, I gotta conform to the lowest denominator ๐ฉ
Would be fubby if i update mine and it stll wont work
I have 527.56 apparently
it's the latest I think
but I had to update nvidia geforce experience just to look it up
I'm now going to try your examples
I have the previous driver version from that
why do you have to explicitly specify if gltf is binary or not
too lazy to check the extension automatically ๐คฐ
well, I shouldn't have to check the extension
std::filesystem::path::extension
I mean there should be a magic number in the file to indicate it
only glbs have a magic number though
but yeah I guess I should check the extension
it works
so they actually did fix that just in 2022
and both amd and nvidia decided to do that in 2022?
well can't say for amd
but my drivers weren't year old
this is one of the reasons spirv is superior
compile to ir yourself and don't worry about driver not being able to compile
or idk link your app to compiler
ironically, opengl drivers are extra buggy when it comes to spirv compilation
but was spirv made for ogl at all
also I wonder if we'll get another ogl version
it seems to have been abandoned
it ded ๐
pretty sure that was actually his nv card
eastern thinkers say 'kalpa' - a moment of eternity - passes when deccer buys a new gpu
yep im on intel usually, but i was able to resurrect 1 of the two 780tis last night
I just had a thought
What if I could dynamically distribute samples like in surfels GI
It would be cool to do the RSM at a really low res, but put more samples (or move the sample location) on edges
Maybe just something like VRS would work, hmm
I want to put jittered samples on edges so high frequency geometry doesn't die
I guess aggressive VRS might work
Given how we've seen that downsampling the RSM effect to like (1/32)^2 works
But then there's still the problem of filtering, which I don't want to do at full res (as VRS still resolves to a full res image)
I gotta try the bilateral filter upscaling strategy first though
sounds like its worth try
@long robin that Limbosuccessor gdc talk was really cool, i find the game itself utterly boring, but the graphicsisms look intermeresting
mayhaps you can post the link in #graphics-resources when you find the time
I can't post there
oi
ah
its gp-news where everyone can
let me allow mvps to post in gp res too
should work now?
Still can't
Worked
no embed?
Idk why
They also have one about TAA
https://youtu.be/2XXS5UyNjjU
In this 2016 GDC talk, Playdead's Lasse Jon Fuglsang Pedersen discusses Temporal Reprojection Anti-Aliasing in the context of INSIDE, touching on the process, the initial research, and the pleasant side-effects.
Register for GDC: http://ubm.io/2gk5KTU
Join the GDC mailing list: http://www.gdconf.com/subscribe
Follow GDC on Twitter: https://...
Lol nothing embeds in that channel
Ez
merci mon ๐ช
The triangle noise technique was probably my favorite thing from the first talk lol
I need a scene with high geometric complexity for testing RSM
something with foliage, perhaps
why does the sketchfab's search suck so bad
the minimum price in the filter is $3 bruh
can't even filter for gltfs bruh
ok I think I found the free gltfs
https://sketchfab.com/3d-models?features=downloadable&sort_by=-likeCount
I had to use google instead of sketchfab's ๐ฉ search functionality
lmao I literally clicked the "free gltf assets" button on their site and was presented with paid non-gltf models
blender is taking literal minutes to open a 1gb obj file
already using 11gb of ram
epic
hey I was thinking about morgan MC guire samples with foliage and forgor it had miguel san
yeah I resorted to those because the websites for sharing models suck
does this work as a complex scene?
it certainly makes some problems more obvious
like the fact that I need to do the 3x3 search in reprojection
the upscaling filter is something of a success
geometry no longer gets censored when only illuminated by indirect lighting
that is amazing, you win
it's way too unstable at that res though, even with the smooth upscaling
2x upscaling seems a lot more feasible
just gotta perfect the reprojection pass 
very pog
persistent world-space surfels would be cool for this though
that would completely change filtering though
yes you could also make use of software rt using them
cuz less rays would need to be traced for surfels
yeah
I forgor how surfel lighting is actually applied to the image though. Is it just a lookup into the surfel structure?
I thought they just used the world space pos of the surface being shaded to look into the surfel acceleration structure
I think they had an improvement on top of that tho
some kind of interpolation iirc
I didn't look into how they do it specifically so I don't know
I think this is it
https://youtu.be/h1ocYFrtsM4?t=668
they fetch N surfels from the cell and weight by distance and normal
well isn't it a voronoi splatting
the surfel locations?
weighting by distance is essentially saying mixing with a convex combination
not sure what's with normal but I guess in the end it's all for a single weight in a convex combination
and if you have a set of points that you interpolate based on distance that's basically voronoi
I see
when I googled voronoi splatting the first time you mentioned it, your shadertoy thing was the only relevant result lmao
I guess I coined the term
but there really isn't any other way to call it that makes sense
idea is simple though
essentially a constraint for a set such that a + b + c ... = 1
a way to achieve is to divide each element by the sum
Weighted means are functionally the same as convex combinations, but they use a different notation. The coefficients (weights) in a weighted mean are not required to sum to 1; instead the weighted linear combination is explicitly divided by the count of the weights.
example of a convex combo is barycentric coords
many image filters are convex combinations through the above definition
the term "convex combination" has a geometric implication imo
kernels just have the same condition
blur kernels
you can have whatever kernel you want for a convolution
random fact did you know that we divide by pdf because estimators otherwise converge to it if you integrate 1
I guess you can numerically get a pdf using this property
I mean just making a fake scene to derive the pdf, then plugging in the number into the code
maybe RIS is actually exploiting this property though ๐ค
I remember they have some weight using expected value, but maybe it's unrelated
if your pdf is not constant then my idea does not work
you could make a LUT for numerically estimated pdf
but the dimension of the LUT depends on the number of parameters of your sampling function
maybe if some parameters are degenerate you could exclude a dimension
I'd only consider it if the lut was 0-dimensional tbh
that would be an approximation that causes bias for a difficult continuous pdf
depending on the LUT resolution
well not to mention that fat LUTs are lethal for performance
otherwise I'm sure we'd be using LUTs for BRDFs
there are captured real world BRDF LUTs out there
what was it called I can't remember
basically 100 materials library
idk I'm not an artist
MERL BRDF database
yw
MERL - Mitsubishi Electric Research Laboratories
those are very very fat
the only use is pretty much research
to compare your analytical model against
you have mastered reprojection?
I still haven't tried a bicubic filter like in that ReLAX talk
I won't try it tbh since I doubt it'll improve this much
when did we take damage
by looking at the aliasing in the pic
it works pretty well even at 4x4 upscaling
still flickers a lot since there aren't enough samples placed on different surfaces
I want to temporally jitter the samples somehow, but idk if it's feasible
foliage hell
it's kinda ugly, but has useful geometry
sometimes there is black and I'm not sure why
from what pass
is it consistent
somewhat
it seems to happen when there is a rejection
but not always
pretty sure I just got a bug somewhere
rejection seems to make areas get darker
it's odd because the affected areas have a long history
actually, no
low history seems to be correlated with darkening in general
I think it's just a lack of blurring + sparse sampling
except it still happens when there are 40 samples
that's something
probably using an index meant for a larger texture or vice versa
something is definitely wack, cuz it happens with 400 samples and when I only move by a pixel
I'm guessing some kind of UB cuz it doesn't happen in RenderDoc
because it still does primary shading on the image
the "skip albedo modulation" is just for the output of the RSM pass, which means the indirect light
makes sense
another thing is that it feels like sun intensity somehow doesn't make sense with what I see on screen
feels like it should be way more overblown for 40
idk
btw
do you 1/pi for direct
I think you don't
well doesn't matter for first bounce
I'm more interested in trying a tonemapper
ok, I can shoehorn one in sometime
vec3 reinhard_ext_luminance(vec3 v, float max_white_l)
{
float l_old = dot(v, vec3(0.2126, 0.7152, 0.0722));
float numerator = l_old * (1.0 + (l_old / (max_white_l * max_white_l)));
float l_new = numerator / (1.0 + l_old);
return v * l_new / l_old;
}
nice
thats what I'm editing ye
here's my previous image with reinhard
how does it compare to this
vec3 reinhard_luminance(vec3 v)
{
float l_old = dot(v, vec3(0.2126, 0.7152, 0.0722));
float l_new = l_old / (1.0 + l_old);
return v * l_new / l_old;
}
moar color
