#GrubStomper29's OpenGL Sandbox
1 messages · Page 6 of 1
Let's explore whether German really is awful. And use code ROBWORDS at the link below to get an exclusive 60% off an annual Incogni plan:
https://incogni.com/robwords
In 1880, Mark Twain wrote his essay "The Awful German Language". In it, he listed what he saw as the many drawbacks of the language that he had spent years trying to learn. Are hi...
I think after shadow mapping is done i’ll implement pbr and move on from this project
i think a big source of burnout is the codebase’s mess
too much codependency
regardless i cant make this the THIRD project where i plan to implement pbr but move on before i do so
anyways my current plan going forward is to make slightly smaller projects with 1-3 month deadlines to encourage more work
I want to work with engine structure, 2D graphics, and gameplay prototypes
after watching The Phantom Menace, I’m inspired to work on swordplay. It will be challenging to get something unique, understandable, and fun up though
man i havent touched this in so long
i have no idea what specifically I need to do
i know I was working on multiview but i dont remember specifically what part
thats probably not good
Implement LPV it's what all the cool kids are doing now 
ah lovely it dissapears when i enable ASAN
this has been happening ever since I made hiZTexture a 2D array
im not sure why
oh I was using a parameter wrong
anyways I just made it so there are viewBuffers containing view info like camera frustum and matrices
next I need to put them to use in the shaders
ah it seems ive forgotten that textures with a depth component format can't be used for image load/store ops
i wish the wiki page said that
im too scared to edit the wiki to say this, because I'm only 100% sure it's true, not 200%
If a tree falls in a forest and no one is around to hear it, does it make a sound?
you wont hear it, but it will move the air
the gpt nonsense in offtopic is also rule 8 violation 🙂
I really need to write shaders specific for the shadow maps
i bet using the ubershaders for them is a huge time waste
Yea
man i gotta do some witchery to convert a uint to a sampler2D
you mean a uint64
bindless handles are 64bit
if you dont mean bindless elaborate please 🙂
yeah not bindless
what im trying to do is impossible anyways since you cant put a sampler2d in a struct array
let my view index an arbitrary sampler
struct View
{
...
sampler2D hiZ_buffer;
}```
such and so forth
and ofc
layout () buffer ViewBuffer
{
View views[];
}```
im in opengl
i could be evil and hardcode it
arent those locked? iirc you cant update the contents of a texture after its made bindless
You can
i dont think so
oh okay
bindless just means you have a handle to use
It locks something but I don't remember what
maybe sampler parameters idk
i mean i just need to do glCopyImageSubData and image load/store ops
none of its state is locked or none of its state can change?
the former
splendid
"Once a handle is created for a texture/sampler, none of its state can be changed. For Buffer Textures, this includes the buffer object that is currently attached to it (which also means that you cannot create a handle for a buffer texture that does not have a buffer attached). Not only that, in such cases the buffer object itself becomes immutable; it cannot be reallocated with glBufferData. Though just as with textures, its storage can still be mapped and have its data modified by other functions as normal."
so from what I gather I can't change its dimensions or wtv but I can still do all this
i dont think you are using buffer textures (TBOs)
thats something you would use if you cant use SSBOs (like mobile platforms)
i see
another day celebrating not being a mobile dev
what a beautiful image
that's the shadow map's hzb
noice
wow my pc monitor must be really bright
this is extremely dark on mobile
Everything is usually darker on mobile idk if it's just because the darks go darker (oled) or what
Or just because we put the brightness down
When I max out the brightness it looks completely different
probably the former
I always have my phone brightness as low as possible for the sake of the battery
Yeah
this did not happen before
hzb culling is the issue
I bet it has something to do with the bindless textures
rd would be a huge help rn
i mean i can tell its not sampling the right texture
0000000100000B7D
0000000100000b7d
the texture handle and the handle referenced in the shader match up
nvidia nsight understands bindless textures
yeah thats how i found this out
ah
wait why am I doing this twice
regardless i dont see why that would cause this issue
as expected fixing that didnt remedy it
some witch has hexed my code, ill fix it tmrw
oops
im still taking slowly and learning scales for now
i havent yet decided on a song to learn but the cave story moonsong sound track is looking real nice
or geothermal or living waterway from the same game
still trying to fix this, I have confirmed that the correct texture is being read
my next guess is a transform error
but I don't see how
heres an interesting find
The top right is what's visible, and the left is the shadow view (which has the same view/proj as the camera for now)
the bug doesnt seem to be affecting the shadow view like it affects the camera view
for comparison the camera view hzb matches perfectly
sidenote: there are some buffer mappings that take over a milisecond each, which can account for over half the frame time. I'll have to fix that later
very interestingly if I send in my view matrix to the culling algorithm manually using a uniform mat4, the value ends up being totally different from if I use what's in the view struct
something something alignment
which doesnt make sense to me since the last vals in the struct seem perfectly fine
unless nsight graphics buffer view doesnt expose alignment issues?
@kind sparrow youre online would you happen to know anything about that
I'm not sure I understand
you're saying a uniform variable vs. a uniform buffer is making a difference?
they should be identical values but yes
meaning something is wrong with reading from the ssbo
i just used an = test within the shader 
im afk for about 30 minutes but i can check soon
what irks me is that nsight reads correct values (at least from the last few vars in the struct)
maybe its this—theres a struct within the view struct in the glsl code, maybe thats messing up alignment.
theres no struct in the nsight buffer view; i just paste in the contents of the struct, being 6 vec4s
you can format the buffer view in nsight, like you can do in render doc
idk if it supports struct within struct
double check alignment rules though
hence my pasting in the 6 vec4s
std140/std430 with ssbo
yeah its unintelligible
some really weird rule i dont really understand
ill see if replacing the struct with 6 vec4s does the job
alright so that fixed half of it
view data is correct
occlusion culling still simply does not function though
i wonder if passing sampler2D as a function parameter is illegal
its not
well I got it to sample SOMETHING and this is my result
uhhhh
occlusion culling seems to do something now
but it looks like transforms are actually still broken
ill be done for the day
yeah according to jaker in #opengl i may have discovered an esoteric driver bug
@junior sparrow do you have an rtx 3050?
iirc you did
rtx 4080 super
@gilded shell
aslo never pass sampler2D
glsl is broken lang when it comes to this
lpotrick was really mad about it
aa func param?
yep
alright
vulkan + slang when 😈
do you have an alternate recommendation lol
nope
sorry to the sampler thing
let the opengl and glsl burn in hell
Why not
i’m good
oh believe me its been in there and there are no messages
then render a cube and slap that texture on it
const uvec2 hiZ = views[viewId].hiZ;
uvec2 hiZ = views[viewId].hiZ;
based on my sophiscated testing method, the former gives incorrect results
I think const works differently in glsl than in cpp
so i removed all the incorrect const usage but culling is still broken
it really seems like im sampling the hzb correctly this time
i believe the issue lies elsewhere
being able to place breakpoints in glsl would be a godsend
Renderdoc
Has debugging shaders
doesnt support bindless textures
My man switch to vulkán or at least daxa 😭
Alright so I set the meshlets to cull if the bounding sphere depth is calculated to be less than 0
I honestly don't remember if they should be more than or less than zero but obviously the sphere transformations are the problem if it just swaps like that
which confirms my theory—somehow I broke sphere transformations when implementing multiview
the first culprit would again be alignment issues but again every other value given by the ssbo seems perfectly fine
in other news I'm trying to implement gribb & hartmann's frustum extraction method on Scratch
that's the easy part; the hard part is calculating a projection matrix lol
in the vertex transform, I just divide x and y by z then multiply them by the half screen width divided by tan(0.5 * fov)
That gets me straight from view space to screen space, skipping clip space
so I suppose I need some sort of imaginary clip space solely for the purpose of frustum culling lol
yeah
gribb and hartman extracts the frustum from the viewproj
top plane of the frustum looks like this
column major
"View mat" is actually supposed to be view proj
but I have not yet constructed the proj
Oh lol, I didn't understand when you said scratch you meant the language 
I thought you mean you were writing it "from scratch"

precisely
When your capabilities (mostly in math) grow then you'll be able to channel that masochism into making more challenging stuff rather than making simpler stuff in more painful tools lol
A brilliant version of Two Step w/ Bela Fleck & The Flecktones from 4/21/02.
This was the day after the very popular #41 w/ the Flecktones which was on the Central Park bonus CD. Jeff Coffn was with the Flecktones at the time. This was 6years before Leroi Moore passed and Jeff filled in....
Filmed by Ethan Sinclair.
Clipped out render fro...
this is my favorite drum run ever
it sounds a bit messy because theres also Wooten's brother on the SynthAxeDrumitar
I really like the tom and snare stuff
Looks pretty legit
and crispy
indeed
Occlusion culling is fixed
what was the problem?
I use a vec4 to store cam pos + znear
to get the znear I was stupidly fetching the z-component and not the w-component
oops
yeah
I believe multiview is complete
that took far longer than it should've
its merged to with the git
now for shadow mapping
or I could do some optimization first

i need to figure out how to use nsight profiler
it seems like the camera's orientation is somehow affecting the shadow's hzb
ofc that could also be a bug with the frustum culling
what an odd bug, there is no apparent cause
here's what NSight shows when I move the camera around between captures
so it's not an issue with the depth buffer visualizer
I think I narrowed it down to being an issue with the view frustums
or very quite possibly... alignment issues
well i read the spec and my alignment for the view struct is indeed wrong. After fixing it, the program is even more broke
its like every line i add to this junk breaks something
I've deduced its not a sync issue
anyways after "fixing" my alignment theres some really weird flickering
I fixed an incorrect near z value and most of the flickering is gone, but it still seems the camera's view frustum is somehow interfering
yep im correct
and it looks like the bug itself is in the shader
Alright
so I think I finally understand everything there is about glsl alignment
I also figured out where the bug originates:
I can't understand why, but viewId is incorrect
but ONLY when its puling view frustum vals
if I hardcode 0 or 1 its correct
i say this because there are zero other bugs related to viewId
the buffer values are correct
im losing it
i cant fathom this
i opened a thread in #1019726067851862097
4 minutes
I don't think that's very good
missing barrier perhaps?
not sure
Alright all bugs i know of are fixed
i dont want to look for another because im scared of finding one
so it was alignment bs again after all?
Thank you all
anyways it's finally time I start with shadow mapping
i gotta look into shadow samplers
@kind sparrow I think I saw you used those; were they nice to you?
I do, they're just very rudimentary
in srs?
Yeah
sorry for spreading misinfo 🙂
It's just a single shadow map at short range, no cascades
Oh I do use a shadow sampler actually
But I have a manual filter kernel it doesn't look this good automatically
hmm i dont rember seeing them, but they look quite neat already
so what does it do
it shades the grass : >
I think it's probably because once upon a time (or now) that was some kind of hardware accelerated thing (it probably piggybacked on whatever hardware tested against your depth buffer under normal circumstances)
one nice benefit of using a shadow sampler is that if you're using linear filtering on it, it basically gives you a little free smoothing
so you have to do less with your edge smoothing kernel thing
Yeah
I forgot I made this
in retrospect this is actually a really well designed boss fight for being a joke project
if this was made in flash and released 15 years ago you would've done numbers
instant school computer lab classic
yeah…
gosh i even make sure to show the player how an attack works before actually making it dangerous
i was smart 4 years ago
It gives you 2x2 hardware pcf
y’all better remember this when i make real games
Another developer for the roster of deccer publishing group. GmbH
GmbH?
Gesellschaft mit beschränkter Haftung
the thing im most scared to do is add a third model to the scene
Gesellschaft mit beschränkter Haftung (German: [ɡəˈzɛlʃaft mɪt bəˌʃʁɛŋktɐ ˈhaftʊŋ]; lit. 'company with limited liability') is a type of legal entity in German-speaking countries. It is equivalent to a société à responsabilité limitée (Sàrl) in the French-speaking part of Switzerland and to a Società a Garanzia Limitata ...
baking more baguettes today
they use a preferment so hopefully theyll stay fresh until monday morning when ill give one to my friends
good luck
theyre good
another potential project is buying a vocore2 and compatible screen to try and get some graphics up on that
id be writing a software renderer
looks like only the openWRT os will work on this system though
and the cpu is mips
so getting hello world up will probably be the most difficult part for me
not really
you just need a mips compiler 🙂
MIPSym is an academic tool used to teach assembly language programming.
i see
anyways it looks like csm will require that my camera have some sort of far plane
ig ill need a second projection matrix thats not infinite
Are you sure that's necessary
Seems like you could just have a manual arbitrary shadow cutoff that you use to bound the frustum
ill have to see
Yeah my CSM cuts off significantly before the far plane
This is extremely common in games with long draw distances
i see
Sometimes they fill in the far distance with screen space shadows
im using reverse z for my depth buffers
im looking at resources for light matrix calc and they partition the camera's frustum into smaller frustums, one for each shadow map
then they get the world space frustum corners
im assuming I won't need reverse z here
no
you could probably just pick a max far, and then split it into cascades
while your normal camera keeps using reverse zed
No
Yeah
M*v is how you act on column vectors which are how it's done basically everywhere
It's only ambiguous because GLSL doesn't differentiate between row and column vectors
todo: shadow map pancaking
csm is going smoothly so far
ill also probably want pcss
Yeah std::floor obviously won't have overloads for glm::vec3
That's why I have co_math::floor
Also because co_math::floor is constexpr
But std::floor isn't (yet)
coroutine based floor 😛
coroutine?
a fancy thread like mechanism, nothing to worry about, but co_... is what all the coroutine bs is prefixed with in c++
I came up with the namespace before c++20
nice
oh shot
shoot
i forgot to make a shadow mapping branch
for my next trick, ill hardcode the amount of shadow cascades!
message to future self: my debugging setup could potentially be breaking light matrix calc by basing it on itself
Lol, I did that, I wouldn't be surprised if others did too
frustum culling is so hard bruh
Is it? Just a few dot products lol, don't overengineer it
Like you have 6 planes right?
What bounding volume are you using for objects? AABBs?
csm*
no clue why i said frustum culling
tons of good info + a demo program, it should leave you with CSM you're very happy with
The demo program is the CSM Bible as far as I'm concerned
ive been following some old blog post
its good but its for d3d so ive probably messed up translating the math code
The link above is also d3d
but the concepts should carry over
double check your math I guess
CSM might be the kind of thing that is borderline impossible to debug in a methodical way until you have learned linear algebra for real
Time for a crash course in Lin-Alg @gilded shell
yeah…
i mean i can see that my view matrix calc is correct for the most part
i understand very basic fundamentals about matrices and vectors
The issue is just the fluency in changing basis and the geometry and coordinate spaces
right
If you don't have a crystal clear understanding of what's going on it's really hard to debug the values
Like even if you do it's still tricky
You need to devise contrived situations where the components are recognizable
Otherwise it's just number soup and your only recourse is doing hand calcs to compare brute force
Are you going to college in the Fall? or do you still have more HS to do?
thats the plan
no math classes yet though
what program did you apply for?
i’m going to a community college before university
and i need to see what math credits will transfer, hence me postponing that
so just ge like bio, english, etc
Hmm community college seems to be much more common in South Canada 🇺🇸 where you guys are
Maybe its a cost thing? Here you usually do one or the other
def cost
MJP's article was good but I literally missed one line in my scan of the demo code (iirc it was the depth clamp settings) and he made an amendment to the article after
so uh, check your depth clamp
university is about $30k a year here?
but community college is about $5k a year i believe
absolutely crazy that you have to pay so much for basic shit
yeah
its 500 european shekels per semester in many unis in europe
time to move to europe 🙂
switzerland looks lovely but they have mandatory conscription
and i dont think iceland has much demand for graphics engineers
Come up north, join us 🇨🇦
i’m so good
Ok I'll stop before we get into rule 8 territory
In what way does it sound bad?
you tolerate the french
Lmao
It's like 30C today so the frozen wasteland thing is a lie btw
Got a free college, maybe will qualify for free uni
30c? thats like 200f
CC is basically like university costs in other countries
thats 86F no?
Universities charge a lot of money because they fund world class research programs that are expensive although these days a lot of the money is being pilfered by admin overhead and stuff
So it's the worst of both worlds
this is doxable info btw
We already know where Jake lives lol
I live in toronto ON
find me in the 6 million people
lol
i do have a meme that ties to this give me a minute
I already know which one it is
Ah that's not the one I expected
If you study in Canada as a foreign student you will pay more, it helps keep the costs down for Canadians
i actually love celcius for baking
25c is so easy to remember and its the optimal dough rise temp
Yea obviously we use metric for everything scientific
who needs to know that
just put that jawn on the stove and crank it
That tells you what temperature the water is when it's boiling not the temperature needed to boil it
I mean same thing but reverse direction
But the US is officially a metric country we just use the imperial units in daily life
Also freezing at zero
i mean ours freezes at 0 too
It freezes at 32F
well yeah but 0F water will be frozen is what i mean
yeah its -17 C no shit
What matters is that you know what temperature the phase transitions happen at
Which metric makes easy
0 and 100
yeah
well technically 100C depends on atmospheric pressure
are you saying celsius changes or the number changes
100C at STP
i know the objective temperature will change
The boiling point changes
right
The boiling point will be like 98C or 102C or something depending on ambient pressure
yeah ig C would be pretty useless if it changed with atmospheric pressure
dont forget + C after your integrals
Yes wouldn't want someone to think your integrals were in Fahrenheit
we should really be using kelvin
Ah yes a comfortable 283.15K day
anyways, csm
i’m just using large near/far z vals for the projection matrix but bad z clipping might still be a possibility
i heard of shadow pancaking being a possible fix but havent found a resource describing what it actually is
I threw the CSM building logic from the demo into a 1x1x1 compute shader and ran it after my depth pyramid build
so I could just sample the highest mip for depth max
actually it might've been numCascadesx1x1, so 4x1x1
that was basically all I needed to get good CSM precision, all the other little features and strategies are just icing on the cake
that and receiver plane depth bias
anyways from 1998 to 2015, konami managed to make some the best graphics for their hardware
mgs3 looks phenominal for ps2
looking at my scene through the view matrix of the light cascade is so fun
the cullings real weird, probably because I just lazily set my view to the cascade view
also it looks like ctrl / will comment out a whole line, nice
I forgot that my hzb sampler is designed for persepective projection matrices. oops
yes
debug draw time
any recommendations
First thing you probably wanna do is have the option to visualize the cascades
like I have here
the cascade depths?
I have that
this is another handy debug view
lets you see where each cascade starts and ends
I use a sphere fit, so there can't be stretching
Really the only oddity i see right now is with the depth visualization
maybe its just a quark with depth buffers but, depending on orientation, its too dark to see anything
i did not expect this from a linear depth buffer, so maybe my near/far planes for the cascade proj are off
I am swapping far and near in the ortho matrix calculation for reverse z
should I instead flip their signs rather than swap them?
yeah draw the frustumsi
wow gp'ers really found out how to bikeshed getting world space coords from g buffers
vec3 ReconstructFragmentWorldPositionFromDepth(float depth, vec2 screenSize, mat4 invViewProj) {
float z = depth * 2.0 - 1.0; // [0, 1] -> [-1, 1]
vec2 position_cs = gl_FragCoord.xy / screenSize; // [0.5, screenSize] -> [0, 1]
vec4 position_ndc = vec4(position_cs * 2.0 - 1.0, z, 1.0); // [0, 1] -> [-1, 1]
// undo view + projection
vec4 position_ws = invViewProj * position_ndc;
position_ws /= position_ws.w;
return position_ws.xyz;
}
do you want credit for this lol
no
this is an amalgam of all sorts of things too
ah look whos typing, school is over it seems 😄
yay
wouldnt be better 🤓 ☝️
vec2 position_cs = gl_FragCoord.xy / (screenSize - 1);
sadly not 💀
In 2 weeks I have oral exams
ive set gl's ndc z range from 0 to 1
thats what she said
- defence if my thesis 😭
so im guessing that first line is unneccasary if i could spell the word right
yeah
why's that?
because the .xy is a index so dividing that with sceen size will never be 1
it will be like 0.99
i think i understand
yeah fragpos.x is 0..1919
and im assuming this works out of box with reverse z?
if we assume 1920
so for example the res is 1280 on x coord and max index is 1279 soo 1279/1280. it will get close but never reach 1
yes
at worst you will see 💀
not sure how one would debug this
debug what?
why is it called viewProj if it's proj * view
some math stuff that I dont remember
i wonder why my screen dimensions are 1440x810
🤷♂️
Cause it's evaluated right to left but we read left to right?
ive been using ssbos so much i forgot the difference between them and ubos
Bruh
looks like ubos are better for smaller buffers
They lack extendable arrays and are generally faster
ssbos support more storage
extendable arrays?
ubos are only guaranteed to have 128kb at least
Shit you put in the last place in ssbo []
and that might even vary between driber versions and vendors ofc
I see
looks like ive been using ssbos where i should be using ubos
there is no "must use x over y"
I doubt you will notice perf diff
yeah
you could just use ssbos if you wanted
i probably will
im getting more comfortable with std430
and iirc vulkan doesnt have the concept of ubo vs ssbo
It's the least problematic layout
i use ubo for my global uniforms and debug settings buffers, everything else is bound as ssbo
and i group my shit in multiples of 16 bytes anyway, so no problem with std140 with ubo
It does
I havent touched opengl in years but it isnt the same exactly
oh ok
i keep forgetting that gl is by default -z forward
has nothing todo with opengl
yeah
alright i have some cascade visualization
and it seems like near shadows are functioning
somewhat
Lmao what is that
lemon with a face saying were gonna make it
ill make shadow maps, youll make an audio engine, lukasino will graduate, etc
you submitting to GP-Direct then?
this is fixed now
setting the color to the world pos is consistent and looks accurate
heres another look
I wonder if its working fine and this is just incredibly bad shadow acne, especially for the far cascade? I would love second opinions
the maps are 512x512
2048x2048 maps have identical artifacts, making me doubtful
it almost looks like the shadow maps themselves arent correctly aligned with the camera frustum
look at how that shadow just walks away at the end, which is why i believe the above
@plain mantle any thoughts
looking a lot better now
the issue is related to the near and far planes of my ortho matrix
i dont see how that could be the issue since depth clamp is enabled
using orthoZO fixes some issues
my current theory is that using reverse z is messing up gl_depth_clamp
IS THIS IT!?!?!?!?!?!?
left side down the street
thats not shadow acne
only happens for the spherical cascade code, so ill stick to tight cascades for now
finally time to use this
oh i need world space normals now
@gilded shell this is how I calculate the cascade spheres```cpp
void camera::calculate_cascades()
{
size_t num_cascades = 4;
float clip_distance = (zFar - zNear);
float min_dist = shadow_min;
float max_dist = shadow_max;
float lambda = shadow_lambda;
float minZ = zNear + min_dist * clip_distance;
float maxZ = zNear + max_dist * clip_distance;
float range = maxZ - minZ;
float ratio = maxZ / minZ;
if (!manual_split)
{
cascade_splits.resize(num_cascades);
for (size_t i = 0; i < num_cascades; i++)
{
float p = (i + 1) / static_cast<float>(num_cascades);
float log = minZ * std::pow(ratio, p);
float uniform = minZ + range * p;
float d = lambda * (log - uniform) + uniform;
cascade_splits[i] = (d - zNear) / clip_distance;
}
}
cascades.clear();
for (size_t i = 0; i < num_cascades; i++)
{
auto c_near = zNear + (i == 0 ? min_dist : cascade_splits[i - 1]) * clip_distance;
auto c_far = zNear + cascade_splits[i] * clip_distance;
auto sphere = calculate_cascade_sphere(fovy_degrees, aspect, c_near, c_far);
auto aabb = calculate_cascade_aabb(fovy_degrees, aspect, c_near, c_far);
cascades.emplace_back(sphere, c_near, c_far, aabb);
}
}
bounding_sphere calculate_cascade_sphere(float fovy_degrees, float aspect, float z_near, float z_far)
{
auto fsubn = (z_far - z_near);
auto fandn = (z_far + z_near);
float k = coMath::sqrt(1.0f + aspect * aspect) * tan(coMath::to_rad(fovy_degrees) / 2.0f);
if (k * k >= fsubn / fandn)
{
return bounding_sphere{ vec{ 0.0f, 0.0f, z_far }, z_far * k };
}
else
{
return bounding_sphere{
vec{ 0.0f, 0.0f, 0.5f * fandn * (1.0f + k * k) },
0.5f * coMath::sqrt(fsubn * fsubn + 2.0f * (z_far * z_far + z_near * z_near) * k * k + fandn * fandn * powf(k, 4.0f))
};
}
}
AABB calculate_cascade_aabb(float fovy_degrees, float aspect, float z_near, float z_far)
{
auto inv_cascade_persp = perspectiveLH(fovy_degrees, aspect, z_near, z_far).inverse();
auto points = pointsFromAABB(vec{ 1.0f }, vec{ -1.0f });
for (auto& point : points)
{
auto p = inv_cascade_persp * vec4(point, 1.0f);
point = p.xyz() / p.w;
}
return find_AABB(points);
}```
college websites are always, without fail, the worst websites ive had the misfortune of needing to navigate
got one for normals?
i hink that came up a few times on the server, try with the searchbox
maybe i wont need it
once you have position it's just normalize(cross(dFdx(pos), dFdy(pos)))
i want to do normal displacement for shadow acne and i can just do that in viewspace before going to screen space
or that
that seems much clearer
wait
@echo token wont that be more of an approx
well yeah, it's not the actual normal, it's just based on the gradient of the surface you're rendering
plus it's subject to float precision issues
yes
alrighth well i dont see how it would hurt to test
bro what is this
i have bigger problems than bias bro
seems to happen near ends of the first cascade
its not outside the texture bounds
perhaps thats just where the far plane is?
yep its the far plane
Oh boy! I love having near and far planes for my shadow maps!
so my issue is objects too far are clamped to the far plane but as far as i can tell you can’t differentiate between the far plane and fragments clamped to the far plane
is there any solution to this besides deepening the far plane?
i think ill post in questions
there is a shadow discrepency here
im shading fragments black if the depth equal to the far plane, so that's not the issue here
ah here we go
the frag coords are too far left in the shadow map to sample anything
The far plane of the cascade or of the view frustum?
cascade
this new error is much more troublesome though
im following Alex Tardiff's frustum calc
well anything past the far plane of the cascade shouldn't be visible so it shouldn't be an issue
Can you explain the new issue?
in theory the cascade eye should be around hereglm::vec3 eye{ frustumCenter - (2.0f * radius * lightDir) };
but im using frustumCenter + lightDir because that one doesnt work
maybe I should try using + instead of - there
is lightdir the direction of the world to the light, or the world from the light?
its the red arrow?
yes
Do you have debug draw?
It helps to draw these vectors into the scene
so you can make sure things are pointing the way you think they are
im not sure how that would work in this scenerio
I just have a function in my renderer to draw a wireframe line
renderer->add_wireframe_line(ray0_origin, ray0_origin + ray0_dir, vec3{ 0.0f, 0.0f, 1.0f });
its super helpful
how about i look at the scene thru the light persective once again
alright it looks like i have shadows again but the previously mentioned bug still persists:
that blue region on the left
fragments in the near cascade trying to read out of bounds from the shadowmap
basically, my matrices are once again cucked
im gonna play roblox
ironically roblox has some of the worst shadows ive seen in a game
i cant forget about the shadows
shadows the shadows
@night shoal do you say gross like gr-oh-ss or gr-aw-ss
I think he says groß
grossss
like jake did
i dont think german has words which have an aw sounding aw in it besides aaaaaaawwwwwwwww 🙂
their render pipeline is abysmal for how OP their render team actually was/is
ah, i was watching a video where the speaker had some germanic accent but said gross like “grawss”
i wonder whats up with that
maybe because roblox is coming up on 2 decades old?
iirc the engine devs are paid very well too, like $200k
there are a lot of germans, when speaking english try to sound american, mayhaps thats why
Fun fact: not sure if this is an engine quirk or the choice of the game dev, but i notice when some objects get further away they swap to gauraud shading
i love per vertex shading
for example: this zepplin from zepplin wars
maybe thats actually what happens
to reduce fragment shader complexity, they just do gouraud for distant objects
gauraud 😛
mobile mobile mobile
if you've seen their earnings reports, all the shit on the devforum/wiki, and basically everything else, mobile is their biggest growth market and revenue market, so their renderer has to be mobile compatible
yeah
they have some interesting stuff to chunk and spatial hash things on the CPU, if you've ever renderdoc'd it
Alright shadow cascades are stable now
but theyre still being sampled out of bounds (the blue fragments)
im using Alex Tardiff's code matrix calc https://alextardif.com/shadowmapping.html
but I swapped the the minus for a plus on this line glm::vec3 eye{ frustumCenter + (2.0f * radius * lightDir) };
@echo token you said you implemented his code, right? Did you experience any similar problems?
I don't remember, I just remember it sucked because I copied it wrong until I got it right then it was fine
im still very undecided on my next project
I want to cap it to 6 months
either a 2d game or gcn/ps2 styled scene
in other news im finally starting to learn a song on piano
this
Track 24
of Cave Story's soundtrack.
Complete extended soundtrack download:
http://www.mediafire.com/?ijnqxzykcnz
fixed some acne and added hardware filtering. bug is still there but overall its looking a lot better now
screwing around with pfc + shadow sampling. Im not really sure what im doing but the results are a bit pretty
its pretty ugly up close
and the cascade transition is too obvious
otherwise its hot
once i fix CSM and occlusion culling, I want to fix the meshlet batching atomic bottleneck
// P00 and P11 are indices [0][0] and [1][1] of the ortho proj matrix
bool projectSphereView(vec3 center, float radius, float znear, float P00, float P11, out vec4 aabb)
{
if (c.z < r + znear) return false;
float minx = center.x - radius;
float maxx = center.x + radius;
float miny = center.y - radius;
float maxy = center.y + radius;
aabb = vec4(minx * P00, miny * P11, maxx * P00, maxy * P11);
// clip space -> uv space
aabb = aabb.xwzy * vec4(0.5f) + vec4(0.5f);
return true;
}
holy bikeshed
I need to somehow communicate to my meshlet batcher whether a view is ortho or perspective
i just hardcoded enum vals
now im wondering about occlusion culling for the shadow maps
I definitely need it, but it will take front face culling from me 😦
which has been invaluable in fighting acne
nvm i dont think it will
anyways
seems all my meshlet bounding spheres fail this test
c.z < r + znear
actually nvm
they fail elsewhere
in vblanco's old culling from vkguide 1 I think he just uniform branches at the top level of the shader
if (IsOrtho) {
CullOrtho();
}
else {
CullPerspective();
}
it seems much easier to just have them isolated and separate compute invocations
really the only difference for me is some calculations in the occlusion cull
so i just branch there, and since the branches are smaller i imagine theyre faster or occasionally free
easy projection_matrix[3][3] == 1.0
nah i wanted to be more explicit
make it a function
not sure what you mean
bool is_projection_matrix_ortho(f32mat4x4 projection_matrix) {
return projection_matrix[3][3] == 1.0;
}```
Sorry, I meant I want to let the user explicitly decide
yeah, much more principal than anything else
ortho occlusion culling
really only seems to cull small meshlets
it might be a bug, but I theorize the bounding spheres are just so big
and since theres no perspective to shrink them, they stay big
@junior sparrow I need a second opinion
wdym
how do you calculate bounding spheres?
i believe whatever meshopt gives me
Do you get those from meshoptimizer? If yes then they should be fine
maybe I couldve worded that better
maybe the meshlets themselves are too big
this is from jasmine
at least spacially
there is special case for orthographic projection for hiz culling
Nah
It's more like your culling code is broken
I fixed occlusion culling. The camera and all shadow cascades have it now.
Despite that, my worst fear have been confirmed: this engine's performance is utter garbage
the vast majority of that frametime is spent on the cluster batcher
i think its finally time to learn what these mean
im gonna guess throughput is along the lines of "recruitment"
compute warp latency is about 4 million cycles throughout the cluster batch shader execution?
that doesnt sound very good
whatever "Sync Q Waiting" is, 100% is stalled
I would love for anyone to explain to me the meaning of these
what is it?
also it would be beneficial that you would add hot reloading for shaders
compute shader where I write out the index buffers of visible clusters
i have that
👀
they all try to atomicAdd the same 4 trackers though, which I imagine is the main bottleneck
one "index" for each triangle?
yeah I followed bevy verbatum there
it should be a little bit cheaper
but join the dark force with daxa and enjoy nice things 😈
next proect for sure
thatd still leave me with this
that stalling and latency is probably the threads waiting their turn to write to that var
you could instanced render meshlets and discard vertex invocations
i dont think thats related to the problem
the atomicAdd tracks the count of indices
but the rendering will be a little bit slower
i imagine the instanced render would also have a "count" of some sort
only the count of visible meshlets
yeah but thats still atomicAdd for each visible meshlet
I have a lot of visible meshlets
thats already what im doing
i do also have a bitfeild for meshlet visibility
that uses atomic operations but only 8 times per byte and the results are unused so im hoping its free
maybe I could come up with a solution using that?
but not as much the indices
literally 100x less
not as much as me
do you use an atomicAdd per meshlet?
I was using it
how did you drop it
ehhh I am doing things differently
I would suggest dropping the "index buffer generation" and instance render the meshlets
and some waste vertex invocations
you would have less VRAM usage and faster rendering
what does the batcher output?
just the meshlet information which indices like vertex, micro, etc..
and render it
it just fills a buffer with passing meshlets?
yep
sort of like a multidraw indirect buffer
you can do it directly in culling shader
yeah do you use mesh shaders?
yeah i probably shouldve consider mesh shaders since day 1
start rewrite in daxa 😈
yeah 😈
and fully replace it with mesh shaders
will having a lot of mesh shaders that contribute zero triangles be a problem?
🤨
what
will having a lot of mesh shaders that contribute zero triangles be a problem?
what do you mean by that
sorry
a lot of mesh shader invocations
because a lot of meshlets will be dropped due to culling
just dont execute mesh shaders for them
have separate culling shader where you only write out meshlets that survived and execute mesh shader for them
simple as that
alright
how do i write out surviving meshlets without using an atomicAdd to count them lol
you have to use it here
here its fine because you are not spamming it for each triangle
i’m already not
like i said im currently doing one atomicAdd per meshlet and its already slow
do you have the code on github?
uhhh yeah
its pinned in this thread
you can look in occluder_batch.comp
it’s cluster_batch.comp minus the culling so theres less code to look at
this is the slow part
lemme comment it out and see how we do
i really thought the atomicadd would be the slow part
yeah youre right lukasino
removing that brings frametime down to 10ms
66% reduction
get rid of it and instance render the meshlets
how are you calculating the matrices from the bounding spheres? I dont see that here
The cascade matrix?
yeah
It's not shown there yeah. I'll see if I can dig it up in a sec
It's part of another function
There's a whole snapping thing
im finding that I can fix those out of bounds issues with c++ proj = glm::orthoZO(2 * -radius, 2 * radius, 2 * -radius, 2 * radius, 10.0f * radius, 10.0f * -radius);
but one would think i shouldnt have to multiply radius by 2 here
even multiplying by 1.5 seems to be a valid fix
I switched to 1 index per triangle eventually, and later deleted the index buffer entirely and just wrote meshlet Ids to a buffer.
Slow rendering, but way less vram usage and faster culling, yeah
Biasing is fixed. I still wanna work on the PCF
if I use textureGather I get this
with texture I get this due to interpolation
I can't decide which one I hate less
Are you using hardware pcf?
yeah
You have the sampler set to linear?
yeah
And you are using sampler2dshadow
yes
You are definitely doing something weird with your pattern cause the pixels shouldn't look so defined
Can I see your pcf shader code?
Here's the smoother one
float shadow;
for (int x = 0; x < 2; ++x)
{
for (int y = 0; y < 2; ++y)
{
vec2 offset = vec2((x * 2) - 1, (y * 2) - 1) * texelSize;
shadow += texture(shadowMap, vec4(projCoords.xy + offset, layer, projCoords.z));
}
}
shadow /= 4;
this one does use the interp
i should probably be using textureOffset
Ehh it doesn't matter too much
oh yeah
for some reason textureGatherOffset offset is in texels while textureOffset offset is in UV
nvm then
Why are you multiplying the coordinates by 2?
Have you looked at mjp's shadow sampler?
There's a bunch of examples of different pcf kernels
Yeah it's smoother but I can still see the pixels
I suppose that's just a limit of the naive pcf kernel
The one I use is labeled "optmizedpcf" in MJP's sampler
I see
i dont think they look too bad atm
im happy with how it is right now
especially with stabilization it doesnt look half bad 🙂
all thats left is the stupid matrix calc bug
yeah that looks pretty good
makes the bistro look so much nicer now
what is next after CSM?
Maybe some SSAO?

