#GrubStomper29's OpenGL Sandbox

1 messages · Page 6 of 1

night shoal
#

or 1d

night shoal
gilded shell
gilded shell
#

I think after shadow mapping is done i’ll implement pbr and move on from this project

#

i think a big source of burnout is the codebase’s mess

#

too much codependency

#

regardless i cant make this the THIRD project where i plan to implement pbr but move on before i do so

#

anyways my current plan going forward is to make slightly smaller projects with 1-3 month deadlines to encourage more work

gilded shell
#

after watching The Phantom Menace, I’m inspired to work on swordplay. It will be challenging to get something unique, understandable, and fun up though

kind sparrow
#

nice

#

keep us updated

gilded shell
#

man i havent touched this in so long

#

i have no idea what specifically I need to do

#

i know I was working on multiview but i dont remember specifically what part

gilded shell
#

thats probably not good

plain mantle
gilded shell
gilded shell
#

im not sure why

#

oh I was using a parameter wrong

#

anyways I just made it so there are viewBuffers containing view info like camera frustum and matrices

#

next I need to put them to use in the shaders

gilded shell
#

oh btw heres that ups vs china post i talked about

gilded shell
#

ah it seems ive forgotten that textures with a depth component format can't be used for image load/store ops

#

i wish the wiki page said that

gilded shell
gilded shell
# gilded shell

If a tree falls in a forest and no one is around to hear it, does it make a sound?

night shoal
#

you wont hear it, but it will move the air

gilded shell
#

right

#

read the comic

night shoal
#

the gpt nonsense in offtopic is also rule 8 violation 🙂

gilded shell
#

really?

#

i dont think so

gilded shell
#

I really need to write shaders specific for the shadow maps

#

i bet using the ubershaders for them is a huge time waste

kind sparrow
#

Yea

gilded shell
#

man i gotta do some witchery to convert a uint to a sampler2D

night shoal
#

you mean a uint64

#

bindless handles are 64bit

#

if you dont mean bindless elaborate please 🙂

gilded shell
#

yeah not bindless

#

what im trying to do is impossible anyways since you cant put a sampler2d in a struct array

night shoal
#

what are you trying to do

#

(pulling information out of your nose)

gilded shell
#

let my view index an arbitrary sampler

#
struct View
{
  ...
  sampler2D hiZ_buffer;
}```
#

such and so forth

night shoal
#

then dont put it in a struct

#

is this still opengl? or are you back in vk?

gilded shell
#

and ofc

layout () buffer ViewBuffer
{
  View views[];
}```
#

im in opengl

#

i could be evil and hardcode it

night shoal
#

or make it bindless

#

its an uvec2 or uint64_t then

gilded shell
#

arent those locked? iirc you cant update the contents of a texture after its made bindless

west trail
#

You can

night shoal
#

i dont think so

gilded shell
#

oh okay

night shoal
#

bindless just means you have a handle to use

west trail
#

It locks something but I don't remember what

gilded shell
#

maybe sampler parameters idk

west trail
#

Something something format

#

Wiki says "none of its state"

gilded shell
#

i mean i just need to do glCopyImageSubData and image load/store ops

gilded shell
night shoal
#

the former

gilded shell
#

splendid

#

"Once a handle is created for a texture/sampler, none of its state can be changed. For Buffer Textures, this includes the buffer object that is currently attached to it (which also means that you cannot create a handle for a buffer texture that does not have a buffer attached). Not only that, in such cases the buffer object itself becomes immutable; it cannot be reallocated with glBufferData. Though just as with textures, its storage can still be mapped and have its data modified by other functions as normal."

gilded shell
night shoal
#

i dont think you are using buffer textures (TBOs)

#

thats something you would use if you cant use SSBOs (like mobile platforms)

gilded shell
#

i see

#

another day celebrating not being a mobile dev

#

what a beautiful image

#

that's the shadow map's hzb

night shoal
#

noice

gilded shell
#

this is extremely dark on mobile

kind sparrow
#

Everything is usually darker on mobile idk if it's just because the darks go darker (oled) or what

#

Or just because we put the brightness down

#

When I max out the brightness it looks completely different

night shoal
#

probably the former

echo token
#

I always have my phone brightness as low as possible for the sake of the battery

kind sparrow
#

Yeah

gilded shell
#

this did not happen before

#

hzb culling is the issue

#

I bet it has something to do with the bindless textures

#

rd would be a huge help rn

#

i mean i can tell its not sampling the right texture

#

0000000100000B7D

#

0000000100000b7d

#

the texture handle and the handle referenced in the shader match up

night shoal
#

nvidia nsight understands bindless textures

gilded shell
night shoal
#

ah

gilded shell
#

wait why am I doing this twice

#

regardless i dont see why that would cause this issue

#

as expected fixing that didnt remedy it

#

some witch has hexed my code, ill fix it tmrw

gilded shell
#

friends art peace

plain mantle
#

Very nice art peace

#

How is the piano going these days btw?

gilded shell
#

oops

#

im still taking slowly and learning scales for now

#

i havent yet decided on a song to learn but the cave story moonsong sound track is looking real nice

#

or geothermal or living waterway from the same game

gilded shell
# gilded shell

still trying to fix this, I have confirmed that the correct texture is being read

#

my next guess is a transform error

#

but I don't see how

#

heres an interesting find

#

The top right is what's visible, and the left is the shadow view (which has the same view/proj as the camera for now)

#

the bug doesnt seem to be affecting the shadow view like it affects the camera view

#

for comparison the camera view hzb matches perfectly

#

sidenote: there are some buffer mappings that take over a milisecond each, which can account for over half the frame time. I'll have to fix that later

gilded shell
#

maybe meshlet view ids suck

#

just an idea

gilded shell
#

very interestingly if I send in my view matrix to the culling algorithm manually using a uniform mat4, the value ends up being totally different from if I use what's in the view struct

#

something something alignment

gilded shell
# gilded shell

which doesnt make sense to me since the last vals in the struct seem perfectly fine

#

unless nsight graphics buffer view doesnt expose alignment issues?

#

@kind sparrow youre online would you happen to know anything about that

kind sparrow
#

I'm not sure I understand

#

you're saying a uniform variable vs. a uniform buffer is making a difference?

gilded shell
#

they should be identical values but yes

#

meaning something is wrong with reading from the ssbo

kind sparrow
#

How different are they

#

what are the two different versions

gilded shell
#

i just used an = test within the shader frognant

#

im afk for about 30 minutes but i can check soon

gilded shell
# gilded shell

what irks me is that nsight reads correct values (at least from the last few vars in the struct)

#

maybe its this—theres a struct within the view struct in the glsl code, maybe thats messing up alignment.

#

theres no struct in the nsight buffer view; i just paste in the contents of the struct, being 6 vec4s

night shoal
#

you can format the buffer view in nsight, like you can do in render doc

gilded shell
#

idk if it supports struct within struct

night shoal
#

double check alignment rules though

gilded shell
#

hence my pasting in the 6 vec4s

night shoal
#

std140/std430 with ssbo

gilded shell
#

yeah its unintelligible

#

some really weird rule i dont really understand

#

ill see if replacing the struct with 6 vec4s does the job

gilded shell
#

alright so that fixed half of it

#

view data is correct

#

occlusion culling still simply does not function though

#

i wonder if passing sampler2D as a function parameter is illegal

#

its not

#

well I got it to sample SOMETHING and this is my result

#

uhhhh

#

occlusion culling seems to do something now

#

but it looks like transforms are actually still broken

#

ill be done for the day

gilded shell
#

yeah according to jaker in #opengl i may have discovered an esoteric driver bug

#

@junior sparrow do you have an rtx 3050?

#

iirc you did

junior sparrow
#

rtx 4080 super

#

@gilded shell

#

aslo never pass sampler2D

#

glsl is broken lang when it comes to this

#

lpotrick was really mad about it

gilded shell
junior sparrow
gilded shell
#

alright

junior sparrow
#

vulkan + slang when 😈

gilded shell
#

do you have an alternate recommendation lol

junior sparrow
#

nope

gilded shell
#

sorry to the sampler thing

junior sparrow
#

let the opengl and glsl burn in hell

gilded shell
#

i cant

#

i imagine uvec2 thru function param will be fine

junior sparrow
gilded shell
#

i’m good

night shoal
#

why not, try it out and observe the debug message callback

gilded shell
#

oh believe me its been in there and there are no messages

night shoal
#

then render a cube and slap that texture on it

gilded shell
#

I can do a full screen quad very similarly

#

it seems just fine

gilded shell
#

const uvec2 hiZ = views[viewId].hiZ;

#

uvec2 hiZ = views[viewId].hiZ;

#

based on my sophiscated testing method, the former gives incorrect results

#

I think const works differently in glsl than in cpp

#

so i removed all the incorrect const usage but culling is still broken

#

it really seems like im sampling the hzb correctly this time

#

i believe the issue lies elsewhere

#

being able to place breakpoints in glsl would be a godsend

junior sparrow
#

Has debugging shaders

gilded shell
#

doesnt support bindless textures

junior sparrow
#

My man switch to vulkán or at least daxa 😭

gilded shell
#

i was considering daxa

#

i should probably debug my sphere depth vals next

gilded shell
#

Alright so I set the meshlets to cull if the bounding sphere depth is calculated to be less than 0

#

I honestly don't remember if they should be more than or less than zero but obviously the sphere transformations are the problem if it just swaps like that

#

which confirms my theory—somehow I broke sphere transformations when implementing multiview

#

the first culprit would again be alignment issues but again every other value given by the ssbo seems perfectly fine

#

in other news I'm trying to implement gribb & hartmann's frustum extraction method on Scratch

#

that's the easy part; the hard part is calculating a projection matrix lol

#

in the vertex transform, I just divide x and y by z then multiply them by the half screen width divided by tan(0.5 * fov)

#

That gets me straight from view space to screen space, skipping clip space

#

so I suppose I need some sort of imaginary clip space solely for the purpose of frustum culling lol

plain mantle
#

Can I see the code?

#

Are you not culling in world space?

gilded shell
#

yeah

#

gribb and hartman extracts the frustum from the viewproj

#

top plane of the frustum looks like this

#

column major

#

"View mat" is actually supposed to be view proj

#

but I have not yet constructed the proj

plain mantle
#

Oh lol, I didn't understand when you said scratch you meant the language KEKW

#

I thought you mean you were writing it "from scratch"

junior sparrow
night shoal
#

from scratch, in scratch

#

to scratch an itch

gilded shell
kind sparrow
#

When your capabilities (mostly in math) grow then you'll be able to channel that masochism into making more challenging stuff rather than making simpler stuff in more painful tools lol

gilded shell
#

A brilliant version of Two Step w/ Bela Fleck & The Flecktones from 4/21/02.

This was the day after the very popular #41 w/ the Flecktones which was on the Central Park bonus CD. Jeff Coffn was with the Flecktones at the time. This was 6years before Leroi Moore passed and Jeff filled in....

Filmed by Ethan Sinclair.

Clipped out render fro...

▶ Play video
#

this is my favorite drum run ever

#

it sounds a bit messy because theres also Wooten's brother on the SynthAxeDrumitar

#

I really like the tom and snare stuff

gilded shell
#

another bake

kind sparrow
#

Looks pretty legit

night shoal
#

and crispy

gilded shell
#

indeed

gilded shell
#

Occlusion culling is fixed

night shoal
#

what was the problem?

gilded shell
#

I use a vec4 to store cam pos + znear

#

to get the znear I was stupidly fetching the z-component and not the w-component

night shoal
#

oops

gilded shell
#

yeah

#

I believe multiview is complete

#

that took far longer than it should've

#

its merged to with the git

#

now for shadow mapping

#

or I could do some optimization first

#

i need to figure out how to use nsight profiler

gilded shell
#

it seems like the camera's orientation is somehow affecting the shadow's hzb

#

ofc that could also be a bug with the frustum culling

#

what an odd bug, there is no apparent cause

#

here's what NSight shows when I move the camera around between captures

#

so it's not an issue with the depth buffer visualizer

#

I think I narrowed it down to being an issue with the view frustums

#

or very quite possibly... alignment issues

gilded shell
#

well i read the spec and my alignment for the view struct is indeed wrong. After fixing it, the program is even more broke

gilded shell
#

its like every line i add to this junk breaks something

gilded shell
#

I've deduced its not a sync issue

#

anyways after "fixing" my alignment theres some really weird flickering

#

I fixed an incorrect near z value and most of the flickering is gone, but it still seems the camera's view frustum is somehow interfering

#

yep im correct

#

and it looks like the bug itself is in the shader

gilded shell
#

Alright

#

so I think I finally understand everything there is about glsl alignment

#

I also figured out where the bug originates:

#

I can't understand why, but viewId is incorrect

#

but ONLY when its puling view frustum vals

#

if I hardcode 0 or 1 its correct

gilded shell
#

the buffer values are correct

#

im losing it

#

i cant fathom this

gilded shell
#

i opened a thread in #1019726067851862097

#

4 minutes

#

I don't think that's very good

night shoal
#

missing barrier perhaps?

gilded shell
#

not sure

gilded shell
#

Alright all bugs i know of are fixed

#

i dont want to look for another because im scared of finding one

night shoal
#

so it was alignment bs again after all?

gilded shell
#

sort of

#

actually no not at all

#

i just forgot to set a buffer ofset basically

night shoal
#

: )

#

im happy you sorted those out

gilded shell
#

Thank you all

#

anyways it's finally time I start with shadow mapping

#

i gotta look into shadow samplers

#

@kind sparrow I think I saw you used those; were they nice to you?

night shoal
#

i dont think demon has shadows going

#

speaking of ze devil 🙂

kind sparrow
#

I do, they're just very rudimentary

night shoal
#

in srs?

kind sparrow
#

Yeah

night shoal
#

sorry for spreading misinfo 🙂

kind sparrow
#

It's just a single shadow map at short range, no cascades

#

Oh I do use a shadow sampler actually

#

But I have a manual filter kernel it doesn't look this good automatically

night shoal
#

hmm i dont rember seeing them, but they look quite neat already

gilded shell
#

so what does it do

night shoal
#

it shades the grass : >

gilded shell
#

nah i meant the shadow sampler

#

whats the benefit over vanilla

kind sparrow
#

I forget

#

It has some intrinsic depth comparison property iirc

gilded shell
#

alright

#

thats sounds too simple for opengl to have added it

echo token
#

I think it's probably because once upon a time (or now) that was some kind of hardware accelerated thing (it probably piggybacked on whatever hardware tested against your depth buffer under normal circumstances)

#

one nice benefit of using a shadow sampler is that if you're using linear filtering on it, it basically gives you a little free smoothing

#

so you have to do less with your edge smoothing kernel thing

kind sparrow
#

Yeah

gilded shell
#

I forgot I made this

#

in retrospect this is actually a really well designed boss fight for being a joke project

echo token
#

if this was made in flash and released 15 years ago you would've done numbers

#

instant school computer lab classic

gilded shell
#

yeah…

#

gosh i even make sure to show the player how an attack works before actually making it dangerous

#

i was smart 4 years ago

plain mantle
gilded shell
plain mantle
gilded shell
#

GmbH?

plain mantle
gilded shell
#

i dont speak korean

#

jake it’s a satire

plain mantle
#

Not beating the the allegations 🇺🇸🦅

#

I'm kidding (mostly)

gilded shell
#

the thing im most scared to do is add a third model to the scene

night shoal
#

Gesellschaft mit beschränkter Haftung (German: [ɡəˈzɛlʃaft mɪt bəˌʃʁɛŋktɐ ˈhaftʊŋ]; lit. 'company with limited liability') is a type of legal entity in German-speaking countries. It is equivalent to a société à responsabilité limitée (Sàrl) in the French-speaking part of Switzerland and to a Società a Garanzia Limitata ...

gilded shell
#

baking more baguettes today

#

they use a preferment so hopefully theyll stay fresh until monday morning when ill give one to my friends

night shoal
#

good luck

gilded shell
#

theyre good

gilded shell
#

ap calc test was done today

#

very easy imo

gilded shell
#

another potential project is buying a vocore2 and compatible screen to try and get some graphics up on that

#

id be writing a software renderer

gilded shell
#

and the cpu is mips

#

so getting hello world up will probably be the most difficult part for me

night shoal
#

not really

#

you just need a mips compiler 🙂

gilded shell
#

i see

#

anyways it looks like csm will require that my camera have some sort of far plane

#

ig ill need a second projection matrix thats not infinite

kind sparrow
#

Are you sure that's necessary

#

Seems like you could just have a manual arbitrary shadow cutoff that you use to bound the frustum

gilded shell
#

ill have to see

plain mantle
#

Yeah my CSM cuts off significantly before the far plane

#

This is extremely common in games with long draw distances

gilded shell
#

i see

plain mantle
#

Sometimes they fill in the far distance with screen space shadows

gilded shell
#

im using reverse z for my depth buffers

#

im looking at resources for light matrix calc and they partition the camera's frustum into smaller frustums, one for each shadow map

#

then they get the world space frustum corners

#

im assuming I won't need reverse z here

night shoal
#

no

#

you could probably just pick a max far, and then split it into cascades

#

while your normal camera keeps using reverse zed

gilded shell
#

yeah

#

also

#

is vec4 * mat4 the same as mat4 * vec4?

kind sparrow
#

No

gilded shell
#

looks like i wanna do mat * vec

#

the opposite is like transposed i suppose

kind sparrow
#

Yeah

#

M*v is how you act on column vectors which are how it's done basically everywhere

#

It's only ambiguous because GLSL doesn't differentiate between row and column vectors

gilded shell
#

todo: shadow map pancaking

#

csm is going smoothly so far

#

ill also probably want pcss

night shoal
#

i suppose glm::floor has overloads for all the glm types.... and std::floor doesnt

plain mantle
#

Yeah std::floor obviously won't have overloads for glm::vec3

#

That's why I have co_math::floor

#

Also because co_math::floor is constexpr

#

But std::floor isn't (yet)

night shoal
#

coroutine based floor 😛

gilded shell
#

coroutine?

night shoal
#

a fancy thread like mechanism, nothing to worry about, but co_... is what all the coroutine bs is prefixed with in c++

plain mantle
#

I came up with the namespace before c++20

gilded shell
#

nice

gilded shell
#

oh shot

#

shoot

#

i forgot to make a shadow mapping branch

#

for my next trick, ill hardcode the amount of shadow cascades!

gilded shell
#

message to future self: my debugging setup could potentially be breaking light matrix calc by basing it on itself

plain mantle
gilded shell
#

frustum culling is so hard bruh

plain mantle
#

Is it? Just a few dot products lol, don't overengineer it

#

Like you have 6 planes right?

#

What bounding volume are you using for objects? AABBs?

gilded shell
#

dear lord

#

sorry

gilded shell
#

no clue why i said frustum culling

echo token
#

tons of good info + a demo program, it should leave you with CSM you're very happy with

plain mantle
gilded shell
#

ive been following some old blog post

#

its good but its for d3d so ive probably messed up translating the math code

plain mantle
#

The link above is also d3d

#

but the concepts should carry over

#

double check your math I guess

kind sparrow
#

CSM might be the kind of thing that is borderline impossible to debug in a methodical way until you have learned linear algebra for real

plain mantle
#

Time for a crash course in Lin-Alg @gilded shell

gilded shell
#

i mean i can see that my view matrix calc is correct for the most part

#

i understand very basic fundamentals about matrices and vectors

kind sparrow
#

The issue is just the fluency in changing basis and the geometry and coordinate spaces

gilded shell
#

right

kind sparrow
#

If you don't have a crystal clear understanding of what's going on it's really hard to debug the values

#

Like even if you do it's still tricky

#

You need to devise contrived situations where the components are recognizable

#

Otherwise it's just number soup and your only recourse is doing hand calcs to compare brute force

gilded shell
#

yeah

#

ill probably be waiting until i get a formal college class in it

plain mantle
#

Are you going to college in the Fall? or do you still have more HS to do?

gilded shell
#

no math classes yet though

plain mantle
#

what program did you apply for?

gilded shell
#

i’m going to a community college before university

gilded shell
#

so just ge like bio, english, etc

plain mantle
#

Hmm community college seems to be much more common in South Canada 🇺🇸 where you guys are

#

Maybe its a cost thing? Here you usually do one or the other

echo token
#

so uh, check your depth clamp

gilded shell
#

university is about $30k a year here?

#

but community college is about $5k a year i believe

plain mantle
#

$30k per year thats brutal

#

University here is like $7k CAD (~5k USD)

night shoal
#

absolutely crazy that you have to pay so much for basic shit

gilded shell
#

yeah

night shoal
#

its 500 european shekels per semester in many unis in europe

#

time to move to europe 🙂

gilded shell
#

switzerland looks lovely but they have mandatory conscription

#

and i dont think iceland has much demand for graphics engineers

plain mantle
#

Come up north, join us 🇨🇦

gilded shell
#

i’m so good

plain mantle
#

Ok I'll stop before we get into rule 8 territory

gilded shell
#

yeah lol

#

i’m sure canadas not as bad as it sounds

#

provided i avoid quebec

plain mantle
gilded shell
kind sparrow
#

Lmao

plain mantle
#

It's like 30C today so the frozen wasteland thing is a lie btw

west trail
gilded shell
kind sparrow
#

CC is basically like university costs in other countries

plain mantle
kind sparrow
#

Universities charge a lot of money because they fund world class research programs that are expensive although these days a lot of the money is being pilfered by admin overhead and stuff

#

So it's the worst of both worlds

gilded shell
kind sparrow
#

We already know where Jake lives lol

plain mantle
#

find me in the 6 million people

gilded shell
#

lol

gilded shell
kind sparrow
#

I already know which one it is

gilded shell
kind sparrow
#

Ah that's not the one I expected

plain mantle
gilded shell
#

i actually love celcius for baking

#

25c is so easy to remember and its the optimal dough rise temp

plain mantle
#

also 100c easy to remember for boiling water

#

212F is so weird

kind sparrow
#

Yea obviously we use metric for everything scientific

gilded shell
#

just put that jawn on the stove and crank it

kind sparrow
#

That tells you what temperature the water is when it's boiling not the temperature needed to boil it

#

I mean same thing but reverse direction

#

But the US is officially a metric country we just use the imperial units in daily life

plain mantle
#

Also freezing at zero

gilded shell
#

i mean ours freezes at 0 too

plain mantle
#

It freezes at 32F

west trail
#

Isn't it 32 or smth

#

Completely logicalsmart

gilded shell
#

well yeah but 0F water will be frozen is what i mean

plain mantle
#

yeah its -17 C no shit

gilded shell
#

and besides you of all people should remember the number 32 lol

#

language

kind sparrow
#

What matters is that you know what temperature the phase transitions happen at

#

Which metric makes easy

plain mantle
#

0 and 100

gilded shell
#

yeah

plain mantle
#

well technically 100C depends on atmospheric pressure

gilded shell
kind sparrow
#

100C at STP

gilded shell
#

i know the objective temperature will change

plain mantle
#

The boiling point changes

gilded shell
#

right

kind sparrow
#

The boiling point will be like 98C or 102C or something depending on ambient pressure

gilded shell
#

yeah ig C would be pretty useless if it changed with atmospheric pressure

#

dont forget + C after your integrals

kind sparrow
#

Yes wouldn't want someone to think your integrals were in Fahrenheit

gilded shell
#

we should really be using kelvin

kind sparrow
#

Ah yes a comfortable 283.15K day

gilded shell
#

anyways, csm

#

i’m just using large near/far z vals for the projection matrix but bad z clipping might still be a possibility

#

i heard of shadow pancaking being a possible fix but havent found a resource describing what it actually is

echo token
#

I threw the CSM building logic from the demo into a 1x1x1 compute shader and ran it after my depth pyramid build

#

so I could just sample the highest mip for depth max

#

actually it might've been numCascadesx1x1, so 4x1x1

#

that was basically all I needed to get good CSM precision, all the other little features and strategies are just icing on the cake

#

that and receiver plane depth bias

gilded shell
#

anyways from 1998 to 2015, konami managed to make some the best graphics for their hardware

#

mgs3 looks phenominal for ps2

gilded shell
#

looking at my scene through the view matrix of the light cascade is so fun

#

the cullings real weird, probably because I just lazily set my view to the cascade view

gilded shell
#

also it looks like ctrl / will comment out a whole line, nice

#

I forgot that my hzb sampler is designed for persepective projection matrices. oops

night shoal
#

grubby

#

its obviously wrong

#

no need to ask if its not 🙂

gilded shell
#

are you talking about this?

night shoal
#

yes

gilded shell
#

mjp says it could be

#

and i agree

#

the ortho is tightly fit to the cascade aabb

night shoal
#

debug draw time

gilded shell
plain mantle
#

like I have here

gilded shell
#

I have that

plain mantle
#

do you see the stretch?

#

in there

gilded shell
#

yes

#

was this not correct?

plain mantle
#

this is another handy debug view

#

lets you see where each cascade starts and ends

#

I use a sphere fit, so there can't be stretching

gilded shell
#

Really the only oddity i see right now is with the depth visualization

#

maybe its just a quark with depth buffers but, depending on orientation, its too dark to see anything

#

i did not expect this from a linear depth buffer, so maybe my near/far planes for the cascade proj are off

#

I am swapping far and near in the ortho matrix calculation for reverse z

gilded shell
night shoal
gilded shell
#

wow gp'ers really found out how to bikeshed getting world space coords from g buffers

night shoal
#
vec3 ReconstructFragmentWorldPositionFromDepth(float depth, vec2 screenSize, mat4 invViewProj) {
    float z = depth * 2.0 - 1.0; // [0, 1] -> [-1, 1]
    vec2 position_cs = gl_FragCoord.xy / screenSize; // [0.5, screenSize] -> [0, 1]
    vec4 position_ndc = vec4(position_cs * 2.0 - 1.0, z, 1.0); // [0, 1] -> [-1, 1]

    // undo view + projection
    vec4 position_ws = invViewProj * position_ndc;
    position_ws /= position_ws.w;

    return position_ws.xyz;
}
gilded shell
night shoal
#

no

#

this is an amalgam of all sorts of things too

#

ah look whos typing, school is over it seems 😄

gilded shell
#

yay

junior sparrow
junior sparrow
#

In 2 weeks I have oral exams

gilded shell
#

ive set gl's ndc z range from 0 to 1

night shoal
#

thats what she said

junior sparrow
#
  • defence if my thesis 😭
gilded shell
#

so im guessing that first line is unneccasary if i could spell the word right

night shoal
#

yeah

junior sparrow
#

it will be like 0.99

gilded shell
#

i think i understand

night shoal
#

yeah fragpos.x is 0..1919

gilded shell
#

and im assuming this works out of box with reverse z?

night shoal
#

if we assume 1920

junior sparrow
#

so for example the res is 1280 on x coord and max index is 1279 soo 1279/1280. it will get close but never reach 1

gilded shell
#

excellent

#

hope yall are ready for this

junior sparrow
#

at worst you will see 💀

gilded shell
#

not sure how one would debug this

junior sparrow
#

debug what?

gilded shell
#

why is it called viewProj if it's proj * view

junior sparrow
#

some math stuff that I dont remember

gilded shell
#

i wonder why my screen dimensions are 1440x810

junior sparrow
#

🤷‍♂️

west trail
gilded shell
#

ive been using ssbos so much i forgot the difference between them and ubos

west trail
#

Bruh

gilded shell
#

looks like ubos are better for smaller buffers

west trail
#

They lack extendable arrays and are generally faster

night shoal
#

ssbos support more storage

gilded shell
night shoal
#

ubos are only guaranteed to have 128kb at least

west trail
night shoal
#

and that might even vary between driber versions and vendors ofc

gilded shell
#

looks like ive been using ssbos where i should be using ubos

night shoal
#

there is no "must use x over y"

west trail
#

I doubt you will notice perf diff

night shoal
#

yeah

#

it might have been like that 10 years ago

gilded shell
#

yeah

night shoal
#

you could just use ssbos if you wanted

gilded shell
#

i probably will

#

im getting more comfortable with std430

#

and iirc vulkan doesnt have the concept of ubo vs ssbo

west trail
#

It's the least problematic layout

night shoal
#

i use ubo for my global uniforms and debug settings buffers, everything else is bound as ssbo

#

and i group my shit in multiples of 16 bytes anyway, so no problem with std140 with ubo

junior sparrow
#

I havent touched opengl in years but it isnt the same exactly

gilded shell
#

oh ok

gilded shell
#

i keep forgetting that gl is by default -z forward

night shoal
#

has nothing todo with opengl

gilded shell
#

yeah

#

alright i have some cascade visualization

#

and it seems like near shadows are functioning

#

somewhat

gilded shell
kind sparrow
#

Lmao what is that

gilded shell
#

lemon with a face saying were gonna make it

#

ill make shadow maps, youll make an audio engine, lukasino will graduate, etc

gilded shell
#

i am the competition

#

everyone else is cooked

plain mantle
gilded shell
#

probably not

#

its just bistro with some broken shadows

gilded shell
#

setting the color to the world pos is consistent and looks accurate

#

heres another look

#

I wonder if its working fine and this is just incredibly bad shadow acne, especially for the far cascade? I would love second opinions

#

the maps are 512x512

#

2048x2048 maps have identical artifacts, making me doubtful

#

it almost looks like the shadow maps themselves arent correctly aligned with the camera frustum

gilded shell
# gilded shell

look at how that shadow just walks away at the end, which is why i believe the above

#

@plain mantle any thoughts

#

looking a lot better now

#

the issue is related to the near and far planes of my ortho matrix

#

i dont see how that could be the issue since depth clamp is enabled

plain mantle
#

I gotta get back to you on that

#

Bustly atm

gilded shell
#

using orthoZO fixes some issues

gilded shell
#

my current theory is that using reverse z is messing up gl_depth_clamp

gilded shell
#

IS THIS IT!?!?!?!?!?!?

#

left side down the street

#

thats not shadow acne

#

only happens for the spherical cascade code, so ill stick to tight cascades for now

#

finally time to use this

#

oh i need world space normals now

plain mantle
#

@gilded shell this is how I calculate the cascade spheres```cpp
void camera::calculate_cascades()
{
size_t num_cascades = 4;

float clip_distance = (zFar - zNear);

float min_dist = shadow_min;
float max_dist = shadow_max;
float lambda = shadow_lambda;

float minZ = zNear + min_dist * clip_distance;
float maxZ = zNear + max_dist * clip_distance;
float range = maxZ - minZ;
float ratio = maxZ / minZ;

if (!manual_split)
{
    cascade_splits.resize(num_cascades);

    for (size_t i = 0; i < num_cascades; i++)
    {
        float p = (i + 1) / static_cast<float>(num_cascades);
        float log = minZ * std::pow(ratio, p);
        float uniform = minZ + range * p;

        float d = lambda * (log - uniform) + uniform;
        cascade_splits[i] = (d - zNear) / clip_distance;
    }
}

cascades.clear();

for (size_t i = 0; i < num_cascades; i++)
{
    auto c_near = zNear + (i == 0 ? min_dist : cascade_splits[i - 1]) * clip_distance;
    auto c_far = zNear + cascade_splits[i] * clip_distance;

    auto sphere = calculate_cascade_sphere(fovy_degrees, aspect, c_near, c_far);
    auto aabb = calculate_cascade_aabb(fovy_degrees, aspect, c_near, c_far);

    cascades.emplace_back(sphere, c_near, c_far, aabb);
}

}

#
bounding_sphere calculate_cascade_sphere(float fovy_degrees, float aspect, float z_near, float z_far)
{
    auto fsubn = (z_far - z_near);
    auto fandn = (z_far + z_near);

    float k = coMath::sqrt(1.0f + aspect * aspect) * tan(coMath::to_rad(fovy_degrees) / 2.0f);

    if (k * k >= fsubn / fandn)
    {
        return bounding_sphere{ vec{ 0.0f, 0.0f, z_far }, z_far * k };
    }
    else
    {
        return bounding_sphere{
            vec{ 0.0f, 0.0f, 0.5f * fandn * (1.0f + k * k) },
            0.5f * coMath::sqrt(fsubn * fsubn + 2.0f * (z_far * z_far + z_near * z_near) * k * k + fandn * fandn * powf(k, 4.0f))
        };
    }
}

AABB calculate_cascade_aabb(float fovy_degrees, float aspect, float z_near, float z_far)
{
    auto inv_cascade_persp = perspectiveLH(fovy_degrees, aspect, z_near, z_far).inverse();

    auto points = pointsFromAABB(vec{ 1.0f }, vec{ -1.0f });
    for (auto& point : points)
    {
        auto p = inv_cascade_persp * vec4(point, 1.0f);
        point = p.xyz() / p.w;
    }

    return find_AABB(points);
}```
gilded shell
#

college websites are always, without fail, the worst websites ive had the misfortune of needing to navigate

plain mantle
#

I feel for you

#

My uni had a horrific website

night shoal
#

i hink that came up a few times on the server, try with the searchbox

gilded shell
#

maybe i wont need it

echo token
#

once you have position it's just normalize(cross(dFdx(pos), dFdy(pos)))

gilded shell
#

i want to do normal displacement for shadow acne and i can just do that in viewspace before going to screen space

gilded shell
#

that seems much clearer

#

wait

#

@echo token wont that be more of an approx

echo token
#

well yeah, it's not the actual normal, it's just based on the gradient of the surface you're rendering

#

plus it's subject to float precision issues

gilded shell
#

hmm

#

i wonder if thatll be problematic

#

hold on

#

this returns world space norm?

echo token
#

yes

gilded shell
#

alrighth well i dont see how it would hurt to test

gilded shell
#

@echo token its not looking so hot

#

oh sorry i messed up the calc

gilded shell
#

bro what is this

#

i have bigger problems than bias bro

#

seems to happen near ends of the first cascade

#

its not outside the texture bounds

#

perhaps thats just where the far plane is?

#

yep its the far plane

#

Oh boy! I love having near and far planes for my shadow maps!

gilded shell
#

so my issue is objects too far are clamped to the far plane but as far as i can tell you can’t differentiate between the far plane and fragments clamped to the far plane

gilded shell
#

is there any solution to this besides deepening the far plane?

#

i think ill post in questions

gilded shell
#

there is a shadow discrepency here

#

im shading fragments black if the depth equal to the far plane, so that's not the issue here

#

ah here we go

#

the frag coords are too far left in the shadow map to sample anything

plain mantle
gilded shell
#

cascade

#

this new error is much more troublesome though

#

im following Alex Tardiff's frustum calc

plain mantle
#

well anything past the far plane of the cascade shouldn't be visible so it shouldn't be an issue

#

Can you explain the new issue?

gilded shell
#

in theory the cascade eye should be around hereglm::vec3 eye{ frustumCenter - (2.0f * radius * lightDir) };

#

but im using frustumCenter + lightDir because that one doesnt work

#

maybe I should try using + instead of - there

plain mantle
#

is lightdir the direction of the world to the light, or the world from the light?

gilded shell
#

uhhh

#

its the direction the light is facing in

#

so light to world?

plain mantle
#

its the red arrow?

gilded shell
#

yes

plain mantle
#

Do you have debug draw?

#

It helps to draw these vectors into the scene

#

so you can make sure things are pointing the way you think they are

gilded shell
#

im not sure how that would work in this scenerio

plain mantle
#

I just have a function in my renderer to draw a wireframe line

#

renderer->add_wireframe_line(ray0_origin, ray0_origin + ray0_dir, vec3{ 0.0f, 0.0f, 1.0f });

#

its super helpful

plain mantle
#

You will get it eventually

#

You are already making pretty good progress

gilded shell
#

how about i look at the scene thru the light persective once again

#

alright it looks like i have shadows again but the previously mentioned bug still persists:

#

that blue region on the left

#

fragments in the near cascade trying to read out of bounds from the shadowmap

#

basically, my matrices are once again cucked

#

im gonna play roblox

#

ironically roblox has some of the worst shadows ive seen in a game

#

i cant forget about the shadows

#

shadows the shadows

gilded shell
#

@night shoal do you say gross like gr-oh-ss or gr-aw-ss

plain mantle
#

I think he says groß

night shoal
#

grossss

#

like jake did

#

i dont think german has words which have an aw sounding aw in it besides aaaaaaawwwwwwwww 🙂

echo token
gilded shell
gilded shell
#

maybe because roblox is coming up on 2 decades old?

#

iirc the engine devs are paid very well too, like $200k

night shoal
gilded shell
#

Fun fact: not sure if this is an engine quirk or the choice of the game dev, but i notice when some objects get further away they swap to gauraud shading

#

i love per vertex shading

#

for example: this zepplin from zepplin wars

night shoal
#

maybe thats actually what happens

#

to reduce fragment shader complexity, they just do gouraud for distant objects

#

gauraud 😛

plain mantle
#

I mean it totally makes sense

#

The zeppelin looks pretty good to me

echo token
#

if you've seen their earnings reports, all the shit on the devforum/wiki, and basically everything else, mobile is their biggest growth market and revenue market, so their renderer has to be mobile compatible

gilded shell
#

yeah

echo token
#

they have some interesting stuff to chunk and spatial hash things on the CPU, if you've ever renderdoc'd it

gilded shell
#

Alright shadow cascades are stable now

#

but theyre still being sampled out of bounds (the blue fragments)

#

but I swapped the the minus for a plus on this line glm::vec3 eye{ frustumCenter + (2.0f * radius * lightDir) };

#

@echo token you said you implemented his code, right? Did you experience any similar problems?

echo token
#

I don't remember, I just remember it sucked because I copied it wrong until I got it right then it was fine

gilded shell
#

well

#

i’m stumped

#

i might open a post in the question thread

gilded shell
#

im still very undecided on my next project

#

I want to cap it to 6 months

#

either a 2d game or gcn/ps2 styled scene

gilded shell
#

in other news im finally starting to learn a song on piano

#

this

gilded shell
#

fixed some acne and added hardware filtering. bug is still there but overall its looking a lot better now

gilded shell
#

screwing around with pfc + shadow sampling. Im not really sure what im doing but the results are a bit pretty

#

its pretty ugly up close

#

and the cascade transition is too obvious

#

otherwise its hot

#

once i fix CSM and occlusion culling, I want to fix the meshlet batching atomic bottleneck

gilded shell
#
// P00 and P11 are indices [0][0] and [1][1] of the ortho proj matrix
bool projectSphereView(vec3 center, float radius, float znear, float P00, float P11, out vec4 aabb)
{
    if (c.z < r + znear) return false;

    float minx = center.x - radius;
    float maxx = center.x + radius;

    float miny = center.y - radius;
    float maxy = center.y + radius;

    aabb = vec4(minx * P00, miny * P11, maxx * P00, maxy * P11);
    // clip space -> uv space
    aabb = aabb.xwzy * vec4(0.5f) + vec4(0.5f);

    return true;
}
#

holy bikeshed

#

I need to somehow communicate to my meshlet batcher whether a view is ortho or perspective

gilded shell
#

i just hardcoded enum vals

#

now im wondering about occlusion culling for the shadow maps

#

I definitely need it, but it will take front face culling from me 😦

#

which has been invaluable in fighting acne

gilded shell
#

nvm i dont think it will

#

anyways

#

seems all my meshlet bounding spheres fail this test
c.z < r + znear

#

actually nvm

#

they fail elsewhere

echo token
#

it seems much easier to just have them isolated and separate compute invocations

gilded shell
#

so i just branch there, and since the branches are smaller i imagine theyre faster or occasionally free

junior sparrow
gilded shell
#

nah i wanted to be more explicit

junior sparrow
gilded shell
#

not sure what you mean

junior sparrow
#
bool is_projection_matrix_ortho(f32mat4x4 projection_matrix) {
    return projection_matrix[3][3] == 1.0;
}```
gilded shell
#

Sorry, I meant I want to let the user explicitly decide

junior sparrow
#

oh 💀

#

user is me so I dont care ididnotread

gilded shell
#

yeah, much more principal than anything else

gilded shell
#

really only seems to cull small meshlets

#

it might be a bug, but I theorize the bounding spheres are just so big

#

and since theres no perspective to shrink them, they stay big

#

@junior sparrow I need a second opinion

junior sparrow
#

ehhh never had this bug

#

bigger bounding box/sphere shouldnt be a problem

junior sparrow
#

how do you calculate bounding spheres?

gilded shell
#

i believe whatever meshopt gives me

junior sparrow
#

Do you get those from meshoptimizer? If yes then they should be fine

gilded shell
#

maybe I couldve worded that better

gilded shell
junior sparrow
#

this is from jasmine

gilded shell
#

at least spacially

junior sparrow
#

there is special case for orthographic projection for hiz culling

junior sparrow
#

It's more like your culling code is broken

gilded shell
#

aw

#

i thought my math checked out

gilded shell
#

I fixed occlusion culling. The camera and all shadow cascades have it now.

#

Despite that, my worst fear have been confirmed: this engine's performance is utter garbage

#

the vast majority of that frametime is spent on the cluster batcher

#

i think its finally time to learn what these mean

#

im gonna guess throughput is along the lines of "recruitment"

#

compute warp latency is about 4 million cycles throughout the cluster batch shader execution?

#

that doesnt sound very good

#

whatever "Sync Q Waiting" is, 100% is stalled

#

I would love for anyone to explain to me the meaning of these

junior sparrow
#

also it would be beneficial that you would add hot reloading for shaders

gilded shell
#

compute shader where I write out the index buffers of visible clusters

junior sparrow
#

👀

gilded shell
junior sparrow
gilded shell
#

all indices

#

3 per triangle

junior sparrow
#

💀

#

make it one

gilded shell
#

yeah I followed bevy verbatum there

junior sparrow
#

it should be a little bit cheaper

#

but join the dark force with daxa and enjoy nice things 😈

gilded shell
#

next proect for sure

gilded shell
#

that stalling and latency is probably the threads waiting their turn to write to that var

junior sparrow
#

you could instanced render meshlets and discard vertex invocations

gilded shell
#

i dont think thats related to the problem

junior sparrow
#

that way you dont need to generate the index buffer

#

you eliminate the step

gilded shell
#

the atomicAdd tracks the count of indices

junior sparrow
#

but the rendering will be a little bit slower

gilded shell
#

i imagine the instanced render would also have a "count" of some sort

junior sparrow
#

only the count of visible meshlets

gilded shell
#

yeah but thats still atomicAdd for each visible meshlet

#

I have a lot of visible meshlets

gilded shell
#

i do also have a bitfeild for meshlet visibility

#

that uses atomic operations but only 8 times per byte and the results are unused so im hoping its free

#

maybe I could come up with a solution using that?

junior sparrow
#

literally 100x less

junior sparrow
gilded shell
#

do you use an atomicAdd per meshlet?

junior sparrow
#

I was using it

gilded shell
#

how did you drop it

junior sparrow
#

ehhh I am doing things differently

#

I would suggest dropping the "index buffer generation" and instance render the meshlets

#

and some waste vertex invocations

#

you would have less VRAM usage and faster rendering

gilded shell
#

what does the batcher output?

junior sparrow
#

just the meshlet information which indices like vertex, micro, etc..

#

and render it

gilded shell
#

it just fills a buffer with passing meshlets?

junior sparrow
#

yep

gilded shell
#

sort of like a multidraw indirect buffer

junior sparrow
#

you can do it directly in culling shader

gilded shell
#

yeah do you use mesh shaders?

junior sparrow
#

yes

#

you can do it without with instanced rendering with normal vertex shader

gilded shell
#

yeah i probably shouldve consider mesh shaders since day 1

junior sparrow
#

start rewrite in daxa 😈

gilded shell
#

nah

#

hey i wonder if i can fully drop the meshlet batching

junior sparrow
#

yeah 😈

gilded shell
#

and fully replace it with mesh shaders

#

will having a lot of mesh shaders that contribute zero triangles be a problem?

junior sparrow
#

🤨

gilded shell
junior sparrow
#

will having a lot of mesh shaders that contribute zero triangles be a problem?
what do you mean by that

gilded shell
#

sorry

#

a lot of mesh shader invocations

#

because a lot of meshlets will be dropped due to culling

junior sparrow
#

just dont execute mesh shaders for them

gilded shell
#

well i want to do the culling in mesh shaders

#

so i can skip the batching stage

junior sparrow
#

have separate culling shader where you only write out meshlets that survived and execute mesh shader for them

#

simple as that

gilded shell
#

alright

#

how do i write out surviving meshlets without using an atomicAdd to count them lol

junior sparrow
#

you have to use it here

#

here its fine because you are not spamming it for each triangle

gilded shell
#

i’m already not

#

like i said im currently doing one atomicAdd per meshlet and its already slow

junior sparrow
#

do you have the code on github?

gilded shell
#

uhhh yeah

#

its pinned in this thread

#

you can look in occluder_batch.comp

#

it’s cluster_batch.comp minus the culling so theres less code to look at

gilded shell
#

lemme comment it out and see how we do

#

i really thought the atomicadd would be the slow part

#

yeah youre right lukasino

#

removing that brings frametime down to 10ms

#

66% reduction

junior sparrow
#

get rid of it and instance render the meshlets

gilded shell
#

i probably will

#

still gotta fix shadow mapping first

gilded shell
plain mantle
#

The cascade matrix?

gilded shell
#

yeah

plain mantle
#

It's not shown there yeah. I'll see if I can dig it up in a sec

#

It's part of another function

#

There's a whole snapping thing

gilded shell
#

im finding that I can fix those out of bounds issues with c++ proj = glm::orthoZO(2 * -radius, 2 * radius, 2 * -radius, 2 * radius, 10.0f * radius, 10.0f * -radius);

#

but one would think i shouldnt have to multiply radius by 2 here

#

even multiplying by 1.5 seems to be a valid fix

spring kelp
spring kelp
gilded shell
#

Biasing is fixed. I still wanna work on the PCF

#

if I use textureGather I get this

#

with texture I get this due to interpolation

#

I can't decide which one I hate less

plain mantle
#

Are you using hardware pcf?

gilded shell
#

yeah

plain mantle
#

You have the sampler set to linear?

gilded shell
#

yeah

plain mantle
#

And you are using sampler2dshadow

gilded shell
#

yes

plain mantle
#

You are definitely doing something weird with your pattern cause the pixels shouldn't look so defined

gilded shell
#

well textureGather doesnt interpolate

#

it just returns texel results

plain mantle
#

Can I see your pcf shader code?

gilded shell
# gilded shell with `texture` I get this due to interpolation

Here's the smoother one

float shadow;

for (int x = 0; x < 2; ++x)
{
    for (int y = 0; y < 2; ++y)
    {
        vec2 offset = vec2((x * 2) - 1, (y * 2) - 1) * texelSize;

        shadow += texture(shadowMap, vec4(projCoords.xy + offset, layer, projCoords.z));
    }
}

shadow /= 4;
#

this one does use the interp

#

i should probably be using textureOffset

plain mantle
#

Ehh it doesn't matter too much

gilded shell
#

oh yeah

#

for some reason textureGatherOffset offset is in texels while textureOffset offset is in UV

#

nvm then

plain mantle
#

Why are you multiplying the coordinates by 2?

gilded shell
#

I want the offsets -1, -1 through 1, 1

#

maybe Im approaching pcf wrong

plain mantle
#

Have you looked at mjp's shadow sampler?

#

There's a bunch of examples of different pcf kernels

gilded shell
#

hey I got this

plain mantle
#

Yeah it's smoother but I can still see the pixels

#

I suppose that's just a limit of the naive pcf kernel

gilded shell
#

sorry there waas a little bug

plain mantle
#

The one I use is labeled "optmizedpcf" in MJP's sampler

gilded shell
#

I see

#

i dont think they look too bad atm

#

im happy with how it is right now

#

especially with stabilization it doesnt look half bad 🙂

#

all thats left is the stupid matrix calc bug

plain mantle
#

yeah that looks pretty good

#

makes the bistro look so much nicer now

#

what is next after CSM?

#

Maybe some SSAO?

gilded shell
#

meshlet optimization

#

because this is 30 fps

#

then pbr

plain mantle
#

Ooof

#

is this just because the geometry is so intense in bistro?

#

I bet bistro would hurt my engine

#

because my occlusion culling basically won't work on bistro

#

Do you have occlusion culling?

gilded shell
#

yeah i do

#

but the method I use to batch meshlets for rendering is so bad

plain mantle
#

what does meshlet mean in this context?

#

just a piece of a mesh?

gilded shell
#

yeah

#

about 124 triangles

#

just to make culling a little more thurough

#

in my case I write every single visible triangle out in an index buffer via compute, which is really slow

#

what I need to do is write visible meshlets then use mesh shaders to draw the varying sized meshlets