#Kakadu
1 messages Β· Page 3 of 1
I get this
(Compiling...)
Ok
This
I looked at it in RenderDoc and it is the same
Initially it had 2 mips, so I thought okay that's the problem and disabled mipmap gen. for this texture
Still the same
I used image magick to dump the pixels
it says 0 255 0
for all 4 pixels
I used png crush to remove metadata that may cause opengl to misinterpret the texture contents (chatgpt's suggestion)
: )
I also printf'd the pixels immediately after stb loads them
All seems fine
I thought
Okay ImGui renders this window/texture
What if I use this texture as the diffuse map for a material
Still the same
ok so if you read the pixels in your app and they are fine it's not the image
texels*
Yeah there's something else on the opengl side of things I think
I knew you would scold me for chatgpt π
so you are saying it looks wrong in renderdoc?
Yeah, give me a sec I'll take a screenshot from renderdoc
it looks different from what it looks like in paint?
Yeah
It looks like fully green 4 pixels in paint
I also created the image in gimp, krita etc. by the way
And tried bmp and jpg
I'll also share my stb loading code & glTexImage2D line to show the texture formats I'm using
i will leave this thread now
show how you create the texture
the switch selects GL_RGB
number_of_channels is filled with the value 3
I added the case GL_RGB: return GL_RGB8 to InternalFormat() yesterday to try stuff out
internal format is GL_RGB8, format is GL_RGB, pixel data type is GL_UNSIGNED_BYTE
stb returns 12 bytes:
0, 255, 0,
0, 255, 0,
0, 255, 0,
0, 255, 0
I disabled all srgb stuff as well to no avail
This is the texture by the way
Actually, can you guys import this in your engine thingys?
I don't have an engine atm
Did you nuke your repo as well π
it's in nightly zig
I haven't launched it in forever
is your data four byte aligned?
maybe set the desired channels from 0 to 4
// *channels_in_file otherwise. If desired_channels is non-zero,
// *channels_in_file has the number of components that _would_ have been
// output otherwise. E.g. if you set desired_channels to 4, you will always
// get RGBA output, but you can check *channels_in_file to see if it's trivially
in stbi_load
Hmmm
0 again
ah bjorn beat me to it
unless you absolutely know how many channels, for greyscale values for instance, where its 1, always request 4
and upload it as GL_RGBA
But stb returns 12 correct bytes
And if I tell OpenGL that this has RGB data not RGBA data
Shouldn't it work?
maybe try it? and hope for the best
for science
Just trying to understand π
Which is the case this time, since I created the texture π But I'll shut up and try it
I mean it looks right as you said
idk, it's just what I saw in your code that stood out
is that address there aligned?
how does the png look like if you upload it here
Okay now its half pink half green, I'll check the values in the debugger before sharing the screenshot to make sure I didn't do something stupid
Coming
Looks small, like a little frog
XD
its all green
this is explained int he comment for the function
2x2
// output otherwise. E.g. if you set desired_channels to 4, you will always
// get RGBA output, but you can check *channels_in_file to see if it's trivially
// opaque because e.g. there were only 3 channels in the source image.
Oh yeah right
it's RGBA output, but it reports 3 channels
so that switch statement is incorrect I think
I am a fan of just writing my engine for my chosen formats
...
I am not going to support all the image formats under the sun
I hate that it worked
Of course, but what's the point of having < 4 channels in an image then
I mean none for my engine
Could you humor me and tell me the 3 format values I need to get this work without using 4 channels? Just for education purposes of course
Hmm I don't know anything about this, maybe this is why
I don't think it is misinfo
Isn't this correct:
internal format -> GL_RGB8 (non SRGB)
format -> GL_RGB
pixel format -> GL_UNSIGNED_BYTE
Replying to this I mean
Oh it turns out some gpus do not natively support rgb formats, but they support rgba instead
From what I've just read anyway
this looks right
why not useing glCreateTextures glTextureStorage and glTextureSubData?
glTexImage2D is a nightmare
: (
I did try glPixelStorei(GL_UNPACK_ALIGNMENT, 1); with the initial setup I mentioned
And voila
Although it's worrying this is not mentioned in the wiki
Because opengl wiki is not MSDN
Yeah but the spec is too detailed and long
thats the most halfassed explanation for using chatgpt π
π
you should be standing in the corner
Okay how about this
and think about this
Sebastian Aaltonen uses it
hes a moron
Whaaat
I'm defending my viewpoint that if you know how to peel the bullshit away (i.e., know the domain of your question) it is an "accelerator"
its not as dense as vulkan spec
You can't hope to scour stackoverflow faster than chatgpt that's the sad truth
you cant possibly know what is and what isnt bs with chatgpt, its a fancy markov chain generator, it doesnt know what opengl is
your standing just decreased π
Code it to see for yourself
or how is it called in games
Dude of course I'm not blindly copy pasting what it says, that's what I was trying to say
I use it as a faster google
oh it's to do with how texture data is sent to the gpu
that data needs to be 4 byte aligned
Seems so yeah
But as chatgpt duly notes (imagine I'm eyeing deccer without blinking while saying this lol)
It may come with performance penalties to tell gpu to unpack
So yeah RGBA it is
But I can't imagine console games using grayscale textures, LUTs etc. with RGBA format
Though they probably merge textures together etc. to utilize the otherwise unused channels
the initial value of GL_UNPACK_ALIGNMENT is 4
Specifies the alignment requirements for the start of each pixel row in memory. The allowable values are 1 (byte-alignment), 2 (rows aligned to even-numbered bytes), 4 (word-alignment), and 8 (rows start on double-word boundaries).
so if you have a 3 byte image format the row row length of the pixel rectangle has to be adjusted
so that the rows are 4 byte aligned
is what I think this means
so you can change GL_PACK_ROW_LENGTH or GL_UNPACK_ALIGNMENT
and have your RGB
I guess is how it works
but it's just easier to use RGBA ? seems like a waste of a byte
especially for big images
I'll definitely go the RGBA route
feels weird to be reading opengl docs again
No escape huh
vulkan documentation is so much better
unless it's brand new
well the docs are good
is this 4.6 core?
Pat is the Pat from this server right
I googled gl spec and this is the first link
Maybe mine is Pat's version (2.0) and your is one of the previous ones
I should add the GL spec to my syllabus as well
Regardless of whether chatgpt sucks or not, a graphics programmer should be on top of the GL spec I think
I'll be job-ready in 2039 at this rate nice
π
gl spec is pinned in the opengl channel too
either jake or martty posted the link back then
thats the one
Thanks for wasting time with me on this useless bug btw, both of you π
Back to normal mapping
that's the same spec I am using, why do you think mine is out of date?
it was just modified last in 2022
Chapter 8.4 you posted has a different structure compare to mine
I thought so but now that I look at it, it's the same. My bad 
That's probably better for the eyes, compared to ipad
Yo, I'm back. I was implementing normal mapping and realized that if I add tangents to the vertex buffer for the instanced crates/cubes in the scene (200k instance count), fps drops immediately from 144 to 77 (halved)
This is the vertex buffer
4 vectors so 64 bytes + 1 matrix which is 64 bytes = 128 bytes
I wouldn't think this would make the gpu crawl but it does
Maybe I should squeeze the tangents into the other vectors as there are unused components
quantise them my friend
Makes sense π
normals & tangents: octahedral normal encoding
UVs: normalised 16-bit ints
Looks like I got some R&D to do first, thanks for the advice folks
so in total you would then have 3*4 + 2*2 + 2*2 + 2*2 + 4*1 = 28
bytes/vertex
(incl. 8-bit vertex color)
By the way, does it seem normal to you guys that adding the tangents alone (not utilizing them at all in any shader) would hinder performance this much?
The reason I'm asking is tangents are not instanced as opposed to the world_transform.
Matrix is costly I get that, pecause it's a per instance attribute, but I don't get why the addition of a mere vector attribute is causing this.
Perhaps I'm doing something stupid, other than the tangents I mean

you generally want to reduce the amount of memory you need to access as far as possible
because that is the slowest part of gpus nowadays
and I mean if you think about how many vertices there are and that you need to fetch 64 bytes for each one⦠that adds up really quickly
Yeah I get that, just trying to understand the limits of what I can do I guess
well for this you would use a profiler
NSight or Radeon Analzyer thingy will basically tell you that everything is stalling waiting for memory
Do you do this encoding/decoding for every material/shader in your renderer? Or do you use it for the most common "default/pbr" material?
every single one
Shader-side decoding is not that costly then I presume?
it is always better nowadays to trade compute against memory usage
Just some arithmetic instructions?
compute is insanely fast
yeah itβs a bit of floating point arithmetic
the UV encoding is really simple⦠cast to float and then divide by 65536
octahedral encoding is slightly more involved but itβs no more than like 10 loc
Thanks again, I'll read up on this a little and integrate it 
https://github.com/spnda/vk_gltf_viewer/blob/renderer_interface/shaders/mesh_common.h if you need a C++/GLSL impl itβs right at the bottom
btw you can also use these mechanisms to compress your gbuffer
I'm following LOGL and am on normal mapping, so I'm not yet familiar with deferred rendering, but I'll take note 
Was reading up on the GL wiki, specifically vertex post-processing and the post-transform cache. I realized indexing may be a quicker way of getting back to 144 fps for my instanced cubes. I was using non-indexed vertices (as LOGL does).
Switched to indexed vertices and indeed, lost fps is claimed back.
I'll postpone quantization for now (I'm reta... ehm old-fashioned, so I need to read up on a subject before implementing it, even if it is somewhat trivial to implement/grasp).
Back then, I thought that the textures inside this sample glb file needed to be created with mirrored repeat wrapping mode.
Now I realize that I'd thought that because I was flipping the textures vertically when loading them with stb.
I noticed that you don't flip on Y in gl_viewer demo @humble sinew. Why do you not flip on Y and why is it the correct thing to do in this case?
OpenGL and stb have inverse V texture coordinates after all.
tbh not sure
only thing I could think of is that perhaps my UVs are implicitly flipped when transforming them because of KHR_texture_transform
other than that I really don't know right now
I suspected that as well but no, the model does not have texture transform info. Gl_viewer sends 0 offsets and 1 scale to the shader every frame.
I opened the model in Maya and the Vs are in the 0, -1 range it seems
I'm confused as to what to do in my model import code
Flip or not flip
yeah but the rotation matrix might feed in the flip?
It was identity IIRC
Let me check
maybe
Yeah because there's no transform, 0 is passed as the rotation
ok fair point
This is the model by the way (one of Khronos' official sample glb models)
Wait, does gltf uv space have the V going down?
I'll check the spec
Two negatives cancel out
So uvs coming from gltf should not be flipped for OpenGL
Sorry
uvs coming from gltf are compatible with textures imported with stb*
OpenGL is irrelevant here
Both have the U going to the right and V going down
I'm back, with a new bug of course
Normal mapping is sorted out and is correct visually.
Parallax mapping is not.
Problem: Specifically, shader code converting the viewing direction vector (i.e., fragment to camera direction) from view space to tangent space is wrong somehow.
I'll share code after this message.
First, some info.:
Engine is left-handed with x right, y up and z forward.
Tangent space is right-handed with tangent and bitangent aligning with u & v and the normal being the right handed cross product of tangent and bitangent.
So to construct a TBN (converting from view space to tangent space) I do:
T = tangent in view space
N = normal in view space
B = cross( tangent, normal )
Using this TBN, normal mapping works correctly: When I direct the directional light downwards, top parts of each brick in the brick wall is lit and bottom part is unlit/black.
So I thought the Problem conversion I mentioned above should be like this: (vs_out.tangent_to_view_space_transformation is the TBN by the way)
vec3 viewing_direction_view_space = vec3( vs_out.position_view_space.xy, -vs_out.position_view_space.z );
vs_out.viewing_direction_tangent_space = transpose( vs_out.tangent_to_view_space_transformation ) * normalize( viewing_direction_view_space );`
But this does not work. What happens is the parallax effect is static even when I move the camera around (position and/or rotation).
Now the weird thing is, if I forego the transpose( TBN ) * viewing_direction_view_space thing and instead do this:
vs_out.viewing_direction_tangent_space = normalize( vec3( vs_out.position_view_space.xy, -vs_out.position_view_space.z ) );
The result is visually correct. The parallax effect works when camera angle changes.
I know there should be handedness change going from view space to tangent space and vice-versa. But TBN already accounts for that.
So I'm puzzled as to why this hack works but the correct way (transpose TBN) does not.
Here are the shaders
Also, where is Bjorn?
I haven't been following the community threads lately, but I'm around in #wip and the other channels 
Do you have an idea on how to debug this mess?
This is the viewing direction (fragment to eye) in tangent space, output as colors. It does not change with camera rotation, but is offset with camera movement (no rotation).
Okay I found out
Nothing is wrong ...
It shouldn't react to camera rotation alone
That's completely normal
Viewing direction (fragment to camera direction) is CONSTANT for tangent space when the camera rotates only
...
Tangent space viewing direction only changes when the object being viewed rotates, or the camera moves and rotates
...
Well kudos to me for being stubborn I guess
Steep parallax mapping
With variable sample count based on view direction
And ultimately parallax occlusion mapping
i like ur editor theme 
Back after a long break
Not much happened this engine wise, I just added 3 new primitives (circle, cylinder & sphere) as a side quest and oh the engine now has a name
Kakadu
Turkish for Cockatoo
Also an icon
Calling it an engine is a stretch though but anyway
I also created a small blog/portfolio site with Hugo hosted on github.io
Did a lot of reading/watching though
Halfway through fabian giesen's graphics pipeline series
It was once ancient sumerian to me
I can't claim that I understand everything I'm reading, but more than %50 of the stuff is clicking now, which is progress
welcome back!
I'm thinking of extracting Sandbox project from my main Kakadu.sln into its own solution or at least, a Kakadu-Demos.sln, so I can have many client projects.
While I could just create a new project (or better yet, duplicate Sandbox.vcxproj) and work with that, the "engine repo" would be filled with demo projects and I don't want that.
I want to have two repos at least; Engine in its own repo and demos in their own repo(s). Demos may have individual repos as well, that's no immediate concern to me.
For example, HDR chapter of LOGL has a different, smaller scene depicted. While I could move the camera away and put the new scene there in my current Sandbox.exe, things like lights' ambient terms need to be turned off to not "disturb" this new scene for example. A messy solution at best.
I don't have a scene-graph or a scene representation in general and I don't plan on implementing it any time soon. (No more distractions!)
So I'm determined to make this happen.
Question is, do I go with two repos and individually update both? That's risky, because it will inevitably break once I update the engine repo and one of the not-so-up-to-date-anymore client projects stops working.
I could go with 2 repos and have the engine repo be a submodule inside each client. That way, clients would have the engine repo locked to a specific commit and would not break with further engine modifications.
Does this make sense?
Current structure is:
Kakadu.sln has:
Sandbox project (exe) depends on Engine project (.lib) depends on Vendor project (.lib)
I personally have no intention in creating a separate library just for the engine in the near term but am writing it hopefully in a way that is going to be possible to pull it out eventually. I am trying to get to gameplay
Valid
I mean I could (and probably should) just shut the hell up and continue, LOGL will be over soon. It's just that prior chapters largely followed the same scenes or at least I could make incremental changes to the scene for each new chapter.
While this is obviously not a problem for many chapters (why have the same scene as LOGL? Do what you want!) I want to able to visually verify correctness, especially for color correction and HDR kind of chapters. So sticking to source scenes there is important I think.
It wouldn't really matter for, say, framebuffers for example
I think I will add point lights eventually into my asset format and have the engine handle it when it sees it in there
I donβt want scene objects in the engine
So youβre following LOGL
I followed a similar book
I just dumped that project that had all that and started over
It was just a pile of tutorials
Then my first attempt at Rosy I tried doing scenes the same way as I did in zig and it led to C++ issues, a really slow build, a horrible header dependency tree
So what I would do if I was in your position, and this is not a recommendation, itβs just what I do, is just start over
Once done with learning
I stated over about a week ago, brand new project and am close to where I was before already
Like nearly three months of work and I will probably get it done within two weeks I think
I know I should do that but I kind of drove myself into a corner by overengineering this into "my baby" π
I could just take the math and foundation stuff along and start anew, you're right.
And I know there's this trend where people go through multiple renderers in their learning journey. Which makes sense on multiple fronts, for example to learn a new API.
I made my decision after at least trying to scope a refactoring
I don't know I think I should just stick to the current structure and finish LOGL with it.
Then I should probably start from scratch and incorporate DX11 maybe?
Or maybe it's time for vulkan. I've not decided yet. I'm also continuing Real Time Rendering and have lots of projects planned (shaders, techniques etc. etc.)
If you can refactor maybe do that
I determined it would take as much time as a reboot and would end up worse
Hmm makes sense
I need to get out of my comfort zone I think
So after LOGL is done, I shall be done with Kakadu
Whether or not I'll be able to let go is another story
I noticed in debug builds, the renderer wastes 2 whole seconds to get up and running
Investigating further showed that shader parsing stages are REEEALLY slow due to std::regex use (to parse #define features)
And now I'm going down the rabbit hole of why std::regex is slow -> ABI breakage promises killing c++
Fun
Now I have a firmer grasp on why there are "better c++" alternatives and more come up every day
ββ Awesome T-Shirts! Sponsors! Books! ββ
Slides: https://docs.google.com/presentation/d/1V4JBhto3E4864_8jP0ZuOWwHBkZoti29DgeTIdU3p8k/edit?usp=sharing
[[no_unique_address]] in MSVC Discussion: https://twitter.com/MalwareMinigun/status/1103498318343073792
Final code sample: https://godbolt.org/z/96hPofz7E
T-SHIRTS AVAILABLE!
βΊ The best C++ T...
Nice watch
How the f can I jump to the very first post?
Top right
Ah not present in my phone. I'll check it out from my desktop. Thanks π
Here you go: #1277011356637462621 message
Neat! Thank you once again!
the fuck?
The better alternative is to not use regex
Definitely
I was lazy
Although I'll keep being lazy and try ctre
I should be done with Kakadu soon anyway
So ctre was a letdown as well
I glanced at some benchmarks and while it mostly kicks std::regex' ass, there are some cases (notably more complex patterns from what I can tell) where the gap is much smaller
And from my extremely scientific benchmarks (which is stepping over a function in VS debugger and looking at the ms elapsed)
It is somehow slightly slower than std::regex for my use case
So I dumped both of these overengineered pieces of software (std::regex is shit & ctre is probably superb in many cases but overkill for my use case anyway) and went with good old string(view) find()
It sounded like I criticized ctre but I really like the compile time part of it actually. I don't have bad things to say about it. I just shouldn't have used regex in the first place
I avoid regular expressions, cannot think of a use case that anything I ever want to do would need them
I usually do so too, as I was bitten by the exceptionally bad performance of std::regex many times before
But sometimes it is handy, especially when the source of the string (i.e., the user of the engine as the shader author in this case) is free to use any whitespace to delimit the token I'm interested in
#define POSITION_LOCATION 0 \n
Is an example string
#define POSIITION_LOCATION 0\n is another
Although it isn't hard for the programmer to use regular string/string_view (or c equivalents) functions to implement this as well
Actually it is probably harder to "remember" ECMA regex syntax to come up with the pattern anyway π
oh are you making an engine for other people to use?
regex is not worth the hassle
just define your shader slots in a const, those can be shared with glsl too
No I'm just an idiot unfortunately and overengineered (more like overcomplicated) this whole shader introspection bit to kingdom come
Although I did get rid of the regex
And shader init. code went down from 2.8ish seconds to 60 milliseconds on Debug
Parsing code is less accurate so if someone other than me wrote a shader, it would be probably relatively easy to break things
But since the answer to this should be a resounding no
I'm done with this
Although, if I ever were to be hired as a tools programmer ...
Oh I remembered, I did plan on eventually making a game-jam with my friends from my studio
Although I would be the one writing the shaders
I think I'll postpone that idea to focus on my studies, write more renderers hopefully
That sounds fun
HDR chapter of LOGL is ... not the best
Instead of utilizing the lighting model from previous chapters, there's a hand-wavy hard-coded phong lighting with ambient set to 0 in the shader
But I couldn't get it to look like the screenshots by just diffuse alone
He uses 0 constant and 0 linear term and 1.0 quadratic term for the attenuation. No speculars or ambient, just diffuse lighting
Aaand this is me
With the same settings
If I do crank up the ambient to match the diffuse (51.000 in rgb = 200.0f in fp) and move the light closer, I can somewhat get closer to LOGL's screenshots
Is this expected or should the diffuse be enough (I think the latter)?
I have no idea about anything HDR related, but looks cool to me
This is probably because I cut some corners veeeery half-assedly
Like inverting a cube (via winding order) to get a quick tunnel
Very quick indeed as I'm still debugging it
I really should have bundled some bs together and created a hand-wavy gameobject class so I could iterate faster
Instead of this horrible and time-consuming bs
is your code on github?
my materials all are defined by a gltf I don't have anything hard coded
well by my asset format
it includes the shader path too
I just refuse to have any hard coded geometry or materials or vertices or any of that at this point, I will add support for debug lines though
Yes
First post or the ones following it should have the link
Personal graphics framework I'm developing, to study graphics programming topics. - fauder/Kakadu
nice
I'm having banding problems for a while and I'm having trouble trying to understand whether it is normal or I f*cked something up
For example I reduced my scene to just the ground (a quad), with blinn-phong (all features removed, no shadows no parallax no nothing, just plain old blinn-phong shading and lighting model), no directional light, no point lights, just a single spot light with no ambient and specular terms, only the diffuse term set to full (white).
I also uploaded the texture of the quad as well
Switching to RGBA16F or RGBA32F did not help (unless I screwed it up somehow)
highp precision did not help (and also, I'm on desktop and I read that it is highp by default on desktop)
Maybe my normals are not normalized somehow? That's the only thing I can think of.
Although, I did hard-code the ground normals to 0,0,1 (since it gets transformed to from tangent space to view space, I needed to input 0,0,1 instead of 0,1,0. My lighting is done in view space)
Any ideas?
Or is this normal? In that case what is the solution to this? Dithering?
I'm rendering to offscreen fb 0 which is in linear space (srgb encoding is off and its color attachment texture is RGBA).
Then it gets blitted to editor fb which uses srgb encoding as it is the last step in my pipeline (and also its color attachment uses a SRGBA texture)
I do no gamma related stuff manually inside any shaders.
I've turned of MSAA completely
I've rolled back to October, before I implemented instancing and the banding is still there
September as well
August
Yeah as far as Kakadu is concerned, there was banding from the beginning it seems, I just didn't notice probably.
show normals
first thing I do for any lighting problems is show normals
whatever normal is in the shader already just make that the output color * 0.5 + 0.5
if that's correct remove the half vector and just make it L dot N without the half vector
if that doesn't help make the output color the normalized L * 0.5 + 0.5
if your normals and lighting don't show banding then L dot N won't either
if your normals or lighting are to blame you can track that down
to where it starts
Sure but it's just a simple quad and I even went as far as to hard-coding the normals to the up vector.
Still, you're right. Normals coming up
For this scene
Which has serious color banding
The normals
reddish is the tangent and the normal map looking color is
well the normal
They all look normal to me pun intended
View space normals
where is the banding? those normals look pretty dark and not like normals, the circle colors I mean
You mean this right
that picture is kind of small and dark I do not see banding, and the circle colors look dark for being normals
if it's * 0.5 + 0.5 the colors shouldn't be so dark no?
but I don't see banding, I am curious what the banding looks like
the wall surfaces look fine
Oh sorry the spheres in this scene represent point lights not normals
ah
You're right my bad
I'll output world space normals since reading view space normals is not that easy
can you increase the intensity maybe?
Sure thing
I changed the wood texture into my "missing" texture which is pink and more legible
I think banding is more easily visible here
Not the red light's banding, but the far, white light's banding I mean
Moving that far light even further allows the red one to show more and we can now see red light's banding too
It's almost as if I wrote a toon lighting shader
It looks quantized
ah
My Blinn-Phong is a little complex now that there are shadows, instancing, parallax support.
I'll try to create a smaller, bare-bones shader to easily debug it.
that just looks like it's the closest spot to the light on the wall?
as in the normals there and the light direction are basically facing each other
and you're taking the dot product of two parallel directions
-> <-
for illustration purposes
that hard line there between purple and pink?
Sure but the result is clamped
That makes two of us appearantly
I tried recreating the same scene in Unity and there's no banding, at least not this extreme.
Hell, it looks nearly identical to learnopengl.com which is pissing me even more
ah
Of course there is some banding in Unity as well. Seeing this and reading up online, I thought ok some banding is inevitable (at least without taking measures against it like dithering, higher precision framebuffers etc.).
But this whole tunnel thing from the HDR chapter of LOGL is throwing me off
Sometimes I feel like drowning in basic sh*t like this and think I'll never get to vulkan or compute or render graphs or whatever cool shit I want to work on
well
Buuut I'll shut up because I really don't want to cry about this for a long time and just call it a day, get back on it tomorrow and try to figure it out
vulkan is just a bunch of apis that are just more explicit than opengl and you don't get automatic sync behavior like opengl
I struggle with these sorts of things too
I struggled and still struggle with like basic frustum math
there was a moment where I was like maybe I can't do this, because I kept getting my CSM wrong
just one issue after another
and I went to bed, woke up the next day
and found a bunch silly bugs where I just had like used the wrong value
You're right %100, as this has been my experience studying graphics as well
Buuuut the problematic part for me is that I am still trying/hoping to work as a graphics dev some day and that puts pressure on me naturally
Would you though?
If you could I mean
From what I remember you had a cool job already
my job is very stressful
I would prefer a job where I could not be stressed out and I could work on my game dev at home as a hobby to a job where I work as a game dev and cannot do my own hobby
Ok then you definitely wouldn't π
Game industry does not know work-life balance at all
Although a graphics dev would have it easier compared to a gameplay programmer for example imo
Graphics I mean*
I think maybe after I'm done paying for college tuition for my kid I will like retire from software as a job and take a low paying easy stress free job and just program as a hobby
it would be nice
Solved it
It's the f*cking attenuation
Gamma corrected lighting works better with linear attenuation
As opposed to quadratic attenuation working good with linear color space
While this is covered in the earlier gamma correction chapter of LOGL
It sure as hell would have been nice if it was mentioned in the HDR chapter as well ...
I'll either accomplish this or die on this hill
Now I can sleep
oh
I thought gamma correction is always the last step
this math involves doing math with gamma corrected light? is that an HDR thing?
It is, and it is implicit in my setup (no manual code in any shaders for this)
gamma correction is like a trick that tries to solve human better preception of darker light and is not good for math, like you have to remove the gamma correction first
ah
glad you solved and feel better now
and I hope you feel better now*
you got this
Yeah basically any real change in luminosity isn't perceived linearly by the human vision system but since pbr and co. is based on real light physics, it is preferable to work in linear color space, as in linear for the light. So you work in linear color space by default and only for your last rendering pass (I call it editor fbo in my renderer since the result is output to an ImGui window in the editor) you enable sRGB encoding for the framebuffer object, which does a final step of converting your linear output into the sRGB color space.
Then your display does the inverse and converts from sRGB to linear color space.
Then the light from the monitor hits your eye and your brain-eye-and-whatnot convert it to sRGB again.
(Also you'll want to create your diffuse/albedo-like maps with the SRGB format so OpenGL knows to convert it from sRGB to linear when sampled in a shader, so you can keep working on linear color space).
This is how I understood it.
I also mumble these during the day t see whether I got it down or need to study more
Hence I wanted to explain again (I swear I'm not trying to show off π )
I do, but the image still differs in some aspects from the LOGL's but I'll convince myself that it is normal
I'll also remember that the creator of ImGui knows less about gamma correction about me (based on his words claiming he does not know how gamma correction works or smth like that) and chill out
And I have 5.5 hours of sleep left
oh no get some sleep
I just stay in linear mode and use the SRGB swap chain format until presentation
it does make everything look really dark in renderdoc though
Yeah you have to account for that darkening as well
I created a function to convert imgui style colors to accommodate for this
I still think there's something wrong with my lighting | gamma-correction | switch to HDR (no tonemapping yet).
One or more of these have problems I think
While this is closer to the source image on learnopengl.com, it is still off
Just to recap the gamma correction in my engine:
- No manual gamma correction in any shaders.
- SRGB encoding of OpenGL is disabled for all but the last FBO (which is the editor FBO as I call it).
- Editor FBO's color attachment is also SRGBA. Other FBOs color attachments are RGBA.
- Diffuse/albedo etc. textures are created as SRGB so they should be SRGB decoded when sampled in a shader.
I would really appreciate it if someone could give me any pointers
Or maybe I should ask in a dedicated question channel from DISCUSSIONS category or directly in #questions ?
Left two images are from learnopengl, right one is Kakadu.
The interior of the wooden tunnel is darker in logl
Also let me take a screenshot from the end of the tunnel to show the closeup of "sun" light as well
Nooo dont say that π
Something fishy going on. Iβll create a post in questions when I get back on pc
Have you considered that LOGL could be wrong (wouldn't be the fist time)
I did, briefly π
Honestly I don't know if I would be glad or sad if that was the case
I guess a better way would be to use sponza as I believe I would have lots of references all over the internet
Or Rosy β€οΈ
Indeed
It's still early enough that you can catch and fix this kind of bugs, so they can wait a bit
If you were in my shoes, would you be OK with it for now and move on and later fix it maybe then?
I'm not sure on how to proceed
I know the feel, but I also went several times with "good enough for now" with my engine. Sometimes you just need to free up your mind and come back to the topic at a later time.
But loading Sponza to compare is also a good idea, if you want to get rid of the "bug" (if there is one) I think that's a better way to investigate that sticking to LOGL.
Sponza or even the Kronos/GLTF test meshes will be a better comparison point because they have been vouched by a lot of folks
Oh Kronos/GLTF makes a lot of sense yeah
OK that makes a lot of sense. I'll do some final sanity checks with sponza and khronos gltf assets and read MJP's blogs on MSAA. If I still can't correct the alleged bug after exhausting these options, I'll simply move on.
Thanks a lot! It really helps when someone other than me points me towards other options (or simply suggests that my assumptions may be incorrect)
My pleasure
I don't know anything about HDR yet
so I can't be of any help there unfortunately, would be cool to add it at some point
Im doing some reading on msaa and signal processing in general
Then Iβll take a fresh look at Kakadu and try to see if there is anything wrong with it
There are some problems with MSAA resolve off the top of my head thatβs for sure
But I donβt think those have anything to do with color banding/smearing but thatβs just a hunch. Canβt know until I fix it.
OK back again.
Did some reading on MSAA, HDR & gamma-correction. Matt Pettineo's blog is a treasure.
To better understand the signal processing side of his series on signal processing and MSAA, I did some light studying on signal processing but the math side of it got real complex (pun not intended), real fast. I remember some signal processing from my university days, it is fun but I think deserves proper time to study properly.
Anyway, back to tinkering with this color banding thing for a limited amount of time before moving on.
I actually noticed something working on this problem: If I replace wood textures on the walls of this tunnel from logl scene with pure color, like red, the banding can more easily be observed.
Take a look at this image for example:
Now, if I disable gamma correction entirely (i.e., disabling sRGB encoding of framebuffers & also not using any sRGB textures anywhere), look what happens:
Okay sure the image is darker since the scene was set up for a gamma-corrected setup
Let me crank up the lights a little
Let's move the "sun" light a little closer to see if there is any banding:
There is no banding
So I'm thinking this banding has to do with gamma-correction
This is somewhat verified by this: https://loopit.dk/banding_in_games.pdf
Page 10
Although I'm nowhere close to a solution other than disabling gamma-correction
Dithering is an option but if I understood it correctly, it should be done as close to the point before color quantization (i.e., just before the frag shader terminates) and also repeated for each step in the pipeline. I can't simply put dithering at the end of my MSAA resolve shader and call it a day. In fact I tried that first as it was the least effort route, and it didn't work as the slides say it wouldn't.
Is it the only way to avoid banding? Surely thereβs another method? I have major banding on my gl project on windows even on a so called hdr monitor which cost an arm and a leg but is only true black 400 certified, my swap chain is 16bpc and lighting accumulation is also 16 bit float
But hey on my MacBook with the crazy good hdr 1600 nit peak itβs gone
I have the samsung chg70 27" which was also hdr600 if I'm not mistaken
Iβm a bit pissed because no one said oh check for certification they said hdr is good and so far itβs nothing worth bothering but hey the monitor is oled so itβs really nice just wish it was actually compatible to my mac, it has a 1000 nit mode in the osd settings, for Β£650 youβd think they wouldnβt lie
I can't say for sure, but for my case, none of these helped:
- Using sRGB
- Using higher precision framebuffers
- Using highp in shaders (although, it's highp by default on desktop if I'm not mistaken)
- Doing tonemapping per sample before resolve vs. after resolve.
Things that actually worked:
- Dithering
- Disabling gamma-correction altogether.
Itβs piss all in way of actual hdr brightness but the colours are good
That sucks
Hmm how would you disable gamma correction π€£
HDR was mostly a gimmick
Yeah, my bad for not researching
Back when I purchased my monitor which is 5+ years old at this point I think
Oh believe me on a MacBook Pro like mine itβs deffo not a gimmick but yeah canβt beat Apple for their displays
Although, it works wonders with a playstation and an oled tv/monitor IMO
Windows has HDR problems if I recall correctly
I believe you
Yeah windows is shite in general lol
Indeed
HDR on Mac is gorgeous
Super bright also thereβs no such thing as hdr output on Mac, everything can be hdr π€£
Obviously not in bit depth but like the way windows does it is stupid
Once I set my output to 16 bit in openTK banding disappeared on Mac
Literally 4 params when making the window and it fixed it but on windows it has no effect
RedBits = 16β¦. Etc
I basically don't call glEnable( GL_FRAMEBUFFER_SRGB ) and don't set any textures as sRGB/sRGBA.
There was no manuall gamma-correction code in any of my shaders to begin with so that's already done.
I also had a function to convert my ImGui colors to sRGB when I did use sRGB. I simply commented the body of that function so ImGui works correctly as well but this one is not relevant to our discussion.
So there is no gamma correction anywhere in the rest of the renderer
I recommend this, although a huge portion of it was overkill info (for me, at this stage)
I also used this for dither: http://www.anisopteragames.com/how-to-fix-color-banding-with-dithering/
And this is with tonemapping and dithering
Is it close to LOGL? not really
Do I care
Definitely not
A wise frog told me to move on
That's what I will do
Tbh windows does one thing ok and itβs gaming, if I could swap my pc for a mac which is as good in terms of gpu perf, tbh M1 Pro gpu is capable, it just doesnβt scale as good as a desktop gpu heh
And there's no visual studio :p
Which is arguably a bad software but has the best debugger I've seen so far
Yeah rider is better anyway unless you need to do c++
Rider debugger is amazing heh
My work is c++ and also graphics = c++ (mostly)
Can do dx12 raytracing in c# if you want tho π
Or vulkan
Tbh idk why anyone chooses dx over vulkan unless they donβt wanna spend time learning it, vulkan is so cool, my goal is to move stuff over to vk eventually, c# is like the ideal language to use besides rust, c++ vulkan looks awful to work with heh
c++ anything is awful usually
I use Vulkan's C API because that's what the literal spec is
C++ with the Vulkan C API
I've set up https://github.com/cagritaskn/GoodbyeDPI-Turkey and now I'll be more frequently using Discord
Ok I'm a moron
I knew that I was a moron but now I'm absolutely sure that I am a moron
All this freaking time I've been passing GL_RGBA for the internal format
For GL_RGBA16F I mean
Woops
Should it be gl float or rgba float or idk I only use gl with openTK
In openTK it has internal format which is the real format, format which is like rgba no precision and then a type so unsigned byte would be 8bpc and half float for 16f, float for 32f
So you have the texture format, layout format and actual per channel storage precision specifier
So like thereβs rgba16f for internal, rgba for layout and halffloat for precision
Itβs strange how gl does it
I only support one format for images and it's all the same, I refuse to support multiple
I process everything with ktx
whatever garbage format goes in (jpg, png, etc) -> ktx tools -> same ktx2 format comes out
with the same command args
I do currently have an srgb problem with the output though, I don't want srgb
just haven't spent any time on it
it's a problem especially for the normal maps
The internal format should have been GL_RGBA16F (or 32) for an HDR back buffer
Framebuffer in GL speak
But I fed GL_RGBA mistakenly
The reason I couldn't catch it until now is because I could swear I'd seen GL_RGBA16F in RenderDoc
Exactly
Those should be linear yeah
I delved deep into OLEDs, how consoles deal with HDR and what formats they support etc. these past days
In this 2017 GDC talk, EA's Alex Fry presents the approach the Frostbite team took to add support for HDR displays.
Register for GDC: https://ubm.io/2yWXW38
Join the GDC mailing list: http://www.gdconf.com/subscribe
Follow GDC on Twitter: https://twitter.com/Official_GDC
GDC talks cover a range of developmental topics including game design...
This in particular was a nice watch and answered many questions
Tonemapping w/ 0.1 exposure
Exposure = 1.0
Exposure = 5.0
And this is the reference from LOGL
Finally I'm able to move on
Next up: Bloom
Should I make the lights' ImGui widget display [HDR] when they are bigger than 1.0 or 255 before sleeping π
Ok joking I'll get to bed now
lol
May I recommend this: https://github.com/Froyok/Bloom π
I was searching for what bloom to implement because I watched chernoβs bloom video and he kind of dunked on logl and sascha willemsβ bloom tutorials.
Thanks!
I now have (limited) time to r&d on bloom a little
Found your blog post @fluid parrot. It's based on CoD's bloom if I'm not mistaken (I only skimmed it just now, because I have to go out now).
Neat!
I have this plan:
- Read logl's Bloom but do not implement it (it shows MRT in GL for example, which is useful knowledge for someone learning GL).
- Read up on CoD Advanced Warfare slides.
- Read your blog and check out your implementation.
- Finally implement it myself.
I may skip or skim over CoD slides as yours is basically the same approach if I'm not mistaken.
Awesome blog by the way 
Also, Jorge Jimenez seems to have a guest tutorial on logl on Bloom, which is the same thing I think?
Yeah my approach is the same but it has some extras. The LOGL tutorial is base don it, i don't think it was made by Jimenez however, but it based on the same slides.
I recommend checking out my repo directly, it has the latest fixes and all while my article has some mistakes.
Oh ok thanks
Not the βdefaultβ logl tutorial but the extra one in the guest articles
It says jorge jimenez in the author section
My bad
It quotes Jorge Jimenez
But the author is Alexander Christensen
yup π
Note also that my repo has a sample that you can run, if you want to try out some settings live
Nice! As my last effort on HDR proved, it definitely helps to have a reference implementation for sanity checks!
OK so that CoD presentation flew right over my head π
@fluid parrot your blog was much more approachable
But I thought I would read the LOGL Phys. Based Bloom first to get my bearings and paused reading yours.
Also watched SimonDev's video on this as it was a nice round-up style video to quickly see what Unreal and Unity does (which copy CoD as well).
Also Acerola's and Cherno's videos helped fill the blanks.
After reading through LOGL, I'll get back to your blog and then check your source code.
I wanted to get the theory down (as much as I can) before coding anything, so I didn't check your source code yet
An immediate idea I had upon reading up on Bloom was to utilize compute shaders, as the rasterizer and the previous stages are unnecessary for this kind of post-processing.
It'll be a nice exercise to convert the bloom impl. from gfx to compute when I get to compute shaders eventually
You can indeed, you could check out the implementation in frogFood for an example: https://github.com/JuanDiegoMontoya/Frogfood/tree/main/data/shaders/bloom
(Tho in practice the performance benefit is almost non existant.)
Hmm I guess geometry processing & rasterizing for a single fullscreen quad (or better yet a fullscreen triangle) is indeed negligible then
Still, a nice exercise π
Ok, LOGL guest article on physically based bloom is not bad imo, though both the tutorial and the code is a little disorganized
But it did help me quickly grasp the basic theory
Back to yours now
Hmmm, so your method is similar to the CoD one, but during the upsample, you do not apply the tent filter to blur right @fluid parrot ?
Why and how did you make that decision?
I do apply the blur filter
Like right there π
https://github.com/Froyok/Bloom/blob/6a839ab4fbe347ca382da7e50fc2d7fe05408dd6/data/shader/upsample.FRG.glsl#L40
Implementation of realtime bloom post-process. Contribute to Froyok/Bloom development by creating an account on GitHub.
Oh
Then I'm either misunderstanding the phrasing in your blog, or this is one of the things when you mentioned your code is more up to date compared to the blog post?
The code is more up to date yes, but I don't think this part differ much form the blog post.
Is there a part you find confusing ?
I may have misunderstood the original algorithm; I thought that tent filter in the upsampling pass was the one handling the blurring.
But from your blog post:
I use the same downsample/upsample patterns, but I removed the intermediate blur passes.
Maybe there were additional passes for blurring that I missed
Ha yeah, that part is a bit wrong indeed. In practice I found that playing with the tent filter to control the blur radius didn't lead to great results.
So instead I used the mix/lerp to adjust it.
Hmm
What is the meaning of the "intermediate blue pass" there? Is it something other than the tent filter applied during upsampling?
Unrelated question: Karis average is for reducing the flickering caused by the so-called fireflies right?
And if I understood the code correctly, this is only applied to the first downsample (from mip 0, to mip 1).
This was a misunderstanding on my side from the COD slides, based on what I knew about Unreal bloom.
In unreal after each downsample they do a separable gaussian blur. When I read the COD slides I thought they meant the same thing, since they reference Unreal talk initially. But the blur() mentioned in the slides is likely more about the tent/special kernel they do for blurring.
Exactly right yeah :)
In practice I havent noticed the Karis partial Average to help that much. What helps the most with fireflies and spatial alaising (notably on emissive surfaces) is temporal accumulation/filtering.
You can see more about it here (the issue and then how I fixed it afterward): #1067777224528375858 message
It's not part of my repository yet, but I plan on adding it. (Needs a bit more work since you need to reproject the previous frame)
Hmmm I see
I'm obviously not going to implement the bloom from the original bloom chapter of LOGL since it is both outdated and ugly.
I've had my fair share of thresholding problems back when I worked on Circadian City (https://store.steampowered.com/app/1045430/Circadian_City/). They wanted me to implement Bloom and I was new to Unity's SRP and post-processing stuff back then and I recall struggling with thresholding to get the look right. Stuff was either over or under blooming. And it was apparent that threshold route was going to need artist intervention per-scene.
Fun story: They then later decided to not use it (bloom)... our art director didn't bother doing simple pre-production on photoshop to see if he actually wanted an effect in the game or not, and I usually had to r&d and implement stuff and usually my work went down the drain as he didn't like the output visually. I also implemented dithering for the lights which was cozy imo but of course that got axed too.
I could (and should) re-implement those in Kakadu or some other future renderer as portfolio ideas. I had a lovely WIP pixel-art reflection shader too.
Circadian City is a life simulation game in Early Access. The game holds a unique premise of playing both in the Realworld and the Dreamworld while they both affect each other.THE DAYLIFEYou play daylife as a sim/management/adventure game where you try to make money and friends. But you can only become friends with people who share the same inte...
$11.99
143
Anyway, back to the topic.
I'll go with phys. based non thresholding route
But I think I'll just wing it on certain aspects
Because I've not yet internalized some problems commonly faced in post-processing
For example it does not make immediate sense to me when I hear the term "temporal aliasing"
I'm still relatively new to the image-processing side of these
And to post-processing in general actually
So for example since I don't know/experienced what "ringing" is, it doesn't make sense for me to code against it, for now.
So by winging it I mean, I'll omit certain measures against flickering etc. and have an inferior implementation by intention
You could still do thresholding, but with a progressive curve instead of something binary. Unity has a simple curve I recall, should not be too hard to find back the source.
It's easy to understand fortunately.
Imagine the fireflies: Frame A one pixel is visible and because it's very emissive it generates bloom. Frame B aliasing make it dissapear. Now you alternate and see a flicker. Spatial aliasing is similar in the sense that a shape in 3D can have its volume on screen change as you move closer/away from it.
If you imagine a neon light, like in the link I gave above, because you move the shape gain or lose a certain amount of bright pixels. So from one frame to the other you have a sudden change in intensity because of the total of emissive pixels has changed drastically. Temporal accumulation allows you to smooth out that kind of "noise" (binary on/off becomes "gray"/in between because of the interpolation).
But it's fair to revisit this topic later, that's also what I did. I only tackled the spatial aliasing in my bloom when it actually had a significant impact on my fog (which uses similar downsample/upsample passes).
Thanks for the insightful explanation as always
In this case, I get the general gist of it sure.
I can tell "temporal = frame to frame". It is usually the "aliasing" part that confuses me.
I ask myself, "what is being sampled here since there is aliasing". I try to understand what the signal is etc. I've failed to internalize aliasing as a concept I think.
And it would be overkill for me to focus on signal processing at this stage I believe, so I think I should do what I should have done in the HDR chapter and simply move on and later revisit these topics.
This mindset also gives me incentive to stop Kakadu soon and re-start on a fresh renderer.
If I can reassure you (or not π) I know nothing about signal processing or stuff like that. I'm just an artist that is pilling up tricks in a hat. π₯·
I was afraid you might say this π But you have the artistic side of you (in addition to the GP side) allowing you to directly mess with all this stuff and get a feel for it. That's a huge advantage I'd say.
Lately it feels like I studied theory more than I actually implemented stuff, so it feels as I slowed down significantly.
I want to half-assedly implement stuff for once (I'm not saying I always do everything correctly on the first try, but rather I usually (over)do it to the point where I can assure myself that I did it to the best of my ability) and then be able to come back to it with a more solid foundation later.
That's why I'm being lazy on understanding the common problems and their solutions in the context of bloom for example
I have to develop this habit of prototyping, and do it soon,
Anyway, sorry I rambled to you π And thanks for all your help again
I'm thinking "what if I re-use the existing framebuffer (contains HDR color info of the frame prior to post-processing) for bloom downsampling (and upsampling)?"
I could always keep the color attachment (i.e., texture) of the actual frame around and take care not to overwrite it
First downsample pass will sample that texture but write into mip_0 texture
So it is not overwritten during downsampling
For upsampling, the end result is the full resolution texture, so it could be overwritten if I let things be, but I could simply create another full-res texture and call that post-bloom or something I don't know
I'm not sure what good keeping the pre-postprocess texture around would do currently
So I don't necessarily need to keep it around
Not that any of this matter for my toy engine but I'm guessing a heavy-weight engine would try to reuse as much as it can
Anyway, pointless speculation at this stage
My current Resolve shader also handles the tonemapping, which worked during the HDR chapter since there were no post-processing effects at play then.
But I'll have to decouple tonemapping and MSAA resolve now, as Bloom (and other post-processing effects I may implement in the future) will need HDR.
As per MJP's MSAA blog, I know that tonemapping after resolving can actually re-introduce aliasing.
I could tonemap -> post-process -> resolve, but that is counter-intuitive; bandwith would suffer throughout post-processing, carrying that fat 16-bit MSAA fbo around.
The solution proposed in MJP's (or Humus') blog (if I understood correctly) is to tonemap each subsample before resolving.
Than after resolving (and having a single color value, but LDR due to tonemapping), apply the inverse of the tonemapping operator to go back to HDR.
All done in the single Resolve shader.
Then I can do post-processing in HDR.
Then tone-mapping to convert to LDR.
Then finally gamma-correction to convert to sRGB.
Any suggestions/corrections @fluid parrot?
How does it work in Ombre?
Yeah tonemap -> resolve -> reverse tonemap -> rest of the rendering pipeline sounds like the right approach, that's also what I'm planning in my engine. :)
You could check this out: https://gpuopen.com/learn/optimized-reversible-tonemapper-for-resolve/
Nice! Thanks a bunch π
Good luck with your audio woes, following Ombre post silently π
I think Iβll do it the wrong way first (tonemap at the end prior to gamma correction)
Then get bloom working
Then use the operators in the amd link you shared
That way I can blog about it 
Also, I can see if it will mess with bloom or not
I vaguely remember you mentioning this (or something else, I might be wrong) killing the dynamic range and making the bloom lose intensity on your channel
Not in your channel probably or I couldnβt find it
In this message
Oh ok itβs really in the past and I saw it because you linked it
But it is not the tonemapping before/after resolve thingy but the clamping of the dynamig range killing the bloom, you mentioned there
#1067777224528375858 message
Yeah that wasn't for MSAA resolve
I think I'll make the editor-only behaviour of "always render into an fbo and only use the default framebuffer for display purposes" the default for both the editor and the game
What the hell
Intel arc gpus apparently do not support opengl directly, but through a compat. layer
At work
My mfc and opengl 1.1 legacy app is slooooow on this new arc b series integrated gpu on the clientβs laptop
And it is weird in that when I attempt to rotate by holding down the mouse button and dragging
Initially it doesnβt do anything for 3-5 secs
Then it fluidly rotates, as long as I hold the button down
Maybe that compat layer kicks in ? I donβt know
I should upgrade the ancient rendering code of this app to use vaos, shaders and instancing
...
I've just put arguments to Renderer constructor inside a struct, that's where it barfs
I didn't even do any template shit this time
This time I'm innocent
I have %10 faith that upgrading VS will solve it
Let's see
%10 was too much
Of course it didn't
Wow, the compiler errored?
Maybe try clang?
It might have a different error message
Idk how much effort it would be to get your code to build in clang though
I had to remove C++23 stuff
I turned off warnings for clang as they contradicted msvc
I just use clang for debugging errors I canβt understand via msvc
always the correct answer
I'm on C++20
Might give it a try good idea π
Although I already discarded that code
But next time I get internal compiler error (I feel a fuckening nearby already) I'll give Clang a try
This happened to me last year frequently, when I abused templates (constexpr stuff mostly).
But this is the first time I've encountered it for doing something relatively simple (and non-template)
I'm making good progress. Renderer needed some rework to accomodate my recently solidifed understanding of MSAA, passes, post-processing in general etc.
I'm nearly there yet in terms of being able to implement Bloom
Some small editor changes remain, which I took note and will do tomorrow
This
Sort of
I didn't apply it for the color picker widgets (although that would have been cool)
I use it for color formats for Textures
That gradient does not look sRGB at all by the way, so it is a good way to piss graphics programmers off as well
Also used for Framebuffers in the Renderer window of the editor
Also did some actually useful work too; Renderer did set the render state for passes/queues that don't actually have anything to render for that frame (passes with either disabled or empty queues/renderables). Got rid of that.
Also found a bug in SetRenderState(); the current framebuffer was being cleared before depth/stencil write state is set. I caught this by observing that if I disable shadow mapping, following passes do not render correctly. That's because shadow mapping enables the depth write and the subsequent passes can now actually clear the buffers.
So my renderdoc captures now don't show gazillions of events/passes/queues finally
sounds like a very thoughtful render pipeline set up, my opengl stuff was pretty yolo, which I coudln't replicate in vulkan, I needed something well thought out
I just tried to emulate my unity experience to some extent. Actually to the extent of my current progress. Iβm sure itβll blow up in my face. It has so far and it will again. But I could re-think it and adapt it based on how it should work (as far as my current understanding goes I mean)
It slows me down for sure but at the same time it forces me to think things through harder
Ultimately I canβt say it is the best way to learn or not of course
But it is a way
I think it is a smart approach
@fluid parrot Can I mention you and/or Ombre in Kakadu's Readme.md?
I'm creating a past, ongoing & planned features part. I want to mention you and Ombre in there, with regards to bloom & post-processing in general.
Also the credits part.
Sure, no problem
OMG your blog was so cool already but the new re-design is gorgeous 
Congrats to you and your team by the way on the Emmy trophy!
Now I feel even more of an impostor while hyping up the basic things I did for Kakadu as much as I can in the Readme.md, in hopes of being hired for graphics some day π
But no, joking aside. It feels good to sometimes pat yourself on the back.
Ho hey thx π
(That was some kind of shadow update hehe)
But don't worry, I have been (and still is) in your shoes
Yeah I wanted to link to you and Ombre and accidentally saw it
Well you certainly walked thousands of miles and I aim to follow π«‘
Okay so there's a performance regression
And there is a weird thing
But before I say that let me say that not disabling V-sync of GLFW was a huge mistake because I don't have base numbers to evaluate performance now
I tried to not fall under 144 fps which is the v-sync target as my main monitor is 144hz
After I fix this regression, I'll disable v-sync and accept whatever fps I get as the current base-line moving on.
Of course a much better idea would be to show my frame times to fellow frogs here and ask whether there is anything fishy they can catch or not.
I'll do that first 
Now the weirdness is this: I've found the offending commit by bisecting my recent git history.
The offending commit runs at 131 fps but magically goes to 144 fps on RenderDoc.
The commit before that runs at 144 fps but magically falls down to 131 fps on RenderDoc.
Classic
Well at least they both have the same GL API event count in RenderDoc
Offending commit looks fairly innocent as well
9b3c372 - Engine/Graphics/Renderer & Client Apps: Move MSAA Resolve & Tone Mapping to the Renderer completely
I wonder how exactly I fucked up
Debugging these are usually fun
I'll also add a moving avg. fps. My poor eyes can not track those digits.
I just moved stuff around and there is a 40 fps difference (125 vs 165 fps with v-sync turned off).
Granted, there were some non-deterministic stuff in the Renderer (uses of unordered containers instead of ordered ones & using Shader pointers as map keys are some examples) causing the objects inside passes/queues to be rendered in random order on each run
BUT this 40 fps difference is consistent between the offending commit and the one before that.
Nevertheless I tried to temporarily patch both commits to lock draw order, just to narrow down the possible list of causes
I don't have any clues yet.
Tomorrow, I'll simply divide the offending commit's diff compared to the previous into small pieces, apply each one, one by one, to see exactly what causes it.
I may also start using Tracy now that I have an excuse
So is the inverse fps values I'm getting from RenderDoc lol
I'll also give NSight another try some time
RenderDoc seemed less cluttered and easier to mess around when I last tried NSight
nsight has really helped me figure out the bottlenecks
I tried lot of things tonight to find out the cause of the regression but to no avail
I'll go to bed after playing some games with the better half
I need fresh eyes and some brain cells to tackle it tomorrow
I tried my best to level the playing field for both before/after commits concerning the regression
Disabled everything but the instanced (200k) cubes
There is still a substantial fps difference
And it is still the inverse in RenderDoc
I looked at the draw calls line by line
They are exactly the same
I even exported the draw calls to txt to diff them
Still, same...
200k cube positions were generated randomly and I parallelized the generation code because it was so slow it was making debugging the renderer a nuisance
Calling std::random stuff inside the parallel for each body of course meant that cube positions were randomized on each run of the program
So I moved the generation out of the parallel loop and made sure that everything is deterministic and exactly the same between before vs. after commits
I'll give Tracy a shot I guess and see if some cpu side slowdown occured
I wonder if anyone would be willing to take a look at before vs after renderdoc captures and point me at something. Which channel should I ask this in?
Tracy is the correct tool, renderdoc is not a profiler. Tracy should immediately show you what is slow
If something is slow I use tracy and nsight profiling
Have you rebooted your pc?
Changed any windows settings?
Maybe a download and installed a new app? Are other games also slow?
I would look at what is using system resources
Try kakadu on another computer
When you use tracy do not run other apps besides tracy and kakadu
The tracy pdf explains a lot about performance testing btw
Good luck!
Lots of times, I've been going at this for sometime now
But I have 2 commits; A before and after commit. I swap between them and fps drops by 40
So if there is something wrong with my pc, it should affect Kakadu globally I think
Yeah I've been reading it. It's extraordinarily thorough!
I don't think the problem is GL related

If I have a performance issue I just connect to tracy, I already have my program configured to run with tracy
it will just immediately point out where all the time is spent
it will directly show you in the code
I don't know how good nsight profiling is on non-nvidia hardware but on nvidia with vulkan at least it is very helpful also
if I don't already know the cause, it's no good to look at code and try to figure out a performance issue that way
Your workflow makes sense π
I'll instrument my major hotspots as well
But I think I'll also add capture mode support as well
So I can run Kakadu and hit f12 or click on some ImGui button so only that frame gets sent to Tracy
It's time for me to open a question post in the forum here
I'm having weird results
Like, Renderer::Update() seems to be taking %90+ of the frame time in Tracy.
But if I completely comment out what is in there, fps stays the same and Tracy now reports most of the frame time is spent in Renderer::Render() this time.
And to top it all, when I added gpu zones, Tracy now says that most of the time is spent in tracy::GpuCtx::Collect.
Frame time stays the same but the offender is changing
That sounds like UB
Hrm are you sure your zones are set up correctly?
I might ask on the C & C++ server instead of here
I would create a throwaway git branch and just start deleting code until the problem went away
Sounds like a tough issue sorry 
That's what I'm doing yeah 
What server is that
I'm hesitant to ask here or there tbh
Both the problem and the codebase is too complex for a simple outsider to just come barge in, take a look and offer help to begin with
Asking here, I'll probably get flagged as a spammer because this is probably not directly related to graphics, but to the c++ code in general
Asking in a c++ forum, I'd say there's just too much "stuff" for a possible helper to unpack
Well, first things first. I got too distracted with Tracy (it is fun) and forgot to actually measure both commits to compare things
Tracy apperantly has a compare feature as well, maybe I could do that.
Kind of good news: This happens for the previous commit too
I'll start calling them commit A and commit B
commit A: 165 fps, working fine commit (baseline)
commit B: 125 fps, regression commit.
commit B is the next commit after commit A, naturally
And both commits' Tracy profiles look like this; Gpu context collect takes an abnormally long time
But I don't think this result is real
I'll remove gpu zones just to show why it isn't real
Now, it is billed to SandboxApplication::Update().
I'll scour Tracy docs, maybe there is a gotcha or some other trap I've fallen into
From a quick glance, no
There are many things I can do, I feel like I'm jumping all over the place
I'll try running both commits on another machine
My wife's gpu is more modern (AMD Rx 6700 Xt vs my trusty old Gtx 1080)
Since I didn't export Kakadu and try it in another PC all this time except once or twice, asset paths under Engine/ directory of course do not work out of the box, as relative paths are interpreted relative to the .exe by default.
I was thinking of properly handling that for some time now, guess that time is now.
Ran with ASAN, no errors.
This detour (like any other nowadays) is both frustrating and fruitful
I'm getting the chance to mess with NSight Graphics & Systems finally, not to mention Tracy
Superliminal also seems promising and I would have bought it had it had gpu profiling support
Night systems is too cluttered (not that Nsight graphics is any better). Tracy is so much better UI/UX wise.
Also, if I understood correctly, I should instrument my code with NVTX to better see what happens. I already did that with Tracy and don't want to sprinkle my codebase with instrumentation code
Although, NSight Graphics seems to pick on GL markers just fine
I wish Nsight Systems did as well (maybe it does but I'm missing something I don't know)
To distract myself from my abysmal progress, made the stats window into an overlay and added a graph for the fps.
Also switched to rolling avg. for fps and frame time so it is easier to follow the values.
OK progress!
Got rid of asset paths resolving to D:\Source\Kakadu\blablabla and made them actually resolve into relative path (relative to .exe location) so Kakadu can now be exported
Tried on a different pc with AMD Radeon 6700XT
Both commits yield the same 135 fps, on a 4K monitor (mine is 2K, and I get 165 fps for the baseline and 125 fps for the regression)
And the fps is surprisingly stable, unlike my pc where it is somewhat fluctuating
I'll update the gpu driver
Let's see what happens
Driver update didn't change anything on my pc.
Now on the one hand, I should really let this go because it's interfering with my studies.
On the other hand it is really frustrating that a seemingly trivial change introduced a 40 fps slowdown on my development PC.
I treated it as a nice challenge (maybe a blog post idea!) so far and tackled it
But profiling is buggy/problematic and large portions of cpu time are attributed to swapping buffers or tracy gpu context collect operations
Which seem not right to me
I'm not sure what to do with NSight Systems or Graphics captures/traces either
Engine::Material msaa_resolve_material;
Engine::Renderable msaa_resolve_renderable;
Engine::Shader* shader_msaa_resolve;
All I did was to move these (and their init. code) from SandboxApplication class into the Renderer class.
I know there could be endless things causing problems and this much info. is not nearly enough to solve the problem. But I don't think anyone would take time to work on this and I can't expect anyone to.
This was the last I wrote about this issue and I'm moving on for now
Who knows maybe I'll discover what went wrong some day, while working on something else
Although I would be happy if someone ran the two executables inside and shared the fps values
Time for bed, and hopefully post-processing work tomorrow!
Update: Did more work on getting the post-processing side of things ready on the renderer side
Added a framebuffer_target_override member to RenderQueue and decided to create a new Queue for each effect
This way I can ping pong 2 framebuffers for post-processing and have each effect work on the result of the previous
On other news: By now, I had people test on 3 other computers and they all have reported there is no regression
Whatever happens seems to be specific to my pc/config.
I'm not working on this anymore but I'm still having people test it on their pcs nevertheless
Otherwise I couldn't get ping pong to work in the current state as is, because post-processing pass has 1 target framebuffer, so it wouldn't work.
-π I fixed some glTF import bugs with textures loaded from the model file:
Upon loading spanza.gltf and inspecting the shading I realized something was wrong with the normals, specifically the texture used was wrong.
- Model loader now properly identifies and names metallic-roughness maps as well (previously, only albedo and normal maps were named). I2m not using it yet, but this had to be done sometime, so why not now.
- Separated shadow casting and receiving (previously, a Renderable just "had shadows")
- Move all shadow map uniform setting inside the Renderer so client code no longer cares.
- Add matrix - column vector multiplication as an overload as well (my engine is row major, vectors are post-multiplied with matrices normally).
- π Fixed a bug with sorting of Renderables in a Queue (camera position was wrong because I stupidly used the last row of the view matrix. Added a utility math function to get the camera world pos from the view matrix).
Slight perf. regained from sorting properly. - Added a new Debug Geometry shader to visualize geometry tangents, bitangents and normals, much like gltf sample viewer does.
- Moved Renderer member from client apps to the Engine project itself (don't know why it was there to begin with)
- Added a second overlay (first one is the top-right Frame Stats. overlay) to the viewport window called "Viewport Controls". Currently it only has one button and it is explained in the next item.
- Added some ImGui APIs to ImGuiUtility namespace: DrawArrow(), DrawShadedSphere() and DrawShadedSphereComboButton(). Last one is used for the "editor shading mode" button located on the top-leftmost overlay in the viewport. I used Unity's "Draw Mode" button as reference for this.
- Added polygon mode to render state, mainly to enable wireframe mode in the editor. (I might remove it again, because I suspect I won't use it. Reasoning explained below)
- Implemented a simple "Morph" system, which is a fancy word for tween (I don't like the word tween, so I asked ChatGPT for alternatives and liked Morph). It's implemented with std::function< void( float ) >, float being the normalized time value passed to the Morph every frame, by the Morph System, which is "ticked" every frame inside Application::Update(). Usage is nice imho, I'll share some snippets.
- Used the Morph System in a debug button: "Flash Framebuffer Clear Color". This ping-pongs the clear color of the main framebuffer from between cyan and yellow 5 times in 0.75 seconds, so the user can quickly see whether parts of the screen are not drawn into. Not terribly useful, but it served as a test use case for the morph system.
- π Finally bothered to fix the exception when saving a shader and trying to hot reload it: std::filesystem::last_write_time() produces the std::errc::no_such_file_or_directory error for some reason (maybe the file is deleted and re-created on save by VS). I simply ignored this error and checked again next frame and the hiccup is gone.
Also implemented a wireframe rendering mode, in addition to geometry TBN modes
But saw that the fps dropped %30 (180 -> 135-140)
Did some research, found out why, came up with a simple plan to circumvent it and voila, same scene is rendered in wireframe, in 360 fps
I won't bother explaining it in detail here, because it'll be my first (or second, after a quick post on the Morph system) blog post hopefully
Instead
Some pictures
I used ImGui's custom rendering APIs to draw a gray sphere and a white ellipse on top, rotated roughly 45 degrees, to simulate a shaded sphere as a button
The "editor shading modes" context menu
Tex coords, shading normals and shaded wireframe is not implemented yet, because I have to tidy up the code for the others first and commit
Regular shading with Suzanne
Geometry tangents
Geom bitangents
Geom normals
My custom wireframe solution
Bonus: Sponza shaded (blinn-phong, no PBR yet).
Wanted to get shadows working but couldn't get the shadow projection volume to work right for this image. I'll get it to work when I have some free time
Sponza geom. normals
Sponza wireframe
Morph system in action
A barebones implementation
Updated every frame
Example use
I was working on Bloom initially
Then during resting I thought about how to showcase bloom once it's done and loaded up suzanne and sponza
And saw that normals were problematic and then one thing led to another and yeah, bloom is post-poned again
But I got some useful stuff done I think
I saw Bjorn doing some debug vis. on Twitter as well and that also gave me the idea to copy the debug vis. modes from gltf sample viewer.
Another mode: Debug vectors (normals and tangents as line segments)
