Kakadu | Graphics Programming | Page 4

astral hawk Mar 13, 2025, 8:06 PM

#

I'm wasting soooo much memory on this 40 longitude sphere but I'm not messing with tessellation yet, so it'll have to do

astral hawk Mar 13, 2025, 9:04 PM

#

Shaded wireframe and texture coordinates modes

#

Only remaining one is the "shading normals" mode, then I'm done

copper cliff Mar 13, 2025, 11:03 PM

#

Thicc wireframe

astral hawk Apr 7, 2025, 10:33 AM

#

I'm back again

#

After 3 weeks!

#

I've been focusing on the wireframe method because I wanted to blog it

#

But it had a major problem as Cirdan put it

#

Those are some thicc lines

#

And more importantly those are some inconsistently thick lines

#

So I wanted to make it so that I could specify thickness in screen-space pixels and every triangle, no matter the perspective/depth would have uniform line thickness

#

My current method worked by assigning barycentric coordinates <1,0,0>, <0,1,0> & <0,01> to the vertices of a triangle respectively, in a geometry shader, and letting the rasterizer do its magic so I could use the interpolated barycentric coordinates in the frag. shader to check whether any of them are below a certain threshold; i.e., we are near an edge.

But doing this causes the thickness to depend on the triangle size (specifically the distances between vertices and the opposing edges, since that is essentially what a barycentric coordinate encodes, in normalized form).

So I said OK, let me calculate these vertex-to-edge distances in clip space, convert them to screen-space and use that in frag. shader. That way I could work directly with pixel-unit distance values.

Of course there are major problems with this strategy, which I didn't realize at the time, due to my lack of understanding in some important areas.

And implementing this, I saw there were problems visually: Line thickness for a triangle changed when the camera rotated. This happened for triangles where the depth varies from vertex to vertex. Triangles completely parallel to the near plane did not suffer from this problem for example.

Turns out this was the perfect challenge to deep-dive into rasterization, projection and pers. correct depth/attribute interpolation.

#

I mostly did reading/study and did very little coding during these 3 weeks

#

Scratchapixel has nice, detailed, but not too mathy (or rather, too rigorously mathy) explanations on these topics

#

So I started from scratch (pun intended) there, i.e., from the beginning.

#

Up until the end of the rasterization chapter

#

Also re-read Fabian Giesen's "A trip through the graphics pipeline", specifically the parts on rasterization and attribute interpolation

#

After all of these and some back and forth with ChatGPT (which was surprisingly insightful and didactic)

#

I realized 2 major problems

#

Doing geometric operations in clip space does not make sense semantically: It is not a Euclidean space.

#

I was calculating the vertex-to-opposing-edge in clip space and then converting to screen-space.

#

It became clear that I had to invert this: First convert from clip-space vertex positions to screen-space, then do all those geometric operations.

#

But this in itself was not enough to solve the varying thickness problem

#

Because

#

The default perspective correct interpolation (or hyperbolic or rational-linear interpolation) was distorting the edge distances I calculated in the geometry shader.

#

While this default behaviour of perspective correct interpolation is necessary for depth and attribute values, it is counter-productive in this case because the edge distances were already projected and transformed into screen-space.

#

There (in screen-space), simply linearly interpolating across the triangle's pixels is actually needed to produce the correct values.

#

In a way, we are accounting for the perspective twice if we both project and interpolate hyperbolically

#

At least this is how I understood it.

#

So disabling that via noperspective interpolation qualifier in shader code

#

I got ... something.

#

#

Nice wireframes, all equal thickness

#

But I conveniently disabled the ground, left, and right walls

#

Because there is a MAJOR problem

#

#

And I'm not sure why it happens but I strongly believe it has to do with clipping

#

As long as the triangle remains in viewport, the lines are of consistent thickness

#

But if it straddles the near plane, this artifact happens

jagged pecan Apr 7, 2025, 7:09 PM

#

astral hawk I'm back again

welcome back

astral hawk Apr 9, 2025, 11:33 AM

#

Part 1 of the blog post is live!

#

https://fauder.github.io/posts/25-04-07-optimized-wireframe-rendering-part-i/

Burak Canik

Optimized Wireframe Rendering: Part I

This is my first actual blog post! I had planned to write other blog posts first, but I decided to write about a recent subject I worked on instead. So, wireframe rendering it is!
Anyway, the reason I got into this topic in the first place is because I wanted to add “Editor Shading Modes” as I call it, similar to Unity’s, to my study rende...

sharp shadow Apr 9, 2025, 4:07 PM

#

astral hawk https://fauder.github.io/posts/25-04-07-optimized-wireframe-rendering-part-i/

Hey as far as I’m aware you can use fwidth to get a consistent edge width 🙂 I’m sure that’s what I did in my unity thing when doing wireframe debug

astral hawk Apr 9, 2025, 6:52 PM

#

You're right of course

#

Part II of the blog will cover the stuff I explained above

#

Part III will be utilizing fwidth and calling it a day

#

The only "downside" to the fwidth approach as far as I can tell is you are not exactly working in screen-space values so you can't say "every line should be 4 pixels wide". But it is insignificant of course, just set a value (of whatever unit it is) you like and you're done

sharp shadow Apr 9, 2025, 7:07 PM

#

Of course, very good:)

#

Btw avoid geometry shaders, pipe it through tessellation stage instead, or just use a compute to make the barycentric coordinates, geometry shaders suck

#

fwidth is some derivation of ddx/ddy I think?

astral hawk Apr 10, 2025, 12:02 PM

#

fwidth is abs(ddx(...)) + abs(ddy(...)) I think

astral hawk Apr 10, 2025, 12:03 PM

#

sharp shadow Btw avoid geometry shaders, pipe it through tessellation stage instead, or just ...

Yeah I wasn't happy with using geometry shaders in the first place. I know they are not that performant. I thought about switching to tesellation, once I get to the topic of tesellation of course 🙂 Didn't think about compute though (and I also didn't get to compute shaders yet, at least on OpenGL).

#

Thanks for solid advice by the way, much appreciated 🙏

sharp shadow Apr 10, 2025, 12:04 PM

#

yeah, I didn't realise btw that you were doing tutorials, it's good though 🙂 also not speaking from some vast knowledge or trying to be rude, I think you understood what I meant anyway but I have anxiety so yeah 😄

astral hawk Apr 10, 2025, 12:05 PM

#

I didn't think you were rude, not even for a second. Relax friend forgelove

sharp shadow Apr 10, 2025, 12:06 PM

#

btw, what do you tink to my idea... if I have 50 spot lights all drawing shadows from 50 meshes.. that's 2500 draw calls for all the shadows...

Instead, I wanna collect which lights affect each object, build an instance buffer, and draw the 50 shadowmaps with a draw instanced call with 50 instances, each instance has its own shadow viewproj and tile offset/scale variables, then the only additional "cost" is pixel shader discard and vertex shader tile transform... but the cpu side will be blessed by only 50 instance draw calls 😄

#

I'm sure it'll improve performance considerably as I'm currently cpu bound on my mac at least

#

plus opengl on mac is god awful

sharp shadow Apr 10, 2025, 12:06 PM

#

astral hawk I didn't think you were *rude*, not even for a second. Relax friend <:forgelove:...

🥺

#

honestly I admire the graphics community, everyone has their place here, I'm just doing opengl and c# so not in the major leagues heh but I've learned a lot so far and continue to learn 😄

astral hawk Apr 10, 2025, 12:15 PM

#

sharp shadow btw, what do you tink to my idea... if I have 50 spot lights all drawing shadows...

You are instancing the meshes right?

#

Or the light data (viewproj etc.)?

#

Can you make it 50 spot lights and 60 meshes so I can understand easier forgderp4

astral hawk Apr 10, 2025, 12:36 PM

#

I think I understand what you're proposing: Instance the light data, not the meshes, by finding out which light affects which object and building an instance buffer based on that. I'm not familiar with tiled shadow maps but a quick research shows that it is basically spatially partitioning the shadow map. So by vertex shader workload you mean to find out which tile of the shadow map (of a light instance) a mesh belongs to I assume.

I didn't understand what you are discarding in the pixel shader though.

#

Overall, sounds solid to me though

sharp shadow Apr 10, 2025, 1:32 PM

#

Well instancing the meshes with each instance using its own viewproj

#

Btw this is your thread right? @astral hawk

astral hawk Apr 10, 2025, 1:32 PM

#

Yes

sharp shadow Apr 10, 2025, 1:32 PM

#

Do you mind if I post what I’m doing here ?

astral hawk Apr 10, 2025, 1:32 PM

#

Sure

sharp shadow Apr 10, 2025, 1:33 PM

#

Shitty but it works, rough reflections using brute force importance sampling (because I don’t have time to perfect a mip gen for pbr)

#

I’ll show rough stuff

#

Or at least 0.3 rough to give you an idea of what it should be like heh

#

I still think my metalness is inverse but I’m not sure

#

The floor doesn’t have a metallic texture so it won’t be affected by me potentially inverting the metallic

#

I’ll probs redo the pbr anyway

#

Need to send the reflection as a hdr texture too

astral hawk Apr 10, 2025, 1:35 PM

#

It looks awesome, though it's way over my head on the reflection front. But the AO looks nice I think

sharp shadow Apr 10, 2025, 1:35 PM

#

It’s sdr unsigned byte atm

#

Yeah I’m doing a massive hack

#

Getting a reflection vector and using atan and asin to convert to uv coords 😅

astral hawk Apr 10, 2025, 1:36 PM

#

atan in a shader frogdelet

#

You're working with Unity's SRP right?

#

How is it? Acerola started that way too if I'm not mistaken

sharp shadow Apr 10, 2025, 1:39 PM

#

nope

#

this is using opentk / gl

#

🙂

#

and yes 😄 😄 slow as fuck I know hah

#

might use that atan2 approximation from shadertoy

#

🙂

astral hawk Apr 10, 2025, 1:47 PM

#

sharp shadow this is using opentk / gl

Oh yeah you mentioned gl in mac I forgot sorry

#

I looked at your blog and saw that you tinkered with Unity's SRP

#

Nice stuff!

sharp shadow Apr 10, 2025, 2:34 PM

#

Yeah hehe I made my own pipeline

#

Which even performs much better than my opengl uni shit

#

Main fps killer is my shadowmap generation passes

sharp shadow Apr 10, 2025, 2:58 PM

#

ooof

#

#

looks pretty much ok 😄

#

well, that's me done for now, gotta go 😄

astral hawk Apr 10, 2025, 3:06 PM

#

sharp shadow Main fps killer is my shadowmap generation passes

So that's why you are trying to optimize it, got it 👍 Your approach sounded good to me, have you tried it yet?

astral hawk Apr 10, 2025, 3:07 PM

#

sharp shadow

That looks gorgeous. Is there image based lighting in there?

#

Also, do you have a thread here in this server?

sharp shadow Apr 10, 2025, 3:08 PM

#

Not just yet tbh I don’t wanna uproot my code yet, also not sure if I should send an array of indices to the shader as vert arrays with a ssbo of all light viewprojs in view or the matrices themselves

#

I don’t yet

#

Also yes there’s reflections from the background hdri

#

Idk if they’re correct as I need to validate stuff before I move forward

#

It’s just my uni graphics module coursework so I don’t think I should make a thread heh

astral hawk Apr 10, 2025, 3:15 PM

#

That's fair

#

Good to have you around froge_yeehaw

sharp shadow Apr 10, 2025, 3:20 PM

#

🙂

#

I’m gonna try slap a latlong to cubemap thing out and do proper mipmap generation 😭

#

Because even mipmap roughness shit looks better and runs far better than importance sampling ha

#

I’ll take a comparison pic when I get to uni lol

#

Hope it doesn’t take longer than 10 mins to get to the bus stop I need to be off at cos I’ve got a presentation at 4:45 😭😭😭

#

What I’ve done to my pbr should be illegal 🤣🤣🤣

#

However when I’ve got the metals properly integrated (need to have reflections in a separate pass, the lighting needs to be diffuse and spec and additively composited, somehow without destroying the fragility of my ropey pbr, and then reflections need to correctly composite in to the specular 😭😭😭😭😭

#

I am sooooo behind schedule

#

Still not off the bus someone best let the bus through now

astral hawk Apr 10, 2025, 5:39 PM

#

astral hawk

I did some debugging and the "moment" this bug starts to happen is when a vertex' w coordinate becomes <= 0

sharp shadow Apr 10, 2025, 6:25 PM

#

I know graphics shit well enough, I’m assuming that when it’s w <0 it’s behind the near plane then? Makes sense heh…

#

Wait is this still geometry shader pipe?

#

Maybe it’s legit being clipped?

#

Or the geometry shader can’t actually see that vert when it’s behind the near plane?

astral hawk Apr 10, 2025, 8:34 PM

#

sharp shadow Maybe it’s legit being clipped?

It sure is, I'm trying to wrap my head around why it is a problem

astral hawk Apr 10, 2025, 10:11 PM

#

First image is showing a wall that is perfectly perpendicular to the near plane (i.e., a quad rotated 90 degrees around Y axis).
The top left vertex is at world point <-15,+15,-30>
This is when the camera is at Z = -31.0 (my engine is left handed: x right y up z forward). The wireframe works because all w values are >= 0

Second image is showing the same scene, but now the camera moved to Z = -29.99. The bug is present, because now the top left vertex' w is -0.01.

The second image is telling us that the vertex-to-opposing edge distances (in screen-space pixels) I calculated in the geometry shader are all ~~zero~~ extremely close to zero.

I was thinking "Ok there is clipping, so what? The vertex-to-edge distances I calculated in the geometry shader are interpolated for the newly created clipped vertices. So why are they all zero?"

#

And then I did some plotting.

#

#

For the second image, the top left vertex has w = -0.01, which maps to NDC of X = 1500 and Y = 1500

#

That's absurdly large

#

No wonder why this happens

#

I mean I knew that perspective distortion is a thing

#

But to see it in action, to this degree, is wild to say the least

#

And this, will conclude the Part II of the wireframe rendering blog

#

In Part III I will say "just use fwidth" and call it a day

astral hawk Apr 10, 2025, 10:33 PM

#

To put things into perspective (pun not intended), for image 2 (buggy version), these are the vertex-to-opposing-edge-midpoint distances.

#

And now it's time for me to go to bed

humble sinew Apr 10, 2025, 10:37 PM

#

when I see markdown with that color scheme I instantly just think its GPT...

astral hawk Apr 10, 2025, 10:37 PM

#

That's because it is

#

And I verified the calculations on pen and paper

#

Up until to the plots

#

I didn't actually verify the edge distances but from a quick glance they seemed correct

#

If I do decide to include these in the blog post, I'll verify them

humble sinew Apr 10, 2025, 10:39 PM

#

astral hawk That's because it is

I had a little hope left

#

nogpt

astral hawk Apr 10, 2025, 10:39 PM

#

Come on, if we can't even use it for mundane calculations, what's the point of it existing forgderp4

#

It's a glorified wolfram alpha

humble sinew Apr 10, 2025, 10:45 PM

#

then what about using wolfram alpha

sharp shadow Apr 11, 2025, 6:34 AM

#

There’s nothing wrong with using ChatGPT tbh

astral hawk Apr 11, 2025, 6:40 AM

#

humble sinew then what about using wolfram alpha

Fair point. But then again, as long as I know my way around the domain and can verify/dispute the results, I’d argue there is no harm in utilizing chatgpt.

#

There are cases when I can’t verify (this was not one of those), for which I abandon chatgpt and do actual research, study actual resources.

sharp shadow Apr 11, 2025, 11:13 AM

#

yeah, legit 🙂

sharp shadow Apr 11, 2025, 11:53 AM

#

physically-somewhat-based

#

#

128 importance sample iters, with 128 per pixel atan2 calcs to get reflection vector -> uv coord

#

:V :V :V

astral hawk Jun 16, 2025, 11:55 AM

#

I'm back I guess

#

Or rather I will be (I hope)

#

I've been going through a burn out for 2 months now

astral hawk Jul 9, 2025, 11:34 AM

#

https://fauder.github.io/posts/2025/07/07/optimized-wireframe-rendering-part-ii/

Burak Canik

Optimized Wireframe Rendering: Part II

In the last post, we implemented a triangle-based wireframe solution using barycentric coordinates to shade the edges of each triangle. We also brought back anti-aliasing, which we previously got “for free” with the default glPolygonMode( GL_FRONT_AND_BACK, GL_LINE ) API—assuming anti-aliasing was enabled to begin with.
However, a glaring ...

#

Onto part 3 and finally closing this chapter.

astral hawk Jan 23, 2026, 10:27 PM

#

Damn its been 6 months huh

#

I am back again

#

Finalizing cod aw style bloom as we speak but I have some questions regarding the upsampling "radius" and the karis avg.

#

During the upsampling passes, both the source texture (mip i) and the render target (mip i + 1) are known, so why does everyone use a uniform scaling "radius" to scale delta uvs with when performing the tent (or whatever) filter in the upsampling shader for bloom?
Why not simply set it to the input texture (which is the one being filtered and is the mip i in this scenario)'s 1 texel which is 1 over the mip i's size in both axes?

I get "artistic freedom" and have nothing against it but as a default "exact fit" isn't what I said the default way to go?

#

@fluid parrot Would love to pick your brain on this one

rustic belfry Jan 24, 2026, 3:44 AM

#

Welcome back !! 🎉

#

I just recently implemented this bloom

#

not sure what you mean by a uniform scaling radius in the tent filter

#

I just did exactly what the slides said to do

#

#

you mean the 1/16?

#

because that's what the slides said to do is why I did it

#

if your technique works better you should do it, I'm pretty sure derived this empirically by testing different techniques

#

it says there "the filter do not map to pixels, has holes in it"

astral hawk Jan 24, 2026, 8:51 AM

#

Hi Bjorn how are you?

astral hawk Jan 24, 2026, 8:52 AM

#

rustic belfry

Yeah I followed these slides (and Froyok's blog as well).

The third bullet point mentions the radius

#

Unless I am missing something, I'd say that "radius" is already known analytically

#

for upscaling mip 0 to mip 1, it is 1 over mip 0's resolution or eplicitly, it is
delta_u = 1 / mip_0_width;
delta_v = 1 / mip_0_height;

Why not simply do this and call it a day instead of using a shader uniform for this radius parameter as it is referred to in this particular slide?

rustic belfry Jan 24, 2026, 9:03 AM

#

no idea, the bloom looks good though

#

I guess you're not like going to get arrested by the call of duty police if you do it some otherway

astral hawk Jan 24, 2026, 9:03 AM

#

Lol of course I just wanted to know

#

Bonus: If you do increase it beyond a certain point, you get these cool fake bokeh like effect

#

Do you have a repo I can look at for your bloom impl.?

#

I am having more of a problem with the Karis avg. anyway

rustic belfry Jan 24, 2026, 9:05 AM

#

no mine isn't public, but it's very similar to the logl one

#

https://learnopengl.com/Guest-Articles/2022/Phys.-Based-Bloom

#

I followed the slides only to do my implementation, and then someone later showed me that, and mine is pretty much the same

#

except mine is in C++ not some silly shader language :P

#

#

is what mine looks like

astral hawk Jan 24, 2026, 9:47 AM

#

Looks beautiful

astral hawk Jan 24, 2026, 9:47 AM

#

rustic belfry <https://learnopengl.com/Guest-Articles/2022/Phys.-Based-Bloom>

Yeah this one dropped the ball on karis avg. for sure

#

I was gonna reply but yours looked ok to me

#

logl one double accounted for dividing by 4 for example

#

I would appreciate it if you could show it to me or dm me if it is proprietary etc. I just want to look at someone else's code to sanity check mine

#

In the slides, there is a distinction between the full karis avg. vs. the partial karis avg.

#

#

They went with the partial

#

I'll explain what the hell I am blabbering about shortly

rustic belfry Jan 24, 2026, 10:15 AM

#

oh I just thought I had spammed your channel

#

I can put it back

#

suddenly a wall of code

astral hawk Jan 24, 2026, 10:16 AM

#

rustic belfry oh I just thought I had spammed your channel

You could never!

#

My channel is your channel please do spam it always

fluid parrot Jan 24, 2026, 1:20 PM

#

astral hawk Do you have a repo I can look at for your bloom impl.?

https://github.com/Froyok/Bloom

GitHub

GitHub - Froyok/Bloom: Implementation of realtime bloom post-process

Implementation of realtime bloom post-process. Contribute to Froyok/Bloom development by creating an account on GitHub.

#

Here is mine

#

I don't touch the tent filter to adjust the bloom radius, instead I play with the blending intensity/weight between each upsample pass

#

Also Karis average is only for downsampling during the first pass

#

If you do on each pass you are going to kill your HDR range

astral hawk Jan 24, 2026, 1:30 PM

#

fluid parrot Also Karis average is only for downsampling during the first pass

Yeah I did implement the whole thing according to cod aw slides and only applied karis avg. on the first downsample pass.
It is done on paper and visually satisfying but I have some questions on how to apply karis avg.

Although it'll have to wait because I have a framebuffer resizing regression atm.

I'll get back to you guys once I am more free

astral hawk Jan 24, 2026, 7:47 PM

#

fluid parrot I don't touch the tent filter to adjust the bloom radius, instead I play with th...

You do pass a pixel size uniform to the upsample shader though

#

And it is set to 1 / resolution (of the previous mip in your repo's lingo) in bloom.lua

#

I am inclined to do that as well because I don't really see the appeal in providing that as a knob to the artist

#

Your Blend makes much more sense as a knob!

#

That's question 1 answered, thanks!

#

Trying out stuff for question 2 (karis avg.) which I realize I still didn't ask properly. Sorry about that but there is a chance that I'll be able to work it out (mostly from trial and error, with Froyok and Bjorn codes as well as the logl one)

#

If I can't, then I'll properly explain the problem and ask you guys

#

That SafeHDR utility function is interesting, have you ever been bitten by this @fluid parrot ? Or is it a preemptive measure?

astral hawk Jan 24, 2026, 8:27 PM

#

Ok my karis average is the exact same as yours (except mine is needlessly verbose and yours is more compact) semantically.

#

You calculate 13 karis averages and also incorporate the spatial weights for the cod aw custom 13 tap downsample filter, which is also what I did.

#

Ours is the "full karis average" right?

#

And the partial they (cod aw) did is similar to Bjorn's?

#

I guess these are my only questions huh

#

Logl drops the ball in two areas mainly if I'm not mistaken:

There is a divide by 4 for blocks and there is also a multiply by 0.25 inside the KarisAverage() as well which is wrong and darkens the image I think.
Not sure if this is wrong per se but inside KarisAverage, they first convert the (presumably) linear HDR color values to sRGB and then do RGBToLuminance(). Even if this is true, shouldn't there be a re-conversion on the way out, back into linear HDR ?

Aside from logl, no one does this from what I can tell, so I didn't as well.

#

I think I'll put Bjorn's version (partial averaging) as a shader feature as well and let the client/game decide which one to use

rustic belfry Jan 24, 2026, 8:37 PM

#

idk if what I did is correct, it's just how I got to having a bloom. I am approaching these things breadth first, so I can continue to make progress on multiple areas of my project and dive deeper into specific topics as I need to. There's a conflict between deep learning and producing something interesting, and I recognize I can probably spend the rest of my life learning about just one tiny slice of graphics programming, so when this conflict comes up making progress always wins out and I learn as much as I can along the way without ever grinding anything to a halt

astral hawk Jan 24, 2026, 8:38 PM

#

Well said

#

I on the other hand am authistic

#

That's probably an insult to actually authistic people so I'm redacting it

#

I am ... sub-optimal

rustic belfry Jan 24, 2026, 8:39 PM

#

it's all sub-optimal, it's a trade off, what you're doing is more common on this server and is a better long game

astral hawk Jan 24, 2026, 8:39 PM

#

What you did is (as far as I can tell) correct by the way

rustic belfry Jan 24, 2026, 8:39 PM

#

especially if your interest is doing this professionally

astral hawk Jan 24, 2026, 8:39 PM

#

Logl is not but yours looks like it is to me

rustic belfry Jan 24, 2026, 8:39 PM

#

this is just a hobby for me

astral hawk Jan 24, 2026, 8:40 PM

#

Yeah but the thing is there is (and must be) a balance to this game

rustic belfry Jan 24, 2026, 8:40 PM

#

yes

astral hawk Jan 24, 2026, 8:41 PM

#

How am I going to pass an interview without pbr, low-level apis etc.? Am I going to bore them to death with the particulars of a bloom technique from mid 2010s?

#

Anyone who lectures me (not saying you are lecturing but still) on this is more than welcome to because frankly I do need this told to me more often and in fact I made a pact with me that I would close the Bloom chapter this weekend for this precise reason

#

So thank you sincerely

rustic belfry Jan 24, 2026, 8:42 PM

#

that's questions you might ask @torn stream as I understand it dvesh hires graphics developers

#

there's also #1020406707488313444

astral hawk Jan 24, 2026, 8:43 PM

#

Yeah I should do that

rustic belfry Jan 24, 2026, 8:43 PM

#

sorry devsh idk if you like being pinged into threads or not 😅

#

I default pinging

astral hawk Jan 24, 2026, 8:44 PM

#

I don't know I guess I feel like I should get through pbr and whatnot before I can ask people about career advice, given that I (apparently) am in the long haul for this one

astral hawk Jan 24, 2026, 8:45 PM

#

rustic belfry sorry devsh idk if you like being pinged into threads or not 😅

Yeah another reason I am not ping-bombarding people like Devsh, Matt Pettineo etc. lol

#

But from what I can tell, most people here are really chill and helpful

rustic belfry Jan 24, 2026, 8:45 PM

#

idk, people can set their notification settings

astral hawk Jan 24, 2026, 8:46 PM

#

Yeah I trusted that when I pinged Froyok today

#

Hopefully you are right 🙂

#

I should get accustmed to frog emojis again

#

These default ones are cringe

#

By the way how's it going with you? Last I saw you you were working on a software rasterizer?

#

Today you shared some cuda code

rustic belfry Jan 24, 2026, 8:47 PM

#

yeah I just write CUDA now, I don't have a thread, I just update my website sometimes

#

I am over shader languages

astral hawk Jan 24, 2026, 8:47 PM

#

A level of zen I aspire to

rustic belfry Jan 24, 2026, 8:48 PM

#

well, I can do whatever I want since it's just for personal projects and I don't care what anyone else thinks

astral hawk Jan 24, 2026, 8:48 PM

#

How did the rasterizer go? Was it fun and/or educational?

rustic belfry Jan 24, 2026, 8:48 PM

#

it was educational. I don't think I did very well on it and I got bored with it

#

I will likely do it again, via CUDA on the GPU though

astral hawk Jan 24, 2026, 8:49 PM

#

It seems like I won't really understand understand fabian giesen's "A trip through the graphics pipeline" without writing a rasterizer

#

Cuda/compute seems like the other half of the medallion

rustic belfry Jan 24, 2026, 8:50 PM

#

I wrote a pretty completish rasterizer, it just sucked and the perf was horrible

astral hawk Jan 24, 2026, 8:50 PM

#

Isn't that kinda the thing though? Seeing as it is a software impl.

#

Unless you optimize to the bone, do simd etc.

#

Which still would probably suck I don't know

rustic belfry Jan 24, 2026, 8:52 PM

#

well I sort of gave myself a dumb self inflicted roadblock by being on a from scratch kick at the time and I was refusing to use tracy which would have helped me solve my bottlenecks, so that was also an issue

astral hawk Jan 24, 2026, 8:53 PM

#

I see

rustic belfry Jan 24, 2026, 8:58 PM

#

are you working on your bloom on the same code base as your previous project?

#

what have you been doing for 6 months?

astral hawk Jan 24, 2026, 8:59 PM

#

Yeah everything I do I incorporate into Kakadu (also slows me down I guess but I get to work on renderer arch. and a rendering pipeline so it is not a complete waste of time imo)

#

My actual work was kicking my ass for a while so I kinda had to slow down on my graphics studies

#

But I got lots of bugfixes, a major renderer refactor, introduce fullscreen effects as distinct stuff from regular rendering etc. to pave the way for Bloom

#

Did some research into msaa etc.

#

Not the best 6 months but I am picking up the pace

#

Monday -> Starting SSAO

#

Or maybe tomorrow if I can coimmit everything today

#

Oh I did a ray tracer project last Friday I think

#

https://graphics.cs.utah.edu/courses/cs4600/fall2020/?prj=6

#

That was fun

#

I may do fragment shader based ray tracing like this in Kakadu perhaps

rustic belfry Jan 24, 2026, 9:07 PM

#

sorry works' been hard on you

#

good to see you're active again however

astral hawk Jan 24, 2026, 9:09 PM

#

Thanks, yeah I will force myself to be active no matter what

fluid parrot Jan 25, 2026, 7:44 AM

#

astral hawk That SafeHDR utility function is interesting, have you ever been bitten by this ...

I think it's preemptive, I did that a while ago so I don't remember exactly

fluid parrot Jan 25, 2026, 7:45 AM

#

astral hawk Ours is the "full karis average" right?

No it's the partial because it's applied in groups and not as one block

fluid parrot Jan 25, 2026, 7:46 AM

#

astral hawk Logl drops the ball in two areas mainly if I'm not mistaken: 1) There is a divi...

Logo seems wrong.
You can check out Blender EVEE code for a reference if you want, they do it and know what they are doing.

#

To compute luminance you should stay in a linear color space anyway

astral hawk Jan 25, 2026, 10:17 AM

#

Thanks a lot

astral hawk Jan 25, 2026, 5:06 PM

#

Coarse is what Bjorn did essentially

#

Fine is the partial karis average on 4x4 blocks, using 13 averages

#

13 karis averages*

#

Along with spatial weights

#

rustic belfry Jan 25, 2026, 5:40 PM

#

how many mips is that?

#

I have 7 mip levels where 0 is the source

#

#

with fewer mips the bloom gets really tight

#

I probably over did it but I like the result

#

if I turn the bloom intensity all the way up it looks like the whole world is glowing

astral hawk Jan 25, 2026, 6:12 PM

#

Oh yeah I set it to 4 mips (including the original reso) during implementation and forgot to increase it

#

Let's try 6-7

#

obs darkens the colors despite my effort to remedy it but it is not that bad

#

My texture viewer doesn't render the mips for some reason I know that but didn't have time to check it

#

Now that bloom is all done and pushed I can check it

rustic belfry Jan 25, 2026, 6:25 PM

#

very nice

astral hawk Jan 25, 2026, 6:25 PM

#

Thanks!

rustic belfry Jan 25, 2026, 6:26 PM

#

yeah obs does that to me too I gave up

#

trying to figure it out

astral hawk Jan 25, 2026, 6:27 PM

#

Either I remember it wrong or Nvidia captured faithfully

rustic belfry Jan 25, 2026, 6:27 PM

#

I use nvidia

#

idk

astral hawk Jan 25, 2026, 6:27 PM

#

I swapped graphics cards with my wife because my trusty old gtx 1080 couldnt drive my new monitor. Now I have a radeon 6700xt

rustic belfry Jan 25, 2026, 6:27 PM

#

screenshots look a little worse too

astral hawk Jan 25, 2026, 6:27 PM

#

rustic belfry I use nvidia

I probably remember wrong

astral hawk Jan 25, 2026, 6:27 PM

#

rustic belfry screenshots look a little worse too

Oh that's interesting

rustic belfry Jan 25, 2026, 6:28 PM

#

probably a skill issue on my part

#

I just don't care enough

astral hawk Jan 25, 2026, 6:28 PM

#

Doubt it. These things are needlessly complex nowadays

astral hawk Jan 25, 2026, 6:28 PM

#

rustic belfry I just don't care enough

None of us should, aside from Nvidia Amd Microsoft etc.

astral hawk Jan 25, 2026, 6:30 PM

#

astral hawk

The idiot me would probably waste a couple more dayson why the orbs are flickering (or more like scanline effect I don't know how to describe it).

Logl article has a comment saying that the upsampling radius (the one I removed from my shader just like Froyok's) should be aspect-ratio aware or otherwise it could cause this.

I did notice that visually and guarded against it.

But now that I increased the mip from 4 to 6-7, it became visible again for some reason.

#

But my answer to this is

astral hawk Jan 25, 2026, 6:30 PM

#

rustic belfry I just don't care enough

This

rustic belfry Jan 25, 2026, 6:31 PM

#

I don't see any flickering?

#

oh I do

#

hrmmmm

#

no that's just you changing the mips

astral hawk Jan 25, 2026, 6:32 PM

#

Although it does look distracting

rustic belfry Jan 25, 2026, 6:32 PM

#

I don't see any flickering

#

what do you mean?

astral hawk Jan 25, 2026, 6:32 PM

#

Closer camera will show it

#

Watch this one

rustic belfry Jan 25, 2026, 6:33 PM

#

oh, that's a sampling artifact

#

how do you sample?

#

shouldn't be square like that

#

that's why I wrote the bilinear sample that was in my code

#

to get rid of a similar artifact I had

#

but if you're just using a hardware texture you can just set the sampling?

astral hawk Jan 25, 2026, 6:34 PM

#

The sampling should be set to bilinear

#

Let me check

rustic belfry Jan 25, 2026, 6:35 PM

#

#

those straight lines on the edges looks sus

astral hawk Jan 25, 2026, 6:36 PM

#

Indeed

rustic belfry Jan 25, 2026, 6:36 PM

#

I had a very similar issue until I fixed how I sampled

astral hawk Jan 25, 2026, 6:38 PM

#

Thanks for the tip. I will look into the sampling

#

RenderDoc shows that all mips including the original hdr color buffer use GL_LINEAR for min and mag

#

with the wrapping mode set to clamp to edge for both directions

rustic belfry Jan 25, 2026, 6:41 PM

#

hrm, what resolution is your bloom image at?

#

I use a separate image for the mips, the same dimension of my very large draw image (maximum physical display size dimensions), and it doesn't include mip0, I only write mips 1 though 6 to it

#

my image is way too big

#

I don't think that's it though, maybe the mip dimensions have an off by one error or something

#

this is my sampling code

__forceinline__ __device__ float4
sample_clamped(int2 uvs, int2 mip_dims, int2 src_dims, int2 mip_offset, float4 *pixels) {
  i32 x = CLAMP_MIN(CLAMP_MAX(uvs.x + mip_offset.x, mip_offset.x + mip_dims.x - 1), mip_offset.x);
  i32 y = CLAMP_MIN(CLAMP_MAX(uvs.y + mip_offset.y, mip_offset.y + mip_dims.y - 1), mip_offset.y);
  return pixels[y * src_dims.x + x];
}

__forceinline__ __device__ float4
sample_bilinear(float2 uv, int2 mip_dims, int2 src_dims, int2 mip_offset, float4 *pixels) {
  float2 texel = uv - 0.5f;
  int2 i0 = make_int2(floorf(texel.x), floorf(texel.y));
  int2 i1 = i0 + make_int2(1, 1);
  float2 f = make_float2(texel.x - i0.x, texel.y - i0.y);

  // clang-format off
  float4 s00 = sample_clamped(i0, mip_dims,          src_dims, mip_offset, pixels);
  float4 s10 = sample_clamped(make_int2(i1.x, i0.y), mip_dims, src_dims, mip_offset, pixels);
  float4 s01 = sample_clamped(make_int2(i0.x, i1.y), mip_dims, src_dims, mip_offset, pixels);
  float4 s11 = sample_clamped(i1,                    mip_dims, src_dims, mip_offset, pixels);
  // clang-format on

  return lerp(lerp(s00, s10, f.x), lerp(s01, s11, f.x), f.y);
}

I am not using a hardware texture, just a bitmap

#

__global__ void ps_upsample(
    i32 width, i32 height, float4 *pixels, float4 *mips, i32 mip_level, f32 bloom_intensity
) {
  i32 x = threadIdx.x + blockIdx.x * blockDim.x;
  i32 y = threadIdx.y + blockIdx.y * blockDim.y;

  i32 mip_width = width >> mip_level;
  i32 mip_height = height >> mip_level;
  if (x >= mip_width || y >= mip_height)
    return;

__global__ void ps_downsample(i32 width, i32 height, float4 *pixels, float4 *mips, i32 mip_level) {
  i32 x = threadIdx.x + blockIdx.x * blockDim.x;
  i32 y = threadIdx.y + blockIdx.y * blockDim.y;

  int2 idx = make_int2(x, y);
  int2 src_dims = make_int2(width, height);

  i32 mip_width = width >> mip_level;
  i32 mip_height = height >> mip_level;
  if (x >= mip_width || y >= mip_height)
    return;

this is how I get my mip dimensions

#

my number of threads are greater than the image dimensions

ps_internal void downsample(
    ps_ctx_t &ctx, cudaStream_t stream, i32 width, i32 height, float4 *pixels, float4 *mips
) {
  {
    for (i32 mip_level = 1; mip_level < 7; mip_level++) {

      i32 block = 16;
      i32 mip_width = width >> mip_level;
      i32 mip_height = height >> mip_level;
      dim3 dimBlock(block, block);
      dim3 dimGrid((mip_width + block - 1) / block, (mip_height + block - 1) / block);

      ps_downsample<<<dimGrid, dimBlock, 0, stream>>>(width, height, pixels, mips, mip_level);
      getLastCudaError("downsample failed");
    }
  }
}

ps_internal void
upsample(ps_ctx_t &ctx, cudaStream_t stream, i32 width, i32 height, float4 *pixels, float4 *mips) {
  {
    i32 end_mip = ctx.gpuc_ctx->debug_bloom_downsample ? 1 : 0;
    for (i32 mip_level = 5; mip_level >= end_mip; mip_level--) {

      i32 block = 16;
      i32 mip_width = width >> mip_level;
      i32 mip_height = height >> mip_level;
      dim3 dimBlock(block, block);
      dim3 dimGrid((mip_width + block - 1) / block, (mip_height + block - 1) / block);

      ps_upsample<<<dimGrid, dimBlock, 0, stream>>>(
          width,
          height,
          pixels,
          mips,
          mip_level,
          ctx.gpuc_ctx->bloom_intensifier
      );
      getLastCudaError("upsample failed");
    }
  }
}

#

dim3 dimGrid((mip_width + block - 1) / block, (mip_height + block - 1) / block);

#

notice the + block - 1

#

that avoids clipping

#

you shouldn't use my sample_bilinear, since you have hardware textures, just showing it to you

astral hawk Jan 25, 2026, 7:00 PM

#

rustic belfry hrm, what resolution is your bloom image at?

Original hdr color buffer is 1262x1649 (apparently the size I dragged the viewport borders to be at)

rustic belfry Jan 25, 2026, 7:00 PM

#

does it include mip 0?

astral hawk Jan 25, 2026, 7:00 PM

#

Mips are all separate textures

#

Original is a separate texture2d

#

Mips are all separate texture2ds as well

#

I didn't "get to" texture arrays yet lol

astral hawk Jan 25, 2026, 7:02 PM

#

rustic belfry you shouldn't use my sample_bilinear, since you have hardware textures, just sho...

Yes sure thanks anyway. Always helps to see other code

rustic belfry Jan 25, 2026, 7:03 PM

#

np, gl

astral hawk Jan 25, 2026, 9:45 PM

#

rustic belfry if I turn the bloom intensity all the way up it looks like the whole world is gl...

I missed the video

#

This is so cool

#

Whatever became of this?

rustic belfry Jan 25, 2026, 11:58 PM

#

became of what?

#

that's my current project

astral hawk Jan 26, 2026, 9:56 AM

#

Oh ok I thought you switched to something else because Rosy had a thread

#

Oh it is still there ok

astral hawk Jan 26, 2026, 3:14 PM

#

rustic belfry

This was me being idiotic

#

Renderer updates the "intrinsic" (i.e., renderer-internal) UBO's viewport_size field whenever framebuffer resizes.

I used that when calculating the "radius" (delta uv) for the upsampling shader but during bloom steps, that uniform is not updated so it is always the original viewport size.

Passing the source/input texture's resolution as a uniform to the shader and using that to calculate the radius/deltauv fixed it.

#

There still is a little flicker but it looks normal and non-squarey

#

obs causes stutter during record sometimes

astral hawk Jan 26, 2026, 4:28 PM

#

I made a revised study plan. I am open to suggestions

#

## Study / Implementation Plan

### Phase 1 — Image-Space Effects
- **RTR**
  - Chapter 12: Image Space Effects
- **Kakadu / LOGL**
  - SSAO
  - Depth of Field
  - Motion Blur
  - Lens Flare
  - (Bloom already done)

  Use Jorge Jimenez slides for these.

---

### Phase 2 — Physically Based Lighting & Local / Global Illumination
- **RTR**
  - Chapter 9: Physically Based Shading
  - Chapter 10: Local Illumination
    - 10.1 Area Light Sources
  - Chapter 11: Global Illumination  
    - 11.3 Ambient Occlusion
- **Kakadu**
  - Implement PBR (LOGL and/or other sources)
  - Implement Area Lights

---

### (In Between)
- **Kakadu: DSA refactor across codebase**
- **Kakadu: Integrate Texture Arrays**

---

### Phase 3 — Shadow Mapping Enhancement
- **RTR**
  - Chapter 7 (relevant sections): Shadows
- **Kakadu**
  - Implement Cascaded Shadow Maps (CSM)
  - Implement smart/auto directional-light camera placement

---

### Phase 4 — Re-evaluation
- **Kakadu**
  - Revisit and refine SSAO using insights from RTR Chapter 11.3 IF NEEDED.

---

### Phase 5 — Compute
- **RTR**
  - Chapter 23 (selective revisit)
- **Kakadu**
  - Introduce SSBOs
  - Introduce compute shaders
  - Migrate an existing pass to compute
    - SSAO or blur
    - light culling or similar

---

### Phase 6 — Deferred Rendering
- **RTR**
  - Chapter 20.1: Deferred Shading
- **Kakadu / LOGL**
  - Implement Deferred Rendering
  - Integrate compute-based light culling where appropriate

---

### Phase 7 — Visibility
- **RTR**
  - Chapter 19.4: View Frustum Culling
- **Kakadu / LOGL**
  - Implement frustum culling

#

I am straying from the ordering in logl because ssao for example is something I can tackle right now as I implemented the beginnings of a fullscreen effects system in Kakadu and I only have bloom as an effect for now.

No need to put deferred in the middle of bloom and ssao I think.

#

In RTR I am chronologically at chapter 9 physically based shading. I jumped to chapter 23 (I think) to read the hardware chapter.

I am thinking about skipping 9-10-11-for now and get image-space effects (chapter 12) out first. then go into PBR full blown and continue from there.

rustic belfry Jan 26, 2026, 4:42 PM

#

looks great!

astral hawk Jan 28, 2026, 11:37 AM

#

Hmm ok there is a reason deferred rendering comes before ssoa in logl: per fragment normals are a thing by default in deferred rendering, which the ssao (crysis) uses.

#

I'll just read the deferred rendering chapter for now to get an idea on the implementation details and decide then.

astral hawk Jan 28, 2026, 9:39 PM

#

I was almost brainwashed: https://www.youtube.com/watch?v=QVbOp1h-Jb4

YouTube

bazhenovc

Why you should never use deferred shading

Personal and strongly opinionated rant about why one should never use deferred shading.

Slides: https://docs.google.com/presentation/d/1kaeg2qMi3_8nQqoR3Y2Ax9fJKUYLigPLPfdjfuEGowY/edit?usp=sharing

Links:
https://github.com/zeux/meshoptimizer
https://vkguide.dev/docs/gpudriven/gpu_driven_engines/
https://vkguide.dev/docs/gpudriven/compute_culli...

▶ Play video

#

Then I saw Devsh's comments here: #graphics-techniques message

#

Disclaimer: I shouldn't even begin to compare forward vs deferred at this stage, I know. That is a given and I fully acknowledge it. It was past working hours and I saw the vid on youtube and wanted to share.

astral hawk Feb 3, 2026, 4:24 PM

#

Seeing @fluid parrot's immensely cool splash screen I thought why not

#

Animating stuff so I can get cool images with bloom

#

Maybe this one dunno

rustic belfry Feb 3, 2026, 4:26 PM

#

wow

#

I love bloom

#

I can't wait until I have more better lighting and more stuff looks good, it adds so much

fluid parrot Feb 3, 2026, 4:32 PM

#

Bloom is the key to nice images 😎

astral hawk Feb 3, 2026, 10:48 PM

#

I wanted to make it animated by using the renderer (which initializes in 200 ms. give or take) itself but I can't because the splash screen animation would have to block initialization of the rest of the engine/app until it finishes because the rest of the init code contains lots of gl calls as well and opengl is not multithreaded right?

This could be a cool idea in a vulkan/d3d12 renderer perhaps but not for opengl if I'm thinking right.

#

If the rest of the init. code was pure cpu code without any gl dependencies then sure. Otherwise, a blocking splash screen kinda defeats the purpose imo.

rustic belfry Feb 3, 2026, 11:32 PM

#

well, you could do a little bit of extra stuff each splash screen frame instead of using a thread

#

you can use command buffers in separate threads in vulkan, and opengl context's are not as convenient

astral hawk Feb 4, 2026, 10:41 AM

#

Yeah in gl it would be diminishing returns for unnecessary hustle it seems

#

Unity gets by with a static splash screen so why not

#

We can't all be Ombre

astral hawk Feb 4, 2026, 8:47 PM

#

I'll try changing the preview image of this post/thread. Hope I don't screw it permanently

#

Yeah I nuked it lol

#

Ok it's back but it looks meh

#

#

Maybe this will look better due to aspect ratio

rustic belfry Feb 7, 2026, 5:20 AM

#

I think it looks good in the community projects list

#

it stands out

astral hawk Feb 18, 2026, 10:22 PM

#

Heyo

#

Update: I am working on a big refactor that was long overdue before I continue with augmenting the Renderer with newer stuff (deferred rendering is on the way).

#

Currently the engine architecture is in a weird place (by design, or at least by intentional postponing of fixing it)

#

Inside the solution, there are multiple projects:

Vendor (has deps like stb, fastgltf, ImGui etc.) -> produces static lib.
Engine (has EVERYTHING else that is not client application/game specific) -> also produces a static lib. and consumes vendor.lib.
Most notably (for the refactor) contains ALL editor code as well, dispersed across the codebase in the form of #ifdef _EDITOR blocks. All ImGui code is editor code.
Client applications: Currently there are 3 projects: Sandbox, HDR-Demo and Bloom Demo. There should/could be much more, but creating a new project means essentially duplicating the dir/proj of an existing one and modifying it. These are all executables by the way and they weirdly contain the editor and the engine runtime as well as client app logic, all in one.

#

Creating projects is a pain point right now.

#

Also I for some reason took Cherno's Application base class I think (not sure, it's been a while) and have virtual functions like Update(), Initialize() etc. even though there can not be more than 1 client application for a given project, so virtual dispatch is unnecessary (although irrelevant perf wise since every frame there is 1 indirection into long-running functions like Update(), Render() etc.).

#

What I want is to do some ordered refactors:

Separate editor code from engine code; Do this in the form of a new Editor namespace and translation units, containing all editor code.
The rest of the codebase should mostly be unaware of the editor except for the Application calling the update and render functions of the editor per frame and keeping an Editor::Context member around.

#

Separate the engine and the editor into 2 projects.
Turn engine into a dll. I am not so sure whether I should do this or not. Aside from possibly hot reloading the Engine itself, I don't see an immediate gain from this.
Turn editor into an executable. This is bound to happen soon. It is weird launching the client application executable but in reality you are launching the editor, but with custom logic per application.
Turn client applications into dlls. They could be hot reloaded which is a must for iteration in my opinion.
Separate client applications into two parts: A stub executable and a dll that will contain the actual logic. See number 7 below.
Implement project creation from the Editor; When a new project is created, the editor will simply create a new c++ project that will house the client app logic in the form of a dll. When the client app is "built" from the editor, the editor will generate the stub executable which will load the engine runtime dll and give the control to it. Engine runtime will initialize itself (most notably the renderer, which is the bulk of the engine atm) and then will load and initialize the client app dll. Then it will enter the main loop and every frame call into the client app dll to execute custom update/render etc. logic on top of the engine runtime logic.

#

I am not sure how heavily I will invest in this, except for number 1 because I ~~will be~~ am doing that fully.

#

I plan on implementing all of these sometime but I need to balance maintaining Kakadu with learning graphics as well (although one could argue this type of engine arch. stuff is also a responsibility of a graphics programmer, at least the renderer).

#

Let's just see where things will go from 1).

#

I do want to create a "Many Lights" project/sample as well by the way and that is what triggered this refactor that was long overdue.

#

Oh and 8) Currently there is no scene data serialized to drive. Scene data is initialized via code during every run. So serialize/deserialize via a format like json maybe.

#

7 and 8 combined would make creating new projects a breeze compared to status quo.

#

#1067777224528375858 message

#

Adding this to the pile of stuff to fix after the refactor

astral hawk Feb 19, 2026, 6:50 PM

#

https://tenor.com/view/movie-one-eternity-later-gif-7900643

Tenor

#

#

Except for the Renderer.cpp, which contains a shit load of ImGui calls, the decoupling is (half) done.

#

Though I am fully done for the day

astral hawk Mar 10, 2026, 3:36 PM

#

Update: Renderer is also rid of all editor code (except for some debug bounds checks but those are unimportant right now). I've been busy with my job again but all is good.

#

I brushed up on dynamic linking in the MSDN docs today. I will go for explicit linking in so I can hot reload stuff when I want to get into that.

#Kakadu