#Kakadu

3436 messages · Page 4 of 4 (latest)

astral hawk
#

I'm wasting soooo much memory on this 40 longitude sphere but I'm not messing with tessellation yet, so it'll have to do

astral hawk
#

Shaded wireframe and texture coordinates modes

#

Only remaining one is the "shading normals" mode, then I'm done

copper cliff
#

Thicc wireframe

astral hawk
#

I'm back again

#

After 3 weeks!

#

I've been focusing on the wireframe method because I wanted to blog it

#

But it had a major problem as Cirdan put it

#

Those are some thicc lines

#

And more importantly those are some inconsistently thick lines

#

So I wanted to make it so that I could specify thickness in screen-space pixels and every triangle, no matter the perspective/depth would have uniform line thickness

#

My current method worked by assigning barycentric coordinates <1,0,0>, <0,1,0> & <0,01> to the vertices of a triangle respectively, in a geometry shader, and letting the rasterizer do its magic so I could use the interpolated barycentric coordinates in the frag. shader to check whether any of them are below a certain threshold; i.e., we are near an edge.

But doing this causes the thickness to depend on the triangle size (specifically the distances between vertices and the opposing edges, since that is essentially what a barycentric coordinate encodes, in normalized form).

So I said OK, let me calculate these vertex-to-edge distances in clip space, convert them to screen-space and use that in frag. shader. That way I could work directly with pixel-unit distance values.

Of course there are major problems with this strategy, which I didn't realize at the time, due to my lack of understanding in some important areas.

And implementing this, I saw there were problems visually: Line thickness for a triangle changed when the camera rotated. This happened for triangles where the depth varies from vertex to vertex. Triangles completely parallel to the near plane did not suffer from this problem for example.

Turns out this was the perfect challenge to deep-dive into rasterization, projection and pers. correct depth/attribute interpolation.

#

I mostly did reading/study and did very little coding during these 3 weeks

#

Scratchapixel has nice, detailed, but not too mathy (or rather, too rigorously mathy) explanations on these topics

#

So I started from scratch (pun intended) there, i.e., from the beginning.

#

Up until the end of the rasterization chapter

#

Also re-read Fabian Giesen's "A trip through the graphics pipeline", specifically the parts on rasterization and attribute interpolation

#

After all of these and some back and forth with ChatGPT (which was surprisingly insightful and didactic)

#

I realized 2 major problems

#
  1. Doing geometric operations in clip space does not make sense semantically: It is not a Euclidean space.
#

I was calculating the vertex-to-opposing-edge in clip space and then converting to screen-space.

#

It became clear that I had to invert this: First convert from clip-space vertex positions to screen-space, then do all those geometric operations.

#

But this in itself was not enough to solve the varying thickness problem

#

Because

#
  1. The default perspective correct interpolation (or hyperbolic or rational-linear interpolation) was distorting the edge distances I calculated in the geometry shader.
#

While this default behaviour of perspective correct interpolation is necessary for depth and attribute values, it is counter-productive in this case because the edge distances were already projected and transformed into screen-space.

#

There (in screen-space), simply linearly interpolating across the triangle's pixels is actually needed to produce the correct values.

#

In a way, we are accounting for the perspective twice if we both project and interpolate hyperbolically

#

At least this is how I understood it.

#

So disabling that via noperspective interpolation qualifier in shader code

#

I got ... something.

#

Nice wireframes, all equal thickness

#

But I conveniently disabled the ground, left, and right walls

#

Because there is a MAJOR problem

#

And I'm not sure why it happens but I strongly believe it has to do with clipping

#

As long as the triangle remains in viewport, the lines are of consistent thickness

#

But if it straddles the near plane, this artifact happens

jagged pecan
astral hawk
#

Part 1 of the blog post is live!

sharp shadow
astral hawk
#

You're right of course

#

Part II of the blog will cover the stuff I explained above

#

Part III will be utilizing fwidth and calling it a day

#

The only "downside" to the fwidth approach as far as I can tell is you are not exactly working in screen-space values so you can't say "every line should be 4 pixels wide". But it is insignificant of course, just set a value (of whatever unit it is) you like and you're done

sharp shadow
#

Of course, very good:)

#

Btw avoid geometry shaders, pipe it through tessellation stage instead, or just use a compute to make the barycentric coordinates, geometry shaders suck

#

fwidth is some derivation of ddx/ddy I think?

astral hawk
#

fwidth is abs(ddx(...)) + abs(ddy(...)) I think

astral hawk
#

Thanks for solid advice by the way, much appreciated 🙏

sharp shadow
#

yeah, I didn't realise btw that you were doing tutorials, it's good though 🙂 also not speaking from some vast knowledge or trying to be rude, I think you understood what I meant anyway but I have anxiety so yeah 😄

astral hawk
#

I didn't think you were rude, not even for a second. Relax friend forgelove

sharp shadow
#

btw, what do you tink to my idea... if I have 50 spot lights all drawing shadows from 50 meshes.. that's 2500 draw calls for all the shadows...

Instead, I wanna collect which lights affect each object, build an instance buffer, and draw the 50 shadowmaps with a draw instanced call with 50 instances, each instance has its own shadow viewproj and tile offset/scale variables, then the only additional "cost" is pixel shader discard and vertex shader tile transform... but the cpu side will be blessed by only 50 instance draw calls 😄

#

I'm sure it'll improve performance considerably as I'm currently cpu bound on my mac at least

#

plus opengl on mac is god awful

sharp shadow
#

honestly I admire the graphics community, everyone has their place here, I'm just doing opengl and c# so not in the major leagues heh but I've learned a lot so far and continue to learn 😄

astral hawk
#

Or the light data (viewproj etc.)?

#

Can you make it 50 spot lights and 60 meshes so I can understand easier forgderp4

astral hawk
#

I think I understand what you're proposing: Instance the light data, not the meshes, by finding out which light affects which object and building an instance buffer based on that. I'm not familiar with tiled shadow maps but a quick research shows that it is basically spatially partitioning the shadow map. So by vertex shader workload you mean to find out which tile of the shadow map (of a light instance) a mesh belongs to I assume.

I didn't understand what you are discarding in the pixel shader though.

#

Overall, sounds solid to me though

sharp shadow
#

Well instancing the meshes with each instance using its own viewproj

#

Btw this is your thread right? @astral hawk

astral hawk
#

Yes

sharp shadow
#

Do you mind if I post what I’m doing here ?

astral hawk
#

Sure

sharp shadow
#

Shitty but it works, rough reflections using brute force importance sampling (because I don’t have time to perfect a mip gen for pbr)

#

I’ll show rough stuff

#

Or at least 0.3 rough to give you an idea of what it should be like heh

#

I still think my metalness is inverse but I’m not sure

#

The floor doesn’t have a metallic texture so it won’t be affected by me potentially inverting the metallic

#

I’ll probs redo the pbr anyway

#

Need to send the reflection as a hdr texture too

astral hawk
#

It looks awesome, though it's way over my head on the reflection front. But the AO looks nice I think

sharp shadow
#

It’s sdr unsigned byte atm

#

Yeah I’m doing a massive hack

#

Getting a reflection vector and using atan and asin to convert to uv coords 😅

astral hawk
#

atan in a shader frogdelet

#

You're working with Unity's SRP right?

#

How is it? Acerola started that way too if I'm not mistaken

sharp shadow
#

nope

#

this is using opentk / gl

#

🙂

#

and yes 😄 😄 slow as fuck I know hah

#

might use that atan2 approximation from shadertoy

#

🙂

astral hawk
#

I looked at your blog and saw that you tinkered with Unity's SRP

#

Nice stuff!

sharp shadow
#

Yeah hehe I made my own pipeline

#

Which even performs much better than my opengl uni shit

#

Main fps killer is my shadowmap generation passes

sharp shadow
#

ooof

#

looks pretty much ok 😄

#

well, that's me done for now, gotta go 😄

astral hawk
astral hawk
# sharp shadow

That looks gorgeous. Is there image based lighting in there?

#

Also, do you have a thread here in this server?

sharp shadow
#

Not just yet tbh I don’t wanna uproot my code yet, also not sure if I should send an array of indices to the shader as vert arrays with a ssbo of all light viewprojs in view or the matrices themselves

#

I don’t yet

#

Also yes there’s reflections from the background hdri

#

Idk if they’re correct as I need to validate stuff before I move forward

#

It’s just my uni graphics module coursework so I don’t think I should make a thread heh

astral hawk
#

That's fair

#

Good to have you around froge_yeehaw

sharp shadow
#

🙂

#

I’m gonna try slap a latlong to cubemap thing out and do proper mipmap generation 😭

#

Because even mipmap roughness shit looks better and runs far better than importance sampling ha

#

I’ll take a comparison pic when I get to uni lol

#

Hope it doesn’t take longer than 10 mins to get to the bus stop I need to be off at cos I’ve got a presentation at 4:45 😭😭😭

#

What I’ve done to my pbr should be illegal 🤣🤣🤣

#

However when I’ve got the metals properly integrated (need to have reflections in a separate pass, the lighting needs to be diffuse and spec and additively composited, somehow without destroying the fragility of my ropey pbr, and then reflections need to correctly composite in to the specular 😭😭😭😭😭

#

I am sooooo behind schedule

#

Still not off the bus someone best let the bus through now

astral hawk
# astral hawk

I did some debugging and the "moment" this bug starts to happen is when a vertex' w coordinate becomes <= 0

sharp shadow
#

I know graphics shit well enough, I’m assuming that when it’s w <0 it’s behind the near plane then? Makes sense heh…

#

Wait is this still geometry shader pipe?

#

Maybe it’s legit being clipped?

#

Or the geometry shader can’t actually see that vert when it’s behind the near plane?

astral hawk
astral hawk
#

First image is showing a wall that is perfectly perpendicular to the near plane (i.e., a quad rotated 90 degrees around Y axis).
The top left vertex is at world point <-15,+15,-30>
This is when the camera is at Z = -31.0 (my engine is left handed: x right y up z forward). The wireframe works because all w values are >= 0

Second image is showing the same scene, but now the camera moved to Z = -29.99. The bug is present, because now the top left vertex' w is -0.01.

The second image is telling us that the vertex-to-opposing edge distances (in screen-space pixels) I calculated in the geometry shader are all zero extremely close to zero.

I was thinking "Ok there is clipping, so what? The vertex-to-edge distances I calculated in the geometry shader are interpolated for the newly created clipped vertices. So why are they all zero?"

#

And then I did some plotting.

#

For the second image, the top left vertex has w = -0.01, which maps to NDC of X = 1500 and Y = 1500

#

That's absurdly large

#

No wonder why this happens

#

I mean I knew that perspective distortion is a thing

#

But to see it in action, to this degree, is wild to say the least

#

And this, will conclude the Part II of the wireframe rendering blog

#

In Part III I will say "just use fwidth" and call it a day

astral hawk
#

To put things into perspective (pun not intended), for image 2 (buggy version), these are the vertex-to-opposing-edge-midpoint distances.

#

And now it's time for me to go to bed

humble sinew
#

when I see markdown with that color scheme I instantly just think its GPT...

astral hawk
#

That's because it is

#

And I verified the calculations on pen and paper

#

Up until to the plots

#

I didn't actually verify the edge distances but from a quick glance they seemed correct

#

If I do decide to include these in the blog post, I'll verify them

humble sinew
astral hawk
#

Come on, if we can't even use it for mundane calculations, what's the point of it existing forgderp4

#

It's a glorified wolfram alpha

humble sinew
#

then what about using wolfram alpha

sharp shadow
#

There’s nothing wrong with using ChatGPT tbh

astral hawk
#

There are cases when I can’t verify (this was not one of those), for which I abandon chatgpt and do actual research, study actual resources.

sharp shadow
#

yeah, legit 🙂

sharp shadow
#

physically-somewhat-based

#

128 importance sample iters, with 128 per pixel atan2 calcs to get reflection vector -> uv coord

#

:V :V :V

astral hawk
#

I'm back I guess

#

Or rather I will be (I hope)

#

I've been going through a burn out for 2 months now

astral hawk
#

Onto part 3 and finally closing this chapter.

astral hawk
#

Damn its been 6 months huh

#

I am back again

#

Finalizing cod aw style bloom as we speak but I have some questions regarding the upsampling "radius" and the karis avg.

#

During the upsampling passes, both the source texture (mip i) and the render target (mip i + 1) are known, so why does everyone use a uniform scaling "radius" to scale delta uvs with when performing the tent (or whatever) filter in the upsampling shader for bloom?
Why not simply set it to the input texture (which is the one being filtered and is the mip i in this scenario)'s 1 texel which is 1 over the mip i's size in both axes?

I get "artistic freedom" and have nothing against it but as a default "exact fit" isn't what I said the default way to go?

#

@fluid parrot Would love to pick your brain on this one

rustic belfry
#

Welcome back !! 🎉

#

I just recently implemented this bloom

#

not sure what you mean by a uniform scaling radius in the tent filter

#

I just did exactly what the slides said to do

#

you mean the 1/16?

#

because that's what the slides said to do is why I did it

#

if your technique works better you should do it, I'm pretty sure derived this empirically by testing different techniques

#

it says there "the filter do not map to pixels, has holes in it"

astral hawk
#

Hi Bjorn how are you?

astral hawk
# rustic belfry

Yeah I followed these slides (and Froyok's blog as well).

The third bullet point mentions the radius

#

Unless I am missing something, I'd say that "radius" is already known analytically

#

for upscaling mip 0 to mip 1, it is 1 over mip 0's resolution or eplicitly, it is
delta_u = 1 / mip_0_width;
delta_v = 1 / mip_0_height;

Why not simply do this and call it a day instead of using a shader uniform for this radius parameter as it is referred to in this particular slide?

rustic belfry
#

no idea, the bloom looks good though

#

I guess you're not like going to get arrested by the call of duty police if you do it some otherway

astral hawk
#

Lol of course I just wanted to know

#

Bonus: If you do increase it beyond a certain point, you get these cool fake bokeh like effect

#

Do you have a repo I can look at for your bloom impl.?

#

I am having more of a problem with the Karis avg. anyway

rustic belfry
#

no mine isn't public, but it's very similar to the logl one

#

I followed the slides only to do my implementation, and then someone later showed me that, and mine is pretty much the same

#

except mine is in C++ not some silly shader language :P

#

is what mine looks like

astral hawk
#

Looks beautiful

astral hawk
#

I was gonna reply but yours looked ok to me

#

logl one double accounted for dividing by 4 for example

#

I would appreciate it if you could show it to me or dm me if it is proprietary etc. I just want to look at someone else's code to sanity check mine

#

In the slides, there is a distinction between the full karis avg. vs. the partial karis avg.

#

They went with the partial

#

I'll explain what the hell I am blabbering about shortly

rustic belfry
#

oh I just thought I had spammed your channel

#

I can put it back

#

suddenly a wall of code

astral hawk
#

My channel is your channel please do spam it always

fluid parrot
#

Here is mine

#

I don't touch the tent filter to adjust the bloom radius, instead I play with the blending intensity/weight between each upsample pass

#

Also Karis average is only for downsampling during the first pass

#

If you do on each pass you are going to kill your HDR range

astral hawk
# fluid parrot Also Karis average is only for downsampling during the first pass

Yeah I did implement the whole thing according to cod aw slides and only applied karis avg. on the first downsample pass.
It is done on paper and visually satisfying but I have some questions on how to apply karis avg.

Although it'll have to wait because I have a framebuffer resizing regression atm.

I'll get back to you guys once I am more free

astral hawk
#

And it is set to 1 / resolution (of the previous mip in your repo's lingo) in bloom.lua

#

I am inclined to do that as well because I don't really see the appeal in providing that as a knob to the artist

#

Your Blend makes much more sense as a knob!

#

That's question 1 answered, thanks!

#

Trying out stuff for question 2 (karis avg.) which I realize I still didn't ask properly. Sorry about that but there is a chance that I'll be able to work it out (mostly from trial and error, with Froyok and Bjorn codes as well as the logl one)

#

If I can't, then I'll properly explain the problem and ask you guys

#

That SafeHDR utility function is interesting, have you ever been bitten by this @fluid parrot ? Or is it a preemptive measure?

astral hawk
#

Ok my karis average is the exact same as yours (except mine is needlessly verbose and yours is more compact) semantically.

#

You calculate 13 karis averages and also incorporate the spatial weights for the cod aw custom 13 tap downsample filter, which is also what I did.

#

Ours is the "full karis average" right?

#

And the partial they (cod aw) did is similar to Bjorn's?

#

I guess these are my only questions huh

#

Logl drops the ball in two areas mainly if I'm not mistaken:

  1. There is a divide by 4 for blocks and there is also a multiply by 0.25 inside the KarisAverage() as well which is wrong and darkens the image I think.

  2. Not sure if this is wrong per se but inside KarisAverage, they first convert the (presumably) linear HDR color values to sRGB and then do RGBToLuminance(). Even if this is true, shouldn't there be a re-conversion on the way out, back into linear HDR ?

Aside from logl, no one does this from what I can tell, so I didn't as well.

#

I think I'll put Bjorn's version (partial averaging) as a shader feature as well and let the client/game decide which one to use

rustic belfry
#

idk if what I did is correct, it's just how I got to having a bloom. I am approaching these things breadth first, so I can continue to make progress on multiple areas of my project and dive deeper into specific topics as I need to. There's a conflict between deep learning and producing something interesting, and I recognize I can probably spend the rest of my life learning about just one tiny slice of graphics programming, so when this conflict comes up making progress always wins out and I learn as much as I can along the way without ever grinding anything to a halt

astral hawk
#

Well said

#

I on the other hand am authistic

#

That's probably an insult to actually authistic people so I'm redacting it

#

I am ... sub-optimal

rustic belfry
#

it's all sub-optimal, it's a trade off, what you're doing is more common on this server and is a better long game

astral hawk
#

What you did is (as far as I can tell) correct by the way

rustic belfry
#

especially if your interest is doing this professionally

astral hawk
#

Logl is not but yours looks like it is to me

rustic belfry
#

this is just a hobby for me

astral hawk
#

Yeah but the thing is there is (and must be) a balance to this game

rustic belfry
#

yes

astral hawk
#

How am I going to pass an interview without pbr, low-level apis etc.? Am I going to bore them to death with the particulars of a bloom technique from mid 2010s?

#

Anyone who lectures me (not saying you are lecturing but still) on this is more than welcome to because frankly I do need this told to me more often and in fact I made a pact with me that I would close the Bloom chapter this weekend for this precise reason

#

So thank you sincerely

rustic belfry
#

that's questions you might ask @torn stream as I understand it dvesh hires graphics developers

#

there's also #1020406707488313444

astral hawk
#

Yeah I should do that

rustic belfry
#

sorry devsh idk if you like being pinged into threads or not 😅

#

I default pinging

astral hawk
#

I don't know I guess I feel like I should get through pbr and whatnot before I can ask people about career advice, given that I (apparently) am in the long haul for this one

astral hawk
#

But from what I can tell, most people here are really chill and helpful

rustic belfry
#

idk, people can set their notification settings

astral hawk
#

Yeah I trusted that when I pinged Froyok today

#

Hopefully you are right 🙂

#

I should get accustmed to frog emojis again

#

These default ones are cringe

#

By the way how's it going with you? Last I saw you you were working on a software rasterizer?

#

Today you shared some cuda code

rustic belfry
#

yeah I just write CUDA now, I don't have a thread, I just update my website sometimes

#

I am over shader languages

astral hawk
#

A level of zen I aspire to

rustic belfry
#

well, I can do whatever I want since it's just for personal projects and I don't care what anyone else thinks

astral hawk
#

How did the rasterizer go? Was it fun and/or educational?

rustic belfry
#

it was educational. I don't think I did very well on it and I got bored with it

#

I will likely do it again, via CUDA on the GPU though

astral hawk
#

It seems like I won't really understand understand fabian giesen's "A trip through the graphics pipeline" without writing a rasterizer

#

Cuda/compute seems like the other half of the medallion

rustic belfry
#

I wrote a pretty completish rasterizer, it just sucked and the perf was horrible

astral hawk
#

Isn't that kinda the thing though? Seeing as it is a software impl.

#

Unless you optimize to the bone, do simd etc.

#

Which still would probably suck I don't know

rustic belfry
#

well I sort of gave myself a dumb self inflicted roadblock by being on a from scratch kick at the time and I was refusing to use tracy which would have helped me solve my bottlenecks, so that was also an issue

astral hawk
#

I see

rustic belfry
#

are you working on your bloom on the same code base as your previous project?

#

what have you been doing for 6 months?

astral hawk
#

Yeah everything I do I incorporate into Kakadu (also slows me down I guess but I get to work on renderer arch. and a rendering pipeline so it is not a complete waste of time imo)

#

My actual work was kicking my ass for a while so I kinda had to slow down on my graphics studies

#

But I got lots of bugfixes, a major renderer refactor, introduce fullscreen effects as distinct stuff from regular rendering etc. to pave the way for Bloom

#

Did some research into msaa etc.

#

Not the best 6 months but I am picking up the pace

#

Monday -> Starting SSAO

#

Or maybe tomorrow if I can coimmit everything today

#

Oh I did a ray tracer project last Friday I think

#

That was fun

#

I may do fragment shader based ray tracing like this in Kakadu perhaps

rustic belfry
#

sorry works' been hard on you

#

good to see you're active again however

astral hawk
#

Thanks, yeah I will force myself to be active no matter what

fluid parrot
fluid parrot
fluid parrot
#

To compute luminance you should stay in a linear color space anyway

astral hawk
#

Thanks a lot

astral hawk
#

Fine is the partial karis average on 4x4 blocks, using 13 averages

#

13 karis averages*

#

Along with spatial weights

rustic belfry
#

how many mips is that?

#

I have 7 mip levels where 0 is the source

#

with fewer mips the bloom gets really tight

#

I probably over did it but I like the result

astral hawk
#

Oh yeah I set it to 4 mips (including the original reso) during implementation and forgot to increase it

#

Let's try 6-7

#

obs darkens the colors despite my effort to remedy it but it is not that bad

#

My texture viewer doesn't render the mips for some reason I know that but didn't have time to check it

#

Now that bloom is all done and pushed I can check it

rustic belfry
#

very nice

astral hawk
#

Thanks!

rustic belfry
#

yeah obs does that to me too I gave up

#

trying to figure it out

astral hawk
#

Either I remember it wrong or Nvidia captured faithfully

rustic belfry
#

I use nvidia

#

idk

astral hawk
#

I swapped graphics cards with my wife because my trusty old gtx 1080 couldnt drive my new monitor. Now I have a radeon 6700xt

rustic belfry
#

screenshots look a little worse too

astral hawk
astral hawk
rustic belfry
#

probably a skill issue on my part

#

I just don't care enough

astral hawk
#

Doubt it. These things are needlessly complex nowadays

astral hawk
astral hawk
# astral hawk

The idiot me would probably waste a couple more dayson why the orbs are flickering (or more like scanline effect I don't know how to describe it).

Logl article has a comment saying that the upsampling radius (the one I removed from my shader just like Froyok's) should be aspect-ratio aware or otherwise it could cause this.

I did notice that visually and guarded against it.

But now that I increased the mip from 4 to 6-7, it became visible again for some reason.

#

But my answer to this is

astral hawk
rustic belfry
#

I don't see any flickering?

#

oh I do

#

hrmmmm

#

no that's just you changing the mips

astral hawk
#

Although it does look distracting

rustic belfry
#

I don't see any flickering

#

what do you mean?

astral hawk
#

Closer camera will show it

#

Watch this one

rustic belfry
#

oh, that's a sampling artifact

#

how do you sample?

#

shouldn't be square like that

#

that's why I wrote the bilinear sample that was in my code

#

to get rid of a similar artifact I had

#

but if you're just using a hardware texture you can just set the sampling?

astral hawk
#

The sampling should be set to bilinear

#

Let me check

rustic belfry
#

those straight lines on the edges looks sus

astral hawk
#

Indeed

rustic belfry
#

I had a very similar issue until I fixed how I sampled

astral hawk
#

Thanks for the tip. I will look into the sampling

#

RenderDoc shows that all mips including the original hdr color buffer use GL_LINEAR for min and mag

#

with the wrapping mode set to clamp to edge for both directions

rustic belfry
#

hrm, what resolution is your bloom image at?

#

I use a separate image for the mips, the same dimension of my very large draw image (maximum physical display size dimensions), and it doesn't include mip0, I only write mips 1 though 6 to it

#

my image is way too big

#

I don't think that's it though, maybe the mip dimensions have an off by one error or something

#

this is my sampling code

__forceinline__ __device__ float4
sample_clamped(int2 uvs, int2 mip_dims, int2 src_dims, int2 mip_offset, float4 *pixels) {
  i32 x = CLAMP_MIN(CLAMP_MAX(uvs.x + mip_offset.x, mip_offset.x + mip_dims.x - 1), mip_offset.x);
  i32 y = CLAMP_MIN(CLAMP_MAX(uvs.y + mip_offset.y, mip_offset.y + mip_dims.y - 1), mip_offset.y);
  return pixels[y * src_dims.x + x];
}

__forceinline__ __device__ float4
sample_bilinear(float2 uv, int2 mip_dims, int2 src_dims, int2 mip_offset, float4 *pixels) {
  float2 texel = uv - 0.5f;
  int2 i0 = make_int2(floorf(texel.x), floorf(texel.y));
  int2 i1 = i0 + make_int2(1, 1);
  float2 f = make_float2(texel.x - i0.x, texel.y - i0.y);

  // clang-format off
  float4 s00 = sample_clamped(i0, mip_dims,          src_dims, mip_offset, pixels);
  float4 s10 = sample_clamped(make_int2(i1.x, i0.y), mip_dims, src_dims, mip_offset, pixels);
  float4 s01 = sample_clamped(make_int2(i0.x, i1.y), mip_dims, src_dims, mip_offset, pixels);
  float4 s11 = sample_clamped(i1,                    mip_dims, src_dims, mip_offset, pixels);
  // clang-format on

  return lerp(lerp(s00, s10, f.x), lerp(s01, s11, f.x), f.y);
}

I am not using a hardware texture, just a bitmap

#
__global__ void ps_upsample(
    i32 width, i32 height, float4 *pixels, float4 *mips, i32 mip_level, f32 bloom_intensity
) {
  i32 x = threadIdx.x + blockIdx.x * blockDim.x;
  i32 y = threadIdx.y + blockIdx.y * blockDim.y;

  i32 mip_width = width >> mip_level;
  i32 mip_height = height >> mip_level;
  if (x >= mip_width || y >= mip_height)
    return;
__global__ void ps_downsample(i32 width, i32 height, float4 *pixels, float4 *mips, i32 mip_level) {
  i32 x = threadIdx.x + blockIdx.x * blockDim.x;
  i32 y = threadIdx.y + blockIdx.y * blockDim.y;

  int2 idx = make_int2(x, y);
  int2 src_dims = make_int2(width, height);

  i32 mip_width = width >> mip_level;
  i32 mip_height = height >> mip_level;
  if (x >= mip_width || y >= mip_height)
    return;

this is how I get my mip dimensions

#

my number of threads are greater than the image dimensions

ps_internal void downsample(
    ps_ctx_t &ctx, cudaStream_t stream, i32 width, i32 height, float4 *pixels, float4 *mips
) {
  {
    for (i32 mip_level = 1; mip_level < 7; mip_level++) {

      i32 block = 16;
      i32 mip_width = width >> mip_level;
      i32 mip_height = height >> mip_level;
      dim3 dimBlock(block, block);
      dim3 dimGrid((mip_width + block - 1) / block, (mip_height + block - 1) / block);

      ps_downsample<<<dimGrid, dimBlock, 0, stream>>>(width, height, pixels, mips, mip_level);
      getLastCudaError("downsample failed");
    }
  }
}

ps_internal void
upsample(ps_ctx_t &ctx, cudaStream_t stream, i32 width, i32 height, float4 *pixels, float4 *mips) {
  {
    i32 end_mip = ctx.gpuc_ctx->debug_bloom_downsample ? 1 : 0;
    for (i32 mip_level = 5; mip_level >= end_mip; mip_level--) {

      i32 block = 16;
      i32 mip_width = width >> mip_level;
      i32 mip_height = height >> mip_level;
      dim3 dimBlock(block, block);
      dim3 dimGrid((mip_width + block - 1) / block, (mip_height + block - 1) / block);

      ps_upsample<<<dimGrid, dimBlock, 0, stream>>>(
          width,
          height,
          pixels,
          mips,
          mip_level,
          ctx.gpuc_ctx->bloom_intensifier
      );
      getLastCudaError("upsample failed");
    }
  }
}
#

dim3 dimGrid((mip_width + block - 1) / block, (mip_height + block - 1) / block);

#

notice the + block - 1

#

that avoids clipping

#

you shouldn't use my sample_bilinear, since you have hardware textures, just showing it to you

astral hawk
rustic belfry
#

does it include mip 0?

astral hawk
#

Mips are all separate textures

#

Original is a separate texture2d

#

Mips are all separate texture2ds as well

#

I didn't "get to" texture arrays yet lol

astral hawk
rustic belfry
#

np, gl

astral hawk
#

This is so cool

#

Whatever became of this?

rustic belfry
#

became of what?

#

that's my current project

astral hawk
#

Oh ok I thought you switched to something else because Rosy had a thread

#

Oh it is still there ok

astral hawk
#

Renderer updates the "intrinsic" (i.e., renderer-internal) UBO's viewport_size field whenever framebuffer resizes.

I used that when calculating the "radius" (delta uv) for the upsampling shader but during bloom steps, that uniform is not updated so it is always the original viewport size.

Passing the source/input texture's resolution as a uniform to the shader and using that to calculate the radius/deltauv fixed it.

#

There still is a little flicker but it looks normal and non-squarey

#

obs causes stutter during record sometimes

astral hawk
#

I made a revised study plan. I am open to suggestions

#
## Study / Implementation Plan

### Phase 1 — Image-Space Effects
- **RTR**
  - Chapter 12: Image Space Effects
- **Kakadu / LOGL**
  - SSAO
  - Depth of Field
  - Motion Blur
  - Lens Flare
  - (Bloom already done)

  Use Jorge Jimenez slides for these.

---

### Phase 2 — Physically Based Lighting & Local / Global Illumination
- **RTR**
  - Chapter 9: Physically Based Shading
  - Chapter 10: Local Illumination
    - 10.1 Area Light Sources
  - Chapter 11: Global Illumination  
    - 11.3 Ambient Occlusion
- **Kakadu**
  - Implement PBR (LOGL and/or other sources)
  - Implement Area Lights

---

### (In Between)
- **Kakadu: DSA refactor across codebase**
- **Kakadu: Integrate Texture Arrays**

---

### Phase 3 — Shadow Mapping Enhancement
- **RTR**
  - Chapter 7 (relevant sections): Shadows
- **Kakadu**
  - Implement Cascaded Shadow Maps (CSM)
  - Implement smart/auto directional-light camera placement

---

### Phase 4 — Re-evaluation
- **Kakadu**
  - Revisit and refine SSAO using insights from RTR Chapter 11.3 IF NEEDED.

---

### Phase 5 — Compute
- **RTR**
  - Chapter 23 (selective revisit)
- **Kakadu**
  - Introduce SSBOs
  - Introduce compute shaders
  - Migrate an existing pass to compute
    - SSAO or blur
    - light culling or similar

---

### Phase 6 — Deferred Rendering
- **RTR**
  - Chapter 20.1: Deferred Shading
- **Kakadu / LOGL**
  - Implement Deferred Rendering
  - Integrate compute-based light culling where appropriate

---

### Phase 7 — Visibility
- **RTR**
  - Chapter 19.4: View Frustum Culling
- **Kakadu / LOGL**
  - Implement frustum culling
#

I am straying from the ordering in logl because ssao for example is something I can tackle right now as I implemented the beginnings of a fullscreen effects system in Kakadu and I only have bloom as an effect for now.

No need to put deferred in the middle of bloom and ssao I think.

#

In RTR I am chronologically at chapter 9 physically based shading. I jumped to chapter 23 (I think) to read the hardware chapter.

I am thinking about skipping 9-10-11-for now and get image-space effects (chapter 12) out first. then go into PBR full blown and continue from there.

rustic belfry
#

looks great!

astral hawk
#

Hmm ok there is a reason deferred rendering comes before ssoa in logl: per fragment normals are a thing by default in deferred rendering, which the ssao (crysis) uses.

#

I'll just read the deferred rendering chapter for now to get an idea on the implementation details and decide then.

astral hawk
#

Then I saw Devsh's comments here: #graphics-techniques message

#

Disclaimer: I shouldn't even begin to compare forward vs deferred at this stage, I know. That is a given and I fully acknowledge it. It was past working hours and I saw the vid on youtube and wanted to share.

astral hawk
#

Seeing @fluid parrot's immensely cool splash screen I thought why not

#

Animating stuff so I can get cool images with bloom

#

Maybe this one dunno

rustic belfry
#

wow

#

I love bloom

#

I can't wait until I have more better lighting and more stuff looks good, it adds so much

fluid parrot
#

Bloom is the key to nice images 😎

astral hawk
#

I wanted to make it animated by using the renderer (which initializes in 200 ms. give or take) itself but I can't because the splash screen animation would have to block initialization of the rest of the engine/app until it finishes because the rest of the init code contains lots of gl calls as well and opengl is not multithreaded right?

This could be a cool idea in a vulkan/d3d12 renderer perhaps but not for opengl if I'm thinking right.

#

If the rest of the init. code was pure cpu code without any gl dependencies then sure. Otherwise, a blocking splash screen kinda defeats the purpose imo.

rustic belfry
#

well, you could do a little bit of extra stuff each splash screen frame instead of using a thread

#

you can use command buffers in separate threads in vulkan, and opengl context's are not as convenient

astral hawk
#

Yeah in gl it would be diminishing returns for unnecessary hustle it seems

#

Unity gets by with a static splash screen so why not

#

We can't all be Ombre

astral hawk
#

I'll try changing the preview image of this post/thread. Hope I don't screw it permanently

#

Yeah I nuked it lol

#

Ok it's back but it looks meh

#

Maybe this will look better due to aspect ratio

rustic belfry
#

I think it looks good in the community projects list

#

it stands out

astral hawk
#

Heyo

#

Update: I am working on a big refactor that was long overdue before I continue with augmenting the Renderer with newer stuff (deferred rendering is on the way).

#

Currently the engine architecture is in a weird place (by design, or at least by intentional postponing of fixing it)

#

Inside the solution, there are multiple projects:

  • Vendor (has deps like stb, fastgltf, ImGui etc.) -> produces static lib.
  • Engine (has EVERYTHING else that is not client application/game specific) -> also produces a static lib. and consumes vendor.lib.
    Most notably (for the refactor) contains ALL editor code as well, dispersed across the codebase in the form of #ifdef _EDITOR blocks. All ImGui code is editor code.
  • Client applications: Currently there are 3 projects: Sandbox, HDR-Demo and Bloom Demo. There should/could be much more, but creating a new project means essentially duplicating the dir/proj of an existing one and modifying it. These are all executables by the way and they weirdly contain the editor and the engine runtime as well as client app logic, all in one.
#

Creating projects is a pain point right now.

#

Also I for some reason took Cherno's Application base class I think (not sure, it's been a while) and have virtual functions like Update(), Initialize() etc. even though there can not be more than 1 client application for a given project, so virtual dispatch is unnecessary (although irrelevant perf wise since every frame there is 1 indirection into long-running functions like Update(), Render() etc.).

#

What I want is to do some ordered refactors:

  1. Separate editor code from engine code; Do this in the form of a new Editor namespace and translation units, containing all editor code.
    The rest of the codebase should mostly be unaware of the editor except for the Application calling the update and render functions of the editor per frame and keeping an Editor::Context member around.
#
  1. Separate the engine and the editor into 2 projects.
  2. Turn engine into a dll. I am not so sure whether I should do this or not. Aside from possibly hot reloading the Engine itself, I don't see an immediate gain from this.
  3. Turn editor into an executable. This is bound to happen soon. It is weird launching the client application executable but in reality you are launching the editor, but with custom logic per application.
  4. Turn client applications into dlls. They could be hot reloaded which is a must for iteration in my opinion.
  5. Separate client applications into two parts: A stub executable and a dll that will contain the actual logic. See number 7 below.
  6. Implement project creation from the Editor; When a new project is created, the editor will simply create a new c++ project that will house the client app logic in the form of a dll. When the client app is "built" from the editor, the editor will generate the stub executable which will load the engine runtime dll and give the control to it. Engine runtime will initialize itself (most notably the renderer, which is the bulk of the engine atm) and then will load and initialize the client app dll. Then it will enter the main loop and every frame call into the client app dll to execute custom update/render etc. logic on top of the engine runtime logic.
#

I am not sure how heavily I will invest in this, except for number 1 because I will be am doing that fully.

#

I plan on implementing all of these sometime but I need to balance maintaining Kakadu with learning graphics as well (although one could argue this type of engine arch. stuff is also a responsibility of a graphics programmer, at least the renderer).

#

Let's just see where things will go from 1).

#

I do want to create a "Many Lights" project/sample as well by the way and that is what triggered this refactor that was long overdue.

#

Oh and 8) Currently there is no scene data serialized to drive. Scene data is initialized via code during every run. So serialize/deserialize via a format like json maybe.

#

7 and 8 combined would make creating new projects a breeze compared to status quo.

#

#1067777224528375858 message

#

Adding this to the pile of stuff to fix after the refactor

astral hawk
#

Except for the Renderer.cpp, which contains a shit load of ImGui calls, the decoupling is (half) done.

#

Though I am fully done for the day

astral hawk
#

Update: Renderer is also rid of all editor code (except for some debug bounds checks but those are unimportant right now). I've been busy with my job again but all is good.

#

I brushed up on dynamic linking in the MSDN docs today. I will go for explicit linking in so I can hot reload stuff when I want to get into that.