#Iris - A Journey through OpenGL and beyond to learn Graphics

1 messages Β· Page 4 of 1

wicked notch
#

Well it's 1ms cumulative, I'm doing this 5 times

#

Once for the main camera and once for each shadow cascade

#

Performance was actually a bit worse than not doing ROC on shadows, but I assume it scales better?

finite yacht
#

oh ok. Is it actually worth it for the shadows.... Yeah about to ask

wicked notch
#

With representative_fragment_test it looks like it's viable

#

Like it takes much less time than the shadow map passes themselves

#

Previously it was ~400us for each cascade

#

Depending on the cascade of course.

finite yacht
#

yeah but without the extension idk. Because the shadow map generation pass the fragment shader does basically nothing

#

so that you are not saving anything

wicked notch
#

Ye

#

I was looking at other methods for shadow caster culling, but they are all so ridiculously complicated that it turned me off

#

Like building a whole octree to accelerate culling and stuff.. crazy stuff

finite yacht
#

are you already doing compute frustum culling

wicked notch
#

Yes

#

Takes no time at all

#

I'm surprised by how little time it takes honestly lol

#

It's the small gray rect

#

Less than 10 microseconds KEKW

finite yacht
#

yep its amazing and so useful

#

16 bit depth can help speed up shadow map generation. and dont use a geometry shader if you are

wicked notch
#

Yeah no geometry shader

#

I was thinking about going full reverse Z because ROC has some precision issues on very small AABBs

#

Such as floor planes etc..

finite yacht
#

like false negatives, where the aabb doesnt produce any fragments and the mesh is not drawn?

#

but then the mesh wouldnt be drawn in the first place i guess

wicked notch
#

No, it's because of temporal incoherence due to the camera moving relatively quickly and ROC not being able to keep up with last frame's depth

#

I guess at least

#

I could fix this by making the AABBs slightly bigger though, now that I think about it

#

To add some buffering

frank sail
#

last frame's depth should be irrelevant

wicked notch
frank sail
#

the temporal incoherence is solved by the step that renders just-disoccluded objects

wicked notch
#

Ah yeah, I forgot about that

#

I found that adding some epsilon to the AABB works pretty good at "solving" this temporal incoherence too

#

And it doesn't reduce the effectiveness of culling by much

#
void main() {
    ...
    vec3 position = make_cube(gl_VertexID) - 0.5;
    const vec3 aabb_max = object_info.aabb.max.xyz;
    const vec3 aabb_min = object_info.aabb.min.xyz;
    const vec3 aabb_size = (aabb_max - aabb_min) * 0.5;
    const vec3 aabb_center = object_info.aabb.center.xyz;
    const vec3 local_view_pos = vec3(transpose(inverse(transform)) * vec4(camera.position, 1.0));
    position *= (aabb_size + 20) * 2.0;
    position += aabb_center;
    if (all(lessThan(abs(local_view_pos - aabb_center), aabb_size))) {
        roc_visibility[object_id] = 1;
        gl_Position = vec4(-2.0, -2.0, -1.0, 1.0);
    } else {
        o_object_id = object_id;
        gl_Position = camera.pv * transform * vec4(position, 1.0);
    }
}
#

LODs are also something I could do in the future, but I would like some hugemonogous scenes to test with first

#

I'm getting sidetracked here a lot though, hehe, TAA is the primary objective.

wicked notch
#

The infamous reprojection is here

#

Although now that I'm beginning to understand TAA I know I can't just apply this to the shadow map hmm...

#

I would definitely need a shadow mask.. maybe?

#

Oh well, temporal supersampling shadows can wait for now

frank sail
#

yeah

#

shivoa mentioned that shadow masks can be trivially TAA'd in the context of the new zelda game having awful shadow aliasing

#

it's probably worth it if you're suffering from bad aliasing from the sun shadow

wicked notch
#

Ahh I see

#

Speaking of bad aliasing, Elden Ring on ultra had pretty bad aliasing too..

#

I wonder why they don't just filter the thing..

#

For Zelda, is the nintendo switch unable to perform shadow filtering efficiently?

frank sail
#

spatial filtering doesn't remove temporal aliasing

wicked notch
#

Yeah, I mean spatio temporal filtering

frank sail
#

yeah idk why

#

buncha amateurs lel

wicked notch
#

Beaten by a beginner in OpenGL smh

frank sail
#

tbf, they have to carefully consider all features they add since they will eat into the rendering budget

wicked notch
#

Ye jokes aside I'm sure they err on the side of caution most of the time

frank sail
#

that being said, temporally filtering a shadow mask should be quite cheap in the grand scheme of things

#

they probably just don't have motion vectors or some other thing that would be dealbreaker when it comes to this

#

adding those would likely make rendering the scene more expensive

wicked notch
#

Hmm why is it a deal breaker? It's just a float32 texture?

#

Could be float16 too

frank sail
#

that's quite a bit of extra bandwidth

wicked notch
#

Hmm I see

frank sail
#

imagine if the g-buffer was already like 96 bits, an extra 32 or 64 could affect perf quite a lot

wicked notch
#

I am actually limited by VRAM throughput just by rendering shadow maps nervous

#

And that's on desktop, I imagine mobile and low power chips have even lower bandwidth?

wispy spear
#

i dont understand the hype for zelda 😦

wicked notch
#

Maybe it's good I don't know. I never played it

frank sail
#

it's allegedly a fun game

#

I know people really liked the previous one, so of course they would be excited for a sequel

wispy spear
#

hmm, the lets plays i watched so far were meh, but thats probably just me

wicked notch
#

I like FPS and Arena more anyways πŸ˜›

#

An open world like Elden Ring would be good too, I loved it

wispy spear
#

also didnt like that one, but that applies to all dark souls and clones

frank sail
wicked notch
#

Anyways, after I get TAA done I really want to experiment with Atmospheric Scattering, I'm getting bored of the void sky

wispy spear
#

arena sounds like battle royal-esque games

#

or that rocket league thing

wicked notch
#

By arena I mean stuff like ULTRAKILL or DOOM

frank sail
#

mmm ultrakill is special

wicked notch
#

They are considered arena shooters right?

wispy spear
#

πŸ‡ΉπŸ‡«

#

doom is just your bog standard fps

#

nothing arena about it

frank sail
#

it's relegated to boomer shooter now

wicked notch
#

I thought "standard" was more like CoD or something

wispy spear
#

nah its all the same

wicked notch
#

Anyways, I like DOOM

wispy spear
#

you might like prodeus too πŸ™‚

#

has this neat giraffics look too

wicked notch
#

Very cool indeed

frank sail
#

I definitely have a leaning towards FPS games

#

hehe hl2:dm is a boomery shooter

wicked notch
#

DRG is so good

frank sail
#

one of the most wholesome gaming communities

wicked notch
#

πŸͺ¨andπŸͺ¨

wispy spear
#

hehe

frank sail
#

celeste and hollow knight are also pretty good if you want a break from fpses

#

I played both last year and was pretty blown away

wispy spear
#

i know the author of celeste

frank sail
#

Personally?

wispy spear
#

no

frank sail
#

Ah

wispy spear
#

well its made by 2 peeps

#

they hang out on the FNA discord

#

(FNA = XNA reimplementation)

frank sail
#

cool

#

I heard they're working on a new game, but I forgor what it was called

wispy spear
#

i didnt like celeste either btw : (

frank sail
#

You mustn't be easily impressed like I am

wispy spear
#

perhaps i prefer more space-y stuff

frank sail
#

I couldn't tell

#

A VR space game would be cool though

wicked notch
#

Except it's Bethesda...

wispy spear
#

ja 😦

#

it will be another outer worlds

wicked notch
#

I also want a proper space game, Star Citizen is the closest we get

wispy spear
#

same

wicked notch
#

But it's permanently beta

wispy spear
#

Freelancer was superb

#

then i played Everspace 2 recently which is praised as the new freelancer or better even

#

but the end was VERY disappointing

frank sail
#

I thought about it a little more and the shadows in totk can probably still be TAA'd for static objects without motion vectors

#

marking dynamic objects to ignore would still require 1 bit in the gbuffer

wicked notch
#

This do be some good anti aliasing

finite yacht
#

why can I see the stone behind some curtainsnervous

wicked notch
#

Twas a little bugged KEKW

#

This is not

#

(There's the comparison in #wip)

#

I wonder how I should handle stuff that just pops into existence, what do I use as the previous frame's model matrix?

#

Just a zeroed out mat4?

finite yacht
#

just using the current one sounds more correct

#

you can also have a bias mask that controls how strongly should be blended between history and current color

wicked notch
#

Hmm yes makes sense

wicked notch
#

Motion Vectorsℒ️

wicked notch
#

Full TAA is here, it looks incredibly good

#

I should filter it a bit more though, so objects in the distance don't shimmer, but I know that I can tweak and tune TAA for the next 10 years and it'll never be perfect πŸ˜„

wispy spear
#

πŸ˜„

#

the curtains and arches are super smoof πŸ™‚

wicked notch
#

I will patiently wait for Mr. Jaker to port FSR2 to OpenGL so I can have free antialiasing AND upscaling KEKW

wispy spear
#

: >

#

im sure hes ghosting you

wicked notch
#

He was hard at work with FSR2 this morning so that's good πŸ˜›

#

Where to go next hmm

#

I kinda want to continue with mesh shaders, but the situation's dire in OpenGL..

#

Next is Atmospheric Scattering which is really cool

#

And some UI wouldn't be too bad as well..

#

Oh, terrain rendering as well!

#

I will have to get started, sooner or later, to making open worlds after all

wispy spear
#

nah

#

next is volume clustered rendering

#

and then is void's GI thingy

#

before all else

wispy spear
#

JPvanoosten's

#

ah perpahs your paper is an evolution of it

wicked notch
#

This?

wispy spear
#

yes

wicked notch
#

Psychedelic Lighting Simulator I see

wispy spear
#

hehe

wicked notch
#

It's basically point light culling?

wispy spear
#

i mean i couldnt tell the difference between 500 and 1mio lights

#

not just point light i guess

wicked notch
#

Yeah, area lights too I guess

#

Hmm I want to do UI first though

wicked notch
frank sail
#

deccer has found someone worthy of the task

wicked notch
#

Progress report on Operation Black Mage (porting FSR2 to OpenGL)

wispy spear
#

there is also rmlui

frank sail
#

The shaders and cmake are probably done, now I just need to make the backend

wicked notch
#

So... about 1% done?

#

Great KEKW

frank sail
#

Btw, how is your TAA under motion

wicked notch
#

Bit of shimmering

#

For some reason my GPU really doesn't like encoding H264

wispy spear
#

looks neat still

frank sail
#

Even the best TAA implementations I've seen shimmer a little bit in motion

wicked notch
#

Visual representation of Jaker's progress:

#

Jk by the way, it's your personal time so I'll be patient

wispy spear
#

: )

wicked notch
#

~~Perhaps I'll switch to Vulkan faster than you port it KEKW ~~

#

By the way RMLUI looks very interesting, finally my useless HTML and CSS skills will be put to use

#

On a whim I decided to actually look at the vulkan tutorial in my bookmarks

#

This is.. frightening lol

#

Words I've never seen before KEKW

frank sail
#

Tbf you only have to do like 70% of that once in your whole life

#

Since it's boilerplate

wicked notch
#

Yeah but why is there a physical and a logical device, what does that even mean

#

For SLI or something?

#

Ah alright it's explained, perhaps I should just read

#

VkPhysicalDevice is the actual hardware, VkLogicalDevice VkDevice lets you specify what extensions and features of the VkPhysicalDevice you want to use and stuff like that?

#

To draw to a VkImage acquired from the swap chain, we have to wrap it into a VkImageView and VkFramebuffer. An image view references a specific part of an image to be used, and a framebuffer references image views

#

Ah yes, indirection.

finite quartz
wispy spear
#

how come you fiddle with Vk now?

wicked notch
#

Nothing special, I just wanted to check out how much I can delay using Vulkan KEKW

wispy spear
#

heh

cunning atlas
#

honestly i set up my device stuff once and I look at it only when I need to enable something

#

if you use C++ vkbootstrap helps a lot with the setup too

wicked notch
#

I really like this Vulkan thingy where you put your stuff into structs and then use these as parameters

#

It reminds me of my framebuffer constructor with 10 parameters or something

#

I'll probably adopt this

frank sail
#

don't forget that OpenGL features zero structs

wicked notch
#

Yeah

#

I really hate glCopyImageSubData

wispy spear
#

almost

frank sail
#

a mind boggling choice tbh

frank sail
wicked notch
frank sail
#

You'd love fwog hehe

wispy spear
#

IF only fwog's "Cmd" bs would not be free floating floaters

wicked notch
frank sail
#

you can do whatever, I just like shilling

wicked notch
#

Yes, I think I've used enough raw OpenGL, so I'll gladly accept your shilling

wispy spear
#

wouldnt surprise me that you switch to vk and make a better Fwovk before jaker even considers switchting to vk

frank sail
#

Fvog

wispy spear
#

and martty will get inspiration from it for vuk2

wicked notch
#
int main() {
    if (!glfwInit()) {
        IRIS_LOG_ERROR("failed to initialize glfw");
        return -1;
    }

    glfwWindowHint(GLFW_CLIENT_API, GLFW_NO_API);
    glfwWindowHint(GLFW_RESIZABLE, GLFW_FALSE);
    auto* window = glfwCreateWindow(800, 600, "IrisVk", nullptr, nullptr);

    // instance
    {
        auto count = 0_u32;
        const auto** extensions = glfwGetRequiredInstanceExtensions(&count);

        auto application_info = VkApplicationInfo();
        application_info.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO;
        application_info.pNext = nullptr;
        application_info.pApplicationName = "IrisVk";
        application_info.applicationVersion = VK_MAKE_API_VERSION(0, 1, 0, 0);
        application_info.pEngineName = "IrisVk";
        application_info.engineVersion = VK_MAKE_API_VERSION(0, 1, 0, 0);
        application_info.apiVersion = VK_API_VERSION_1_3;

        auto instance_info = VkInstanceCreateInfo();
        instance_info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
        instance_info.pNext = nullptr;
        instance_info.flags = {};
        instance_info.pApplicationInfo = &application_info;
        instance_info.enabledLayerCount = 0;
        instance_info.ppEnabledLayerNames = nullptr;
        instance_info.enabledExtensionCount = count;
        instance_info.ppEnabledExtensionNames = extensions;

        auto instance = VkInstance();
        auto result = vkCreateInstance(&instance_info, nullptr, &instance);
        IRIS_LOG_INFO("instance created: %p", (const void*)instance);
    }

    while (!glfwWindowShouldClose(window)) {
        glfwPollEvents();
    }

    glfwDestroyWindow(window);
    glfwTerminate();
    return 0;
}
``` This is what I wrote in the past 30 minutes
#

And I'm already tired

#

It's 40 lines of code

wispy spear
#

btw

#

did you know that glfwTerminate calls glfwDestroyWindow

wicked notch
#

You need less than 30 for a colored triangle 😎

frank sail
#

glfwDestroyTheChild

wicked notch
#

Does it actually?

wispy spear
#

it do be doing

#

but i like calling destroywindow too

frank sail
#

big chads don't call either

wispy spear
#

true

frank sail
#

let os mommy clean up after you

wispy spear
#

and claim that their shit works whifout calling sdlInit when using sdl

#

what about cleaning up before you?

frank sail
#

wut

wispy spear
#

there was a dude somewhere in #vulkan or idk where, claiming that

frank sail
wicked notch
#

Hmm I don't really like RML, it forces you to use OpenGL 3.3 if I didn't misunderstand anything?

#

It also comes with its own glad and stuff

#

Actually maybe not

wicked notch
#

Yeah okay, after using RmlUi for a few hours I decided it's not very comfortable to use.. I think it's more intended for Apps and stuff instead of real time 3D

wicked notch
#

dearimgui looks promising, it's even used by Rockstar Games lol

wispy spear
#

ah

#

it sounded you want an actual UI not just some debug ui dearimgui nonsense

wicked notch
#

I will eventually, but I'll probably use something easier like Electron or something like that

#

I would like to see my render targets right now mostly

wispy spear
#

yeah then dear imgooey

frank sail
#

babagooey

wicked notch
#

dear ImGui is truly one of the pieces of software of all time

#

I'm very satisfied

frank sail
#

Is that the docking branch

wicked notch
#

Yes, I've discovered that after I initially did this all manually without docking of course nervous

frank sail
#

Try drawing ImGui without framebuffer srgb

#

It'll fix some of the colors, but too bad it's wrong either way nervous

wicked notch
#

Hmm need everything as linear sRGB and then perform the conversion manually in the TAA resolve pass?

frank sail
#

You should be drawing ImGui at the very end of everything

#

And without framebuffer sRGB, because in theory it should already be sRGB (thus it doesn't need the linear-srgb conversion)

wicked notch
#

But doesn't OpenGL convert back to linear space when sampling from textures?

frank sail
#

it does when it's an sRGB texture

#

idk ur pipeline though

wicked notch
#

I suppose ImGui samples from my output texture when I give it the texture id to ImGui::Image, then it will write that into the default framebuffer and since it's not sRGB it will be stored in linear space?

wispy spear
#

can you not draw imgui onto the final image after resolving everything else

#

or even once you blitted everything to fb0, draw imgui then

frank sail
#

I'm talking about imgui itself, not the random image you draw with ImGui::Image

wicked notch
#

Ah by the way, my pipeline is simply: Depth Prepass -> Frustum + Occlusion Culling -> Depth Reduce -> Shadow Setup -> Shadow Render -> Color Pass (SRGB texture) -> TAA Setup -> TAA Resolve -> Draw UI (To SRGB framebuffer)

wicked notch
frank sail
#

and I'm saying to disable framebuffer srgb right before that

#

It should make your imgui darker

wicked notch
#

Ah I see

wispy spear
#

glDisable(GL_FRAMEBUFFER_SRGB)

frank sail
#

I can tell it's on because the colors are too bright

wicked notch
#

Nope it still looks very bad

frank sail
frank sail
#

But now your scene is super dark lol

wicked notch
#

I don't care too much about color correctness of the UI to be quite honest

#

I'm afraid to step into uncharted territory (#vulkan)

#

I looked again at whatever little Vulkan I wrote yesterday and I just realized I'm passing over 15 parameters to a single function

#

The whole "ApplicationInfo" and stuff..

wicked notch
#

I think that's enough Vulkan for today

#

230 lines and still no pretty colors on the screen 😦

#

Time to learn about atmospheric scattering!

frank sail
#

just a few hundred more lines and you'll have a clear color going

wicked notch
#

Do you recommend anything for atmospheric scattering? Besides scratchapixel's article?

frank sail
#

um there are some papers I guess

#

check the citations too

wicked notch
#

Does Fwog have any samples I can copy reference?

frank sail
#

I only have a local mie scattering fog thingy

#

which isn't suitable for atmospheres, since you need rayleigh scattering for dat

wicked notch
frank sail
#

there are probably also some siggraph pbr course thingies you can look at

frank sail
wicked notch
#

There's also the thing about stars and clouds

#

And fog

#

And rain

#

Making an atmosphere is hard

frank sail
#

fog is something I can help with

wicked notch
#

Lovely, one piece at a time I'll solve this puzzle hopefully

frank sail
#

I'm not sure about rain rendering techniques

#

you can probably get something decent with a particle system

#

I guess there are quite a few components to a weather system though

#

clouds, precipitation, ground effects (like puddles, snow, 'wettening', etc.)

wicked notch
#

Hmm yes, the most difficult thing would be to dynamically update each object based on how it reacts to rain

#

Why can't the dumb silicate rock figure it out on it's own smh

frank sail
wicked notch
#

The things you wrote

wispy spear
#

with shriwwle

frank sail
#

ah

wicked notch
#

Objects become more shiny or their roughness is accentuated

frank sail
#

I think some tomb raider game had a presentation where they mentioned this stuff

wicked notch
#

I'm getting very ahead of myself though, atmosphere first.

#

It seems like the easiest of them all

frank sail
#

or maybe it was uncharted

#

idk all these exploration games look the same to me

wicked notch
#

Thanks for all the links, big preesh

frank sail
#

they use some wetness mask

#

btw ignore the mip fog in that presentation KEKW

wicked notch
#

Me: Heavily constrained by VRAM Throughput
Also me: What's 6 or 7 more full screen passes

#

How do games even do this

#

Without running into absurd amounts of render targets

cunning atlas
frank sail
#

btw the bandwidth isn't that bad when you read/write to each pixel once

#

full screen passes are generally super cheap

wicked notch
#

There's no way we're reading once per pixel from the wetness mask is there KEKW

frank sail
#

one trick is combining a lot of full screen passes into a single one also

wicked notch
#

Hmm, like making a huge shader?

frank sail
#

t h i c c

wicked notch
#

How many buffer bindings was the limit per shader stage? bleakekw

frank sail
frank sail
wicked notch
#

Hmm I see

frank sail
#

wetness is an additional material input thingy that they use

#

so it's one extra texture read I guess

wicked notch
#

I am still using a regular forward renderer, should I go deferred?

frank sail
#

but it doesn't affect how chonky the g-buffer is

wicked notch
#

Even more render targets bleakekw

frank sail
#

like ssao, ssr, etc.

wicked notch
#

I'll have to think carefully about what I put in the gbuffer

#

I'm thinking just uvs and depth?

frank sail
#

everythingggggg

wicked notch
frank sail
#

"standard" deferred renderers will put surface material info in the g-buffer (material type, normals, roughness, metallic, ao) and depth

wicked notch
#

Yeah but that's a lot of bandwidth

frank sail
#

boohoo

#

the new hotness is visibility rendering, which is basically depth+triangle&instance ID, then you can fetch material parameters in the shading pass

wicked notch
#

Look, I don't want my occupancy to look like it's starvingKEKW

frank sail
#

meh

#

visibility buffer is quite a bit more effort for probably 0 gain in most projects

#

UE still uses a massive g-buffer in its regular rasterized renderer and that performs fine

wicked notch
#

Does depth + uv not work?

frank sail
#

how do you know what texture to sample with just the UV

wicked notch
#

depth + uv + material id

#

RG32UI

frank sail
#

now how do you get mipmap+anisotropic filtering

wicked notch
#

Uh

#

Hm

#

Good question

frank sail
#

just pass 4 more floats for UV gradient on x and y smart

wicked notch
#

depth + uv + material id + ddx/ddy

#

Do I only need derivatives for aniso?

#

If I had meshlets I could easily calculate derivatives

#

Just get 3 vertices with gl_MeshPrimitiveID, compute barycentrics, ezpz

frank sail
wicked notch
#

How do I get the actual triangle though?

#

I think I need more than the triangle id

frank sail
#

you need instance ID and triangle ID to uniquely identify a triangle

wicked notch
#

Yeah but which vertex buffer is it in?

#

What index buffer does it use

frank sail
#

implying you'll have more than one vertex/index buffer

wicked notch
#

It's easy if you have only one ofc

wicked notch
#

Jokes aside it's actually very easy to solve

frank sail
#

well, if you're going to optimize meaningless stuff, you better go the full way

#

with one mega vertex/index buffer, you can do MDI

wicked notch
#

Every object already stores their index to their VBOs and EBOs

#

So I just need one more id, object id

#

And I can uniquely identify any triangle

frank sail
#

homie is speedrunning graphics

wicked notch
#

I'll just keep my good old forward renderer until I need a reason to change it

frank sail
#

I guess you can write a thin g-buffer if you need that stuff for post-fx

#

like what I think πŸ‡©oom does

wicked notch
#

Atmosphere is cooler than random api plumbing 😎

wicked notch
#

I just preallocate 256MiB and if it spills I make another 256MiB

frank sail
#

oops misinfo

frank sail
#

or mayhaps make some adjustments to the algorithm

#

I guess you could record the buffer id and do multiple shading passes (or one where you have all the buffers bound)

wicked notch
#

The scheduler in my brain says this process (deferred) is very low priority

#

But I will think about it

frank sail
#

I'm not saying you should do it. I'm saying that it will happen at some point

#

Beware the pipeline!

wicked notch
#

Yeah ofc

#

For now I sleepy

#

My single neuron is complaining

frank sail
#

@wicked notch what do you remember about the reason for physical devices and logical devices being separate things? I just (hopefully correctly) remembered that one logical device can correspond to multiple physical devices, e.g., for SLI

wicked notch
#

I think it's the other way around

#

One physical -> many logical

#

Yeah, a VkPhysicalDevice is the actual hardware while the VkDevice specifies what features and extensions of that specific physical device you want to use

frank sail
#

https://stackoverflow.com/questions/31833776/howd-multi-gpu-programming-work-with-vulkan

The idea with this is that the SLI aggregation is exposed as a single VkDevice, which is created from a number of VkPhysicalDevices. Each internal physical device is a "sub-device". You can query sub-devices and some properties about them. Memory allocations are specific to a particular sub-device. Resource objects (buffers and images) are not specific to a sub-device, but they can be associated with different memory allocations on the different sub-devices.

wicked notch
#

Ah then the tutorial is wrong nervous

frank sail
#

tutorial vs so answer, unstoppable force vs immoveable object

wicked notch
#

We have a new contender these times, nogpt

frank sail
#

I think you can ask for certain things before getting the logical device

#

sadly my knowledge of vulkan device creation is quite limited

#

anyways, the tutorial provided reason enough to have such a separation

#

the SLI thingy is just an aside (and mostly irrelevant)

wicked notch
#

If you can query for SLI in OpenGL I'd just do:

if (is_sli_enabled) {
    printf("bro wasted his money lmao");
    debug_break();
}```
wicked notch
#

Rate my (non-antialiased) sun

wicked notch
#

I forgor everything about my physics classes, how do you calculate the sun's trajectory over time again? KEKW

wispy spear
#

sunpos = sunpos + 1

frank sail
#

x = cos(time)
y = sin(time)

wispy spear
#

fisically accurate

wicked notch
#

I remember it more involved but that's accurate enough I guess?

frank sail
#

making it have an offset is up to the reader to figure out

wispy spear
#

good thing void is not here atm :>

frank sail
#

@ void !!! math alert !!!

wicked notch
frank sail
#

Looks like a perfect UK atmosphere already

wicked notch
#

Still no clear color has been achieved

#

At least I have a swapchain now, I have no idea how OpenGL abstracts the swapchain away from me...

#

Anyways it's cool seeing the inner workings of how things work effectively

frank sail
#

the windowing system defines it

#

you can write to it by binding framebuffer 0, but you can't read from it or query anything about its attachments

wispy spear
#

its a little sad

frank sail
#

glfw gives you a minimal amount of control over the swapchain when you create the window

wispy spear
#

one could simulate a swapchain, with an ordinary fbo, and you never care about fb0

#

and your swapchain::present is just a glBlitNamedFramebuffer(..to fb0...)

#

i am toying with that idea for some time now, and prohaps emove BeginRenderToSwapchain and replace it with an acchual schwapschain object

frank sail
#

hmm, how do you render to the actual swapchain then

#

is it just a secret blit somewhere

wispy spear
#

yeah

#

swapchain::present

#

that could/would do blit+swapbuffers

frank sail
#

not a bad idea tbh

#

I could copy that idea to shrimplify fwog (by removing RenderToSwapchain stuff)

wispy spear
#

: )

#

technically

#

you could still draw your UI after it if you wanted with a little gymnastics to put it between blit and swap, or you also just draw into the swapbuffer-fbo

#

pretty much like you would do in dx11

wicked notch
#

Hmm I can't find the necessary motivation to study Atmospheric Scattering

#

Exams are also coming up nervous

wispy spear
#

time to take a brek

#

mu and snu and mie and xi are not running away

frank sail
#

Rayleigh will still be here too

wicked notch
#

I'll definitely be there for the FSR2 inauguration

wicked notch
#

I guess I'll try to continue with the Vulkan triangle instead of randomly reading scratchapixel

#

We're at day 3 and counting by the way KEKW

wicked notch
#

The more I sink into Vulkan the less I understand how OpenGL works

#

How anything "just works" in GL is purely beyond me at this point

wispy spear
#

ObenGL is just Vulkan minus info structs and synchronization thingies (i forgot the word)

frank sail
wicked notch
#

I wanted to notify I reached the "Synchronization and Command Buffers" phase

#

If you don't hear from me within 30 minutes, that means Khronos has claimed my soul

frank sail
wicked notch
#

Alright

#

I came back

#

I have this.

#

...But I understood close to nothing about synchronization

wicked notch
#

I understood a bit of synchronization

#

Which I am pretty happy about, I at least have 1 clue of what the thing is doing

#

That being said I will definitely do atmosphere tomorrow, while I digest the casual 1k lines for a single triangle..

#

I also realized earlier I am not even using a vertex buffer, that's probably another thousand lines bleakekw

cunning atlas
#

Don’t worry, there’s a good portion of that 1k you won’t have to mess with all the time

wicked notch
#

I am conflicted

#

I feel like this guy

#

I love OpenGL but damn...

wispy spear
#

czech and german number plates

cedar seal
#

OpenGL: Driver does lots of mostly useful stuff for you, hides some nastyness.
Vulkan: You deal with everything yourself.

finite yacht
#

I have a question. Do you do the ROC with an indirect draw or not

wicked notch
#

Yes, it's indirect

#

Specifically I first do frustum culling and then I do ROC

finite yacht
#

alright thats exactly what I was going to ask

#

so I need to add an other indirect buffer 😦

wicked notch
#

Yeah, and it's just a single indirect command too if you use Jaker's trick

#

Which is to do a TRIANGLE_STRIP with the number of objects that were visible as instanceCount

cedar seal
#

Roc?

wicked notch
#

Raster Occlusion Culling

finite yacht
wicked notch
#

Yes

finite yacht
#

i smell an other layer of indirection

wicked notch
#

Unfortuantely 😦

#

I thought long and hard but this was all I could come up with

wispy spear
#

i was also thinking about that last night

#

when you let your cs cull all sorts of things, then how to transfer the right gl_DrawID, to fetch the instance data for the object you want to render

#

unless you also pass that to the cs, and let it also store the instance data that way it does for indirects

#

but gave up thinking further a minute later πŸ™‚

wicked notch
#

Along the indirect buffer to write, I also pass another buffer yes

#

It's sole purpose is to store the relationship between the index of the object and the index of the thread in the compute shader that culled the object

#

That's where the additional layer of indirection comes from

wispy spear
#

yeah thats what i thought

#

im not that stupid then hehe

wicked notch
#

I never thought that mate :p

wispy spear
#

i do/did πŸ™‚

cedar seal
#

I can see how the compute shader culling works, but how does roc work?

wicked notch
#

You rasterize every object's bounding volume and check it against the depth buffer

#

You also abuse EarlyZ so that you can simply have a fragment shader that does visibility[object] = 1

#

Because all the samples that don't pass will not enter the fragment shader

cedar seal
#

Where does the depth buffer come from?

wicked notch
#

You use this frame's depth buffer to write visibility for the next frame

cedar seal
#

How does it deal with false negatives?

wicked notch
#

Generally object "pop-in" is not noticeable unless at low framerates but you can fix that by also drawing all the objects that changed visibility from 0 to 1

cedar seal
#

How do you know which objects changed vidibility?

#

From other means of culling?

wicked notch
#

You keep two visibility buffers and copy current to last at the end of the frame

#

Then in the culling compute shader, you just check both last and current

#

So the pipeline looks like this:

frustum_cull();
draw_all_visible();
occlusion_cull();
draw_changed_visibility();```
#

At least I think it was this way?

cedar seal
#

All of these in gpu?

frank sail
#

you don't need the second occlusion cull

#

I think

wicked notch
frank sail
#

Wait no it's the first occlusion cull that you don't need

wicked notch
#

Yeah, we have the visibility from the previous' frame

cedar seal
#

I should give it a try

#

Do you use bounding spheres and bounding boxes, any other shapes?

wicked notch
#

I just use regular AABBs, I used bounding spheres for HiZ but I prefer ROC πŸ˜›

frank sail
#

republic of china

cedar seal
#

What step is roc? draw*?

frank sail
#

occlusion_cull

#

and draw_changed_visibility would also be something you implement due to roc

cedar seal
#

Okay now it makes a bit more sense

frank sail
#

chicken-and-egg problem hehe

wicked notch
#

I suck at explaining

wispy spear
#

egg and chicken problem

frank sail
#

or at least no worse than myself when explaining this πŸ˜„

wispy spear
#

even i understood it somehow

cedar seal
#
  1. frustum cull - clear
  2. draw all visible - this is just normal draw, fills depth buffer
  3. roc - draw bounding volumes, update visibility buffer for next frame
#

Like that?

frank sail
#

ye

wicked notch
#

Pretty much

cedar seal
#

Does roc execute fragment shader for every fragment for the bounding shape?

wicked notch
#

Yesn't

#

It executes it for each sample that passes EarlyZ

frank sail
wicked notch
#

Unless you do glEnable(GL_REPRESENTATIVE_FRAGMENT_TEST_NV); ye

cedar seal
#

That would be one fragment per triangle, I see. Cool stuff.

#

Is this worth on mobile as well?

frank sail
#

roc in general?

#

no clue here, I don't do mobile

#

it makes huge savings on desktop, so I'd imagine that some of it would translate to mobile hw

frank sail
#

@wicked notch try Nsight Graphics

wicked notch
#

wat

#

I am using that

frank sail
#

why it look funny

#

ah

#

you took a gpu trace

#

try the frame profiler

wicked notch
#

It's gone bleakekw

frank sail
wicked notch
#

They removed it in this version

frank sail
#

welp

wicked notch
#

Ah one thing I'm doing is this though

frank sail
wicked notch
#
foreach cascade {
    dispatch() // does frustum culling, writes indirect commands
    glMultiDrawIndirectCount(); // draws
}```
#

Could this cause problems?

frank sail
#

nvidia doesn't like "subchannel switches" (changing between compute and graphics work) because it forces a WFI each time

#

but that will manifest as a tiny gap of no work on the GPU

#

and you have only like 4 cascades, so it's no biggie

wicked notch
#

Yeah, but there's a stupid amount of compute warps active

#

Where do these all come from

frank sail
#

That's a good question

frank sail
#

it seems supported on new hw from both vendors

wicked notch
#

Oi

#

That's the sampler reduction thing

#

Very nice

frank sail
#

it's all yours, my friend

wicked notch
frank sail
#

@wicked notch have you figured out how to view multi-frame gpu traces in nsight

#

for some reason it's only showing me the first frame

#

I presume this means it'll capture 5 consecutive frames anyways

wicked notch
#

Yeah, as far as I know it just calculates the frame deltas

frank sail
#

this grayed out button tells me that something weird is happening

wicked notch
#

Like, you'll see plus or minus some milliseconds

#

Yeah

frank sail
#

I get the + or - thing from single frame captures too

#

hmm

wicked notch
#

Hmm probably an estimate or weird shit happening

#

I also have the thing greyed out for some reason

#

Mayhaps it's due to the debug group markers?

frank sail
#

ah

#

when I change the metric set to throughput metrics, now I can aggregate the frames

#

much better

wicked notch
#

epic

#

I didn't even notice the difference between advaced and throughput

#

Documentation on nvidia's site is outdated KEKW

frank sail
#

yeah

#

you can now find them by just searching "L2"

#

btw you can analyze a range with shift+drag (I accidentally discovered it a few days ago)

wicked notch
#

Nice

wicked notch
frank sail
#

yeah I'm looking at my gl app with it right now

wicked notch
#

I am a bit conchfused by C#'s generics system

#

Given the following:

private static void GenericThing<T>(T x)
    where T : IThing
{
    x.DoThing();
}

private static void InterfaceThing(IThing x)
{
    x.DoThing();
}```
#

What is the difference between the first and the second thingy?

#

If the generic type is constrained, then what's the point? thonk

wispy spear
#

one is compiletime the other is runtime

wicked notch
#

Hmm

#

Looking at the IL code both seem to do a callvirt does that mean I'm doing something wrong?

wispy spear
#

no

wicked notch
#

What do you mean by compile time?

wispy spear
#

i think constructed questions like that dont lead to anywhere

#

you cant do typeof(x) in interfaceThing, but you can do typeof(T)

wicked notch
#

Hmm I see

wispy spear
#

and use it to run code

#

if typeof(x) == typeof(whatever) for instance

#

but you could in the generic thing

wicked notch
#

What I don't understand is why the GenericThing is doing a virtual call πŸ€”

wispy spear
#

because of the interface

#

DoThing is a virtual method

twin musk
#

callvirt is used for more or less everything, typically the jit will devirtualize it if it can

wicked notch
#

Ooh

#

So let's say I have this:

Thing thing = new Thing();
IThing thing2 = new Thing();
GenericThing(thing);
GenericThing(thing2);```
#

I assume the first will be devirtualized while the second will not?

wispy spear
#

never had to dig down that deep

wicked notch
#

Hehe, I enjoy getting to know the language

wispy spear
#

id rely on the compiler to lower properly

wicked notch
#

I want to explore C#'s guts as much as possible

wispy spear
#

good exercise right there πŸ™‚

#

you can also emit IL with c#, but its been years since i did that last time πŸ˜„ it was in the summer of 2005

wicked notch
#

I have this very nice thing on the right

wispy spear
#

yeah

#

rider has that View IL Code thing

twin musk
wicked notch
#

Hmm I see

#

So if the compiler can prove that T is going to be Thing then it will devirtualize the call

#

Epic

twin musk
#

i'm not 100% certain on that though, i could be completely wrong there!

wicked notch
#

That's fine, I'll be reading stuff on C#'s JIT compiler anyways

#

Guesswork and theorycrafting are appreciated as well

wispy spear
#

its interesting to see how people approach new languages : )

wicked notch
#

Explains a lot about generics, devirtualization and stuff

#

I've seen from the various threads that the rule of thumb to avoid hindering the JIT's compiler ability to devirtualize is to use:

  1. Sealed Classes
  2. Don't assign T to an interface, just use var
  3. Generics and typeof are your best friend
wispy spear
#

i am not sure i understand 2)

wicked notch
#

With 2 I mean this:

IThing thing = new Thing();
``` is potentially dangerous
#

While

Thing thing = new Thing();``` is safe
wispy spear
#

ah, but it shouldnt

wicked notch
#

Yeah of course, this is a shrimple example

#

The compiler is big brained enough I'm sure

wispy spear
#

its a good thing that all the dotnet/and c#isms are also worked upon in public

wispy spear
#

weird how this article is inbetween all this other crap

wicked notch
#

C# has been pretty fun so far, however Avalonia's AXAML is a bit of a pain to use

#

Why couldn't it just be HTML smh

wispy spear
#

i agree avalonias xaml is scuffed

#

WPF's xaml is >> all

#

HTML is not an option πŸ™‚ its not xhtml

cedar seal
#

πŸ˜‰ ImGui

frank sail
#

dear imgui is the end-all be-all of gui solutions

wicked notch
#

Alright

#

Trying to render the City Sample (even the small one) caused my GPU's driver to reset infinite times, I had to hard-reset my PC KEKW

#

Turns out that if two or more meshes are equal, different vertex buffers with the same contents are still created for them, which is really suboptimal

#

So I was basically loading the full 116 million triangles for no reason at all, spilling it into system RAM, and even into the page file

wispy spear
#

are you going to break the mesh into chunks?

#

physical ones

#

and stream/load on demand?

#

is it even "large" (dimension wise) enough? πŸ™‚

wicked notch
#

I think for now I'll shrimply NOT allocate vertex buffers with the same contents lol

#

Then I might do some software meshlet shenanigans perhaps

#

Streaming is still a dream I don't think I can achieve right now, but soonℒ️

wispy spear
#

: ) oki

frank sail
#

the stream dream meme

wicked notch
#

Also, I'll upload the city sample if any of you want to break your engine try it out yourself

wispy spear
#

be careful with your git repo

frank sail
#

isn't it like 300 mb

#

ye just make us convert it ourselves

wicked notch
#

License issues?

wispy spear
#

size and bandwiff

wicked notch
#

I dunno anything about licensing tbh

#

Ah, yeah 300MiB is not much but still quite heavy

wispy spear
#

there is a stupid limit for unpaid git plans

#

perhaps you can upload it to discord

#

and link from zer

#

not sure what the non nitro upload limit is, with nitro its 500mibs iirc

frank sail
#

only 100mb for plebs like us

wicked notch
#

I have a hetzner storage box but I have no idea how to setup permissions KEKW

#

For a quicc upload πŸ™‚

#

Try it out, let me know how many levels of backrooms you'll fly through

wispy spear
#

πŸ˜„

#

(dont run that)

wicked notch
#

Hehe I've used linux for a long time

wispy spear
#

890mb πŸ˜„

wicked notch
#

ngl I've actually fallen for that joke way back

wispy spear
#

hehe

wicked notch
#

On a VPS though so nothing catastrophic

wispy spear
#

i got rid of all my servers over the years

#
SharpGLTF.Validation.SchemaException: Accessor[208] _count: 0 must be greater or equal to 1.Model generated by <Unreal Engine 5.2.0> seems to be malformed; Please, check the file at https://github.khronos.org/glTF-Validator/
#

lets see if vscode can load it

wicked notch
#

Hmm, blender could load it so I assumed it had no errors

wispy spear
#

gltf seems indeed kaputt here and there

#

according to the red indicators of the linter/validator thing

wicked notch
#

I blame Ü

#

If you disable error checking does it work?

wispy spear
#

you need to rewrite UE into the german umlaut u πŸ˜„ and yell it, puts more emfasis on the sillyness

frank sail
#

oo ee

wispy spear
#

"i blame Ü"

frank sail
#

why ΓΌ smiling

wispy spear
#

201 validation messages for /home/deccer/Private/Code/Projects/lessGravity/OpenSpace/src/OpenSpace.Assets/Data/Props/Small_City_LVL/Small_City_LVL.gltf

wispy spear
wicked notch
#

cgltf doesn't seem to complain though I did pass it through gltfpack

#

So perhaps it did some weird shenanigans

wispy spear
#

hmm its also just warnings but ye

wicked notch
#

Just do the big ignoreℒ️

wispy spear
#
{
            "bufferView": 208,
            "count": 0,
            "type": "VEC3",
            "componentType": 5126
        },
        {
            "bufferView": 209,
            "count": 0,
            "type": "VEC3",
            "componentType": 5126
        },
        {
            "bufferView": 210,
            "count": 0,
            "type": "VEC4",
            "componentType": 5126
        },
#

for those kind of things

wicked notch
#

I mean, yeah a 0 sized buffer is nonsensical KEKW

wispy spear
#

or is it?

wicked notch
wispy spear
#

heh

#

hmm i probably have to open another iShoe with that lib

#

since validation is pretty much off already

wicked notch
#

Indeed, one must push through any errors and warnings

#

Jaker where's your render

frank sail
#

I'm not touching my home pc until I'm off work

wispy spear
#

i was about to say that an excuse is about to come .. something about work

#

you live in ze wong timezone

frank sail
#

no u

wispy spear
#

nuh uh

frank sail
wispy spear
wicked notch
#

So, it's exam season but I have not been idling, I have done quite a bit of research a lot of stuff: How Nanite works, Micropolygon Software Rasterizers, Visbuffers and Vulkan.

  • Regarding "software" meshlets (i.e.: without mesh shaders) it turns out you lose quite a bit on performance because you lose the vertex cache. I'm not sure if this'll be a problem in the future with hugely detailed meshes, we'll see.
  • Micropolygon Software Rasterizers are incredibly cool and they somehow beat a hardware rasterizer. It turns out they also work really well with a 64-bit Visibility Buffer.
  • Visibility buffer are another thing that's incredibly cool, if you store the cluster index and the triangle ID within that cluster, along with the depth you have all the informations you need to render a triangle.

I, however, am getting VERY ahead of myself, given that I don't even render a single triangle with my current Vulkan abstraction KEKW

#

I'm also quite tired of using raw OpenGL and don't have the life force to abstract it on my own, so I guess Fwog will have another user!

#

Unfortunately I don't have time to properly sit down and program as I used to, once exam season ends I should start again

wispy spear
#

no rush my spaghetti frog

#

gp discord wont be running away 🀌

wicked notch
#

Alright, I have one hour before my brain batteries run out, it's time to clone Fwog

#

Jaker I expect techsupport within 2 nanoseconds of my requests

wispy spear
#

πŸ˜„

#

i will also try to help if the walls are too thicc for jaker to come out

wicked notch
#

Thanks 🎩

wispy spear
#

πŸ‡³πŸ‡΅

wicked notch
#

Much appreciated, alright it's time

#

T+00:05:00.000: Hello World achieved

wicked notch
#

This tells me you always use a -1 to 1 depth range

#

Unforgivable

frank sail
#

oops I need to fix that naming

#

resharper didn't change it when I was refactoring

frank sail
#

unrelated, but I also need to fix a bunch of missing enumerators because they didn't appear in the refpages when I was copying them bleakekw

wispy spear
#

before or after working on them articles?

frank sail
#

babogey

wicked notch
frank sail
#

I forgot what the missing enumerators were though so I'll forget about it altogether

wicked notch
#

Regardless of Jaker's questionable time management skills

#

We have Mr. Triangle for the 28595th time

#

Time to do hybrid software-hardware rasterization with LOD'ing and basically reimplement nanite

#

Hold on

#

OpenGL does support 64 bit atomics right?

frank sail
#

uh

#

GL_NV_shader_atomic_int64 is implemented on both nv and amd gpus

wicked notch
#

I mean 64 bit image atomics, sorry

wispy spear
#

is this srgbified?

#

it looks quite dark

frank sail
wispy spear
#

ill revoke the first tringle being rendered properly

wicked notch
#

shader_image_64bit

#

or something

wispy spear
#

GL_EXT_extension

wicked notch
#

GL_EXT_shader_image_int64

wispy spear
#

close

frank sail
#

I don't see it on gpuinfo

#

maybe it's one of those glsl extensions that "just works" when you add it on supporting drivers

wispy spear
#

BBK might know something about it

#

he can sniff out undocumented exts πŸ™‚

wicked notch
wispy spear
#

no

frank sail
#

perfectly normal bleakekw

#

you might be able to trick the driver by compiling to spir-v first

wicked notch
#

How do I feed SPV to your thing

frank sail
#

that's the neat part

#

uh

#

would you like me to add a way to feed spirv to fwog shaders

wicked notch
#

Nah, don't bother, it's too painful

#

Alright change of plans

#

I will not do the 64 bit visbuffer in OpenGL

#

I'll shrimply do software meshlets

frank sail
#

you can still do 64 bit atomics without explicit support

#

if you're brave enough

wicked notch
#

CAS loop?

frank sail
#

yeah

#

or you use that other 64-bit atomic extension

wicked notch
#

There's an underlying problem though

#

There's no GL_R64UI format for images in GL

frank sail
#

I mean for buffers

wicked notch
#

Hm

frank sail
#

maybe your epic renderer could be enough pressure to get vendors to add 64-bit image atomics

wicked notch
#

Yeah, let's not do 64 bit visbuffer in GL bleakekw

wispy spear
#

its been 1.5years or so since devsh showed off his brainworm of a visbuffer

frank sail
#

and if you look at #vulkan right now you can tell it took a toll

wispy spear
#

jebus they talk faster than i can read

finite yacht
#

but the new amd drivers are weird. Cant say if it works when you have backing storage and actually read/write it

wicked notch
#

Epic

#

Btw, how do mesh shaders handle vertex cache? We do basically both vertex and index pulling here, does it not matter because clusters are small enough to fit in L1 anyways?

frank sail
#

my educated guess is that they don't, at least on AMD hardware

#

since mesh shaders bypass the part of the hardware (the geometry engine) that implements vertex reuse

wicked notch
#

Interesting

#

It do make sense, I guess I shouldn't have to worry then

frank sail
#

lemme know if there is anything in fwog that sucks

wicked notch
#

Ayeaye chief

wicked notch
#

vertices[v_offset + vertex_indices[i_offset + primitive_indices[p_offset + thread_index]]]

#

Lovely three way indirection KEKW

wicked notch
#

How in god's holy name is this struct not trivially copyable

struct raw_meshlet_t {
    uint32 vertex_offset = 0;
    uint32 index_offset = 0;
    uint32 index_count = 0;
    uint32 triangle_offset = 0;
    uint32 triangle_count = 0;
    // custom data
    uint32 group_id;
};```
#

Is C++ shitting me?

#
TriviallyCopyableByteSpan(std::span<T> t) : std::span<const std::byte>(std::as_bytes(t))```??????
#

Ah TriviallyCopyableSpan accepts a single T too for some reason smh

wicked notch
#

Alright, we got software mesh shaders in Fwog

finite yacht
#

very cool

#

so how is a triangle rendered

#

what does software mesh shader mean

wicked notch
#

It's how I call "rendering meshlets without mesh shaders"ℒ️ lol

#

The pipeline is actually very simple, from the meshlet buffer I build indirect commands that have a vertexCount of primitive_count * 3 triangles and one instance

#

Then the vertex shader just fetches primitives, indices and vertices from gl_DrawID

#
void main() {
    const meshlet_t meshlet = meshlets[gl_DrawID];
    const mat4 transform = transforms[meshlet.group_id];
    o_meshlet_id = gl_DrawID;

    const uint primitive_index = primitives[meshlet.triangle_offset + gl_VertexID];
    const uint vertex_index = indices[meshlet.index_offset + primitive_index];
    const vertex_format_t vertex = vertices[meshlet.vertex_offset + vertex_index];
    gl_Position = camera.pv * transforms[meshlet.group_id] * vec4(vertex.position.xyz, 1.0);
}```
#

Only problem is uh, L1 hit rates being 20% KEKW

finite yacht
#

ok so this could easily be done in actual mesh shaders when available on the hardware

#

the fact that the vertex shader manually fetches the index makes it sounds slow

wicked notch
#

Yeah, I didn't change anything on building meshlets or the buffers from my previous mesh shaders attempt

wicked notch
#

This is one sad frame capture lol

finite yacht
#

where should I be looking at to see its bad? (i dont use nsight and stuff)

wicked notch
#

"Unit Throughput" being very smol

#

Also "SM Occupancy" is all grey

#

The blue thingy in SM Occupancy is how many vertex warps were in flight

#

Less vertex warps = less throughput = less ms

#

Actually more ms but you get the point KEKW

finite yacht
#

the meshlets are generated by some libary?

wicked notch
#

Yes, I use meshoptimizer

finite yacht
#

ok

#

cant find an up to date .net port : (

wicked notch
#

Hmm that's a pain indeed, but you only need these two functions to build meshlets: meshopt_buildMeshletsBound and meshopt_buildMeshlets

#

So perhaps you could just DllImport your way through these two

finite yacht
#

yeah If I really wanted, I would do that. Like I am trying to do right now with fsr2 and failling horribly

#

have you looked into that?

wicked notch
#

Holy pog that looks great

#

I haven't, I'll try it right now

finite yacht
#

great

wicked notch
#

Much better

#

But L1 hit rates are still pure garbage

#

This is shrimply MDI + Index Buffer + Post-T&L cache

#

Compute will come after I study nervous

frank sail
#

fun learning time has ended, now it is boring learning time

minor root
#

i have cooked up a new horrible way of studying

#

factorio on one monitor, watching lectures i missed on the other

#

very effective

frank sail
#

I can't focus on more than one thing at a time

#

if I'm programming with a video playing on the other monitor, I don't retain anything from the video

#

and vice versa

minor root
#

it depends for me

#

the lecture is mostly audio, and the material is not extremely complicated

frank sail
#

at best I'll tab out of a game while I'm waiting to respawn so I can read a sentence or two of a blog post

wispy spear
#

i was able to do that when i was younger

#

its impossible now

twin bough
wicked notch
#

still studying

#

πŸ’€

finite yacht
#

i wonder if others can reproduce it. @frank sail next time you are on on your amd card and you have time can you try compiling and linking this please (as fragment shader):

#extension GL_EXT_shader_image_int64 : require
#extension GL_ARB_gpu_shader_int64 : require
layout(binding = 0) restrict writeonly uniform u64image3D ImgResult;
void main() {
    imageStore(ImgResult, ivec3(0), uvec4(0));
}
wispy spear
#

you might need the format in the binding thing too

#

layout(binding = 0, xxx) blabla blabla; where the xxx is

finite yacht
#

its writeonly so the format is not needed

wispy spear
#

ah

finite yacht
#

(but i also tried with)

frank sail
#

there's an extension that makes the format unneeded even when it's not writeonly

wispy spear
#

heh

#

no surprise, almost

frank sail
#

GL_EXT_shader_image_load_formatted

wispy spear
#

one really has to read the little txt files of the extensions first to know when to use which πŸ™‚ neh?

frank sail
wicked notch
#

I dunno what's the deal either but I couldn't compile KTX because it uses -Werror and it has warnings so...

#

I just yoinked warnings out of the equation KEKW

frank sail
#

that's wack

#

(on libktx's part)

wicked notch
#

Probably just an issue with my outdated MSVC tools

frank sail
#

I'm about to find out if that's the case

frank sail
#

yeah there were a bunch of warnings lol

frank sail
#

I don't see a call to texture_t::create in that file, so I guess so

#

just learned how basis universal works too and it seems quite nice

#

imagine being the poor schmuck who has to write the BU->ASTC transcoder bleakekw

wicked notch
#

soonℒ️

frank sail
#

btw

#

how do you obtain the ktx images

#

do you just use the toktx tool in ktx-software

wicked notch
#

gltfpack does it all

#

There's a flag that compresses everything in BasisU

frank sail
#

ah

#

sweet

#

doesn't look like blender can output basisu textures or draco meshes froge_sad

#

anyways, I shouldℒ️ have compressed texture support pushed to fwog tomorrow. It's already in, but I have to test before committing

wicked notch
#

Lovely, I'll be the first in the world to use Fwog's compressed texturesℒ️

wicked notch
#

Hmm I suppose they just call it with the max number of indices

#

I do not understand this quite well...

wispy spear
#

same way you can use glDrawArrays(... big number);

#

and get the triangleId by gl_VertexID % 3 to index into some other buffer to draw many triangles/sprites in 2d games

wicked notch
#

Does gl_VertexID reset when gl_InstanceID is incremented?

wicked notch
#

I crashed the driver so hard I got a BSOD

#

Is this even possible?

#

Ah I see

#

mfw windows shits itself when I add 3059167329581762395871623049847601294875601948657 to my vertex offset

minor root
#

lvstri trying to read vertices from another timeline

wicked notch
#

I have cooked up a solution

#

The only problem is that it's a garbage solution

#

Basically for each vertex ID, store the meshlet it belogs to

wicked notch
#

And is gl_VertexID just the content of the index buffer?

wispy spear
#

i dont think so

#

its the iterator in 0..vertexCount

wicked notch
#
the index of the vertex currently being processed. When using non-indexed rendering, it is the effective index of the current vertex (the number of vertices processed + the first​ value). For indexed rendering, it is the index used to fetch this vertex from the buffer.``` What does "index used to fetch this vertex from the buffer" mean? ![frog_thinkk](https://cdn.discordapp.com/emojis/1112825124316516448.webp?size=128 "frog_thinkk")
#

vertices[base_vertex + gl_VertexID]?

finite yacht
#

yes but base_vertex is already taken into account

wicked notch
#

Wait, is it the index in the index buffer or the index in the vertex buffer?

#

Say my index buffer is [2, 0, 1], I call glDrawElements(3, 1, ...)

#

gl_VertexID represents the index of the current index being processed?

#

So like vertices[indices[gl_VertexID]]

finite yacht
#

for indexed drawing its the value of the index not the index of the index in the element buffer.
So if you do indexed drawing and want to manually fetch the vertex you'd do vertices[gl_VertexID]

wicked notch
#

Ah I see, there's a big fat dumb monkey in my brain

#

Thanks for engaging in my meaningless ramble, I understand now

wispy spear
#

are you on the path of VisBuffer again?

finite yacht
#

its not an uncommen question

wispy spear
wicked notch
#

Or rather, I can, but I'm not brave enough to hack spirv into OpenGL and pray for the best (and that the driver doesn't notice)

wispy spear
#

devsh did it

wicked notch
#

How did she/he/they manage?

wispy spear
#

its a brainworm

wicked notch
#

Yeah well, I figured KEKW

wispy spear
#

ask him, he has a dedicated channel on his discord explaining it in 6, 7 vidjeos

wicked notch
#

I'll see if I catch him in #vulkan sometime, I see they're quite active in there

wispy spear
#

yeah

#

or not do it in opengl, thats why you play with vulkan i suppose : >

wicked notch
#

Yeah but Vulkan is a beast

#

I can do meshlet rendering in less than 200 lines in OpenGL bleakekw

finite yacht
wicked notch
#

Also there's no GL_R64UI image format so... there's that bleakekw

wispy spear
#

why does it have to be 64bit?

wicked notch
#

Because I can put depth in it as well

wispy spear
#

ah packing

wicked notch
#

And if I can put depth in it I can do imageAtomicMax and do hybrid software rasterization in compute

#

I am not aware if I could do the same without imageAtomicMax or with a GL_RG32UI target

wispy spear
#

you could add it into llvmpipe perhaps πŸ˜„

wicked notch
#

Drivers are magic as far as I know, I don't think I should be allowed anywhere near them bleakekw

wispy spear
#

im sure you could figure something out heh

wicked notch
#

I'd guess Vulkan is far less effort than hacking the driver

wispy spear
#

absolutely

#

does vk have that 64ism?