#GrubStomper29's OpenGL Sandbox
1 messages Β· Page 2 of 1
thats fine
there are extensions for nvidia from 15 years ago, which allow bindless buffers
and you can also track the bind state on your side too, and only bind when it needs binding
right
yeah
and then the next step would be nanite, where you split your shit into meshlets and do the same frustum/hiz culling
i went bindless now because id rather build off of it than have to change it later
in my older engines i had a fallback to texture2darrays, so that i can debug in renderdoc, but might not worth the effort, but simply because i dont like nsight much
i dont mean for textures, im talking about geometry
what do you mean
bindless buffers is pretty much out of question for opengl : >
unless you want to support nvidia hardware only
i dont want to write a meshlet generator then immediately refactor it to use megabuffers yeah
at runtime?
alright
or as a preprocessing step, or even at build time if you want
maybe later as an exercise to figure out that meshlettification stuff
it wonβt be as elegant
i also believe this is a rabbithole and will grant you a million dollhairs if you actually solve it for every mesh configuration there is
lol
it always sounded like thats an actual maffematical problem or topological or whatever term, when lustri was talking about METIS and whatnot heh
my original plan was to tipsify sort the index buffer, then put sequential tris in a meshlet
the results looked good enough for high poly yet simple models like that dragon thing
but im sure it would fall apart with complex or low poly geometry
i have not put any thought into meshlets myself yet
Bindless textures doesn't improve performance on its own afaik
the point is to do larger batched draws
yep theoretically they do improve performance, because you dont bind any texture anymore and have to group your meshes by material and whatnot
oh yeah right
I forgot that texture switching itself can be a significant cost
for some reason I've never had that issue
on the account of not having any textures going in your game ahem
I do though lol
I make textures that are just solid colors as a pessimization so that later when I have real textures it doesn't cause any perf or memory issues
I have like 1024x1024 textures that are 3 blocks of solid color lol
But yeah probably because I don't have a large enough number of different models yet
actually though I don't even batch by texture binding
so just not enough objects
you just render what needs rendering neh?
in whatever order it comes in
as in, no batching
I bin by pipeline
ah
and then just draw the objects naively
batching is implemented bespoke for systems that have a lot of objects
e.g. foliage
yeah
rendering the 2 quads per grass tile would be insane as individual drawcalls heh
if you think about it, environment stuff is almost always going to outweigh anything the player creates
yeah
asteroid belts, whole planets, cities full of housing and whatnot, or otherwise terrain rocks and whatnotisms
Yeah
plus all the fluff which goes with it, decals and foliage
this madrigal guy keeps teasing me
with his little terrain and car fisiks arrangement
(i believe he works in the studio which made alan wake)
explains a lot of stuff π
An average scene in the game is like,
trees/rocks/misc ground clutter: 100,000,000
items/gear/vehicles/etc: 100
buildings: 50
characters: 50
oh yeah that definitely explains it
: )
i played a bit of dayz earlier again and kept gazing at the trees and grass and sky and the wind playing with all that shit
and pictured some vehicle driving around with wobbly suspension and antennae and whatnot
haha
I'm taking over some structural dynamics sims at work rn so I am learning stuff that will potentially result in lots of wobbly suspension and antennae if I ever make it to SRS DLC #137
in this case vibration analysis but yeah
not yet beyond some debug 3D plotting I've done as a basic vis
but maybe I will later if I can justify spending time on it
otherwise the output is just plotting transfer functions and stuff
power spectral density
your colleagues should hold on to their knickers then
when you drop the srs plugin for that thing : >
where you can walk around in that simulation and inspect values from a virtual camera, or even V R
haha
like they did when they build CERN
E.g. when I was working on geodesy stuff I transcribed some of the math code from SRS to do my work calculations
same for some aircraft attitude problems
thats cool when you can apply stuff from your hobby to your work, or other way around
then you did something right π
It's surprising how clunky some of the APIs are when it comes to spatial math stuff, like in matlab or python (even with numpy) I find myself whipping up some glm-like functions that let me do math the way I'm used to doing in GP
weird, you would think those maffs libs come with those things, neh
since they are so widely used
They kinda do but they're just clunky
like in numpy if you want to take a bunch of vectors and turn them into homogeneous coordinates to multiply them all by a matrix, you have to do a few steps to append the column of 1s and multiply them by the matrix and stuff, so I just made a function that does it in one step
small QOL things
Similar to the debug graphics API I have in my engine
being able to plot a matrix directly as basis vectors in 3D
in 2005 or so, i rewrote a whole workflow engine from scratch in c# as a demo, see if i can actually do it or not, since iat the time i was working with something java based (had a visual editor in c# which i was able to reuse) but my crude c# bs was magnitudes faster, the core of the engine is basically just xslt transformations, from any input to any output, forks and joins and whatnot
haha cool
i was just a trainee at that time too hehe
oh, thats what i would have expected to be part of numpy tbh
it is in some limited ways just not as flexible as I'd like
usually it's fine, there are just a few awkward parts that I fixed with bits I have in SRS too basically
even the VectorGeometry.pas pascal file for all the math stuff had it built in back when i was fiddling with Delphi
from homogeneous to affine and other way around, overloads for everything you need
sucks
the helper functions it provides are hit or miss, but 90% of the time it's good
I use it for everything
if it works most of the time then its probably alright
and then you just add whatever is missing fgor your special corner case
yeah
you have been coding since before i was born lol
does mdi allow you to change bindings at all?
Personally I juse use array textures
yeah, started fiddling with computers in 1990 π
2016
oh okay
That's when I started programming
But I was fiddling with computers since maybe 2010 and using them since 2004 or something
Just playing games and drawing on macpaint and stuff
Before that I guess too in limited capacity
I don't think macpaint existed anymore in 2004 so maybe that was earlier
yeah
I wonder if I should use compute shaders for my mdi
so I can properly choose which SSBOs to read from in vert/frag shaders
and to calculate transforms
try naive first
then use compute to cull and fill the indirectbuffer with the surviving geometry
wdym
no compute shaders
frustum culling on cpu, populating the indirect/parameters buffer for glMultiDrawIndirectCount
Yes this is good enough for most purposes
Your application may not have enough stuff going on to get much benefit from computer draw generation idk
how would that work?
i know you can specify the offsets of each parameter buffer letting you put extra stuff in there ig
but how would you access that extra stuff?
it looks like on vulkan you can have a full command buffer for bindless rendering
im jealous now
Which stuff
you have all your meshes, have all their aabbs, you cull those aabbs against your view frustum, the ones which survive, you add to the indirect buffer, and put the count of all of them in the parameterbuffer
then you bind both buffers and glMultiDrawElementsIndirectCount()
next frame, you do the same, as you might have moved your camera around, frustum changes, you cull all meshes using their aabbs
etc
Maybe worth doing some regular instancing before trying MDI
yea
and funny enough
if you slowly progress through glDrawArrays, glDrawElements, glDrawElementsInstanced, glDrawElementsInstancedBaseVertex, glDrawElementsInstancedBaseVertexBaseInstance etc you arrive naturally at glDrawIndirect and glMultiDrawIndirect and then the peak of MDIism glMultiDrawElementsIndirectCount hehe
and you can see how all the params of the DrawIndirectElementsCommand came to be
next evolution then is, doing the culling all on the gpu
with compute
then hiz occlusion culling
and then the latest thing, you meshlettify all your geometry and hiz occlusion cull all of them
makes sense
if you have a lot of geometry that is
which overlaps a lot
otherwise a shrimple glDrawElementsInstanced is probably the best
I think the structure I will take is
have an array with elements for each primitive/meshlet being rendered containing indices to transforms, materials, etc.
and then have arrays for said transforms and materials
all on GPU side
use buffers instead of arrays
its not a sin to bind stuff
i know but I literally cannot bind
you might have a depth prepass, a deferred render pass, lights and shadows and whatnot
those require binding of sorts
as far as i know mdi doesn't let you bind between draws
ye its in the nature of the thing
bind all the things you need for the pass to be able to draw its stuff
so to go around that I would have to do this
dont overthink it or overengineer it, you also might need to bind some stuff for debugging purposes
like a line renderer or so to visualize aabbs or frustums
mdi is still very possible
You have a couple thousand state changes per frame before you start getting bottlenecked by the driver
just in a different renderpass then
yeah also texture binds is what used to be the costly part, but since you go for bindless textures already thats gone pretty much
okay
how so?
like i just mentioned
MDI is just another type of draw call
you might have a depth prepass, then some deferred geometry pass, then a pass lights and shadows pass, then a pass to resolve the two
You don't have to put every single thing into one MDI call, that's not really desirable
all the post effects if you ever add some will do their own thing
i'll stick to singular draw indirect which I assume is about a nanosecond faster lol
and upgrade to mdi in the future if needed
drawindirect and mdi is virtually the same
correct
but starting with non indirect draw is a good start too
it does have implications on the "renderer" "architecture" a bit but thats just 1 vs many (2 per mesh)buffers really
I think I'm just bikeshedding the wrong stuff
Well it depends on your goals I guess, deccer and I are fairly practical about this so our advice is always going to be "you don't need to bikeshed this until your application demands it"
But if you don't want to make an application and you just want to make a meshlet bikeshedding application then that's up to you
But you can't really dive into it without some decent experience using the API the normal way imo
I plan to implement gi, so I do want to optimize this well but I really doubt I'm cpu bound anyways
I do wonder how people are drawing sponza at 0.1ms though
probably just culling
honestly before I do anything else I want some OIT
and maybe alpha to coverage
OIT supplants A2C as far as I know
The point of A2C is to avoid blending altogether
: )
I read a paper on something called "fourier opacity mapped OIT"
it sounds like a neat place to start
for some reason i have old game magazines in front of my third eye right now explaining MSAA with some telefone cable poles in games
Lol sounds ambitious
perhaps
I guess if you're just taking stuff straight off the paper maybe it's doable
if it's too much I can always fall back to whatever method logl has up
I wish there were more test scenes that were better than sponza and bistro
weighed blended
i have this neat antenna thing from mjp's shadow demo thingy
I've never rendered them but afaik they are basically just one giant mesh right? Nothing to really instance
uhhhh
I think bistro has some identical bush meshes
but repeated less than 10 times so most likely not worth any instancing
Yeah that's not that interesting though, what we need is a test scene that's like an actual game and not just a blender scene
thats why i always wanted to make some "bricks" in various default sizes 1x1x2, 1x1x3, and so forth per axis, and "rebuild" sponza out of those, would be suitable for some fisiks testing too hehe
yes
and ivy
and curtains
too bad the fucking knight model only comes in obscure file formats
fr
<rant> i have no fucking clue whats so hard to properly name the model, its parts and whatever else is in there, and then export them into all the fileformats they need individually </rant>
I'm going to be the first one here whose engine includes a glTF exporter and then others can render SRS scenes in their engines lol
absolute W
i screenshotted that
i enjoy being distracted grumpy
otherwise i would need to attend my 4k lines of mess.cpp
yeah
doing controlled sixteenth notes on a single kick is hard
but i refuse to buy a double pedal
Although I exclusively play 16ths on the kick lol
so naturally, 16ths on singles are possible
I thought you meant sustained
Yeah
yeah that's what I mean
I can play 8ths sustained on one pedal up to a certain bpm but not 16ths
Idk how you'd do that unless at very low bpm
Carter Beauford from the Dave Mathews Band in the studio recording the drums to Tripping Billies. From his dvd "Under the Table and Drumming". Great dvd, check it out http://www.amazon.com/Alfred-Carter-Beauford-Under-Drumming/dp/0757990894
I do not own any rights, go buy the dvd! ;)
to my understanding its much like 32nds with your hands: much less technique and mostly just brute force practicing it
Where does he play 16ths on the kick
still thats an equivelant of 16ths on one foot
unless hes doing some sort of double stroke
Timestamp? I'm at work I can't watch the whole thing
Those sound like 16ths to me
Ah I'm counting it at cut time I guess
Well that's 8ths doing a blast beat on a single foot so I can do 16ths on a double pedal at that BPM
yeah
That's pushing the range of what I could sustain for more than 30 seconds or so though
Ok I found maybe the only recording of myself playing that I can find online, it's two songs I tracked for my friend's meme grindcore project lol
I will DM you a link so as to not dox myself
lol
darn, none of my projects are cool enough to go in that comp
dont say that π
or take it as an inspiration
like these tiktokers who ask random 1 man burgerflippers if they can make them a 60s intro video π
put the deadline a year ahead and there might finally be some cool stuff in here lol
Is this compilation thing going to be publicly associated with this server, I don't want it to attract more windowlickers
i was a windowlicker once
#1252708855176106086
Advertising it as a place to come hang out with the cool kids or see cool projects without contributing
No you weren't
i like aphex twin's window licker
not sure, perhaps there could be various formats, jake is the brains behind this idea
time to create an alt account, demon π
where nobody will totally recognize all the DLCs which went into srs when showing srs reel
maybe we just show it to our wives/gfs and no one else
that also works
I would include SRS if there was no way a viewer could follow it back to this server
i will also refrain from fan commentary under that video then
i would love to see a demo of what you have so far
all ive seen are some models and a terrain and fooliage system
There's not a whole lot to see yet tbh I just test it in small pieces and right now it's all taken apart to rewrite most of the whole game
The most cohesive things to see is just at the top of my thread basically
lol
i need to escape this habit of doing some tiny task then being done for the day
I need to stop anxiously avoiding my projects because of the crushing weight of their difficulty
I waste a lot of time
Also I feel bad working on that one because there's another I'm supposed to be working on
huh
I am beholden to another volunteer project that I can't disclose but it's a huge PITA to work on so I've been avoiding it
i see
i could temporarily ban you from
if that helps : >
i think my ubo alignment is all messed up
ah
I think my booleans were being optimized to 1 byte or something
anyways it took embarrasingly long to get this demo up
I have dozens of gl objects that i dont delete when the program closes
"ill handle it later" lol
yeah
I have finally gotten my model class written
theres a couple more things to do to finalize it but other than that were good
and I am proud to report that I did not write a Scratch project
AndβI might switch to vulkan later down the road for rt support
weβll have to see
Tbh
I might just
step through the hierarchy a second time for transparent primitives
I doubt it's much slower than making some sort of queue system
and if it's a perf problem I can probably just store two seperate hierarchies in the future
the material should be able to tell you whether its transparent or not
and from that you can divide your primitives in Opaque and Transparent-isms
i suppose
There is an issue
blender didn't actually correctly write the materials upon export
perhaps this? #1237853896471220314 message
idk thats a foreign discord
it was mentioned in #ray-tracing somewhere, perhaps ping jaker or BayBoyKiller
i believe those 2 were referring to the good one some time agio
ill try jaker
The gltf file is worse than I imagined
it's storing duplicate texture-sampler pairs
ig I gotta make a hash table now π
the fix was simpler than imagined yet took some time to write
if (!set.contains(mTextures[i].bindlessHandle))
{
glMakeTextureHandleResidentARB(mTextures[i].bindlessHandle);
set.insert(mTextures[i].bindlessHandle);
}
use the variant which takes samplers into account already
i believe i mentioned that some time ago, that when you go bindless to also look at sampler objects
otherwise you have no way of using different samplers per texture
thats what I do
ah right the sampler and texture goes into the GetSampledTextureHandleThingARB not this function
i 
@night shoal cool finding
those depth issues with the sponza decals don't occur with OIT for some reason
interesting
todo: default sampler
5 on the ap csa exam
finally correctly rendering sponza
That's bistro mate
oops
normal mapping
doesn't look very normal to me 
its just lighting
everything is white, and the light color is red
the code is all updated on the github if anyone wants to see
i dont like the highfov
do you know its high from looking at screenshots, the code, or by running the code?
looking at the screenshots
i changed it to 75
dont get me wrong, just beacuse i dont like it doesnt mean you have to change it π
i scrolled through a little and i think its perfectly fine
i personally dont use glGetXXXLocation to map strings to locations, i just use locations directly because they will never change really
there is a slight problem with creating textures, regarding color space/format, diffuse or albedo and emissive textures need to (or should) be in GL_SRGB8_ALPHA8, the rest in whatever... when you add compressed textures later then that matters for normals and single channel stuff too (the BC format, not srgb)
you can also save some bandwidth, by encoding normals into an uint using orthogonal encoding and glm::packSnorm2x16 (plus its counterpart in glsl to unpack and decode orthogonal into vec3)
looks like you create materialbuffer etc per model you load
that most likely will work when you render each model individually (bistro + deccer cubes + sponza + whatever)
hehe reading c++ which doesnt use trailing type thingy syntax is weird, now that i kind of like it and use it everywhere
I see
lol
i personally donβt like it
if you noticed i also dont store vertex tangents at all
A lot of models donβt include it anyways so I just calculate them on the fly with no major perf penalty
gltf will provide tangents
optionally
Itβs just easier to calculate them lol
sure fallback to have makes sense
anyways, what do you mean compress them later?
I don't like the trailing type thingy either
Declaring everything as auto only to separately declare its return type is just pointless imo
But to each their own
The thing about C++ is that there are so many ways to do things that you just need to learn to be flexible with different code styles with it
yeah
i do use fuzzy finders in whatever tool when present, but i understand what you are saying
deferred shading
im jealous π
Why?
my renderrer is fucked atm : >
Language
but mine too slightly
those lamps hanging from the rope should be colored
indeed
I also need to debug this
obviously it's working but I'm still scared that it's somehow not
It might be something about emissiveness
i think they have emissive properties yeah, but also some diffuse
Well im not rendering emissive properties
nor do I plan to
I went to a gltf viewer and turned emissiveness off, and it looks like the colors do dissapear
so it's fine

time to choose a meshlet generator
any suggestions?
also i went and saw dmb last night
their drummer is insane
this is not real
maybe you meant this https://github.com/zeux/meshoptimizer
I've downloaded it
the documentation seems very lacking
ohhhhh nvm
: )
I am a little confused by what I'm left with after using meshopt to generate meshlets
there's meshletVertices which are meant to be indices to the original vertex buffer
so I suppose the new meshlet vertices are sorted
and then there's meshletTriangles
which ig is the actual index buffer, but it's std::vector<unsigned char>
maybe theyre just 8 bit uints?
yeah that would seem so
why not just use uint8_t though
Taking a week off as I plan to go on vacation
I know its far off
but when Iβm done with college, I want to make a really good 2D game and distribute it for free
using my own engine and music of course
im looking forward to it
The thing is I kinda want to release it anonymously
I think that could be cool
or at the very least be silent about it during development
you could release it as StrubGromper69
Iβd probably use my real name
which is a great reason to not dox myself here
maybe I will personally DM you and demon about it lol
Yea the separation of identities is pretty tricky
It's hard to keep them perfectly isolated
I have a more professional handle that I don't mind being associated with myself but obviously I do all my main discording and stuff with this one so if I were to release under another name then everyone who's seen me here will know the connection between the identities
Interesting
well if a side scroller is released in 2030 with dave matthews styled music youll all know who made it
I'll keep my eyes peeled
Midwesterner moment hehe
Actually Iβm from Maryland
but md water sucks so true
Oh I thought you were from Ohio for some reason
Seeing the curvature of the earth always reinvigorates my desire to stick with my round earth terrain 
just assume earth is a 10 dimensional object with a perfectly flat projection in 3D
waterposting gets this thread more activity than actually taking about the project 
well I love water so Iβm not too bothered by that
Gotta do more bikeshedding
thereβs no code to bikeshed
i impress myself with how small my programs tend to be
In which we explore ray tracing, the reason modern CGI can look so convincing, and ReSTIR, a recent technique that allows images (and particularly animations) to be rendered hundreds of times faster.
RIS Paper: https://diglib.eg.org/bitstream/handle/10.2312/EGWR.EGSR05.139-146/139-146.pdf?sequence=1&isAllowed=y
RIS Thesis: https://scholarsarchi...
I found this video to be a good resource
i am home now
wb
I hate these bugs:
throw;
std::size_t maxMeshlets{ meshopt_buildMeshletsBound(tmpIndices.size(), 64, 124) };```
It doesn't throw, but meshopt_build fails because
`assert(index_count % 3 == 0);`
alright we're getting somewhere
Okay index bufs are per meshlet
meshlet gen works, but that frametime π
7x slower
I need to use smaller indices, and maybe mdi for the meshlets
ahhh I see why its slower
I updated my little stats thing and before meshlets, there are 1538 drawcalls, and now with meshlets there are 50213
How slow we are talking about? Also what gpu do you hve?
the trick is now to cull all the non visible meshlets
35ms for a frame on rtx 3050
true but theres still 50k drawcalls lol
mdi is much needed here
jesus on my 1050 its 7ms without culling
how many drawcalls lol
not really possible cpu almost doesnt know what is loaded
uh
you can do singular draw indirects
thats what im doing
bind the indirect buffer then call drawindirect
Its not simple for me to just switch it
unless youβre using vulkan, im not sure what the equivalent will be
God willing, Iβll start with MDI and see if that resolves it
i am using vulkan but switching things around would be too much effort
only thing missing is lods and sw raster
i see
wait for hiz its going to be big pain in the ass
I still havent nailed it. There are some edge cases that are hard to nail down
Time to bikeshed drawcall batching
Ideally I want to use one MDI per mesh so that all meshlet draw calls are indirect
however I still need to sneak culling into that process
afaik you can't have MDI skip an instance
I have heard of mesh and meshlet shaders but iirc theyre vender specific so I want to avoid those
The vulkan switch may come sooner than we thought 
What are you skipping instances based on
Is this for culling
yeah
Normally you just build up the list of draws from scratch and you skip in that process
Or just on the host, whenever you build up the draw data
Oh wait this is for meshlets is it
Ok I guess it's still similar, but you have sort of a fixed "whole scene" buffer that's static and then fill the MDI buffer and transform buffer out of that I suppose
I'm not really sure
The difference for me is that normally your draw data is changing every frame
right
But if you're just selecting meshlets to be drawn out of a fixed array it's a bit different
You'd either have to copy the scene data into a compact instance buffer, sort the scene data, or add another layer of indirection so that your instances can randomly access the meshlets they're drawing
but the takeaway is im rebuilding indirect arrays each frame, right?
yeah
You can do this on the host too which is easier to start with but yeah you can do it in compute as well
Wdym isn't that what we're talking about doing
i thought we were using compute to make the draw arrays then make the few drawcalls from the cpu?
The CPU just does kicks off the indirect work
so why do we need to write the arrays if the compute shaders can just directly make the drawcalls?
Wait is there no transformation data for the meshlets
thereβs material/transform data per mesh but not meshlet
Ah ok
Then I guess you don't need the arrays
But either way you're just filling the indirect draw buffer every time
meshlets are nothing else but ordinary mesh primitives
you treat them the same way as your cube or sphere or motorcycle or deccer cube
they all go in your whole scene geometry/andotherwise indirection buffers
then on the gpu you ask the compute shader to cull invisible ones
and the survivors are appended into the indirect(and parameter buffer if you use mdic) buffer for the next mdi call
First what I would do move the data needed for rendering to gpu. So materials, transforms, mesh groups and meshes are on gpu in their own buffers and are reachable from shaders.
My simple render loop looked like:
- compute dispatch which reads a scene data and writes a indirect compute dispatch with ammount of mesh groups
- indirect compute shader goes over all meshgroups and their meshes to write all meshlets into one giant buffer with a counter.
- compute shader that read a counter and writes a single indirect draw call command with the instance count which equals to meshlet count in the scene
- indirect draw call which reads a previously written buffer with meshlet data and renders them.
This is my vertex shader that renders meshlets
I abuse instancing to render meshlets
That's true if the meshlets are all the same size you can just use instancing and pull the vertices huh
I set vertex count to a max of vertices that meshlet can have and if meshlets has less then max I clip out the vertex
instancing for meshlets?
As I said I abuse instancing
Yeah if each meshlet is e.g. 64 vertices then you can instance a 64-vert mesh and use the instanceID to fetch the vertices from an SSBO
if you mean if they have different amount of vertices you can simply handle it same way I do
you can instance them even tho they have different sizes
its not optimal but vertex shaders are fast enough
How do you discard the unused invocations, just emit NaN?
yes βοΈ
I linked the shader how I do it
i also have yet to descend into the meshlet realm
If you can use mesh shaders its much better
otherwise you have to cope with compute and vertex shaders
yeah so i heard
fortunately i have some 40xx grade hardware
but a fallback would also be nice, for the learning purposisms
My entire player character is a meshlet
sounds like an insult if you say it like that
learning purposes for sure but realistic wise you can just assume you will have mesh shaders
Well I am in the 11.2% and am not sure I agree with this message
I want to be able to play Deccer's game
Although I'm going to put a 30xx in my other computer eventually so all will be forgiven if he decides to go the mesh shader route
This is only discrete gpus on steam
Actually weren't there whisperings of cross-platform mesh shaders coming in GL soon
Although idk if that would work on old cards still
I assume most players who will want to play my game will have one
Are mesh shaders a hardware thing
rip
so gtx 1660 up in nvidia terms
is there anything non-vendor specific?
Like what
only having vendor specific mesh shaders 
Lol
this is non vendor specific but vulkan
I donβt want raw vulkan though; i think an abstraction would be better
I can recommend daxa
is that locally made here in this server?
hmm
very easy to use
a quick glance tells me it might be too high level
also to mention it has rendergraph
it handles synchronization for you and easier to code stuff
i see
yeah, turns out i cannot read, as usual
@gilded shell you can find the daxa thread in #1019722539116802068
kinda dead tho
lol
daxa has its own server but that one has occasional activity
good thing is you can still pester potti or gabe or any other frog who took the daxa pill
daxa pill is very nice 
If you want to make a specific game then raw Vulkan is fine
If you want to make a flexible engine or sandbox tool for graphics prototyping you should use something like vuk or daxa
Otherwise you're going to just spend huge numbers of hours writing infrastructure code
i suppose calling this a sandbox is wrong because i do have end goals
famous last words
but i do want this to be sandbox-like
thats up to you vuk is also nice but I like daxa for better ergonomics and ability to annoy lpotrick
End goals like what though
End goals as in a specific game or end goals as in a certain set of graphics features
features
which is why for now im going with daxa or vuk π
maybe vuk because it has a better name
It's also made by a funnier person
vuk has a dedicated server too, its in #related-servers martty is on travelz atm might not get immediate feedback atm
but Hek is there to help as well
i see
with daxa you have to either infer from the code or timerdoodle or one of my projects
Also you can ask me
daxa does have a dedicated tutorial too
I dont think its up to date but its really close
there were breaking changes and lpotrick is lazy to write it
darn
I would rather consult one of my projects or timberdoodle
I checked the tutorial and it should work fine
https://tutorial.learndaxa.com/installing-dependencies/
why is none of you suggesting the good ways of doing meshlets without mesh shaders
@gilded shell #1262676828322271293 message
@oak moat if Iβm βdecodingβ a vertex index in the vert shader, doesnβt that mean i cant use conventional vertex buffer layouts where the shader fetches the vertex automatically?
you fetch from buffers
I am using bda/pointersβοΈ
you can't do the classic vertex buffer stuff, yes
i see
"dispatch a compute shader to write an index buffer of size meshletCount * maxPrimitives * 3 that encodes primitiveID and meshletID, then in the vertex shader you can decode the vertex index and proceed as usual"
Not sure why it needs to write the index buffer
right but why? I already have index buffers; rewriting them with compute seems wasteful
I can understand the primitive and meshlet id part
its not index buffer in traditional sense
its buffer that contains indices for meshlet
I mean you can do this offline for twice the memory usage
why not do it offline and drop the original indices?
here you have highlighted code to understand
https://github.com/bevyengine/bevy/blob/a6f8f62748677a8a04b86231870e8e6de5505bc9/crates/bevy_pbr/src/meshlet/write_index_buffer.wgsl#L41
https://github.com/bevyengine/bevy/blob/a6f8f62748677a8a04b86231870e8e6de5505bc9/crates/bevy_pbr/src/meshlet/visibility_buffer_raster.wgsl#L41-L44
you will pay twice the memory usage
I do not understand why
because you write same info 3 times instead of just once
same packed index for every vertex in triangle or you could do it once per triangle
Sorry but I'm incredibly confused lol
I will simplify
suppose you do this offline
you have a buffer of size N
now suppose you want culling
just to clarify offline means on the CPU at loadtime?
you need somewhere to store the culling result
worst case for culling is that every meshlet and every triangle is visible
so the worst case size for the new buffer that contains the culling result is N again
therefore the total size is 2N
you don't do it offline
and the culling shader is the same as your "make buffer" shader
okay
perhaps
in conclusion,
all the geometry is loaded into storage buffers and when its time to render, a compute shader runs through, does the culling, and writes out the survivors to new buffers?
or am I back at square one
yes
I see
Thank you for giving me widsom
wisdom
after that, I suppose the CPU issues an MDI command that draws all those survivors
no
No?
you write the command in compute shader and cpu does indirect with the buffer
actually yes I thought you were doing the other strat
yes it's a single command with indexCount set as survivingMeshlets * triangleCount * 3
of course
one more thing
is this solution for per-triangle culling as well? That would explain the index buffer rewrite
yes
I see
I wonder how per-triangle culling can be faster than just letting them through
it depends on the method I suppose
you reduce vertex invocations
I assume per-triangle culling would use a frustum intersection method, rather than the projection method used in vkguide 2
no
you do small triangle culling
then backface culling
and then you profile if occlusion/frustum culling is worth it
typically it's not
also from lpotrick its not worth if you dont use mesh shader
i can't really do backface culling before vertex transform though
you can
yes you can
v0 = projView * worldTransform * vertices[indices[i0]];
v1 = projView * worldTransform * vertices[indices[i1]];
v2 = projView * worldTransform * vertices[indices[i2]];
cull = determinant(mat3(v0.xyw, v1.xyw, v2.xyw)) > 0;```
nah zeux is 
yeah 
Also I need to figure out why my hiz is not 100% correct
it just sometimes breaks
now all i need is a lod tree and sw raster lol
just a couple business days
I tried sw raster but it was hilariously broken like whole sponza was splitted weirdly in half and both parts were going in different direction
god no
I am already suffering with my jobs
angular + php will make you hate programming
right
i actually did write a clusterizer and dynamic lod engine on Scratch and it was hell
yeah
Also I do wish upon worst enemy to write javaish php code or deal with css/js spec
which is less than a pixel, no?
sure
more rigorously: projected area of triangle is less than threshold (with subpixel scaling taken into account)
right
but i dont own any models with that level of fidelity lol
maybe it makes more sense for far away triangles, not sure if bistro is expansive enough for that to apply
if you go far enough away every triangle is subpixel
but it is pretty marginal
it may even give you a little overhead
i see
ill probably go with the backface cull
ahhh i need to make my materials bindless too
The thread has been pulled
what
You're pulling the thread and now the whole thing is unravelling lol
If you're going to go down this route you may as well switch to Vulkan
fake you can implement nanite without vulkan
you need vulkan only if you want to torture your GPU with CAS loops and devicecoherent memory 
for that beautiful cross queue async visbuffer writing and bvh trasversal 
i dont want tooooo
this meshlet frustum junk + two pass occlusion culling is all i have planned
the vulkan switch might be inevitable depending on how performance critical ray trace operations will be
what kind of tracing

unless by bvh you mean distance field then you're gonna get "interactive framerates" at best
and even then it'll be painful
when you said this i thought you were taking about how this discord thread has gotten more activity than ever
its at about 1800 messages and yesterday i think it was 1500 lol
meaning ~17% of this threads messages occurred today just because i wanted meshlets to draw a bit faster
No I meant the sweater is unravelling
well i can either switch to vulkan now or later
I would suggest avoiding anything RT
why?
slow
how are those any faster?
it depends
generally if you want HWRT there is only one way of doing things
with everything specified by the API
instead of bringing your own representations which may or may not work well with your scene
anyway it depends, just do whatever you want
what about shader ray queries?
that sounds like it nullifies this if its my own pipelines though
unless youβre talking about the scene and bvh setup?
legitimately it's really just traceRayEXT but in compute
you're still using your HW's BVH format and HW's tracing algorithm and all that
it's fast*
*assuming you provide a scene that is a good fit
you should read more
what I said is just the tip of the berg
generally one can only say "it depends"
I know, thats the plan
Actually
I think I'll start with screenspace
since I want to use screenspace for specular reflections anyways
closer
the code in question
oh my shader version wasn't high enough
alright
all the scene's materials are now in one big ssbo
geometry now mostly bindless
this is all so that I can do aforementioned culling and batching methods
the only thing left being bound are material indices and and model transforms, which all can probably be built into another large ssbo with per-meshlet elements
after that I can use compute to build the index and mdi buffers
what do you mean by geometry bindless?
only one vbo, ibo, and vao
how isnβt that bindless?
bindless means you dont bind anything
hence the name
like bindless textures, you just grab a handle from the resource, and pass the handle to the shader
that exists for buffers too on nvidia, not just textures
most people refer to that as mdi or mdic
im not using multidraw indirect yet
just huge buffers
That isn't true. You need to still bind something. But you bind once per pipeline
Correct name would be bind less
ah so you do use mdi pretty much, just not the multipart, but indirect draw
in vulkan bindless is descriptor indexing and bda. you dont need to bind bda because its pointer but for descriptor indexing you need to bing big descriptor set to index into in shaders
you still have to bind but only once per pipeline
I am aware that in opengl you dont need to bind textures since nvidia has some extension but theres nothing for buffers
that nvidia extension i was mentioning earlier is also BDA for bindless buffers
hmm
but for indirect draw you still need to bind the indirect (and parameter buffer)
Its little better but still have global state and you get surprises like nvidia doing mip mapping of 3d textures on cpu
Whenever I say bindless I mean the technique rather than the APIisms
You store animated and non-animated objects in single buffer? Or you dont have vertex skinning yet?
no skinning yet, why?
Because skinning adds a ton of attributes that you'd have a second vao for
You wouldn't want to have those attributes on every mesh if they're unused
Still struggling to understand how to do it right. Creating 2 types of vertices, one with skinning attributes,one without and then loading them based on user choice seems wrong
why user choice
you render static objects with vao 1, and all your animated ones with vao 2, or you dont use vaos at all (besides the default one for gl core)
loadAsStatic, loadAsAnimated. Maybe user doesnt want to import anims from gltf
- creating 2 types of vertices seems wrong to me
then what?
you usually have more than 1 vertex type anyway
Didnt know,thx
Saving a bit of memory on stuff like animated gates/doors/etc. You may not want to import animations and just use them for background
your artist will provide the animations used in the game
not loading animations to save memory doesnt matter
@junior sparrow @west trail Do you have project threads I can follow?
not me, still figuring out stuff
mind updating your github repo btw? I want to stole check out your code
I keep switching projects 
I should create a thread about what I am working on currently ofc GP related
Uh
Probably not today
its at a pretty incomplete state lol
what code are you looking for?
just checking out
nothing wrong with creating new threads for another project : >
alright
then maybe today
but its gonna be super messy
Gates and doors wouldn't be skinned they would be animated programmatically
We're talking about skeletal animation
okay, no user choice then
hell yeah
Watch your language in my thread please
We do not speak obscenities in this chat
@west trail its up
Its night time in my timezone man, let me sleepπ