#Fwog and co.
1 messages Β· Page 9 of 1
oops
how will teardown ever recover
mayhaps tinkered together and not properly tested?
temporal motion blur
yeah it stops accumulating whenever the camera is moving
holding middle click (the button to change camera roll) will turn off temporal effects too
pretty fun way to do it tbh
most people do it as a post process
but here it's the ground truth
makes sense, since you want quality for photo mode
would have been possible to do it stochastically with ray tracing imo
here's another funny one
so that there's noise instead of whole frames
not photo mode tho
kinda
a box is drawn around each object, then a ray is traced into it
rasterization can't do this because it's glued together
well, a ray per rasterized fragment
but rt is embarrassingly parallel
ok so teardown apparently does some thing where it launches a new process or re-initializes the graphics context (I'm not even sure) right after loading. tl;dr it kills most debuggers attempting to connect to it at launch time
trying to figure out how to overcome this so I can connect nsight to it, as it's currently reporting that the connection was closed upon launch
Launch process exited. Searching for attachable child processes...
Searching for attachable processes on localhost:49152-49215...
Searching for attachable processes on localhost:49152-49215...
Failed to connect. The target process may have exited.
everyone in #gaming be saying that he is probably fine with it since he's an indie 
anyways, I can already see all the shader sources with renderdoc. I just want to profile
zink plus RGP pls
I want useful-er profiling results
btw you leaked your Ip I am now going to ddos localhost
lol I can find all the shaders in the executable if I dump all strings
fun fact if you look at a binary generated by godot you can find all the shaders there, and most of them will likely be unused
makes one realize that there is little care about storage efficiency in modern age
someone tell the author to xor encrypt them
renderdoc will spill the beans anyways
what if teardown will look for processes and kill renderdoc-like ones
actually is there a way to tell that some dll was injected into the process?
that should be enough
maybe
it doesn't kill renderdoc if you use the global hook and then manually connect renderdoc ui to it
indiscriminately suiciding on dll injection is going to make the game unstable tho
also rip any video capture
yeah it survives injection for sure
I think it's just doing something wonky with the process which happens to neuter things that connect at launch time
I have to connect uprof after launching the game or it will fail
given that nsight tries and fails to find a child process after the main one terminates, idk what is even happening
is renderdoc able to show spirv decompiled?
yeah but the decompiled spir-v looks like ass
BUT it is enough that you can kinda tell what's happening and even modify+recompile it inside renderdoc
well once it's decompiled everything else including what you listed is a bonus
renderdoc is just getting the spir-v and using one of those spir-v libraries from khronos to decompile it
heh, at least it won't preserve identifiers and such
unless you explicitly compile with debug info
ok, I think I haven't tried the classic trick of connecting nsight to steam and capturing its subprocesses
let's go
idk why I didn't try that sooner. it's my literal job to know how to do that
ladies and gentlefrogs, we're in
lmao
seems like the AO pass is the most expensive, then the reflections, then sunlight and fog
Lol the guy that made the other Teardown teardown is unhinged af. Do not check his Twitter
*checking his twitter*
here's the reddit post where he argues over the semantics of the word "shadowmap" when he's clearly wrong
https://www.reddit.com/r/GraphicsProgramming/comments/10k8l7r/teardown_frame_teardown/

I see your nitpick and raise you an even harder nitpick. A map that "literally stores light/dark values in texels" is not typically called a "shadow map", but a "light map".
A common "shadow map" is a 2D texture that stores depth, and it is queried using a depth compare operation against a known reference level (the pixel depth). For this reason, you cannot bilinearly filter shadow maps, rather, you use a PCF gather which does a compare for every texel and then bilinearly interpolates the result of that comparison.
Another way to look at this is that a 2D shadow map is just a "froxelization" of the world (as in frustum-voxel), with each texel forming a pillar from infinity/far-plane up to the surface.
this seems correct to me?
well
let me phrase it as, i don't know enough to call it incorrect
yeah, his counter-nitpick seems true, but I still think calling the 3D voxel world a shadow map is completely wrong
oh, yeah in that sense i don't think it's actually a reason to call it a shadowmap assuming i understand
when i went to his twitter i saw a bunch of anti-AI circlejerk
which has been semi-funny, semi-sad to me as of recently
the more you scroll the more brain cells you lose
ye nitpick is true but he missed the point
it's like calling a keyboard a typewriter or something
and then defending that they both do similar thing
yeah i didn't realize what he was calling a shadowmap initially
linky
he's right wing
least unhinged twitter user
ignore that person, publish your thing
didnt even look at theirs and i wont
as i have more trust in one of my favorite
peeps π
Their post is honestly quite good though
Anyways, I'll finish mine for sure. I just want to add profiling data to it since I figured out how to profile it finally ||and to differentiate our posts||
I can't work on it tonight since I'm tired af and cooking up a big bowl of mac and cheese before bed
oh god i scrolled more 
this was me last time but instead of diamonds it was brain rot juice
what's funny is that the shader source code actually refers to the functions that trace this volume as raycastShadowVolume*
however, the uniform name of its sampler is π¦ly uVolTex
though tbf, other textures are bound to that sampler (such as the small per-object volumes)
wtf, the dude who made the other Teardown teardown is perpetuating the myth that it has global illumination
However, GPUs forbid reading and writing from the same render target, to avoid race conditions.
bullshit
https://registry.khronos.org/OpenGL/extensions/ARB/ARB_texture_barrier.txt
I actually think glTextureBarrier is something that Teardown could benefit from to remove this horrendous hacky soft-early-z abomination thing
as far as I'm aware, each draw renders to each pixel in its bounding volume exactly once and no more, making it a suitable candidate for a rendering feedback loop
it has global direct illumination
why is this
I feel like I overuse commas in my writing because I put commas where I would naturally pause in speech
what's funny is that in the thing I'm looking at (commas for dummies), the first use case is to put a comma before coordinating conjunctions ("but" being one of them)
could this be true?
I think tomorrow I'll post the draft of this thing for you guys to read. It'll be missing a few sections, but oh well
who else became overly conscious of their grammar after reading this
this is what i was taught
- no comma
- after pancakes
I think pronouns form a new clause though
so there are 2 clauses separated by a comma in 1
yeah i would've been docked in school for not writing a comma before "or"
there are also some teachers that would've docked for no oxford comma before the and but obviously that's not universal
Oxford disagrees
ok I found "why" teardown has RGB16_FLOAT motion vectors
honestly quite incredible
outputMotion = vec3(cur.xy / (cur.w*2.0) - old.xy / (old.w*2.0), 0.0);
it's for marking water areas supposedly
to prevent cosmic radiation bitflips
yeah at least put it in A channel of 8-bit normal texture
it seems overkill for everything in the material buffer to be f16
emissive, ok, but roughness and metallic surely suffice with 8 bits of precision
seems indeed a bit loose with packing, esp all those wasted bits in A
yeah, I put my notes on that in a "bikeshed" section of my post
I'm not sure if it's in bad taste to put that kind of stuff in a frame breakdown, but oh well
just think of this is as the evidence gathering phase for a trial
GP vs. Teardown
I obliterated the quality of the videos in my breakdown to not make this thing 100+ megabytes
I just realized that, since you guys are my main audience for my posts, I should really amp up the shrimps, clams, and "honestly quite incredible"s in them
anyways, here it is (draft)
https://juandiegomontoya.github.io/teardown_breakdown.html
todo: fill out WIP sections, add table of contents
posting it to pixelduck's linkedin
speaking of whoms't, @valid oriole
π
(I am abusing this thread to post non-fwog stuff)
I posted a link to the teardown^2
i saw
I guess teardown^4 since that other guy posted it
i am currently examining said post
gib clickable images
right click -> open in new tab
also only some of them are high res because I didn't actually expect people to want that 
jeebus
it's already 20mb of pics 
ok a workaround: if the image you click is low res, resize the browser window when you open the image up
if I make them all HD, the post would be like 200mb
I guess it's only really a problem with github if individual files are huge
ok, I will consider making the pics thiccer
another thing on my todo is to fix the gamma for the images in the first half of the post
big smoke WIP
let me know if there are any images in particular that you wish to have a higher resolution
Teardown features exclusively area lights, which is made possible thanks to ray tracing.
be sure to mention big ol' rock as a competitor tech
i will also leave helpful comments later btw
I had to think for a bit to understand that reference
might be a bit too obscure for the general audience 
. Spending 16 bits to store a boolean value is an interesting choice.
too passive-aggressive
say braindead instead, lean on the aggressive
yeah I'll try to frame that differently
proposal: 1 dogjiff per perf warcrime
how about
Since the water flag is a boolean value, it could be allocated elsewhere to save some bandwidth. Maybe another value could be quantized to make room for this bit, but I'll leave determining that value as an exercise for the reader.
I tried to stay objective with my analysis of perf warcrimes, so I didn't get into the 8 uniforms + glBindTexture per draw call they are doing, nor the redundant state setting throughout the frame
maybe I'll save that for part 2 where I analyze driver overhead
its a short section, no?
uses opengl instead of superior vulkan api, therefore driver overhead
driver overhead: yes
gob already told dennis to use vulkan on tw*tter
after he asked why you can't make gl calls from multiple threads
what a wasted opportunity to say "skill issue" and refuse to elaborate
I would have pretended that teardown teardown didn't exist
instead of shouting it out and posting a slowpoke image
I don't mind giving the other one exposure
implies looking down
I instantly like it by virtue of the fact you don't :^)
my intent wasn't to look down at the other one
tbh I think it's better than mine in a number of ways
in all seriousness, from an """academic""" sense it seems most honest to cite prior work
you could also say that the other one is rushed
esp. if you're aware of it
and if I'm going to post this on reddit (where the other one was already posted), then it would be weird if I didn't acknowledge the other one at all
I guess I could acknowledge it in a non-memey way, but that would be boring
its not formally academic, there's no rule you can't make it memey
although strictly speaking I think there's no rule against that in academia either, provided they accept your paper
identifying the audience has been difficult for me
its a technical blogpost in the year 2023, I think memes are fine
plus with how fast graphics moves, the technical aspect might age faster than the jokes
does not apply to independent discovery
and jaker was first anyways
yeah I think that changes if you're made aware of something during writing
what does "first" mean in this context?
apparently even gob was writing a breakdown of teardown (long ago) and got as far as the post-processing stage
ok
yall are plagiarists copying gob
so why did he stop?
did he discover some deeply hidden unsettling thing that's better left undisclosed to the public?
I think he lost motivation or got bored or something
yup
I had "November 2022" at the top of this thing until hours ago
I guess this game is popular to cover because it's relatable to indie graphics programmers in certain ways
and dennis is an actual person and not some faceless company with a huge team of rendering engineers
to me it makes an impression of shadertoy regular making an actual product to sell
claybook gave me that impression too
too bad cool graphics doesn't instantly equate to fun gameplay
yeah, but game design is also a thing that requires effort and study
makes me think of noita, like how in that one GDC talk he mention he basically screwed around with his falling sand sim for years
yeah that is some cool stuff
you know what they say, if all you have is a hammer everything starts looking like a nail
you can actually make a scuffed sand sim yourself in very few loc
I remember that being an advent of code puzzle at one point
would overgrowth be another example or a counterexample
they spent ungodly amount of time on tech
core mechanics were fun but as a finished product it's meh
it has pretty decent reviews on steam at least
they spent so much time on it their rendering became extremely outdated
well it has fun mechanics, but there is much more to a game than gameplay
the more I look at examples I like the more I realize there's no substitute to an ungodly number of artist manhours
they had so many ideas which they never implemented
the campaign ended up being basically
what's the interesting part of the tech?
you start at A, see slideshow cutscene, kill guys, next level
oh huh, the source code is on github
their GDC talk on the animation tech is really good
apache license even
idk what else it has to offer technologically, just looking at pictures of it right now it looks like pretty generic phong
those system reqs tho
OS: Windows 7 or later (64-bit)
Processor: Passmark CPU 1000 or better
Memory: 4 GB RAM
Graphics: Passmark GPU 500 or better with OpenGL 3.2 support
Storage: 35 GB available space
I think it was mainly their character animations/physics
*see slideshow cutscene with physics-based-ish frame interpolation
they morph between poses
with spring constraints and physical forces or something like that
or it's what it appears to look like
I have a lot of appreciation for stuff like that, which tries to move tedious art-space problems into programmer-space
graphically not much
there is physics based combat
they probably spent most of their time on the editor and moddability
and I think game logic is all mostly hardcoded
so it's not even a general purpose engine
I haven't looked at the code but it might be hellish which would have caused tech debt issues
slowing down development
ok but this mod exists
https://steamcommunity.com/sharedfiles/filedetails/?id=1170115712
it started when C++ was bad as hell
the code for their previous game was open sourced
compared to current standard
I actually vaguely remember a post about overgrowth being open-sourced way back
except I think I didn't understand how to read code then
mfw people who write code like this will always be more successful than us 
my code is like a temple
I think british person made this mod
and by this i mean the sagrada familia
instantly ruined
overgrowth's code looks close to C
did you see this comment
Sir the community of DayZ absolutely needs you i just add you to ask about making this for DayZ , i will pay you
absolute time capsule of a comment
no way
here's another top-tier comment
could you make more fps boostings? lol
what does this mean
lugaru story is not working
it's a campaign that remasters previous game of the series
in overgrowth
honestly I liked it better somehow
than overgrowth
i like how you explained the very first picture
typical for americans i guess
like those city shots of very well recognizable cities but just their skyline, when you write "Paris, France" on top of the Paris eiffeltower shot π
here: you see a shotgun
caption: shotgun
(but I suppose great minds think alike)
what is this
hi deccer
why is there a visible seam here in the corner? https://juandiegomontoya.github.io/assets/teardown/g_albedo.png
is the albedo g-buffer NdotL'd or something?
actually it might just be a dirt decal or something
also, is it possible that he uses this term
raycastShadowVolume
as a "raycast shadow using volume"?
not "raycast into the shadow-volume"?
overall neat read 
I actually don't know what you mean
I don't think so, because those are the only functions used for all tracing
Which means they're used for non shadow tracing
the picture with the shotgun has a caption "shotgun"
Where
π
yeah the game projects detail onto the voxels
the seam is a result of the projection not being continuous as the angle changes
Btw @golden schooner I hope you liked the inclusion at the end of my article
The schlapp
In related news, it seems like only martty so far has complained about low res images
Therefore it's probably fine
i dont mind them
maybe the overall size of them within the blog could be bigger
given how much empty space you have left and right
otherwise even i as an uninitiated was able to somewhat follow the content
i just zoom in the whole website to 200%
the only potential problem is that they are constant w.r.t. screen size
The content is always 800px unless on a small screen
can you make it height relative
like, width = 1x screen height
or 80% screen height
maybe something that's like min(80% screen height, 100% screen width)
to keep the fit on mobile
mobile at it again π
Hey, my site works on mobile
I was reading it on mobile this morning trying to figure out the caption thing you were referring to π
i apologise for my poor attempt of explaining what i meant
did you contact gustaffson yet?
I didn't realize I was supposed to π³
Yes, but I don't recall there being a definite conclusion or making any promises
Would I be asking him for source code access or something
not necessarily
Or just sending him my feedback
but exchanging thoughts and brainthinks
yay or ask him questions primarily, regarding the things which are puzzling you/us/cameup during bikeshedding it
I wish he was on this server, but I also know he would immediately be spammed by everyone here
would he
assuming he didn't do like (CREATOR OF TEARDOWN)
jasper, suslik, MJP all don't really get spammed
Tru
But there are probably like a dozen people here who have things they want to tell Dennis
shoulda joined sooner so it didn't all pile up 
Gob wants to shill Vulkan to him. Pixelduck and Jasper want source code access for profiling, etc.
Literal free labor if you think about it. Maybe he should do it
Of course, I have questions for him too 
you are the traffic π
the trafficker rather
Is this a Californian phrase
Actually, my dad is from California and he never said this 
oh there's a dumb old meme that's like
"you aren't in traffic, you are traffic"
another dredd quote almost
referring to this tho https://youtu.be/miVoe7U6Lx4?t=26
hmm what should I call the init function for fwog? π
what does the init do?
Fwog::Innitshallah()
Fwog::InitAndDrawEverything(makesureToDoItInaPerformantFashion: true)
so far: initializes global variables and populates a struct with device capabilities
I also keep thinking about using an object instead of hidden globals, but I think that will just make the API more annoying to use overall
oh?
{
//your actual init here
}
You couuuuuld do both
that looks cursed π
auto ctx = Fwog::Context();
As in you have one api that passes context around
yeah
Then you make an api on top that just globals it
which also has the DrawXXX functions innit and all the other lose stuff flying around in Fwog
inspired by d3dXX
Tbh you're using opengl anyways, globals shouldnt be too bad innit
yeh
but you add a little type safety
fwog is kinda "stateless" so it's not the cancer kind of opengl globals
In haskell you wouldnt need to pass the context around, you'd just make a monad and:
do
YourRenderingFunc
YourOtherFunc
Etc

but to pretend to be stateless, it needs hidden state. quite the conundrum
the conundrummer
The do here passes the context around epic style
epic
having a context seems pretty standard for any graphics API ever
Bababooey as they say
so doesn't seem like a big deal
just cuz other APIs do it doesn't mean I see a clear reason to do it in mine
auto& ctx = Fwog::CreateContext(...);
auto& commutePipeline = ctx.CreateComputePipeline(...);
commutePipeline.BindImage(..., 42);
commutePipeline.DispatchIndirect();
at best, it makes it harder to accidentally call graphics functions
and it makes global state management a little easier since it's tied to an object
well you need a state blob, as all graphics APIs do, and having a context means that the user can create more than 1
it doesnt have to mean that
it doesn't make sense for there to be multiple contexts for fwog
I have no idea how I'd do that anyways since this is GL
Fwog::GetContext() could work too
it doesn't have to but its a benefit, and one of the reasons APIs choose this design
I thought you could have multiple contexts in GL as well
but you might want to create a context for aschleppyncly things, upload geometry/textures/compile shaders
multiple gl contexts is a meme feature
just saying
all that stuff requires integration with the OS/windowing system which I don't feel like doing since my perception is that multiple GL contexts offers almost zero benefit
::GetContext then π
now what does GetContext do
basically a singleton
make a different system for handling interaction with the windowing system
FWOG_EXT_Swapchain
bruv momento
creates the 1 context and returns it, if its created already, returns it
schleppy
if you're 100% confident you will only ever have 1 global state per process, just go with Fwog::Init()
currently, the user is expected to make the context
I wonder how I would design it to support multiple contexts
I haven't actually done multi-context gl
mayhaps via a dedicated context thing for aschleppyncisms
idk what the point of it is, but if it exists that means people probably use it
where you cant do ordinary contextisms, but only things you can or would actually use MT contexts for
I explicitly don't support a lot of stuff in fwog because they're badβ’οΈ features (also to protect my sanity from having to implement literally everything possible in gl)
https://www.youtube.com/shorts/v7PwhAm5mzg jaker trying to improve his shit, cashiers (DR and others) making fun of the poor guy
I can just label multiple contexts as a bad feature and go on with my life
multicontext considered harmful
dont let derhass hear it
or neure
: >
my brainworm isn't deep enough to care about it
probably fine
otherwise it might be possible to create "a context for MTisms" from the main ctx, and you can only limited things on/with it
yeah, I think the important part is just playing nice with other GL contexts if they exist
remember devsh's multi-context brainworm?
yeah, since Fwog more or less hides all glIsms()
don't think I was there for that
very vaguely
he had to call glFlush/glFinish in weird places for things to work
I don't want that crap
ye
I was thinking more of a situation where you would say, have a GPU-accelerated physics engine that internally uses GL for whatever reason
but thats probably also due to mobile-isms
and you wouldn't wanna step on its toes
i dont think fwog is there yet π
nah I mean as a user, you have fwog + some random physics engine and they both have separate GL contexts
my solution for handling situations where non-Fwog gl calls will be made is to expose a function to invalidate Fwog's internal cache
fair
you wouldn't want them to mess each other up
but thats not reading fwog's disclaimer
debug renderer mayhaps
well you shouldn't be making an app with multiple gl contexts, ever 
I was thinking for doing GPU compute or something, but I guess most actually use cuda or opencl or something
valid concern
I imagine the expectation with most reasonable GL libs is that something else will create the context
fwog does not create the gl context, so it should play nicely with all the crap that does dumb stuff
in the case where you have some dumb physics library that makes a GL context, you just need to give fwog the gl.h header and let the physics lib do its thing
taking otherisms into considerations is probably out of scope anyway, since we assume sane people using fwog know what they are doing and dont mix things
well it isn't too hard to make it compatible with other stuff due to its design
I just need a Fwog::InvalidateAssumptions of some sort after external GL calls are made
(assumptions = cached pipeline state, etc.)
Fwog::AssBarrier()
you could have some function on the main context where you pass in the aschleppynced one, to sync
example?
implemented with std::abort
ctx->Synchronize(asyncCtx1); ?
hmmmm
methinks I'm going to forget about these asyncschleppyisms entirely for now
people who need that can use Fwog 2
hmm I wonder if objects can be shared between vulkan contexts
er, I mean devices
either one I guess π
there's not really any point since vulkan supports non-stupid multithreading already
yeah I think so
I remember when I first joined this server, I posted a pic of my friend calling khronos morons for the design of OpenGL
then I learned that the design made sense for its time
but with experience, I now know that the design is indeed horrible 
how did you find the server in the first place?
I think I found it linked on reddit
ah
idk where the reddit link is now
I couldn't find it last time I checked, maybe they removed it
/r/GraphicsProgramming probalby
that's where I checked
dragon posted links there
no link in the sidebar :/
we dont own that section unfortunately
My Fwog feature request:
- add VXGI
Thanks.
I actually would like to try that technique sometime
It wouldn't be part of Fwog though, since it only abstracts the graphics API
but you also have the RSM stuff
What would be interesting is having a library of reusable effects
thats examples right
nice. I am waiting
that will be after clustered volume rendering
I knew you were gonna say that when I saw you typing π
hmm @long robin
i have a silly question π
in gpu_driven, when you define the vao
you use elaborate formats to describe the attributes... like rg16_unorm for the normals
isnt it just shrimple datatypes like byte/float/int etc? and the amount of components
that's just a silly oct-encoded normal
ye so GL_FLOAT x 2
it's rg16_unorm
which corresponds to two floats in the shader, ye
floats in the range [0, 1]
ye i can see that its 2
yes
but glVertexArrayAttribFormat takes GL_RG16_UNORM in as a type?
wut
glVertexAttribPointer?
oh nvm
glVertexAttribFormat(normalLocation, 2, GL_UNSIGNED_SHORT, GL_TRUE, offsetof(Vertex, normal));
ye you translate the format into those things, i feel dumb
ye im there
line 48 then π
yep thats exactly where my cursor is rn
do you understand it now
np, wasn't sure if you were still conchfused
ngl that function is boss, amazing how crufty gl is sometimes
hmm I think line 39 is a bug
shouldn't it be glEnableVertexArrayAttrib(vao, desc.binding)
in vertexcache still?
ye
yes it is
but since you described your attributes in the right order in your exshrimples it works by chance
I think the i just happens to work if you supply all of the bindings
ye
I'm glad I looked at this π
you ain't seen nothin
let me PR this
no I'll do it
probably not
binding is somewhat the wrong word here, its attribute index
yes
better
I think line 40 is also fooked
...
GL.VertexArrayAttribFormat(
_id,
vertexBinding.Location,
...
and fix 48-50 too then
40 looks ok to me
did you merge exshrimples back to main already?
indeed 40 is fucked, for some reason i kept reading gl...(vao, desc.binding) and didnt see the i inbetween π
yeah i should only be used for getting the createinfo thing
ye
it just happens to perfectly line up with desc.location in many situations
you're right. I won't even have to fix it
you better fix it
amend original commit, force push
mine looks like
.AddAttribute(0, DataType.Float, 3, 0)
.AddAttribute(1, DataType.Float, 2, 12)
.AddAttribute(2, DataType.Float, 2, 20)
the perfect heist
indeed
feels good, man
you are welcome
actually designed instead of bolted on
the original gl one sucks
at least VertexAttribFormat and stuff make it easy to translate to vulkan
VertexAttribPointer means you have to do this whenever the format or buffer changes (courtesy of dennis)
dennis the menace
old opengl has a way to specify a single value for vertex attribs, for some reason
I wonder what that could be useful for
its not THAT bad
it's either that or using a vao with every vertex buffer combo
i like our setup vao and schwitsch vao when necessary more
vao sounds like that nft scam "dao"
data access object
its not as bad as drinking rotten acid from a raccoon corpse, but it is not exactly good
I was actually thinking of an analogy using a field of broken glass, but those work too
who walks barefoot?
ok, I'll let you get back to your pbrisms
shit
the imgui comment had me think, while dennis may not get spammed, ocornut absolutely would LMAO
don't think he could join any public discord though
didn't ocornut maintain a discord server at some point, but shut it down because of too much cancer
yes, but that was another level of bad
the problem there was just the CS:GO cheater issues multiplied by 10
I can't decide which header to put the init function in π
none of the current ones really make sense for it, so I'll have to make a new one
maybe Context.h (inb4 an OS API already uses this)
Fwog.h sounds like something that includes all of the headers, like glm.hpp
I love wading through this page and seeing what's still relevant and what's useless FFP crap
https://registry.khronos.org/OpenGL-Refpages/gl4/html/glGet.xhtml
GL_MAX_PROGRAM_TEXEL_OFFSET
data returns one value, the maximum texel offset allowed in a texture lookup, which must be at least 7.
can't tell if old shit
yes it is
epic
GL_MAX_UNIFORM_BUFFER_BINDINGS
GL_MAX_COMBINED_UNIFORM_BLOCKS
GL_MAX_VERTEX_UNIFORM_BLOCKS
GL_MAX_FRAGMENT_UNIFORM_BLOCKS
GL_MAX_GEOMETRY_UNIFORM_BLOCKS
GL_MAX_TESS_CONTROL_UNIFORM_BLOCKS
GL_MAX_TESS_EVALUATION_UNIFORM_BLOCKS
GL_MAX_COMPUTE_UNIFORM_BLOCKS
I'm strategically ignoring a bunch of these to put in my own DeviceLimits struct
GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT
data returns a single value, the minimum required alignment for uniform buffer sizes and offset. The initial value is 1. See glUniformBlockBinding.
what does the bold text mean? the value is 256 on my GPU
"initial" implies that this value can somehow change at runtime
spec says this
The value of UNIFORM_BUFFER_OFFSET_ALIGNMENT is the maximum allowed,
not the minimum
but also this
wtf does it mean for it to be the "maximum allowed alignment"
what a horrible API
No conforming impl can have more alignment than that
Ie you can hardcode 256 alignment and it will work everywhere
is the "the initial value is 1" thing a fluke then
what a bad job is to write from specs
opengl driver devs must all have ptsd
wait... is writing specs even worse?
writing the driver is probably worse because you're dealing with an imprecise spec
the refpages is where the sus "initial value is 1" thing is
it does not appear in the actual spec
what are you gonna do with the stick
Make a fire to burn the refpages
you know it's socially unacceptable to get your stick out in public
yet another refpages L
I wonder how some of these errors happen. Shouldn't the refpages be auto-generated?
that reminds me of something i wanted to do as well
did you see this
For a long time I've been wanting to make a blog post about Teardown rendering. I even started writing it twice but never reached the end because there is so much to cover. Now it turns out @unconed did all this work in glorious detail and accuracy!
https://t.co/0fKQbTprxT
189
Lol
he who snoozeth loseth
Btw what's the water mask in RT3 used for? I tried to look at all the shaders using it, but only found references to .rg, never to .b. What did I miss?
16 bits to store 1 bit that's not used
funny inline location
struct ContextState
{
// a bunch of stuff. seriously, don't worry about it
}inline * context = nullptr;
now i am worried
reinterpret the function pointer of glGetString and plonk a random byte to prevent anyone calling it
apparently both GL_TEXTURE_MAX_ANISOTROPY and GL_MAX_TEXTURE_MAX_ANISOTROPY are a thing
ah, they are used in different contexts
I love how all enumerators are in the same namespace in gl
GL_EXTENSIONS is also funky
in legacy gl it gives you all extensions separated by space
today you have to use glGetStringi(GL_EXTENSIONS, i)
rate my struct
struct DeviceFeatures
{
bool bindlessTextures{}; // GL_ARB_bindless_texture
};
struct DeviceFeatures
{
bool bindlessTextures : 1{}; // GL_ARB_bindless_texture
};
pls
don't wanna waste the perf
the hidden static state is now hidden global state. much better
and now the user can get a vkPhysicalDevice{Properties, Features} struct for context info instead of using those yucky glGets
when is Fwulkan coming out
I don't think I can do better than the existing solutions out there
Plus I'm not experienced enough with Vulkan to make a great abstraction
That said, using Vulkan more is on my list
The other problem with making an abstraction for Vulkan is that there isn't as much low-hanging fruit (mostly just auto filling structs and hiding useless stuff like the allocator parameter)
Most things that are helpful when using Vulkan are difficult to implement and infinitely bikesheddable, like automatic barriers or automatic descriptor set creation
yeah and generally the popular wisdom seems to be that you shouldn't even try to do that
you'll likely never beat driver implementers at making a fully automated general purpose solution like GL/D3D11
Using OpenGL is nice because you don't have to screw around with API plumbing much. It's just sad that it's super mega crusty, and that's what Fwog is intended to fix
most of the stuff I see in vulkan abstractions/augmentations is stuff like framegraphs to help make explicit resource/sync management less verbose and more flexible
I think that could totally have its place in a GL abstraction too though, you can gain a lot of similar benefits from pass reordering and resource aliasing
It makes me sad that all the cool new features are only in the painful APIs
though I'm not sure if aliasing is as easy in GL as vulkan
guessing it's most likely not
Resource aliasing is definitely harder in gl
You can't access raw memory
At best you can use texture views to reinterpret a texture
Or suballocate buffers from a larger one (which requires an extra layer of buffer)
Resource aliasing seems like a huge brainworm that doesn't give much benefit tbh
Maybe it'll save a hundred MBs of VRAM if you have some MS 4k textures that can be reused
yeah thats how I felt initially because I have a pretty simple frame flow, but once you start to notice how many secondary textures you have from various techniques it starts to get more attractive
at least to me because its ez pz with vulkan and VMA
byeah
pretty hard to tell without actually measuring it, but I think the real wins probably come from improved cache hit rate moreso than the memory savings
but that also helps
if you're big brain in gl, you'd just allocate and delete textures when needed and let the driver alias memory
I don't see how aliasing would help anything other than memory consumption
I would assume if you're using the same region of memory more often, it's more likely that some of it would remain in the cache for longer
but that's 110% speculative
its all about feeding that wurm
I finally found what the water mask in Teardown is used for
it's used to modify the t value in TAA
it seems kinda like the reactive mask for FSR 2, which is used to indicate un-reprojectable pixels (e.g., like a scrolling texture on an unmoving object)
hmm it's actually something else or is it
it's used to influence t in this computation, where minNewColor and maxNewColor are the mins and maxes of the local color in the current frame
oldColor = mix(oldColor, clamp(minNewColor, maxNewColor), t + waterMask);
so it seems like it's kinda a mask that tells you how little you can trust an old sample, and since the ocean is dynamic AF you can't trust the old sample at all (no clamp = you trust the old sample completely)
makes kinda sense
idk what to name the function for this issue
https://github.com/JuanDiegoMontoya/Fwog/issues/35
Fwog::SomeOpenglFunctionsHaveBeenCalledSoInvalidateAnyAssumptionsYouMightHave()
Fwog::ClearCachedState or Fwog::InvalidateCachedState maybe
im not a fan of this as free floating function
but name wise id go with InvalidateState
it doesn't invalidate all state though 
"Cached" implies you are operating on a cache via other function calls
you could just invalidate all state tho, the ones you are in active control and reset the others to their defaults (ie could take a capture with nsight and look at all the state which you never touched)
fwog::Poop
I cache the last pipeline state (plus a few other bits) for state deduplication
but there's other state that shouldn't be touched, like the vao cache, the fbo cache, and the sampler cache
the state deduplication stuff is what I'm worried about since I want to allow a limited amount of fwog + raw gl (for masochists)
What would be the benefit of that over fwog/raw opengl
fwog can't do everything, so allowing some raw gl calls would be a sort of escape hatch
all the user has to do is tell fwog that they made some calls that changed pipeline state, then fwog can discard any assumptions it made about the current GL state
fwog could assume defaults for all state
if fwog can't make any assumptions then you end with a lot of duplicated state calls and it makes everyone unhappy
Hmm interesting, so so opengl drivers not have these kinds of caches already?
they probably do, but
- we can't trust the driver (opengl moment)
- some state calls can add validation overhead regardless
Like "oh they're asking for the same thing that's already set, lets do nothing"
I see
if you cover all of them then they can go into the cache and therefore wont be called π
Whew
and 3. nsight gets cluttered AF when you duplicate state calls
wdym
and what does it mean for fwog to assume defaults
hang on
"In God Driver We Trust"
hehe
when you take a capture with nsight for instance
and you go through all the stages
you see state set to defaults
like here clip_depth_mode
oder _clamplicated_read_color
etc
if you expose them in your create/info structs
then you can toss the structs into a cache
frog eaters could change their values, most likely wont, some might even relate to FFP or otherwise old crap
but they are tracked and therefore can be cached
my cache basically holds a pipeline state struct already
and when you ::InvalidateState() you shrimply reset those values back to their "defaults"
so you want the user to be able to choose what state gets invalidated?
my idea was to set a shrimple pointer to null
when you call InvalidateState the idea is that all those parameters (if different to former default) get defaultified, if they are not different then they dont -> no api calls, but all state is covered and you dont have to expose invalidation in various forms/stages
idk if that makes sense the way i wrote it
so you want the user to themselves define what state has changed and put it in this struct?
im not exactly sure i follow
I'm not exposing any cache to the user fyi
this is all internal stuff
yes
the only external thing is the InvalidatePipelineStateCache() thingy
BindXXXPipeline will call Invaildate...
BindPipeline calls only invalidate the state that is different since the last pipeline was bound
exactly
the point is that if external GL calls were made, you can't assume anything, so you invalidate everything (which is exposed via this new function we're talking about)
if GL_PIXELSTORE_BLABLA was not changed then you wont be calling glPixelStorei(..., whateverdefaultvalueOrNewValue)
exactlyment, invalidate everything
with everything i also mean evrything π
that was already the plan πΈ
not just the 3 or 6 things fwog is covering/tracking right now
hmm
also the things you can find in a nvidia capture (looks like its showing quite a lot of things state wise)
the only things that need to be invalidated are explicitly what Fwog tracks
no idea if RGP will also show more things
this is to avoid having bugs where fwog erroneously deduplicates some state
if you allow arbitrary glcalls next to fwogisms, then fwog would need to track all state
well that's easy, RGP doesn't support GL except on mesa where some weird subset of it is supported
a state reset function? Fwog::revolucion()
c++ supports unicode, c with the tailthingy π
Fwog::β()
Γ++
honestly
I don't think so
it only needs to invalidate what it tracks for correctness
the arbitrary GLisms must assume that fwog has changed everything as well
as long as you dont call it "...Cache..."
InvalidatePipelineCash
you expose no other "...Cache..."ism in Fwog but that one XD
InvalidateJohnnyCash()
I can't think of an accurate name that doesn't use cache tbh
InvalidatePipelineState
good enough
Reset is shorter
your mom is shorter
Invalidate is moar descriptive
true

ResetPipelineState
hmm
sounds a little missleading
but not bad either
Fwog::Reset(ResetWhat::All); XD
ye i was just playing
I did consider it tho
mayhaps for things where you dont want to reset everything
but maybe just texturebindings/resourcebindings/bufferbindings
because you are in between 2 passes of sorts still within the same pipeline

people shouldn't be binding the same resource in a pass multiple times anyways
i was finking about the ZeroThingyBindings() which could have been part of the invalidate/reset thing
ye but they might bind a different buffer
lets say you have in pass1 bind ubos to. 0, 1, 2... and in pass2 binds a different buffer to 0
ah no
makes no sense π
ignore what i said
the ZeroResourceBindings was just for debugging, but I think it does fit to put in the InvalidatePipelineState function
that would mean different shaders, and different shaders are different pipelines again which are covered by the deduplication
ZRB clutters the capture quite a bit, but its a useful tool
yeah I have it so that it's disabled in release mode
mwa ossi
there's about a thousand ways this can subtly fail so that's going to be fun
can't wait
hehe i dont think its going to be a probwem tbh
only people like martty who try to fuck with you will make a fuss about it
I've had many spoopy subtle bugs with state tracking already
and technically, over time fwog could still keep track of the other glisms as part of its state
and in the long run it covers all the toggles and bits
by exposing that functionality
pipelinestate doesnt handle packing/unpacking flags right now for instance
but pixelops are more or less covered
I think I can just set the pixel store alignment to 1 and assume it will never change except when the state is invalidated
since fwog will never touch it after that
ye thats what i meant with assuming defaults too
you either set them to a known value you think makes sense or go with the actual default value
there is only very few things not covered by fwog, from the looks of it
a handful of things from here
packing/unpacking flags, and thats it
apparently unconditionally setting the pixel alignment to 1 is a bad idea
https://stackoverflow.com/questions/11042027/glpixelstoreigl-unpack-alignment-1-disadvantages
GL_CLAMP_READ_COLOR where viewport/depthclamp is involved
ye derhass mentioned it somewhere as well iirc
i assume thats a reply from nicol bolas or him π or reinhard
stinky image formats
ah it is π
I think I won't touch this and let the user deal with it
ye
maybe at some point I can add the rowPitch and whatever nonsense to the texture upload struct
and the 5 or 6 texture_view_isms can be treated the same or are irrelevant
that means fwog covers pretty much everything already
just the deduplication might not be perfect yet - i think thats why we are talking about it right now
we have a few people who can reach into khronos committees, do we not?
OpenGL 5.0 FwogGL
yeah let's get @ Thomas to design OpenGL 5.0, the implicitest API π€°
we could rename all the fwog calls to glBeginPipeline() etc π
enable it with GL_FWOG_modern_opengl
13
13
lel
imagine being so crazy that you set GL_UNPACK_ROW_LENGTH to something other than 0
-1
I like how glPixelStoref exists but all it does is convert the arg to int
hehe
looking at the docs for glPixelStore make me feel like I aged 30 years in seconds
they really must have smoked moldy mushrooms back when
but
when i think about glBeginPipeline/glCreatePipeline/etc
a lot of the pseudo code exshrimples of things like MDI also do stuff under the hood
why not fwog's contribution to the opengl society
and turn it into an actual RFC like all the extensions are
: D
they just need to make this core
https://registry.khronos.org/OpenGL/extensions/NV/NV_command_list.txt
uh except it uses crusty-ass GLisms still
void CreateStatesNV(sizei n, uint *states);
void DeleteStatesNV(sizei n, const uint *states);
boolean IsStateNV(uint state);
void StateCaptureNV(uint state, enum mode);
π€’
it has pipeline state objects though
true
you make them by capturing the GL state with those funcs π
even the stuff that makes GL good (like this ext) reeks of GL
sometimes glPush/PopAttrib would be nice too
glPushAttribs(All); ImGuiDoesItsThing(); glPopAttribs();
anyway, ttyl
another bikesheddable topic perhaps is the topic of buffer mapping
https://github.com/JuanDiegoMontoya/Fwog/issues/42
it's quite unimportant though. I'm just trying to cross stuff off the issues list
well there is always the q of, ye this is weird/shit but is there a driver that likes that path
also the question of old windows
which had a problem with persistent maps
uh do you mean the one that can trash mapped buffers
not necessarily persistent ones btw
I'm not even going to think about it tbh
iirc it's just windows xp which no one should be using
glUnmapBuffer returns GL_TRUE unless the data store contents have become corrupt during the time the data store was mapped. This can occur for system-specific reasons that affect the availability of graphics memory, such as screen mode changes. In such situations, GL_FALSE is returned and the data store contents are undefined. An application must detect this rare condition and reinitialize the data store.
yeah I'm never going to worry about that since it's completely absurd
How often does this happen? On Microsoft Windows 5.1 (XP) and below, video memory could get trashed anytime an application didn't have input focus. This is why alt-tabbing away from games takes a long time to recover from; the application/OpenGL has to reload all of this data back to video memory. Fortunately, on Windows 6.0 (Vista) and above, this is fixed; Windows itself manages video memory and will ensure that all video memory is retained. Thus, at least theoretically, this should never be a problem on Vista or above machines.
it's on khronos.org so it can't possibly be wrong
https://www.khronos.org/opengl/wiki/Buffer_Object#Buffer_Corruption
zamn, my zest for vulkan has been reinvigorated for some reason
I can't find a source for my memory fragment
I just assumed we were talking about the same thing
I may have spread fear and misinformation into your heart by careless cerebral query
misinfo (it caused me to google something and find maybe actual info
)
is that really a problem?
provide facilities to map/unmap and unmap implicitly on buffer destruction
More of a clean API thing I guess
It seems to me like regular (non persistent) buffer mapping is rather useless when you have get/sub data as well as persistent mapping
w.r.t. the issue I linked, I don't know exactly what happens under the hood when you map/unmap a persistently-mappable buffer, so maybe I won't bother with the auto mapping thing
that's basically what I do, I only have persistent mapping and unmap on destruction
what I also find convenient is having my map() return a uint8_t* by default instead of void*
what made you choose that over void*?
i have std::byte*
No need to cast for doing address maf before memcpy
and if you do need to, it is the same
ez pointer arithemetic
basically that, not sure what makes std::byte better than uint8_t tho, but maybe that's just ultrabikeshedding
when measuring time on the gpu via gpu timers or tracy or whatever else you like using
would you just wrap the actual draw calls or the whole pass including binding resources/setting state?
I just wrap whole passes/scopes
putting a query on each draw call will generate too much noise
right
in all the exπ¦les
ye
i mean if all you use is glDrawIndexed, then it might get noisy
but if all you do is mdi
+- the fst draws
and right now I don't have automated timer query placement
I don't think it's a good idea to automate that anyways
ye, I was just thinking aloud
one could automate it the same way push/pop groups are placed, would cover the same scope
I have a utility for that already
https://github.com/JuanDiegoMontoya/Fwog/blob/main/include/Fwog/Timer.h#L63
@long robin just realized that the other teardown teardown guy is in this server 
not actually active though
#1064092953401905255 message
i was also finking about fwog
to bring it closer to vulkanisms, would it make sense to also have shader compilation go through a pre/postbuild pipeline via spirv and have fwog consume spirv only
it would make sense, except spirv isn't supported well enough
wasnt that one of the major selling points of gl4.8?
or because spirv has progressed a lot since that?
last time I tried spirv on AMD, some of my shaders randomly didn't work π¦
so I've given up on the concept
hmm I'm not sure what the benefit is
to have post/pre compile pipeline going at least
and when you transition to vk, you disable the spirv->glsl part again
I'd have to embed a spirv compiler if I didn't want it to be painful to use
i wonder if that could be pulled via cmake, or one assumes vksdk is installed
I've used shaderc with gl for runtime spirv compilation before
4.8?
4.7*
4.7?
