#Fwog and co.
1 messages · Page 2 of 1
btw I'm thinking of typed textures
it would be maybe useful to separate color textures and depth/stencil textures
how would it look like?
I'm still thinking about it
perhaps it will come to you in a dream
maybe typed texture views I guess
but yeah, could be worthwhile
Tear has been asking for typed buffers for a while now
similar boat
I was looking at my render target struct and realized that it's pointless to combine color render targets and depth/stencil render targets as the same struct
and it just makes things more complicated with union shit
ClearValue wouldn't have to be a variant/union if I just split RenderAttachment into RenderColorAttachment and RenderDepthStencilAttachment
https://github.com/JuanDiegoMontoya/Fwog/blob/main/include/Fwog/Rendering.h#L40
I also realized that, since you often know if it's a color or depth/stencil texture at compile time, these could be separate types
there would be a few places that could benefit from the additional type safety
the only case I see for super duper typed textures like Texture<RGBA8_UNORM> is for uploading data and idk if it's worth it
at the same time, buffers are written to often enough that I can see the benefit for those
oh wait I already added typed buffers 
it would also be really nice to not have to interact with std::variant<std::array<float, 4>, std::array<uint32_t, 4>, std::array<int32_t, 4>> data; when I'm clearing a texture
the fact that depth and stencil are sometimes separate and sometimes one thing is really annoying
ok I just had a 
I could completely remove this ugly variant from the user's view (or at least prevent them from caring about which type is active in some cases)
#pragma region to the rescue
I could just secretly check the texture's type and convert from the variant's type to the one needed to clear the texture. Or just assert if it doesn't match (which is what I currently do)
very secret, very cool
I feel like a secret conversion would lead to subtle bugs in some cases where you're clearing to a nonzero value
can you show?
there is an issue if those are selected runtime
eg. maybe you have quality settings
yea
but my theoretical plan wasn't to force the usage of those
just have them derive from Texture and then add extra safety on top
as a treat
hmm, that reinterpret_cast could be a static_cast
noep
good idea though, will add issue
I do assert if the buffer is still mapped at destruction time, so I guess that counts for something
3 pieces of reddit silver
idk how exactly I should design the RAII thing returned by mapping
maybe it should have move semantics if I want to change the lifetime of the mapping?
but std::lock_guard doesn't offer move semantics (and probably for good)
why'd you put a lock in there?
lol sry that was just for comparison
they are kinda similar in some ways
I was mainly thinking of edge cases from normal mapping, like mapping once at the start of the program
normal mapping 
because the lifetime needs to be elongated in the case of persistent mapping
eh or I could just offer both unsafe and scoped mapping as a cop-out
depending on dependent things, you can do what is idiomatic in vk
persistent maps for some things on creation
no other mapping, no explicit mapping and unmapping
its a cop movie
i am not sure intermittent mapping is worthwhile
i think subdata is better for that on some drivers
ye
so I guess the only purpose of mapping now would be for some persistent mapped thing, or for reading the data
but for reading I can just use glGetBufferSubData all the same
so really just persistent mapping is the compelling use case
then I guess unmapping in the buffer destructor is fine. I can't think of a reason you'd want to explicitly unmap a persistently-mapped pointer
maybe some drivers put persistently mapped buffers in some small region of memory?
I sure do love how easy it is to be certain with opengl
almost makes me wish for a nuclear winter
i used to be a cpp enjoyer like you
then i took an operator-> to the knee
at least not an std
knee stds 
cpp has lots of stds, who knows maybe they'll add std::knee in C++30
I just realized @golden schooner isn't in this thread for some reason
nvm I'm just blind 
was i supposed to comment on something?
I don't need any template stuff at all for my buffers / textures
oh wait, buffer has map which returns span<T>
returning a span is a good idea
this is depressing to read (this is in the opengl3 imgui backend in the repo)
https://github.com/ocornut/imgui/issues/4468
// Upload vertex/index buffers
// - On Intel windows drivers we got reports that regular glBufferData() led to accumulating leaks when using multi-viewports, so we started using orphaning + glBufferSubData(). (See https://github.com/ocornut/imgui/issues/4468)
// - On NVIDIA drivers we got reports that using orphaning + glBufferSubData() led to glitches when using multi-viewports.
// - OpenGL drivers are in a very sorry state in 2022, for now we are switching code path based on vendors.
at least the Intel driver bug was apparently fixed, according to the author of the issue
||8 days ago 💀||
OpenGL drivers are in a very sorry state in 2022
Just move onto Vulkan and forget about all the OpenGL
it was all just a bad dream
it's over now
maybe someday
when the promise that all old devices will get vulkan support be fulfilled (never)
actually, I don't care about old devices
fwog on zink
I don't understand what clang-format is aligning stuff to here
Fwog::SwapchainRenderInfo swapchainRenderingInfo{
.viewport =
Fwog::Viewport{
.drawRect{.offset = {0, 0}, .extent = {gWindowWidth, gWindowHeight}},
.minDepth = 0.0f,
.maxDepth = 1.0f,
},
...//some other stuff
my indent size is 2
maybe it's aligning initializers with the end of the namespace 🤔
clang format is still too limited
looks like it always aligns it to four spaces in this scenario 😦
longlong::FooButWithARatherLongName myincrediblyLongNamedFoo{
.bar = longlong::Bar{.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0,
.bbbbbbbbbbbbbb = 1,
.cccccccccccccc = 2,
.dddddddddddddd = 3,
.eeeeeeeeeeeeee = 4,
.fffffffffffffff = "hello",
.baz = "world",
.gg = 1}};
clang format does really bad with the named initialization sort of stuff
i tried for a while to get decent settings and couldnt
there also aren't enough penalty options
it likes to break in some really annoying spots
this is good and normal
ignore the red squigglies, my intellisense is having a syndrome rn
quite sad that clang format still cant do simple stuff, but ye, more important that it does something
i wonder why nobody reinvented some c++ formatting thing "made with rust" yet
lol
btw pro git tip
you can exclude the commit that reformats the entire repo from blame
ye, rewriting history
I actually don't use git blame like ever 🫠
it's pretty easy when you don't write bugs
ah, thought this might be something like that 🙂
the .git-blame-ignore-revs thing was pretty easy to add 
remember to let devsh know this any time he posts any of his crunk-ass-unformatted nabla code
what should I use instead of std::aligned_storage since it's going to be deprecated? std::optional?
https://en.cppreference.com/w/cpp/types/aligned_storage
can you render https://sketchfab.com/3d-models/eas-agamemnon-ebcfdc41c19a40b9b06020b7bb31515e out of the box?
ugh I don't want to spam * and -> everywhere
I can try
where is the download button
ah it's stealthy
hmmmmmmmmmmmmmmmmm
it's just bright due to the lighting btw
ja
it seems to render properly
no hay de que
Yeah it's pretty sad, the proposal is to encourage writing your own byte array using std::byte
with
template <typename T, size_t A = alignof(T)>
struct MyAlignedStorage {
alignas(A) std::byte storage[sizeof(T)];
};```
I wish there was a way to make it act "like" the object it's potentially storing without having to change the usage syntax everywhere
basically I just want a deferred init object
guess I'll just use an optional
you could have an implicit conversion operator
yeah but I couldn't do myThing.foo() still
can you change A to Alignment?
hmm
at least I could use it for function arguments
you could wrap the aligned storage to some kind of Convert<T>
so like, Convert<AlignedStorage<YourThing>>
maybe I could make it a union
@long robin https://godbolt.org/z/Wd19Pxr8z
I managed this so far
template<typename T, size_t A = alignof(T)>
struct AlignedStorage
{
operator T()
{
return *reinterpret_cast<T*>(storage);
}
template<typename... Args>
void emplace(Args... args)
{
std::construct_at(storage, std::forward<Args>(args)...);
}
~AlignedStorage()
{
std::destroy_at<T>(storage);
}
alignas(A) std::byte storage[sizeof(T)];
};
ye but I think overloading operator -> is the best course of action right now
ye
unless you want to T(storage) everywhere
also remember to launder
I'm not 100% sure it is necessary
brb lemme look at lifetime rules for the 100000000th time
[basic.life] here we go
I guess std::launder can't hurt
basically just sprinkling a little voodoo into the mix
std::launder is the MSG of C++ programs
man that is a big model
77mb 1k textures
I should try it
heh
cool model
fun fact, it took my ktx converter 2 minutes to convert all the textures
and it's AVX2 
4k textures though?
ye
thicc
had to scale down but works
you don't mipmap
ye
but opengl generates them on GPU maybe? Idk
so wouldn't make a difference (probably)
actually the only reason I don't mipmap is because the 33% increase in memory usage makes new sponza not run on my PC 

💀
ye, the driver will do it on the GPU when you call glGenerateMipmap
unless it's a bad driver
by the way it took less than 2 minutes to load even with CPU mipmap generation in debug build
I also use ktx lib for compression, it also compressed them to bc7
the only difference may be is that I have it all multithreaded
if you use ktx why are you generating mipmaps at runtime? 
that looks great
honestly not sure but why not
yours too void
heh fair enough
took me a second to see the specular reflections 🙂
I prefer precomputing mipmaps since ktx allows you to
it makes loading a texture blazing fast
there is nothing to do at runtime, you simply copy over the bytes to the GPU
the mipmaps are already there
ah fair enough
then I think your method is better than mine
but I can't be bothered generating mips on the GPU tbh
if you're compressing them you can't even do that
blitting to compressed formats is not supported anywhere
I was just checking that lol
big sadness
Treat compressed textures as read-only
comperssed textures are also on my todo list
check out my tuned agamemnon
sometimes I like playing with materials for no reason whatsoever
: )
lovely
it looks great in that style
and sorry jaker for flooding your thread with offtopic
its std::optional, albeit not aligned
I want to add dxt (s3tc) support, but I haven't found a nice library for loading them yet
stb only has a library for writing to them
hmm maybe that could be useful for converting
You could fix a memory budget and then set lod range per texture / load only selected mip levels to load virtually any content
yeah
Have you tried the 'putting , for the last member' trick?
I'll try that in a bit when I'm back on
I get this output (using the clang-format from Fwog). : cpp FooButWithARatherLongName myincrediblyLongNamedFoo{ .bar = longlong::Bar{ .aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 0, .bbbbbbbbbbbbbb = 1, .cccccccccccccc = 2, .dddddddddddddd = 3, .eeeeeeeeeeeeee = 4, .fffffffffffffff = "hello", .baz = "world", .gg = 1, }, };
ah yeah I already have a trailing comma in the real example I have 😦
oh well, it's not too bad
and I'd rather have this than no auto formatting
@viral haven the trailing comma trick comes in handy in quite a few places though. I'm glad to know it
Glad to know it's helpful for you 
@long robin
when would you use the _REV version of your formats in fwog?
like USHORT_5_6_5_REV,
or in opengl in general
some file formats just have the data specified weirdly like that
in one of my first graphics classes we encountered some BGR textures
thats not what i wanted to hear
but it means, they could be safely omitted in fwog
the _Rev ones at least
maybe I can go through and look at support for the optional ones
would reduce clutter here and there i suppose
you assert anyway
i saw that at UploadType just now btw
@long robin I suppose it is not viable nor useful to make multiple cascades of an RSM right?
it could be useful. I don't see why not
The indirect illumination should only be computed on what's inside the view frustum
Or am I missing something
Wait let me read the paper again
the work is already done on a per-pixel basis
the idea is that you sample the RSM many times for each pixel on the screen
the work is not done in light space (except for rendering the RSM itself, which is a relatively cheap scene pass)
Yeah ok
but yeah the paper is like 5 pages so I guess read that
@golden schooner I'm not sure how relevant it is, but this page mentions using BGRA for better perf on windows
https://www.khronos.org/opengl/wiki/Reverse_Color_in_Texture
iirc NV only supports reversed texture formats for the swapchain in Vulkan, so maybe that has something to do with it
hmm, I think I GL_DONT_CARE about this
i know d2d1 uses BGRA, thats why you have to create the swapchain with support for that, if you want to have 2d shizzle on top of your 3d nonsense
also TIL, i thought REV was some sort of Revision 🙂
ah
now it makes sense
UBYTE_3_3_2,
UBYTE_2_3_3,
should also be
UBYTE_3_3_2,
UBYTE_2_3_3_REV,
according to https://registry.khronos.org/OpenGL/specs/gl/glspec46.core.pdf page 198
ye the other _REVs are not correctly named either 😉
Hmm I thought I copied OpenGL's sketchy-ass naming convention
Oh well, I'll check it again tomorrow after I sleep
just assign it to me
ill take care of it when im back
you got more important things to implement 😛
Ok 
cant assign it myself for some reason
I did it
Does this library have examples for dynamic sparse GPU octrees?
I can't seem to figure out how to implement a good construction algorithm for those
fwogs sources are weirdly spread out 🙂
Can you explain
sadness, do you have any resources on it, by chance
I see
You could join the voxel dev server and ask
There are some papers people can point you towards
It's in #related-servers I think
im trying to implement your deferred rendering example in EngineKit and was kinda lost finding impls for the pipeline stuff
its like split across 4 units somehow
but its just a me thing anyway
What stuff exactly?
BindGraphicsPipeline
i now know wher it is, after rembering it hiding in detail/
Isn't that in Rendering.cpp
in Rendering.cpp tho
ye
in Rendering i would expect something like an implemented renderpass of sorts
not BindXXX
I see
like i said, its just me
I just threw all the Cmds in there
yep
there are also stil some deadbody logic op things in there hehe
thought we removed those, mmh i could be rong
I guess I left them because it was easier than removing them
also possible
It also sucks when you have to initialize a complex object, like a pipeline, in an initializer list
Another use case for the DelayedInit class
Doesn't solve every case
Plus, I can fix my particular problem by using optionals instead of pointers in certain places
what do you mean with every case?
with a builder you can also counter wrong input or states which make no sense, when you are able to edit the descriptor willy nilly, you have to validate it somewhere else, and that pollutes the other place with code which shouldnt be there - BindXXXPipeline for instance
I mean where you actually need to run some logic to initialize something. But I guess you can use a lambda
lambda could work too, although it will uglify the code where you build it, more indentation more {} and =>s
hmm can you actually assign normal methods as well and pass them instead of lambdaise the code?
something like
Foo(param => Console.WriteLine(param));
vs
Foo(DoSomeShit);
// or
Foo(param => DoSomeShit(param));
void DoSomeShit(string param)
{
Console.WriteLine(param);
}
with function pointers or std::function you can
ok
I guess I can just show you what I come up with when I'm done with the refactoring, then you can tell me if it's poop
Imagine having to use the ->* operator
struct X {
void x(...) {}
};
void f(X* x, void (X::*f)(...)) {
(x->*f)(...);
}```function pointers to member functions were a mistake
that's why we had std::bind
and now we have lambdas™️
erhe mostly avoids pimpl
clang-format is once again triggering me
it seems to be ignoring BreakConstructorInitializers
it only respects BreakConstructorInitializers: 'BeforeComma' which looks like ass
better than the abomination shown in the first pic I guess
that oxford comma actually didn't influence anything
oh mama I fixed it
ConstructorInitializerAllOnOneLineOrOnePerLine: 'true'
also my other problem with everything being indented by 4 even though I specified an indent width of 2 is fixed
turns out there are like five options for influencing indent width, and they all affect different things
my skin is feeling smooth
the potion of lignification is wearing off
the ligmafication potion is now kicking in
and I think the potion of marttyfication is making me experience psychosis
in a good way?
as a (soon™️ to be) medical doctor I can assure early onset psychosis does feel good for the patient
mfw swapping delayed init objects is really cursed
guess I do need a bool to indicate whether the object has been initialized after all
wots delayed init object
aligned storage + helper functions
why is it cursed?
I'm pretty sure you can't just swap the object representation of two objects
plus I do this totally-not-sketchy thing where I call the managed object's destructor no matter what
template<typename T>
struct DelayedInit
{
template<typename... Args>
void emplace(Args... args)
{
std::construct_at(&value(), std::forward<Args>(args)...);
}
operator T&()
{
return value();
}
T* operator->()
{
return &value();
}
T& value()
{
return *reinterpret_cast<T*>(storage);
}
void destroy()
{
std::destroy_at<T>(&value());
}
~DelayedInit()
{
destroy();
}
alignas(T) std::byte storage[sizeof(T)]{};
};
it has... several problems
turns out what I actually want is std::optional
sensible
I discovered the cursed problems when I tried to std::swap two DelayedInit objects
std::swap is implemented with three moves and a temporary
when the temporary dies, it's murdering one of the objects I care about with my DelayedInit
well yeah, you need to add move support
thats already an issue if you want to have this as a member
object my beloved
but yeah just using an optional would be easier
ah yeah
plus I don't have to worry about manually destroying the thing if I want to emplace into it again
is DelayedInit<T> something like a Lazy<T>? which gets its value when you read from it the first time
name is really confusing
i was seriously curious
the code for it is above
I don't know what Lazy<T> is, but your description sounds like not the same thing
ok
DelayedInit is supposed to allow you to call the constructor later
also I just noticed that the framebuffer cache in Fwog is broken when you delete textures
cuz the cached framebuffers using that texture don't get removed as well
good find
discovered after I noticed that window resizing wasn't working for some reason
ah and it sounds exactly like lazy 🙂
im currently also busy trytig to replicate deferred.cpp into EngineKit
the difference is that DelayedInit does no lifetime tracking on its own (it only holds aligned storage), so it's easy to create bugs with it
manually creating shapes is a pain in the butt 🙂
I'm refactoring deferred.cpp right now
in gltf_viewer I load a gltf scene and render it with RSM
ah so i should look at gltf viewer instead
yeah
alright
the features of these examples are kinda all over the place
oh I need to reorganize those too
that people use input attributes in all sorts of orders
thats no critique at all, just an observation 🙂
since i kindof used the same order in my engines for ages
Position = 0, Normal = 1, Color = 2, Uv = 3, Tangents = 4
i suppose my vertextype enum makes no real sense then, since it dictates a specific order, and all involved shaders must follow
probalby better to provide an inputlayout vertexbinding builder of sorts, with a simple interface
or optional let reflection do the majeek
ye its really just a few lines anyway, for declaring them somewhere
hmm
now im finking
when building the pipeline
it could technically also take care of building the vertexbinding automagically
and you dont even have to worry about, when you link the shaders, you can reflect the input attributes, and with that build the vertexbinding
how so?
only probwem is, when your shaders suck, and the attributes are not used
what about normalized attributes like colors?
how would you infer that it's normalized from the shader?
maybe you could introduce HLSL semantics 🤰
I hate those fyi
you would have to force people to use those
explicit vertexbindings you have to provide, it is
Every couple of years I try clang format and it disappoints me every time
my stuff is in a broken state due to me trying to automate synchronization with render graphs
and I've decided to take a break 
Fwog has a built-in render graph called OpenGL
"graph" 🤣
the "graph" is silent
there should be a series hosted on this server
"Fwog in Action"
where people add stuff or make stuff with it
without a version number, how would we know? could be a major problem
I should patch that
looks like gltf_viewer and co are fucked atm
or somebody added a bunch of stuff which my igpu doesnt support anymore
or both
hehe
i cannot run anything from fwog anymore
getting this stuff instead
OpenGL Debug message (1): 0:137(54): error: parameter `in offset' must be a constant expression
Source: Shader Compiler
Type: Error
Severity: high
OpenGL Debug message (2): 0:137(16): error: type mismatch
Source: Shader Compiler
Type: Error
Severity: high
OpenGL Debug message (3): 0:138(68): error: parameter `in offset' must be a constant expression
Source: Shader Compiler
Type: Error
Severity: high
OpenGL Debug message (4): 0:138(31): error: type mismatch
Source: Shader Compiler
Type: Error
Severity: high
OpenGL Debug message (5): 0:138(16): error: no matching function for call to `LinearizeDepth(error)'; candidates are:
Source: Shader Compiler
Type: Error
Severity: high
OpenGL Debug message (6): 0:138(16): error: float LinearizeDepth(float)
Source: Shader Compiler
Type: Error
Severity: high
OpenGL Debug message (7): 0:139(10): warning: `normal' used uninitialized
Source: Shader Compiler
Type: Other
Severity: high
OpenGL Debug message (8): 0:139(44): warning: `depth' used uninitialized
Source: Shader Compiler
Type: Other
Severity: high
OpenGL Debug message (9): 0:141(43): error: parameter `in offset' must be a constant expression
Source: Shader Compiler
Type: Error
Severity: high
OpenGL Debug message (10): 0:141(11): error: type mismatch
Source: Shader Compiler
Type: Error
Severity: high
Error: Failed to compile shader source.
0:137(54): error: parameter `in offset' must be a constant expression
0:137(16): error: type mismatch
0:138(68): error: parameter `in offset' must be a constant expression
0:138(31): error: type mismatch
0:138(16): error: no matching function for call to `LinearizeDepth(error)'; candidates are:
0:138(16): error: float LinearizeDepth(float)
0:139(10): warning: `normal' used uninitialized
0:139(44): warning: `depth' used uninitialized
0:141(43): error: parameter `in offset' must be a constant expression
terminate called after throwing an instance of 'Fwog::ShaderCompilationException'
what(): Failed to compile shader source.
0:137(54): error: parameter `in offset' must be a constant expression
0:137(16): error: type mismatch
0:138(68): error: parameter `in offset' must be a constant expression
0:138(31): error: type mismatch
0:138(16): error: no matching function for call to `LinearizeDepth(error)'; candidates are:
0:138(16): error: float LinearizeDepth(float)
0:139(10): warning: `normal' used uninitialized
0:139(44): warning: `depth' used uninitialized
0:141(43): error: parameter `in offset' must be a constant expression
0
zsh: IOT instruction (core dumped) /home/deccer/Private/Code/External/Fwog/build/example/deferred
Looks like an error in the RSM shader that anonten changed
Isn't texelFetch(...).x a float
i thought so too
Oh
misplaced braces?
It's complaining about the parameter to texelFetchOffset first
ah
The offset parameter
It's a constant expression but not really
I gotta refactor that shader anyways
#define PI 3.141592653589793238462643
#define TWO_PI (2.0 * 3.14159265)
if only there were a name for two pi to shorten some characters
perhaps like TWOI or something
HALF_TAU
TAU
pls
or the lesser known (double)PI
what i usually do is
float pi = 3.14;
double pi = 6.28;```
(half)TAU
i used to play in a guild called Nok'Tau Order Malion, does that count too?
the texelfetchoffset needs to be rewritten to
vec3 normal = texelFetch(s_gNormal, coord + offset, 0).xyz;
neh?
yep
vec3 Tap(in sampler2D tex, ivec2 coord, ivec2 offset, vec3 src_normal, float src_depth, inout float sum_weight)
{
vec3 result = vec3(0);
vec3 normal = texelFetch(s_gNormal, coord + offset, 0).xyz;
float depth = LinearizeDepth(texelFetch(s_gDepth, coord + offset, 0).x);
if (dot(normal, src_normal) >= 0.9 && abs(depth - src_depth) < 0.1)
{
result = texelFetch(tex, coord + offset, 0).xyz;
sum_weight += 1.0;
}
return result;
}
that compiles
discord's syntax highlighter thinks offset is a keyword
: )
maybe it is
just noticed gltf_viewer example runs in single digit fps :>
deferred as well
RSM is probably taking a toll on your iGPU
what if you try pressing 0 to enable the filtered version?
theres no difference
you could try changing the source to reduce the number of samples or filter passes
i lied
when i run it it somewhat chuggs along, when i then press 0 it runs half speed of what was before
oof
because it's full resolution 
: )
having no indicator in titlebar or imgui sucks a little
of what is active right now
btw what happens if you increase the radius on the original impl
RSMFilteredSamples = 2, and Filtered = true runs like ass
i can see changes
also while pressing f4 its more obvious
but performance is still not good at all
I didn't know that was a thing
alt+f4 shows the window behind this one
what was the error in my code by the way
the way it is right now is weird
what why
if you inline all the code it is a constant expression
but you also pass it via a parameter that isn't const
i switchedoffset it to texelFetch and value+offset
also I don't think that igpu is suited to run any gi other than baked one
so if it runs like shit I am not surprised
very true
the previous implementation was running "faster" tho
(im not complaining)
it was exactly half res
: )
and all coherent samples even if 400
caches op
so it's harder to compare unless this is also made half res
when I started I was like wtf is this code and just made it full resolution
because wanted to cut to the chase
honestly I think that convergence on this thing is atrocious
if it wasn't as bad we could put a narrower filter on it
but alas we are sampling a big surface with a disk in 2d so idk how to guide samples better
I guess we could make it even more nonlinear to sample closer
may I introduce you to voxels
no
VXGI is the real deal™️
real meal
how do I voxelize the scene
VRAM is the real meal for VXGI
4GB 3D texture 
huh
actually I just realized I can't do more than 1024x1024x1024 because my 3070 only has 8GB of VRAM
I clearly need a 3090
can't you do cascades
clipmaps
true
clippy mappy
need to learn more about this stuff
GPU gems only has a chapter about painting meshes with octrees
I need more knowledge
RTX 4000 (the quadro variant), has 48 jibs, no?
AMD GPUs have more VRAM on average
NVIDIA Quadro RTX 5000: Real Time Ray Tracing for ProfessionalsShatter the boundaries of what’s possible with the NVIDIA Quadro RTX 5000, powered by NVIDIA Turing GPU to bring real-time ray tracing and accelerated AI to next-generation workflows. Creative and technical professionals can superchar...
erm
bro waht this is ril
the A6000
"hi yes I need a quadro RTX 6000"
"why do you need it"
"intel sponza will not voxelize itself"
rtx 8000 that is not air-conditioner sized???
I think you render slice by slice
into a d3 tex
hmm with an ortho projection?
naturally
don't you need conservative raster
unless you want to add a certain hint of perspective to your voxelized scene but that's kinky imo
I'm wondering how you'd voxelize walls if you were rendering XZ slices (top-down or bottom-up)
you'd have to thicken the walls somehow
conservative raster too ye
there was an article from novidia somewhere
about voxelization
it was around time when nvidya was flexing vxgi
What Can You Do With Voxelization? A voxel representation of a scene has spatial data as opposed to the conventional rasterization view (as stored in a render target) which just has a slice of depth value.
there it is
shall read it
I shall too
but first I gotta eat
I think you don't have to render slice by slice, I think you can disable early depth test and do image store from the fragment shader, if that's possible idk
or maybe only disabling depth test will do, because without it I think the rasterizer shouldn't try to discard anything and use painter's algo
you could probably do it efficiently with a compute shader
they really need to get an intern to go through these posts and fix the code formatting
by implementing whole softraster?
ye 😄
writing software rasterizers in hardware rasterizers
good shit
nanite does something similar too
I realize that since you can do image store in fragment shaders, you probably don't need a compute shader after all
that's the thing, CS raster outperforms hw only on small tris
except maybe you can do more efficient sw conservative raster in a compute shader
is async compute a thing on opengl
no
rip
it's an overhyped feature anyways
well you need heavy compute workloads to justify using it indeed
otherwise sync overhead kills the benefits of async comput
not only that, but you need workloads that don't consume the same resources which can run in parallel
What does RSM stand for? reflection shadow map?
reflective shadow maps
thx
Yesh I have seen that I think
funny how all yt videos I've seen have not implemented it properly
I think authors omitted the maths needed for that
the monte carlo part
what, RSM?
y
I don't think the authors themselves used the maths needed for it
that would be a plot twist
what issues do you see in the videos?
ad hoc intensities
that one problem I cracked so that fwog may be one of the physically correct-ish ones
this was also before the age of PBR, so maybe that's related
btw I think you need to fix your impl as well because why not
actually I don't know what's broken
the pdf
aka 1/pdf the weight
compare the code of illumination computation it should be obvious what to change
I think I should commit latest changes to my fork btw
there is broken ass commented out occlusion test
and cleaner code, jsut a bit
wait a sec
I will commit something
jokay
huh wait maybe I already did
fix the texelFetchOffset too while at it 🙂
I can fix that
it is
i R confuse
https://github.com/JuanDiegoMontoya/Fwog/blob/aa71014fe745af6f235076d92b1e8ad9a9b53ed9/example/shaders/RSMIndirectDitheredFiltered.comp.glsl#L112
https://github.com/JuanDiegoMontoya/Fwog/blob/aa71014fe745af6f235076d92b1e8ad9a9b53ed9/example/shaders/RSMIndirectDitheredFiltered.comp.glsl#L145
https://github.com/JuanDiegoMontoya/Fwog/blob/aa71014fe745af6f235076d92b1e8ad9a9b53ed9/example/shaders/RSMIndirectDitheredFiltered.comp.glsl#L103
that is different
this is a bit optimized because I took the constant area term out of the weight and multiply the sum by it instead
and removed unused uniform sampling because it's useless
this is the commit
you made the noise bigger
ignore any noise
I was experimenting a lot
only important thing is the shader p much
as a matter of fact you should put imgui in there and add sliders and stuff to tweak the thing
yeah that has been painful
my refactor adds imgui to every example
and automated camera controls and stuff
by the way interpolation pass may also help
so that filtered version also works in half resolution
💀
const int numGroupsX = rsmUniforms.targetDim.x;
const int numGroupsY = rsmUniforms.targetDim.y;
how did this get here
I was wondering why occupancy was like 14% and the average instructions/thread was 0.5
lmfao now perf is several times better
const int localSize = 8;
const int numGroupsX = (rsmUniforms.targetDim.x + localSize - 1) / localSize;
const int numGroupsY = (rsmUniforms.targetDim.y + localSize - 1) / localSize;
somehow the incorrect calculation was being done only for the filtered version of RSM
now these SOLs are much closer to what I was expecting for the filtering passes
I guess the scheduler was being murdered before
amd employee caught using competition's software
I use nsight all the time at work to look for competitive deficits in games
no cap
- profile game with nsight on nv gpu
- profile game with RGP on amd gpu
- compare how long each pass takes
- ???
- profit
noice
it does run faster indeed
i was also able to resurect some amd card
R9
with the old numgroups even there it ran like shit
some frametime display would be nice 🙂
if you don't enable the filtered mode, it has the correct WG count
yeah it's in the imgui stuff I'm adding in the refactor
ye one mode ran like shit, the other didnt
controls + FPS display
now both are smoof
i manhandled the code myself
epic
some people call me that
it's true
glad i got fancontrol to work too, otherwise i would have gone deaf
sounds like an industrial hoover 😄
lol
absolutely have no clue whats kaputt with this thing, took 4 hrs or so to get a working combination
it's kinda crazy how high tech electronics are manufactured in a way that makes certain issues almost impossible to diagnose or fix
ye
it was much worse 15 years ago
im suspecting the cpu is kaputt somehow, something about the C states
like, if you drill a single hole randomly in a GPU board (avoiding all capacitors and stuff) it is probably unsalvageable even though you just broke some copper traces
some important copper traces ;P
if you actually hit the die then it's unsalvageable^2
heh
these things are manufactured with actual magic
its just photographs
photolithographs 😳
ye
would be interesting to study just to figure out how humanity even got to this point
like nvidias a100? with 56 BILLION transistors
there is some sort of documentary about the first transistor
and how people were assholes back then as well
do you remember the name of it?
also, slightly related, but I've only heard people talk about reversible computing once in my life (on the voxel server of all places)
sadly incomplete FAQ: https://www.cise.ufl.edu/research/revcomp/faq.html
try not to burn your eyes looking at that site
there was something on youtube iirc
its been a while, re the documentary
the other thing i dont think i have heard b efore
it's weird because it seems like the inevitable future, but no one talks about it
this seems to be the only "for normal people" video with some popularity talking about it. Basically all the others are lectures with ~10k views
https://youtu.be/jv2H9fp9dT8
ah i know this one
that guy is pretty entertaining
indeed
- the sco'ish accent
i used to work with a super cool dude from scottland several years ago
an old fart, but best architect ive ever met lol
informeeeeeeeeeeeeeeeeeaaaaaaashn
can you also ImGui::Image all the debug textures? 🙂
I prefer having the buttons to toggle the fullscreen view of the textures
because of the 1-1 pixel mapping
however, I can make a concession for u
but I gotta go to bed so I'll do it tomorrow
ok I haven't slept yet because I was debugging
imgui samples the nonexistent alpha channel of my texture, so it's completely blank
ah, I need to change the bg color
maybe
yeah no it's all black now
I need to change the sample state of my texture with raw GL calls which is yucky
kill me
glTextureParameteri(frame.gcolorTex.value().Handle(), GL_TEXTURE_SWIZZLE_A, GL_ONE);
ImGui::ImageButton(reinterpret_cast<ImTextureID>(static_cast<uintptr_t>(frame.gcolorTex.value().Handle())),
{100, 100},
{0, 1},
{1, 0});
ok real sleep time

my way of removing / 2 from the calculations
I guess I forgot to write the rest of the expressions before moving on
I guess RTX cards can tank about any bs there is
I'm gonna try an atrous filter and see if it performs or looks any different
also gonna try a tiny bayer matrix for noise
then I will see if I can get away with doing everything at quarter res
on what?
on filtered version already filtered by atrous?
I would be more concerned about the variance in unfiltered image
it's too bad for any filter
should add a checkbox for the filtered version to skip filter passes
did you fix your version?
last I checked there were no commits on that
I included the stuff in your patch if that's what you mean
I haven't pushed anything for a while
I do have what I believe to be your latest changes though
no I mean fix unfiltered version's disintegration with radius increase
Oh I haven't fixed that, but it shouldn't be too hard
yes it's all in the weigt
jakerino, am i seeing it right that you like 2 spaces indentation?
feels like it was 4 before as it should be 🙂
it has been 2 spaces this whole time
the formatting got a little whacky when anonten added stuff, but I have since run clang-format
feels weird being referred to as anonten in this server
well your "actual" name is quite cursed to write
hmm i could have sworn it was 4 before, but ok
i noticed another error which keeps the project from compiling on clang14, a missing ;
people just say void
true
which file?
ill fix it with the incoming PR
okay
in about 30s 🙂
thirties?
secondz
gotta wait 10 years
hehe
can you notify me when you merge the imgui and fixes you've made so far?
push not merge
I guess commit
I wanna play with settings live because I was recompiling the thing before
yeah that workflow is sad
pretty weird thing to want to play with but I wouldn't be programming graphics if my preference in toys was mainstream
cd bad, blu ray good
the 2 spaces trips me, just saying. rather unusual to look at
can't you make your editor display indentation at any width
thats like people using tabs over spaces and i have a different tab setting
also you're like the second person in this channel to say that my preference is weird 😂
im not complaining, just whining because it feels like you stand in front of thicc aquarium glass
4 spaces, best spaces 😛
I was first
the main advantage it has, imo, is preserving horizontal space
but do you have two documents open in split view
thank you 🙂
not saying I'm going to change it btw
was about to ask anton if he would join me coercing you into 4 spaces
I'm just saying that 2 spaces is the convention for this project
aanton 👁️
or, you give me contributor rights
: )
mr aquarium man
I used 2 spaces in every project I linked and you haven't complained until now 😩
to be honest I don't care because my code can't even decide between tabs and spaces
ye for some reason i didnt notice until yestertwodays ago when i was looking at the glsl
I adapted so much I don't care even if there is no spaces
or different spaces every line
heh
which is why I ruined jakers code
in my c# world this shit is formatted automagically for me whenever i save or close a scope
plus the default is 4 spaces indentation
anyway, as i said, i wasnt comaplainingngin
in clang it does for some reason
in the PR
that's weird
I guess it parses it just like any function call
even though it's a part of what preprocessor should turn into other code
there are other places that use that macro
https://github.com/JuanDiegoMontoya/Fwog/blob/main/include/Fwog/BasicTypes.h#L280
hang on a second, it could alo be clangd
ok it still compiles liblibimgui just fine, with gcc
liblib
: )
that annoys me too, don't worry 😄
yeah, you added it
wait no that was galunga
unless 
maybe the sceneloader needs it
ah, possible that it was galuga
it needs to be linked with all examples
hmm no
shit
gcc10 doesnt like the semicolon
gcc11/12 doesnt complain here
you shouldn't need the semicolon at all
intellisense might give a fake warning is all
ye, clangd
does it prevent you from compiling?
but these things should work the same
no, just tried it
clangd is the lsp here, that fucked with it
wtf
exactly
expanding the macro should just work
[build] [23/26 76% :: 4.917] Linking CXX executable example/deferred
[build] [23/26 80% :: 11.149] Building CXX object example/CMakeFiles/gltf_viewer.dir/common/SceneLoader.cpp.o
[build] [24/26 84% :: 11.531] Building CXX object example/CMakeFiles/gpu_driven.dir/common/SceneLoader.cpp.o
[build] [25/26 88% :: 11.598] Linking CXX executable example/gltf_viewer
[build] [25/26 92% :: 11.736] Building CXX object example/CMakeFiles/volumetric.dir/common/SceneLoader.cpp.o
[build] [26/26 96% :: 11.913] Linking CXX executable example/gpu_driven
[build] [26/26 100% :: 12.100] Linking CXX executable example/volumetric
[build] Build finished with exit code 0
it builds
super odd
now it doesnt complain about it anymore
phantom error
you have 24 hours
I would've just made a new commit that undoes that one 
nah
since i knew it was in the past 3 commits, i just did git rebase -i HEAD~3 (i for interactive)
then an editor pops up, and has 3 rows, one per commit, and i changed pick into drop in front of the offending commit
saved the file and git push --force done
alright PR builds successfully
4x4 bayer vs shitty 2x2 bayer vs 16x16 blue noise
@shell inlet how come the weight is just r in the new one instead of r * r? does it have to do with this: xi = mod(xi + noise.xy, vec2(1.0));?
jacobian determinant of the mapping is r
where did you get r^2
previously (and maybe currently) the samples were biased toward the center of the disk
r^2 is 1/pdf of that or whatever
idk but I'm sure it's true
here's a visualization
https://www.shadertoy.com/view/7ssfWN
bigger dots = more weight
what does visualization have to do with it
the likelihood of sampling a particular point
it's "importance" sampling
it's also in the original paper
I know that the distribution is not uniform, that's why the pdf is 1/r and not constant 1/pi
I don't know what is your point
I must've misunderstood
here's the jacobian determinant of the mapping
I think I got the wrong idea about the r^2 thing from the transformation you do to make it uniformly sampled
I will trust your math (never studied jacobian matrices)
you know what it might actually be 1/2pir
not 1/r
because the pdf integrates to 1 and 1/r does not
1/pi integrates to 1 however which I can definitely say is a correct uniform sampling pdf for a unit circle
now what I'm not sure about is how to properly integrate over a circle
I guess since we have sin(theta) in hemispherical integral when validating pdfs and say nonuniform pdf would be cos(theta)/pi for cosine weighted, we integrate it and it yields 1, now for circle it would be r instead of sin(theta)
then integrating 1 like this gives the area of the circle as expected
1/r actually makes it integrate to 2pi meaning we need to normalize the pdf
so I guess proper pdf for a unit circle with quadratic falloff would be 1/2pir
jaker are you following
I'm reading
it seems quite bright when I use 2pir (this is just the indirect illumination texture)
like this?
float QuadraticCircleMappingWeight(float r)
{
return TWO_PI*r;
}
yeah
what does the full image look like?
it's still pretty darn bright when I increase the radius
looking at the red thing reflecting more light than it gets is the evidence
it will remain same brightness regardless
because 2pi is a constant multiplier on the integral
it's adding to the already bright white surface
oh wait, you're looking at the top pic
adding or not this is weird
if you shine a light with intensity 1 it should not become brighter than that
unless you concentrate it (which isn't happening here)
I think the picture could be lying to us though
the top pic is not direct illumination
can you make sun brightness
1
darn it I should probably load fwog again and do tests

current mood