#StratusGFX - Realtime Rendering Engine
1233 messages · Page 2 of 2 (latest)
Yeah I have an array of pages instead of a big 2D texture at the moment
But I'll have to use a big atlas at some point because I'm only guaranteed 2048 array layers
I think I'll still keep the 1D addressing scheme in the future though
Oh that’s right I forgot about that
So 1d but it can still work? Will you have to pack multiple things into it or just single address still?
Well it'll be a 1D address but I'll just expand it to 2D by reading the image size
I'll have it behind a function so it looks 1D to the user hehe
I'll need to simplify my API as much as possible if even I'm to understand it 
I dunno if you're joking, but it should be the physical pages atlas size this time 
Divided by the page size of course
uhh yes
I think
I use a stupid simple addressing method in my partially working sample
I'll just ask chadgpt
which is take pixel, divide by page size, now you got page index
i read it all
: ) and i would not be able to implement it myself as is. im simply too dumb for it after work
but
i reads very nicely
and although i barely understand anything i enjoyed reading it for some reason
the pics help too
what i did understand, conceptually is the analogy to virtual/physical-isms
that makes sense
and im glad you are here 🙂
Hey thanks for reading it! I appreciate that.
Ok that’s awesome. Like there’s tons of nonsense with these vsms that we’re all drowning in but this is probably the most important low level detail. So if that makes sense maybe the article is good to go
i hope i will be able to implement that cool stuff too : >
Join us in the vsm trenches lol
i will watch from the sky and see how it pans out first XD
I'll re-read it when I have a bigger brain 
But what this article needs is a short explanation and comparison with other approaches, e.g. what they do here:
https://docs.unrealengine.com/5.1/en-US/virtual-shadow-maps-in-unreal-engine/
Helps readers to understand why it's even worth going through all this work to implement 😄
there are some comparison pics
oh wait nvm I thought you were talking about the UE5 docs 
Oh yeah good point
yeah this super complicated technique needs some motivation lol
“Why do you want to be miserable for this? Here’s why”
nah, just talking about fortnite, my dude
hey man please troll people in #off-topic-🐸 or #bikeshed-😇
but hey, ue5 docs can be interesting too, i guess
Hmmm I could add in some comparison pics with CSM at different resolutions
Maybe briefly talk about other approaches (tho I don’t have them implemented locally)
btw I noticed in your reddit post that you have a gtx 1060
the UE5 VSM impl was probably not designed for that class of hardware sadly
regardless, have you tested it in UE5 to see how the perf is?
Oh actually no I haven’t
I’ll need to figure out how to test it - I’ve only used UE4 so far. Do you know of any vsm test scenes they released?
hmm I'd test by just loading a random scene and doing stat gpu
I guess ideally you'd test with the bistro model, since that's what you're using already
well, ideally, you'd test with whatever model you want to use
but sure, the bistro model is a popular choice for testing graphics performance in ue5
Actually isn't too bad! It's getting like 11-12ms per frame with 25% resolution scale
though something seems a bit off though since sometimes the shadows disappear
Enabled page visual and the regions the shadows are missing should be covered so idk what's happening
Added a small wip motivation/comparison section
there's a typo in the article: nunmber
btw UE5 docs recommend quantizing light rotations when you have a constantly rotating directional light so that the vsm can be fully updated over several frames before the next rotation increment
also the gifs have an interesting blue-and-yellow check pattern and I'm not sure if that's intentional or an artifact of compression 😄
By this does it just mean rotate, wait for update queue to clear, rotate again? That’s a really good idea
Dang yeah I’m not doing that currently but that’s a great idea
Lmao I keep changing my debug color scheme since I can’t make up my mind
What they recommend specifically (the engine doesn't do this itself) is to rotate the light in larger (quantized) amounts infrequently instead of rotating a tiny amount every frame
Oh so it’s more of a guideline rather than an engine feature. That makes sense though. Tiny rotations constantly seem to be the worst case scenario for this
Yeah there's a heap of useful tips on the ue5 doc for vsms
Plus lots of info that hints at the underlying implementation
couple of changes to the article today:
- Added a section talking about hardware/API sparse vs software (also made software sparse the default in my code)
- Mentioned splitting light rotation into several steps instead of doing continuous rotation
- Tried to make virtual mem -> physical mem section a little clearer
What software do you use to make the graphics btw? They look clean
you mean the gifs? I export them as mp4 from davinci resolve then I convert to gif using this site: https://convertio.co
great site if the mp4s are less than 100mb
Convertio - Easy tool to convert files online. More than 309 different document, image, spreadsheet, ebook, archive, presentation, audio and video formats supported.
also been thinking about how the software sparse would work in vulkan or DX... like from lvstri's experiments we know the sparse api is still super broken at least on windows
so maybe it would be something like allocating small 2D textures (1024x1024, 2048x2048) where the memory comes from a custom allocator as needed and freeing them whenever no pages were being used from them?
1024 would give you 64 pages, but it would require a readback whenever a new texture block was needed
yes you make an r32ui texture that has a reasonable size (i.e 256MB) and this is your physical memory
I still think I'll just do readback for now
maybe Jaker will convince me to switch
But readback is blazing fast, it takes zero time to copy a few megs
I wonder if they’ll ever allow compute shaders to allocate memory
But yeah read back is pretty fast especially if you just queue it up and then check the results next frame
I meant the little diagrams
Edit: will repost when I have more info and maybe a demo
few more article changes:
- comparison mentions other light types
- page offset xy was changed to page index xy with slightly updated descriptions under the diagram
- added a small allocator gif comparing hardware and software sparse
this was the mp4 before giraffeifying it
updated gif to say allocator = pool
Tried compiling and running this, here's what I'm getting on AMD 6750 XT and Debian 12 
(Running ./Ex00_StartupShutdown)
[Error] Thread::(Renderer) checkShaderError:50 -> [error] Unable to compile shader: viscull_lods.cs
0:226(61): error: parameter `in offset' must be a constant expression
0:226(19): error: type mismatch
0:226(19): error: operands to arithmetic operators must be numeric
0:226(19): error: operands to arithmetic operators must be numeric
0:226(9): error: no matching function for call to `normalize(error)'; candidates are:
0:226(9): error: float normalize(float)
0:226(9): error: vec2 normalize(vec2)
0:226(9): error: vec3 normalize(vec3)
0:226(9): error: vec4 normalize(vec4)
0:226(9): error: double normalize(double)
0:226(9): error: dvec2 normalize(dvec2)
0:226(9): error: dvec3 normalize(dvec3)
0:226(9): error: dvec4 normalize(dvec4)
0:226(2): error: could not implicitly convert return value to vec3, in function `sampleNormalWithOffset'
[Info] Thread::(Renderer) Compile_:226 -> Loading shader: ../Source/Shaders/viscull_csms.cs
==Begin Shader Source==
It complains about textureLodOffset mainly 
The*Offset sampling functions are really lame 😦
I encountered this too previously, idk why it’s that way - spec doesn’t say that it should be constexpr
Oh, I see 
I think some shader compilers differ in what they think a constant expression is
It seems like some will inline the code before checking (which causes it to succeed), while others strip the context from parameters (so it fails to compile)
It’s sad that glsl doesn’t have constexpr guarantees/constructs that C++ has
Wouldn’t be a problem if it was just for optimization (like inlining), but when it requires you to be constexpr, but doesn’t give you the tools - that’s frustrating.
Yeah maybe it’s something like that, dang. Once I’m at the computer I’ll do some digging for a fix
"The offset is supposed to be a compile-time constant. If you pass value obtained from a constant array by index, the code will normally work only if the loop can be unrolled, so that the individual texelFetchOffset calls end up with constants for the offsets.
On Nvidia it usually works, but AMD drivers often do not unroll the loops, perhaps because of more efficient loop implementation. In that case the offset expression is not constant anymore, and loops with texelFetchOffset fail to compile or link."
this is probably the same issue
GLSL is painfully underspec'd
"yeah things can be constexpr if the compiler wants to do it"
yeah that's actually amazing
so you can open issues if it does stupid stuff
and DX12 also open sourced theirs and can deal with SPIRV
ye tbh though just use DXC and write HLSL since it's 100x better than GLSL
yeah lots of people do it
GLSL is practically in maintenance mode while HLSL is getting more C++ features all the time
however, I'm not sure if HLSL can deal with BDA yet which would be one good reason to avoid it
oh no, wonder if there are plans to add that
I think they are currently bikeshedding pointer design and gob doesn't like what they're coming up with
or something
through gob and devsh, I get the impression that the designers of HLSL are missing something
I'm not caught up there but I've seen some of his posts about the compiler stuff he's doing
you need to do some funky SPIRV intrinsics everytime you wanna use an extension that's not supported
the shading lang situation is pathetic atm but HLSL seems passable for "serious" projects
and by "funky" I mean really funky

oh ye I saw a bunch of those in devsh's HLSL headers. at least they exist tho
I'm talking writing opcode IDs
oh no lmao
that's so much better than simply not being able to express it though
yeah true
how painful is BDA by spamming spirv opcodes
I have no idea 
I'm sure you could wrap it in a class template that makes it a little less sucky
does that mean he writes HLSL and manually injects opcodes to enable bda?
wdym manually
you can do vk::RawBufferLoad right now I think
idk, maybe like during compile time
but it's garbage because [insert devsh reasoning]
(I haven't looked at it
(
no, you put attributes like [[spirv::4213728]] before a function that does the thing (I think lol)
ye 
IF i have to use hlsl for reasons
id like to write hlsl like a sane person
without special attributes or opcodes here and there
just like in good old d3d11 days : )
the future is now and it's spirv::4213728
no i agree wish it were better
maybe someday the shader langs will be sane again
or at least, let devsh coerce mirkosoft people into make a better hlsl where shit is built in
I removed textureLodOffset and pushed to master branch
let me know if anything changes! i don't have any amd hardware to test on
Yep, it works now. I'm getting 20-30 FPS in Sponza example, though 
Capturing it with nsight just crashed my X11 
Warehouse sample is 100+ FPS, though
My PC sounds like a jet when running the "bathroom" demo 😄
Oh wow the sponza one is definitely odd
Does it stay at 30 even if you let it run for a bit?
Though the 6750 is probably a really good card. I did all the testing on a potato gpu and it can get 45 or so on sponza
Jaker is there a way that an amd perf capture can be run and debugged on a pc with an nvidia card?
you don't need an AMD driver to open an RGP trace
you just need RGP
oh wait this is OpenGL, RGP doesn't work with that 
I’ll try running nsight on Windows later, it shouldn’t crash 😂
It's impossible, see proposal 0011
Your ignorance is showing
Inline SPIR-V is great
It's actually the only sane way to get this stuff in HLSL
The other alternative is getting the SpiR-V extensions after 3 years when they're no longer shiny
And also if and when the HLSL Devs feel like giving them to you
C.f. the BDA fiasco
With proposal 0011 you can define new types and built-in variables properly
Meaning you can expose any new SpiR-V Khronos ratifies as an extension
As a HLSL type, function etc
Without having to go deep into HLSL compiler internals and applying your own patches and then having them refuse support cause you basically have your own fork
The only glaring hole in proposal 0011 is that the type definition is missing a lot of ahit
This begins the review period as described in #96.
Please support my comment
May I offer you a 👍 in these trying times
im also not sure what you mean with arrogance
i meant it in the best way possible, support you
Ignorance
if this doesn't go through, it will be our own personal 9/11
haha i have no idea why i wrote arrogance, i meant ignorance : C getting old
its because if you wait and want them to make all the SPIR-V stuff built-in, then you're condemning yourself to the scenerio Ive outlined
there is no other way to be forward compatible other than SPIR-V intrinsics
and writing out opcodes and function signatures
thats ok for me
i know you try to help the cause and I admire that, although I do not understand 90% of it
das all 🙂
at least im slowly abandoning opengl already
join my cult
Guess that means it’s time to switch to hlsl
haven't worked on this too much lately, but I made some progress with the VSM scheduling system recently. The issue I would run into is that sometimes a lot of changes happen quickly such as with a major camera movement. The perf would fall a lot more than normal leading to frame time spikes
new system sets up a budget per clipmap with a CPU scheduler. Now if huge numbers of update requests come in it allows itself to fall multiple frames behind and uses coarse data as backup
also uses new hpb for culling which gave some solid improvements
hpb 
also this replaces the old mess I was using which was error prone
but now when the camera starts going nuts the frame rate stays pretty high
I'm still here, just inactive mostly. I sometimes check out #showcase. How have you been?
Wow I didn't realize it's already been 2 years. Doesn't feel like it
when are you going to strap in again and work on the next thing
hmm last graphics stuff was #showcase message and also a Vulkan backend I never posted about
lately I've shifted to game dev 
i remember that, looks like potrick inspired probeisms 🙂
everyone is abandoning opengoodluck
Yeah I jumped in on the probe bandwagon for a little while. I really like the results they give even without RT
You left it too?
not yet
it does look very pretty
I'm probing it up as well except I'm using rt to populate them
but that's mainly because rt is the only way to render voxels in my engine atm
Aren't you working on some kind of voxel game?
it's kind of a pain until you get some nice abstractions up and running
or until you borrow someone else's lol
yeah I started last December or so #1128020727380054046 (I didn't make a new thread)
Oh it's in frogfood?? I must have confused myself, I was looking for the other thread
a bit confusing innit
must be tuesday again
I hate tuesdays
made me check what day it was 
