#TREAD
1062 messages · Page 2 of 2 (latest)
I mean it's kind of based on how much data to read back
I think you could essentially have a write read loop entirely separated from the bevy stuff
clone a RenderDevice over
The problem is really determinism
I still haven't figured that out
Gpus don't have strict atomics ordering so you almost have to bake in an ordering for them like position from in atomicMax
well some do, more modern ones
This weekend I did fix a lot of my mesh allocating issues though
Down to like 5-10ms for large amounts of changes from like 60ms lol
learned a little bit of the pipeline for bevy too
I was also thinking that meshlet generation might be faster with surface nets but I'm not entirely sure on that
I need to implement vertex pulling still, prob get down from 8 bytes to 5-6 per vertex
What if I do almost entirely gpu driven stuff though hehehe
surface nets on the gpu
I'd have to figure out how to compress that much further though
still you have to deal with networking going through the CPU and also sometimes your GPU handle explodes and you have to recover from it so you better keep a copy of the state on the CPU
Ya, I'd probably have to keep more of the uncompressed version on the cpu anyways for stuff like physics and such
I think making it entirely gpu driven is probably more ideal, I could add a lot more entirely graphical effects directly off the sim
or temp/pressure/lighting/audio/physics pushing all processed there
just have to keep the readback small enough for what's important latency wise like audio or physics gen
lighting is whatever and I'm unsure how voxel light maps would work anyways
now do physics on the gpu
fair lmao
I know a bit about the process but I have no idea how you'd realistically do gpu physics
Seems like a lot of steps
I think the initial steps here are moving surface nets over to the gpu, then the sim prob
surface nets should give me a good overview of how to do compaction/outputs/etc
plus save me gpu bandwidth and cpu (it's by far the most taxing thing still)
Sending a paletted input to mesh is gonna be like 1/16th the size of a majority of my meshes rn lol and it scales better to the worst case
right now with lots of edits I can get a bunch of 100kb meshes to send over which might be even just 512 bytes if paletted
alright time to try some gpu falling sands
gonna need to think about this a good bit tbh, determinism adds a brutal constraint to a lot of this
atomicMax/Min seem like the only real way to do determinism without margolus neighborhoods
27 pass block CA might be cool
or maybe 18 pass if I could figure out a tiling for that
ok fuck it I'm gonna do the simplest thing imaginable: 2x2x2 margolus neighborhoods, no atomics and see how it looks
for later at least, this is deterministic + simple so shrug
either that or we need like different passes per cell type, which might be worthwhile tbh?
powder pass, liquid pass, gas pass, interaction pass
TODO:
- [ ] GPU surface nets?
- [ ] GPU simplistic CA?
- [ ] Morphological operators on voxels (varying filter size, 3x3x3, 5x5x5, etc.) for erosion/dilation/whatever
- [ ] Jackdaw integration for level editor? I'd like to be able to make buildings out of voxels in the same way it lets you build boxes... maybe I just sample the boxes for intersection or something based on what the jackdaw scene files are?
what if I did morphological sampling only vertically for my surface nets?
if I changed my ordering to XZY or something so Y is the finest stride in my arrays, shouldnt be any additional cache misses + I get less flickering from meshes of falling voxels + less strain on my meshing since its more likely to not have a surface there in non lod 0 cases
xz sampling would be too expensive I think for little gain
y ordering might also enable optimizations for falling specifically
swapping all of this over to Y ordering seems like a pain though
I guess its just inputs + outputs really but yeah
or i guess really just mesh outputs? if i dont give a shit about my current maps
ykno I could also just dilate the sampling
if the sampling above and below is X and the middle is air, just count the middle as X
smoother looking result less flickering less verts less processing
ok plan for this weekend:
- scratch the gpu stuff, its fine as is for now shouldn't get distracted from making an actual game
- move to xzy or zxy ordering, important part is the y axis is the finest stride in the arrays
- get the "morphological" stuff working for sampling for just the y direction
- maybe try a dilation or morphological "closing?" as well after we sample just vertically as well.
I should update the stuff on the first post of this, make it look cooler lol
I got xzy ordering on a branch, unsure of whether I'm gonna move forward with it after thinking about maybe just doing a morphological average-ish thing with representative sampling
Looks awesome! Any idea how you can optimize your frametime?