#daxa - opionionated vulkan abstraction (now with autosync!)
2482 messages ยท Page 3 of 3 (latest)
No yeah that's true, but I was more referring to things where you end up checking something dynamically when checking statically is not possible.
Though usually I think it's a matter of finding the right abstraction level at which incorrect behavior becomes inexpressible. I'm eager to see how daxa handles things. Are the rust bindings somewhere public already?
i guess so
yes but they are VERY wip, they will probably go throu a lot of evolution
Mmm I see, I'm just curious
Found 'em
Nice. I took a look and it all seems fairly clean.
Also I notice that often your are expected to pass things around by Id, which I assume to be the bindless IDs. This might by itself address my concerns because it means it's much easier to keep track of ownership (in vulkano you are forced to pass Arc a rounds which means you'd clone them and store them all over the place). I'm excited to see where this goes
re implementing the c++ on the c api now
90% done with first version c api beeing fully implemented
i am yet to test it at all tho. i will die
once the rust api hits i will shill the shit out of daxa
innards?
im procrastinaring rt for a year now but daxa has mesh shading
pog
You might wanna increment user_count in a few
LVSTRI already thinking of rt shadows after horrors with VSMs
I love how the sunk cost is forcing us into implementing Nanite and coping with "always wanted to do that anyways"
Thank god I did implement the easy part of nanite indeed 
cloning daxa as we speak, what should I expect Saky
I need a full review of daxa
Greatness and prosperity
Daxa is great, I recommend looking at mine/Gabes repos to see the code style
The iteration times are amazing, you need OpenGL levels of loc to add new things
And you get extremely hard working maintainer (me) who can always help and two slackers who will sometimes fix your issues
link repo

o wait nevermind
you are a gremlin
(I didn't recommend potricks because his is probably outdated and doesn't even compile)
that is true
Ok awsome, the new lifetime modes now both work.
- Mode 1 (default mode in c++): error when a child outlives parent (now properly applies to everything consistently)
- Mode 2 (default mode in rust): children keep their parents alive until they get destroyed
- both langs can also choose the other mode by setting an instance flag
this gives a really good compromise between comfortability performance and control
i will add more "dum dum give me simple abstraction" modes in the future BUT make them optional. This will be great for debugging
do I really need vcpkg :(
yes! we love vcpkg
not really
it can auto download it and hide it from you
it will be there but you dont need to care
@pale owl
Hmm
can I add_subdirectory this somehow or do I have to build daxa separately from my project
as gabe is not here atm i cant really help
but vcpkg is really easy to remove if you want to switch later but use it now to get it going
nono I'm trying to do it the recommended way on Building.md
so I did basically every step lol
I don't really mind vcpkg
ok
Should I write a cmake config to find daxa?
uuuuuuuh
I see saky doing a very shrimple find_package(daxa CONFIG REQUIRED)
im searching for gabes template but cant find it
but I don't see the daxa.cmake anywhere in his repo ๐ค
ah
at least from what I understood
nono
look at mine
the vcpkg configuration at the end is so that vcpkg uses the local version of daxa that I have cloned on my pc
instead of using the actual package, I prefer this because if there are any changes made to daxa I can just pull and get them immediatelly as opposed to waiting for major daxa release
I'm conchfused
your vcpkg.json is identical to daxa's?
Unless there's a difference I'm not spotting
maybe I'm just dumb
ye as I said
me dumb
I don't understand how I didn't notice I was on daxa's repo
very nice
no worries it always confuses me
everything does tbh
So if you would like to use your local copy of daxa as opposed to the "official" package version you can do the overlay-port thing I have at the end of my vcpkg
The disadvantage is that sometimes (although very rarely nowadays) we break the API a bit due to some changes so you might need to update your projects that use the local version
"builtin-baseline": "78ba9711d30c64a6b40462c72f356c681e2255f3" wut is this
no problem, I break the API with every line of code I write with my abstraction

I'd tell you if I knew
These are some magic hashes that vcpkg uses but I actually have no clue what they do
we need to wait for @pale owl to reveal this
ight, now that I have the json what do I do
he told me like 20 times but I always forget
with the json you now should be able to find the package and link to it normally
and you should be able to use daxa
hmm CMake be yelling though
what is it telling you
can't find daxa-config.cmake
Did you set the path in the overlay-ports to daxa location on your pc?
Uh that's nice
hmmm
{
"name": "daxa_test",
"version": "0.0.1",
"dependencies": [
"daxa"
],
"builtin-baseline": "78ba9711d30c64a6b40462c72f356c681e2255f3",
"vcpkg-configuration": {
"overlay-ports": [
"D:/Dev/CLion/DaxaTest/deps/Daxa/cmake/vcpkg-port"
]
}
}``` this be the json
Sure
D:\Programs\JetBrains\CLion\bin\cmake\win\x64\bin\cmake.exe -DCMAKE_BUILD_TYPE=Debug -DCMAKE_MAKE_PROGRAM=ninja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -G Ninja -S D:\Dev\CLion\DaxaTest -B D:\Dev\CLion\DaxaTest\cmake-build-debug
CMake Error at CMakeLists.txt:4 (find_package):
Could not find a package configuration file provided by "daxa" with any of
the following names:
daxaConfig.cmake
daxa-config.cmake
Add the installation prefix of "daxa" to CMAKE_PREFIX_PATH or set
"daxa_DIR" to a directory containing one of the above files. If "daxa"
provides a separate development package or SDK, be sure it has been
installed.
Did you try to build daxa?
building right now
build was good
I see so building actually also installs the thing
let me try rerunning cmake for my repo
frigg
still no daxa found
yes
Those instructions on the wiki are for building daxa yourself. You can just pull daxa from the official vcpkg registry. It's registered as a package there already
no we need LVSTRI to be a cool kid and have the Daxa dev setup with us
Okay, so you see the overlay port, that's how you can use it through vcpkg (like a sane person who wants to acquire all their packages through a package manager)
Using the overlay port, with the relative path to daxa will allow you to use a local clone of daxa
The baseline hash represents the hash of the vcpkg repository that you are using, which makes sure the package versions don't change despite having multiple people with different versions of vcpkg
Because if a package is updated with breaking changes and that becomes the new default version, we don't want to update to it until we specifically say so
makes sense
i agree, VLSTRI should probably get it straight from the sauce ๐
For some reason, instead of just explicitly versioning each package, vcpkg decided to allow you to omit the version, and so it infers it from a file they call baseline.json... which is tied to the official registry and it's git hash
I see
btw I was missing some docs, I'll read them and report back soon ๐ซก
specifically the cmake-vcpkg integration docs (through cmake presets)
Sure
There's no specific need for CMake presets, but they are lovely IMO
Wanted to ask. Will daxa expose manual barriers to rust? If so does it consider it safe or unsafe?
I ask because I've had conversations with vulkano devs on whether the UB you get by doing barriers wrong is relevant to rust safety, I'd like to heae more thoughts on this
yes
imo anything on the gpu is not rusts concern
but daxa also has autosync with an optional taskgraph
Yeah I agree, vulkan halso gives robustness guarantees very often so you don't eg. risk corrupting random stuff for getting a barrier wrong
good
I was looking at daxa's API and I was thinking about the command list impl
I see it uses only the main queue and that it uses a command pool per command buffer?

it got me thonking but I like it
very nice and shrimple
GPU resource IDs are also epic
I have been thinking about multiqueue for a while, but mainly in the context of task graph
The command pool per command buffer makes everything simpler and does not really affect anything from what I understood
Just reposting, more info about daxa development there
shrimple indeed
everything is simple until it needs to be more complex
indeed
@lavish talon i have a new idea for the command list btw, it will be able to create multiple VkCommandBuffers. the complete function will spit out a BackedCommands object that contains the command buffer, after that you can continue recording new commands in a new command list. This way you can only use a pool in one spot (the command list) making it very easy to not misuse. dropping the list then recycles the pool.
So for example you can have an app with a command list pool that you get command lists from on the fly, then record get the backed commands and return the list to the pool. making command pool reuse super easy within a frame
CommandList = VkCommandBuffer + VkCommandPool
CommandList::complete() -> BackedCommands
BakedCommands = VkCommandBuffer
VkCommandPool is recycled when the list is dropped
i may rename a few things
you can still use this new thing just like the old one basically but its much more flexible and potentially more efficient
maybe CommandEncoder is a good name
also a bit easy to get wrong cause reusing the same command encoder forever will just lead to never reusing command pool memory which would be horrible
im not sure about this yet
interesting, I need to study daxa's internal a bit more before I can suggest anything though
many parts are shrimple but the task graph is scary 
the goal is always to have the simple default use be as simple as possible
Rule one of task graph impl, you do not talk about task graph impl
task graph is scary
(which is why I am NOT touching phobos' task graph. idek how it's handled internally lol)
I think phobos's one is simpler
Daxa TG is not that hard to understand tbh
The concepts at least
The implementation is hell
Lol I was about to take a look
all of daxa is probablz 20k
yuuuh
there are 6254 edge cases to remember tho
lol
pretty sure they are hardcod in my neurons now
yea the god function hahaha
has probably 40% of its loc as comments
Clang tidy warning function complexity 670 (limit 20)

We need task list rewrite I need it Patrick
We need better sync
yes
we need to make a tracking issue
so we can remeber all the things to improve
LVSTRI is a part of us now we have to dress to impress now
we will see
i have the glimmer of hope that i can bully lvstri to add rt support to daxa
i never said that
Uh I expected more
My engine was bigger...before I deleted a good chunk of it to rewrite the renderer lol
daxa was rewritten two and a half times now
it always shrunk except with the latest one introducting the c api
Uhm makes sense
changing daxa to make c api work feels like bending steel with my hands
we are very close
many tests work
end of week i hope i get task graph running again
the c api + rewrite of internals to make safe ideomatic rust error handling possible (if i dont have bugs in it) made the error handling so much more robust
every error case has an error code like in vulkan
and i actually dont just abort or ignore anything anymore
everything is checked and reported back as error if it fails
This literally made daxa find 5 bugs in the test code already
๐ค the tests are apparently not the best
its soo much better now
there were tons of checks before
but now i actually make an effort to check EVERYTHING that can break daxa/ the vulkan driver badly
I will add a report print that lists all currently living resources
i will allocate all resources into tables so i can iterate over them in these cases
like a resouce dump
Task graph is halicinating names
yes
i was too lazy for to implement raytracing btw
it is still nr one after the c api
that's fine
lets goo
tests all pass
tho uuuuh
imgui is having a stroke
Btw daxa now takes string views, so data + size instead of null terminated strings
i try to get all the rustisms into daxa
I rewrote daxas id system to be safe now.
Problems with ids before:
- version of array slots was 8 bit, it would wrap, causing errors in id checks as ids were quickly non unique
- ids were not checked in functions using them causing hard to debug errors
- even checking could not prevent the ids becoming invalid after the check as parallel threads could still kill it after the check
Solutions: - CommandLists get read lock on resource lifetimes. This lock is held from begin to end of the cmd buffer. It prevents true resource destruction.
- All functions using ids check their validity, that + the read locks make it possible for them to be completely safe.
- The version is now 44 bits making it 100% bulletproof. Even if the version is exceeded, it will simply retire that slot, it wont be reused anymore.
- The collect garbage function now gets a strong write lock on resource lifetimes. That will prevent command lists recording while actually destroying resources.
- Adding optional software command buffer that does NOT lock and records fake commands to be later turned into real commands. Same api. These are optional convenience command lists that can be recorded in parallel with collect_garbage.
so now i am comfortable with a rust api on this
Btw i realized that the user can use the c api to interface with the c++ api and make their own lifetime system that doesnt nessecarily use refcounting for any object
which i find to be really cool i will probably use that too
i nearly caved in to make buffer image and sampler refcounted
but i stand strong i wont!
Why is there a necessarily bad thing about making those refcounted?
i really like the explicit lifetime control
having an id only makes them pod in any other struct and such
Ah makes sense alright
btw with this you can also do manual lifetime management on daxa ref counted typed if you want to, the c++ types are all abi compatible with c types
so you can create c++ type, cast to c type then treat the dev refcnt function as a free function for that object
not sure who would use that but its quite nice to have that option
also i did some perf testing on my system lately
atomic adds and subs can each add around 4-5nanoseconds to each op on my pc
that is actually significant
copying them around is MUCH more expensive
not that i need the perf lool
just a cool bikeshed to do ids instead
I guess yea
I don't think you'd be doing mass copying operations between frames though
Idk, depends on the program
i had some cases where i could measure them a little
ids are also sexy cause they work on the gpu too
can just pass them to a buffer and then use the texture in a shader with the exact same id
i think thats very sexy
Ooo yea true for bindless stuff that is indeed very nice
So your main CPU "cache" for storing resources that are pointed to by IDs could be sent to the GPU I assume?
all the resources created on the cpu are written into a big descriptor set which is living on the GPU and you can then use IDs to access
the ids index it
Ahh yep ok that works too yea
So one massive descriptor set with just array descriptors then?
Or whatever vk needs for bindless shenanigans I haven't looked into it
easy
Cool cool
Yea imma see how phobos handles descriptor sets cause I've had the idea of messing with bindless for a while now so this might be a good chance to do so
Afaik all it's really about is just how you can store multiple resources of the same type in a single descriptor, something array based basically
daxa also supports bda as well right? How well does that fit in with bindless in your (devs) opinion
not the same descriptor
the same binding
the binding contains a descriptor array
each descriptor points to one resource in some way
Ohh ok I see
So you have a descriptor set, which contains multiple descriptors (for each resource type)... which each act as an array for multiple resources?
So one descriptor could be for images whilst another could be for buffers or something like that
kinda
descriptor sets contain descriptor bindings
each binding is either a single descriptor
OR a descriptor array
so usually binding = descriptor
but if you make the binding an array it will also contain an array of descriptors
layout(set=0, binding=1) uniform texture2D; <= binding 1 has one descriptor
layout(set=0, binding=1) uniform texture2D[256]; <= binding 1 is a descriptor array
here you can also see the dstArrayElement for the binding IF its an array
so one descriptor is always 1to1 with one resource view
while bindings can be multiple descriptors/resource views
Daxa has one descriptor set for bindless stuff
binding 0 is an array for storage buffer descriptors
binding 1 is array for storag image descriptors
binding 2 is array for sampled image descriptors
binding 3 is for sampler descriptors
Ahh yep this is what I was trying to get at thanks
Pretty odd imo that you can have these both have the same binding but I'll go with it honestly
hm you can alias compatible bindings
Oh you can do that? Intruiging...
that was not an example of one set but just two random bindings
yes
you can alias all texture types
and all image types
also all storage buffers
that is because a storage image descriptor is the same data type for all storage image types just with different values
same for sampled or buffer descriptors
so you can have heterogenious arrays of descriptors for each kind (storage, sampled, buffer, sampler)
glsl syntax sucks for this
So you can have storage images with different formats in the same descriptor?
yes
descriptors are just super fat pointers
for example
rdna3 buffer dewcriptor
its 128 bit so 16 bytes
just a pointer with metadata
they have something similar for textures
Wait is that the actual implementation detail for amd cards? I never knew it was publicly available
Damn thanks. Gives me a hint on why community driven linux support for amd drivers is pretty good
it goes way back
c api is merged into master
lets go 200 โญ
that was quite the struggle
most days i could only type with one arm
@lavish talon i cooked up a big improvement on command lists that allows for command pool reuse for multiple command buffers easily and safely
epic 
i will implement it today but the conceptual details are done already
daxa becoming more perfect by the minute
the best part is that its as easy to use as command list but much more powerful safer and more efficient
@lavish talon https://github.com/Ipotrick/Daxa/commit/304787472a77f5c615c83abf27c0148131e26578
https://github.com/Ipotrick/Daxa/blob/304787472a77f5c615c83abf27c0148131e26578/tests/3_samples/5_boids/main.cpp#L202
renderpasses are typesafe now
starting a renderpass changes type of the encoder only allowing correct functions
i dont have samples for recording multiple ExecutableCommands from a single encoder
updates all samples and tutorial to new command encoder
very very simple api
Looks great!
need to share sakys art
I will rewrite the wiki as an md book
๐ค
so how exactly do you use timeline semaphores to ensure deletion safety without having FIF buffered deletion queues?
wouldn't you need to tag or bucket everything by the u64 counter value it was queued to be deleted on?
I'm not sure if I understand your frame sync model
In my interpretation (and how I implement it), when you want to destroy something you get the current CPU timeline value and then every frame you check whether the GPU timeline has reached the target value
yep
so then you do have to store a list of (ThingToDelete, TimelineValue) and iterate it every frame
once you delete on the cpu it gets put into the zombie queue
it can be a queue not a list
oh yeah I guess you could have a priorityqueue keyed on the timeline value
actually idk I'd give it some more thought, but generally you would have to iterate it and assume that some of the stuff has to be skipped over until the next frame
yes
so it'd be some kind of datastructure where you can pass over the stuff not yet to be deleted
it is always ordered
oh shit right
as you must insert in frame index order
its a queue
and the actual order inside of the frame does not matter
somehow got into the assumption that the data would end up unordered but yeah I guess that's never possible
with DLSS3 maybe 
do you (or I guess more precisely the user calling into daxa) still use FIF for controlling presentation?
the user should be clueless about FIF
or just putting a bound on actual frames in flight in general
uh swapchain has a semaphore for each image
so it blocks on acquire_next_image() when not yet available
so you usually don't have to care
wym?
like why would you have a structure around controlling the maximum number of frames in flight that's separate from blocking on your swapchain acquisition
when you don't use swapchain for example
its not
the swapchain WILL block you
if you exceed frames in flight
nvidia even blocks you in the present call
what I use FIF for is for knowing when stuff that's read and written by both the CPU and GPU isn't being touched by the GPU
the swapchain is inherently tied to fif semantically
but (following vkguide) I have a separate FIF count outside of swapchain acquisiton, although yeah that would block me if I was finishing submissions faster than I present
swapchain is also not guaranteed to return images in order afaik
so still need per image semaphore
that cant really happen
they are ordered with semaphores
right
looking at my code again I have a present semaphore per FIF not per swapchain image... gonna have to think about this some more, the vkguide code really threw me for a loop
one more thing, without FIF, how do you n-buffer data that's commonly read/written by the CPU and GPU, like a buffer for transform matrices
with FIF I know that my staging capacity needs to be transformCapacity * FRAMES_IN_FLIGHT
is it implicitly now transformCapacity * swapchainImageCount?
the users still gives the FIF on swapchain creation
and it is his responsibility to handle that
As it is very often usage specific how you want to handle that yourself Daxa doesn't really do that for you
as in via the present mode?
I dont' follow
nvm it's been so long since I looked at swapchain creation, forgot you could control the min image count explicitly
yeah we don't even expose that
Daxa just sets that for you
I also don't really see how it is relevant, but perhaps I'm missing something
that's kinda what I was wondering
so based on this FIF does remain as a concept known by and handled by the API? and FIF is an entirely separately managed concept from the swapchain image count?
because so far I've been under the assumption that's how you're supposed to build your frame sync
User gives daxa max fif he wants, daxa creates enough images to support that and handles their availability (blocks on acquire_next_swapchain_image())
Inside swapchain there are per image binary semaphores handling that
per-image or per-max-FIF? because in my swapchain I have binary semaphores * max FIF
Per image
You need per image
Ehhh maybe you can get away with only having per max fif
But this is more clear
if you wanted to only have FIF semaphores you would need to reassign these (as I believe vulkan tutorial does) as swapchain returns you the images
idk, I've always had it work with no validation issues or anything, I think if your FIF < swapchainImageCount you just end up blocking more frequently or something
but the main thing is that for daxa FIF is a user controlled concept, but it's transparent to the exection
so it's not like daxa is entirely designed around somehow making "what's a FIF" work, which was my main confusion
no you block the same you just do what I said above
finally i have time again
did some upgrades to daxa the last few days
- rt integrated into task graph
- rework of task heads
- no longer need the annoying count
- slang integration
- no more silly ub with reinterpret casting structs to arrays
- added optional attachments to tasks
- mutliple bug fixes
- added read write concurrent sync for attachments
- made task head constexpr (much simpler push constants)
- reworked image view cache (can now also cache separate mip levels for things like hiz gen storage image)
- streamlined whole task interface for consistency and simplicity
- mesh shader bugfixes
- reworked slang integration
- reworked msaa, added optional dynamic state msaa
- added mesh shader and msaa test samples
- gabe added spirv caching for pipelines
- jaisero added rt pipelines with full integration
does Daxa have anything exposed for using transfer and compute queues?
I see the C api lets you get the vulkan handles for the device and physical device, would this be my best bet for now?
๐
its alredy abstracted to allow this internally but we didnt need it till now. But gabe wants it so it will come soonish
Not a 'debug my code' request, but not a bug report exactly either: daxa segfaults calling vkEnumerateExtensionProperties in daxa_create_instance
skipping that vk call moves the segfault to the next one, and so on.
I am able to call that (and other) vk funcs outside of the daxa init stuff, and in general I do not have any problem running vulkan code on this machine, so I'm reasonably sure the fault is in this library
fptr to vk function looks normal, address sanitizer doesnt show anything before that point.
an api dump generated by the validation layers (via vk config) of your app using daxa might be a good starting point for me to investigate
solved, not daxa, linker being dumb and writing nonsense symbols
ok nice
weird issue
let the dead rest
