#daxa - opionionated vulkan abstraction (now with autosync!)
1 messages ยท Page 2 of 1
sure
ok nice
most likely thats where the driver was putting it anyway
true
for me the extension will probably not to anything really
but i am still gonna try and use it to see
it cuts out the middleman
you can also pipeline descriptor updates
put your descriptor update into the command buffer
no need to stall the host
pretty neat
unless that never changes, you need to update it
yea but i can just use a descriptor update call or in the future ill write to the mapped pointer
its only done on resource creation so i dont see why id wanna pipeline it
also with comamnd buffers id need to think about what is going on with multi queue
would i just make a cmd buffer that stores all changes and then is submitted with the next submit in the beginning/
wont i have to tehnically synch on all queues
seems easier to write a mapped ptr to be honest
sure, do whats easiest
already ideas for vuk?
im always intrested in stuff others do to get more perspective
mmm, maybe
we'll see, vuk supports bindy
maybe this can make bindy extra fast
perhaps thats useful
yea
but doubt its high on the list
@soft tapir @pale owl later today fucking around with VK_EXT_descrriptor_buffer and beta drivers?
Oooh
@pale owl @soft tapir @cobalt thistle should i remove non buffer reference buffer access?
you know how you can do daxa_get_buffer(Type, id) ?
do you wanna have that?
(it gives bounds checks for debugging but thats it )
i can keep the syntax
wat
but remove the bounds checks and buffers for simplicity
ohhhh
only allow buffer references
I feel like I will only use buffer refs, so I don't really care
daxa_get_buffer is nice for people mixing HLSL and GLSL, but IDK if that's something we care to support anymore ๐ค
i can keep it by just faking it. id just look up the buffer address with the id in a table then return that from the function
lmao
then its the same as the other function
ssoooo
i guess i can just remove it
i looked more into the generating sampler function
and its simpler then i thought at least a bit
ill add an incomplete experimental impl
So i ll give TextureRef(2D) another try
daxa_sample(image, sampler, uv)
daxa_shadow_sample(....
daxa_gather(...
yesss
I hope it won't break my project 
i will try to be more backwards compatible haha

shove constants
ima name it push_big_constant
I will also enable a nvidia extension for bindless images behind the scenes (no api cahnges)
so that i only emulate bindless on amd
then nvidia is truely bindless with no downsides except debugging
the what now
i dont remember the name of the extension
nvidia has an extension for bindless imags in vulkan
that removes one indirection
on nv descriptors are stored in some special memory, and the "vulkan side descriptors" are just uints
so in daxa its uint -> uint -> descriptor -> actual image memory
with the extension its just uint -> descriptor -> actual image memory
found it
what was the name again
VK_NVX_image_view_handle / GL_NV_bindless_texture
ah rip
can't do separate samplers
weird
that is very strange
that is a bummer
welp nv you wont get faster image access after all then
wtf is that
i need to pass a device address from an image?
really strange
am im stupit
it looks like it should be supported
but maybe glsl cant do it
ah right, you can't do sampler handle
that still must be std bindless
but thats ok ig
yea thats fine imo
I am planning on doing some heavy makro magic
to declare samplers inside shaders
macro?
yea
like
DAXA_DECL_INLINE_SAMPLER(
...
)
not sure if it will work like that
but ill try
my main man potrick
i have indeed seem macros before
that was not the question ๐
ah lol yea i am very unspecific lmao
what does the macro do
it would do nothing in glsl but define some binding to a sampler or just a index that can be used to sample
or just declare a global daxa_SamplerId in the shader
and on cpu it would declare a daxa::SamplerInfo, that is then created and bound by the pipeline compiler
hmhmhm
i think that needs extra magic
makros arent enough
maybe not worth honestly
๐ณ
just this once
spirv_decorate
yes
like inline spirv?
yes
ong i will commit crimes now
then you put a usersemantic decoration
and read it back later with spirv-cross or whatever
just to be sure on the power of this
could i implement sampling functions with this "more nicely" then just plain glsl?
or at all?
hmm?
with spirv intrinsics
no, this is not about that
not the decorations
so its makred in spirv and i can see it later
with reflection?
yes
i dont get it
ah ok
hmmm
so i manipulate the spirv after the fact myself
intresting
compatible with hlsl too
that is cool
ah wow thats cool stuff
(3) How should we handle separate samplers?
RESOLVED: OpenGL 3.3 created separate sampler objects, and provided separate binding points in a texture image unit. When referencing a texture image unit with a sampler object bound, the state of that object is used; otherwise, sampler state embedded in the texture is used. In either case, each texture image unit consists of a texture/sampler pair, and no capability is provided to mix the texture from one unit with samplers from another.
i mean if thats the case
ill just give all of the images a dummy sampler
i have gotten offhand info that nvidia might actually already do the right thing with normal descriptor sets
and that extension is mostly for opengl glsl extension compatibility
well meh
i guess that makes sense
ok i wont do it then
seems like the syntax would have been cringe anyways
without changes at least
the handles are also 64 bit
my things are 32 bit with 8 bits reserved for debug
...
Ok other fancy idea
use after free detection on the gpu
I know you keep a version of each resource on the CPU but how would you use it to detect after-free use on the GPU?
ill put them into a buffer
and compare on gpu
when i see a fail i atomic add on a uint to aquire a index in the failure list
then write it out
compare with what? does GPU track it's own version?
with a small size of like 127 ids per type
all ids have the version
my cope grows stronger every day
texture ref lives
it LL be in soon
typed boneless handles with sane sample functions
boneless yes
@covert zephyr I'll show you my baby brainworm glsl file to cope with bindless and glsl when it's done
Validation layers with bindless(y) thing in daxa are acting very sus
I will expand the docs even more
@soft tapir @cobalt thistle @pale owl syntax questions, what do you prefer:
-
TextureRef(texture2D)
-
TextureRef(2D, float)
-
TextureRef2D
-
Texture2D
-
(all the above but with SampledImage instead of Texture prefix)
I am currently for option one with either Texture or SampeldImage
and secondly, on image functions:
do you want overloads for existing glsl functions, or new names, or both?
an example:
vec4 texture(TextureRef(T), SamplerId, vec2 uv)
or vec4 daxa_sample(TextureRef(T), SamperId, vec2 uv)
1st and new names
I'm indifferent
The first appears to be the most verbose, which is kinda cringe
But I don't really care
any lurkers pls give optlinions too
I fixed spelling and capitalization in the docs last night. I didn't actually read too much, so grammar might be weird
thx
hmm maybe hlsl style is nice
Texture2D(f32)
hmmmmmmmmmmmmm
I want more consistent naming
there are mainly two image types in Vulkan and glsl
storage image and sampled image
should I just name them that everywhere?
I need help
I hate sytnax
simple and consistent it should be
ideas on names?:
image_sample
image_read
image_write
image_size
image_gather
image_compare_sample (shadow sampling)
image_ prefix on these yay or nay?
I am for destroying the name texture
hmmmmmmmmmmmmmmm
name for what
i didnt read all the stuff above, are you writing some sort of shading language?
just some glsl abstractions
names for image handles
and names for reading and writing images
inglsl
in cpp side?
ah
images as in textures or also as in images
thinking about opengl, where you use images in csisms, and textures/samplers in non-csisms
hmm hlsl has something like RWTexture
hmm im probably also the worst person to give any advise here :S
glsl side
I am making my own image handles and functions on top of the glsl ones
so bindless looks better
true I'm thinking of making it closer to hlsl
ah using that thing martty showed you last night?
mmhm hhhm mmhmm
if you used plain hlsl how would you call all these things there, or how does hlsl call those, maybe those names are already suitable out of the box and you could just use them?
otherwise, what might help, if you create a little matrix with all the naming glsl/hlsl/metal/andfriends call them and either pick what you like or decide on an existing one which makes most sense already, otherwise maybe seeing all those next to each other you can come up with something different more suitable to your current problem
hlsl has member functions for structs
with would be cool
but glsk doesn't have those
I will
good point that makes more sense
wort case is you invent a new shading language from scratch ๐
yea lol
fuck
nah I'll stay on glsk cope
I'll work on that maybe later today or tomorrow, give a table of options
generating glsl is not an option?
generalizing?
ok let me actually write down the problems I have with current day and glsl in increasing importance:
-
it is strange that there are the names textureXY and imageXY. I would prefer more consistent naming.
-
the texture accessing functions are strangely named in glsl. An example is the texture() function. I would prefer more descriptive names like sample_image.
-
the glsl handles are not opaque and can not be returned from functions, stored in structs or put in local variables.
-
to cope with the previous point Daxa uses IDs for images and samplers. The problem with that is that the IDs don't have type information (for example if it is 2d or integer/float), so when using them the user needs to always specify the accessing type like image2D.
-
accessing images via current daxa IDs is ugly.
an example how one samples a day image currently:
vec4 value = texture(
sampler2d(
daxa_get_texture(image_id),
daxa_get_sampler(sampler_id)
),
uv);
I would prefer this:
vec4 value = sample_image(image, sampler, uv);
I can do all impl details I only need to know what ppl would like the most syntax wise
I'll propose more options for image handles and sampling /read/write functions
Can we encode some of the image information into the handle (similar to how version is encoded)?
yes
that works
as I said I can do all the impl details
it's all about names now
So image_sample is basically glsl texture and image_read/write would be load/saveImage?
Mby image_load(store)_texel would work also
Dam naming is hard
The best would be Texture2D and second TextureRef2D
so you don't like the () syntax?
ok
yea
If it would be the () I would make #define Texture2D on it
I see
I honestly liked TextureRef(glsl type)
because then the user can just look up the glsl type that he wants
instead of searching in the daxa wiki
that makes me wonder what you think about BufferRef(type)
would you also prefer something else?
yes
It's okay. I dont with else it would be replaced
it would be consistent with buffer ref tho
Honestly sounding to make own shader lang
BufferRef is a thing in hlsl?
ah
i dont think those Ref's make sense in a shader language
naming wise
because, what would be semantically different between Texture(foo) and TextureRef(foo)?
same for buffer/bufferref too i suppose
it is derived from the glsl extension called buffer reference
but you have a point
it does not make sense to nessexwrily name the textures fmref cause eof that
would only do it for consistency
fair
but tbf
it should be the other way around maybe
Texture(..) and Buffer(..) seem nicer actually
i concur
or ReadTexture ReadBuffer
and if you want to write to them you WriteTexture and WriteBuffer ;d
jk ๐
Texture() Buffer() make sense
or Load() for both
or LoadSampled/LoadUnsampled
aaah bascal case idk
I'll o snake case for functions I think
ppl will hurt me if I do Microsoft style I think lol
ah loadSampled hmm load_sampled
yeeees
people should shut the fuck up tbh
good I like
pascal case is clearly superior

any should be fine, as long as its consistent
unsamoled
Shut
Oled good technology dunno what you on about
Ahh but you can't write sampled image
Hmmm
currently looks like it will be:
RWBuffer(type)
Buffer(type)
RWimageXY
AtomicRWImageXY
ImageXY
and access functions
image_(sample/gather/read/write/atomics...)(image,...)
now
I am only conflicted about SampledImage
should we have that?
because Image could either be a read-only storage image or a sampled image
I personally don't see the need for a read-only storage image except to reduce pipe barriers and layout transitions
i also kinda like the RWisms from hlsl
hmmm
would making a few prototypes make sense then if you are not sure
and see how it feels, if its shit we find something else
hmmmmmm
nah
fuck read-only storage images
probably won't do anythigb
hlsl doesn't even have it
ROimage ๐
as long as you dont call your shit ipotric
the i/l thing at the beginning ๐
ill just call you ingo from now on
lpotrick::Image
uuuhh
@pale owl
I like it
pog
gabe just cba to deal with any of it ๐ since they are busy getting wayland support to work ahem
It works
Wayland support is in
hmm did I forget to mention?
ah no, im just dumb
we tried it the other day, and found out that my hardware is just not capable to run vk
all good, soon its no problem anymore
it will be Image2Df32 probably
shortage image will be RWImage2Df32
I'm messing with you
RWTexture<float4> :>
๐๐
read_and_writable_texture_for_many_floats 4 channels
read_and_writable_texture_for_several_floats 3 channels
read_and_writable_texture_for_few_floats 2 channels
read_and_writable_texture_for_one_float 1 channel
read_and_writable_texture_for_many_floatsises
Deccer has the best naming around I swear to God
i try
I will have the format abstracted(except for float/integer/unsigned) as those are shouldn't be casted
I also added defines to selectively enable the generation of function overloads. So if someone doesnt want them, they can just not enable them and have faster compiles:
* DAXA_ENABLE_IMAGE_OVERLOADS_BASIC
* DAXA_ENABLE_IMAGE_OVERLOADS_ATOMIC
* DAXA_ENABLE_IMAGE_OVERLOADS_64BIT
* DAXA_ENABLE_IMAGE_OVERLOADS_MULTISAMPLE
* DAXA_ENABLE_SHADER_NO_NAMESPACE
* DAXA_ENABLE_SHADER_NO_NAMESPACE_PRIMITIVES
yea lol
the default for all of them is 0
you can set them to 1
and the default for namespaces is that all things have the daxa_ prefix
i dont mean thair default vlaue value
should it be named DAXA_USING_NAMESPACE ?
but what the semantical default is
all things are prefixed
ie namespaces are on by default or not
yes
then name the thing DAXA_DISABLE_SHADER_NAMESPACE rather than DAXA_ENABLE_SHADER_NO_NAMESPACE
glsl has no real namespaces so i just prefix a daxa_ on all things and the no namespace thingy just adds defined
ok
yea
Clinically insane human being
fr lukas
After gym I will post 
@next tundra does that make sense (except for typos and broken grammar)?
also @soft tapir @cobalt thistle ^ new shader integration
naming poll:
BufferRef(BUFFER_TYPE) or Buffer(BUFFER_TYPE)
hmmmmmmmmmm
re the last question, im in favour of Buffer(type) rather than BufferRef (i still dont see the point of calling something ref in a shader, but i also dont know shaderisms at all tbh)
yea i am at buffer rn
re the link... i suppose DAXA_SAMPLED_IMAGE_BINDING is an example only?
just sanity checking
yes
is that a name somebody would use in their program somewhere?
no
kkk then nevermind my question
its integrated to main now
daxa will get a samples overhaul
how it handles samples or you mean examples?
examples
couldnt dinf anything better "samples" related ๐
I love daxa
struct Mesh
{
daxa_f32vec3 obb_min;
daxa_f32vec3 obb_max;
daxa_BufferId mesh_buffer;
daxa_u32 meshlet_count;
daxa_Buffer(MeshletArray) meshlets;
daxa_Buffer(MicroIndexArray) micro_indices;
daxa_Buffer(IndirectVertexArray) indirect_vertices;
daxa_Buffer(Vec3Array) vertex_positions;
};
I love buffer device address. I just allocate a whole mesh into one buffer and then store bda's to the parts that store the relevant data.
And daxa is so nice to translate this to a c++ struct i can init on host.
updated buffer ptr syntax
@next tundra can i post a link to a daxa discord here?
Daxa is getting a tutorial showcasing the api on a basic level
its also a bigger sample in multiple steps
I started work on conditional based permutations and persistent resource synch
i already have permutations implemented via conditional scopes in recording
but no i will also support just in time generated synch between executions of the task graph
as i now have conditionals the previous frame access can change
even worse, these accesses can fragment and change over time, so i must track last executions last access states. So this synch must be generated just in time every execution
but that is only the case for any resource that is persistent
I will also add just in time static permutation compilatiuon
as currently i just generate 2^(conditionals-1) permutations on startup
with jit compilation it would only compile permutations that are actually encountered
All this will allow daxa task list to cache (or even fully precalculate) synchronization information as much as possible between executions and allows it to be more efficient then pure just in time re optimizing synchronization
@south anchor for the brainworm enjoyment
now this is nice stuff
i wonder how dynamic real usecases are, do you have any examples?
one of the most common examples would be to have a first frame permutation to run some initialization code
another would be to turn off code for culled features, for example no water visible in current view
but it can be taken even further, making it describe all rendering
so on startup of an app you are in a menu for example
then you have a permutation to init rendering
and then a permutation for rendering after entering the game
mmm, but what is the benefit of that
state would be traxken brtween executions, so it can insert correct barriers hwne usage changes between executions
and as the one task lisy holds most of what is going on it can do all synch correctly for persistent resources
Technically i should probably also track buffer access stages to synchroniza on that between frames right? That seems kinda unnessecary even tho its needed right?
look at what vuk futures are then
do you have a good sample that shows how its used?
try example 8
is a gpu future tracking the state of a resource? to keep it around for multiple frames for example?
i think i am too dumb to understand xD
yeah
its automation of initial and final state
point being that you don't need to have a single compilation
yea good point
after all
this is just research for my own understanding really
i do think that task list is a nice api i want to use
but i wont know if the ideas i have will be good and nice to use in the end
i routinely remove things i dont like after a while
backwards compatibility is low prio in daxa
tracked image subressourve manipulation can be really complex
ye, esp with multi queue
probably the most bug prone loops i have ever written
intrestingly a few of the changes increase the time to build a task list unexpectedly
i will start to optimize the compilation
i actually found a few bugs while redoing layout initialization
jit compiler to synch between permutations done
just needs testing now
I also fixed initial access for ressources with @soft tapir
daxa s wiki got a facelift
its actually passable now
we also added a small tutorial with sample code
daxa on shader objects when
daxa is getting rpsl for poor people
shaders declare what resources they use and task list will sync and bind automatically
inline shader code that can be also included in cpu code
generates a c++ container that task list eats on task construction
and a shader side struct that is used by the shader directly
i am considering adding explicit uniform buffer binding
this way i wouldnt need to put this into a push constant
even easier
Can we have rpsl?
We have rpsl at home:
a thing that is annoying here that will get fixed soon, is that the things are passed in the push constant
i will implement opten gl style buffer bindings, so that task list can hide that as well
in the end this saves a ton of code in one of my projects
what can also still be improved is that this must be in inl files
optimally we can declare uses in shaders alone and without the access code
that would just be autogenerated from glsl type + stage
why do you do codegen instead of just reading info from the spirv?
i want to keep using my bindless handles
and they are just fancy bitfields
i can also pack in whatever info i want into this now
dont need to be restricted to spirv reflection information but anything
i don't follow
Also if you look at this, this is per task information. So what i want to do is have one shared file for all pipelines in that task as a whole. Declare all uses across the pipelines as thats fine within a task. Then i can simple fill one uniform buffer with the ids at runtime and bind it. The pipelines all then use the ids from there
i forgor
maybe an isoteric idea to have many pipelines in a task but i already have some tasks that do that
the code generation is also just more consistent with the rest of daxa
as i use that a lot already
as you prefer
big downside to my design is that its cpu compile time constant
so when i hot reload it wont know the update
if i put it in the spirv
i could rerecord the tasklist and have hit reload add task uses at runtime
that would be amazing
i think i will have two options for this
one is the cpu compile time inl file version
and the other is in shader option
daxa basically emulates descriptor buffer now
i have sinned
daxa has bindy now
but only for uniform buffers
not shown here
but used by task list to remove more code and make it even simpler
in daxa i can use them kinda like descriptor buffers and descriptor sets now
just in a cursed daxa way

I AM SANE
Ok
finalizing the sync api in daxa now:
We can now create persistent resources that live outside of tasklists.
One can simply mention these in any task list.
The jit sync will then correctly attach their state to the first and last use of uses inside a task list at runtime.
this makes it so that you can use persistent resources in any combination at any time in any tasklist.
the taskl lists could even be completely statically compiled and choosen from per situation without conditionals
You can rerecord every frame, reuse task lists every frame or any combination, this also works with conditionals
all due to the jit
it just works
Importantly it will ONLY jit sync to the FIRST mentioned use of the persistent resource in a task list. the rest is statically compiled sync and can be reused
all in-tasklist resources are transient and will use aliasing allocations to reduce memory footprint
the aliasing imo only makes sense to optimise within a tasklist
I think it's impossible to alias outside task lists
because you have no guarantee that two tasks lists will not execute at the same time on the GPU
they can have arbitrary overlap
@south anchor https://github.com/Ipotrick/Daxa/blob/5a002846f75490ea43518f12dbb91bc6b353bfc1/include/daxa/types.hpp#L203
I had to change how i work with alignas again
but this template cope makes it work
and so far its non breaking backwards compatible with all projects that use daxa
jimbus
But these work very well from what i see
quite the wall
i have to rewrite the shader integration wiki
but if it works
I think we actually came up with something which does very little if any compromises in the front end and maps super nice to the back end
yea
i like the current compromise a lot
i think its very easy to use and understand
and hard to misuse if we get validation right
and still super flexible and fast
100 daxa stars
Transient resource aliasing is in the works
Also Patrick is adding much better validation messages so that we catch as many user errors as possible
man writing validation code is a lot of work
Also rewriting the old crusty validation messages in daxa will take some time
writing your own validation :slippy:
have you considered just slapping bunch of asserts on things people actually often run into
That's wgat I do every time I run into user error
Me being the user
๐
i think i found a really nice new interface for tasks
its definetly much better for the things i use task list for
struct MipInput
{
daxa::TaskInputImage lower_mip = {{.access = daxa::TaskImageAccess::TRANSFER_READ}};
daxa::TaskInputImage higher_mip = {{.access = daxa::TaskImageAccess::TRANSFER_WRITE}};
daxa::TaskParam<f32> value = 3;
} input;
input.lower_mip.id = task_render_image;
input.lower_mip.slice.base_mip_level = i;
input.value = 3213;
input.higher_mip.id = task_render_image;
input.higher_mip.slice.base_mip_level = i + 1;
new_task_list.add_task(daxa::TaskInfo<MipInput>{
.task_input = input,
.task = [=] (daxa::TaskInterface<MipInput> const & ti)
{
auto cmd_list = ti.get_command_list();
f32 value = ti->value;
cmd_list.blit_image_to_image({
.src_image = ti->lower_mip.image(),
.dst_image = ti->higher_mip.image(),
.src_slice = {
.image_aspect = ti->lower_mip.slice.image_aspect,
.mip_level = i,
.base_array_layer = 0,
.layer_count = 1,
},
.src_offsets = {{{0, 0, 0}, {mip_size[0], mip_size[1], mip_size[2]}}},
.dst_slice = {
.image_aspect = ti->higher_mip.slice.image_aspect,
.mip_level = i + 1,
.base_array_layer = 0,
.layer_count = 1,
},
.dst_offsets = {{{0, 0, 0}, {next_mip_size[0], next_mip_size[1], next_mip_size[2]}}},
.filter = daxa::Filter::LINEAR,
});
},
.name = "mip_level_" + std::to_string(i),
});
mip_size = next_mip_size;
example code with mipmapping
@wicked loom if you wre still intrested
@haughty whale this could also be done in c with some minor tweaks i think
// New fancy syntax!
struct ClearSwapchainIn
{
daxa::TaskInputImage swapchain = {{.access = daxa::TaskImageAccess::TRANSFER_WRITE}};
} input;
input.swapchain.id = task_swapchain_image;
new_task_list.add_task(daxa::TaskInfo<ClearSwapchainIn>{
.task_input = input,
.task = [](daxa::TaskInterface<ClearSwapchainIn> const & ti)
{
auto cmd_list = ti.get_command_list();
cmd_list.clear_image({
.dst_image_layout = daxa::ImageLayout::TRANSFER_DST_OPTIMAL,
.clear_value = {std::array<f32, 4>{1, 0, 1, 1}},
.dst_image = ti->swapchain.image(),
});
},
.name = "clear swapchain",
});
// Inline syntax still works!
new_task_list.add_task(daxa::TaskInfo<>{
.task_input = {
daxa::TaskInputImage{{.id = task_render_image, .access = daxa::TaskImageAccess::TRANSFER_READ, .slice = daxa::ImageMipArraySlice{.level_count = 5}}},
daxa::TaskInputImage{{.id = task_swapchain_image, .access = daxa::TaskImageAccess::TRANSFER_WRITE}},
},
.task = [this](daxa::TaskInterface<> const & ti)
{
daxa::ImageId render_img = ti.input_as<daxa::TaskInputImage>(0).image();
daxa::ImageId swapchain_img = ti.input_as<daxa::TaskInputImage>(1).image();
// alternatively, the task ids can still be used like before. It is jus a shorter syntax now:
daxa::ImageId render_img_alt = ti.image(task_render_image);
daxa::ImageId swapchain_img_alt = ti.image(task_swapchain_image);
this->blit_image_to_swapchain(cmd_list, render_img, swapchain_img);
},
.name = "blit to swapchain",
});
i love this
it is better
you wanna see the future vuk syntax?
yes
// deferred shading RG
auto build_gbuffer_pass = vuk::make_pass(
"05_deferred_MRT",
[uboVP](vuk::CommandBuffer& command_buffer,
vuk::IA<vuk::eColorWrite, decltype([]() {})> position,
vuk::IA<vuk::eColorWrite, decltype([]() {})> normal,
vuk::IA<vuk::eColorWrite, decltype([]() {})> color,
vuk::IA<vuk::eDepthStencilRW, decltype([]() {})> depth_rt) {
command_buffer.set_viewport(0, vuk::Rect2D::framebuffer())
.set_scissor(0, vuk::Rect2D::framebuffer())
.set_rasterization(vuk::PipelineRasterizationStateCreateInfo{}) // Set the default rasterization state
// Set the depth/stencil state
.set_depth_stencil(vuk::PipelineDepthStencilStateCreateInfo{
.depthTestEnable = true,
.depthWriteEnable = true,
.depthCompareOp = vuk::CompareOp::eLessOrEqual,
})
.set_color_blend(position, {})
.set_color_blend(normal, {})
.set_color_blend(color, {});
// .... more cbuf recording
});
auto shading_pass = vuk::make_pass(...);
auto position_image = vuk::declare_ia("05_position");
position_image->format = vuk::Format::eR16G16B16A16Sfloat;
position_image->sample_count = vuk::Samples::e1;
position_image = vuk::clear(position_image, vuk::ClearColor{ 1.f, 0.f, 0.f, 0.f });
// ... more stuff
auto gbuffer = build_gbuffer_pass(std::move(position_image), std::move(normal_image), std::move(color_image), std::move(depth_img));
auto result = std::apply(shading_pass, std::tuple_cat(std::make_tuple(target), gbuffer));
return result;
i like this
i like it too
TMP
i just need to work on it more
its not final syntax yet, some things are still rough on the edges
i see
struct MipMapTask
{
struct Uses
{
ImageTransferRead<> lower_mip{};
ImageTransferWrite<> higher_mip{};
} uses = {};
std::string name = "mip map";
u32 mip = {};
std::array<i32, 3> mip_size = {};
std::array<i32, 3> next_mip_size = {};
void callback(daxa::TaskInterface const & ti)
{
auto cmd_list = ti.get_command_list();
[[maybe_unused]] auto const lower_mip_view = uses.lower_mip.view();
[[maybe_unused]] auto const higher_mip_view = uses.higher_mip.view();
cmd_list.blit_image_to_image({
.src_image = uses.lower_mip.image(),
.dst_image = uses.higher_mip.image(),
.src_slice = {
.image_aspect = uses.lower_mip.handle.slice.image_aspect,
.mip_level = mip,
.base_array_layer = 0,
.layer_count = 1,
},
.src_offsets = {{{0, 0, 0}, {mip_size[0], mip_size[1], mip_size[2]}}},
.dst_slice = {
.image_aspect = uses.higher_mip.handle.slice.image_aspect,
.mip_level = mip + 1,
.base_array_layer = 0,
.layer_count = 1,
},
.dst_offsets = {{{0, 0, 0}, {next_mip_size[0], next_mip_size[1], next_mip_size[2]}}},
.filter = daxa::Filter::LINEAR,
});
}
};
daxa::TaskImageHandle lower_mip = task_render_image.handle().subslice({.base_mip_level = i});
new_task_list.add_task(MipMapTask{
.uses = {
.lower_mip = lower_mip,
.higher_mip = task_render_image.handle().subslice({.base_mip_level = i+1}),
},
.name = std::string("mip map ") + std::to_string(i),
.mip = i,
.mip_size = mip_size,
.next_mip_size = next_mip_size,
});
I did another iteration
i now reflect this struct for tasks
only strangeness is that uses need to be in a locally declared field called uses that must only contain uses.
makes a lot of tiny things more clear
I also made handles take a slice. Now handles uniquely identify uses
making this syntax possible as well:
new_task_list.add_task({
.uses = {
BufferHostTransferWrite{task_mipmapping_gpu_input_buffer},
},
.task = [this](daxa::TaskInterface const & ti)
{
auto cmd_list = ti.get_command_list();
update_gpu_input(cmd_list, ti.uses[task_mipmapping_gpu_input_buffer].buffer());
},
.name = "Input Transfer",
});
much, much better
im actually completely happy with this now
i love daxa
I just fixed some things and now transient memory allocations work nicely with the task graph. ez aliased scratch memory for days.
3 lines and i have my buffer ready
Epic
its really nice rn because i have everything indirect, meaning i need a ton of tiny buffers just containing indirect command info
now i can just create them super ez inbluck
auto upsweep0_command_buffer = info.task_list.create_transient_buffer({
.size = sizeof(DispatchIndirectStruct),
.name = "prefix sum upsweep0_command_buffer",
});
auto upsweep1_command_buffer = info.task_list.create_transient_buffer({
.size = sizeof(DispatchIndirectStruct),
.name = "prefix sum upsweep1_command_buffer",
});
auto downsweep_command_buffer = info.task_list.create_transient_buffer({
.size = sizeof(DispatchIndirectStruct),
.name = "prefix sum downsweep_command_buffer",
});
info.task_list.add_task(PrefixSumCommandWrite{
{.uses={
.u_value_count = info.value_count,
.u_upsweep_command0 = upsweep0_command_buffer,
.u_upsweep_command1 = upsweep1_command_buffer,
.u_downsweep_command = downsweep_command_buffer,
}},
.context = info.context,
.push = {info.value_count_uint_offset}
});
i have set myself a minimum time i wanna spend on my projects now
its really fun and i just always get lazy and stop for no good reason
when are you making a game
๐ฆ comany says nono
๐ i have a good excuse
i guess they wont care about anything small tho actually
have you thought about writing automated tests for daxa
You should also write some Rustdoc and MdBook style docs 
automated tests
No i have not actually. should be easy to set up but i havent come around to it.
mdbook style might be cool
rustdoc is good 
md sounds nice, is it actually md, cause the current eocs are all md files already
I wonder how daxa handles updating descriptor sets for things like sampler arrays. Does it track it in any way to make it safe?
No idea
I'm talking about the case of VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT, or more generally I'm wondering how descriptor sets updates for bindless stuff are handled
there is only one descriptor set which holds everything
they update the specific index
And is it up to the user to not change anything which is dynamically accessed?
user doesnt have access to this
Uhm
buffers and images are hidden by handle which is just index
yes
daxa has a storage image, sampled image and sampler descriptor array
it enables all the descriptor indexing extensions
it kinda makes it just some fancy block of data you can update however. I plan to replace the pool with a descriptor buffer in order to ipdate the descriptors even more efficiently by just writing the mapped host ptr of the descriptir buffer instead of doing descriptor writes. Descriptor set writes can be very expensive
not that it matters much for daya honestly
its ultra fast already but speed addiction hits always
๐finally
take a look at the tutorial wiki page
it shows simple use
Already reading it ๐ซก
daya task graph is quite fleshed out now. it is really nice to use. kinda like vuks rendergraphs but with my own tradeoffs
I dont understand this thing. You can have multiple buffers in on task buffer?
Awesome
you *can
its quite esoterical
whej you have a task that mipmaps N images
it is nice
sometimes you have a collection of resources with the same sync needs. One tbuffer or timager can represent them
it could be useful for tripple buffering
this also allows you to change the count at runtime
ik out for a few hours
rammstein concert started
okay ๐
Descriptor buffer doesn't have that much coverage though right? So you still need the fallback
Holy based vulkan dev
its some prelude they come in a few minutes
coverage is fine its nvidia gtx 745+ and amd rx580+
its just slow driver updates. it will likely be 99% covered
BTW I wonder how db compares to updating descriptor sets in radv since all it has to do to update descriptor sets in principle is memcpy but for smaller descriptors it just manually writes stuff without even calling memcpy. So it'd kind of micro optimized. That'd be a question for pixel I guess
Ah nice
you can switch underlying resources between executions of task graphs
. generally you can reuse completed task graphs very well.
thats some real 
api calls have a general overhead due to indirection. mapped mem writes to descriptir buffers will lokely always stomp it
Is it known whether nvidia "likes it"? From what I know updating descriptor sets on nvidia is not as simple as writing memory
Makes sense I guess
nvidia likes it
Great
Do you know how it works?
Though maybe not a discussion doe this channel
Thanks guys
my only problem how task graph handles when you switch pipeline as in task callback. You can get the task and update the pipeline or you have to rebuild the whole task graph again?
you can just switch the pipeline in the callback at any time
as long as you mention all used stuff that needs sync in the task you are fine
hmmm. I dont see how would you do it code wise.
inside the callback
you would have a pointer captured to the pipelines and the data needed to decide what pipeline to use
oh alright. I see now
I very much would but rust owo

we are planning a prototype of a c api and rust bindings
i wont promise anything
but maybe(tm)
That's nice
there is no learning
it's all super simple
(except for the tl bugs)
you arent supposed to mention those
๐
I found a way to workaround renderdoc ruining the immages btw
we need to add that
If you manage to make it rusty enough (strong use of type / trait system) then that will be very nice
I'll add it tommorow
each task is in it's own batch
Renderdoc is okay with that - for some reason once some tasks are in the same batch it ruins all the texture views
So once you go full bindless how do you do the virtual to real resource translation in the task graph? It's just integers you are sending to shaders right? Does daxa send them for you?
I don't follow
you pass them as BDA to the shader (in a push constant or in any other way you like)
You have TaskBuffers which contain one or more real buffers
daxa has functions to extract the real buffer ids from the task buffers and for translating the real buffer ids to BDA
Ah, so it's up to you to send those to the shader then ๐ค
Guess I'll give the code a read
if you use the macro task graph api it sends them for you in a uniform buffer
@winged tangle is this on the wiki yet?
task graph fills a uniform buffer with all task args
any used images get a view created for the specific use, its view id is put into the buffer
any buffer gets its device address put into the uniform buffer
ypu can alternatively also access them yourself in the task callback.
The wiki shows how to do this already
now use it! ๐
how task list puts your things in a uniform buffer is not explained yet really but i will add that tomorrow
My codebase is in rust ๐ญ
I'm doing open source espionage to hopefully contribute proper support for bindless in Phobos
but yes, essentially its all integers then
so many rust ppl damn
Fr?
A lot of ppl ask for rust yeah
rust is growing really fast in graphics
its still not much tbf
its just a lot of young ppl going for rust
so in 10 y it will be goodish i would think
Sooner thanks to my espionage
@sick pilot a thing that was a bit tricky was to get the ressource table threadsafe*
you can read without locks, assuming you never use after free which is checked tho (thid check is just a debug thing and can die to race conditions so you still cant just rely on it but it kinda mostly finds errors most of the time)
Yea considering there's only like 1-2 main libraries used for graphics it's not mature at all
It's growing though so that's a nice thing
Ah, something to take a closer look at I guess. Thanks!
it is a bit overengineered actually. It cuts up a big array onto page sections. Then only keeps allocations of these sections
idk if i like that still as i prefer simplicity even if that would mean using more memory
hmmm
i think its ok as the objects are kinda big
descriptors are very small usually so i dont care that they are fixed
i need to reevaluate this a little
its very battle tested tho so i believe it doesnt really have bugs
im thinking out loud
๐
Thinking out loud kinda helps tbh because if I say something dumb I instantly realize it
yea
Me too because Patrick reacts with "๐"
Hopefully it will solve some issues
So can you read it locklessly because you can just copy a page and then replace the whole thing when it is written?
I wrote lucklessly insteaof of locklessly
loool
when i add a new page all other pages are independent from that
as opposed to a growing vector
this gives me pointer stability
addresses wont change
if i grow by adding more pages
Right, I was thinking about the case in which you replace an entry but I suppose that falls within the case of use after free which as you said is not something you actually need to worry about
That's really nice thx
well
you can always check the version
this way you can detect use after free
and version cna be atomic
so that wont need sync
I noticed you've got those 8bits for thr version
yea its super tiny
for rust i would maybe go for 32
but it happens thaz nv only allows up to 1 million images per app
so thats 20 bits
Oh lol
so i use 8 for version
Yeah makes sense
32 bit version woudl make it MUCH more robust
but also make it take more space in buffers
i copied this pattern from entt
very good lib can steal tons of data oriented patterns from it
Came across it ad well, it's kinda like generational references
Yo how is daxa rt going?
Iโm just curious as daxa is more matured. No way Iโm gonna jump from rust Iโm too ๐ฆโd
๐ฆ
Danny asking the real questions
"it's planned"
There just is a lot of small fixes and stuff higher on the priority list
i was thinking of using md book for docs
everything daxa has rn is already in md files
im working on making the c api able to be theoretically safe rust when wrapped
so it aligns will all the ownership and sharing bingus
its growing
๐คฐ
vulkan-portability I assume?
I thought that moltenvk was not vulkan-spec complient hm
its very spotty
we will see how it works out
but we already ran daxa code on a mac
just some smaller tests
Ooh I see I see
Lol
๐ we are in business \
to be clear it wont just be poopy c bindings
i ll wrap it all in sexy rust
๐ฆ
"I" as in all the Rust lovers in my discord (exso, brynn, etc.)
right
daxa team grew
brynn kindly took over the work for daxa rust, so the work will probably be done even sooner.
This is what i cooked @harsh yew
yeah.. I'm not reading that ๐
lol
My template-foo lacks
mine too
it broke my head to get this to work
inline void CompileTestVariant()
{
Variant<u32, f32> v;
Variant<f32,i16,u8> v1 = u8(1);
v1.set(3.4f);
if (auto ptr = v1.get_if<f32>())
{
*ptr = 1.0f;
}
else if (auto ptr = v1.get_if<i16>())
{
*ptr = 5;
}
v1 = NullVariant{};
if (!v1.has_value())
{
v1 = i16(3);
}
}
The use is nice tho
also build in null thingy
if constexpr (variant.index() == 0)
{
return variant[0];
}
...
gotta have that constexpr support in there
intresting
or in this case its literally
variant[variant.index()] which is a compile-time get of the value
due to changes i make looking at better error handling for the c and rust api i find new solutions to the c++ api as well
its a whole revamp basically and i get rid of some fat c++ stl headers which is always very very nice
themperror is an amazing name
true
its very interesting to explore design options to make zhe c api do most of the work and still allow for a convenient small safe rust wrapper
gabe rewrote it by de bloating another impl
ah i lov working on daxa
the c api thing gave us so many problems
that i had to rewrite it multiple times
but i also had to rewrite the implementation which is actually great
Daxa now never internally errors or throws it returns error codes instead.
making rust wrapper possible
finding solutions to these problems was epic fun
for example the c api does intenral ref counting similar to vma. The ref counter is exposed, so the c++ and rust wrapper can write super slim wrappers as the c api does all the lifetime stuff
this also means that the c ptrs can be tricially casted to the rust and c++ type
which makes the passthrou and wrappers MUCH simpler and cleaner
and the c++ api will have interop with raw vulkan and the c api now
which is also awsome as that was a big missing part
i love how simple this is in the end on the user side
this also means that (as daxa internally ref counts now), that the user doesnt need to care about dependencies at al anymore. for example killing an image before an image view is fine the image vie keeps it alive now
it half assed ref counted ebfore but only outside, which made no sense but i didnt realize that before
so the c api makes the c++ api actually safer and easier to use in the end which is surprising to me
Before i started i dredded the idea of the c api for rust.
But now i love it
well.. you have now forced refcounting bc the c interface is too anemic to support anything safe
that is not to say you can't make the decision to refcount
but the argument you make is Stockholm syndrome at best
i dont follow
the interface did refcounting before. Extending it to internal counting for dependencies is only natural
i have no clue what you mean with Stockholm syndrome here.
i mean that c apis are shit
i agree
refcounting is one thing, but being forced to refcount bc c apis are shit
nothing to celebrate
well
idk about that. I think what i tried to say here is that i relalized i can do the internal refcounting allso in the c api which makes less redundant code in the c++ and rust wrapper
also makes the c api simpler to use
"have to do it" i mean i dont have to but it makes life better for me
thats orthogonal, i don't have an issue for designing it with refcounting
i personally don't like it, but w/e
what i don't like is having to limit oneself to the stone age due to c
oh yea that caused a ton of problems
i had to rewrite some containers like optional and variant so ihave stable c abi
but it was a very cool challange in design
was very fun to get it to work
this is the wrost part by far
the rest of the changes were nice imo. Thinking of the c api made me realize general improvements that i could apply everywhere ๐
new perspectives always give new ideas
i actually really like c apis in many cases
i love how simple they are
i often say the best c++ libraries are c libraries
I see what you mean with stockholm syndrome now but i like it ๐ . Maybe im crazy
there is a little c-bro growing inside me
just keep it in check, otherwise you might go bald
Is there anything a c api disallows? I mean the c api is not something you use directly
Are big resources ref counted? Like if you allocate a buffer, do you have a way of deallocating it, say from another thread, and in general a way of knowing for sure it has been deallocated?
Destruction of persistent resources, such as persistent buffers or images is user responsibility but DAXA still notifies the user when he fails to do that
Other objects such as device etc are refcounted though
yes
Yes to which question lol
i like to believe that the c api is actually quite usable on its own too now
both
Mmm how
i missread
the second one is semi knowable
you can enable a flag
that makes daxa error if you for example delete an image that is used by an image view
but per default it will just keep it alive
the only exception right now are images buffers samplers in regard to device. zhe device must always outlive all its images buffers and samplers
well change that for rust
and later also c++ to be consistent
the ref count methods return the prev value on decrement so you can chevk on it beeing 1
then its destroyed for sure
coherent explanation:
- every resource has a ref counter
- per default every* resource is internally refcounted. For example an image view keeps its image alive, a semaphore keeps device alive etc.
- currently there is an inconsistency with images, buffers, image views and samplers. Those MUST be destroyed before their device!
- this will be changed in the future for better consistency.
- all other resources can outlive their parent, they will scilently keep it alive via the refcounter
- in the c api, you get one pointer to a resource with ref count of one. In the c api from the user side its basically manual memory management.
- but internally ofc its still ref counted, so the user doesnt need to worry about the dependencies or resources, eg device -> semaphore, image -> image view which makes the c api much easier to use.
- this also makes the c++ and rust wrapper much simpler, they dont have to care about dependencies.
- there is an optional flag that will ERROR if you destroy the user handle with other internal references still beeing in place, for example killing a device before a semaphore will error then.
- in c++ you can also use the c api flawlessly which gives you access to manual management if you wish. The default containers are automatically ref counted tho. And this is relatively safe as the internal ref counting is still in effect!!
- essentially in c++ you can choose between externally refcounted or manually managed + choose if the internal ref count acts as an error check or lifetime extender.
- overall this makes daxa much easier to use as internal errors due to broken dependencies are much rarer.
- in rust everything is ref counted to ensure safety.
- overall ref counting in this style has no real perf impact and makes it much easier to use. The optional flag to error on unordered destruction also makes for peace in mind if you wish to be in full control and check for errors.
@sick pilot sorry for the mumbling before, this explanation should make it clear ๐
So to answer your second question,
there is an optional flag that will ERROR if you destroy the user handle with other internal references still beeing in place, for example killing a device before a semaphore will error then.
With this flag you can be sure of the destruction order.
Mmm I see thanks.
With vulkano you are always forced to use Arc for pretty much anything. I didn't like this because it meant I lost any concept of ownership. Things like threads for deallocations would become harder and it was easy to keep something alive by mistake because easily get scattered everywhere.
some parts are not ref counted in the rust api
for example command list is not
makes no sense to ref count it in rust
but in rust you dont have a choice if you want it safe
the dependencies are not managable without massive constrains otherwise
actually maybe im wrong
how would you like it pac
I'm biased because I would be based on just one bad experience, my opinion is that safety is an open problem in graphics and trying to solve it shouldn't get in the way of a flexibility.
i dont think that makes sense in the context of rust
well in rust one should always try to use safe wrappers for eveythign as much as possible
Yeah ik
otherwise the language is needlessly restrictive
but i think the mode where daxa errors if you dont delete in right order might be cool then
still safe
but you know whats going on
But like, an engine is not just a renderer, even if all of the renderer is unsafe there is much to be gained in the rest of the codebase. Anyway yeah that sound good
๐
The reason I say that is, if you force Arc on "big resources", then basically deallocating them too late is as dangerous as deallocating them too soon
yea i agree
