#Phobos - Vulkan abstraction library in Rust
1 messages Ā· Page 2 of 1
IT FIXED ITSELF
IDK WHAT I DID
BUT I DID SOMETHING
š„³
WOOOOH
SPONZA NEVER HAVE I BEEN EVER HAPPY TO SEE YOU
i am crying rn i figured it out. i set the # of instances for my build_info's range but didn't do it for querying the build size š
My laptop 1050 starts slowing down at around 3m triangles which is pretty amazing of itself
So if a desktop 2080 ti is getting bottlenecked by a single 1m tris then something must be very wrong
oh no it loads fast now
idk if i should multithread it and make it concurrent
eh premature optimization is bad
I can reach nanite level of performance with a single thread, you're good
UE5
Strictly in terms of triangles/s 
fellas
im proud to announce
after years, decades, centuries
monke
i think i did the rotation wrong however ;kek
so how would define
the .transform
I'm so confused
.transform seems to take in a TransformMatrix type
but I cannot find on it's documentation how I would input my floats + vecs
let transform = vk::TransformMatrixKHR {
matrix: [...}
}
phobos::TransformMatrix(transform);
Is not possible as it is a private constructor

bro what is there not a way to create a transform matrix with your own transforms. there is literally only identity
desperate times call for desperate measures im gonna make a pr
uh penguin is texture loading not supported?
let image = images.get(gltf_image.index()).unwrap();
let image_data = &image.pixels;
let image_phobos = phobos::image::Image::new(
ctx.device.clone(),
&mut ctx.allocator,
image.width,
image.height,
vk::ImageUsageFlags::SAMPLED | vk::ImageUsageFlags::TRANSFER_DST,
get_image_type(image.format),
vk::SampleCountFlags::TYPE_1,
)
.unwrap();
the format is R8G8B8A8_SRGB
It keeps saying I need to specify a depth?
I forgor
You canāt actually construct it with things yet
Where depth
Image loading should work
LMAOOO pls fix
I might be messing it up? Iām not entirely sure. Itās late rn so iāll send you the code later
I simply want to create images for my gltf asset. I also create a buffer to store the image itself, but iirc you donāt need to do anything prior to get images to work, yeah?
Yeah Iāll do it later today, Iām a bit busy rn
Eh i can make a pr tbh if you canāt
You donāt have to create a buffer at all
The image manages its memory
Wait I think Iām screwing up then š
Where would you specify that actual data of the image
You create a command buffer and upload the data
Probably through a buffer-image copy yeah
Ohh ok cuz Iām trying to get textures
Ah that might be why then 
I was just making an image without doing the copy
Yeah itās gonna be empty then lol
But that shouldnāt be causing any errors 
Idk what your full code is
Ill share it when i wake up tomorrow
Alright
Huh I realized. I should be moving my buffers onto my gpu
All of my buffers have been CpuToGpu 
Iirc i make a staging buffer which is cputogpu then add to the command buffers cmds to copy that gpu information over
Yep thatās pretty much it
I guess I should also merge all my vertices together into one monolithic one while Iām at it. Wasn;t this called bindless or something?
Ah wait nvm it means not making binds (referring to buffers by their bda) im stupid lmaooo
You can do that yeah
Wait thereās no cpu only memory option?
HOST_VISIBLE | HOST_COHERENT | HOST_CACHED is CPU_ONLY
Ohh itās been moved into flags I gotcha
Penguin
here is the exact error:
ERROR phobos::core::debug > [VALIDATION]: VUID-VkImageCreateInfo-imageCreateMaxMipLevels-02251 (-1094930823): Validation Error: [ VUID-VkImageCreateInfo-imageCreateMaxMipLevels-02251 ] Object 0: handle = 0x1b9bf6ba320, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0xbebcae79 | vkCreateImage(): Format VK_FORMAT_R8G8B8_SRGB is not supported for this combination of parameters and VkGetPhysicalDeviceImageFormatProperties returned back VK_ERROR_FORMAT_NOT_SUPPORTED. The Vulkan spec states: Each of the following values (as described in Image Creation Limits) must not be undefined : imageCreateMaxMipLevels, imageCreateMaxArrayLayers, imageCreateMaxExtent, and imageCreateSampleCounts (https://vulkan.lunarg.com/doc/view/1.3.243.0/windows/1.3-extensions/vkspec.html#VUID-VkImageCreateInfo-imageCreateMaxMipLevels-02251)
error: process didn't exit successfully: `target\debug\DARE.exe`
How DARE.exe you
Danny's Awesome Rendering Engine
Lol
Try rgba
Oh im loading from a file
i dont think i can
wait nvm
i can im a dumbass
R32G32B32A32_SFLOAT works idfk why everything else doesn't
RGBA seems to only be supported
wtf
@hybrid pilot yeah that fixed it...? But how tf am I supposed to conform my textures which im loading from gltf to it
Are you trying to create an image on HOST_VISIBLE memory
My only flags rn are SAMPLED and TRANSFER_DST so I doubt so
this is phobos
let image_phobos = phobos::image::Image::new(
ctx.device.clone(),
&mut ctx.allocator,
image.width,
image.height,
vk::ImageUsageFlags::SAMPLED | vk::ImageUsageFlags::TRANSFER_DST,
vk::Format::R8G8B8_SRGB,
//get_image_type(image.format),
vk::SampleCountFlags::TYPE_1,
)
.unwrap();
i don't really have access to the vmaCreateInfo cuz:
- this is rust
- phobo'ing time
Huh
it's basically always device local
You can just look at the source you know
Anyways looks like some values are undefined for some reason
oh it is but that really shouldn't be causing any issues?
Yeah
š«” ethereal bug
ill run it through the debugger and see what is causing it
thats odd, everything it is complaining about is valid
You might have to do some conversions
seems like a massive waste of gpu memory though
ahh ill dig more into it
well seems like my gpu only supports r8g8b8a8
sadge
Wait you were trying to create a r8g8b8 image?
bruh
Quite literally no GPU in the world supports that format for color 
LMAOOO i didn't realize it's not supported whatsoever i was so fixated on the vulkan specs š saying stuff was null
Check on GPU info for support
yeah thats what i did i was coping maybe gpu didnt support it and my code wasnt broken as my copium came true. the real question is how to convert it all to rgba 
If you are using stb image you can ask it to load all 4 channels regardless
i wish
i'm using the in-built crate loader. doesn't seem like they offer a way to force it all rgba
Otherwise you can just do a shrimple conversion
I can only speak C++ though so my example will be C++
// assume it is a 3-channel texture
auto texture = load_texture();
const unsigned char* data = texture.data;
auto rgba = std::vector<uint32>(texture.width * texture.height);
for (auto i = 0; i < rgba.size(): ++i) {
texel =
(data[i * 3 + 0] << 24) |
(data[i * 3 + 1] << 16) |
(data[i * 3 + 2] << 8) |
255;
}
its been so long c++ 
oh wtf iterating through the array by ptr
https://github.com/gltf-rs/gltf/issues/376 seems like im not alone in this issue
oh no i havent gotten close to bit stuff yet thats why it's indecipherable
It's easy, r8g8b8a8 means 8 bits per channel
yeah that makes sense, im just confused about <<
You want to convert r8g8b8 to r8g8b8a8, so what you do is shift everything 8 bits to the left, and add 255 or 0xff to the end
no
understand this
written by a human for a human (I assume)
wat
i think ill have to do that manually then sadge
You have no << in rust
https://doc.rust-lang.org/std/ops/trait.Shl.html you are lying
wtf
i couldn't find it š
oh wait there is a better way so i can also support multiple component counts
I do it with iterator
like you have your u8 vec and you do .chunks(3).flat_map(|p| [p[0], p[1], p[2], 255])
im trying to figure out a way to have it work across all the possible types

Do you see the i * 3
change 3 to component_count
and then switch on component count, with fallthrough bitwise shifts
multiplied by the component count
Same but T::MAX instead od 255
nono like 8, 16, 32 bits
Don't think about that
You won't have to deal with those textures

what do yall think the chances of me running into a r8 texture gonna be
sorry for asking but would this do an arithmetic overflow? At least that is what the rust compiler is complaining about however we are oparting on two different-ish types i think? I'm using a &[u8] which is basically an array of bytes
cast to u32
C++ integer promotion rules are fucked and Rust decided to not deal with any of that and making you do everything
what do you think you've been programming up until now
programming itself is a bunch of hacks piled up on top of each other
we hacked sand and tricked it into thinking, what's one more hack
My thought is to worry about it when you come across it
heightmaps 
ah it doesn't work nvm
seems like ill have to do what pac85 does cuz rust is throwing a fit about how i shouldn't be going over my array
me neither 
i just get a happiness hit when i see
vec.iter().map().collect() it feels so nice pretty sure cpp has its own way
istg you guys have a map function
but me functional programming :(
Probably std::ranges has something
well for is functional but shhh
But ranges are trash so nobody uses them
LMAOO
ill go back to cpp when they finally get modules to work (they will never)
oh shoot i could technically make your way work, it's just that i need to allow arithmetic overflow
Just do what pac did if it works
yeah but i wanna use this learn more about bitwise operators
eh pac is probably better as it would be "safer" idfk know though i'm assuming as there is a possibility you read outside of the intended data
I am only capable of writing unsafe code
this is what c++ does to you
data.chunks(component_size * component_count)
.flat_map(|p| {
vec![
p.get(0..component_size).unwrap_or(&[0]).to_vec(),
p.get((component_size)..(2 * component_size))
.unwrap_or(&[0])
.to_vec(),
p.get((component_size * 2)..(3 * component_size))
.unwrap_or(&[0])
.to_vec(),
match component_size {
2 => (65025u16).to_be_bytes().to_vec(),
4 => (4228250625f32).to_be_bytes().to_vec(),
_ => vec![255u8],
},
]
.into_iter()
.flatten()
})
.collect::<Vec<u8>>()

@iron shadow i added methods to the transform matrix
the plan is to finish up 0.10 tomorrow and release it
why are you writing 65025u16 instead of u16::MAX
indeed
me neither but it converts any given component size to rgba 
guh
magick
alright then
well the transform matrix stuff is added
a little different from your (now closed) pr but simpler I think
btw the old commits are added because i squashed yours when merging
ohh i see
how would i get those commits ommited cuz if i did pr again, it would be annoying to have those prs still merged again
overwrite the branch with origin/master
simplest non git fuckery way is probably just to delete the fork and make a new one, better is to make each pr on a "patch" branch and sync master periodically to the main repo
aye
how would you chain a bunch of commands together? I got something like this
let mut image_commands = ctx
.execution_manager
.on_domain::<phobos::domain::Compute>()
.unwrap();
for image in images {
image_commands = image_commands
.copy_buffer_to_image(
&staging_buffer.view_full(),
&image_phobos.view(vk::ImageAspectFlags::NONE_KHR).unwrap(),
)
.unwrap();
}
But i have no clue if this would be considered best practice
and also seems like it does f up everything else as if i wanted to use ctx, it would block it due to it to the top line
could i really just make a bunch of commands then do a batch submission? iirc seeing that on the execution manager
what youre doing now is also what i do
ImageAspectFlags::NONE is suspicious
use COLOR
also avoid unwrapping as a general best practice, using the ? operator is much nicer
ah i should probably return a result
yeh
also seems like it does f up everything else as if i wanted to use ctx, it would block it due to it to the top line
If this is an actual issue you can clone the exec manager and use that, its refcounted internally
? succinctly tells the reader that you have no idea what is happening in the given code and that the prayerbook should be kept close
i see it more of a procasination tool : im too lazy to deal with this error, go deal with it later 
oh god my image is a black screen
it isn't copying over correctly
?? tells the reader this code is a big blunder and should not be taken seriously
yeah you have NONE as your aspect flags
nono i fixed that
oh alright
i think i fixed it? now it realized i should probably keep my buffers alive as they're dropping before the command buffer has a chance to submit it
that is not good yeah
yooo images finally work
so when i submit all those commands, is it doing in parallel or 1 by 1? or is that depends on how the gpu is feeling on that day
depends on the weather
i see i see ill tell god to make it snowy for the gpu
if the gpu likes snow sure
but you can count on a certain bandwidth of copy being parallel id expect
Comands start in the order you start them, but if commands need the same hw unit they can't always run in parallel
Buffer copies are compute shaders on some GPUs (amd)
that's weirdd
oh well seems like nvidia is fine with me putting it on the compute as well
What is weird?
how it is compute instead of transfer unless im tripping
What I mean is that copies are done by a compute shader by the driver lol
But you can run different compute workloads in parallel
Most big gpus do have a separate async copy unit
So you can expect transfers to happen in parallel with other stuff
Amd uses to have it
Not anymore?
I think not
Hmm
Well anyway, compute implies transfer support
Thats why the compute domain also exposes transfer commands in phobos
Well I can't tell for sure what the hw has but I can tell for sure that radv does copies with a compute shader for anything newer than southern islands~ no that was bullshit, I misunderstood stuff. The actually code to chose between cp_dma and cs is more involved
Some legacy hw lingers on
Could still be there dunno
Mmm
there is also dma hw in the CP
and then finally you can use shaders
cornucopia of options
Yeah that's what is used in the si path
In any case it was just an example
To say that just because something looks like dedicated hw in vulkan it might not be
Yeah even different queue families donāt necessarily map to different hw
do you mean queues
uh dumb question, i'm a bit confused as to how the gltf sampler maps to vulkan?
OHH NVM im being brain dead
i thought the values represented like pixels but they represent enums
Most of them probably map to vk sampler settings
let phobos_sampler = phobos::Sampler::new(
ctx.device.clone(),
vk::SamplerCreateInfo {
mag_filter: translate_filter(
gltf_sampler
.mag_filter()
.unwrap_or(gltf::texture::MagFilter::Linear),
),
min_filter: translate_filter(
gltf_sampler
.mag_filter()
.unwrap_or(gltf::texture::MagFilter::Linear),
),
mipmap_mode: vk::SamplerMipmapMode::LINEAR,
address_mode_u: translate_wrap_mode(gltf_sampler.wrap_s()),
address_mode_v: translate_wrap_mode(gltf_sampler.wrap_t()),
address_mode_w: vk::SamplerAddressMode::REPEAT,
mip_lod_bias: 0.0,
anisotropy_enable: vk::FALSE,
max_anisotropy: 0.0,
compare_enable: vk::FALSE,
compare_op: vk::CompareOp::ALWAYS,
min_lod: 0.0,
max_lod: 0.0,
border_color: vk::BorderColor::INT_OPAQUE_BLACK,
unnormalized_coordinates: vk::FALSE,
..Default::default()
},
);
I geniunely have no clue if these are good defaults
Make max_lod VK_LOD_CLAMP_NONE I think is what itās called
Enabling anisotropic filtering is probably also a good idea
Ok strike that, I assumed that because the function to encue the copy command started with si it was only used for si š¤¦āāļø
The acwual condition is more complicated
oh wtf what is the gltf default sampler
I was gonna send you my code to handle the gltf sampler but it looks like I just ignore it
Guess you can take inspiration from that :p
ill rummage around in the gltf spec to see if they defined it
Just define your own
You most likely want linear filtering and linear mip transitions with anisotropy enabled and repeat wrap mode
Yep that's what I do if not defined
ah how do you convert a gltf transform to a vulkan one again
also how tf do you do textures š
i want it so i can just refer to textures by their bda
descriptor indexing
there is no vulkan transform
okok
wil look into it
I;m a bit confused,..
What is the purpose�
So is it a way to bind multiple resources and refer to them as an index in an array?
hmm how would i do this in phobos
phobose seems like it can support it, but like i'm not 100% sure if:
- im meant to touch those functions
- allocation callback (???)
ahh im so confused on descriptor indexing
Allocations callbacks are for using custom allocators for the host side
hmm pretty sure phobos doesn't want me to accesss that
i wanna see if penguin added an impl
What do you need allocation callbacks for 
im trying to do descriptor indexing and its leading me down a path of making a descriptor pool + sets
You can do descriptor indexing I just need to add some internal code to set the proper flags
oh
Ill try to add it today
ill wait then. hmmmm what other things i should do. oh a camera
take your time bro i feel so bad
Nah itās fine lol
i should probably also relearn vulkan 
ill do both vkguide using vulkan-hpp then vulkan-tutorial on rust copium that prevents me from going brain dead and ctrl+c + ctrl+v my way through
oh my god i spent over 60+ hours in the ide in the past week
. the addiction cannot be stopped
Wtf that's like 9h a day

my friends all normal talking about elden ring and im frothing at the renderdoc of it 
time to remove buffer usage flags
breakage 3000 incoming
it does break but easy to fix
How would do you plan on supporting descriptor indexing? Iām so comfused by the process š
Pretty easy
An unbounded array in the shader is seen as a descriptor indexed array using reflection
Then thereās just a bind_sampled_image_array function to bind all your images
And done
As long as you non-uniformly index it 
Ohh so thatās what you plan on adding
Oh yeah csn you do the same to buffers or not?
Or is this an image exclusive thing?
For buffers just use the device address
Ohh ok wait what?
Just send device address and use it in the shader
How do you change individual images?
That would involve updating descriptor sets which isnt being done atm
Might be worth looking into though
Mmm
Ohh what bind function would that be?
Oh wait is it ubo
Wonder how other rendegraphs do it
There is no bind
Maybe by making bindless a "first class citizen"
Bindless in my thing is the only citizen 
Lol
A new one
Wait only bindless is really bad?
The exact opposite, it's ridiculously good
I don't think so
Thatās what I thought, but it also means graphics programmers can get a bit diabolical with vram
It means you don't have to pretend the hw can't di stuff it can do
OHHHH
Yeah basically you just get to throw pointers at the gpu and dereference them
I wish I had access to the whole GPU memory subsystem
That way I wouldn't have to readback marked pages and make them resident/evict them
I could manage memory directly on the GPU š„²
I wish I could write rust on the gpu
Is that exposed to userspace at all?
Well
There is a thing for rust shaders
But meh
Sparse binding?
yeah
But maybe in principle the GPU could write it'sn own page table?
yes
As long as you map it
But you would have to hack the kernel driver to do that I guess
Like maybe map it to an hardcoded address then access it from a shader
So you don't have to touch the rest of the stack
Oh wtf rustgpu
That looks cool, is it any good?
We where discussing it
Outside of here
Seems like it lacks essential featured for compute shaders
(Barriers)
But it is promising at least
wtf
Yeah I personally wouldnāt use it
What's not clear?
me when
me when no GL_KHR_shader_subgroup_ballot
Lol
it's not clear how rustgpu doesn't have barrier()
third class language
smh
It doesn't have it yet
Should have it eventually
No idea why it doesn't have then yet
I don't understand why features like these aren't given absolute maximum priority
BDA is literally necessary to move on from the garbage "bind and use" model
Petition to deprecate bin-
One day
(he got shot by nvidia snipers)
LMAOOO is nvidia that adamant on bind and use
Does nvidia hw like binds?
I wanted to look into how descriptor sets work on nvk and never got around to it
i was just doing a meme, but nv descriptors are a bit more convoluted than amd ones
nvk 
All I can remember was jerry rigging that to work so I could submit my raytracing cs project
I don't think nvk does rt
It doesnt i thik that was nvh
But iirc nvk is also used in their tutorial
Which was what i built off
Poggies
ill upgrade the egui integration later today
Fair
Hey Pengiun, you got an idea why ash and phobos' ash includes it exposes just are never detected by my IDE?
Do I need to do static linking?
? It is like the default in rust.
What do you mean they aren't detected?
Yeah, but it's really weird as there is like no autocomplete whatsoever
like it won't recognize classes exist
i tried importing regular ash too
and the issue persisit which is weird
thats weird
it works for me
takes some time to index sometimes but it works eventually
time to do a hard ide reset
like it's not detecting types like vk::Bool32
is there a way to force clion to index it?
Use vim


ping me when support for descriptor indexing and giving shaders buffers via bda is added to phobos
All you need for bda is GL_EXT_buffer_reference
you can do bda already
And vkGetBufferDeviceAddress
a bda is just a uint
just send it through a push constant or something
or in a ubo if you have lots of them
but you can literally just write the return value of address() to some memory visible in the shader and use it like you would in glsl
oh š so im just realistically waiting for the descriptor indexing for textures
wait hm how would I synchronize all those buffers?
like "No one else is allowed to write to this until X is finished reading" like a RwLock
Inside the shader with a barrier(), outside you have to add barriers yourself
OHH memory_barrier that makes sense
Disclaimer: barrier only synchronizes a workgroup
workgroup?
ohh compute? I think the ray tracing shaders still are covered
unless I'm wrong
Probably ub
you can use subgroup vote to synchronize in shaders other than compute (or mesh/task)
if (subgroupElect()) {
write();
}
subgroupBarrier();
Yes
You can't effectively synchronize all threads
You can only synchronize within a subgroup, so be careful
im so confused by these words š I havent done any compute
Don't worry about it until you need it
wait so when doing bda for my buffers, does this bind my buffers?
or is the shader seriously just reading it from gpu memory using that address

There is no binding
huh okay so all this descriptor binding stuff ive been doing was mainly to just make a pointer?
sorry for asking such basic questions š
Let me give you a basic rundown on how BDA actually works
The classic binding model involves the following
vkCreateDescriptorSetLayout(&set_layout);
// you allocate a descriptor set to *bind* to the GPU, a descriptor set is a fancy pointer to resources which right now is nullptr
vkAllocateDescriptorSets(&set, &set_layout);
// Now you *write* to the descriptor set with your resources
vkUpdateDescriptorSets(&set, 1, &write);
// Then you finally *bind* the set
vkCmdBindDescriptorSets(1, &set);```
We call this bindful because we need to bind descriptor sets
BDA works like this:
// In the shader
layout (buffer_reference) buffer X {
LotsOfStuff[] data;
};
layout (push_constant) {
uint64_t address;
};
void main() {
X ptr = X(address);
}```
And in your vulkan app you just do a shrimple
auto address = buffer.address();
vkCmdPushConstants(1, &address);```
There is no vkCmdBind anything
this seems suspect to me

ohhh that's why so many developers live and die by bind less
what if you wanted to push an arbitrary array of them? You simply just change it to?
layout (push_constant) {
uint64_t[] addresses;
};
?
no
if you need a lot of addresses you should first consider your life choices and then you make a UBO
but my rt shaders can't know which vertex buffer is gonna actually be used
Make only one vertex buffer
If not, have an object list with each object having a uint64 address to its vertex buffer
What is sus sir
So those can't be tracked by the graph I imagine
Or can you explicitly say "I'm gonna need this buffer but don't bind it cuz I'll just send the address"
oh real
Though you'd probably use buffer address for persistent stuff where sync is not necessarily a concern
yeah but what if my scene changes and stuff
add, removing, idfk about culling as i have no clue how that works exactly
If you make incoherent read/writes that could even change from subgroup to subgroup, you need atomics/locks
But this scenario is never going to happen tbh
If you have a specific example I might give you more details/realistic approach
Rn it's mainly jsut to ray trace a scene that is gonna be static for a bit
but I wanna do physics simulations with gravity where many objects are gonna be transformed as a result of it
but idfk like if I removed an object from my scene, I should remove it from the device buffer as well
but I'm just unsure if that wouldn't be optimal as I think I would need to more or less restream the data from the cpu back to the gpu but now slightly changed
Hold on the neurons are firing what if I did an "inverted slice" of the removed memory, copy all data to the left and right of it into one massive new buffer I have allocated for the new size? My worry then is mainly sync issue as what if I have an AS or some other object that is referencing my buffer, but now that information is old?
You can do that
Fantastic!
So if you use a buffer bound that way across passes you can get it synced for you
Really nice
Yep
Youāre just reinventing an allocator at this point 
Look up some algorithms for that
Sparse bindings make allocation so much easier
There are complex ways of solving this
Yeah not a trivial problem
But really this is so much easier than you think
dispatch(); //physics shader, writes to many buffers
barrier();
build_tlas();
barrier();
trace_rays();```
That's it
This is all host side btw
If you write from the host to a buffer used by the gpu it's even easier
all you have to do is allocate some HOST_COHERENT memory
Yeah, you could implement caching, virtual memory allocators, sparse resources, write to and from a page table from the GPU, request and mark pages asynchronously and all the ridiculously complicated stuff you want
But just, don't? 
Probably no updates for a week or so
My laptop decided to delete its own kernel
And I didnāt bring my installer usb
You are a penguin, so you are linux. Stick your finger in the USB port and boot yourself
right
ill try that
Maybe I can buy one here but shits expensive here and I already bought a new phone charger
I tried to install a new graphics driver
And it failed and decided to take the kernel with it
Uhm do you use nvidia?
I have a nvidia card in there but it doesnāt work with my compositor so I generally use the igpu
The driver was also for the igpu
Mmm
Huh yeah i forgot that tlas building is fast blas is a bit slower, but enough that animations could be donw
ah wait I realized
how would I even identfiy the BLAS in my shader so I would know which buffer it refers to
The TLAS identifies the BLAS
gl_InstanceCustomIndexEXT ?
oh wait it is gl_InstanceCustomIndexEXT
I literally just needed a way to determine which BLAS i was hitting in my shader
Because if I'm passing my vertex buffers into my shader, I need to be able to
determine which vertex buffer corresponds to which blas
Yes you can use that value to identify mesh/object properties
The nvidia rt sample does this
idk im generally following what NVIDIA is doing for their shaders
yeah nvidia's rt sample
I still don't get why you need the BLAS
why can't you use gl_InstanceCustomIndexEXT to identify an object, which has the address to your vertex and index buffers

epic
step 2) camera
Are you using rt pipelines or ray queries btw
I will personally uninstall ray queries from the driver if anyone is using them
raytracing pipelines
Theyāre great wdym
no
iirc whatever the example is
Ray query is epic
CPU readback is cringe
Right, quick note about that. SBT is currently fully generated because I donāt know of any reason not to. If you come across any limitation with this lmk and Iāll rework a little
Why do you need cpu readback
Just ray query in the fragment shader for shadows and stuff
Its great
Works nicely
You see I can't read

It did crash my driver for some inexplicable reason but that aside
Nvidia shader compiler nonsense
so like any articles on how to do flying camera. learn opengl is fien but i can't go upside down :(
Ah yes, I have crashed NVIDIA's internal compiler countless times
you have a forward vector and an up vector, cross them and you get your right vector, slam that shit into glm lookAt

you now have a fps camera
This is just speculation but it seems like it starts optimizing the shader after a few frames on a background thread, and just crashing inside that somewhere
Incredible
Truly
Its not unheard of apparently
You profile it for some frames and optimize based on that data
Perhaps I'm too small brained
I don't know what kind of heuristics you can get from on the fly profiling 
Perhaps instruction or branch reordering idk
yeah
I got some pointers from pixelduck on what could cause this but havenāt found a fix
the greatest debate of time
Maybe if you do physics you can do 0,0,1
Anything else is forbidden by international law
z-up is quite common
0,1,0 we do
Blender does z-up, Unreal too
iirc there was a nice image explaining all that stuff
where tf was it
cant find it
which system does godot use? lhs? whatever ill pick opengl and do rhs
Do whatever is convention for the API you are using
which is right handed, y up, -z forward and x right
i still think the potential of (1,0,0) to mean up is still there 
y up or I make more breaking changes in Phobos
oh no
ohno.png
I even wrote some instructions in the release notes
Epic
Yoo pog
ladies and gentlemen i once again have a kernel

i have not been able to go cg in the past all day
i'm not even in my uni yet and a club got me to code a discord bot for them 
how am i busy with uni already and i havent even started my first year its actually joever how do i phobo now
My laptop died again
New chapter in the penguin laptop saga begins
laptop refuses to start at all for a whole night
oh shoot i need one too for university
How old is it?
4 years or so
I can fix this
oh?
I can get into boot options now
So I can boot my arch usb again and reinstall grub
I should just give up on getting factorio to run
so is this hardware or software
cuz if it is failing this much
i doubt it's software then
No idea
Good luck then š«” idk if it keeps happening im pretty its hardware at that point
or some really f'd up driver
OHH
Linux + NVIDIA 
Yeah good luck
Literally the only reason why I probably can't go over to linux rn:
- i game a lot
- nvidia
Wait so if I am doing BDA, how would indicate to phobos that I tend to use these buffers so it can generate the necessary fences/barriers
Same as any buffer
you just add it to the pass
But you donāt need to do any sync if theyāre readonly
Since there are no buffer layouts or whatever
Ah alright
so I would be fine by just not even binding it?
so anything that would modify that buffer, would sync'd
Yeah the resource sync and declaration is completely separate from binding
wait so what is the point of declaring your bindings then?
Its declaring resource usages
oh like from vulkan prespective
Not bindings specifically
But declaring what and how you will use a resource allows the system to sync it
Its up to you how you ābindā/access the resource, as long as itās within the declared usage
Yeah
ohh that makes sense
hey penguin how much effort would it take for me to implement descriptor indexing
oh i realized this was so much easier.I was gonna make a system where i had an object struct which described the index of the bda ubos, but then i realized
i could just place my bdas in place of those indices 
Might try to implement descriptor indexing today
Or tomorrow
Itās not a lot of work but needs some annoying things
You need to VkDescriptorBindingFlags~~ VARIABLE_DESCRIPTOR_COUNT~~ and PARTIALLY_BOUND when creating the descriptor layout and VkDescriptorSetVariableDescriptorCountAllocateInfo when allocating the set
you don't need variable count
my bad

you in fact don't even need partially bound either
, but that does make things a bit nicer
oh no
I was pinged
false information
i was asking a question then i realized it was stupid and deleted it
I see
What will descriptor indexing look like in phobos?
Like what kind of api will it expose for it?
Yeah that ^
also I'm gonna evangelize my university to use phobos 
should i convince the guy who is making a vercel physics engine to join phobos
Hmm how would I enable an extension in phobos
I need specifically: GL_EXT_scalar_block_layout
huh yeah
i don't need
OHH
IT WAS FOR SHADER INT 64
#extension GL_EXT_shader_explicit_arithmetic_types_int64 : require
I'm so confused
let mut settings = phobos::AppBuilder::new()
.version((1, 0, 0))
.name(name)
.validation(true)
.present_mode(phobos::vk::PresentModeKHR::MAILBOX)
.scratch_size(1 * 1024u64)
.gpu(phobos::GPURequirements {
dedicated: false,
min_video_memory: 1 * 1024 * 1024 * 1024, // 1 GiB
min_dedicated_video_memory: 1 * 1024 * 1024 * 1024,
queues: vec![
phobos::QueueRequest {
dedicated: false,
queue_type: phobos::QueueType::Graphics,
},
phobos::QueueRequest {
dedicated: true,
queue_type: phobos::QueueType::Transfer,
},
phobos::QueueRequest {
dedicated: true,
queue_type: phobos::QueueType::Compute,
},
],
features: vk::PhysicalDeviceFeatures::builder().shader_int64,
..Default::default()
});
Doesn't work for obvious reasons
shader_int64 returns bool32
You need to set that to true
yeah for soem reason
Sorry, wrong question mark lol
i solved it lmaoo
now i have a descriptor type error 
also can soemone explain to me
why the fuck does it take like 90 good seconds
to the vulkan docs to load
gnomes
Hey uh penguin
so if I were to set the extensions variable in the gpu settings
would it like mess up everything i.e. rt extension loading, validation, etc.
since i would be overriding the default values?
I think you just want to .push() values into it
.bind_sampled_image_array() with a slice of images and a sampler will bind all those to 0..n in the shader
Whether a binding is using descriptor indexing or not is checked using reflection
Bindless is great
You do need to set a max value for the amount of descriptors in such an unbounded array but Iām thinking to just provide that on init
And default to like 4k
my laptop be back

no idea
rcp might be 1/x?
Ok so I might be using phobos in the near future...
Is it stable/usable for rasterization/compute yet?
We're mainly waiting for descriptor indexing for images
other than that, Penguin is using phobos in his terrain project iirc
so it should be?
It's ready for ray tracing 100%
Oh nice
I dont even have an RT card lol so I'm mostly going to be using it for game engine dev
Are you one of the maintainers or just using it?
If the code isn't too complex (like wgpu's imo) I might be able to help out if I find bugs / issues with it
As long as I can rasterize / compute stuff like I can in wgpu I'll definitely use it
I'm just using it rn
Definitely stable-ish yeah
Oh yeah penguin is very open and chill about you contributing to the project
it'd be cool if we made this project community developed than solely penguin developed
i'm relearning vulkan so i can help out more meaningfully
heh im always open to take prs
I think apart from the render graph internals, the code isnāt too complex
I might rewrite that part at some point in time, it truly is a beast
i wanna learn how a render graph works 
Mine only abstracts sync
It doesnāt do any binding or resource management/aliasing
ohh i see
i'm mainly waiting for vulkan to add the task graph from dx12
as it looks insane
.. to debug
There's going to be support for what now??
Oh...

Oh alright cool!
I already have a good front-end API rn that uses WGPU as a back end so if I could just swap out the backend for phobos it should be pretty easy
Ohh wait phobos allows you to do uploads using a transfer only queue? That's amazing
yep
well, its a "transfer domain", which maps to a transfer only queue if possible
If thats not possible it shares a queue
Awesome
There are a few missing things that I'm gonna need like vkCmdDrawIndexedIndirectCount but they shouldn't be hard to implement
I could just make a PR
Also I love how you actually made use of the trait system for "Domains"
Really refreshing to see a vulkan API actually make use of traits properly
Penguin is just encouraging you to use vkCmdDispatch for everything 
It's difficult enough without learning a new language thank you
We must convert them.
š¦
I wish I had an RT capable card
Mesh/Task/Accel Structures seem so fun
what card do you have
they are not

do not be fooled
Lol
1060+ has compute emulation for rt
I am currently struggling, meshoptimizer prefers good mesh topology to good spatial locality
Which is eating my brain
Laptop 1050 
Yep which makes it even more painful tbh
Anyways as long as I can do some compute shenanigans and rasterization I am ver happy
definitely possible
I wish we had access to the vertex pipeline in compute
mesh shaders kinda solve this but they are meh
NVIDIA also puts absurd limits on workgroup sizes
"it must be 32 or else"
With VK_EXT_mesh_shader it can be whatever
but if you use anything other than 32, occupancy dies (on NVIDIA)
rip
VK_NV_mesh_shader explicitly limits max invocation size to 32
Oh also Penguin do you mind if I make some QoL improvements to context creation? (like the ability to select GPU stuff like that)
oh yeah go ahead, it should already have some options to prefer a certain gpu but an explicit picker might not be bad either
Aight thanks yea
should be mostly extending AppSettings/GpuRequirements
I have a scoring system already in my graphics API impl so I'd rather just pick manually
Gotcha
scoring instead of discarding would be nice to have
currently it just discards gpus that dont fit the requirements given
I'll work on that too then lel

i agree. do not let the pretty rays trick you. in serious words: the debugging experience is horrible. you will get bugs that are completely silent 
First time doing a feature request on dis so hopefully it goes well
yo i should implement this so i can actually learn ash and vulkan
scoring? I feel like the user themselves could keep track of that
Right yea that's why I don't recomend that method of doing it
And just letting the user pick their own PhysicalDevice
Also yea sure. Idk if it's gonna help with ash / vulkan too much though but should get you started
It's mostly just rust / iterator stuff
Now the question would be "how to retrieve all possible PhysicalDevices" before creating the instance itself
You could pass a callback that will be called to "select" a PhysicalDevice
So something like this
.gpu(|physical_devices_iter| {
// select one physical device from the iterator and return it
})
Then you could pass what queues / features you want for said physical device / device
Thatās not bad yeah
I just realized that the reason you might be so offline is because of different time zones lol
It's 3 am in canada rn kek
its 9am now
aye
I alr cloned it and started doing some testing just to get used to the API
yeah thats perfect
I think for returning a default gpu in case the selector fails we could just pick the first dedicated gpu if there is one and log a warning that the selector failed
That's assuming we use the one that uses the F: Fn(&PhysicalDevice) -> bool method
If we force the user to always return a physical device like in here I don't think a fallback would be needed
Hmm thatās possible too
But it requires tracking state between each iteration to see which is better
Up to the user 
They could pick randomly if they wanted to as well
It's how wgpu does it too, just let's the user pick what adapter (phys device) they want
nb that it is a bit questionable if the app should do any heuristic for picking
Yeah I think just giving some requirements is the most you want to do
even that
if the user wants to use the integrated gpu bc they want to conserve power
how can the app know?
This looks good, but how would you indicate what gpu you want?
OH
LMAOOOO it;s an iter, i thought you were gonna pass each gpu individually
Ok this makes sense now. I assume you iterate over that iter then return the desired gpu you want in same function
So we need Ai and neuralink integration

Yepp
Doesn't even need to be an iter, could just be a slice
Yep
What I do in my rust app is check if the user is running on laptop battery (if they have one) and force the use of integrated graphics only then
If not it always uses the "best" dedicated card (one with the most vram and highest max buffer size)
didn't understand my point :/
Oh no I was just talking about how my upstream API does it. Phobos IMO shouldn't handle that
GPU selection is completely up to the user is what I meant*
thats not what i said
Oh sorry then I misunderstood. What'd you mean by this then?
i explained a bit below
i think the point is that the app shouldnt care what gpu is used, thats up to the user to configure in os settings
OS settings just reorder them right?
depends
So the first one is the preferred one?
im pretty sure on linux at least you can disable them entirely
At what level?
Do you mean shutting down the device?
You can't shut down the integrated gpu in some setups though
Because those drive the display
oh yeah, igpu might be a special case
I don't have such a devices so I can't test what it does to vulkan
either way xorg bad
Mboth xorg and Wayland compositors have to manage multi gpu afaik
Descriptor indexing allows you to do bindless shenanigans right?
Is it supported in Phobos yet? Cause I saw an issue stating that you can't change descriptor sets after you call draw / dispatch, which might be problematic for my renderer (if not using bindless)
that issue is about something else, descriptor indexing is coming soon
i just have to add the api for it really
I added a comment to the issue to explain the problem better
its more of a design annoyance really
ill try to get descriptor indexing out by this weekend or so
im back from vacation friday night
it is really nice
though vk api wise bindful isnt really annoying anymore with phobos
not like with a lot of vk renderers that store descriptor sets per material š¤¢
... that's what I do lel
very naive way but worked fine 4 me
scales horribly though kek
it really does lol
here you just call bind with an image or buffer and it Just Works ā¢ļø
Yep that's how my current high level graphics API worked, but it does some hideous stuff under the hood to make it work.
Good to know I don't have to deal with that anymore lel
but it does some hideous stuff under the hood to make it work.
ok its not that bad tbh
just a hashmap full of descriptor sets
Yea it just "feels" very ugly lol
speaking of i should replace those with ahash before i forget
Yess ahash all the things
especially pipeline lookup can get expensive
There's also LruCache that automatically drops old values
but thats mostly because it needs to hash a lot of things
Right yea
that seems similar to what I do except this "oldness" is determined by the last frame it was used on
Awesome
if its used sporadically then recreating it is no big deal
Okkk I was confused for a sec with how methods (bind storage texture, bind sampled texture) in incomplete command buffer automatically handled sync
They don't, I needed to look in the pass builder lol
I was scared for a sec thinking this lib doesn't do auto-sync and that I'll need to handle putting pipeline barriers myself


