#Rosy

1 messages · Page 21 of 1

brisk chasm
#

are you on EMT duty?

true moon
#

Not today

#

Yesterday I was working the SJSU football game

astral hinge
#

he's there to help people having L-system emergencies

true moon
#

I'm probably on some body cam footage from when the cops came over to see us dealing with this insanely wasted guy, all they cared about was his birthday probably to decide if they could arrest him for drinking underage lmao

brisk chasm
#

I-m probably on some body cam footage..
with gaffertape wraped around your head again like the other day? 😄

true moon
#

Lmao no

cloud rivet
#

idk why that's a video

#

I clicked the wrong button that's why

#

I think maybe total my vulkan init took me over the last few days like 5-6 hours

cloud rivet
#

need to add resize/minimize support and then I'll create the memory for cuda and start on cuda

#

going to stick with a single main.cpp until I have rendered a Suzanne monkey via an acceleration structure with a controllable camera and imgui debug ui

#

although, that may require cu files

#

not sure

#

my C++ is looking a lot like my C only now I get to use references and C++ libraries

#

it is not C however

#

it just looks a lot like my C

#

no use of any std:: right now :/

#

minimize crashes, however

#

tomorrow I'll finally write some cuda I think, I'll just start by writing a color value to the memory and blit it to the swapchain image

#

via cuda

#

then I'll do the triangle example, then the triangle example with a acceleration structure, then I'll add imgui, then gltf loading then Suzanne and then a camera

#

then I'll try and add my track and skimmer back in

#

I won't have a mesh shader anymore for the track so need to think about that

#

I won't have any shadows either, you don't get those for free

astral hinge
cloud rivet
#

Oh cool

silk pier
#

I would imagine nvcc does something like this (no idea tho):

astral hinge
#

would that be a problem?

silk pier
#
  • cpu code ==> clang cpu path
  • gpu code ==> clang GPU path
silk pier
astral hinge
#

idk

silk pier
#

same here

#

(its worth bringing up so stuff doesn't magically break and cause major confusion)

cloud rivet
#

Depends on the build

#

The ninja build is clang

#

The msbuild is msvc

#

I only have that one to use with visual studio

#

cmake lets you tell nvcc what the host compiler is

#

It’s a compiler arg given to nvcc

#

So it can be whichever you want, msvc, clang or g++

cloud rivet
#

there's even a --allow-unsupported-compiler option

Disable nvcc check for supported host compiler versions.

Using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. This option has no effect on MacOS.

#

clang has a cl.exe and you can use clang instead of nvcc too

#

well a clang-cl I guess

#

idk was just looking it up again

cloud rivet
#

that's the thing to test just the CUDA VK interop

#

cudaProcess<<<grid, block, sbytes>>> <<<thinkeyes >>>

pseudo dock
#

I heard you like angle brackets

cloud rivet
#

I wonder what is going on there

#

I should look at the header file

#

oh it uses a ppm file

cloud rivet
#

man

#

calling a function on the device from application code directly is kind of mind blowing

#

hrm

#
// Find the GPU which is selected by Vulkan
    while (current_device < device_count) {
        cudaGetDeviceProperties(&deviceProp, current_device);
        int computeMode;
        checkCudaErrors(cudaDeviceGetAttribute(&computeMode, cudaDevAttrComputeMode, current_device));

        if ((computeMode != cudaComputeModeProhibited)) {
            // Compare the cuda device UUID with vulkan UUID
            int ret = memcmp((void *)&deviceProp.uuid, vkDeviceUUID, UUID_SIZE);
            if (ret == 0) {
                checkCudaErrors(cudaSetDevice(current_device));
                checkCudaErrors(cudaGetDeviceProperties(&deviceProp, current_device));
                printf("GPU Device %d: \"%s\" with compute capability %d.%d\n\n",
                       current_device,
                       deviceProp.name,
                       deviceProp.major,
                       deviceProp.minor);

                return current_device;
            }
        }
#

hrm this is actually the thing to follow

#
        VkExternalMemoryImageCreateInfo vkExternalMemImageCreateInfo = {};
        vkExternalMemImageCreateInfo.sType = VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO;
        vkExternalMemImageCreateInfo.pNext = NULL;
#ifdef _WIN64
        vkExternalMemImageCreateInfo.handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT;
#else
        vkExternalMemImageCreateInfo.handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
#endif

        imageInfo.pNext = &vkExternalMemImageCreateInfo;

        if (vkCreateImage(device, &imageInfo, nullptr, &image) != VK_SUCCESS) {
            throw std::runtime_error("failed to create image!");
        }
#

it goes to the pNext for vkCreateImage create info

#

so it's just a draw image like I have been creating and it just gets a pNext

#

oh

#

and then a VkExportMemoryWin32HandleInfoKHR and VkExportMemoryAllocateInfoKHR

#

which goes into the VkMemoryAllocateInfo pNext

#

and then a VkSemaphoreGetWin32HandleInfoKHR from a VkSemaphore

#

which is attached to a cuda object and given to cudaImportExternalSemaphore

#

and then a VkMemoryGetWin32HandleInfoKHR for the image memory

#

which is given to cudaImportExternalMemory

#

I just want to write some rgba values into that memory

#

hrm

#

this uses a single image

#
        // Added sleep of 10 millisecs so that CPU does not submit too much work to
        // GPU
        std::this_thread::sleep_for(std::chrono::microseconds(10000));
astral hinge
#

wat

#

ok I think in context this is just a frame rate limiter, albeit a terrible one

#

could use vsync to achieve something similar (without ruining frame pacing) as long as the gpu isn't already the bottleneck

cloud rivet
#

yeah

#

maybe I use flecs for this game

#

it comes with a frame rate limiter

#

or just write my own

#

idk

#

I can just do it based on frame time

astral hinge
#

I wonder what a frame rate limiter is doing in an ecs lib

#

I guess it is a useful utility to have in a game

cloud rivet
#

I don't know, but it's useful

#

yeah

#

so based on that vulkan cuda image example I will start on CUDA init next

#

I was planning today to keep going on the vulkan side to create these shared resources but based on the sample

#

it makes more sense to have cuda initialized by then

#

I think it would be nice to fail fast if there's anything here that becomes a blocker for me

#

so I think that makes sense to not invest in time in things only to find out I can't init a cuda device, stream whatever

#

I should just get cuda hello world going without any interopt

#

and once I know cuda works, I can start doing the rest of it

#

easy

#

unless it isn't

#

the next thing would be to try the simpleStream example, since the vulkan sample uses a stream

#

no, that's too much

#

I'll just do the hello world printf

#

I'll start by just renaming my main.cpp to main.cu

#

and see what happens

#
C:\Users\swart\projects\code\pixel_storm>cmake --build build-dev
[0/1] Re-running CMake...-- Configuring done (1.6s)
-- Generating done (0.0s)
-- Build files have been written to: C:/Users/swart/projects/code/pixel_storm/build-dev

[1/3] Building CUDA object CMakeFiles\pixelStorm.dir\src\main.cu.obj
main.cu
tmpxft_00004b38_00000000-7_main.cudafe1.cpp
[2/3] Linking CXX executable pixelStorm.exe

C:\Users\swart\projects\code\pixel_storm>
#

well it built

#

and it runs

#

one thing I like about using C++ again is references

#

oh clangd hates me now

#

that fixed it

#

and removed some args from compile_commands.json it didn't understand

#

I wish .clangd could read env variables

#

looks like it just needs someone to do and own that

pseudo dock
cloud rivet
#

no

#

I'm staying focused

#

I need to wrap SDL_Log, and all the other SDL crutches I am using so I don't pepper my code with an SDL dependency

#

at some point

#

not right now

#

or maybe I just don't care

#

internal void init_cuda(ps_ctx_t &ctx) {
#ifndef __CUDA_RUNTIME_H__
  SDL_Log("No CUDA runtime detected");
  abort();
#endif
  i32 device_count;
  checkCudaErrors(cudaGetDeviceCount(&device_count));
  SDL_Log("num cuda devices: %d", device_count);
}
#

nice, my first ever CUDA code

cloud rivet
#

oh

#

this explains the <<<>>>

#

A kernel is defined using the global declaration specifier and the number of CUDA threads that execute that kernel for a given kernel call is specified using a new <<<...>>>execution configuration syntax (see Execution Configuration). Each thread that executes the kernel is given a unique thread ID that is accessible within the kernel through built-in variables.

#

Any call to a global function must specify the execution configuration for that call. The execution configuration defines the dimension of the grid and blocks that will be used to execute the function on the device, as well as the associated stream (see CUDA Runtime for a description of streams).

The execution configuration is specified by inserting an expression of the form <<< Dg, Db, Ns, S >>> between the function name and the parenthesized argument list, where:

Dg is of type dim3 (see dim3) and specifies the dimension and size of the grid, such that Dg.x * Dg.y * Dg.z equals the number of blocks being launched;

Db is of type dim3 (see dim3) and specifies the dimension and size of each block, such that Db.x * Db.y * Db.z equals the number of threads per block;

Ns is of type size_t and specifies the number of bytes in shared memory that is dynamically allocated per block for this call in addition to the statically allocated memory; this dynamically allocated memory is used by any of the variables declared as an external array as mentioned in shared; Ns is an optional argument which defaults to 0;

S is of type cudaStream_t and specifies the associated stream; S is an optional argument which defaults to 0.

#

its just like a compute shader, except the cuda stream thing

#

so <<< >>> is how to call the shader from the host is how I read that

#

sort of like vkCmdDispatch

#

I guess it's a kernel, not a shader

#

hrmmmmmm

#

it's supposed to log something

#

printf doesn't do anything in a win32 windows application I think

#

ah the debugger was hiding it

#

yay 🎉

#
__global__ void testKernel(int val) {
  printf("[%d, %d]:\t\tValue is:%d\n", blockIdx.y * gridDim.x + blockIdx.x,
         threadIdx.z * blockDim.x * blockDim.y + threadIdx.y * blockDim.x +
             threadIdx.x,
         val);
}
#
{
    SDL_Log("starting testKernel");
    dim3 dimGrid(2, 2);
    dim3 dimBlock(2, 2, 2);
    testKernel<<<dimGrid, dimBlock>>>(10);
    checkCudaErrors(cudaDeviceSynchronize());
    SDL_Log("finished testKernel");
  }
#

great

astral hinge
cloud rivet
#

the cuda programming guide is actually doing a really great job explaining how compute on a gpu works

#

it goes into hardware details, guidelines, talks about occupancy, how threads become inactive during divergence

#

When a CUDA kernel accesses a data region in the global memory repeatedly, such data accesses can be considered to be persisting. On the other hand, if the data is only accessed once, such data accesses can be considered to be streaming.

Starting with CUDA 11.0, devices of compute capability 8.0 and above have the capability to influence persistence of data in the L2 cache, potentially providing higher bandwidth and lower latency accesses to global memory.

#

ok so creating a stream is setting aside L2 cache for persistent access

#

I don't have shared distributed memory, since that's in 9.0

#

I'd have to get a new graphics card

#

ok so the vulkan example needs the stream for cudaSignalExternalSemaphoresAsync

cloud rivet
#

I'll work on the vulkan draw image tomorrow

#

hrm

cloud rivet
#

it looks like the optix shaders can get an arbitrary memory to write to

#

extern "C" {
__constant__ Params params;
}

extern "C"
__global__ void __raygen__draw_solid_color()
{
    uint3 launch_index = optixGetLaunchIndex();
    RayGenData* rtData = (RayGenData*)optixGetSbtDataPointer();
    params.image[launch_index.y * params.image_width + launch_index.x] =
        make_color( make_float3( rtData->r, rtData->g, rtData->b ) );
}
#

that params is passed in just as memory

#
            Params output_buffer;
            params.image       = output_buffer.map();
            params.image_width = width;

            CUdeviceptr d_param;
            CUDA_CHECK( cudaMalloc( reinterpret_cast<void**>( &d_param ), sizeof( Params ) ) );
            CUDA_CHECK( cudaMemcpy(
                        reinterpret_cast<void*>( d_param ),
                        &params, sizeof( params ),
                        cudaMemcpyHostToDevice
                        ) );

            OPTIX_CHECK( optixLaunch( pipeline, stream, d_param, sizeof( Params ), &sbt, width, height, /*depth=*/1 ) );
#
 OPTIXAPI OptixResult optixLaunch (
 OptixPipeline pipeline,
 CUstream stream,
 CUdeviceptr pipelineParams,
 size_t pipelineParamsSize,
 const OptixShaderBindingTable ∗ sbt,
 unsigned int width,
 unsigned int height,
 unsigned int depth 
#
 typedef unsigned long long CUdeviceptr
 // CUDAdevice pointer.
#

that's cool

#

so I think I can just write to the vulkan image memory in whatever format

#

heh

cloud rivet
#

feel pretty confident that if I can write colors to the image memory with just CUDA that I'll have no problems writing colors via OptiX

#

the thing I also want is an image per FIF and not just a single one

#

I'll start with a single one since that's what the sample has

cloud rivet
#

ok this evening's goal is to add draw images to my Vulkan renderer, and an unlikely stretch goal to write colors to them from CUDA.

#

per FIF draw images like I have always done tbh

cloud rivet
#

that turned out to be pretty easy since I could copy a lot of my code over from my previous project

#

well just the draw image itself

#

now I need to do that special memory stuff

#

let me check sync validation though

cloud rivet
#

:|

#

this sucks

#

A pointer to a SECURITY_DESCRIPTOR structure that controls access to the object. If the value of this member is NULL, the object is assigned the default security descriptor associated with the access token of the calling process. This is not the same as granting access to everyone by assigning a NULL discretionary access control list (DACL). By default, the default DACL in the access token of a process allows access only to the user represented by the access token.

#

I'm just going to copy paste this win32 security class from the sample to get this to work

#

it's pretty gross

#
    EXPLICIT_ACCESS explicitAccess;
    ZeroMemory(&explicitAccess, sizeof(EXPLICIT_ACCESS));
    explicitAccess.grfAccessPermissions = STANDARD_RIGHTS_ALL | SPECIFIC_RIGHTS_ALL;
    explicitAccess.grfAccessMode        = SET_ACCESS;
    explicitAccess.grfInheritance       = INHERIT_ONLY;
    explicitAccess.Trustee.TrusteeForm  = TRUSTEE_IS_SID;
    explicitAccess.Trustee.TrusteeType  = TRUSTEE_IS_WELL_KNOWN_GROUP;
    explicitAccess.Trustee.ptstrName    = (LPTSTR)*ppSID;
#

I think that just means no security at all

#

and would not be shippable

#

so

#

good this is a solo project

pseudo dock
#

Wait... Why not just pass null?

astral hinge
#

what is this security stuff

#

process isolation or something?

cloud rivet
#

I think it has to be defined for the win32 external memory handler

astral hinge
cloud rivet
#

If VkExportMemoryAllocateInfo is included in the pNext chain of VkMemoryAllocateInfo with a Windows handleType, but either VkExportMemoryWin32HandleInfoKHR is not included in the pNext chain, or it is included but pAttributes is set to NULL, default security descriptor values will be used, and child processes created by the application will not inherit the handle, as described in the MSDN documentation for “Synchronization Object Security and Access Rights”1. Further, if the structure is not present, the access rights used depend on the handle type.

#

If you specify NULL, the object gets a default security descriptor

#

I don't know what the default security descriptor is

pseudo dock
#

But I think that's ok. Child processed won't inherit it, you're not giving it a name, so that's fine

cloud rivet
#

but the sample has the above code

#

hrm

#

oh I see so it is not a huge risk

#

the wikipedia article is better than the msdn docs

pseudo dock
#

Well, I shouldn't act confident, I haven't used vulkan or its memory allocators. But generally I think it's ok to use the default security descriptor, especially for things that don't have names that other processes can find and that child processed won't inherit.

cloud rivet
#

ok, I hope so, just because though I have no idea what I am doing, I'm going to follow the sample and then once I get working set it to null

#

and see what happens

#

because I won't probably be able to understand when I have an error what the cause of it is

#

hrm

#

I'm going to be happy to delete that class

#

I think I have to link win32 stuff now too

cloud rivet
#

I get this cool icon in neovim for my cuda file

#

I have no idea what that is

#

looks cool though

#

ok everything seems to render fine with just the vulkan clear color applied to the draw image with all external memory ceremony

#

next is the external semaphore

#

my first time using a timeline semaphore

#

I think I need one of these per FIF

#

oh that semaphore needs the win32 security stuff too wow

#

pAttributes is a pointer to a Windows SECURITY_ATTRIBUTES structure specifying security attributes of the handle.

#

If pAttributes is not NULL, pAttributes must be a valid pointer to a valid SECURITY_ATTRIBUTES value

#

can be null

#

I will follow the sample for now

#

oh it is not a timeline semaphore nm

#

cool cool, I guess time to do the CUDA side now for the memory and semaphores 😅

#

oh so my queue submit for the frame command buffer has to wait for two semaphores the frame present semaphore and a cuda -> vk semaphore signaled by cuda, and then I have another vk -> cuda semaphore

#

makes sense for these to be FIF

cloud rivet
#

I switched to a buffer

#
Shared buffer created: 132710400 bytes, CUDA ptr: 0000000404C00000
Shared buffer created: 132710400 bytes, CUDA ptr: 000000040CC00000
Shared buffer created: 132710400 bytes, CUDA ptr: 0000000414C00000
#

works

#
__global__ void fillColorKernel(float4 *output, u32 width, u32 height, float r,
                                float g, float b, float a) {
  u32 x = blockIdx.x * blockDim.x + threadIdx.x;
  u32 y = blockIdx.y * blockDim.y + threadIdx.y;

  if (x < width && y < height) {
    output[y * width + x] = make_float4(r, g, b, a);
  }
}
#
 cuda_fill_buffer(&fctx.shared_buffer, ctx.swapchain.extent.width,
                     ctx.swapchain.extent.height, 0.8f, 0.2f, 0.3f,
                     1.0f);
#

ok

#

let me try set those win security things to null

#

I just copy the buffer to the draw image

#

@pseudo dock you're right I don't need them! 🎉

#

I'm allowed to wear the CUDA thing now as my role

#

I'll try to do a optix thing tomorrow

brisk chasm
#

i didnt catch the sudden cuda integration

#

i suppose that was just an itch you tried to cure by just trying it out, neh?

cloud rivet
#

This is a complete rewrite so I can avoid using a shader language

#

I had to sacrifice a graphics pipeline along the way, it’s fully RT now

brisk chasm
#

i see

cloud rivet
#

Hoping to be where I left off by end of month

#

Also no longer a from scratch NHI project

brisk chasm
#

i can see the little headcrab looking vehicle driving in and on and around my little gpu when i look through the side panel 🙂

bronze socket
#

are you still using shady btw?

cloud rivet
#

No

bronze socket
#

rip, what made you switch

#

are you just using GLSL now

cloud rivet
#

It doesn’t have stuff

#

No

#

Using C now for my shaders

#

Via OptiX

#

It’s a C++ compiler however

cloud rivet
#

I think I could possibly have a triangle by end of day today

bronze socket
#

ah right because you're using cuda now you don't need to bother with a shading lang

cloud rivet
#

I remember the first time I tried linking a string literal with GLSL code in it and trying to get an error out of GL to tell me why nothing worked

#

I have hated since the start

#

I am free now

#

I could still use the graphics pipeline if I really wanted to since I am using both CUDA and Vulkan via shared GPU memory

#

It is still presenting via Vulkan

#

That’s not a thing CUDA can do

bronze socket
#

nice

#

does CUDA generally support GPU images?

cloud rivet
#

In my case it is just memory to CUDA, using a buffer, but yes it supports images with tiling

#

You can use textures, mipmaps etc

#

It just doesn’t have a swapchain WSI thing

#

afaik

bronze socket
#

that's cool

cloud rivet
#

I'm glad I haven't bought an AMD GPU like I had considered to test on

wraith urchin
#

What are we working on now @cloud rivet? I feel like every time I come into this thread its a new project

#

Something with CUDA?

slim oak
#

Bjorn is just speedrunning all major technologies, because why not KEKW

brisk chasm
wraith urchin
#

Does nobody use OpenCL anymore?

brisk chasm
#

only die hards i would imagine, and big libs which are hardware accellerated use cuda or something else

#

i thought that was the motivation of amd to push rocm/zluda

cloud rivet
#

well I'm using CUDA specifically for OptiX

cloud rivet
#

I think/hope I will be back to where I was by end of month, plus also no longer NHI I should go much faster

#

since I'll use tinygltf, and SDL for images

wraith urchin
#

NHI? you mean NIH?

cloud rivet
#

yes

#

typo

#

not national health institute

wraith urchin
#

You did it twice 😛

cloud rivet
#

my brain works in mysterious ways

echo crystal
#

not homemade invention

#

NIH is too mainstream as an acronym so had to come up with a new and improved v2

bronze socket
#

what would the mandatory french canadian alternate acronym be

wraith urchin
#

Pas Inventé Ici

cloud rivet
#

I had a really good reason for my change this time

cloud rivet
#

Im done changing projects now for sure

#

this is the one

brisk chasm
#

famous last words

cloud rivet
#

I got everything I want now

#

I promise

bronze socket
#

locking in my bet on dx12

wraith urchin
#

I'm still waiting for the next blockens update froge_sad

echo crystal
#

this is the 1

wraith urchin
#

It was pretty cool

cloud rivet
#

ok time to get an optix thing to work

#

I think recent releases are more aimed at server stuff

#

1 more day of work and I have a two week vacation

#

first vacation this year

#

I really need it

#

my secret is to not take vacation during the actual holiday weeks, because it's actually super quiet nearly every year

#

nobody is doing much or expects too much

#

because most people are on vacation and nobody is trying to release anything

#

hrm I hard code my cuda paths in CMakeLists I should use my %CUDA_PATH%

#

oh

#

it's nice it did that check

#

they improved the installer too lol

cloud rivet
#

just trying all the tools

#

this DXGI Layer is interesting

#

that's sharing the memory with CUDA I think

#

I'm like using three graphics APIs, not two

#

I have a Vulkan context, a CUDA context and a DX12 context

#

ok ok anyway time for OptiX

#

I'll need to setup Tracy at somepoint

#

optixGetTriangleBarycentrics()
static forceinline device float2 optixGetTriangleBarycentrics ( )
static
Convenience function that returns the first two attributes as floats.

When using OptixBuildInputTriangleArray objects, during intersection with a triangle, the barycentric coordinates of the hit are stored into the first two attribute registers.

Available in AH, CH

#

this is cool

#

also

#

optixGetTriangleVertexData() [1/2]
static forceinline device void optixGetTriangleVertexData ( float3 data[3] )
static
Returns the object space triangle vertex positions of the currently intersected triangle at the current ray time.

Similar to the random access variant optixGetTriangleVertexDataFromHandle, but does not require setting flag OPTIX_BUILD_FLAG_ALLOW_RANDOM_VERTEX_ACCESS when building the corresponding GAS.

It is only valid to call this function if the return value of optixGetPrimitiveType( optixGetHitKind() ) equals OPTIX_PRIMITIVE_TYPE_TRIANGLE.

Available in AH, CH

#

you can get instance id, instance index id, primitive index id, world transform

broken fog
#

man i have no idea what you're doing at all but seeing you this active makes me want to work on my stuff

cloud rivet
#

I don't know what I am doing either

#

apparently OptiX also requires precompiled shaders

#

which is different from CUDA kernels

#

but at least they're still C++

#

they get compiled to PTX files, which I guess is an IR like SPIR-V

#

oh there's a newer version than PTX

#

OptiX-IR

cloud rivet
#

OptiX has a couple of objects similar to a blas and tlas, but they are a little different:

DXR and VulkanRT use the terms top-level acceleration structure and bottom-level
acceleration structure. A bottom-level acceleration structure is the same as a geometry
acceleration structure; a top-level acceleration structure is similar to an instance
acceleration structure.

Traversing against a single geometry acceleration structure,
motion transform nodes, or nested instance acceleration structures (multi-level
instancing) are not supported in DXR or VulkanRT. In NVIDIA OptiX, the terms were
changed due to the additional possible configurations of scene graphs beyond the strict
two-level, top-bottom configurations supported by DXR and VulkanRT.

#

also

#

OptiX has an AI denoiser

#

The denoiser is based on statistical data sets that guide the denoising process. These data,
represented by a binary blob called a training model, are produced from a large number of
rendered images in different stages of convergence. The images are used as input to an
underlying deep learning system. (See the NVIDIA Developer article “Deep Learning”1 for
more information about deep-learning systems.)

#

Because deep-learning training needs significant computational resources — even obtaining a
sufficient number of partially converged images can be difficult — a general-purpose model
is included with the OptiX software. This model is suitable for many renderers. However, the
model may not yield optimal results when applied to images produced by renderers with
very different noise characteristics compared to those used in the original training data.

#

hrm

#

it also has image filters

#

idk

#

The ray payload is used to pass data between optixTrace and the programs invoked during
ray traversal. Payload values are passed to and returned from optixTrace, and follow a
copy-in/copy-out semantic. The payload is passed to all the intersection, any-hit, closest-hit
and miss programs that are invoked during the execution of optixTrace. The payload can be
read and written by each program using the thirty-two pairs of optixGetPayload and
optixSetPayload functions (for example, optixGetPayload_0 and optixSetPayload_0).

#

like in and out variables for shaders

#

you know what I like about OptiX, I don't have to go through the ceremony of creating vulkan buffer

#

I just cudaMalloc

tight torrent
#

Why not just use vulkan alone or something

cloud rivet
#

because I want to use a real programming language for shader code

#

and OptiX gives me that

#

the shader set up -> shader module is basically a compiled .cu file full of ray_gen, miss, any hit closest hit etc functions, these get separated by type into program groups which are associated with shader binding records, which are then attached to a shader binding table, then the SBT is supplied to a launch function

cloud rivet
#

ok ok I think I get the basics

#

I'm going to take this triangle example and make it work in my renderer

#

optix is pretty high level

#

this isn't some low level vulkan like api imo,this is like the opengl of ray tracing imo

#

it kind of seems easier to use, with more features, it has all these helper functions

#

you get image filters, and motion blur, a denoiser,

solid grove
cloud rivet
#

yes I have had made an SBT In VK

solid grove
#

should call it the CBT for how confusing it is tbh

cloud rivet
#

the blas/tlas set up in vulkan is so gross

#

it's just some malloc memory and a create struct for the simplest version of a acceleration structure

#

in OptiX

#

but they can get more complex

#

you can just use the geometry acceleration structure without a instance acceleration structure

#

like you can just use a blas, and no tlas if you want

#

I haven't rendered anything yet, but right now it feels about 90% less gross than vulkan's version of this stuff

#

just now got your joke

solid grove
#

ping: 1000

cloud rivet
#

on a good day

#

I think maybe cross device apis are bad

#

all the good apis are for specific hardware

#

prove me wrong

#

the nintendo graphics api, the play station graphics api are all lauded by people familiar with them

#

metal is good

#

optix is good

solid grove
#

dx12 is meh

cloud rivet
#

dx12 is cross device

#

it's not for specific hardware

#

I mean IHVs should make their own apis

solid grove
#

Metal is also technically cross-hardware, it works on older macs with nvidia and amd gpus

cloud rivet
#

hrm

#

that explains why macs come up in cuda, I was wondering

solid grove
#

the main thing metal does well is the difficult behavior is opt-in rather than pushed onto the developer

#

though with metal 4 they removed auto barriers so 🤷

cloud rivet
#

I don't mind barriers

solid grove
#

i think they have a good-ish reason for it though, metal 4 is all bindless all the time so the driver can't really add barriers for you

#

they got rid of setVertexBytes/setFragmentBytes though which I was using as a push constant equivalent with no replacement froge_bleak

cloud rivet
#

uh

#

how do you associate data with a draw?

solid grove
#

in metal 4 its only buffer objects

cloud rivet
#

but

#

you need like per draw data

solid grove
#

they used to have a convenience API in setVertexBytes for when you needed a little bit of data

#

where the driver would do the buffer stuff for you. underneath it was still a buffer object

cloud rivet
#

ok, I'm still confused how you can get per draw state

#

if push constants were taken away from vulkan it would be useless?

solid grove
cloud rivet
#

unless you have like a draw index

#

ah ok

#

that works

#

that's all I sort of do with push constants

#

give it a way to identify the actual data in a buffer

solid grove
#

but the convenient thing about push constants is they are encoded as part of the command buffer so you didn't have to make a pool allocator just to set some simple global draw parameters

cloud rivet
#

so I can't just have a single file program like I was hoping

#

with cuda the kernels can live in the application code

#

but with OptiX they have to be compiled separately

#

so a second cu file

solid grove
cloud rivet
#

:P

astral hinge
cloud rivet
#

for optimization and debugging is what it says and future features

astral hinge
#

does that only happen at program compile time

#

I'm a little confused how things work exactly which prevents you from mixing cpu and gpu code in a single file

cloud rivet
#

it happens when I run cmake

#

well it's not a CUDA kernel

#

it's a special thing for OptiX

#

OptiX is sort of it's own graphics API that is run from a CUDA context

#
--optix-ir (-optix-ir)

Compile CUDA source to OptiX IR (.optixir) output. The OptiX IR is only intended for consumption by OptiX through appropriate APIs. This feature is not supported with link-time-optimization (-dlto), the lto_NN -arch target, or with -gencode.

Default Output File Name
The source file name extension is replaced by .optixir to create the default output file name. For example, the default output file name for x.cu is x.optixir
#

I'm kind of thinking about OptiX as an OpenGL for ray tracing in my brain

#

OpenGL as in an easy to use, high level graphics API, not as in open or cross platform

#

it's not a CUDA thing, it's just run via a CUDA context

#

you use CUDA to upload everything to the device

#

you can't use a bunch of CUDA features in the OptiX shader programs

#

like barriers or any synchronization

astral hinge
#

I see

cloud rivet
#

For efficiency and coherence, the NVIDIA OptiX runtime—unlike CUDA kernels—allows the execution of one task, such as a single ray, to be moved at any point in time to a different lane, warp or streaming multiprocessor (SM). (See section “Kernel Focus”1 in the CUDA Toolkit Documentation.2) Consequently, applications cannot use shared memory, synchronization, barriers, or other SM-thread-specific programming constructs in their programs supplied to OptiX.

#

add_custom_command in CMake is pretty cool

#

cmake has all these utility commands

#

like moving files if changed

#

and using that with add_custom_command is so handy

#

imagine using a bat file

#

I just have to learn everything the hard way

cloud rivet
#

alright I have shader source compilation working

#

what's next

#

optix init code

cloud rivet
dry apex
#

i miss doing graphics

#

seeing you being so committed motivates me lol

cloud rivet
#

Just do some

#

It’s pretty cool to do

cloud rivet
#

yesterday I ran into an issue trying to use Optix-IR and switched to the PTX format, which is what blender uses

#

but I wanted to investigate it a but more

#

something about the built in vector type was failing to compile when I was using the Optix-IR option when compiling the shader programs

#

well

#

it doesn't fail on compile

#

it fails on creating the shader module at run time

#

maybe I can create a small reproducible test case for the forum

brisk chasm
#

thats always a good idea

cloud rivet
#

I want to go over all the nvcc compiler options

#

maybe it is alignment related

#

native vector types and alignment is a reoccurring topic for me

#

two more hours and I'm on vacation frogegreenexcited

cloud rivet
#

I have used AI for good

#
#

I had claude help me create an easy to reproducible example of the bug

#

they can download, unzip and see for themselves

#

even though I used AI it took me several hours to get it right

#

unlikely anyone will work on it this month, anyway PTX works

#

but they're pushing everyone to use optix-ir so why would it fail like this

#

how ironic is it that my first issue using OptiX is a shader bug agonyfrog

#

anyway, not blocked

cloud rivet
#

i am pretty sure that it is a bug

#

oh

#

someone else just reported a bug too

#
#

I think they shipped some regressions with the latest

cloud rivet
#

I don't know if this exists in Vulkan, but OptiX has a motion transform, which seems like it would help with updates as a result from animations?

#

I'll have to try and use it to understand it better

#

oh the OptiX guide goes on to say these things don't exist in vulkan or dx

solid grove
#

i think motion transforms are intended for stuff like motion blur

brisk chasm
#

for denoisers to plugin?

cloud rivet
#

I see

#

I've been doing a force sync after the kernel call, and have not been using the semaphores, I'm going to fix that now

#

was just trying to get a triangle going

#

going to use timeline semaphores, haven't used them before

bronze socket
#

they're nice

cloud rivet
#

hrm actually think binary semaphores are easier for this

#

and it requires using a cuda stream now

#

and I'll have to not delete these when resizing so I need to fix my frame reset code

#

I don't think I need a stream per FIF thinkeyes

#

thonk hrm

#

The cudaStreamSynchronize() function will block until all the work in the stream has completed.

#

I'm trying to signal an external semaphore

#

so

#

syncing the sync sounds bad

brisk chasm
#

is this not something like a commandlist situation

#

you write to a stream with any thread, when you are done writing and want to submit you sync then

cloud rivet
#

not sure

#

there's no command list

#

The simplest way to synchronize the GPU and a host thread is with the use of cudaDeviceSynchronize, which blocks the host thread until all previously issued work on the GPU has completed. In the examples of this chapter this is sufficient because only single operations are being executed on the GPU. In larger applications, there may be multiple streams executing work on the GPU and cudaDeviceSynchronize will wait for work in all streams to complete.

#

I think maybe I just have a stream per fif

#

all my sync is per fif

#

so I think this makes sense

#

unfortunately all the cuda and optix sample code is one shot

#

at least all the ones I looked at

#

I think it is safer to over sync and I'll be able to profile and fix slowness

#

vs random shit is broken because I didn't sync something and I have no idea what is going wrong

brisk chasm
#

i mean the stream is the "commandlist"

cloud rivet
#

yes that's what it is

#

the analog to that is the vulkan command buffer, and I have one per FIF

brisk chasm
#

why not just use vulkan for all this? ididnotread

cloud rivet
#

because I get to write C++ for my shader code

#

a real programming language

brisk chasm
#

ah, and vchichi is not ready yet i assume

cloud rivet
#

vcc is a research project

#

it's not ready for anything other than research

brisk chasm
#

someone has to dogfood it heh, yeah

#

oki i get it now

cloud rivet
#

vcc is amazing, but to me it seemed that Gob has no plans to make it into a production ready thing someone should use for a game

brisk chasm
#

slang was a disappointment, now you try optix

cloud rivet
#

yeah

#

cuda and optix tbh, I will use both

brisk chasm
#

ok that was the missing piece

cloud rivet
#

yeah I have been spamming this channel, so hard to keep up with with my crazy ranting

brisk chasm
#

all good

#

we do have #gpu-compute as well, not sure if you put your sniffer in there too, just in case other frogs fiddled with optix already, plus the cuda discord linked in related-serveurs

cloud rivet
#

oh there's a cuda specific discord?

brisk chasm
#

yep

cloud rivet
#

that server seems pretty quiet

#

I'll lurk

#

last mention of optix was over a year ago

#

thanks

brisk chasm
#

doesnt hurt to ping peeps, greg might know a thing or two

#

or nightchild, both are here too

cloud rivet
#

I lurk in #gpu-compute

#

I don't really participate in the topic specific "Discussion" channels

#

but yeah I see what's being said

#

I don't generally ask for help, just try to figure things out on my own. I hate asking for help. It's a flaw

#

I feel like I'm surrendering

brisk chasm
#

i will never understand that 🙂

#

but i get it

cloud rivet
#

having to ask for help about slang is actually where I ended up having had enough

#

and that's when I dumped it

#

yes I will do a cuda stream per fif

#

and I will call cudaStreamSynchronize

#

but it should be empty of tasks

#

no

#

I should not call that

#

it interferes with the GPU to GPU sync

cloud rivet
#

alright let's see if this works

#

worked until I resized

#

ok I need to fix that

brisk chasm
#

hehe

cloud rivet
#

works

#

if I resize the window too much I can run out of device memory however

#

I think I need to create a really large shared buffer and only render to a portion of it so I don't have to recreate the buffers on resize

#

the way this code blows up on resize is hilarous

#

abort dialog bombs

#

I will fix that later

#
internal void optix_render_to_buffer(
    ps_ctx_t &ctx, i32 frame_count, cudaExternalSemaphore_t *cuda_v2c_semaphore,
    cudaExternalSemaphore_t *cuda_c2v_semaphore, cudaStream_t cuda_stream,
    ps_shared_buffer_t *buf, u32 width, u32 height) {
  if (frame_count > 0) {
    cuda_wait_semaphore(ctx, cuda_v2c_semaphore, cuda_stream);
  }
  Params params = {};
  params.output = (float4 *)buf->cuda_ptr;
  params.width = width;
  params.height = height;
  params.handle = ctx.optix.gas_handle;

  params.cam_eye = make_float3(0.0f, 0.0f, 2.0f);

  float aspect = (float)width / (float)height;
  float fov_y = 45.0f * 3.14159f / 180.0f;
  float half_height = tanf(fov_y * 0.5f);
  float half_width = aspect * half_height;

  params.cam_u = make_float3(half_width, 0.0f, 0.0f);
  params.cam_v = make_float3(0.0f, -half_height, 0.0f);
  params.cam_w = make_float3(0.0f, 0.0f, -1.0f);

  checkCudaErrors(cudaMemcpyAsync(reinterpret_cast<void *>(ctx.optix.d_params),
                                  &params, sizeof(Params),
                                  cudaMemcpyHostToDevice, cuda_stream));

  OPTIX_CHECK(optixLaunch(ctx.optix.pipeline, cuda_stream,
                          reinterpret_cast<CUdeviceptr>(ctx.optix.d_params),
                          sizeof(Params), &ctx.optix.sbt, width, height, 1));

  cuda_signal_semaphore(ctx, cuda_c2v_semaphore, cuda_stream);
}
#

optixLaunch starts the ray trace

#

I should just pass the frame context into this function

#

those are all per fif objects, the stream, semaphores and buffers

#

yeah the cuda_stream is just like a comand buffer in vulkan

#

I like it

#

so far really enjoying all this

#

that Param struct is used in the shader code, I can just share my header file with the shader code

#

I can just use normal C++20 in the shader code

#

it's great

#

after I fix resize I'll add imgui, which requires adding a dynamic render pass to vulkan

#

and then I should be done touching vulkan

#

it'll just be CUDA and OptiX and application/game code from then on

#

well I should break up in my single main.cpp file up at some point

#

but that's just moving code around

#

ah tokei doesn't count cu files RIP

#

I think the CUDA cmd stream must be handled in order since I don't have to add any barriers between the cudaMemcpyAsync and optixLaunch thinkeyes I should confirm this

#

like a gpu to gpu barrier?

#

oh

#

no barriers needed

#

A stream operates like a work-queue into which programs can add operations, such as memory copies or kernel launches, to be executed in order.

#

guaranteed to be in order, be praised

#

imagine still using vulkan kekwfroggified

#

I never got out of bed this morning, I've just been sitting here with my laptop writing this stuff

#

I haven't eaten or anything, it's already dark

#

lol

#

engrossed

#

I still need to get my 10k steps in fml

astral hinge
#

that's gotta be 1-2 hours of walking frogstare

cloud rivet
#

yeah

#

I'll listen to a Critical Role podcast though while I walk froge_love

brisk chasm
#

i wanted to go to bed 4hrs ago, its almost 2:30am and now i feel like starting a new round of anno 117 🙂 and just skip the night, to fix my schlepp cycle

cloud rivet
#

I started vacation, I can be up at any time, no responsibilities, why can't I always live like this.

brisk chasm
#

ill be off work soon too, but i have people to tell what they have to do until then ;C

astral hinge
cloud rivet
#

I gotta pay college tuition starting next year lol

echo crystal
#

frogally*

brisk chasm
#

i just read cottage cheese

#

and i am getting hungry again, damn you people 🙂

cloud rivet
#

I haven't eaten and am not hungry, so weird

#

I gotta get up now

brisk chasm
#

hmm i could try that new marmelade on some peanut butter

echo crystal
#

am getting hungry

brisk chasm
#

on some briefly toasted slice of german bread, with thicc butter underneath the peanut butter

astral hinge
#

90% nuss

cloud rivet
#

I am going to eat something healthy and all natural like a food truck burrito

#

They got one by a bar that stays open late

astral hinge
#

unsanctioned food trucks that only appear in parking lots at night and dump their trash on the street hit different

#

I mean the food from them does, not the trucks themselves frogstare

brisk chasm
#

hehe

#

i understood what you meant

astral hinge
#

hostile interpretation is funny doe

brisk chasm
#

it do be

echo crystal
#

be doo be

cloud rivet
astral hinge
#

nice! how are you feeling?

cloud rivet
#

tired

#

feeling super lazy today

#

going to uh work on resize device oom bug I guess

astral hinge
#

I wouldn't feel lazy if I worked on my game and walked for over two hours

cloud rivet
#

I guess what I mean is I feel more of a headwind than normal

astral hinge
#

I've been neglecting my cardio since the sun is setting early and my sleep schedule has been wack agonyfrog

cloud rivet
#

exercising has gotten much harder since it is so cold now

#

I exercise very late

#

I'm gonna take a break from exercise for about a week

#

I'll still get my steps in though

astral hinge
#

I go to the gym still

#

the gym is a great place

cloud rivet
#

my shed of a garage is my gym, it's just a wooden shack with no door

#

it's freezing in there

#

at like 11pm

#

I wear thermal wear and a hoodie while I train and my lungs burn

#

my current roadmap for rest of December to get back to where I was is:

  1. fix resize mem bug
  2. add imgui/dynamic render pass
  3. add animated cubes to learn how to instance draw
  4. add Suzanne monke head & gltf loading & camera movement
  5. add deccer cubes to get scene and texture loading
  6. add Sponza to figure out how materials/normal/surface details maps work in RT
  7. figure out shadows, there's no shadow thing in OptiX
  8. add track and vehicle back in
  9. back to where I was!
#

I got 3 and a half weeks

#

mostly vacation and holidays

#

I think maybe doable

#

I have done all this before with a graphics pipeline

#

maybe it's crazy hard though with RT

#

not NIH'ing anything so will use tinygltf and SDL for image loading

#

I'll actually be way ahead because of that from where I was before since I'm using libraries again

sinful agate
astral hinge
#

lung day

pseudo dock
cloud rivet
#

still going to switch it to not recreate buffers and draw images per resize though

#

there's no visible aliasing along the triangle edges, it's nice how smooth it is. it's such a pretty triangle

#

the colors in the video look a bit more washed out than they do in the actual render

#

the green is especially a lot more vibrant for me than it is in the video

#

a relative comparison

#

a screen shot of the video makes it even worse

#

but you can tell there's a difference, you obviously can't see what it looks like for me though

#

gross

#

I want to share the actual colors froge_sad

#

oh I could dump the draw image to a file

#

but that wouldn't work with a video

#

I should use CUDA for that

#

I'll do that later, going to stay focused

cloud rivet
#

hrm

#

I know what it is

#

I think I know

#

no that wasn't it, idk

#

I wonder if this is just on resize or a problem with any kind of change per frame

cloud rivet
#

on to imgui

cloud rivet
#

hrm

#

the fuck

#

how do you debug OptiX thinkeyes

#

it doesn't even show up

#

hrm thinkeyes

#

can I see acceleration structure like I can with Nsight graphics though in vulkan?

#

yes

#

hrmmm

#

there it is

#

nsight graphics is useless to me now

#

it's just nsight compute

#

I already uninstalled renderdoc

#

nsight systems and nsight compute

astral hinge
cloud rivet
#

no

astral hinge
#

that answer is unclear to me

cloud rivet
#

why?

astral hinge
#

so you're saying it does support cuda-vulkan interop?

cloud rivet
#

it does not

astral hinge
#

rip

cloud rivet
#

yeah I mean

#

not much to gain from looking at nsight graphics

#

it just presenting the image from optix

astral hinge
#

yeah vulkan isn't doing the real work anyway

cloud rivet
#

I was thinking since I had vulkan I could still use the graphics pipeline to rasterize if I wanted to

#

but without any kind of a debugger support

#

apparently

astral hinge
#

at least you have the validation layers

cloud rivet
#

anyway, if I want to rasterise now, I'll do it in cuda

#

hrm, the examples, the docs, the debugging applications, it's all NVIDIA, it's just one company's entire ecosystem I've attached myself to

#

it's fine

#

everything is fine 😅

#

it's going well so far. I think a couple of major drawbacks are apparent now though

#

updates break things, seems to be a pattern

#

a completely opaque (binaries, not much is open source) and proprietary ecosystem with a not very large set of visible users

#

I imagine most users of this tech are private companies not spamming a discord/reddit with what they're working on

#

it's fun though

#

gonna keep going

#

it's a 4 or 5 Trillion dollar market cap company

#

they're probably not going to kill OptiX tomorrow

astral hinge
#

or slacks, which are the same

cloud rivet
#

maybe it's like a cult and I should start wearing a leather jacket

#

you know, to fit in with the slacks, a leather jacket is required

#

to get in

astral hinge
#

the leather jacket is the key

cloud rivet
#

I used to have one, a long long time ago

#

I don't remember what happened to it

#

I bought it korea in the 90's

#

I don't think I need a graphics pipeline at all

#

I think I just start a dynamic render pass, call imgui and then end the render pass and done

#

ezpz?

#

just a color attachment

#

init imgui with vk stuff

#

start frame, end frame, shut down

#

capture events

spiral ice
#

You are interoping Vk and Cuda?

cloud rivet
#

yeah

#

anything you want to render in real time from CUDA you have to get on a screen with another graphics API since it can't present

spiral ice
#

Yeah. I am doing a similar project (CUDA and Optix), though I use OpenGL, which seems to be an eaiser path

cloud rivet
#

is that GPU to GPU?

spiral ice
#

Yeah

cloud rivet
#

nice

#

yeah I'm using OptiX also

#

there's two of us!

#

how far along are you?

#

I used vulkan because that's all I know anymore

spiral ice
cloud rivet
#

nice, what IR are you using? PTX or Optix-IR?

#

I can't get Optix-IR to work, I posted about it on the NVIDIA forums

#

for compiling your optix shaders

spiral ice
#

I am using PTX (compiled via nvrtc)

cloud rivet
#

what's nvrtc

spiral ice
#

Runtime compiler. I want to support hot-reloading eventually

cloud rivet
#

oh I see

#

yeah my plan for runtime hot reloading was just to compile and reload the binary

#

so nvcc -> new binary -> app detects it updated and reloads

#

I just read the bytes with file read right now

#

so nvrtc also supports optic-ir

#

I should have figured rt stands for run time

#

they say ptx is deprecated, but blender uses it still

#

optix-ir has a bunch of bugs tbh

#

based on the forum

spiral ice
#

I don't know much about PTX vs OptixIR. I guess I will just use what is currently working until there is a need for change

cloud rivet
#

I don't either, I just saw in the documentation to not use PTX but then optix ir doesn't even work

#

so I also just use PTX

#

works for me

#

apparently optix-ir is supposed to have better performance

#

i need my single triangle to be super performant

#

do you have a project thread or share any updates?

#

what made you pick CUDA/OptiX?

spiral ice
spiral ice
# cloud rivet what made you pick CUDA/OptiX?

I am making a ray tracer with multiple backends (CPU, CUDA pure software, and hardware acceleration with Optix). Using CUDA/Optix makes it easy to share code between different backends.

Maybe in a distant future, I can also implement a Vulkan backend or something

cloud rivet
#

oh, that's cool

#

my opinion, having used the vulkan RT just a tiny bit

#

optix is much better

#

so far anyway

#

vulkan has fewer features and a burdensome API

#

did you get your leather jacket yet?

spiral ice
#

What's that?

cloud rivet
#

a jacket made of leather

#

btw I think @shut hornet doesn't actually use CUDA they are likely an imposter with that role susge but I could be wrong. I have seen zero pictures of them with a jacket on as evidecne

cloud rivet
#

ok dynamic render pass added, but it doesn't do anything

#

oh I need a barrier for that

#

cool no vk sync validation issues

#

alright

#

imgui init is next

#

I guess I need to add the library as a git submodule

cloud rivet
#

cmake is great

#

every time I have to do anything with cmake, it's easy, idk why I have slept on it for so long

shut hornet
#

but honestly really dislike Nvidia in general and will need to port this to something open, just don't have much time

shut hornet
#

thanks!!

cloud rivet
#

I like your logos too

#

you have good branding on all your projects

shut hornet
#

and, AFAIK, the training was done with open datasets, so there shouldn't be anything "stolen" in there

cloud rivet
#

right

shut hornet
#

i have some freelancers, but i'll mostly just do a rough sketch and then send it to them to make it look professional

cloud rivet
#

makes sense, and comes out look great imo

shut hornet
#

thanks a bunch!!

cloud rivet
#

but do you wear a leather jacket

#

jk jk

#

it's not required for the cuda role afaik

#

so after I added imgui to my project I realized I was over gamma correcting my render

shut hornet
shut hornet
cloud rivet
#

it's super whack

#

but yeah

#

It was me this time

shut hornet
#

ok

brisk chasm
#

i thought rendepth was your thing 100%

brisk chasm
cloud rivet
#

10x engineers 10x their gamma correction smart

#

damn it's nice to be using imgui instead of my half baked ui lol

#

it felt so cool to be able to tell people I made something totally from scratch, but you know, that's just ego

#

and in the end I had a worse thing

shut hornet
cloud rivet
#

I did like my UI code though

#

imgui has so many nice features

brisk chasm
#

you could pick up your ui lib and make it cooler later

cloud rivet
#

true

shut hornet
#

like even wrote a crappy XML parser from hand, and the COLLADA importer, no libraries at all

cloud rivet
#

it's so rough

shut hornet
#

cool learning experiment, so i know to never do that again

cloud rivet
#

I agree

shut hornet
#

though, I did do a custom UI on Rendepth and honestly that alone took like a month, when the rest of the app took like 2 months (for the MVP)

cloud rivet
#

for the actual game UI I want to do my own UI still

#

but the debug ui tooling, it's fine to use imgui for it

#

well I want it to be a custom UI unique to the game

#

it might still be a library if I can customize it enough

#

alright I have imgui, I have resize, I have correct sync, it's all RT and game stuff going forward, vulkan is done

shut hornet
#

imgui is a great debugger but not good for in-game ui

elfin cape
cloud rivet
#

I'm gonna star that

brisk chasm
#

all this talk about ui reminded me of something i made decades ago

#

a custom little ui to browse brushes and maps for some little terrain editor

#

it felt super easy to make back when, now it takes a bit more head scratching for me if i was to reproduce that thing hehe

shut hornet
#

sometimes things are easier when you are stupid and don't realize how hard they are smart

brisk chasm
#

yeah

#

or i was much smarter back then, also possible hehe

shut hornet
#

or it's like gambling, were you always win big the first hand of hold 'em you play, and then progressively just get worse and worse

#

think I won like $200 the first time i played poker, all downhill from there 😭

cloud rivet
#

4 am wow

#

hrm

brisk chasm
#

in 2mins its leet time here

#

2mins ago it was leeeet time 😄 (13:33:37)

#

or leeeeeeet time, 13:33:33.337, haha

bronze socket
#

especially for stuff that needs heavier than imgui-tier debug UI but lighter than Qt gigabloat

#

what I think is really cool is that since it has some subset of CSS so it's pretty moddable, including by your users

#

it's def cross platform, you provide your own backends just like imgui and there's default (GL, Vulkan) x (GLFW, SDL) much like imgui, however the vulkan one last I tried is broken

#

but if you're integrating into your own vulkan renderer you likely wanna write your own backend to integrate with your resource management/frame graph/whatever

shut hornet
#

yeah, I understand that much, just looking at the github there was a lot of red x's on the support matrix so didn't seem 100% everywhere

#

though yeah, I guess you can do the integration yourself

brisk chasm
#

DRUI when

bronze socket
#

I have my copy of roblox UI in my engine but much like real roblox UI it's mildly ass to work with

#

RmlUi is awesome because you can build applications with it and people can mod them like foobar all without it being a giant electron app or Qt/WxWidgets/etc. app

#

though for something like a matrix chat or something approaching minimal browser functionality there's a lot I'd have to mod in myself like adding my text support lib for full text rendering/editing capabilities, as well as getting some kind of wuffs-based image loading for safe rendering of images you receive from the internet

shut hornet
#

and still be more optimized than embedded a web browser

solid grove
shut hornet
#

and companies realize that (in terms of development) but still ship "native" apps I guess cause it's easier to make money (or embed illegal tracking)

bronze socket
#

yeah the worst thing about browsers for a company is that they are very well sandboxed

#

I'm occasionally a little sad that html5 didn't take up the slack for flash's market even though it could have and should have

brisk chasm
#

im still a little sour that silverlight was not turned into a native thing, not requiring any plugin, but running natively in any major brauser on any platform

#

that would have been "rich" lol (since all that shit was called rich internet application back when)

solid grove
#

kids could start making movies and games in it with no effort, and have them online in minutes

brisk chasm
#

but flash was riddled with bugs and security vulnerabilitilbilities, thats what fonts and drawing primitives do to a mf render-engine

solid grove
#

the flash plugin also gave webpages unrestricted access to the user's filesystem froge_bleak

bronze socket
#

the closest thing that comes to mind in functionality is the 2D animation tool in blender but that's not exactly kid friendly

solid grove
#

flash is still used today in TV animation

bronze socket
#

I've only ever played with it once

#

wow, didn't realize they hadn't replaced it there either

#

I know flash had a pretty big impact on animation production cycles when it came out

solid grove
#

i think ToonBoom Harmony is steadily eating its lunch, but flash is still king

#

Adobe renamed flash to "animate" and focused almost entirely on TV animation but it's the same app and actionscript is still there

#

to my knowledge anime is still mostly drawn on physical paper and scanned in

bronze socket
#

yeah I've heard that as well

shut hornet
#

or install 50 browser plug-ins and viruses, etc. even without Flash

brisk chasm
#

how was that a red herring smh, it was about flash not everything else

shut hornet
#

like the industry used that as an excuse to kill Flash, when it was really about money/control

#

Apple changed the rules on iOS several times to specifically block Flash, even after Adobe enabled native compile (which did not use the web runtime)

shut hornet
cloud rivet
#

does it let you write actionscript?

#

it does!

#

:O

#

it's just flash

shut hornet
cloud rivet
#

the steve jobs letter of doom hit them

cloud rivet
#

soooo...the optix sdk ships with a gltf scene loader

#

do I NIH this anyway

brisk chasm
#

yes

cloud rivet
#

it handles geometry compression

brisk chasm
#

nobody is keeping you from peeking into the code there and steal relevant bits

cloud rivet
#

as in there's an algorithm in there where it determines the size required to build the acceleration structures via compression

brisk chasm
#

do you mind sharing the link to that repo though?

solid grove
#

if it’s there and you’re already tied to optix I see no reason not to

cloud rivet
#

and the thing you download is an installer

#

it's not source code

#

but you get source code

#

it's just like the FBX SDK

#

the code does have a license though

#

SPDX-License-Identifier: BSD-3-Clause

#

maybe someone put it on github

#

maybe nvidia did

#

looking

#

it's actually not that much code

brisk chasm
#

ah i have a novideo account, no worries

#

you need one for nsight, or needed one at least

cloud rivet
#

I can't see this code on github

cloud rivet
#

I feel like they made this just for their samples

#

it's not documented

#

anyway

#

pretty cool

#

I think I can build this myself tbh, it's complex but there's a huge comment explaining what it does

#

and it's not that much code

#
    // Problem:
    // The memory requirements of a compacted GAS are unknown prior to building the GAS.
    // Hence, compaction of a GAS requires to build the GAS first and allocating memory for the compacted GAS afterwards.
    // This causes a device-host synchronization point, potentially harming performance.
    // This is most likely the case for small GASes where the actual building and compaction of the GAS is very fast.
    // A naive algorithm processes one GAS at a time with the following steps
#

is the beginning of the long comment

brisk chasm
#

sometimes there are some gems in those code schnippets

cloud rivet
#

yeah there's a lot of code in the both the cuda and optix samples

#

the cuda stuff is on github

#

it doesn't have the optix code though

#

it's pretty cool, there's basically an example for anything I might want to do

#

to build the OptiX SDK I had to use these args cmake build .. -DCUDA_MIN_SM_TARGET=sm_86 -DCMAKE_POLICY_VERSION_MINIMUM=3.5 -G "Visual Studio 17 2022" because the latest cmake I have deprecated the cmake version the project uses, and the optix sdk doesn't build with VS 2026 and additionally the minimum cuda version is not supported at least by my system so I specified sm_86 which RTX cards support

#

just in case you wanted to build it, it took me a bit to figure all that out

#

today I have just been reading through the sample to figure out how to do instances acceleration structures because I wanted a bunch of spinning cubes

#
    optix_instance.traversableHandle = m_meshes[instance->mesh_idx]->gas_handle;
#

there it is

#
  instances[0].traversableHandle = state.static_gas_handle;
    instances[1].traversableHandle = state.static_gas_handle;
    instances[2].traversableHandle = state.deforming_gas_handle;
    instances[3].traversableHandle = state.exploding_gas_handle;

    size_t      instances_size_in_bytes = sizeof( OptixInstance ) * instances.size();
    CUDA_CHECK( cudaMalloc( ( void** )&state.d_instances, instances_size_in_bytes ) );
    CUDA_CHECK( cudaMemcpy( ( void* )state.d_instances, instances.data(), instances_size_in_bytes, cudaMemcpyHostToDevice ) );

    state.ias_instance_input.type = OPTIX_BUILD_INPUT_TYPE_INSTANCES;
    state.ias_instance_input.instanceArray.instances = state.d_instances;
brisk chasm
#

gas handles 😄

#

funny word

cloud rivet
#

I know right

#

ok

#

ok so the instances also get a transform

#

7.37.2.5
transform
float OptixInstance::transform[12]
affine object-to-world transformation as 3x4 matrix in row-major layout

#

there it is

#

cool I think maybe I can do some cubes

#

with these sdks instead of having a million browser tabs open I have a million notepad++ tabs open

brisk chasm
#

classic

cloud rivet
#

also using samutrapdf favorites

#

to jump around pdfs

brisk chasm
#

yeah thats a good viewer

#

sumatra 🙂 but your typo sounds more fun

cloud rivet
#

oh right thanks

sinful agate
#

because they all look the same

shut hornet
brisk chasm
#

iirc a bunch of it is on their github

cloud rivet
#

I agreed to stuff without looking at it

#

The code I shared has a bsd license

shut hornet
#

if it's the same way it was, yes, that agreement for the Nvidia dev account is somewhat strict

#

though some of the stuff is open source, it depends which sdk

cloud rivet
#

well, they don't have to worry about me sharing anything I just plan to keep it all on my computer

#

I got a lot of stuff to read through and learn about. it's great

shut hornet
#

for sure. i don't really care, just heads up to know what you are signing

cloud rivet
#

yes, thank you

spiral ice
# brisk chasm gas handles 😄

I actually prefer the gas/ias terminology compare to blas/tlas. Also Optix supports multi-level instancing, so calling ias as tlas will be weird

cloud rivet
#

yeah the Optix acceleration structure stuff seems way better than what you get with vulkan

astral hinge
#

what do gas and ias stand for

cloud rivet
#

geometry acceleration structure

#

instance acceleration structure

#

it's just a memcpy and a build with an update operation to update an ias

brisk chasm
#

i have no idea about any of this

cloud rivet
#

that's all it is

brisk chasm
#

geometry accelellleleeration structure?

cloud rivet
#

yeah

#

each of your deccer cubes would be just one gas

brisk chasm
#

it do be sounding more senseful

cloud rivet
#

and then you create an ias

brisk chasm
#

intelligent accelleration structure 🙂

cloud rivet
#

and the number of deccer cubes as instances each with their own transform

brisk chasm
#

ah