#why is this method is slower than this one, despite being less instructions

7 messages · Page 1 of 1 (latest)

tiny blade
#

Instruction count isn't always a great proxy for actual perf

#

Different instructions can have different latencies

#

I'd suggest putting your shader in godbolt/rga/shader playground and checking the (RDNA) assembly

#

you have to capture your program running and have debug info available for your shader

#

also your app needs to be vk or d3d12

#

time to learn

#

AMD's ISAs are public so you can look up all the instructions