#Efficient memory layout for tensor dot products

4 messages · Page 1 of 1 (latest)

polar moonBOT
#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question run !howto ask.

soft valve
#

Have you profiled and ensured that this is actually a problem?

#

if not just implement it and if its too slow come back

vestal yoke
# soft valve if not just implement it and if its too slow come back

here's like a 10x speedup from simply doing the first fix (moving the contracted dimension iteration loop "up" the for loop ladder, which i've tried for 2d tensor contracted along a single dimension), which is like the simplest fix possible, so i imagine there's plenty of more interesting potential optimisations

i don't have any performance targets at all, i'm just curious about memory layout performance optimisations