When your question is answered use !solved to mark the question as resolved.
Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question run !howto ask.
4 messages · Page 1 of 1 (latest)
When your question is answered use !solved to mark the question as resolved.
Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question run !howto ask.
Have you profiled and ensured that this is actually a problem?
if not just implement it and if its too slow come back
here's like a 10x speedup from simply doing the first fix (moving the contracted dimension iteration loop "up" the for loop ladder, which i've tried for 2d tensor contracted along a single dimension), which is like the simplest fix possible, so i imagine there's plenty of more interesting potential optimisations
i don't have any performance targets at all, i'm just curious about memory layout performance optimisations