#Compiler superoptimization for NNUE inference.

17 messages · Page 1 of 1 (latest)

trim cipher
#

Writing the NNUE evaluation code for speed had been largely manual, and a lot of holes are still probably missing. This proposal aims to make the NNUE inference faster while allowing easier development of new architectures by leveraging compiler superoptimization for NNUE.

rustic jewel
#

Nice abstract but where is the proposal?
pauseChamp

trim cipher
#

Well, the proposal is to just find some superoptimization tools that perhaps could improve the NNUE inference speed.

pliant trench
#

still not a proposal, much less so code, that can be acted on.

knotty forum
pliant trench
#

interesting

smoky heart
#

in my humble opinion the code for each computational step is really hard to improve on, but I think there is still big potential with combining steps or more clever update strategies

pliant trench
#

so the combining steps idea fits with the 'dace' tool I pasted above. Not sure if it is practical or easy, probably weeks of work to get something going, but I might be wrong. I think the first thing would be to figure out how to get callable code from a reference implementation.

smoky heart
#

ooh fancy

#

I think the tricky-ish part is integrating it with all the infrastructure which actually computes what work to do

#

shall investigate tho

pliant trench
#

(maybe it is integrated in the main repo, maybe not).

frosty dagger
#

actually stockfish use PGO which is another super optimizer complier. also I don't really like stoke or Souper because you need a Linux and docker. because stockfish prioritize flexibility

knotty forum
#

PGO and superoptimization are two different things. PGO uses profile information to guide the compiler optimizations. Superoptimization means the result is actually optimal regarding some metric (e.g., number of operations).