#Alternative NNUE design

25 messages · Page 1 of 1 (latest)

vestal basin
#

Do you know that pairwise terms can be calculated efficiently via this formula? (a+b+c+d+...)^2 = a^2+b^2+c^2+d^2+2ab+2ac+2ad+...

mellow wadi
vestal basin
#

The pairwise term? Sure, but that boils down to simply summing up and taking a square.

vestal basin
#

Well, it isn't very expressive.

#

Like, it can't approximate a lot of things.

mellow wadi
#

im not trying to approximate anything

vestal basin
#

Neural networks are seen as universal approximators.

mellow wadi
vestal basin
#

It needs to approximate something.

mellow wadi
vestal basin
#

What are you trying to do? Remember the compute efficiency.

#

Adding complete nonlinearity would be O(n^2)

#

Too expensive.

mellow wadi
#

like with transformers

#

learnt ofc

vestal basin
#

Well, I already have my efficiently updatable transformer design that uses decoder.

#

But probably more suitable for llms than chess.

mellow wadi
#

that isnt the same thing

#

that only looks at tokens backwards

#

not forwards