Bias based on side to move | Stockfish | Page 1

oblique rune Nov 14, 2023, 5:32 PM

#

When letting SF play against itself, there is a significant contempt-like effect in that the eval is biased to the side to move. E.g., look at the below shootout (with shared hash, version from 7 November):
[FEN "rbbnkqnr/pppppppp/8/8/8/8/PPPPPPPP/RBBQKNNR w KQkq - 0 1"]

e4 {0.97/38 47} e5 {0.66/38 61} 2. c3 {0.96/42 120} c6 {0.85/41 77} 3. Ne3 {1.00/42 22} d5 {0.88/45 293 (Nf6)} 4. exd5 {0.99/37 16} cxd5 {0.90/40 63} 5. Nxd5 {1.03/39 24} Qc5 {0.87/41 57} 6. Qf3 {1.10/45 204} Qd6 {0.96/41 97 (Ne7)} 7. a4 {1.14/41 20} f5 {0.93/39 61} 8. d4 {1.04/40 41} Be6 {1.01/40 65}

If the features causing this are really necessary strength-wise, it would be nice to at least have a formula that could estimate the "true" evaluation based on the side to move.

velvet notch Nov 14, 2023, 5:35 PM

#

That is the true value. The side to move has an advantage just based on the fact that they have the next move. Consider the following position:

#

Who's winning? Obviously it depends on who move next, so without that information there's no such thing as the "true" value. In theory you could analyze the position from wtm and btm and then average the two values, but that wouldn't really be the true value of the position. In the above case, it would average to 0.00, which obviously isn't the case

oblique rune Nov 14, 2023, 5:46 PM

#

I might have stated the issue in an unclear manner, or you might intend to point to a legitimate justification of the current behaviour. My problem with the above-implied zigzag eval graph is that it means that a single eval is not reliable to an analyst. In my above example, one might everage two following evals, but when running a deep infinite analysis, this is not possible. So what I really want is a way to correct for this effect.

velvet notch Nov 14, 2023, 6:07 PM

#

I'm honestly not sure what you're asking. In chess, for the majority of positions, the side to move does have an advantage due to tempo (it's better to force your opponent to respond to your move than to respond to your opponent's move). That being said, it probably shouldn't be a very large effect, and practically the difference between a 0.99 and a 0.9 position likely isn't huge. I wonder if the effect is greater due to the position you provided being a DFRC position rather than one arising from startpos, but that's just speculation. Anyway, if you want to smooth out evals, a running average would probably accomplish what you want

main timber Nov 14, 2023, 8:32 PM

#

There are multiple possible reasons.
Optimism is asymmetrical again https://github.com/official-stockfish/Stockfish/commit/908811c24ab53d2cb1bebc1138427e21fefa8054
If you ran it multithreaded the last output will be in stm's favor because of thread voting.
Maybe some other reasons too.

#Bias based on side to move