Calculate Likelihood of Position Being Repeated | lichess.org | Page 1

lunar narwhal Oct 30, 2024, 5:29 AM

#

I am doing many offline Stockfish evaluations (hundreds of thousands / millions of games).

Each game will have a full stockfish evaluation (a list of objects) saved to a Game table in a SQL database.

Each time I have Stockfish evaluate a position, I want to determine the likelihood that I will need to evaluate that same position again so that I can cache that evaluation.

For instance, many of these games will have the exact same position to evaluate after 2 moves, but it is very unlikely that any game 20 moves in will have a position that is reached in any other game.

A simple approach would be to just cache the first n moves for every game. This is a good partial solution!

Consider the endgame; we will likely see common positions across games when there are only 1-2 pieces left on the board.

The “entropy” of a chess game seems to increase and then decrease as the game goes on.

I’m curious if anyone has spent some time thinking through this question and if they had any interesting findings?

TLDR: How to (roughly) calculate the likelihood that any FEN is repeated across any pool of chess games?

rare vigil Oct 30, 2024, 7:48 AM

#

I don't have an answer, but I suspect that in the endgame, a tablebase will be more useful than cached evaluations

honest walrus Oct 30, 2024, 9:02 AM

#

A very rough approximation is o take the number of turns (t) from the FEN, and consider it's probability being roughly 1/20**(2t) with 20 being the average number of legal moves iirc, and considering all legal moves are equally likely (not true)

balmy cove Oct 30, 2024, 4:15 PM

#

You can make a first pass over the games, just generating the FENs. Then sort and count by number of coincidences. This two steps must be well thought to balance time and space needed.

Finally evaluate only the nth percentile with highest coincidence rate.

#Calculate Likelihood of Position Being Repeated