#Study of mistakes in opening

15 messages · Page 1 of 1 (latest)

tender charm
#

I'm facing a choice.
What I want : to study and analyse the most common (and critical) mistakes in opening for players of my level (or from any range of level). Make some stats, ...Etc the sky is the limit once I get what I want : idealy a huge cvs file of the most common lines, their evaluation, and probability of occurence.

2 ways :

  1. there is already someone who made something using the database : Finding the Most Common Chess Blunders : https://www.youtube.com/watch?v=7eevSgJqV7o . It's a python code that go through all the games available and creat a "blunder dictionary"
    Pros : simple and doesn't use lichess bandwidth
    Cons : It uses only the analysed games, which makes a huge biais.
    the blunder dictionary might be too huge if I want to look at all the common mistakes too.
    I'm not sure of the format of this dictionnary, but I don't think I'd be able to do relevant stats on it, cause it uses the fen position, and not the order of moves

  2. I use the lichess API to check for the most common moves on opening, for the people of the rank I want.
    If I want to go up until move 7, and focus only on the top 3 common move each times, it makes 3^14 = 4 782 969 positions. I would need 2 calls to the API per position, which means 9 565 938 calls. If I put some waiting times in my algo, I can try to limit the number of calls per seconds to, say 500. Which would mean 5 hours of intense usage of the API. Is it too much ?

Pro : gives me everything that I want, and use the stats that lichess has already done, I don't need to do them myself.
Cons : Heavy usage of lichess API + I don't even know if I can create and process a cvs file with 4 millions rows.

What do you guys think ?

dry epoch
#

a csv with 4 million rows is nothing. but I highly doubt you'll be able to make 500 (!?) requests per second to the API without getting rate limited.

#

the appropriate approach for something like this is to download the games database and process it yourself. it contains all the same data as the openings api.

#

you can use a single month as a representative sample to reduce the required storage space and bandwidth. in theory it's even possible to only download and extract part of a month but it's a bit tricky. or you could rent a moderate server for a day or so for a few bucks.

tender charm
#

Thanks for the answer. I’ll do that then, and check how/if I can improve the existing code

#

Just for info, what would be a appropriate number of requests per second to stay unnoticed?

dry epoch
#

I'm not sure, but generally, the base guideline is to make only one request at a time and wait at least a minute if you get a 429. ideally back off longer after repeated 429s. with one request at a time, you probably won't be able to make more than a few requests per seconds in the first place. though for most API endpoints, even one request per second for an extended period of time is still too much. the opening explorer is fairly optimized and on a separate server and i don't know what rules it has exactly but it's probably still not too far off.

#

as other people have mentioned, the API is in no way intended for enumeration of all values

gusty violet
#

I am looking for a way to dig through masters db and provide a 30 move deep theory with the same quality as extensive analysis provided by GM's. Is there an algorithm for that?

tender charm
#

I’m not sure I understand what you’re trying to do actually. What do you mean by a 30 moves deep theory?

gusty violet
#

it's like most common lines, with 30 moves long.

tender charm
#

Then I don't know if there is an existing algo for that

night mural
tender charm