What we have:
- a third-party API, that is really unreliable
- events, for which we have to get response from this api
- the API is async, so it works like this:
- First cron job is making requests to it. API doesn't return
realresult, but amessage id, which is saved to DB - Second cron job is then goes over these saved
message ids, and tries to get arealresponse from API
- First cron job is making requests to it. API doesn't return
Problem:
- that this API might go down for like a day, and usual retries don't really work. We are looking at 3-5 days retry window lol
My solution:
- create a table, for storing info about retries
- if request fails (from either job) - check the table: if the time of the first retry is null, or less then retry period - put it back to it's table, to be picked up later again by first or the second job. Otherwise - fail
The solution seems really simple, I wonder, what I might have missed there? Would be grateful for your ideas and advices
Of course I'm gonna talk to my tech lead about this, but I want to eliminate "stupid mistakes" before that