#Partial sync always fails

5 messages · Page 1 of 1 (latest)

vagrant igloo
#

Hi there,

I am seeking help with understanding an issue where partial replication sync always fails after a replica disconnects from master or after a failover. I am running a 3 node cluster managed by sentinel. I have tried increasing repl-backlog-size to 1gb but that doesn't help. Before failovers, all replicas are caught up and no writes are happening during failover. But I always see partial sync rejected with a message like:

Mar 7 20:06:56.432 redis-perm-repl-backlog-test-9ab11db5 redis-server[26068]: 26068:M 07 Mar 2024 20:06:56.432 * Partial resynchronization not accepted: Requested offset for second ID was 1524031215, but I can reply up to 1524030897

even if I do a brief disconnect of a replica by stopping and restarting redis-server this happens.

I'm not sure what is going on or where to look to get a better clue. Nothing in the logs jumps out at me. I can't find anything helpful online and I didn't see similar posts in this discord. Any insight is appreciated. If this isn't the right place to ask this question I would appreciate a pointer to the right place. Thanks in advance.

My redis version is 6.2.14. I can provide more detail as needed.

vagrant igloo
#

ah correction. so if i shut the replica down and start it again it requests a full sync because it no longer has cached master data. if i firewall off the redis port on the master briefly then a partial sync happens on reconnect.

still doesn't explain why partial sync on failover isn't working. my understanding is it should work bc the promoted master still has the old replication id (the second id referred to in the log message above)

#

I would still think since my redis is set up for persistence it should be able to partial sync after the server restarts. 🤔

vagrant igloo
#

Hmm even if I shutdown a replica "gently" via the SHUTDOWN command as mentioned in https://redis.io/docs/management/replication/#partial-sync-after-restarts-and-failovers When I start the replica back up via systemd it still fails to do a partial sync. Looks like somehow it doesn't even have the right replication id. Here is a log from the master when the replica comes back up.

Mar  8 17:40:34.365 redis-perm-repl-backlog-test-6aa8c444 redis-server[26119]: 26119:M 08 Mar 2024 17:40:34.365 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'bf19d81349273662012c8f0ee984853ac45a791c', my replication IDs are '647018439ec67020d9d37c3233c514215ee582c6' and 'da6ae617768ff0e34e0e13ea37ddec60b550710d')
vagrant igloo