#Applied blocks stuck at syncing
27 messages ยท Page 1 of 1 (latest)
Probably related to https://discord.com/channels/484437221055922177/1300289111210463252
even with the latest v0.10.4 I can't resync to network
hmm no it's probably not related to "Failed Block Proof"
@calm lion might know
@valid warren Can you try to attach gdb to the openmina process ?
If you're able to do that, can you run, in the gdb shell, the command: thread apply all bt
failed to receive message from channel: rpc response get_best_tip:2: peer failed to respond: uncaught exception
I think that's it
@valid warren Basically in order to resync when there's no link between new target best tip and current one, we have to call best tip rpc. Response of that rpc contains root block, best tip and block body hashes for blocks inbetween. This way we can find the link and start syncing process.
If you search logs with uncaught exception u'll see it a lot
That rpc shouldn't be failing. I'll check ocaml code and see if there is a reason it can be failing
I rebooted it twice now, wouldn't bootstrap process find rpc that has the best tip if one of it fails ?
it can be any other peer RPC right?
Thanks, the output confirms we are not dead-locked anywhere. Zura's got the answer
yes and that should be the case. we should be querying another peer, not the same one over and over again, but from what I see, it looks like peer isn't getting disconnected like it should... I'll check in with others about this
yeah I can see that, peer ID at which it fails changes tho
those are logs from more tries
no worries! glad you on top of that
It started to fail in my env after the fail block proof happened. Just a theory: maybe the problematic block is now part the 290 blocks history and the issue disappears when it goes out of scope.
No if block proof production fails, that block simply cant make it in transition frotnier
It's pretty much impossible on type system level
I think some Ocaml node created the problematic block, if this was unclear.
But certainly I may be completely wrong ๐
oh. If that would be the case, it'd be really bad ๐ It's highly unlikely though. I'll still look into ocaml part as well, coz that rpc should never fail AFAIK.
Ok this should fix it on the rust node side: https://github.com/openmina/openmina/pull/835
Will look into ocaml side as well.
@grave pine Can you think of any reason ocaml node would fail to create response for best tip rpc?
Only thing I can see that could throw exn is https://github.com/minaprotocol/mina/blob/3dc560df2209c25f8682b5e295ff3c5c216952cd/src/lib/best_tip_prover/best_tip_prover.ml#L62
And that can only throw if root block or best tip isn't found in frontier. Idk how that could happen though, most likely I'm missing something.
sorry, missed this. Yes, I don't see another possibility, I guess the node was just starting and didn't even know what the best tip was?
Could be. Either that or some p2p issue