#the elusive bug

1 messages ยท Page 1 of 1 (latest)

minor nexus
#

the block in question had three IBC transactions:

  1. accepted
  2. failed (out of gas)
  3. failed (invalid event)

why was the event invalid in Transaction 3? it shouldn't have been. it's because the event from Transaction 2 wasn't dropped, and it looked at this invalid event as belonging to Transaction 3

this was a problem for the chain because Transaction 3 was only invalid in the newer client version, and in v0.31.6 the Transaction 2 event did not emit. wat?

the newer version's gas metering is a bit cheaper than v0.31.6, and because it's a bit cheaper, it was able to execute one more line in the newer version than in v0.31.6--and this line was the invalid IBC event. so in v0.31.6 the Transaction 3 is valid, and new version Transaction 3 is invalid

when a tx fails, we drop the write log (ie. don't write to storage) but we do not drop the IBC event (and we should). this is the bug: we must drop events when transactions fail

ocean sorrel
#

Unjailed's fixed, right?

charred bane
#

@minor nexus Thank you for the breakdown!

midnight glen
charred bane
#

The context and detail is very helpful ๐Ÿ™

noble whale
#

nice

remote rivet
midnight glen
jagged agate
#

Thanks for updating us ๐Ÿ™

remote rivet
#

yeah, thx for the update, at such a late hour !

finite rose
finite rose
minor nexus
#

the elusive bug

minor nexus
minor nexus
#

v0.32.0 soon after for the hard fork

ocean sorrel
halcyon crescent
#

Is the rollback command of any use? Seems to take a lot of time and CPU to rollback 1 block though.

surreal flint
#

please fix the faucet

snow crow
snow crow
bright zealot
#

I think the team should not let everyone wait for that long to debug next time. It's almost a week already. Revert should be considered fast the first time. SE will keep go long way if we debug issue like this

If the Team wanna have a long testnet for debugging issues better to create another testnet event for that than SE. That's my thought.

pliant coral
#

Well done team!

minor nexus
minor nexus
mellow falcon
sweet remnant
#

nice one ........ catchy bug indeed

sweet remnant
#

let's be fair, I broke the faucet and report it I got points, @snow crow deserves it's points too if he proove the tx

finite rose
# sweet remnant yes there is, security "stop the chain"

there's a security to submit proof on how to. not to actually halt chain (and tbh given what was done here was specifically upgrading to a client we were told not to touch, I don't think it should be rewarded, nothing personal against favour)

minor nexus
#

talked with eng team, we think chain restart coordination could begin in ~6 hours

we've got ~4 hours more of syncing to do, then maybe ~1h to cut releases

halcyon crescent
#

Not that I am impatient, just wondering about the technical reason.

finite rose
kind saffron
minor nexus
#

what's the block height of your snapshot?

finite rose
#

lmk if you need a cut at a certain block height..

sour marten
finite rose
#

w index would be super useful

sour marten
# finite rose could be so kind to share it here for us plebs? ๐Ÿ™‚

Snapshot at height 90000
https://namada-se-rpc.citadel.one/snap/namada-se-90000.tar.lz4
Contains db directory to be placed in chain id root dir and data directory to be placed under cometbft dir
priv_validator_state.json is removed from the snapshot to prevent validators from double signing on accident as node won't start without this file

Also made snapshot with tx index for RPC nodes:
https://namada-se-rpc.citadel.one/snap/namada-se-90000-tx_index.tar.lz4

finite rose
#

if they don't decide to start chain below that height ofc

finite rose
sour marten
minor nexus
minor nexus
finite rose
sour marten
sour marten
#

i mean rollback more than the current block
cometbft will connect to peers with the highest block and nodes will return to the same state we're in currently. Assuming they won't apphash trying to reach current block

finite rose
#

fair

#

I suppose my question above would apply once we get to the hardfork then, but no need to cross that bridge until we get to it I suppose

mortal sinew
#

missed conversation ๐Ÿ˜„ yep, mine is with 0.31.6, but it is without tx.index for usual users, and will have with it as well soon
so, at least we have two sources that ppl can use

midnight glen
midnight glen
finite rose
#

maybe team will make snapshots made with new client available to us

#

if so, I hope they make one with indexer as well ๐Ÿ‘€๐Ÿ‘€

halcyon crescent
#

"Social consensus"?

#

How would the community verify the snapshots have not been tampered with?

finite rose
#

maybe there's a theoretical s-class waiting for you there ๐Ÿ™‚

halcyon crescent
#

Aha!

mortal sinew
snow crow
midnight glen
snow crow
finite rose
#

otherwise it's pure mayhem whenever an unstable version is put out

#

but idk I'm not the one making the rules ๐Ÿ˜… gg on your part if you get rewarded

finite rose
remote rivet
#

I'd tend to lean on your side too pretoro. I dont know if "unintentionnaly breaking the chain" is a category that should be rewarded ๐Ÿค”
Finding bugs, logging them through github, hell even putting up a PR if you have the skill is something I'd gadly reward if it were me. but unintentionnal outage seems a bit far fetched

finite rose
remote rivet
#

correct

mortal sinew
finite rose
#

all of this being said though, I'd say submit it and let team decide

#

not wanting to gatekeep people's points

minor nexus
# sour marten 0.31.6

snapshots will only work for our purposes if made with the unreleased v0.31.9

eng team is working on that now. Spork is going to use eng team's snapshot to test and then write instructions (to import from snapshot) so that we'll be ready to coordinate the restart

and yeah then maybe we can get blocks made without having to resync from genesis ๐Ÿคž

#

fyi @finite rose

sour marten
#

if 0.31.9 is not consensus-breaking, there is no need to resync whole state from genesis using it

#

idea of this 90k snap was to restore state using it, sync up to pre-halt block (or right away if it won't apphash between 90k and consensus block) and upgrade to binary version that fixes non-determinism

#

i'm just worried that we might not restart today if you just starting to resync for a new snap. It took me 8h to sync with tx indexing enabled

minor nexus
#

but yeah we gotta sync to test before cutting the v0.31.9 release

finite rose
#

otherwise it'll be a horrible mess for those of us running indexer

snow crow
mortal sinew
midnight glen
halcyon crescent
# mortal sinew exactly what I thought preparing the snapshot

Gavin explicitly said snapshots will only work for our purposes if made with the unreleased v0.31.9 that makes me wonder why a 0.31.6 snapshot up to 90043 is unusable. Is it because the engineering team want to check if the changes in 0.31.9 yield the same state, or is the database changed?

#

In the latter case, can we still verify integrity?

sour marten
halcyon crescent
#

So hashes would still be intact, just a compatibility change

sour marten
#

Otherwise i think it's perfectly fine to use snap from 0.31.6 and i'll test it on rpc node to see if prevote hashes will match with team's snap

minor nexus
finite rose
finite rose
#

so I doubt you could have made that tx if you were still on 31,6

#

unless I totally got it all wrong

kind saffron
finite rose
sweet remnant
mortal sinew
#

I'll try anyway ๐Ÿ˜„

midnight glen
mortal sinew
#

need a binary to try, so waiting for the official announcement and then will see if it works with that snap

minor nexus
#

okay i hear you, i will chat with eng team

minor nexus
remote rivet
#

So should we wait for 0.32.0 to start our nodes ? Or what ? Do we relaunch today ๐Ÿ˜„ ? coz tomorrow it's friday...

midnight glen
finite rose
mortal sinew
#

time to talk in new thread ๐Ÿ˜„

finite rose
mortal sinew
left shuttle
mortal sinew
finite rose
minor nexus
#

dang it, new thread haha

#

okay gonna close this one after i've answered Qs

minor nexus
minor nexus
covert brook
#

What about the spreadsheet

minor nexus
covert brook
#

About to tasks submissions

minor nexus
#

if you could roll back, then you wouldn't have the problem block. and when we have v0.32.0, we'll be able to roll back

finite rose
finite rose
halcyon crescent
halcyon crescent
minor nexus
minor nexus
remote rivet
minor nexus
#

i'm gonna lock this thread in a minute, wanna make sure everyone in the other thread is in the loop

finite rose
mortal sinew
minor nexus
ocean sorrel
minor nexus