#toubleshooting

1 messages ยท Page 1 of 1 (latest)

pure escarp
#

is this a validator?

#

think doing threads for these ones are better

shrewd gyro
#

hi, yes

pure escarp
#

did you save your priv validator state json?

shrewd gyro
pure escarp
#

or was that reset too when you did the ledger reset?

shrewd gyro
#

yes, it's saved after the reset, I backed up the entire namada dat dir also before running the namadan ledger reset

#

i.e. I took a backup then ran the reset command on the original directory

pure escarp
#

so what I would do is

#

I would make sure you have the validator state with a high block count (just inspect the file)

#

then restore from snapshot (obvs make sure it's down when you do)

#

restore validator state and boot

shrewd gyro
#

I hadn't upgraded to 1.1.5 when the problem started, it OOM'd at around 16:00 UTC went into restart loop (failing to connect to abci client) last night, I've been busy with family and so didn't realise till now

#

ok I'll take a look at the file now...

shrewd gyro
pure escarp
#

if you get a snapshot from let's say itrocket or mandragora, you basically replace the contents of only the db and cometbft/data folders - that's all a snapshot is..

pure escarp
shrewd gyro
#

ok so it looks like the reset did in fact delete the validator config json , I have a copy, I'll stop the node then copy it over and restart

#

would be good to know it's possible to resync from height 0 without using a snapshot though, should I report the issue?

pure escarp
pure escarp
#

and again I would def restore from a trusted snapshot. there is nothing wrong with doing that

shrewd gyro
#

oh really? the latest versions are not able to sync the chain from genesis, that's a shame, hopefully that can be addressed in the future

shrewd gyro
#

can you provide a link to a snapshot you would trust please? I am somewhat unfamiliar with where to find a trustworthy one ๐Ÿ™‚

pure escarp
pure escarp
pure escarp
shrewd gyro
#

ok I'm following Daniel's instructions to use Mandragora snapshot, I had to change fast_sync/[fastsync] to blocksync in the configs (seems familiar, I'm sure I've done that sometime already!)

#

thanks for taking the time to help it's much appreciated ๐Ÿ™

pure escarp
#

I don't think the fastsync is that important - the snapshot will make sure you are almost up to date. take care in replacing the priv-validator-state before booting your node..

pure escarp
#

remember to unjail once you are back up and running btw - assuming you got jailed already

shrewd gyro
#

sadly, the problem remains:
E[2025-04-13|23:49:38.340] abci.socketClient failed to connect to tcp://127.0.0.1:26658. Retrying after 3s... module=abci-client connection=query err="dial tcp 127.0.0.1:26658: connect: connection refused" E[2025-04-13|23:49:41.343] abci.socketClient failed to connect to tcp://127.0.0.1:26658. Retrying after 3s... module=abci-client connection=query err="dial tcp 127.0.0.1:26658: connect: connection refused" E[2025-04-13|23:49:44.346] abci.socketClient failed to connect to tcp://127.0.0.1:26658. Retrying after 3s... module=abci-client connection=query err="dial tcp 127.0.0.1:26658: connect: connection refused" 2025-04-13T22:49:47.316123Z ERROR namada_node::broadcaster: Broadcaster failed to connect to CometBFT node 2025-04-13T22:49:47.316193Z ERROR namada_node::broadcaster: Broadcaster unexpectedly shut down. 2025-04-13T22:49:47.316199Z INFO namada_node::broadcaster: Shutting down broadcaster... 2025-04-13T22:49:47.316205Z INFO namada_node: Broadcaster is no longer running.

#

seems that cometbft is not opening a listening socket

pure escarp
#

what happens if you disable the service and try to boot the ledger in terminal?

shrewd gyro
#

service is not stopping, I'll need to kill -9 I think ๐Ÿ˜ฌ

pure escarp
#

you need to make sure you disable the service so it's not running twice.

#

you are running cometbft 0.37.15?

#

@shrewd gyro if you want to dm me your config file, I could compare with some of my nodes and see if any differences

#

you're not running a dockerized setup by any chance?

shrewd gyro
#

I'm pretty sure it's 0.37.15 unless updating to namada 1.1.5 changed the cometbft version

pure escarp
#

it didn't, but would verify just to be sure (cometbft version)

shrewd gyro
#

yes 0.37.15

#

ok so namadad.service:
cat /etc/systemd/system/namadad.service [Unit] Description=namada After=network-online.target [Service] User=bod WorkingDirectory=/home/bod/.local/share/namada Environment=CMT_LOG_LEVEL=p2p:none,pex:error Environment=NAMADA_CMT_STDOUT=true ExecStart=/usr/local/bin/namadan ledger run Restart=always RestartSec=3 StandardOutput=journal StandardError=journal LimitNOFILE=65535 [Install] WantedBy=multi-user.target

#

I'll DM you the rest of configs now

#

btw running from terminal just hangs as follows:
bod@scraptop:~/.local/share/namada$ /usr/local/bin/namadan ledger run 2025-04-13T23:15:05.085116Z INFO namada_node: Available logical cores: 8 2025-04-13T23:15:05.085149Z INFO namada_node: Using 4 threads for Rayon. 2025-04-13T23:15:05.085155Z INFO namada_node: Using 4 threads for Tokio. 2025-04-13T23:15:05.122645Z INFO namada_node: VP WASM compilation cache size not configured, using 1/6 of available memory. 2025-04-13T23:15:05.122825Z INFO namada_node: Available memory: 22.327224731445313 GiB 2025-04-13T23:15:05.122838Z INFO namada_node: VP WASM compilation cache size: 3.7212041215971112 GiB 2025-04-13T23:15:05.122850Z INFO namada_node: Tx WASM compilation cache size not configured, using 1/6 of available memory. 2025-04-13T23:15:05.122854Z INFO namada_node: Tx WASM compilation cache size: 3.7212041215971112 GiB 2025-04-13T23:15:05.122858Z INFO namada_node: Block cache size not configured, using 1/3 of available memory. 2025-04-13T23:15:05.122861Z INFO namada_node: RocksDB block cache size: 7.4424082431942225 GiB 2025-04-13T23:15:05.122988Z INFO namada_node: Loading MASP verifying keys. 2025-04-13T23:15:05.123050Z INFO namada_node::ethereum_oracle: Ethereum event oracle is starting url="http://127.0.0.1:8545" 2025-04-13T23:15:05.124911Z INFO namada_node::ethereum_oracle: Oracle is awaiting initial configuration 2025-04-13T23:15:05.141030Z INFO namada_node::tendermint_node: CometBFT node started 2025-04-13T23:15:05.696349Z INFO namada_node: Done loading MASP verifying keys. 2025-04-13T23:15:05.696931Z INFO namada_node::storage::rocksdb: Using 2 compactions threads for RocksDB. 2025-04-13T23:15:05.697760Z INFO namada_node::broadcaster: Starting broadcaster.

pure escarp
#

you have nginx running? doing some port forwarding? could you try disabling nginx for a sec and just trying again to see if that's the culprit?

pure escarp
#

I bet if you waited a bit longer in terminal it would start running

shrewd gyro
#

netstat -tulnp does not show comebft opening a listener socket

#

oh, now it exited:
2025-04-13T23:15:05.697760Z INFO namada_node::broadcaster: Starting broadcaster. 2025-04-13T23:18:05.697962Z ERROR namada_node::broadcaster: Broadcaster failed to connect to CometBFT node 2025-04-13T23:18:05.698025Z ERROR namada_node::broadcaster: Broadcaster unexpectedly shut down. 2025-04-13T23:18:05.698032Z INFO namada_node::broadcaster: Shutting down broadcaster... 2025-04-13T23:18:05.698042Z INFO namada_node: Broadcaster is no longer running. 2025-04-13T23:19:01.855644Z INFO namada_node::abortable: Broadcaster has exited, shutting down... 2025-04-13T23:19:01.855688Z INFO namada_node::tendermint_node: Shutting down Tendermint node... 2025-04-13T23:19:01.855744Z INFO namada_node: Namada ledger node started. 2025-04-13T23:19:01.855798Z INFO namada_node: This node is a validator 2025-04-13T23:19:01.855777Z INFO tower_abci::v037::server: ABCI server starting on tcp socket addr=127.0.0.1:26658 2025-04-13T23:19:01.864985Z INFO namada_node: Tendermint node is no longer running. 2025-04-13T23:19:01.865037Z INFO namada_node: Namada ledger node has shut down. 2025-04-13T23:19:01.865124Z INFO namada_node: Shutting down ABCI server... 2025-04-13T23:19:01.936083Z INFO namada_node::ethereum_oracle: Ethereum event oracle is no longer running url="http://127.0.0.1:8545"

#

something weird has happened, I might try setting up a new node from scratch

#

problem is I've already lost a delegator ๐Ÿ˜ฆ

pure escarp
#

maybe set up a scratch node, compare a vanilla config with your config file.

#

you're already jailed so I would say take your time to get it running again

#

more people have this issue so would be good to pinpoint

shrewd gyro
#

possibly an Ubuntu update

shrewd gyro