#Contract trapped but no errors in diagnostic events

68 messages · Page 1 of 1 (latest)

vestal stone
#

I have a tx that is trapping, but when I view the diagnostic events, there's nothing indicating there's an error. When would a contract trap but there be no diagnostic eents about the trap?

#

Tx: AAAAAgAAAABcdUBy7Zxr00zhrprXCWjoCF9oFrivLGOwxmUudllQ/gANwx8AAE9QAAAAAwAAAAEAAAAAAAAAAAAAAABlKOR4AAAAAAAAAAEAAAAAAAAAGAAAAAAAAAAB2LKxTGD+y2X7KqgcvqedCDXK/A4NhR+GESV8HltB4cIAAAAIdHJhbnNmZXIAAAADAAAAEgAAAAFDMHHd6me1AwWYZHPRUaJXhtnUUSODljJGkPYQWDhQLQAAABIAAAAAAAAAAFx1QHLtnGvTTOGumtcJaOgIX2gWuK8sY7DGZS52WVD+AAAACgAAAAAAAAAAAAAAAAAAA+gAAAABAAAAAQAAAAFDMHHd6me1AwWYZHPRUaJXhtnUUSODljJGkPYQWDhQLQAAAAAAAAAAAABPvAAAABAAAAABAAAAAQAAABEAAAABAAAAAwAAAA8AAAASYXV0aGVudGljYXRvcl9kYXRhAAAAAAANAAAAJUmWDeWIDoxodDQXD2R2YFuP5K65ooYyx5lc87qDHZdjAQAAAAQAAAAAAAAPAAAAEGNsaWVudF9kYXRhX2pzb24AAAANAAAAhnsidHlwZSI6IndlYmF1dGhuLmdldCIsImNoYWxsZW5nZSI6Ims5TDhWbWs5aDRNR2FLakpXaThTVHVFeW5LS1FVcVAxb21uSWZqSTBlam8iLCJvcmlnaW4iOiJodHRwOi8vbG9jYWxob3N0OjQ1MDciLCJjcm9zc09yaWdpbiI6ZmFsc2V9AAAAAAAPAAAACXNpZ25hdHVyZQAAAAAAAA0AAABAiwbmsgY0sd+ZXMoggqGDxDe2ajbuFxPQbzgrdBpAIq2nugSCPHOlqGTg9bzOYQZd4zcT+CUDF/ICXuDSEJi9BwAAAAAAAAAB2LKxTGD+y2X7KqgcvqedCDXK/A4NhR+GESV8HltB4cIAAAAIdHJhbnNmZXIAAAADAAAAEgAAAAFDMHHd6me1AwWYZHPRUaJXhtnUUSODljJGkPYQWDhQLQAAABIAAAAAAAAAAFx1QHLtnGvTTOGumtcJaOgIX2gWuK8sY7DGZS52WVD+AAAACgAAAAAAAAAAAAAAAAAAA+gAAAAAAAAAAQAAAAAAAAADAAAABgAAAAHYsrFMYP7LZfsqqBy+p50INcr8Dg2FH4YRJXweW0HhwgAAABQAAAABAAAABgAAAAFDMHHd6me1AwWYZHPRUaJXhtnUUSODljJGkPYQWDhQLQAAABQAAAABAAAAB6QMvP+kKK3U8q2jHZBMXAOkkh6EtlKyNREG9CTkqveGAAAAAgAAAAYAAAAB2LKxTGD+y2X7KqgcvqedCDXK/A4NhR+GESV8HltB4cIAAAAQAAAAAQAAAAIAAAAPAAAAB0JhbGFuY2UAAAAAEgAAAAFDMHHd6me1AwWYZHPRUaJXhtnUUSODljJGkPYQWDhQLQAAAAEAAAAAAAAAAFx1QHLtnGvTTOGumtcJaOgIX2gWuK8sY7DGZS52WVD+ADUbgQAADUQAAAKcAAAAAAAAUDgAAAABdllQ/gAAAEDj5bzs4eweV7GGjEJHwMLErwSF+j3fXkR1sDvZ2VjLN+XP8Czv9ZI+h1uV8gNl/gxT3DzoX5UNtZjg6xTiSZID

#

Tx result showing the contract trapped:
AAAAAAAB7+f/////AAAAAQAAAAAAAAAY/////gAAAAA=

{
  "fee_charged": 126951,
  "result": {
    "tx_failed": [
      {
        "op_inner": {
          "invoke_host_function": "trapped"
        }
      }
    ]
  },
  "ext": "v0"
}
vestal stone
#

@trim mist @civic jungle what's the recommended way to debug this if the diagnostic events don't expose an error?

#

The tx uses a custom account, but the diagnostic events show that auth with the custom account is succeeding because the __check_auth function is returning void.

trim mist
#

that's weird, we should have at least some error... do you have a clue as to what could have failed?

#

if error reporting is broken for this case for whatever reason, it seems like the way to approach this would be to modify the contract until it passes...

vestal stone
#

It's the native token.

#

So I can't really modify it.

#

There are only 2 contracts and 2 entities in play.

#

The auth contract and the token contract.

#

The diagnostic events show that the auth contract was called and returned successfully.

trim mist
#

right, but since we can be reasonably sure that token contract is fine, then it must be the auth contract

vestal stone
#

Well, look at the events, the auth contract executed and returned fine.

trim mist
#

well, I'm suspecting this part to be broken for some reason

#

I'm looking at the code - maybe we accidentally suppress/mask the error..

#

if that's not the case, then we only have a few things that might fail, such as nonce check. not sure why would it not propagate an error though...

#

well, at least I've found an issue with the core_metrics events - we don't currently log the budget metering data in case of failure, which is annoying

#

(my other hypothesis was that maybe you run out of budget)

#

I'm not 100% sure if we always produce a diagnostic events for budget exceeded errors...

#

actually, we are propagating these always, so that's not the case either

#

but in case if there is non-decorated host error (i.e. the one that doesn't emit a diagnostic event), then we don't log it into meta as well, which is bad. I'll address that as well. but for now I'm not actually sure how to debug this... maybe we could take a look at the transaction

vestal stone
#

If you give me a core I can rerun the test with it.

trim mist
#

yeah, that's an option as well. I can't give you core though, just a patch

vestal stone
#

I just need a commit sha

trim mist
#

ok

vestal stone
#

If you want me to rerun with any modifications

trim mist
#

actually, if you have a local core access, you can just run it with -ll debug

#

this will log the invocation failed: <HostError> line (and I'm working on propagating this to diagnostic events

vestal stone
#

COMMANDS=["ll?level=debug"]

#

That is already in my stellar-core cfg.

#

I'll rerun the tx, and then grab the stellar-core logs.

trim mist
#

yeah, then it should just be there

vestal stone
#

Hmm, the stellar-core logs seem to be very empty...

#
Warning: soroban-env-host-curr is running a pre-release version 20.0.0-rc1
2023-10-12T23:09:37.987 [default INFO] Config from /opt/stellar/core/etc/stellar-core.cfg
2023-10-12T23:09:37.987 [default WARNING] Overriding Soroban limits with TESTING_SOROBAN_HIGH_LIMIT_OVERRIDE
2023-10-12T23:09:37.987 [default INFO] 'ARTIFICIALLY_ACCELERATE_TIME_FOR_TESTING' enabled in configuration file - node will not function properly with most networks
2023-10-12T23:09:37.987 [default INFO] Using QUORUM_SET: {
   "t" : 1,
   "v" : [ "self" ]
}
trim mist
#

yeah, that's definitely not ll debug

civic jungle
#

Is that the log file or console output? Can you try running with --console?

vestal stone
#

Ok looking at actual log file now.

#

What am I looking for? I don't see any errors.

#

Lots of debug, lots of output, but no "error"

#

I found another case where a similar thing happens.

#

"function trapped" trying to call the native token contract when it hasn't been created.

trim mist
#

look for "invocation failed:"

vestal stone
#

Not present in the logs.

trim mist
#

these are gaps in our error decoration - I did a pass over storage errors, but obviously not all of them are properly decorated. so sometimes we still have a 'raw' error (just code+type) and then it's not propagated in any way besides the core log statement

#

this is really strange

vestal stone
#

In the diagnostic events it shows my __check_auth function returning with void, so again looks like the custom auth is successful.

trim mist
#

yeah, I've double-checked the code and it indeed doesn't seem like we could emit this for an unsuccessful call.

#

what is really concerning is that you don't have a log message in core logs

vestal stone
#

Could this be a bug with how custom auth is integrated?

#

I'm able to call the native contract to transfer the other way.

#

i.e. I'm using it to transfer from G to C.

#

Then it is failing whewn transfering from C to G.

trim mist
#

I guess it could be anything, but this is covered in tests

#

so it would be either some tricky scenario, or bad transaction (or bug in data plumbing)

#

I've found one non-decorated storage error - if nonce is missing from the footprint we don't emit the event

vestal stone
#

Then I think that is the issue.

trim mist
#

well, I hope it is

#

it was very useful anyway - there is a bunch of simple improvements we can do here

vestal stone
#

Yup, trying a fix now

trim mist
vestal stone
#

Yay it works.

#

And with that I have a transfer approved by webauthn.

trim mist
#

nice!

vestal stone
#

Need to add some logs to this to show that the balance is indeed changing on chain, then going to share this. I also need to check if I'm under testnet limits, I get the feeling I might have tipped over 😦

trim mist
#

are you doing some crypto we don't support yet?