#CPU Instruction and Mem Bytes Limitations

48 messages · Page 1 of 1 (latest)

deft dagger
#

I've started the process of optimizing the Blend protocol for resource usage, and wanted to start a thread for more details about the limitations around CPU Instructions and Mem Bytes.

Is there an current cap for block limits? For example, Ethereum has a max gas limit of 30 million gas. The env defines numbers, but it was defined months ago so I don't know how current they are: https://github.com/stellar/rs-soroban-env/blame/main/soroban-env-host/src/budget.rs#L761-L774

GitHub

Rust environment for Soroban Contracts. Contribute to stellar/rs-soroban-env development by creating an account on GitHub.

fiery pasture
#

these numbers are good

#

futurenet limits are here. we'll try to keep this up to date with the new releases and network upgrades

#

do note that these numbers are really conservative - please let us know if there are useful contracts that you couldn't fit into any of the limits

deft dagger
#

I'm already hitting these limits, especially when using custom token contracts for fairly simple use cases (IE collateralize three assets and attempt to borrow 1), where ~80% of the total limit is VmInstantiation.

With the changes being suggested in https://github.com/stellar/rs-soroban-env/pull/825, we will likely only be able to support the borrow action where only 1 asset is collateralized, even if we eliminated all other costs.

The current (very unoptimized) version of the lending pool is 48kB (this is roughly comparable to all of Aave's various libraries combined), will cost 736,049 + 48,000 * 684 = 33.7m CPU Instructions just to initialize, meaning almost no other external contract can be loaded (our custom token is 736,049 + 8,500 * 684 = 6.57m).

#

(side note - simulation determines the following costs for a borrow with 3 collateralized assets on a sandbox network {"cpuInsns":"21160080","memBytes":"17689111"} and is still failing, so it might be another resource limit being hit)

fiery pasture
#

hmm, not sure if simulation should fail due to hitting the resource limit

deft dagger
#

the submission is failing, not the simulation*

fiery pasture
#

ah, I see

#

if the error is the tx error, then you're hitting per-tx limit (the one specified on the page). if the error is operation error, then you hit the limit you declared at apply time (preflight metering is currently imperfect and might diverge from apply time metering; although at your scale I think it shouldn't be the reason)

#

the current limit was set with 1m insns/vm in mind (so having like ~10-20 vm calls seemed possible). we'll definitely need to increase it if the cost is changed to linear.

#

FWIW the current results are pretty surprising; I'm not sure why would the linear factor be that big

#

it is like 30 times larger than the cost of executing an instruction

deft dagger
#

Here is some more context for simple functions with the current constant model (where VM Instantiation is playing a still massive role):

#

(note these were taken from rust integration tests)

fiery pasture
#

sure, vm instantiation is expected to be expensive. currently you seem to have potential to increase your invocation 2x which seems fair. if we change the calibration, then I guess we'll change the limit proportionally

deft dagger
#

(FWIW the above graph is an optimistic test on the cheaper actions, where no valuation of the users position occurs)

Hm, so I've been considering tracking collateral and liabilities within the lending pool contract (IE no separate tokens like https://docs.aave.com/developers/tokens/atoken) to avoid instantiating extra token contracts when attempting to value a users position, at the expense of any token interaction (like calling balance) being forced to initialize the full pool. However, I don't think I can make that decision until the PR I linked above is merged / finalized.

To help better the understand implications, the limits reported are per transaction, right? Are there any block level limits?

fiery pasture
#

there is a per-transaction limit for maximum resource consumption. currently it's somewhat arbitary, but going forward it will be based on the per-ledger limit (with the goal of allowing ~10 max transactions/ledger)

#

I think the limits should come from the use cases; if they're too low for most useful contracts, then we should definitely raise them (even at cost of allowing less max txs/ledger). which is why this data is really useful

#

also, limits are not set in stone, they can be updated via network upgrades

deft dagger
#

Ah, thats helpful. I'll see what I can figure out for the current system designed as is and hopefully release a more complete benchmark.

daring minnow
#

@deft dagger blend doesn't sound like a project that should be hitting resource limits. Is there a wasm blob/contract you can share so that we can investigate? If need be we should adjust limits. @reef sparrow @tepid depot @inner walrus

deft dagger
#

Happy to add whomever to the repository. There were a few design decisions made incorrectly (like I assumed we would only be charged the VmInstantiation fee once per contract), and some dead code left from refactors.

#

the primary cost driver is just loading new VMs to check balances on bToken (tokens that track collateral) and dToken (tokens that track liabilities) when attempting to value a users total position against the pool, as it scales linearly with the number of positions they currently have open. Each new position triggers a call to the oracle, underlying asset when accruing interest, and its respective b/d token for a balance, so 3 new VMs.

reef sparrow
#

(consider how many CPU instructions your compiler does to emit each CPU instruction that will run, or even handle faulting-in each new page worth of instructions)

fiery pasture
#

from my brief knowledge of wasm it doesn't seem that high level so I'm not sure how much parsing work is actually there (given that there is also no JIT). but in any case my concern is rather not about the absolute numbers (they are what they are), but about variance we observe. the way I can interpret our benchmark results so far is that there are variables we don't take into account when computing the model, which means that we would either overcharge or undercharge a big fraction of contracts

inner walrus
#

What is the plan to address this @tepid depot ? According to our metering model, we should be breaking up wasm instantiation into existing subcomponents, right? Is it that we're missing callbacks in wasmi to do this well?

tepid depot
#

@inner walrus The only way to improve metering of VM instantiation is to fork into Wasmi and inject metering into different sections (we need a new cost component per section), which is moving away from our effort of migrating to Wasmi upstream. I've created this issue for it https://github.com/stellar/rs-soroban-env/issues/838.

The other option is to do component caching tracked by https://github.com/stellar/rs-soroban-env/issues/827, which can reduce the cost itself significantly. This won't be quick or easy to do either (partly for the reason I mentioned in the thread).

In the immediate term the only way is to keep the currently metering model (with newly calibrated parameters in https://github.com/stellar/rs-soroban-env/pull/825), and bump the resource limits at network level accordingly.

GitHub

Rust environment for Soroban Contracts. Contribute to stellar/rs-soroban-env development by creating an account on GitHub.

GitHub

Rust environment for Soroban Contracts. Contribute to stellar/rs-soroban-env development by creating an account on GitHub.

deft dagger
#

Wanted to add a suggestion that would significantly help larger protocols like Blend keep the VM Instantiation cost in check.

Since the protocol uses known contracts, we can easily write logic that safely reads from another contracts ContractData storage. (like, reading the balance of a BlendToken to determine the users collateral balance without having to instantiate the VM contract). Last time I suggested this, I think the pushback was the potential mistakes developers can make made it not worth it. However, thought it was worth bringing back to light with the new information we have for VM instantiation costs.

daring minnow
#

@deft dagger thanks for resurfacing. I agree that it might be worth revisiting given everything we know now. IIRC (need to lookup the thread) we voted against direct reads from other contracts because of (a) the footguns associated with directly using the memory of another contract and (b) not allowing for "pay to access" patterns on-chain

fiery pasture
#

I don't think we've ever considered just providing unlimited RO access to other contract's storage. instead, the proposals were around specifying readable keys via metadata. this works around both footguns (that's basically still part of the public contract interface), but requires some design and implementation effort

fiery pasture
inner walrus
#

Direct ledger access would break certain commit and reveal schemes like the one that people need to follow for prng (on that note, I see that we don't have an example of this, so I opened one. It does not mean it cannot be done, just would have to be "opt-in". That being said having broken metering for VM instantiation (if this is what this is really about) sounds like a high priority to me: direct ledger access only solves a small subset of problems where multiple VMs are involved in a transaction.

GitHub

Example Soroban Contracts. Contribute to stellar/soroban-examples development by creating an account on GitHub.

deft dagger
#

What I picked up from https://github.com/stellar/rs-soroban-env/pull/825 was that VM Instantiation was just that much more expensive than anything else, and that the current metering was close enough.

Per the recommendation of direct ledger access:

(b) not allowing for "pay to access" patterns on-chain
was there a thread about this? It seems like direct access would allow you to avoid this as any contract can be written to provide access to a certain ContractDataEntry, instead of only being allowed access through the "owning" contract

specifying readable keys via metadata
I have a hard time grasping use cases for universal ContractDataEntry access outside of protocol optimization such that it needs to be on a public interface, but this would still be useful for optimization

break certain commit and reveal schemes like the one that people need to follow for prng
I'm not familiar enough with this to comment, but I'm not opposed to allowing contracts to enable or disable external reads

fiery pasture
#

I have a hard time grasping use cases for universal ContractDataEntry access outside of protocol optimization such that it needs to be on a public interface, but this would still be useful for optimization
it can only be a part of the public interface in order to not break overall security. this doesn't need to be a 'pure optimization', it could be just a general way of defining 'view functions' (most of them probably just read from storage

#

this has also been discussed in the light of e.g. getting token data off-chain (so that external clients could read the relevant data via common interface without relying on internal implementation details)

tepid depot
fiery pasture
#

we can store that in the contract code entry

#

the same goes for any additional info, like number of functions or whatever else is a useful input parameter

#

could you please share your experiment results somewhere?

tepid depot
kind canyon
#

As mentioned by @inner walrus in the issues #838,

Caching is probably doable in the context of a single transaction, but crossing transaction boundary is probably fairly challenging:
as cpuInstructionCount is a non refundable resource, we'd have to come up with some scheme that allows to reason at the tx set level (to also divide up fairly the cost of instantiation between transactions).

If we were to do it at the transaction level only, I am not sure the benefits would be there.

Could you please clarify, are there any plans to cache loaded contracts at least on the transaction level? It makes sense from the perspective of contract developers. Without caching, developers will likely have to rely on caching responses on the contract level to avoid high fees.

Also, this behavior falls in line with widely-adopted in most programming languages dynamic modules concept -- dynamic modules are loaded once on-demand and then can be reused without loading penalties throughout the whole lifecycle of the application.

If caching on the tx level is not an option, we'll need to extend SEP-40 draft (price feed oracles) with additional functions to allow batched price retrieval for several assets. Otherwise, current approach makes some usage scenarios quite expensive. E.g. when a client contract has to make several subsequent calls to an oracle contract to fetch data for several assets. That's the problem we and @deft dagger are facing right now.

daring minnow
#

@kind canyon yes, this is panned. we're adding a new cost type for cached instantiations in the upcoming xdr but it will only be implemented later on. I would avoid introducing batching optimizations, especiallty in standards like sep40

kind canyon
daring minnow
#

@kind canyon with a high probability this will land before the mainnet release. I can't say with 100% because this is an optimization and we're prioritizing feature work ATM

rocky stone