#Dilemma of Stellar contracts state expiration + partial history retention

19 messages · Page 1 of 1 (latest)

thorny swan
prime cradle
#

first, thanks for sharing this!
I've looked through this a bit. the points about DB size reduction are undoubtfully fair and getting rid of postgres is something that has been in plans for a while. the state expiration suggestion itself is a bit orthogonal to that though. it seems to omit the contract-driven rent bumps. the perfect scenario would be for the useful data to be bumped automatically either via frequent enough usage (like for token contracts) or via wallets (for user balances). of course, our expectations might be severely off, but it's not immediately obvious to me that entries that nobody uses should be forever persisted in the ledger for a flat payment

solar creek
#

Hey Orbit, thanks for sharing this. You covered a lot of ground in this post! Let me respond to the concerns about partial history Horizon.

We (SDF) are planning to limit history retention to one year on our public Horizon instance. We recommend others do the same, because as you point out, storing in this way is unsustainable. It's also unnecessary! When we analysed the traffic we receive, we found that 3/4 of requests to us are for current data. Only 7% of requests are for data older than a year. So we expect that after the change to our own service, 9/10 users of our API will notice no difference at all.

You correctly point out that full history is still available from the archives, but tooling is limited. We (SDF) plan to make available txmeta summaries, that can be used to quickly rehydrate a custom Horizon instance covering arbitrary timespans. This means for example that if your company were audited, and you needed to recreate a limited Horizon from 2017-2018 to answer some questions, this would be straightforward to do.
Additionally, Horizon provides a filtering mechanism that allows ingestion of data relevant to a limited set of accounts or assets. Our tests show that an operator tracking only a few assets they care about would save 99% of the storage of a full history instance. For most applications, the full range of blockchain objects is not needed.

#

For wallets that want to show account history there are a couple of viable approaches today. Wallets that create and manage accounts can update their Horizon filtering to track those account histories moving forward. Wallets that allow onboarding of existing accounts can combine filtering with one-time look ups to a third party historical service to backfill initial historical account data. Better support for wallet use cases is part of the SDF platform roadmap.

Finally, we have also provided Hubble (https://stellar.org/blog/developers/beyond-the-blockchain-unlocking-the-power-of-analytics-with-hubble), a modern data warehouse approach for analytics and deep data dives. This allows efficient on-demand aggregate querying, historical spelunking and can answer a much wider range of historical questions than Horizon ever could.

Zooming out: I'd like to stress that this partial history change will only apply to horizon.stellar.org. As a Horizon operator, you are free to store as much or little data as you choose within Horizon. However, we believe there are better ways for most users to cost-effectively access the data they need.

thorny swan
thorny swan
# prime cradle first, thanks for sharing this! I've looked through this a bit. the points about...

The idea of a contract automatically bumping its storage is fairly adequate for simple utility contracts, but it doesn't look applicable for many real-world use cases.

What about account abstraction contracts, for example? I have money on my account, but if I don't log-in for some time, it will disappear with all funds, and I'll have to use some arcane techniques to revive it. Do you understand how many backlash we'll receive from users? How many scams will thrive on this?

Take another example, a lending/borrowing contract. Who exactly is responsible for bumping state records?
Protocol developers? No, it will be too expensive for them because a successful protocol will have tens of thousands of state records. Again, by simply not taking any action and letting the state expire, they may receive indirect profits because of delayed liquidations.
Lenders? No, they don't have direct access to the process, they simply deposit funds to the lending pool.
Borrowers? No, because depending on market conditions they might have an insensitive to let the loan-related data expire. And even if they simply forget to bump the record, what will happen with the loan in case of approaching force liquidation?

Lastly, the notion that expiry should be managed by wallets does not guarantee that everything will be bumped in a timely manner, especially in case of dormant accounts.

Who will be liable for direct and indirect losses caused by state expiration in the future? I personally think that opening this Pandora Box will inevitably result in class-action lawsuits against protocol developers, wallets, and SDF itself.

thorny swan
# solar creek For wallets that want to show account history there are a couple of viable appro...

Of course, there are many ways to retrieve the historical data. Archives, Hubble, custom ingestion, third-party services. For cases like historical audit that's ok. But wallet-related suggestions are a bit off.

Let's take any non-custodial wallet. A user comes back after a year of inactivity and sees a blank transactions history (or simply scrolls by the 1y limit). Following your logic, the wallet should display a notification, something like "To view transactions history, please pay us 5 USDC and wait 10 minutes" because all viable ways to get evicted history records are both slow and expensive. Scrolled down past these recenly retrieved 100 records? Need older transactions history? Please pay again, and wait another 10 minutes.

As for the impact only on horizon.stellar.org, could you please tell me who else at this point is running publicly available Horizon instance with full history? What are the alternatives people will be able to rely on in the next year or two, after you trim history on your Horizon instances?

carmine snow
#

@thorny swan I agree about trimming history, wallet usage as you describe it seems off (maybe the costs won't be that high but I see your point and agree) .

Regarding state expiration, yes it does likely mean that inactive users will have to rely on centralized services (probably wallets will deal with this) to be able to know which entries to retrieve, which transactions will go through and why before execution etc, but this data is still easily verifiable and won't directly lead to funds loss if incorrect (if there's the possibility, the implementation should verify the integrity of that data). About:

And even if they simply forget to bump the record, what will happen with the loan in case of approaching force liquidation?

What do you mean by this? In contracts that are not managed from a centralized party that takes care of state expiration for its users, bumps will happen as the contract execution encompasses the entries (depending on the logic, only writes might lead to bumps but you get my point). It's not possible for the borrower to forget to bump. What would happen on force liquidation? Ideally the related entries won't expire (and it's generally the contract's job to do so, I like the idea of a "permanently on the ledger" entry but it can be achieved by restoring the entry, which similarly to the permanent entry idea you proposed is costlier), but even if it did liquidators would just restore the entries and liquidate the position. Also, I cannot think of scenarios where an expired entry leads to permanent funds loss (besides bad implementation, but that's in every smart contracts platform).

#

Scams will surely thrive on this, but that can be said really for most new features, it's on the user/implementors to rely on official/audited/well-known tooling.

#

Overall, I find state expiration an excellent concept to be applied to programs that can arbitrarily write arbitrary data like smart contracts. Thorough tooling and clear documentation will surely be needed, but I believe the pros of state expiration overcome the cons and the difficulty of working along with it (which won't be streamlined to the generic user in my vision of the management of expiring entries).

severe granite
#

Thanks for your feedback @thorny swan, always appreciate some technical deep dives on our proposals!

severe granite
# thorny swan I understand your position on flat payments. Although it's not really that "flat...

I don't think raw SSD price over time is a great metric to use for a few reasons. First, Blockchain validator DBs are unique in that they can't just store data in standard database or S3 configurations, but have to use some cryptographic schema like a merkle tree or in our case, BucketList. In "traditional" DBs, if you store an entry but never actually use it, you never have to touch the bytes and they can just sit forever on some hard disk. If you have to produce a hash of all state, this is no longer the case. For BucketList, you have to constantly read and rewrite all ledger state. This is intrinsic of all hash based storage solutions, not just BucketList. In merkle trees, you don't have to rewrite state, but you have to constantly read all entries (with random disk IO) and rehash intermediate nodes, which is an even worse trade-off. This makes any storage in a Blockchain validator DB significantly more expensive for both computation and IO than traditional DBs

#

This isn't theoretical, but other Blockchain with more scale than stellar have already run into issues with hashing and maintaining state. Ethereum is going stateless, where validators store no state and block producers maintain very computational expensive Verkle tree databases, which are a more computationaly complex version of a Merkle Tree. This has significant performance and decentralization impacts: https://dankradfeist.de/ethereum/2021/02/14/why-stateless.html

#

Solana tackles the issue by requiring very significant RAM just for DB caches in order to keep up with their merkle Tree rehashing requirements, making validators expensive and centralized

#

As others have mentioned I agree that our specific SQL DB implementation is ineffecient, but we've finished our LevelsDB like database and will be transitioning validators to it soon (Horizon watcher nodes have already been running it for 6 months). That being said, because of the significant cryptographic requirements of Blockchains, these DBs are expensive. Making comparisons to non validator DBs is not prticularly relevant. Any flat rate persistent storage is fundamentally misaligned with the true costs of validator storage.

#

I think if we get into the mindset of "this operation is expensive, but we're not going to charge for it for ease of use reasons" we get into significant scalability and resource constraint issues given that this is an open network that anyone can submit TXs on

prime cradle
#

How many scams will thrive on this?
I apologize for cherry-picking on the long response, but I would really be interested to see what are the specific scam scenarios that you envision state expiration enables. this has never come up during any of the state expiration discussions and I honestly don't have any good ideas regarding this (probably a separate thread would work better for discussing that).

terse mantle
#

Thanks @thorny swan for thinking deeply about the proposals and providing feedback. Want to provide a bit of color on Horizon - specifically, clarifying separation of concerns between:

  1. What the Horizon’s OS software provdes VS what SDF Horizon’s instance supports:
    The Horizon software will continue to support the ability to acquire and run instances with variable (including up to full) history. As @solar creek mentioned, we have plans to make this easier via precomputed TX Meta in the future, given the parallel re-ingstion process currently requires a large number of workers and takes quite some time to ingest multi-year history. SDF however, plans to truncate history in the future (retaining 1 year for example). This is so that we can continue to provide stable query performance/quality of service for most users (>90% of who ask for <30 days of history at most). This will not be done until we offer options to ensure there continues to be a viable path towards acquiring long/full history for those willing to bear the cost.

  2. OLTP (App transactions) VS OLAP (Analytics) use cases (to reference an old school analog):
    Horizon is intended for OLTP type interactions. Its not meant to be a data warehouse for all time. We got away with overloading the use cases initially but in the long run, we see limitations of PG DB will make that infeasible (not to mention, eventually will require exotic options like sharding or vendor specific PG hosting solutions). Fortunately, we do have a OLAP platform (Hubble), available to the community which will aways have full history and is suitable for use cases like auditing or analytics.

smoky condor
#

Couple things (as we’re wrapping up Meridian): talking to multiple people that were not keeping up with the protocol discussions, it seems that they don’t really have the proper “first impression” on what “state expiration” really means. This thread is also mixing use cases and true implications of what this would do to the ecosystem. Naming things properly is indeed one of the hardest problems in computer science, so I think we’re going to have to revisit definitions so that we get everyone on the same page.