#Snapshots fills up entire volume

1 messages · Page 1 of 1 (latest)

pine atlas
#

Hi!

We use Trident in Kubernetes to provision NetApp volumes in our ONTAP system. We have daily snapshots enabled in the SVM to reduce the risk of data loss. Unfortunately, this means that some volumes gets filled with snapshots and have no space left for user data. We've set snapshotReserve=50 to have more space allocated to snapshots, but it does not limit the snapshots to the reserved space so we still run out of space in the volume!

The app in Kubernetes that is particularly effected by this is a Postgresql database. Not exactly sure why, but maybe it has a high churn rate for some reason. Perhaps there is a bug in the app that cause the high churn rate on disk, but we are not sure. And in any case, when a customer requests a 10GiB volume we want them to be able to access the full space, not just some arbitrary number lower than that. And we never want to risk snapshots eating up all usable space for them, since they do not control snapshots.

Am I missing something here? Is there a way to limit snapshots so they cannot encroach on the user's disk space? What's the best practice? Is snapshots a "bad" feature for my use-case and I should use an external backup system instead?

empty tiger
#

First of all, the snapshotreserve is a "soft limit" so doesn't make much sense (I would suggest to set it to 0). There are several volume options you can use like "autoresize" which will allow the volume to grow until a maximum. There is also a snapshot "autodelete" where you can trigger a deletion of snapshots if you are running low on space... Personally I don't like the autodelete because you might end up deleting something you need... Some of this can be set via the webgui, but special nuances can only be set in the commandline... thinks like "autosize-grow-threshold-percent" which I think it set to 98% by default, and I normally set this to 80% becuae my monitoring flips out if volumes are above 80% 😉

somber jasper
#

having automatic snapshots on postgres (or any database, for that matter) isn't going to give you a consistent backup anyway

pine atlas
#

Thank you for the input! Looking through the documentation, I find no option for setting autoresize or autodelete in Trident Backend Configuration. Am I missing some option that lets me specify it, or how does one set those?

pine atlas
# somber jasper having automatic snapshots on postgres (or any database, for that matter) isn't ...

Yes, that is true. Each application owner must make sure to export their data periodically (e.g. a database dump to disk), if their application cannot be reliably restored from a snapshot alone. Assuming they get that part right, our snapshots will help prevent data loss.

But more broadly, I believe you are touching on the core issue here. Our long-term solution needs to be a "real" backup solution, not relying on snapshots in the NetApp volumes.