#Getting useful metrics for volume usage

1 messages · Page 1 of 1 (latest)

red nova
#

Hi!

I'm currently working on configuring the LGTM stack for observability in our Kubernetes cluster. One important thing we need to monitor is the remaining capacity of all volumes in the cluster, so we can act before someone runs out of space. We use Trident with ONTAP NAS as the backend using Flex Volumes.

When inspecting the metrics, I immediately saw that something was off. While kubelet_volume_stats_capacity_bytes reports truthfully that my volume is 50TiB, kubelet_volume_stats_used_bytes is way off reporting ~28TiB while the true size of my data is 86GiB. I recognize this as the discrepancy between running df and du to determine disk utilization (where df reports much more than the true number from du).

This is well-known (see https://docs.netapp.com/us-en/ontap/volumes/df-command-file-size-concept.html). But I'm not a storage expert, so I can't quite wrap my head around the details of why this is, so thought I'd ask here.

It's actually one of the most common questions/complaints from my users, having caused a lot of confusion. Ideally, I'm hoping someone has an easy trick for me to change a parameter somewhere and finally get rid of this ugly bug for good (both via df and the metrics reported to Grafana). But if that's not possible for whatever reason, surely there's some way to get the true metric from Trident so I can present a useful graph to my users?

Hoping for some input on this. 🙂

left scroll
#

what you linked to is related to quotas, which I guess you're not using (as quotas are not exposed to CSI/Trident).
The real reason is probably due to deduplication and compression. ONTAP transparently dedupes and compreses your data, so you might end up with a volume containing 100TB of "logical" data (du) while only consuming 10TB of "physical" blocks (df).

Normally this is seen as a benefit (you can store more data than what the raw capacity of the disk provides), but especially in a k8s/Trident environment it can be counter-intuitive.

You can set the options is-space-reporting-logical and is-space-enforcement-logical on the SVM volume(s) to have it actually count "logical" data and use that. That wy, if you give out 50TB to some k8s workload, you can be sure that no more than 50TB will ever be stored there, no matter the deduplication/compression ratios

red nova
#

Wow, that's quick. Thanks for the input!

Your suggestion sounds really promising, I will for sure check those parameters!

I'm not sure though if your description aligns with what I see:

so you might end up with a volume containing 100TB of "logical" data (du) while only consuming 10TB of "physical" blocks (df).

For me it's the other way around:

$ du -sh /opt/data
86.0G   /opt/data

$ df -h /opt/data
Filesystem                Size      Used Available Use% Mounted on
192.168.0.10:/trident_pvc_c00feec0_adad_4faf_bde9_c00ffeec0ffe/data
                         50.0T     28.2T     21.8T  56% /opt/data
steady crystal
#

I think he meant "-is-space-reporting-logical"

#

This can also be set for the entire SVM. The " -is-space-enforcement-logical" is an SVM only setting which limits usage to the logical and not physical availablity/size in volumes (no dedupe advantages, sort of).

left scroll
#

but yeah this looks like something different... 86 GiB vs. 28 TiB... I think I misread that in your post as boith being GiB... maybe a ton of snapshots using up space?

#

it could also be that your aggregate only has 21 TiB free, then you can create a thin-provisioned volume with 50 TiB, but since you don't have the underlying space available, it will always show up as half-full because you can't put in more data than what the aggregate provides. The joys of Thin-Provisioning. Can you show a df -A -h or storage aggregate show from the ONTAP CLI?

red nova
#

I don't have access to the server where it's possible to run volume, only storage admins do. But I have access to the power shell module with vsadmin in my SVM. So I'm a bit limited in what I can verify on my own. I'll try to get in touch with storage admins next week.

#

It could very well be a bunch of snapshots taking up space! I did have snapshots enabled for a long time, but I've disabled it since it just ate up space for everyone making volumes unusable. 😅 And it is indeed thin. I'll check that command, just a minute!

#

I was not able to run df -A -h because that container only has busybox which does not implement df -A. I'll see if I can run a new Pod instead…

#

Ehm, apparently df in GNU coreutils in Ubuntu does not have -A either. 🤔 Typo?

#

If it was -a you meant, here's that (from my new pod; the path has changed but the volume is the same):

$ df -ah /mnt
Filesystem                                                       Size  Used Avail Use% Mounted on
192.168.0.10:/trident_pvc_52e296e2_adad_4faf_bde9_37e7dc8f3b9b   50T   29T   22T  57% /mnt
left scroll
#

As I said, thedf -A must be run on the storage system. Your containers don't know what aggregates (the -A) are 🙂

#

so without a storage admin we cannot debug this further. Snapshots is possible but to get 20+TiB from 80GiB of data would need a lot of snapshots and a lot of changes, so it's rather unlikely

left scroll
#

but my bet still goes to "full aggregate"

#

@steady crystal I think the link I just posted will also convince you that I didn't do a typo and that indeed both options exist 😉

red nova
#

Got it. Thanks for all the details – I'll try to catch one of the admins today so we can dig further into this.