#NetApp does not recommend using the All volume tiering policy with primary data?

1 messages · Page 1 of 1 (latest)

subtle edge
#

I have two respective questions on following these two statements about ALL tiering policy which are excerpted from https://www.netapp.com/pdf.html?item=/media/17239-tr-4598.pdf

"Note: NetApp does not recommend using the All volume tiering policy with primary data (read/write volumes)."

Why is that?

2) Data in volumes using the All tiering policy, (excluding data illegible for tiering) is immediately marked as cold and tiered to the cloud as soon as possible

Blocks will be IMMEDIATELY market as cold, and will be tiered ASAP, which means not immediately. What dertermine when exactly blocks can get tiered, before then will blocks are statying in the primary storage, even temporarily?

  1. Is there anyway (AIQUM, ex) I can find out how much load the tiering would put on the node?
ivory kestrel
#
  1. Because tiering is designed for only cold data. If you do enable it and are ok with 50+ ms of latency, then you can. But don't complain when it is slow.
#
  1. It is tiered as written to a temporary log buffer on disk, but it really is tiered almost instantly (in Consistnecy Point I believe).
#
  1. CPU load? The background scans consume a lot of CPU but yield to other workloads very well, so it won't be an accurate measure. It should be transparent to the existing amount of load you have now, but the CPU may look hhigher. Tiering scanners (wafl scan status) generally consume a lot.
solemn creek
#

for 2) it will be tiered on the next run of the tiering scanner. Which might take a few minutes even (you can check if the scanner is running with volume object-store tiering show, and you can even trigger the scan to run immediately

#

and for 1), it is a recommendation only. If you have a fast SSD-based on-premises S3 (e.g. one of the orange boxes 😉 ) you might not even notice any huge performance impact for CIFS volumes, for example. But in general you want to keep hot data on the system if possible

subtle edge
#

It is interesting to know the command. As shown below, we have SSD and also StorageGrid, the time on scanning took 3963s = 66minutes on this volume. I also saw it took much longer on the other volume. Could it have impact on performance? Anyway to find out if it really have or not?
::*> volume object-store tiering show -volume volume_name -instance

                              Vserver: nfs-vserver
                               Volume: volume_name
                            Node Name: node1
                          Volume DSID: 35742
                       Aggregate Name: ssd_aggr1
                                State: waiting
                  Previous Run Status: completed
             Aborted Exception Status: -
           Time Scanner Last Finished: Thu Mar 28 11:51:45 2024
             Scanner Percent Complete: -
                  Scanner Current VBN: -
                     Scanner Max VBNs: -
  Time Waiting Scan will be scheduled: Fri Mar 29 11:51:44 2024
                       Tiering Policy: all
 Estimated Space Needed for Promotion: -
                    Time Scan Started: -
               Cloud Retrieval Policy: default
           **  Elapsed Time Scanner Ran: 3963s**
     Time of Last Space Related Error: -
          Scheduling class of scanner: -
      Object Format Revert Scan State: -

space error code encountered in last scan: -

uneven urchin
#

Fabricpool on AFF with "tiering all" has replaced some our FAS systems. We do lose some dedupe with this setting though.
But it can be really fast. We are however running a local SG appliance cluster.
Unfortunately I don't have any latency numbers to share right now

subtle edge
#

Why did Elapsed Time Scanner Ran took 66 minutes long as shown above if no performance impact or really fast? How can you show it really had no performance impact?

orchid fulcrum
uneven urchin
subtle edge
#

We have 2 volumes with tiering policy = all at begining with. We noticed that when we copy about 30TB data on each volume, mre than 20TB data were not be able to be tiered to StorageGrid and had stayed in the performance tierer. It took 10 more days to finally get tiered.

So, it is clear that the data (may be due to large amount of the data) with tiering policy = all won't be immediately tiered, and could use the performance tierer as the staging area. They are not all stored "temporary log buffer on disk". They are not all "tiered almost instantly (in Consistnecy Point I believe)" as said above.

ivory kestrel
#

Thanks for clarifying David. I was under the impression it was instant in the TLOG, but I guess if there is a large amount it is staggered.

subtle edge
#

Yeah, it was not clearly explained in any documents, not as far as I could find.

solemn creek
subtle edge
#

This is very helpful function. @solemn creek Thanks for sharing!

uneven urchin
#

What ontap version are you running and where are you tiering to?
In my experience; it starts tiering data more or less instantly when using the all policy