Is there a hidden "hack" that can be used to create a volume larger than 100TB on a system running ONTAP 9.11.1? (and yet it cannot be upgraded)
We are trying to snapmirror a volume which is a but larger than 100TB but ONTAP complains about this.. I was hoping there was an unknown option that could be used? Or maybe snapmirror to a FlexGroup?
#Creating a volume larger than 100TB on 9.11.1?
1 messages · Page 1 of 1 (latest)
setenv bootarg.init.large_lun_enabled true should work
Do you see this in vol modify?
-is-large-size-enabled
But I think it's only possible beginning with 9.12.1
the bootarg works from 9.8 on. We tested this in our lab. And even though it only mentions LUNs, it also increases the maximum possible size of the volumes to 300T
it's not officially supported of course 😉
Ah nice
and you have to disable it again before upgrading to some recent ONTAP version, I think it's the 9.14 or 9.15 pre-check that tests this flag and has you remove it before the upgrade (because by then the large volumes are enabled by default)
I was thinking of converting the source into flexgroup...
Umm... i'm pretty sure this is not supported for NAS pre 9.12.1
I think you could get it supported via PVR back then. It definitely does work
i said supported.
but it complains about the volume beeing too full.. we have about 8TB avaliable on the aggregate where it is located... 107TB large volume... is there a guideline of how much space it takes to convert?
I will give this a go, and let you know if it works..
I don't remember if that particular option needs a reboot or not though. Some bootargs take effect immediately but I think for this one you need a reboot/takeover-giveback
I am already in the LOADER prompt 😉
doh.. forgot to do a "saveenv" befor boot_ontap... oh well I think it should do the trick anyway
it should still save the variables on a successful boot though
I still think we should convert this to flexgroup as it will most likely expand to 2-300TB.. this 9.11 system should only hold a backup for 4 months... and then we will snapmirror it on to a "newer" cluster... 🙂
You are a legend!.. . FS135-DKAAR1::> volume size -volume test -new-size 120t vol size: Volume "FS05-BACKUP:test" size set to 120t.
but on the other hand... are there any guidelines on space needed for a conversion? I though it just added another set of volumes to the flexgroup and then spread the data arround? maybe I should look into doing this manually i.e. creating the underlaying volumes my self?
BTW. this channel is much better and faster than NetApp Support 😉 let them take care of the wafl-iron stuff 😉
you cannot spread the data around, if you do the ONTAP conversion, you will end up with a flexgroup that has 1 constituent that is almost full, and 3 (or whatever number, N-1) constituents that are empty
well to be on the safe side, you could still try and get that PVR approved for the option, just in case anything breaks (it shouldn't -- 9.11 is almost 9.12 where it was officially supported)
Yes but I am pretty sure you can "balance" the volumes
yeah but there are lots of caveats for that (e.g. no snapshots)
So.. of cause we are low on space... so we have an aggregate with this 100TB volume.. then 4 other aggregates with 30TB+50TB+10TB etc.. does the conversion require x-times 100TB volumes?
hm, good question, that's something I can't answer from the top of my head
..and why does it complain about space `fs07-dkaar1::*> volume conversion start -vserver FS05-DKAAR1 -volume BG3DLOG_Archive -check-only true
Conversion of volume "BG3DLOG_Archive" in Vserver "FS05-DKAAR1" to a FlexGroup can proceed with the following warnings:
- After the volume is converted to a FlexGroup, it will not be possible to change it back to a flexible volume.
- Converting flexible volume "BG3DLOG_Archive" in Vserver "FS05-DKAAR1" to a FlexGroup will cause the state of all Snapshot copies from the volume to be set to "pre-conversion". Pre-conversion Snapshot copies cannot be restored.
- The volume is nearly out of available space. Converting this volume to a FlexGroup might lead to "No space left on device" errors.
- Converting the volume to a FlexGroup will not add additional resources for capacity. After converting, use the "volume expand" command to add resources.`
It's the 2nd to last one I do not like 😉
because of the inherent issue of FlexGroups: If your constituents each have 1 gig of free space, and you have 10 constituents, a "df" will show 10*1 = 10 gig of free space in your flexgroup, but you can not write a single file larger than 1gig to it
so if your first constituent only has 100g free, and your second constituent has 3tb free, the chances are that the system will put a new file on the "full" constituent which then limits it to 100g
the flexible-sizing thing will alleviate that a little bit, but it's not optimal
ahh ok.. makes sense... this is of cause "fixed" in the later versions of ONTAP where the files are "split"
yeah, 9.16 has the advanced capacity balancing ("granular-data") thing that should fix that
I am pretty sure the files on this volume are all smaller files... 1G or smaller... this is why they were not able to back it up using CommVault 😉 they gave up after a week 😉
(This CommVault setup uses CIFS for backup.. which is just terrible)
ah yeah I can imagine...
df -i shows : 144.997.396 inodes on the volume...
anyway I will get the snapmirror up an running
- Converting the volume to a FlexGroup will not add additional resources for capacity. After converting, use the "volume expand" command to add resources.
This however sounds like your plan would work. So after the conversion you have 1 constituent. You can then use volume expand multiple times with different constituent sizes and aggregates to get the layout you want. It wouldn't be optimal though but if you can't avoid it...
Yeah flexgroup is the way to go... we have 4 months to think it over until the new system is ready 😉
Hmm still get this error... I will try via the cmd-line
`FS135-DKAAR1::*> snapmirror show
Progress
Source Destination Mirror Relationship Total Last
Path Type Path State Status Progress Healthy Updated
FS05-DKAAR1:BG3DLOG_Archive
XDP FS05-BACKUP:BG3DLOG_Archive_dest
Uninitialized
Transferring 0B true 02/14 17:49:56`
don't like it doesn't show any bytes under progress.. I guess it's because of all the files that needs to be transfered first
(inodes that is...)
The "ideal" way to do this would be to create a NEW flexgroup volume and xcp the data as a re-ingest so that it balances it across constituents. Short term pain, long term success and ability to expand somewhat endlessly.
Went through this very exercise recently, and while it took a few days to copy the data again, I ended up with something that is MUCH more balanced, and expandable to 65PB
You are right Nick, we will have a plan ready for the customer... this is "archive" data but it is used for analysis so I am not sure how much down time they can handle... But we just cannot continue to drop data into one large volume... as soon as it gets above 100TB, for some reason I don't like it much 😉
There is no "auto-leveling" functionality (yet) to rebalance across the constituents upon a conversion to a flexgroup
Well, can you get them to do an NDU to 9.16? 285TiB volumes might help
No you will have to do it afterwards... but it does work fine... might take forever though...
Sorry... only 45PiB 🙂
This is a "poor mans" setup... source spinning disks FAS2750 (NL-SAS) to a temp-system FAS2650 also with NL-SAS... 😉 we don't have the dollars to just buy TLC Flash for everything 😉
oh so your max ONTAP version is also limited to 9.11
ooph
its less about the hardware and more about that, honestly.
A lot of these newer FG features were added in 9.15/16
yeah and it gets worse. . looks like even with the larger volume/lun hack the snapmirror doesn't want to play ball... Last Transfer Error: Failed to start transfer for Snapshot copy "snapmirror.9cef06a7-eaeb-11ef-b0f0-00a098d50cca_2154198208.2025-02-14_174756". (Failed to start transfer. (Destination not supported (Operation not supported)))
yea because it appears the 2750 as source has a higher ontap version (9.12) than the destination.
so we might end up converting to FG, then balance a bit and then resize the volumes below to under 100TB, and then we can do a snapmirror... seems like a lot of work 😉
How much total data are we talking? Can you middle-man it?
just a bit over 100TB... 108TB used.. 🙂
I will give it a try with a DP instead of XDP... mays that does the trick
Did I read earlier in the thread that this is just a backup target? Not an active dataset?
they need to have a backup of their archive dataset (which they have tried different other ways), so yes we just need to mirror this to an older system until a new system is ready in a few months time
Looks like even if you choose DP when creating your volume, and when setting up your snapmirror, it still just creates an XDP relationship type.. 🙂 just great
Yeah.. no dice on the snapmirror 😦 (Failed to start transfer. (Destination not supported (Operation not supported)))
Question... when you run the "volume expand volume blah -aggr-list aggr2" I can see that it creates a volume with the same size as the existing volume in the FG... is this thin-provisioned? Because we do not have another 100TB free space on any of our aggregates 🙂 I can see that I am able to resize the induvial volumes aftwards, but I cannot choose the size of the volume I expand with...
oh man that's annoying... seems they only check the version number, and not the actual capabilities (i.e. with the bootarg set to "on")... maybe there's also a switch to override this, but if there is, I don't know about it
Yeah.. maybe... I can mention that we have a customer running 9.8 that is mirroring into our 9.16.1 cluster without any issues... maybe because the relationship was created on an earlier ONTAP 9.14 I guess? Or maybe because it's from an old verison to a newer one..
I think it is because the volume is actually larger than 100TiB... SnapMirror should work in all other cases from 9.12 to 9.11. Pretty sure it checks some capability flags before starting the transfer
yeah so we are stuck with converting to FG, add another volume from another aggregate, and hope that works with expanding with a smaller volume... then balance for a bit which will be mostly moving data from one volume to the other... then when we are under 100TB, we can resize the volume so we have two volumes under 100TB and then hope that it works then 😉
SnapMirror 101 is the dest system must be on equiv or higher version of ONTAP
And unfortunately since the dest system is a 2650, the max ONTAP supported version is 9.11
this is no longer strictly true for XDP (see here : "interpoerability is bidirectional")
it was true for DP relationships though
It's not the ONTAP version difference that is causing this issue, because SnapMirror is actually supported from 9.11 to 9.16.. but of cause there is a little issue with the volume size support which is causing issues here... Just to make things worse I found out that there is snaplocked snapshots on the source as well (locked for 31 days), so even if we convert to FG and balance the volumes, we will not be able to free up space to make the volume smaller than 100TB for another 30-ish days... so we are now looking for a FAS27xx which will solve our problems 🙂
Don't know about your CIFS clients but - bold move - can't you simply create another volume? And aim some clients / new workloads to a new share?
It's a single share kinda thing 😉 for an application that does data analytics...
That's what you get for mixing cutting edge ONTAP with old dull ONTAP 😉
I think the easy way is just to track down a FAS27xx
if you've set the hidden bootarg, be mindful that anything you snapmirror to needs to have it set too
even if you're not sending the large volume
One thing that would definitely work if you know the filesystem and usage well enough is to "insert" a volume in the directory tree and either use that for new growth or move some larger chunk of data to new volume... so mount a volume in /foo/data/2025 (/foo/data/2025_new), copy the data, change directory names (/foo/data/2025 to *_old and *_new to /foo/data/2025) and remove the old /foo/data/2025_old ...