Hi there... I looked at this command "system node migrate-root"....
Does anyone know if you have to partition your disks first? And while we are at that... is there a way to partition spare disks? We would like to migrate the root onto 24TB Spinning rust... is can you still do "Root-Data" and not "Root-Data-Data" ? I have a very old description of how to partition disks to any size you like, but would be nice doing it in a more "supported" way.. 🙂 (this is on a 9.16.1P5 FAS50 system)
#Migrating root...
1 messages · Page 1 of 1 (latest)
...about the partitioning I guess you can do "disk create-partition -source-disk <old 10TB disk> -target-disk <new 24TB disk>... but will it then just create a root partition at the same size at the source and leave the rest for data? well tested a bit and I am pretty sure the source and target has to be the same model disk...
please give me some time. I have information relative for this exact issue. I will respond when I get back to my hotel tonight
You are a champ 🙂 I figured out that spinning disks always have Root-Data.. Where P1 is data and P2 is root... I just need to guestimate the root size of the partition... I have an older FAS2600 where the root is 55176MB but my guess is that this root-size depends on the controller model that makes does the initial partitions? We have 50 disks in total, so even if we did just 55176MB root partitions we could just add more disks to the root aggregate... IFAIK there is no "wrong" size here?
root-data-data was never supported for spinning rust, all you can do there is root-data
Use this KB article:
https://kb.netapp.com/on-prem%2Fontap%2FOHW%2FOHW-KBs%2FHow_to_migrate_to_Advanced_Disk_Partitioning
Step #2 tells you what size the root needs to be.
Step #4 explains how to calculate the root partitions size
Setting up a FAS2600 from scratch using the "normal" methods will only partition disks in the base unit.
I have never tried to manually partition external drives on a FAS2600 and to move root there.
OK step #2 is the size of "vol0" which actually vary between controller models... but it also more or less tells you to partition your disk to whatever suits your need... which also makes sense to me... I just thought that this was a fairly "trivial" process that NetApp could be bothered to just add to their command... like the migrate-root would be more useful if it has an option to partition the disks as part of the process...
the migrate root part is pretty straightforward, run the command and cross your fingers.
prepping for it is the hard part. esp if you have to convert to ADP.
from my experience, migrate-root fails in about 50% of cases at some point, leading to you having to do the manual process instead. That's why I still prefer the manual method (create new aggregates, set them to CFO, boot, unset recovery bootargs, reboot, done)
To be honest I did “fully read” the first post. Just saw stuff about partitioning and thought I’d interject
Bottom line is this:
never had any luck with ONTAP using manually partitioned spares for root. It generally would just fail.
Now if the system already has partitioned disks, there is a way to mitigate to either partitioned disks or while disks with the migrate root command.
With the partitioned drives, I could go from say 4tb to 10t(or 22t for that matter). It would require the same number of physical drives (if I have 12 root-data at 4t, I would need at least 12 - plus a spare) to migrate using adp
@gleaming forge do you have the answers you are looking for? Based on all the data above, it seems you should be clear
On the contrary, I have used manually partitioned disks for ONTAP root multiple time without any issues.
You don't need the same number of physical drives, just calculate how large the root partition should be with the number of new disks you have, using a supported Data, Parity, Spare count as per HWU for your controller model.
I'd give it more than 50% success rate but I have had it fail on occasion and have had to revert to the manual process, typically have had to run debug vreport show/fix.
Despite a non 100% success rate, I still think it's worth it to give migrate-root a go - if it works it works and that will save you time.
If it fails midway, I am fortunate enough to have used the old manual process so it's not that intimidating.
Yes Erik linked to a great paper on the partitioning... I actually have an "internal" paper from 2017 describing the commands, and I was just hoping that NetApp had developed other commads that made this easier... but apparently not... as we have 50 disks we will split it 25/25 to each node and setup one raid-tec with 24 disks, and I think I will just add most of the root-patitions into the aggr0 because I also learned that if you have too few disks (spinning) in the root-aggregate it may have a performance impact as it will does logging etc.. to vol0... (maybe it's just me..) anyway I will give it a go later today and let you guys know how it went 😉
OK, one quick question... the existing root aggr size is 186GB and the one shown in the example is 430GB... and as we are moving from a FAS8200 to the FAS50 would it make sense to increase this a bit? Better to be safe than sorry?
I had a look at HWU where is states that with 24 disks in a system, the config would be a root partition of about 23GB with a raid-tec 8d3p setup this comes close to what we actually have now, so I think I will go with this...
Just like in the good old 7mode days I guess 😉
ndmpcopy /vol/vol0 /vol/new_vol0 😉
...I think they put a lot of "pause 1 min" into their migration script? 😉 seems to take awhile... I can follow the process where it boots up with two root aggregates... then copies one to the other... then offlines the old aggregate.. then destroys it.. then reboots... (there are actually 3 reboots in this process) So boots to create the new aggregate.. then reboots using the new aggregate as root.. then copies over the data... then reboots once again... so the most waiting time would be all the rebooting 😉
yeah it might have improved recently, I haven't tried it for a while 🙂
you don't even need that. You can start with an empty root volume and the system will recover its DB from the existing nodes in the cluster (the rest of the files will be extracted by ONTAP automatically)
not so sure abou that... it boots up with both aggregates online... then starts a process where it copies from one to the other... so my guess would be some kind of ndmpcopy like in the old days 😉 anyway it worked... now on to the other node...
hm, I think it only unpacks the rootfs.tgz (or whatever it's called) and then re-syncs the cluster DB. In any case, it's all automatic, all you'd have to do is to unset the recovery bootargs.
But good that you got it to work, I know it can be a bit stressful doing that on a production system 🙂
it was more stressfull to do the "headswap" as I attached the old disks while in maint-mode the status of "aggr status" showd a failed data aggregate... after a few minutes it was changed to online... guess it's because of the high number of disks and flexcache etc.. 🙂
other node also worked as expected 🙂 3 clusters to go 😉
Sweet! That kb above is fantastic. The old one that I used to use was removed during an update. It was not nearly as easy to follow as this one here
Maybe we need to start our own NetApp wayback-machine site? 😉
I was connected to the service-processor while this was migrating.. along the way I got this scary message 😉
`SHA256 checksum failure: varfs.tgz
SHA256 checksum failure: oldvarfs.tgz
- ALERT: SHA256 checksum failure detected *
-
in boot device * -
* - Contact technical support for assistance. *
ERROR: netapp_varfs: SHA256 checksum failure detected in boot device. Contact technical support for assistance.`
...but eventually completed OK 😉
Yeah
I tell everyone if they monitor the console/sp you may see lots of scary stuff. Just let it go
hmm having an issue where the cluster seems to revert my new partitions automatically...... very wierd... I can create them with "disk partition....." I then assign them as root and data.. after some time they are unassigned and unpartitioned... just like that... autoassign is disabled... any clue?
We are on 9.16.1P3... trying to update to P10...
Hmmm while I am waiting I found this: storage raidlm policy modify -node s01-01 -policy-name auto_unpartition_on_spares_low -policy-type Shared-Disk -is-enabled false
think this is the issue...
...would have been nice that they just mentions this in the log...
hmmm that didn't do the trick... actually scripted most of the commands... so I ended up running the migrate-root command... but it then halted when it rebooted and was unable to find the assigned disks, I was able to just do a boot_ontap and we are back to where we started...
...most likely a NetApp support case...
Here are the logs showing this strange issue...
(reads bottom up)... the two partitions are assigned..., then 30 secs after it's unpartitioned again by the system... wierd...
As long as I don't assign the partitions, nothing happens... it's only after they are assigned to this node... (na12)... I already managed to do this on the other node with no problems...
Only thing that I can see is that na12 already have other partitions on it, with a different size... which I think is the reason I get this warning when I create the partitions...
na12*> disk partition -n 2 -i 2 -b 6127616 3d.10.0 WARNING: The specified root partition size 6127616 does not match with the current root partition size. This can create issues with future partition management. Do you want to abort this operation (y/n)? n disk partition: 3d.10.0 partitioned successfully
It is an ONTAP version thing. I noticed that to for a short while, I think if you can upgrade, that bad behavior stops, however, on FAS units, it is fine to leave the whole drives anyway. That way, if a whole drive fails, it is replaced with a whole drive, but if a partitioned drive fails, ONTAP will auto-partition as needed an use it
Upgrading to 9.16.1P10 didn't fix it We don't want full drives for this, because a lot of space it lost on a 24TB drive 😉
Please re-explain why you want partitioned drives and you do not want whole drives. Many tangential material above. Just want to focus in
And if you wish to migrate root from older smaller partitioned drives to newer larger drives and keep partitions, I have an answer for that
Yeah I'm sorry, we actually need to do this procedure on two different systemes.. so we did it without any issues from a FAS8200 (with DS212C shelfs) to a FAS50 (with DS460C shelfs).. But this specific one is/was a FAS2620 Single-Node cluster also with DS212C shelfs... The main reason we need to partition the disks is because of space... we have 50 x 24TB disks, so 25 disks for each controller.. we already completed this on one of the controllers (which was the new one we added to the single-node cluster) but we are having issues with the original node... I can only guess that it's because it is already using partitioned disks for its root (but on the old disks) and the partitions are not the same size on the old disks and the new 24TB disks, mainly because I followed the guide to manually partition the disks... we opened a case with NetApp and hopefully we can find a solution on monday.. maybe a all manual root-migration is the way to go? I guess once you create the aggregate it will not just unpartition it 😉 so a "scripted" run with the commands after each other should do the trick... but I would like to run this past NetAPp support first... and maybe learn if there is an option to disable this unpartition issue..😉
Ok, this should work, for nodes with partitioned drives
Note number of partitioned drives in each current root aggr
Verify max raid size of root aggr doesn’t equal current number of disks. If needed increase by ONE
Aggr add-disk -aggr root_01 -disk <new-BIG>
Aggr add-disk -aggr root_02 -disk <new-BIG>
(On first node only)
Disk create-partition -source <disk-above > -target <new BIG disk same node >
Repeat for as many disks in current root aggregate (identified above)
System migrate-root -node node -disk <list -of-BIG disks>
Repeat for second node.
What’s going on?
The first step is making the root aggr larger by one drive. ONTAP should not NEED the drive but since the root is partitioned it will auto partition the new root to match and then auto create a full size data partition.
Then you should be able to use the create partition command to have the drives stick.
Then you migrate to a new root with the same number of root disks as the original
When you are done the original root should be deleted and the one you added will be the partitioned spare
Make sense?
neat trick with the extra disk for copying the partition 👍
I usually just take the "phys blks" (last column) from sysconfig -r and divide by 8 (or take the rawsize= value from raid_config info listdisk in node shell if I'm lazy) as the partition size for manual partitioning
I’m not taking credit! Credit to Scott Bell.
The trick works in aff also
I’ve also got got the trick to migrate older SSDs to newer ones, even larger.
The issue there is it is absolutely a one for one. Can’t use ssd > 4T (7.68 are not supported in flash pool). It takes a while. You’re essentially doing a replacement of each partition
that reminds me of the trick we had to use in 7-mode to create directories in /etc (like /etc/ssh) since there was no md/mkdir command in 7-mode 😄
OK this aggregate is the one I did on the "working" node... I followed the description way at the start of this thread from Erik.. the root was migrated without any issues...
I woule like to use the same size partitions on the other node... right now it looks like this...
I have about 12 empty partitions of the size 55176 (on the old disks) and this na12 is also still on the old disks... I also have 25 spare 24TB disks...
If I do the partitions they will "stick" as long as they are still unassigned, but a shot time after I assign the partitions, they will become unassigned and unpartitioned by the system... I just would like to disable this "feature"...
So the operation of creating a new partition size of the "old" partitions I would rather not do...
I could of cause give it a try to partition a disk with the same P2 as the existing ones, just to see if it "sticks"...
The details I posted should work. I got them from an engineer at NetApp.
But I would end up with the larger size root partitions (55176) and not the small size (23928) ?
(I am waiting for NetApp to respond to the support case I created)
One thing I was thinking was to create partitioned disks on the node with the smaller partitions... then only assign the root to the "problem node" but leave the data disk on the oppesite node... that way I hope it will not start unassigning and unpartition the disk... (I could even create a small temp aggregate with the data disks) then when the root has been moved, I can reassign the disks back...
But we are getting closer to doing things which might break ... so I will run this past NetApp support first 😉
Update: I think I soved it by assigning the data partitions to the opposite node... at least for now the root partitions are staying... now I just need a service window to run the migrate operation...
Great success 🎉 my OCD conqured it all and we have same size partitions on both nodes now... setup new aggregates with encryption and started moving volumes....
Thanks again for all your suggestions!
woohoo! congratsx
i got a project coming up. going to need to convert from 10T ADP disks to 22T ADP. Going to document and make public