#how to add drives from an old ontap fas8040 to a fas8200
1 messages · Page 1 of 1 (latest)
The cabling was ok config advisor did not complain but when looking at the disks i can see those were still owned by the old sas nodes , to be honest just by node 2 of the fas8040
I did enter on fas 8200 in maintenance mode and by halting the node and boot_ontap maint
There i was able to disk assign -s unowned -f 57 disks
But not the rest like 63 more that remained with the old node 2 of fas8040
I do not understand why and booted the node normaly waited a bit and tried on the 2nd node of fas8200 the same trick with maintenance mode boot but this time i could not do any more reassign of disk by using the same commands
How can i wipe the ownership of the old fas8040 from the disks ?
The remaining 63 disks that remain under old fas id's
From ontap using advance mode : CL2-FAS8200-HA-01::*> disk assign -disklist 3.24.23 -sysid 0537111119 -force
Error: command failed: Failed to assign disks. Reason: Assign request failed for disk 3.24.23 because the disk reservation is not held by
this node.
CL2-FAS8200-HA-01::*>
the disk show from normal ontap :
3.24.1 - 24 1 FSAS unassigned - -
3.24.2 - 24 2 FSAS unassigned - -
3.24.3 - 24 3 FSAS unassigned - -
3.24.4 - 24 4 FSAS unassigned - -
3.24.5 - 24 5 FSAS unassigned - -
3.24.6 - 24 6 FSAS unassigned - -
3.24.7 - 24 7 FSAS unassigned - -
3.24.8 - 24 8 FSAS unassigned - -
3.24.9 - 24 9 FSAS unassigned - -
3.24.10 - 24 10 FSAS unassigned - -
3.24.11 - 24 11 FSAS unassigned - -
3.24.12 - 24 12 FSAS unknown - SC2-FAS8040-01-02
3.24.13 - 24 13 FSAS unknown - SC2-FAS8040-01-02
3.24.14 - 24 14 FSAS unknown - SC2-FAS8040-01-02
3.24.15 - 24 15 FSAS unknown - SC2-FAS8040-01-02
3.24.16 - 24 16 FSAS unknown - SC2-FAS8040-01-02
3.24.17 - 24 17 FSAS unknown - SC2-FAS8040-01-02
3.24.18 - 24 18 FSAS unknown - SC2-FAS8040-01-02
3.24.19 - 24 19 FSAS unknown - SC2-FAS8040-01-02
3.24.20 - 24 20 FSAS unknown - SC2-FAS8040-01-02
3.24.21 - 24 21 FSAS unknown - SC2-FAS8040-01-02
3.24.22 - 24 22 FSAS unknown - SC2-FAS8040-01-02
3.24.23 - 24 23 FSAS unknown - SC2-FAS8040-01-02
240 entries were displayed.
some disk i could realocate in maintenance some i could not ; the owner is the same node SC2-FAS8040-01-02
Sounds like there a still entries from its aggr on the disk and resulting in the error with disk reservation you got. try and check for an „aggr show“ and if you see something what shouldn‘t be there, go for a remove-stale-record, afterwards a remove-ownership and assign them. Should work after you removed the stale records.
https://docs.netapp.com/us-en/ontap-cli-97/storage-aggregate-remove-stale-record.html#description
there are no aggregates that are not bound to FAS8200 ; and i did wipe the old fas using option 4 there were no aggregates on it .
CL2-FAS8200-HA-01::*> aggr show
Aggregate Size Available Used% State #Vols Nodes RAID Status
SA2_01_FSAS_01 61.20TB 39.88TB 35% online 13 SC2-FAS8200-01- mixed_raid_
01 type,
hybrid,
normal
SA2_01_SSD_01 15.71TB 12.60TB 20% online 7 SC2-FAS8200-01- raid_dp,
01 normal
SA2_01_SSD_02 4.58TB 3.01TB 34% online 1 SC2-FAS8200-01- raid_dp,
02 normal
SA2_02_FSAS_01 62.10TB 21.16TB 66% online 13 SC2-FAS8200-01- mixed_raid_
02 type,
hybrid,
normal
SA2_02_SSD_01 15.71TB 7.03TB 55% online 8 SC2-FAS8200-01- raid_dp,
02 normal
SA2_02_SSD_02 4.58TB 3.25TB 29% online 2 SC2-FAS8200-01- raid_dp,
02 normal
aggr0_CL2_FAS8200_HA_01_01 368.4GB 17.69GB 95% online 1
SC2-FAS8200-01- raid_dp,
01 normal
aggr0_CL2_FAS8200_HA_01_02 368.4GB 17.83GB 95% online 1
SC2-FAS8200-01- raid_dp,
02 normal
8 entries were displayed.
CL2-FAS8200-HA-01::*>
The issue is you did option 4. You should have done option 9a to remove all ownership before moving over.
There is a diag command in recent versions of ONTAP that allows you to either remove owner of drives with a particular owner or re-assign drives from one owner to another
You should go into node shell and run “disk show -a” and follow up with “disk show -n”.
While you muck around you may want to be preemptive and do this
disk option modify -autoassign off -node *
Prevent ONTAP from auto assigning drives for now. When you get the disk ownership removed from the old drives, turn the option to true and it should auto assign if things are connect properly
Try this
set diag
Debug vreport show
If it’s not clean, send the output
CL2-FAS8200-HA-01::*> disk option modify -autoassign off -node *
2 entries were modified.
CL2-FAS8200-HA-01::*> set diag
Warning: These diagnostic commands are for use by NetApp personnel only.
Do you want to continue? {y|n}: y
CL2-FAS8200-HA-01::*> Debug vreport show
volume Differences:
Name Reason Attributes
SM2-s072-01-01:SV2_s072_axxx_01 Present both in VLDB and WAFL with differences
Node Name: SC2-FAS8200-01-02
Volume DSID:1046 MSID:2157345611
UUID: 3df49f2d-b909-11ee-887a-00a098bdb032
Aggregate Name: SA2_02_SSD_01
Aggregate UUID: 94026acf-5d07-4c97-9a87-65fd5d845088
Vserver UUID: 5e033fad-68f0-11ee-bd85-00a098bd43fc
AccessType: READ_WRITE
StorageType: REGULAR
Constituent Role: none
Buftree UUID: 00000000-0000-0000-0000-000000000000
Contained in Composite Aggregate: false
Differing Attribute: Analytics state
WAFL Value: on
VLDB Value: off
CL2-FAS8200-HA-01::*>
i tried a few but it the same
SC2-FAS8200-01-02> disk
usage: disk <options>
Options are:
assign {<disk_name> | all | [-T <storage type> | -shelf <shelf name>] [-n <count>] | auto} [-p <pool>] {[-o <ownername>] [-s <sysid>] | [-copy-ownership-from <disk-name>]} [-c block|advanced_zoned] [-f] - assign a disk to a filer or all unowned disks by specifying "all" or <count> number of unowned disks
fail [-i] [-f] <disk_name> - fail a file system disk
maint { start | abort | status | list} - run maintenance tests on one or more disks
remove [-w] <disk_name> - remove a spare disk
replace {start [-f] [-m] <disk_name> <spare_disk_name>} | {stop <disk_name>} - replace a file system disk with a spare disk or stop replacing
sanitize { start | abort | status | release } - sanitize one or more disks
scrub { start | stop } - start or stop disk scrubbing
show [-o <ownername> | -s <sysid> | -n | -v | -a | -m | -w | -S <disk_serialno> | -c <cluster_disk_name> ] - lists disks and owners
simpull <disk_name1> [<disk_name2> [<disk_name3> ... ]] - simulate one or more disk pulls
simpush [<sim_disk_path_name1> [<sim_disk_path_name2> [<sim_disk_path_name3> ...]] | -l] - simulate one or more disk pushes or list available disks to push
zero spares - Zero all spare disks
SC2-FAS8200-01-02>
SC2-FAS8200-01-02> disk assign 0c.20.22 -o 536947288 -s 537111119 -f
Assign request failed for disk 3.20.22 because the disk reservation is not held by this node. Disk assign request failed.
SC2-FAS8200-01-02> disk assign 0c.20.22 -o unowned -s 536947288 -f
Assign request failed for disk 3.20.22 because the disk reservation is not held by this node. Disk assign request failed.
SC2-FAS8200-01-02> disk assign 0c.20.22 -copy-ownership-from 0b.02.14
Assign request failed for disk 3.20.22 because the disk reservation is not held by this node. Disk assign request failed.
SC2-FAS8200-01-02> disk assign 0c.20.22 -copy-ownership-from 0b.02.14 -f
Assign request failed for disk 3.20.22 because the disk reservation is not held by this node. Disk assign request failed.
i will try tomorrow from the node maintenance , since i did a few and worked . so far got stuck in a vicious loop with the rest.
There is a “disk removeowner “ command. I forget if it is at diag in the cluster shell or node shell. That should work.
i fix the issue by going back in the DC cable the old system back to the shelves and boot_ontap maint and from maintenance remove the ownership ; cabling the shelves back to the FAS8200 ; assign the disks ; remove stale records ; then create aggregates.
still there is no good way how i could reassin half of the drives onwed by node 1 of hte old fas from the fas8200 mainenance and not the rest ( who knows) option 9 did not work on the old fas.
On the old system you need not worry about that. The goal is simply to remove ownership. Maintenance mode should do that fine. From the boot menu, you run option 9a on each node (waiting for each run to completely finish first).
After 9a , I typically run it a third time. There is a line that is usually difficult to find that says how many disks it is working on. If that matches however many disk you have then you are good. Ask disks should have the owner removed. If not, I go into maintenance mode and find the culprits and remove there
FYI, ran into this a few weeks ago, and looking at aggrs in cluster shell won't show orphaned/partial aggrs, but run aggr status on the local node shell and you can see the partial aggrs/partitions that didn't get cleaned up.
in my case when i set in maintenance mode the disk to unowned ( ex.: disk assign 0c.22.3 -s unowned -f) there were no aggregates but when assigning from the so called new cluster that got the disks defunct aggregates bound appeared so i had to clean them with remove-stale-records -node -aggregate so i can create new aggregates .
When you option 4, it assigns disk ownership and creates a root aggregate if needed. That was the crux of the problem. Running just option 9a removes partitions ands ownership charging the rear to reuse the drives
yes but the problem is option 9a did not work on the old system,
user error. It is on the boot menu since at least 9.4. When the system was originally decomissioned, it should have ran just fine. If you pulled drives, got some of them assigned to any other node, then the 9a may not fully work. I would like to know more details other than "it did not work"
I would assume the old system is simply running an ancient ONTAP version that doesn't have 9a, but that's just a guess
Fas8040. Possible.
I have had similar behavior a long long time ago when I moved a shelf to a different HA-pair without powering it off. Old SCSI reservations from the old HA-pair were still in place.
As all moved disks had the same issue I could powercycle the shelf and my issue was solved.
storage release disks in maintenance mode also clears SCSI reservations (in case you don't want to walk over to the datacenter again 🙂 )... but it might not do that on foreign disks, I'd have to check that 🤔
the older versions of 9a would ONLY grab disks from the owner and any unowned disks. Thats why you run it on each node. First pass grabs all of node-01 disks. the second pass grabs all of node-02 disk plus all unowned (that used to be owned by node-01).
And this is why still today, I run a third pass and look for that line that tells me how many disks it found. If it is off, I can then go into MAINT mode and force ownership if needed.
Why not try to reconnect the disk shelves back to old controller,
create console session to both nodes
boot_ontap maint
Then do aggr status
You should see root aggregates from the previous cluster, then do aggr offline and aggr destroy of the former root aggregates.
Then you can now do, disk unpartition all disk remove_ownership all.
Do that on both nodes and you should have your disk shelves ready for reuse