#Disk Failure - Netapp FAS 2040 + DS4243
1 messages · Page 1 of 1 (latest)
Have you assigned the replacement disk to the system? “disk assign 0d.01.6”
“disk show -a” if that fails
Is it a supported Netapp disk drive?
does this system have active support contract from NetApp and if you ordered the disk from NetApp using the regular support process
are there any spare drives available in the system?
first the orange light on the drive needs to be corrected. make sure there are at least 1 spare drive in the systems. sysconfig -r should give you that informatino
then re seat the drive to see if the orange light is resolved
if it still stays orange, i suggest speaking to NetApp Support
Pretty sure there are approximately zero FAS2040s with support contracts right now. We stopped selling them 11 years ago
Hello again, thanks for all your help so far. 1st I checked with sysconfig -r our spares. We got 2 spars both are in C Shelf it seems. One is not zeroed (no clue what that means). So I tried “disk assign 0d.01.6” (without quotes"") which failed for some reason. disk show -a shows that 0d.01.6 is not listed.
Next step I will do is plug out / plugin the disk again
2nd issue:
maybe 0d.01.9 (prefailed) will die soon aswell?
I read that I need to unfail it before I can change the owner or assign it. I don't find the exact order how to unfail / reset it.
What is the exact part number (X???? with some digits) of the replacement disk? I have seen the "status: 3" thing (or similar errors) when the disk you're trying to assign is an SED drive and your system uses non-SED disks
@near drum probably best to keep going in here instead of DM. What timezone are you in?
@dry oracle GMT +2 Yeah I will do that 🙂
Is there a way to bring the right FW on that disk, unfail the slot and assign it or similar. If yes could you provide me the right order how to do it in th shell please. I think the SATA disk is just a default one without any netapp pre-config / fw. We noticed its a very very old Netapp. We got multiple and most of them are still supported. how ever this one is kinda too old but it would be so nice if there is a way to use this disk. I am also curious to learn something new right here.
Hitachi 2Tb HUS724020ALA640 64Mb Cache 7200Rpm Sata III 3,5"
There is no way to take a non-netapp branded disk and flash with netapp firmware and reformat. You must buy identical replacements
So the disks you've added which weren't working - did you purchase them in NetApp caddies, or did you move them in them?
@near drum I am able to assist for a few minutes
@dry oracle I am here
Okay alright. Thanks again for your info. Then I need to check for a different vendor.
responding to message - there's your problem, these are generic drives. Generic drives do not work with ONTAP* (there's.. exclusions, but in general that's the issue)
Any recommendations? And what will be the right steps if I got a new one
finding records from when your system was sending autosupport, it had X306 2TB drives in the shelf and X298 1TB drives in the internal shelf
You should look for those part numbers for replacements
X306 sounds familiar to me. I need to check for the shell command to see all others
the reason I specify those exact parts is that some newer OR (not necessarily AND) larger drives require later versions of ONTAP for various reasons. By replacing like with like, you will likely not have problems
sysconfig -a will list models
disk show -an, then disk assign for the slot
ok no need to unfail it right? that was a different FAS or?
lol, here's me 5 years ago providing exact instructions for a FAS2040 - https://community.netapp.com/t5/ONTAP-Hardware/Disk-Replacement/td-p/140362
you might need to unfail it. Reason being when SATA drives are "sanitized", they will be left in a "failed" state.
then they get sold that way
polite sellers unfail, just like how when you used to rent a video cassette you were supposed to rewind at the end 😉
Alright. I think unfailing is not even a command. Then I tried it last time the controller FW told me there is non command like that existing
it is in advanced priv
buy three replacements and then once you have the two spares it will fail it, or you can do a "disk copy" to move from the prefail drive to the new replacements
I can't quite remember the logic on 8.1. Others may correct me
disk copy is always preferable to disk fail since it doesn't need to recompute RAID
You system was shipped 21 November 2012. Last orders for FAS2040 were 02 November 2012, so you may have one of the last ones sold
Okay first we will fix the "broken" one. Then I am allowed to exchange the prefail one right.
yes, but recommend to copy, not just yank
so I use the copy command to copy the prefail stuff to one of the hot spares?
yes
then I exchange the prefail and so we got a new hot spare
okay perfect I understood the logic. sorry for all the askings I am kinda new to netapp stuff
no problem, it's important to understand consequences of actions for storage systems 🙂
could you have a look at our spare situation very quick pls
what particularly?
Ok did that.
any disk not part of an aggregate will be used for a spare
disk zeroing takes some time, depending on system load, disk size etc. I'd expect it'll be done in 2 hours maximum though
Could you provide me a quick example of copy 0d.01.9 to 0c.00.1
does it even makes sense?
from d to c. Looks like I got only 2 hot spares in C shelf
for each shelf you want to have atleast 1 spare?
we bought 2 x306A, so we exchanged 0d.01.6 successfully. It did reconstruction (which took half a day)
How should I proceed with the 0.d01.9 (prefail) disk?
I got no spares for d shelf right?
@dry oracle It would be awesome if you could have a look once again. You helped me recently by a lot
ok, so your system is using 2TB drive 0d.01.11 as a 1TB drive in aggr0. You need to replace 0d.01.11 with new X306 1TB 0c.00.1, then when 01.11 is spare, zero it, and when it's a spare execute a copy from prefail drive 0d.01.9
you want the command "disk replace start 0d.01.11 0c.00.1"
ping @near drum
Thanks you. I will do that right now. Lets see how long the replace will take.
It would be nice if you could send me the zero and copy command aswell.
sure, when that's done "disk zero spares"
and then "disk replace start 0d.01.9 0d.01.11"
I'm guessing it'll be in the order of 3 hours for the first replacement and 6 for the second
but it depends on system load etc. I think there's a "disk replace status"
righto. Good luck
I just wanna mention. After sucussfull replacement from 0d.01.11 to 0c.00.1 then 0d.01.11 was a hot spare for a split second and got choosen for replacement to 0d.01.9 automatically.
I wasn't able to zero it before and started to copy by its own. Hopefully its doing a good job anyway.
If it does his job successfully anyway.
Could you please tell me how I get a new hotspare for d shelf. We still have a disk in our IT warehouse. Could I just put it into an empty bay of d shelf and it gets assigned to hot spare automatically? or which commands do I need then.
Thank you again @dry oracle
Pull out the failed disk, put the new one in and do what you did for those three new ones - just disk assign them