#Replacing boot media issues...

1 messages · Page 1 of 1 (latest)

wild mist
#

I am trying to replace a boot media on an FAS2700 with a boot media of an FAS2600 (same size) This is a "bare install".... I am able to install ONTAP via option 7 in the boot menu... I get ONTAP up and running... but when I reboot the first time, I get this error:
The boot device has been replaced or is seriously corrupted. Use option(7) to install the image and option(6) to restore the system configuration, then reboot the node to restart NVRAM subsystem. Normal Boot is prohibited.
(I then get the boot menu)
I have of cause tried to use option 6, which ends up at the same menu after it reboots etc.. I have tried to use option 7 a few times so both images have ONTAP 9.16.1 now... and it can be installed OK, yet again after first reboot, it cannot boot... my guess is that there are some options set on the boot media? TPM? I get some TPM messages while booting... I have also tried to do a "LOADER-A> set-defaults;saveenv" ... yet I get stuck here anyway... am I missing something? Oh and this is a "new install" as a single node cluster.

#

I will try a "tpm reset" to see if that makes any difference...

hybrid marten
#

Did the Option 6 work or did it print any errors? Also I don't think single node systems are even supported anymore with the 27xx series (although that should not matter for your problem)

wild mist
#

Single node is supported... I have tried it before... so that is not it... option 6 boots, and reboots a few times, but ends up at the same boot menu with the same error

#

`Use your web browser to complete cluster setup by accessing
https://10.10.10.111

Otherwise, press Enter to complete cluster setup using the command line
interface:

Do you want to create a new cluster or join an existing cluster? {create, join}:
create

Do you intend for this node to be used as a single node cluster? {yes, no} [no]:
yes

Step 1 of 5: Create a Cluster
You can type "back", "exit", or "help" at any question.

Enter the cluster name:`

#

I guess the "tpm reset" is only avaliable when the cluster is created...

hybrid marten
#

Yeah but when it reboots it should say something like "restoring config... Success" or something like that

wild mist
#

There is this: mount_msdosfs: /dev/ada0s2: Invalid argument

hybrid marten
#

After the first reboot and before the Seconds one IIRC

wild mist
#

`test::security tpm*> reset -node test-01

Warning: This command will permanently clear the TPM and reset it with new
keys. Any data secured using the TPM will be permanently lost.
Do you want to continue? {y|n}: y`

#

booting now...

#

I get this at the start: Can't find backup boot device u0a.1

#

I also get this: : [TPM-LOG] netapp_tpm_init: tpm_undark:true : [TPM-LOG] netapp_tpm_init: UseTPM: true : [TPM-LOG] netapp_tpm_init: /dev/tpm exists : [TPM-LOG] netapp_tpm_init: Begin Mon Feb 17 11:29:29 UTC 2025 : [TPM-LOG] netapp_tpm_init: Using /dev/tpm : netapp_tpm_init: starting at Mon Feb 17 11:29:29 UTC 2025 SSAL: tss_tpm_nvread:976 ... as it boots

#

pretty annoying 😉

#

is there a way to totally nuke the boot media? without taking it out of the controller 🙂

hybrid marten
#

option 7 from netboot totally wipes the boot media

wild mist
#

I think it's the "Can't find backup boot device u0a.1" that is causing issues.. I will try to setup the cluster and then do a ONTAP update

hybrid marten
#

but without the full log of what option 6 does it's hard to diagnose

wild mist
hybrid marten
#

in any case, option 6 should resolve that, if it doesn't then it should print an error but without logs it's hard to troubleshoot

hybrid marten
wild mist
#

I tried Netboot, but it kept complaining about the file

wild mist
hybrid marten
#

this is what option 7 from netboot should look like. Note the like "Partitioning of boot device complete" and "new filesystem created" -> it wipes everything, even the partition table

#

it also installs ONTAP in both boot slots (primary and backup), whereas option 7 from an installed system only wipes/overwrites the alternate slot and switches it to active

wild mist
#

LOADER-A> netboot http://10.10.10.109/9161_q_image.tgz Downloading 10.10.10.109/9161_q_image.tgz .. Failed to determine Content-Length. Download failed: Not found

hybrid marten
#

"not found" -> are you sure the URL is correct? does it work in a browser?

#

is the network config and gateway correct?

wild mist
#

yes it is, and I was able to use the same path with option 7

#

I have the cluster up now, and am trying to update ONTAP... and crossing my fingers 😉

#

but strange that netboot wasn't able to work

hybrid marten
#

yeah, sometimes the loader is a bit finicky with what HTTP servers it accepts ... I usually use the python-integrated one (python -m http.server 8080 to serve the local directory)

tepid zinc
#

I think I know....

#

extract it. Then go into it and extract the ServiceImage11x.zip

#

locate "boot/x86-64/firmware/SB_XXII/firmware.img" and put on the web server

#

After you configure the LOADER for "netboot" instead of netboot

#

when it is done.
bye -g

#

I suspect the LOADER has NO IDEA how to handle the ONTAP image and fails. I saw that on an Older FAS9000 with old BIOS. after the BIOS was updated, ONTAP loaded fine

hybrid marten
#

You mean back then when the LOADER didn't know how to boot a tgz file and you had to uncompress everything? That's long gone, FAS2700 can handle tgz files, it's just that the loader sometimes doesn't talk properly with some web servers. Sometimes all it needs is a second try (i.e. typing the same netboot command again) and then it works. Sometimes a ping before the netboot helps. Sometimes you need to change the webserver (e.g. hfs doesn't work but python works)... it's just not a very well-tested network stack I guess

tepid zinc
#

No. just for updating the FLASH!

#

I KNOW the Loader can handle the TGZ. Every platform after the 80x0 series can

#

The LOADER needs to know how to process the TGZ. There are subtleties between versions. In my example, ONTAP could boot ONTAP 9.8 but failed when trying to boot ONTAP 9.11. Updated the BIOS/LOADER and that was it

hybrid marten
#

Wouldn't that have given a different error than "not found"?

wild mist
#

I will try the suggestions in a few hours... I think I will also try to configure the chassis as "non-ha" ?

hybrid marten
tepid zinc
#

the first message was on content-length...then not-found. There was an issue reading the new image and the not found is because even though it started to "read" the file, it never finished, hence not found

tepid zinc
#

Update the bios/loader and newer versions should work

wild mist
#

when the system boots it states the BIOS i 11.23 which is the latest version, so I guess it does not make sense trying the firmware update?

hybrid marten
#

yeah, the netboot thing is not directly related to your issue, it is just because you wanted to fully wipe the boot media which can be done using netboot. You can totally install and run ONTAP without ever netbooting

#

your issue is that boot menu option 6 somehow didn't work properly which might or might not have anything to do with the boot media (we still don't know what cause it because we haven't seen the full boot/system messages)

wild mist
#

9.14.1 seems to do netboot... lets see what happens

#

Getting ready to install image /dev/md4: 0.5MB (1024 sectors) block size 32768, fragment size 4096 using 1 cylinder groups of 0.50MB, 16 blks, 128 inodes. super-block backups (for fsck_ffs -b #) at: 192 ******* Working on device /dev/ada0 ******* Partitioning of boot device complete /dev/ada0s1: 33521216 sectors in 2095076 FAT32 clusters (8192 bytes/cluster) BytesPerSec=512 SecPerClust=16 ResSectors=32 FATs=2 Media=0xf8 SecPerTrack=63 Heads=16 HiddenSecs=0 HugeSectors=33553989 FATsecs=16368 RootCluster=2 FSInfo=1 Backup=2 New boot device filesystem created New filesystem mounted Directory /cfcard/x86_64/freebsd/image1 created Directory /cfcard/x86_64/freebsd/image2 created Syncing device... Extracting to /cfcard/x86_64/freebsd/image2... Installed MD5 checksums pass Syncing device... Extracting to /cfcard/x86_64/freebsd/image1...

#

looks promissing

#

...no more strange errors at boot...

#

So we are going fo a "9a" and "9b"... 🙂

#

....so maybe there is a "bug" in 9.16.1 when netbooting?

#

Feb 17 15:54:00 [localhost:monitor.globalStatus.ok:notice]: The system's global status is normal. is always a good message? 😉

hybrid marten
#

but you didn't try to restore the config from the disks (bootmenu 6) this time?

wild mist
#

nope, netboot -> 7 -> 9a -> 9b ...waiting for cluster setup now

hybrid marten
#

because this should definitely work with 9.16, even with a non-HA system. We have done this hundreds of times I'm pretty sure I would have heard of any issues like that 🙂

wild mist
#

I am pretty sure the netbooting did the trick because it re-formated the boot media... which none of the other options I tried did...

hybrid marten
#

yeah maybe. without the original error it's hard to tell. But glad that it works for you now (and that you could do a 9a and don't have to rely on getting your data back 😉 )

wild mist
#

ok first reboot after cluster setup 😉

#

great success 😉

#

thanks for the hints... and yes the FAS2700 works fine as a single node cluster 😉

tepid zinc
#

Check the loader version!
On some platforms the loader is on the flash device.
Grab the latest loader/bios image and update.

#

Of course updating ONTAP to 9.16 will update the loader also

#

You may be using an older loader than you think

wild mist
#

Going back in the logs it states Boot Loader Version 8.4.0 which is the latest (I guess) since I have now upgraded to 9.16.1 and it has the same loader... so it was most likely the boot media from the FAS2600 that wasn't compatible some how... so only after the netboot worked (with 9.14.1) it re-partitioned the boot media and since then I have not had the same strange errors while booting... and I cannot tell you why netbooting 9.16.1 didn't work... I wasn't the image because I used the same image to update 😉 and also not the web-server as I also used the same apache2 from a default ubuntu 24.10 install...

#

but it worked in the end... now I can do the +100TB snapmirror which FAS2600 and 9.11 wasn't able to do 😉