#A general understanding of the FAS2620 in a cluster environment

1 messages · Page 1 of 1 (latest)

hybrid token
#

I have been handed (with no documentation what so ever) the task to shutdown the comms room for 4 hours.
And we have (what I believe) is:

  • 2 x FAS2620 Controllers (Cluster)
  • 1 x DS224-12
  • 1 x DS212-12

This was taken from the various show commands of the Cluster Management SSH console.
But... I think I am seeing the controller having discs also - and I am so confused.

In essence, the system reports the above - so i should see 4 physical devices right?
However, I am sure the Controller has a disc array on it also - so you know... I'm dizzy and stuff.

I need to understand the various components - before i start issuing halts everywhere

I sure do hope someone will take on this newbie 😃
And of course, I thank you! 🙏

upbeat tiger
#

depending on a few things here are the basics to get an idea of what is connected.

The system can have 12 disks internal

storage shelf show
Which will show the basic shelf info

for a full listing of everything in the system/etc
sysconfig -A

warning, it will give you a long list of everything in the system.

if you can get into the GUI you might get an easier picture of everything in the system

hybrid token
#

Thanks @upbeat tiger

So when I list shelfs - that wont include the internal "shelf" if you will?
would that be the correct assumption?

upbeat tiger
#

it will not, only attached ones. if you do a storage disk show

#

that will show all the disks and the shelf info

#

internal disks are generally assigned shelf ID "0"

hybrid token
#

Ok - its starting to make sence then.
but this "Shelf" with ID 0 is not listed as a shelf?

upbeat tiger
#

It will be, shelf id 0
then each attached external shelf will be whatever is on the shelf itself. most people will use 10, 20, 30, for the shelf units

#

but the shelf itself should have an LED on it that shows what the ID is

hybrid token
#

Cool! - this is really helpfull 👍

I plan on halting both controllers - the non cluster management one first, then its self (to which I will be accessing over Console)
i guess this also takes care of gracefully bringing down the internal shelfs also

upbeat tiger
#

The disks should be numbered something like

1.1.1
the second 1 is the shelf number, the third is the bay number

#

yes, when you do a storage failover takeover -ofnode NODEX -halt
will move everything to the online controller and halt that node

then you can halt the one remaining one

#

you can also do the full 'graceful' shutdown
system node halt -node * -skip-lif-migration-before-shutdown true -ignore-quorum-warnings true -inhibit-takeover true

which will shut both nodes down and not go through the reassigning everything

hybrid token
#

The command (s) I will be using

system node halt -node <none-clus-mgmt-ctrl>
WAIT
system node halt -node <clus-mgmt-ctrl>
WAIT

Pull power on all systems
This will all be done from the cluster management controller

is that a (stable) way - dont' worry i wont hold you accountable 😄

(as you may have guessed, this is all very new to me)

upbeat tiger
#

that should work as well, but doing a storage failover is preferred

hybrid token
#

Thank you 🙏

This is what i have taken away from this interaction:

  • The internal disc array, will not be listed as a shelf when showing them
  • My approach will be fine, but fail over is preferred (your command)
    • your command will do both shutdowns

after shutdowns - its (should be ) safe - to pull life support?

#

(and thank you for the extra commands)

hybrid token
#

@upbeat tiger

I have been back to site - and pulling the half height out - i found out the FAS2620 is a dual model.
2 x controller in the same chassis.

Would this make sense then I see only 2 x 2U units, but having 4 devices reported?

  • The 2 controllers (same chassis) : FAS2620
  • The DS212-12 being reported in that chassis
  • DS224-12 our additional shelf.

The list of discs only shows 2 ID's 0, 10
I am still just confused its reporting 2 Shelfs - when I can only see 1 Shelf + the dual controller

#

I can pull a list if you like - to get me off your back 😉

upbeat tiger
#

so there are 2 FAS2620 units? yes, each unit has 2 controllers.
question is are they part of the same cluster or 2 different ones.
If you do a cluster show, does it show 2 or 4 nodes?

hybrid token
#

shows 2

#

Let me logon......

#

Shelfs

#

Notice Shelf ID 0 - that would suggest the dual unit - is also being reported?

upbeat tiger
#

doesnt look like it.
the DS212-12 and DS224-12 are the shelf units shown. How many disks are reported by the system

hybrid token
#

24

upbeat tiger
#

yea, and those are what's in the shelf, right? One shelf full of 900gb and the other with 4 900gb and 8 4tb

silent lake
#

IOM12E is the controller's shelf

#

so yes you have 1 unit (2U) with the controller and 12 disks, and another 2u unit with additional disks

hybrid token
#

That's the ticket 😮

#

Now i know i'm not going mad

#

Thank you both!

This has been really helpful - you wouldn't believe!

silent lake
#

all module types with "E" at the end are for "embedded" systems, i.e. controllers in shelves

hybrid token
#

I have NO documentation on our setup - and this has been REALLY valuable 🙏

#

thank you again both

hybrid token
#

OK...

I have my list (I think).

  • Shutdown ESXi hosts
  • Issue this on the Cluster Management Console (SSH) :
system node halt -node * -skip-lif-migration-before-shutdown true -ignore-quorum-warnings true -inhibit-takeover true
  • Wait
  • Pull Power on everything

Then (After we are ready to come back to life)...

  • Power up the DS224-12 : IOM12 (external shelf)
  • Wait
  • Power up the FAS2620 Dual controller (with DS212-12 : IOM12E)
    This has 2 Power supplies given the HA Pair.
  • Boot up ESXi hosts
  • Pray

I mean, do i have this right?

silent lake
#

yep. that's it

#

you don't need to wait very long after powering up the shelf. maybe a few seconds. these days, the controller takes much longer to start up so it's usually fine to power them up at the same time. the controllers can take 5 minutes to boot

#

I would so a set -confirmations off before issuing the node halt command so that the system doesn't prompt you "are you sure?". If you don't answer quickly enough, the first node shuts down already and if you happen to be connected to that node, you cannot enter the y for the second node anymore and it won't shut down

#

I would also connect to the two BMCs and issue a system console on each of them. that way you can see when they are down (in the LOADER> prompt) to know when it's safe to power off

#

that way you can also issue another node halt, should the connection be closed before you had time to confirm the "are you sure" prompts if you forget the set -confirmations off

#

the BMC is basically a serial connection into the cluster to "another" console

hybrid token
#

This is awesome stuff - Thank You! so much

silent lake
#

if you don't remember the SP/BMC IP addresses, use service-processor show. You can log in using the cluster's admin account and connect to the serial console via system console

hybrid token
#

I have the SP IPs to hand, so should be good to go (although, I’ll remember to disable confirmations) - seems less stress 🤪

maiden dust
#

So I usually do this to make sure I halt the nodes in the correct order...
"node halt -node !local -inhibit-takeover true -skip-lif-migration true"
That will halt all other nodes than the one I am currently connected.
Then some seconds later, re-run but without the exclamation mark:
"node halt -node local -inhibit-takeover true -skip-lif-migration true"

Having checked autosupport (callhome) data for your systems (seems to be enabled but I believe there is no active support contract) I can see that you do have the service-processor configured with an IP so that is where I would have SSH:ed to, run "system console" and then the commands from above... I'd be SSH:ed and "system console":ed into the other cotroller as well just to see what was going on.

hybrid token
#

Hey @maiden dust

Thanks for the input!

I was going to connect to the IOIO port of each, and run the halt on the controller that also has the management IP interface (just for the sake of choosing one) - and to monitor both

I will be behind the rack, for some of this - so want to use the opportunity to connect via IOIO to do my shutdown, if this is a sane idea?

silent lake
#

note that the IOIO port is a serial (RS232) port, do not connect it to a network switch

hybrid token
#

Yup - give me some credit 😉
Already have the cables in my trusty engineer kit

hybrid token
#

Quick question on the IOIO - will i already be at the system console - or do I still need to issue the request?
i guess this port is attached to the OS side already?

silent lake
#

you will be at the SP/BMC and need to type system console too

hybrid token
#

cool - make sense - thank you

maiden dust
# hybrid token Hey <@727839195212546128> Thanks for the input! I was going to connect to the...

Well if you're on the IOIO then you are in full control of the process anyways so that is a goot approach. If you do one controller at a time you can just use the "local" trick on both and no exclamation marks. Please note that with my commands ONTAP will prompt you if you want to continue. The confirmation question comes immediately after issuing the command so itäs not like you have to watch it and respond at the right time or so.