#Misunderstanding broadcast domains / reachability in ontap?

1 messages · Page 1 of 1 (latest)

neat cosmos
#

think i have a fundamental misunderstanding of how networking ...uh.. works... with ontap. By way of example... I have a small flash unit (afs190?) .. just a single dual controller chassis. I have 2 ports per controller set up in a lag, connected back to two switches (mlag), as "trunk ports". I have defined a vlan on each controller's trunk for a dedicated storage network (a0a-2053 on both). I then put both of the vlan "ports" into a single broadcast domain, because, in my mind, these are both on same l2 segment, they should be able to each each other. When i check reachability, both show "no-reachability". I'm not sure what i'm missing / where my mental disconnect is here. When i try to create lifs using each of the vlans, i am able to create one on one controller fine, but when i go to create the other, i'm told there's no elligible port.

can anyone tell me what i'm getting wrong on a conceptual / configuration level?

tawny oriole
#

did you check that your switch actually has those VLANs configured? Usually a quick check is to set up an IP address in that particular VLAN on the switch ("interface vlan 2053" on Cisco) and try to ping the LIF from there.

#

basically, what ONTAP does is it does layer 2 (ARP) pings between the ports that are supposed to be in the same L2 segment, and when it cannot reach the other ports' MAC addresses, it flags them as "no reachability"

#

so my guess is that you forgot to add the VLAN on the second node's LAG port

neat cosmos
#

I've verified on the switches that the Mlags are set up correctly and that each MLAG is a tagged member for the vlan. I'm probably garbling the language there, trying to translate between mikrotik term vs general networking term, but the switch part is the piece i'm actually confident about.

tawny oriole
#

so you can ping (from the ourside) the one lif that you were able to create? did you test that? what happens if you try and migrate that LIF to the other node's a0a-2053 port while pinging it? (net int migrate)

#

and just to make sure, the "no reachability" shows up on the a0a-2053 port, and not on a0a itself, right? because the latter could be due to a bug that always shows "no reachability"..

neat cosmos
#

reachabiilty snip :

#

i am able to ping the 1 lif that i was able to create, yes. I haven't tried to migrate it, I will do that when I can and report back.

tawny oriole
#

maybe your switch blocks ARP pings, I have seen some equipment do that (some sort of security setting or something). Then it will always show "no reachability"

neat cosmos
#

That's a thought, but why would it restrict me from being able to add a lif to the second node's vlan ?

tawny oriole
#

hm, yeah, that's strange. you're doing all that via CLI, not System Manager, I assume? 🙂

neat cosmos
#

i actually attempted to add the lif via system manager. I've been configuring the unit with a mixture.

tawny oriole
#

what ONTAP version are you running?

neat cosmos
#

9.14

tawny oriole
#

there's a couple of KBs that talk about switches not sending the pings to all ports of the ifgrp, causing it to be marked as degraded, but that should not apply to 9.14 anymore 🤔
you can work around it by modifying the port with -ignore-health-status true that should at least let you create a LIF there. Sadly I have no real experience with Mikrotik switches and ONTAP

violet isle
#

Is the ifgrp healthy? “Ifgrp show is it the right type?
ie mode active at the switch is multimode_lacp on the Netapp
Mode on would be just multimode on the Netapp
Also common issue I see is the vlan may be configured on the mlag or ports but it has not been defined on the switch

Show vlan -> on the switch?

neat cosmos
#

back at this.
@violet isle "ifgrp show" output:

nacluster01::> ifgrp show
         Port       Distribution                   Active
Node     IfGrp      Function     MAC Address       Ports   Ports
-------- ---------- ------------ ----------------- ------- -------------------
nacluster01-01
         a0a        port         d2:39:ea:9e:fe:8d partial e0c, e0d
nacluster01-02
         a0a        port         d2:39:ea:9e:f8:c9 partial e0c, e0d

Looking at first cluster node and its ifgrp with port show :

nacluster01::> network port show -node nacluster01-01 -port a0a

                                        Node: nacluster01-01
                                        Port: a0a
                                        Link: up
                                         MTU: 9000
             Auto-Negotiation Administrative: -
                Auto-Negotiation Operational: -
                  Duplex Mode Administrative: -
                     Duplex Mode Operational: -
                        Speed Administrative: -
                           Speed Operational: -
                 Flow Control Administrative: -
                    Flow Control Operational: -
                                 MAC Address: d2:39:ea:9e:fe:8d
                                   Port Type: if-group
                 Interface Group Parent Node: -
                 Interface Group Parent Port: -
                       Distribution Function: port
                               Create Policy: multimode_lacp
                            Parent VLAN Node: -
                            Parent VLAN Port: -
                                    VLAN Tag: -
                            Remote Device ID: -
                                IPspace Name: Default
                            Broadcast Domain: Default-Trunk
                          MTU Administrative: 9000
                          Port Health Status: healthy
                   Ignore Port Health Status: false
                Port Health Degraded Reasons: -
                Virtual Machine Network Name:
                    Supported RDMA Protocols: -

Looking at the vlan (2053) :

nacluster01::> network port show -node nacluster01-01 -port a0a-2053

                                        Node: nacluster01-01
                                        Port: a0a-2053
                                        Link: up
                                         MTU: 9000
             Auto-Negotiation Administrative: -
                Auto-Negotiation Operational: -
                  Duplex Mode Administrative: -
                     Duplex Mode Operational: -
                        Speed Administrative: -
                           Speed Operational: -
                 Flow Control Administrative: -
                    Flow Control Operational: -
                                 MAC Address: d2:39:ea:9e:fe:8d
                                   Port Type: vlan
                 Interface Group Parent Node: -
                 Interface Group Parent Port: -
                       Distribution Function: -
                               Create Policy: -
                            Parent VLAN Node: nacluster01-01
                            Parent VLAN Port: a0a
                                    VLAN Tag: 2053
                            Remote Device ID: -
                                IPspace Name: Default
                            Broadcast Domain: Default-StorageNetwork
                          MTU Administrative: 9000
                          Port Health Status: degraded
                   Ignore Port Health Status: false
                Port Health Degraded Reasons: l2_reachability
                Virtual Machine Network Name:
                    Supported RDMA Protocols: -
tawny oriole
#

the "partial" ifgrps might be a problem. "partial" means that only some of the ports came online. I assume you are using multimode_lacp as ifgrp type, which means the switches did not configure all ports in the same LACP aggregate. How are your ports physically cabled? you might have a cable mixup somewhere

#

often this happens when you try and put e.g. the e0a ports of both nodes into an LAG, and the e0b ports of both nodes into a different one, which won't work. LAGs in ONTAP cannot span nodes

violet isle
#

What’s @tawny oriole said. What about a “net port show”?
Are both e0c/e0d connected?
What is the channel-group mode in the switch?
If you have cdp/lldp enabled, check that. To turn it on at the Netapp

system node run -node * options cdpd.enable on

system node run -node * options lldp.enable on

Wait 3 minutes out so then check
network device-discovery show -port e0c|e0d

#

If you can get it, send the switch config related to the ports in the switch (the ports and the channel group)

neat cosmos
#

i was able to get someone at the DC to check the cabling on the lags, and there was an error where two cables were in wrong place. Swapped them to correct positions and I think that may have solved things. I am still looking through everything, but this is promising:

 nacluster01::> ifgrp show
         Port       Distribution                   Active
Node     IfGrp      Function     MAC Address       Ports   Ports
-------- ---------- ------------ ----------------- ------- -------------------
nacluster01-01
         a0a        port         d2:39:ea:9e:fe:8d full    e0c, e0d
nacluster01-02
         a0a        port         d2:39:ea:9e:f8:c9 full    e0c, e0d
2 entries were displayed.
tawny oriole
#

full connectivity indeed looks better

violet isle
#

Agreed!

neat cosmos
#

                       Node: nacluster01-01
                       Port: a0a-2053
  Expected Broadcast Domain: Default:Default-StorageNetwork
Reachable Broadcast Domains: Default:Default-StorageNetwork
        Reachability Status: ok
          Unreachable Ports: -
           Unexpected Ports: -

nacluster01::> network port reachability show -node nacluster01-02 -port a0a-2053

                       Node: nacluster01-02
                       Port: a0a-2053
  Expected Broadcast Domain: Default:Default-StorageNetwork
Reachable Broadcast Domains: Default:Default-StorageNetwork
        Reachability Status: ok
          Unreachable Ports: -
           Unexpected Ports: -```
violet isle
#

You may want to do a few things
Net port reachability repair -node node-01 -port a0a

Net port reachability repair -node node-01 -port a0a-2053
Repeat for node 2
Wait a couple minutes and then reverify “broadcast-domain show”

#

Hah. Perfect. Broadcast-domain show should look good then also

neat cosmos
#
  (network port broadcast-domain show)

               IPspace Name: Default
   Layer 2 Broadcast Domain: Default-StorageNetwork
             Configured MTU: 9000
                      Ports: nacluster01-02:a0a-2053
                             nacluster01-01:a0a-2053
         Port Update Status: complete
                             complete
  Status Detail Description: complete
                             complete
Combined Port Update Status: complete
            Failover Groups: Default-StorageNetwork
               Subnet Names: subnet-53-storage
    Is VIP Broadcast Domain: false```
#

i think... we are good? should be able to add that other lif now i would think

violet isle
#

Agreed

neat cosmos
#

yep, added just fine