#Is there a way to tell which LIF OnTap is using to get to a specific destination?

1 messages · Page 1 of 1 (latest)

hot nova
#

For a data vserver, we have multiple data LIFs and a single management LIF. We are randomly getting timeout messages for one of our DNS servers, so I'm trying to determine which LIF the vserver is using to get to the DNS server. Is there any way to tell for sure which LIF it's using? When I do a traceroute, you have to specify which LIF or node you want to use, so that doesn't tell me which one OnTap uses when left to it's own devices.

There are two default routes setup for the vserver... the route for the data LIFs has a metric of 20 and the route for the management LIF has a metric of 60 (in the hope that it will favor the data LIFs).

plush stratus
#

the route for the data LIFs has a metric of 20 and the route for the management LIF has a metric of 60
this is not how routes work. There are no "routes for (a) LIF". Routes are not tied to LIFs (you can realize that by doing a "route show" and noticing that nowhere it gives you a LIF in that output).
I don't know where this idea comes from, we have seen this with a lot of our customers in the past...

hot nova
#

Is there something somewhere that explains how routes do work? It's always been a bit of a mystery to me how they decide which route to use if there is more than one route to choose from, and I've never found a good explanation anywhere.

plush stratus
#

the one simple rule is that ONTAP (like every OS, router etc.) uses the route with the "best" match to the target (most number of biuts matching).. If there are multiple routes with the same number of bits, it uses metric. if metric is the same, it is essentially random (unless multipath routing is active in which case it tries to use all matching routes)

random marlin
#

For what it’s worth, a “default” gateway is one with a destination of 0.0.0.0/0. These are most common. Typically, for each unique network the svm has, there is typically a default gateway. Unless you have non-routable VLANs (like nfs for VMware or iscsi which shouldn’t be routed but sometimes is), the gateway provides the svm a way past the local network (like 192.168.200.0/24).

#

There are other route that Netapp can do, like a host route. I’m not too dialed into those so I would not be good at trying to explain them

plush stratus
#

if ONTAP sends a packet to an IP that is on a locally-connected subnet (i.e. the same subnet as a LIF), it uses that LIF. if the destination is not on any locally connected subnet, there is no way ONTAP can tell which LIF would have the "better" gateway except by using the (default) route and possibly the metric

#

remember there is no "ip.fastpath" anymore (and that is a good thing)

coarse cairn
#

and when you use multiple gateways, then the remote side may also use a different path in return (asymetric routing)... which will fail on most any firewall... because the stateful rule for the outgoing connection doesn't exist...

plush stratus
random marlin
#

Which brings me to the complaint : when creating a “default gateway” in the GUI, why is there ZERO choice to sign a metric!

plush stratus
random marlin
#

I am always correcting this for customers.

Cli to delete and create in the same command
Route delete… ; route create…

plush stratus
#

I have yet to see a customer that actually has or needs multiple default routes with different metrics. literally all cases I've seen were misconfigurations

random marlin
#

Naw man. I see too many times. Lots of things. Easiest one is intercluster. Most will put the IC lifs in the admin svm and they need a different gateway than management.

plush stratus
#

that's okay, then you create a different route for the particular intercluster network (10.0.0.0/8 or whatever), not another default gateway

#

multiple non-equivalent default gateways WILL break in spectacular ways

coarse cairn
#

routing can be usefully used to null-route misbehaving clients too, hehe

hot nova
# plush stratus the one simple rule is that ONTAP (like every OS, router etc.) uses the route wi...

Ok, so say I have a friend with an SVM setup like this (IPs have been changed to protect the innocent). There are two different default gateways because the management LIF is on a different network than the data LIFs so that, theoretically, you could still reach the management LIF if there was an issue on the data network.

            Logical    Status     Network            Current       Current Is
Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home
----------- ---------- ---------- ------------------ ------------- ------- ----
svm01
            svm01_data01 up/up   1.3.10.60/24        cluster01-01   a0a    true
            svm01_data02 up/up   1.3.10.68/24        cluster01-02   a0a    true
            svm01_mgmt01 up/up   1.7.4.196/25        cluster01-02   e0M    true
2 entries were displayed.


cluster01::> net route show -vserver svm01
  (network route show)
Vserver             Destination     Gateway         Metric
------------------- --------------- --------------- ------
svm01
                    0.0.0.0/0       1.3.10.1     20
                    0.0.0.0/0       1.7.4.129    60
2 entries were displayed.```

So, if the DNS server for this SVM is say 192.50.10.100, which interface would it use?
Since there are no matching bits, in my understanding (and from my interpretation of what you are saying), it will choose the route based on the metrics of the competing default routes.  In this case, since the "1.3.10.1" gateway has a lower metric, it will use the data LIFs that are on that gateway to get to 192.50.10.100 (so, svm01_data01 or svm01_data02).

Am I understanding this correctly, or am I missing something?

@coarse cairn, you do make a good point about the asymmetric routing... that could be contributing here.
plush stratus
#

note that in recent ONTAP versions, the service policy plays into this, and if a LIF doesn't have the correct service policy for, say, DNS, it will not be used for that particular traffic. At least that's the theory, but I have seen ONTAP do some really weird things when it comes to routing 😄

hot nova
#

Well, it would be used if the SVM were trying to get to an IP that's on a matching subnet, correct? For example, if the DNS server has an IP of 1.7.12.50, it would favor the route with the matching bits, right?

And, someone could still use the svm01_mgmt01 LIF to log into the SVM if they wanted to, correct? Because that is really the intent of having the mgmt LIF is for someone to be able to log into the SVM if the data LIFs become unavailable for some reason.

plush stratus
#

1.7.12.50 is not directly connected (it is not behind 1.7.4.196/25) so that would also take the 1.3.10.1 default route. ONTAP matches the bits from the route's netmask, not from any LIFs

hot nova
#

Ok, so matching the first two bits isn't enough? So, it would use the route if the DNS server's IP was 1.7.4.131 (or, any IP in that subnet), but not otherwise?

plush stratus
#

it doesn't match bits between IP addresses. It only checks if it's a locally-connected subnet, and if not, it goes through all routes in the routing table to find a matching entry. only there does it count matching bits

#

so yes, it would only select the 1.7.4.196/25 LIF for anything that's in the 1.7.4.128/25 subnet

hot nova
#

Ok, thanks for the detailed explanation. I think my understanding was mostly correct. I guess I just thought there were other occasions where it would use the 1.7.4.129 route, but seems I was incorrect there, and that's where my confusion was.

So, now I need to look at service policies! 🙂

plush stratus
#

the only ONTAP specific thing that could influence the LIF selection is the service policy, but I admit I have not dug very deeply into how that interacts with routes yet

#

like, DNS queries should not go out through a LIF that doesn't have "management-dns-client" in its service-policy. But what if the routing table says it should? etc. I can imagine that a lot of this behavior is not well-defined and could change between versions easily

granite mountain
#

I think service-policies are basically like the old firewall settings. If a LIF does not have a certain service/application this LIF will not be used. If it has the service, the "Allowed Addresses" subnet mask needs to match. If that's also true, only then routing decisions will be determined.

https://kb.netapp.com/onprem/ontap/da/NAS/How_do_you_determine_the_most_specific_route_ONTAP_9.2_and_later_versions_will_use
https://kb.netapp.com/onprem/ontap/da/NAS/How_does_ONTAP_9.2_select_a_route