#Adding new nodes to SnapMirror destination cluster leaves cluster peering as partial?

1 messages · Page 1 of 1 (latest)

slender wasp
#

We have just expanded a cluster with some additional nodes in order to expand the capacity. We have several other clusters snapmirroring to this cluster, and it works great.
We have of cause added new InterCluster LIFs and modified the cluster peering information at both ends and for the most part it works and the "Availability" states "Available", yet we haev a few which states "Pending" and at the destination and "Partial" at the source... (output from cluster peer show)... We are able to cluster peer ping with 100% success. But a cluster peer health show states the new nodes "Availability" as false...
We tried to move one of the destination volumes to one of the new nodes, and it is now unable to do snapmirror updates. Stating "Failed to create Snapshot copy snapmirror" and CSM: An operation failed due to an ONC RPC failure... any idea how to "fix" this? I think I read somewhere that a reboot/failover would solve this, but seems a bit much... isn't it possible to re-init the relationship some how?

trim tartan
#

for those entries where cluster peer health show shows Availability as false, does it show the ping status as interface_reachable or unreachable?

slender wasp
#

yes it does

#

And it's only the cluster peer health show when run from the source that gives this information...

#

If we run it from the destination the "problem" cluster are not shown on the new nodes...

#

so if I specify the -originating-node and -destination-cluster from the destination I get "There are no entries matching your query."... so a bit strange...

#

tried to update the passphrase on both sides which fixed some of it... at the destination the cluster peer now shows as Avaliable but at the source is shows as Partial and I have one line with "Data: unreachable"

#

hmm and ofcause now it's OK at both ends... seems like just updating the passphrase on one of the cluster peers triggered some kind of update, because now all the other peers are also ok...

#

would be nice with a "update relationship" command 🙂

#

I can see that on our new destination 9.15.1P3 there is a "cluster peer update-checks" sadly it didn't help with this.. but maybe if it was on the source nodes (which are on a lower ontap version)...

grim delta
#

As you added nodes, you added all intercluster lifs of all nodes to the Cluster Peer relationship?

burnt lagoon
#

ONTAP adds them automagically. When you add nodes, add the Intercluster LIFs. wait about 2-4 minutes and "cluster peer show" should be available. It requires all nodes on both sides to have an IC LIF to communicate with each other

#

last bunch of installs, just added the IC LIFs and the cluster peer healed itself

slender wasp
# burnt lagoon ONTAP adds them automagically. When you add nodes, add the Intercluster LIFs. wa...

well that was not what happened here... we had 5 clusters that reported Partial for two days before I (as described above) tried to update a passphrase on one of the relationships which seemed to fix that one peer as well as the others shortly after.... so go figure 🙂 BTW: Our snapmirrors worked just fine with the status as partial until we moved a destination volume onto one of the new nodes in the cluster... which let me to this post 🙂

burnt lagoon
#

@slender wasp I dont know your setup (with 5 clusters) but you may be running into another issue. I think there is a document around (like a best practices or a KB article) that suggests to create separate IPspaces for each cluster relationship. I had a customer with three clusters and tried to put all IC LIFs in the default IP space. At some point along the way it failed to the point replication stopped. I went in and created two IPspaces on the origin cluster. Then one IPspace was cluster A->Cluster B and the other IPspace was Cluster A->Cluster C. once we did that, all problems went away