Hi,
We have set up SMAS with Proxmox VE 9, and everything works as expected except for the unplanned failover.
Before performing an unplanned failover, all paths to the four controllers are up and connected. However, when we recover the powered-off array, the paths to that array are removed from the Proxmox hosts while the SnapMirror synchronization starts.
We observe the following logs on the array:
"Sequence number
34219
Description
This message occurs when a NVMe-oF target subsystem is hidden due to a SnapMirror(R) Active Synchronous relationship going out of sync. The NVMe-oF target subsystem is no longer accessible from the host.
Event
nvmf.subsystem.hidden: NVMe-oF target subsystem NQN nqn.1992-08.com.netapp:sn.ff8f298d2e7511f19254d039eae061a7:subsystem.NVME_PVE_SUBSYSTEM is hidden and reason code is 2.
Action
Restore NVMe-oF target subsystem access to the host by using the "snapmirror resync" command to bring the out-of-sync relationship back in sync. If "snapmirror resync" does not fix the issue, contact NetApp technical support"
For some reason, I guess this is to avoid data corruption and to prevent hosts from writing to the namespace while synchronization is still in progress. ONTAP hides the subsystem, so the Proxmox hosts remove the controllers and do not attempt to reconnect automatically.
We have to manually run the command nvme connect-all on the hosts to restore the paths.
Is this expected behavior?
NetApp states that in case of an unplanned failover : To recover lost I/O paths or update I/O path states on your hosts, you need to perform a storage/adapter rescan on the hosts after the primary storage cluster resumes operation.
https://docs.netapp.com/us-en/asa-r2/data-protection/snapmirror-active-sync-unplanned-failover.html
Thanks !