We started a snapmirror of 105TB of file data (about 145.000.000 inodes). The transfer seems to have completed, but the status is still "Transferring" and have been for 24 hours now...
With "sysstat" we can see that the system is reading and writing some data but not network traffic... When the tranfer was actually running we saw read/writes, in/out of the NICs which pointed to speeds close to the NICs max (8-900MB/Sec.) But is snapmirror consolidating ? (sis status shows that it is not running)
The destination nodes are only setup for snapmirror of this one volume... both source and destination are FAS2700 nodes with NL-SAS disks... It seems a but strange that it does not complete... and even more strange that we cannot see the progress... (snapmirror show -instane shows nothing useful).. also there are not apprent errors in the event logs... any hints? 😉
#How to see SnapMirror progress...
1 messages · Page 1 of 1 (latest)
do a snapmirror show -instance on one of the relationships.
or filter it down to specific parameters - https://docs.netapp.com/us-en/ontap-cli/snapmirror-show.html#parameters
Yep. But it shows nothing useful.. just "Transfering"...
snapmirror show -fields total-progress,percent-complete-cur-status
that just shows - ?
if it's a flexgroup it's slightly different.
`FS135-DKAAR1::> snapmirror show -fields total-progress,percent-complete-cur-status
source-path destination-path total-progress percent-complete-cur-status
FS05-DKAAR1:BG3DLOG_Archive FS05-BACKUP:BG3DLOG_Archive_dest 0B -`
it is actually a flexgroup...
As mentioned "sysstat" shows that no data is moved over the net and has not been for over 24 houts
`FS135-DKAAR1::*> snapmirror show -expand -relationship-group-type flexgroup
Progress
Source Destination Mirror Relationship Total Last
Path Type Path State Status Progress Healthy Updated
FS05-DKAAR1:BG3DLOG_Archive
XDP FS05-BACKUP:BG3DLOG_Archive_dest
Uninitialized
Transferring 0B false 02/17 19:18:48
FS05-DKAAR1:BG3DLOG_Archive__0001
XDP FS05-BACKUP:BG3DLOG_Archive_dest__0001
Uninitialized
Transferring 104.9TB true 02/20 19:51:45
FS05-DKAAR1:BG3DLOG_Archive__0002
XDP FS05-BACKUP:BG3DLOG_Archive_dest__0002
Uninitialized
Idle - true -
3 entries were displayed.`
The source and destination volumes are the same size... but no snaphots on the destination yet..
snapmirror show -fields total-progress, percent-complete-cur-status -expand
shows the same as above... 104.9TB... that's it
and a blank for progress?
FS135-DKAAR1::*> node run -node FS135-DKAAR1-01 -command "sysstat -u 1" CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP_Ty Disk ops/s in out read write read write age hit time [T--H--F--N--B--O--#--:] util 25% 0 2 1 18220 31228 0 0 10s 94% 15% 0--0--0--0--0--0--0--0 28% 29% 0 2 1 20252 0 0 0 10s 96% 0% 0--0--0--0--0--0--0--0 24% 25% 0 2 1 21544 0 0 0 10s 97% 0% 0--0--0--0--0--0--0--0 25% 27% 0 1 1 19004 0 0 0 7s 97% 0% 0--0--0--0--0--0--0--0 19% 27% 0 2 1 22040 0 0 0 7s 96% 0% 0--0--0--0--0--0--0--0 17% 26% 0 2 1 21260 0 0 0 7s 97% 0% 0--0--0--0--0--0--0--0 19% 24% 0 2 1 19628 0 0 0 7s 97% 0% 0--0--0--0--0--0--0--0 18% 28% 0 2 2 29096 29576 0 0 7s 97% 29% 0--1--0--0--0--0--0--0 23% 31% 0 1 1 19668 130816 0 0 7s 97% 100% 0--0--0--0--0--0--0--1 36% 25% 0 2 1 17500 115504 0 0 7s 98% 97% 0--0--0--0--0--0--0--0 34% 24% 0 2 1 18484 1896 0 0 11s 98% 10% 0--0--0--0--0--0--0--0 19% 27% 0 2 1 21924 0 0 0 11s 98% 0% 0--0--0--0--0--0--0--0 19% 32% 0 2 1 24124 0 0 0 11s 97% 0% 0--0--0--0--0--0--0--0 22% 26% 0 1 1 21324 0 0 0 11s 97% 0% 0--0--0--0--0--0--0--0 18%
`FS135-DKAAR1::*> snapmirror show -fields total-progress, percent-complete-cur-status -expand
source-path destination-path total-progress percent-complete-cur-status
FS05-DKAAR1:BG3DLOG_Archive FS05-BACKUP:BG3DLOG_Archive_dest 0B -
FS05-DKAAR1:BG3DLOG_Archive__0001
FS05-BACKUP:BG3DLOG_Archive_dest__0001
104.9TB -
FS05-DKAAR1:BG3DLOG_Archive__0002
FS05-BACKUP:BG3DLOG_Archive_dest__0002
- -
3 entries were displayed.`
It looks to me like the system is reading, somewhat slowly... as mentioned, this system does not have any other workloads...
...or can "load" be disk scrubs?
could, but also a background process.
`FS135-DKAAR1::*> aggr scrub -aggregate FS135_DKAAR1_01_NL_SAS_1 -action status
Raid Group:/FS135_DKAAR1_01_NL_SAS_1/plex0/rg0, Is Suspended:true, Percentage Completed:1%
Raid Group:/FS135_DKAAR1_01_NL_SAS_1/plex0/rg1, Is Suspended:true, Percentage Completed:1%`
No scrub...
But is it normal that it takes this long to complete the snapmirror? (after the data has been transfered?) should we just give it a few days?
Depends on size and pipe for sure.
i have seen PB size flexgroups take a week to seed.
ok.. I only know it's a volume of 105TB and it has about 145.000.000 inodes
or it's stuck some how.
but hopefully this is just the initial phase... later hourly updates will be as normal? 😉
anything in the ems logs about the snapmirror
the only errors I have in the logs are from the hourly update schedule which complains because it cannot update until the relationship has completed the transfer...
I think we will give it a few days...
yeah.
or open a case and dig into it.
the total progress 104.9 on a 105 TB volume seems to make it look like it's stuck some how?
You can make the progress units more granular. That might help determine if it's moving: 'set -units MB' or whatever units you want.
But if it's at the end of the transfer and total-progress isn't moving, you may be in Finalizing. There are scenarios when FGs don't report "Finalizing" but that's what they're doing. https://kb.netapp.com/on-prem/ontap/DP/SnapMirror/SnapMirror-KBs/SnapMirror_stuck_in_Finalizing_state_in_ONTAP
Volumes with high dedup savings will typically have longer Finalizing phases (particularly for the init)
The Snapmirror progress shows 104.9TB while the volume how now grown from a usage of 105TB to 106.1TB... so apparently is grows slightly.. but still no snapshot on the destination volumes... and it is not stuck in Finilizing because the status is still Transfering... 🙂 I guess it just needs some time to do whatever... 🙂
yep. if it has a ton of inodes, is pretty large and such, it will take a while to complete
we had one of our ~50tb volumes take over 10 days for initial sync.
That volume had over 2.3bil files though
Because this is an FG, there is a chance it's in Finalizing but still reporting Transferring. It's a reporting issue with FG SnapMirrors today.
Great success, it's all done now... 🙂 it roughly took 30 hours from when it stopped sending data...
It may be doing something with efficiencies also.
Try a “vol efficiency show”
I did a "sis show" which didn't show anything running... not sure if it's actually the same thing...
OK, then hourly update does the same thing... a lot of tansfers, then "not a lot"... but status still transferring as it seems to just read and write local data... (and no "volume efficiency" is not running...) hopefully this doesn't take several hours to "process" 😉
...OK maybe right now isn't the best time, because it looks like it is catching up with the snapshots done on the source while the initial transfer was run... so it looks like the processing time isn't that long for the hourlys...
....and it's done. we are happy campers 😉
Yeah the way snapmirror works is that it sends a baseline. Finishes and then immediately transitions to transferring each snapshot individually with a finishing that quickly changes to transferring until all the snapshots are transferred