#Issues with FlexGroup copying...

1 messages · Page 1 of 1 (latest)

gleaming mica
#

We have a very simple FG consisting of two sub-volumes one on each aggregate, and each aggregate is on two different controllers (in the same cluster of cause).
This has been working great for some time...
But recently we see issues where we seem to get timouts in the cluster eventlog like "Nblade.CifsOperationTimedOut".. with these operations: "SMB2_COM_QUERY_INFO", "SMB2_COM_SET_INFO", "SMB2_COM_WRITE" and "SMB2_COM_CREATE"...
From the client side we can trigger with with copying a file of say 35GB.. i will copy along just fine... then after 805 it will slow down... then stop at the end or near the end and stay there for a minute or two, then cast an error...
We have tried the same copy to another FG (on the same cluster) and we get the same issue.
We then tried to copy to a normal FlexVol.. which works just fine without any errors...
We are on 9.16.1P10

We have of cause looked at the network and it all seems just fine. Now the cluster switches are not in a supported setup, yet it has been working just fine before. The switches are Cisco N3K 3172 i think.
But again we see no issues on the switch ports... and we are copying at about 4-500MB/sec. so nothing is getting "overloaded" 😉

Any suggestions are very welcome.

delicate hollow
#

do your constituents have enough free space for the file?

gleaming mica
# delicate hollow do your constituents have enough free space for the file?

Yes it has, it was also set to autogrow... I tried to disable the autogrow because I saw in the debug logs that it kept growing and some times shrinking... but even with the autoresize disabled on the FG it keeps going this on the two sub-volumes... Seems very strange I tried to copy using different "copy" tools but they all seems to have the same issue... I am now planning to setup NFS and give that a try...