#Recently my customer had Servers complaining about NFS mounts hanging/delayed responses
1 messages · Page 1 of 1 (latest)
What is the version of NFS, v4 or v3?
Can we see /etc/fstab to see what mount options are used?
hi David and Nick I may need to point them to this discord. They are reviewing KB now.
its v3 and v4
Yea send them in! We'd love to have them, not to cut you out, but to work more directly with them and get to root cause faster!
I'm wondering if it's the block size specification (or lack thereof) in the mount options being used
Do you have clients connecting to same volume with both NFSv3 and v4?
Are you by chance running the latest redhat 8 update?
I had a customer recently update redhat 8. A lot of mounts hang. A trace was sent to redhat and it was determined to possibly be a kernel bug related to ssh. The suggestion was to downgrade autofs for now.
If it's an ssh bug, why would downgrading autofs resolve it? Do you have Red Hat bug number? Which version of RHEL 8? Which specific kernel version? Thanks!
Apparently the interaction between the two is why. There are kernel hooks in everything. Just passing what red hat advised the customer to do.
If I recall the conversation, 8.6 worked fine. When they did updates to the latest, NFS mounts started to hang. After tracing the client, redhat determined it was in interaction from autofs and ssh in the kernel and suggested to downgrade autofs.
We ran into an issue recently where DNS was failing to resolve one of the server names in the first rule for a volume's export policy. It prevented all the servers further down the export list from being able to mount even though they we resolving in DNS fine. Removing that server entirely or using it's IP address fixed the problem.
I've also had success with mount problems by changing the mount options to use a specific version of nfs until I find one that works. v3, v4, or v4.1 and sometimes one will work but not the others. And using the netapps IP address to rule out DNS wherever possible.
that helps - thanks. I believe we're looking at upgrading to 8.6 or 8.7 soon. I hope you're referring to RHEL 9.x that has the bug.
Unfortunately no. They say 8.6 was fine. I think they were staying on RHEL8 and did updates. After rebooting they had issues. One 8.6 remained and seemed fine. A case was open with redhat and with NetApp. Tracing was done all over the place. Nothing absolutely nothing showed up in the NFS traces. But the tracing of the kernel on redhat showed an issue. That is when they decided to suggest downgrading.autofs.
Thanks again!
Is the case still open? It would be good to make sure that is documented in our KB.
For what’s worth:
The problem looks like it was in the current version of autofs 1:5.1.4-83. We downgraded our non-prod servers to 1:5.1.4-82 and we have been unable to reproduce the problem.
Ok please let the NetApp case owner know so he can put into our KB system.
That knowledge will benefit other customers.
If you have a Red Hat bugz# or case# and can share it, please do so!
@lone sail Do you know if a KB article ever got written on this? We're getting ready for RHEL 8.7 and don't want to trip over it.
I don't have the case number. I asked here and pinged @odd fulcrum earlier today but haven't heard back yet.