#Rocky 8.8 hang while using NFSv4

1 messages · Page 1 of 1 (latest)

feral whale
#

Hi Team!!!

Customer using volumes allocated via NFSv4 from a FAS8300 on Rocky 8.8, the logs show that the Reply is very fast, but the Call takes an extremely slow time, such as 8 seconds, 10 seconds, 5 seconds, 2 seconds, etc.

TAC asking to check the OS side, but I'm not an expert in operating systems, so it's too difficult for me.

What should I look for on the OS side in such cases?

25           2024-01-15 16:00:55.721393        0.000354              10.1.48.242         2049       10.1.48.230         671         4578       NFS                V4 Reply (Call In 24) READDIR
 
30           2024-01-15 16:01:00.149444        2.776703              10.1.48.230         671         10.1.48.242         2049       222         NFS                V4 Call (Reply In 31) ACCESS FH: 0xe48909ca, [Check: RD LU MD XT DL]
31           2024-01-15 16:01:00.149687        0.000243              10.1.48.242         2049       10.1.48.230         671         242         NFS                V4 Reply (Call In 30) ACCESS, [Allowed: RD LU MD XT DL]```
#
9              2024-01-15 16:00:28.649656        8.409304              10.1.48.230         671         10.1.48.242         2049       214         NFS                V4 Call (Reply In 10) GETATTR FH: 0x6cbe265a
10           2024-01-15 16:00:28.659207        0.009551              10.1.48.242         2049       10.1.48.230         671         354         NFS                V4 Reply (Call In 9) GETATTR
 
12           2024-01-15 16:00:38.716107        10.056890            10.1.48.230         671         10.1.48.242         2049       214         NFS                V4 Call (Reply In 13) GETATTR FH: 0x6cbe265a
13           2024-01-15 16:00:38.716262        0.000155              10.1.48.242         2049       10.1.48.230         671         234         NFS                V4 Reply (Call In 12) GETATTR
 
15           2024-01-15 16:00:44.367429        5.651160              10.1.48.230         671         10.1.48.242         2049       222         NFS                V4 Call (Reply In 16) ACCESS FH: 0x6cbe265a, [Check: RD LU MD XT DL]
16           2024-01-15 16:00:44.367626        0.000197              10.1.48.242         2049       10.1.48.230         671         242         NFS                V4 Reply (Call In 15) ACCESS, [Allowed: RD LU MD XT DL]```
clever tundra
#

That's a ways back to pull a perf archive. If you can repro using more recent data we could review in a perf archive.

#

I'd open a support case.

sage coyote
#

it could be doing name lookups if you have NTFS qtrees somewhere, or if you have users that need to be looked up etc. Also numeric IDs vs NFSv4 usernames. Or even MTU issues ... hard to say without more data.

feral whale
feral whale
clever tundra
#

Hmm. Were perf archives pulled? If you have a case number I can easily look.

#

(Perf TSE here)

feral whale
clever tundra
#

Hmm. Are delegations enabled?

feral whale
#

I'm sorry what is delegations?

#

english is not my native language i'm so confused lol

clever tundra
#

Poor Performance with NFSv4.x Delegations Enabled

feral whale
#

I will check docs just give me a minute

feral whale
feral whale
# clever tundra Poor Performance with NFSv4.x Delegations Enabled

However, this symptom is not limited to just one instance of Rocky Linux; it occurs across dozens of hosts using Rocky Linux.

Upon initial connection attempts, there is a delay in response time, but once connected successfully, subsequent response times are faster.

clever tundra
#

Hmm.

#

It seems like we'd need more data probably but this is definitely sounding like a client issue from the analysis.

#

What happens if you mount NFSv3?

feral whale
#

I haven't tried attempting an NFSv3 mount yet, but I think I could give it a try. Since there's no OS administrator for the client, and I'm handling everything up to the OS, this community seems to be the only place where I can seek advice.

#

Oh, I actually did test with NFSv3 in NetApp Log # 2009864935, and even then, the response time was slow as well.

clever tundra
#

That's weird.

#

Did you get tcpdumps from the client end?

#

The only other thing I could think of is a MITM (man in the middle) device like a firewall/proxy. Try bypassing anything like that.

feral whale
#

Yes, I had collected tcpdumps from both the storage side and the client side and uploaded them to the case.

NetApp Log # 2009903851

clever tundra
#

Weird. Not seeing them uploaded.

feral whale
#

Give me a sec lol

clever tundra
#

Mainly I'd want to know if the client side trace shows the same gaps?

feral whale
#

Hm, but what's puzzling is that it's direct communication from the storage to the client without passing through the firewall, as it's the same subnet communication. I'm not sure what the problem could be.

As mentioned in the original text, it shows different response times each time. I'll capture it again and upload it.

#

The text highlighted in yellow indicates the client response time. As you can see, the response time varies each time, such as 8 seconds, 5 seconds, 10 seconds, 2 seconds, and so on.

clever tundra
#

Hmm.

#

Is that the client trace or filer trace?

feral whale
#

I'm not sure.. I definitely uploaded the tcpdumps to the case, but I can't see the files I saved on my laptop, nor can I see them in the case records... lol. If I get the chance, I might have to request the client to collect them again tomorrow.

#

Where is my files!!!! lol

rough dust
#

Just curious. Have you set

sunrpc.tcp_max_slot_table_entries
and
sunrpc.tcp_slot_table_entries

To 128. That may help

feral whale
tacit vault
feral whale
tacit vault
#

I did some testing today with a minimal install of Rocky Linux 8.9 today and did not see any issues. I didn't realize the version difference until now, so I will look to try 8.8 tomorrow.

clever tundra
#

There is a kernel parameter I just found out about in RHEL 8 which I wonder if it carries over to other Linux distros?

tacit vault
#

Sorry, I have been sick the last couple of days and have not able to test, but I will follow up on Monday.

feral whale
feral whale
sage coyote