#Slow performance between vSphere and NetApp

1 messages · Page 1 of 1 (latest)

tepid marsh
#

Hello everyone,

I'm trying to narrow down some complains from my colleagues regarding slow performance between our virtualized environment and our NetApp storage.

We're on vSphere 7.0.3 and have 2 AFF-A400 in 2 datacenters active-active. There are around 15 ESXi per DC, connected via Brocade FC switches. I went through the NetApp best practices and applied the parameters recommended.

Our devs and SysOps are complaining about low performance for their Postgres VMs (running RHEL 7,8,9). I.e. one Postres VM states 25% iowait. One of them wrote a fio test which results in IOPS of 600-700.

Now I played with fio today and the results are quite different depending on the parameters (obviously).

The one the SysOps wrote (resulting in ~700 IOPS):
fio --name TEST --eta-newline=5s --filename=temp.file --rw=randrw --size=2g --io_size=10g --blocksize=4k --ioengine=libaio --fsync=1 --iodepth=1 --direct=1 --numjobs=1 --runtime=60 --group_reporting

The one I wrote (resulting in around 22000 IOPS):
fio --name=random-write --ioengine=libaio --rw=randwrite --bs=4k --size=2g --numjobs=1 --iodepth=64 --runtime=60 --time_based --end_fsync=1 --direct=1 --fsync=1
(I know they are very different - just showing the problem of lacking a proper benchmark)

My first question is if someone can recommend some fio settings which are somehow realistic for some scenarios.

Second question is where I can check for bottlenecks or issues. ONTAP shows no issues or high load. ESXis are also happy. The switches also don't report any trouble.

Last question is what are the "common" IOPS our NetApp should be able to reach regularly - just so I know what's the theoretical maximum achievable.

If you need more details just let me know. Any hint is highly appreciated!

novel urchin
#

I don't have experience with fio, so I pass the question about that. Is this the only VM experiencing slowness ? I would compare the benchmark test on another linux VM to see if the numbers match. If you see no latency on the Netapp/volume/lun side, I would start with the switch port to see any errors. Does the ports on the AFF or host side report any kind of errors? or those are absolutely clean ? Do you have monitoring enabled on the Brocade switches? Any alerts ? Tried vMotion the VM to another host ?

tepid marsh
#

it happens on several VMs on different hosts

the one I'm running the fio tests on is not used at all

I'll check the hardware tomorrow but the monitoring should be clean

opal gyro
#

Are the VMs on nfs mounted datastores or fc/vmfs datastores? Sounds like FC

Are the VMs using pass through fc, RDM?

Have you applied the fc settings to the VMs if needed?

Have you applied the z as MPIO/FC/NFS esxi host settings?

Have you applied the settings manually or did you use ONTAP tools for VMware vsphere?

tepid marsh
#

FC, yes
no RDM

I applied the settings from the best practices manually to the ESXi hosts.

Are there additional settings for the VMs?

opal gyro
#

There is an included iso for the VMs included with OTV. I think that’s mainly for failover but if you are doing fc to the VMs there may be some tunings to apply. Are the VMs themselves using fc? It just hosted on the fc datastores?

Databases do much better when the vm can talk direct to the storage. In Ethernet, we would put the vm on a nfs datastore and then let the vm use iscsi to talk directly to the Netapp for its lun.

tepid marsh
#

right now we present the VMFS LUNs via FC to the ESXi hosts and the VMs are just put on the LUNs

zenith comet
#

--iodepth=1 is what kills performance in your benchmark. If your workload is single-threaded and uses an iodepth of 1 there is no way to make it faster

#

also, is the iowait value you see from within the VM or from the hypervisor?Because the metrics in the VM often don't do what they do on physical hardware. Check for CPU overcommitment for example, that tends to slow everything down if it goes above a certain threshold (you want to keep it as low as possible, around 2-4 or so is good)

tepid marsh
#

yes, the iowait is inside the VM

according to the DB admin Postgres is usually writing sequentially and therefore they tested with iodepth=1...

zenith comet
#

but then IOPS is irrelevant because you're writing large chunks in sequence, and throughput becomes more important

#

700 IOPS with 1mb block size is 700mb/s for example

#

again, without knowing the actual io pattern of the application, any benchmarks are meaningless.
If you want somewhat-comparable synthethic benchmarks then do two: one with random IO (read or write or a mix of both) with small block sizes (4k, 8k) with high concurrenncy, to stress the storage with respect to random IOPS from multiple clients. This will give you the highest IOPS rating with low throughput (even 20000 IOPS at 4k is only 80 mb/s)
And one with large blocks (1m), sequential access (again, read or write or both) to stress throughput. That will give you high mb/s values but obviously much lower IOPS
But any real-world scenario will always be a mix of these two, and if you suspect performance issues you first need to determine in what domain (iops-bound or mb-bound) these are.

#

there's a 3rd domain that is also linked to the first two, which is latency-bound, if you want to benchmark that (synthetically again) use small blocks (1k) and write-only tests (or, if your working-set is small engough that it fits into the controller RAM, you can also use small blocks and read tests)

#

Yes, performance-troubleshooting is hard and it is very easy to be led down a wrong path and wasting time because of certain synthetic benchmarks that do not represent the actual issues

#

and especially with databases we have seen customers increase their query performance 2-fold or even more by just optimizing the queries so that they can be parallelized better by the query optimizer

tepid marsh
#

thanks a million! that's very useful information

nova condor
#

There's also the "blackbox" that is FiberChannel here. Not a lot of people understand queue depths and buffer credits in SAN networks. Add drivers to the voodoo and it's a relatively large PITA.

zenith comet
#

true, but unless you have 20+ km distances to cover, buffer credits shouldn't factor in (at least in today's SAN switches, they have large enough buffers by default)

nova condor
#

buffers usage also depends on a number of other things depending on what targets can accept and misconfigured queue-depths on sources

#

modern switch can protect agains slowdrain situations, but again, not a lot of people spend time monitoring such things

tepid marsh
#

trying to get access to our SAN switches today to figure this out

#

But seeing that they didn't even care for the best practices for the previous SAN I doubt anyone did any optimization 😄

nova condor
#

There are recommended switch OS versions and fw versions for SAN HBAs and such in the Hardware Universe... or the other thing... compatability matrices in general... recommended queue depths in ESXi

#

buffer credits are most relevant on ISL links and more than a few percent of "running out" (depending on the switch vendor) on ISLs is suboptimal

tepid marsh
#

my current uneducated guess is that Postgres in a VM isn't a great idea and we have a few with really high loads

nova condor
#

postgress actually does ok over NFS as well, but it depends on load and experience with NFS as well. It should work fine in a VM as well if one follows best practices

zenith comet
#

I wouldnn't consider FC SANN issues the most likely culprit. Don't spend too much time analyzing the FC SAN

#

I mean, yeah, check for port errors or repeated BB credit exhaustion that's always a good idea. But if nothing obvious pops out look elsewhere

tepid marsh
#

thanks for your input during the day!

problem is we need to prove it's not the storage and our teams standing in the company isn't the best

but as you mentioned earlier IOPS aren't the only metric to look at

the SysOps just took one fio test, ran it and blame us for a bad storage ("My HDD at home has better results!") - they couldn't even describe why they chose these settings like iodepth=1

#

I could imagine there are some optimizations possible within the RHEL VM or postgres, some parameters like buffer, cache, blocksize, whatever - and then it runs better than ever

nova condor
#

yeah, i love such intracompany battles, hehe... but if you can monitor the io on the netapp side while tests are being done, then it's much easier to show what the problem is... system manager gui has good realtime (almost) graphs one could screenshot

zenith comet
#

I usually respond with a different FIO test on the same volume/LUN/namespace with other parameters that shows that the storage is capable of XX mb/s or XX iops and tell them "here, fix your software, these are the numbers you can theoretically reach" 😄

tepid marsh
#

maybe we'll do it 😄

#

I checked some Postgres Best Practices for virtualized environments and they're not following all of them

so maybe I'll play the ball back and point to that 😂

novel urchin
#

@tepid marsh , were you able to graph the LUN or volume performance in the active IQ , specifically latency ? That will be a good indicator if the storage array is performing well, but something else is out of wack on the app side. You can send a screenshot with the latency, IOPS , throughput of the backend lun, that is hosting the Datastore where the VM resides.

tepid marsh
#

I'll send one tomorrow but what I remember the LUN was quite bored unless I fired a fio test with high iodepth then it went up i.e. around 27k IOPS

violet roost
#

Why are you trying to use fio to diagnose an application performance problem? (Source: Perf TSE4)

opal gyro
#

My guess is to rule out the storage as the problem and move the problem to the application 😁

tepid marsh
#

Test 1 (high Throughput around 230MB/s): fio --name=random-write --ioengine=libaio --rw=randrw --bs=1m --size=2g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1 --direct=1 --fsync=1
Test 2 (high IOPS around 8500): fio --name=random-write --ioengine=libaio --rw=randwrite --bs=4k --size=2g --numjobs=1 --iodepth=64 --runtime=60 --time_based --end_fsync=1 --direct=1 --fsync=1

tepid marsh
# violet roost Why are you trying to use fio to diagnose an application performance problem? (S...

our SysOps came up with some random fio test which resulted in horrible numbers and now blame our storage for bad performance
they did the test because users complain about slow Postgres databases
their test: fio --name TEST --filename=temp.file --rw=randrw --size=2g --io_size=10g --blocksize=4k --ioengine=libaio --fsync=1 --iodepth=1 --direct=1 --numjobs=1 --runtime=60 --group_reporting
resulted in 750 IOPS and around 3000 KB/s

nova condor
#

Perhaps the easiest way to quell such discussions is to let them run tests while they watch the stats on the NetApp system. "Seeing is believing" works more often, but not always. It will most always work with decision-makers even when the technical guys have dug their heal in because they've "always used this vendor" or "always done things this way"

violet roost
#

Well it could easily be a bottleneck on the client end as well.

#

I've troubleshooted (is that a word?) many a perf problem for over a decade now.

#

Trust me when I say benchmarks do NOT work to troubleshoot performance.

#

Let's go back to the original problem:
[12:24 PM]
OP
SeriousMike: Hello everyone,

I'm trying to narrow down some complains from my colleagues regarding slow performance between our virtualized environment and our NetApp storage.

We're on vSphere 7.0.3 and have 2 AFF-A400 in 2 datacenters active-active. There are around 15 ESXi per DC, connected via Brocade FC switches. I went through the NetApp best practices and applied the parameters recommended.

Our devs and SysOps are complaining about low performance for their Postgres VMs (running RHEL 7,8,9). I.e. one Postres VM states 25% iowait.

#

Are these RDM LUNs or VMDKs in a FC datastore?

tepid marsh
#

second, VMDKs via Fibrechannel

violet roost
#

Ok good.

#

Do you see latency in VMware logs?

#

Not logs...err vSphere perf graphs

tepid marsh
#

I'm already off at work but when I checked last time there were no peaks

violet roost
#

Ok. So you've got latency in the VM itself.

#

If you can't find anything VM side, then what you need to do is probably add more VMDKs in more Volumes and LUNs. Usually 8 is enough. Sometimes what happens is i/o gets stuck on the Linux SCSI queue threads.

#

More threads = more redundancy.

tepid marsh
#

Is iowait inside the Linux VM relevant?

violet roost
#

Yeah.

#

It means it's waiting on something.

#

We had a problem with SCSI queuing in ONTAP when you had a single LUN, but that was fixed in 9.9.1. I've seen it with Linux Oracle systems. I'd imagine Postgres is similar.

tepid marsh
#

Also what I read in some Postgres best practice is to separate the logs from the database

They don't do that here

violet roost
#

Fail.

#

BTW I'm in US so it seems you're in EMEA. If you are needing help sooner maybe open a case?

tepid marsh
#

I'm still waiting for my actual storage admin to do his job and provide some checks 😄

I'm usually the vSphere admin

But looks like I need to request some access and do it myself

#

Thanks so far, Paul!

violet roost
#

You're welcome.

tepid marsh
#

is anyone using QoS policies on their volumes?

We "found out" we have a policy called "extreme-fixed" applied to all of our volumes and some of them hit the IOPS limit every now and then

our consultant said they only use them with SAP HANA.

our storage guy said he just applied it everywhere because he thought it's best practice and will prevent the Netapp to die 😄

obviously those two never talked about it

I just disabled it everywhere to see if it makes any difference

violet roost
#

Oh goodness. That would do it.

#

Hopefully running those commands helped in that KB.

young sky
#

Yeah, these Service Levels / QoS Policies are unfortunately default when you create your volumes via System Manager. I always change it so I can manually choose my aggr because on most AFF I don't really need QoS.

last depot
#

OP any update? A couple of my data centers are configured similarly. I use the NetApp ONTAP Tools appliance and pair that with every vCenter instance.

#

There is an option to configure optimal setting per hosts. I forget the verbiage.

#

Also I have found my DB VMs definitely perform better when vCPU is NUMA aligned with the hardware CPU sockets.