#Not enough disk space on FlexGroup but Vol and aggr. are not packed

1 messages · Page 1 of 1 (latest)

grizzled solstice
#

On another customer I encountered a problem on disk usage and VM's stopped. The Vol are overcommited by default but should be far from full. The system is on 9.3P21. Some VMs have stopped with disk full from vSphere. Upon trying to enable compression:

The FlexGroup is split on both FAS01/02 aggr1. I cant get a clue if there is still soo much space left why the VM_Store is claiming no more disk space.

grizzled solstice
#

`FAS2240::volume efficiency*> show -instance -volume VM_Store

                                 Vserver Name: hypervisor
                                  Volume Name: VM_Store
                                  Volume Path: /vol/VM_Store
                                        State: Enabled
                                       Status: Active
                                     Progress: -
                                         Type: Regular
                                     Schedule: -
                       Efficiency Policy Name: default
                             Compression Type: -
                       Blocks Skipped Sharing: 0
                         Last Operation State: Success
                 Last Success Operation Begin: Wed Sep 06 00:15:34 2023
                   Last Success Operation End: Wed Sep 06 00:27:50 2023
                         Last Operation Begin: Wed Sep 06 00:15:34 2023
                           Last Operation End: Wed Sep 06 00:27:50 2023
                          Last Operation Size: 282.3GB
                         Last Operation Error: -
                              Changelog Usage: 6%
                            Logical Data Size: 9.50TB
                           Logical Data Limit: 10PB
                         Logical Data Percent: 0%
                                   Queued Job: -
                 Stale Fingerprint Percentage: 61
                                  Compression: true
                           Inline Compression: false
                           Constituent Volume: false
                                Inline Dedupe: true
                              Data Compaction: true
            Cross Volume Inline Deduplication: false
        Cross Volume Background Deduplication: false`
orchid cloak
#

Are there any qtrees that could have been mounted as the datastore?

grizzled solstice
#

No Qtrees

river crag
#

What does this tell you?
volume show -volume VM-Store -is-constituent true

#

Your constituent "VM_Store__0007" is most likely full.

grizzled solstice
#

Its an old FAS. No more updates possible

#

FAS2240::> volume show -volume VM_Store -is-constituent true
There are no entries matching your query.

grizzled solstice
river crag
river crag
#

wait... how did you manage to install 9.3 on there? Isn't 9.1 the maximum?

grizzled solstice
#

the only one almost full is
VM_Store__0007 aggr1_FSAS_01 online RW 1.88TB 442.0GB 76%

#

I was told from the engineers that the FlexGroup would automatically balance between the nodes

river crag
#

Not in ONTAP 9.3 🙂

#

Well, it tries to balance all writes over all constituents, but if it fails (because of a file which is too large, etc.) there is no way to rebalance it with 9.3. You need at least 9.10.1, or better 9.12.1.

grizzled solstice
#

Alright . Will try

#

Any better advice to have a big volume shared between two nodes than a flex group?

oak thistle
#

I would just have FlexVols. If it is a VM datastore FlexGroups didn't really get good for VM datastores until ONTAP 9.8.

flat garnet
#

OG1 is correct, ONTAP 9.1 was the last supported version on the FAS22xx, so you're basically running in a very unsupported config with 9.3 😄

#

so yeah, don't bother opening a ticket with NetApp support 🙂

oak thistle
grizzled solstice
oak thistle
#

No but why do you need it?

#

You essentially are still locked to it anyway because the constituent volumes underneath the FG are gonna limit total space.

#

You can't have one file span across multiple nodes.

#

(one file would be one VM in this case)

grizzled solstice
#

That's alright. afaik the whole thing with FlexGroups started bc one FlexVol on NFSv3 can only be mounted via 1 IP on ESXi while NFSv4 could do multiple IPs for load-balancing. In the end a whole performance thing to split the workload of one big Volume via two nodes.

oak thistle
#

Ah you won't likely see a benefit on CPU level. The Xeon in 2240s was made in 2010 by Intel.

#

Disks...maybe if you have limited disk shelves.

grizzled solstice
#

yep yep. I know. It's old stuff but should be able to handle at least some workload. If i remember correctly, the limitation was bc one Volume was mainly located on the disks/aggr attached to one controller and was mainly handled by that one controller instead of splitting the laod between the 2 nodes.

oak thistle
#

Yeah.

#

If you workload isn't limited by CPU you could have all aggregates on node 1 and have an active/passive setup, but given the limited CPU horsepower you probably need both.

flat garnet
grizzled solstice
grizzled solstice
#

I'm afraid to try that atm

oak thistle
#

You can do an aggr relocate. It just moves the aggr to the opposite node.

flat garnet
#

I would have just set up four FlexVol datastores, two per node, and manually balanced the VMs across them using SVMotion. I can imagine that poor little FASstruggling a lot to keep up with all the features it wasn't optimized for (dedup, flexgroup, etc.)

grizzled solstice
#

why 2 per node? I was just about to say one per node

oak thistle
#

If you can get more than 100 MB/s out of it that's good.

grizzled solstice
#

they got a bigger FAS8040 sitting around but no time for migration yet

#

so trying to do the best thats possible on the specs here

flat garnet
#

well, I can get close to 100mb on my old FAS270 on a good day, but yeah, I can't imagine tha FAS2240 showing good performance.. Probably running 7mode on it would be a bit better

grizzled solstice
#

do you mind elaborating why you would do four FlexVols (two per node) instead of one FlexVol per node?

flat garnet
#

yeah, 1 FlexVol is probably enough to keep the FAS2240 busy. Depending on the system, you can paralellize a bit of the workload if you have multiple volumes per node, but I guess on the FAS22xx it doesn't really help much in any case...

oak thistle
#

So you can use all the vol affinities, which wouldn't be much I'd think.

flat garnet
#

exactly

#

I don't remember how many waffinities the 2240 series had, but it was not a lot, I'm sure 🙂

grizzled solstice
#

alright alright then. Let me see what engineering sais and will discuss with them. thanks a lot for the great input and insight

#

for the sake of the discussion. those are the 2 sysstats

#

seems to me like the CPU is busy but not too much

flat garnet
#

obviously, the disks are your bottleneck

#

SATA disks really benefit from having many many many many spindles

#

CPU is almost never the issue, especially since that column in sysstat is neither an "average" nor a "busiest CPU", but something in between, so unless that approaches 100% for extended periods of time, it's not an issue

grizzled solstice
#

that might be it. I just have 18 disks (+2 spares) in the aggr. and 232GB cache per node. 24 disks are max but that wont get me anywhere here

oak thistle
#

Grab marshmallows. Them disks are toasty.

simple quarry
#

Personally, I think it’s safer/easier to use storage drs clusters instead of flexgroups. You end up with larger volumes and way more control over where things end up.

grizzled solstice
#

the have one issue after another with flexgroups. Now the cloning doesnt work.