Hi to all
We have 4 SG5760 + 4 SG5712, but one of the nodes has a unusual increase of space used, we open a ticket at NetApp and they send a command "rebalance -data start" but after a few days running it did nothing still high use only one node, anyone had the same or similar problem? or any idea how to solve it? any help would be appreciated.
#Storage GRID only one node more than 90% used
1 messages · Page 1 of 1 (latest)
check your ILM policies... and there was at least one bug for certain ILM situations in 11.8 ...
I have only 2 policies and both have "ingest behavior: Balanced", firmware version is 11.7.0.8
we have performed a few rebalance exercises and they do take a long time
the command can take weeks; can you check if it is still running?
rebalance-data status --site sitename
That command finished, took like a week but nothing change
Yeah, a faster and better rebalance would be great
The rebalance is only working with ec data… you can see in the Storage Details from the Node how much Storage is Consumed by replicated data and how much by ec data…
If you go into the Node and click on the Storage tab and hover over the graph, is the data mostly EC data or Replica data? Rebalance is for EC data only and won't touch replicas.
To rebalance Replica data you'd mostly need to decommission that node then re-add back to the Grid. but then you have that node completely empty so not a perfect solution.
one could play with the storage grades and pools in ILM to which will move data internally, but since there aren't a lot of nodes to start with, the result will probably be just as lopsided
changing the composition of existing storage pools by messing with the storage grades won't move existing EC data. The fragments won't be evacuated from the nodes no longer in the storage pool. I know this because I'm currently using Storage Grades and Storage Pools to evacuate data off nodes before we decommission them. I worked with Engineering and lab'd this several times to prove it. You have to change the EC Profile (so basically, a new ILM Rule and Policy applied) to get the data to move. Just changing the existing pools isn't enough to move all existing data, unfortunately.
i think the OP here is dealing with non-EC data, fwiw.
and yes, I've done this as well as we migrated out of model-based storage grades. EC profile changes weren't a complete success either, fwiw.
It's only EC data... no replication data.
I'm a colleague of @pastel thicket and to my knowledge, we're only using the GRID for backups destination and fabricpool and boot are using EC rules (2+1)