#Same Qtree is shown as multiple resources

1 messages · Page 1 of 1 (latest)

fluid hull
#

Harvest: 24.04.02-nightly

tall carbon
#

hi @fluid hull we've seen this happen when the instance label for a metric changes. The instance label includes the poller's Prometheus port that Prometheus uses to scrap. You can verify that the instance label changed by going to Prometheus and typing count by (instance,volume,qtree) (quota_files_used{volume="appdata_mms",qtree="appdata963"}). Do you see more than one row there? In that last six month, did the port for this poller change? That would cause this problem

fluid hull
#

In the Query? Here?
Or can we access Prom in our Nabox?

tall carbon
#

you can access prom from your nabox. Example: if nabox is https://10.216.33.135/login use https://10.216.33.135/prometheus/

fluid hull
tall carbon
#

when you typed the query, can you double check the qtree and volume names by picking them from the completion dropdown that Prometheus shows when typing like this?

fluid hull
#

Yes both works

tall carbon
#

thanks for confirming

#

What if you try
count by (instance,svm,volume,qtree) (quota_files_used{volume="appdata_mms"})

fluid hull
#

Dome the querys on our production env:

fluid hull
#

@tall carbon can you help us fixing this, or cant this be fixed?

lunar yacht
#

@fluid hull One of the reason this can be possible is, If we have enabled all type of quotas in template.
Can you please confirm which quotas have been enabled so we can isolate this possibility,
you can check here for rest template: https://github.com/NetApp/harvest/blob/main/conf/rest/9.12.0/qtree.yaml#L30
you can check here for zapi template: https://github.com/NetApp/harvest/blob/main/conf/zapi/cdot/9.8.0/qtree.yaml#L38

Because the panel in dashboard are showing for qtree level but the metric quota_disk_used is quota metrics, and it can have multiple records for same qtree, volume and svm. Attached the screenshot where same qtree has multiple records because of group and user quotas belongs to that.

Also, In your screenshot for count by query, I could not see the right side counts, can you share that again with those right side count.

fluid hull
lunar yacht
#

As per your screenshot, it doesn't look like you have many quotas belongs to the same qtree. So, thats' not the possibility which I have mentioned yesterday.
Which means, Chris's initial suspect, the metrics would be having different ports which caused the showing multiple instances for the same qtree object, would be the case here.

fluid hull
#

How can i check if what Chris suspected is true?
And can i avoid that actively on mi site?

fluid hull
spiral fable
# fluid hull Harvest: 24.04.02-nightly

@fluid hull Do you still have this issue? If yes, Are the SVMs the same for all these duplicate qtrees displayed in the panel? If yes, then to check if the poller port has changed over time, let's focus on the panel Top $TopResources Qtrees by Disk Used. If you update its panel query to the one below, does it fix the issue with the duplicate qtrees for this panel?

(
  label_replace(
    quota_disk_used{
      datacenter=~"$Datacenter",
      cluster=~"$Cluster",
      svm=~"$SVM",
      volume=~"$Volume",
      qtree=~"$Qtree"
    },
    "instance", "", "instance", ".*"
  )
)
and
topk(
  $TopResources,
  avg_over_time(
    label_replace(
      quota_disk_used{
        datacenter=~"$Datacenter",
        cluster=~"$Cluster",
        svm=~"$SVM",
        volume=~"$Volume",
        qtree=~"$Qtree"
      },
      "instance", "", "instance", ".*"
    )[3h:] @ end()
  )
)
fluid hull
#

Will try the new query and give you feedback soon

fluid hull
#

Our Productive System with older Harvest

#

Our Clone from Production with actual Harves - On the left the new query from @spiral fable on the right the actual query.

spiral fable
#

@fluid hull Has the new query I provided fixed the issue with duplicate records in the panel?

fluid hull
#

The new query shows only data for a few hours

#

And on the right side, the original query, doesnt apply the filter from the variable

#

And it is the same SVM

spiral fable
#

@fluid hull Are you suggesting that with the new query, you are getting only one record as a result, as shown in the screenshot? For me data is available for longer range.

#

I have updated qtree dashboard all panels queries with these changes. Could you import this dashboard and share feedback.

fluid hull
#

will try and report back.

fluid hull
#

As Soon as i change the Query our results in those Panels reduce themself to a few hours. And i dont know why 😄

#

The Panels above work fine

spiral fable
#

This is strange. Let's try below query in Grafana panel for Test purpose and check results.

(
  label_replace(
    quota_disk_used{
      qtree=~"appdata963"
    },
    "instance", "", "instance", ".*"
  )
)
and
topk(
  5,
  avg_over_time(
    label_replace(
      quota_disk_used{
        qtree=~"appdata963"
      },
      "instance", "", "instance", ".*"
    )[3h:] @ end()
  )
)
fluid hull
#

I have found one qtree that shows all the data.
I will try to find out whats the difference between the two trees.
The first difference i have noticed is, that what works is a HA-Pair, and where we have this strange bevaiour is a Metrocluster.

#

I can confirm that all our Metroclusters show the same issue and all our HA Pairs dont

spiral fable
#

Thanks @fluid hull . It means we don't have relevant qtree data for MetroCluster in Prometheus. Were these clusters added recently? ,How far back is data available in Prometheus for metro clusters? Do other metrics (such as volume, etc.) also have the same issue as qtree for MetroClusters? If you could share the logs with us, as well as the details of the MetroCluster poller name, we can check. We don't have any special handling in Harvest for Qtrees related to MetroCluster.

For log collection, please refer to: NetApp Harvest Log Collection.

fluid hull
#

Other Metrics with same problems are not present or maybe not discovered til now.
I will try to collect Logs and send it to you @spiral fable if you tell me what logs. Only Harvest Container ? Or more?

spiral fable
#

Thanks @fluid hull . We have identified the issue. I'll share updated Qtree dashboard with you shortly over email.

spiral fable
#

@fluid hull I have shared updated dashboard via email. Please try and see if it fixes this issue.

fluid hull
#

I will try the new dashboard and report back 🙂

fluid hull
#

Seems to work perfect now for me

spiral fable
#

Great. Thanks!

fluid hull