#Dashboard Performance

1 messages · Page 1 of 1 (latest)

near axle
#

I wanted to discuss dashboard performance. I am unhappy with the dashboard loading times for some time now. I have 22 clusters monitored and using the SVM dashboard takes up to 20 seconds before I get what I want. I highly suspect the huge number of variables that do their own query to be the culprit. Can you please explain what they are necessary for? Why not leave the query logic in the panels?

visual valve
#

The SVM dashboard in Grafana uses numerous hidden variables to compute the top n results.

Grafana's design only sends queries for panels that are open, while queries for panels in collapsed rows are not sent. However, all variables, including hidden ones, are initialized and their queries are sent when a dashboard loads or a filter changes, irrespective of the visibility of the panels using them.

To improve loading times, consider keeping only essential rows open in the SVM dashboard and collapsing the rest.

near axle
#

But why are the Top n results computed separately?

visual valve
#

We may be able to improve some of filter load times by moving Datacenter, Cluster, SVM to use svm_labels instead of volume_labels but it may not be enough.

near axle
#

OK, I see the point. Could you think about a simplified effort to calculate the top n? What about just calculating it once based on volume iops & throughput and then use this value for all protocols?

visual valve
#

That would be a very different to what we do today. Currently, Each panel shows its own top n depending on the metric it represent. If we take an example of v3 and v4 then We want to get top n for their relevant dataset and not apply some other metric top n . This may lead to wrong results.

query_result(topk($TopResources, avg_over_time(svm_nfs_read_ops{datacenter=~"$Datacenter",cluster=~"$Cluster",svm=~"$SVM",nfsv="v3"}[${__range}])))

query_result(topk($TopResources, avg_over_time(svm_nfs_read_total{datacenter=~"$Datacenter",cluster=~"$Cluster",svm=~"$SVM",nfsv="v4"}[${__range}])))

near axle
#

I agree the approach would be different and my lead to less accurate results, but the usability would improve a lot. From my point of view the top n is to see the top talkers on the cluster. Protocol doesn't matter so much in the first place. Maybe you could consider it...

visual valve
#

Sure. Can you open a feature request on GitHub regrading svm dashboard performance. This will invite feedback from other users who might be interested in similar top n w.r.t protocol. We'll also explore other ways to improve the performance of the SVM dashboard via that request.

visual valve
#

@near axle Do you have many rows open in SVM dashboard? That will also add to dashboard load time.

near axle
#

No, just the standard ones

abstract cradle
#

@near axle and to clarify, you still want a topN, because dropping it would speed up the dashboard (probably significantly) ?

As Rahul mentioned, originally we tried using topN in each panel individually, but it was near useless. Our current approach of hidden vars that are used in the panel queries is a workaround. Unfortunately, it's the workaround suggested by the Prometheus and Grafana teams

near axle
#

I see the advantages of topN to easily find the top talkers. Did you have a look at the @ modifier as alternate solution? https://prometheus.io/blog/2021/02/18/introducing-the-@-modifier/
BTW VictoriaMetrics has the perfect solution for this topk_<max/avg/last/median>. To bad Prometheus won't pick this up.

abstract cradle
#

yes! we looked at VM last time we dug into this and that does seem like a better option. last time we looked at @ it was disabled by default so we couldn't use it, but we'll check if that has changed

#

thanks! looks like @ is working locally on some Prom instance and not others that need to be upgraded. Looks promising!

visual valve
#

Thanks @near axle ! As of version 2.33, modifiers have been incorporated into the stable features of Prometheus. I've updated the SVM dashboard to utilize these modifiers, and the preliminary results are promising. While we're in the process of reviewing PR (https://github.com/NetApp/harvest/pull/2553). You can access/test it @ https://raw.githubusercontent.com/NetApp/harvest/4d01b420f9ca227bb7791c030e899cd03e7d8f86/grafana/dashboards/cmode/svm.json. Please let us know your valuable feedback.

near axle
#

Test looks great. Thanks for picking this up so fast! 👍

visual valve
#

Thanks for the feedback!