#workload dashboard

1 messages · Page 1 of 1 (latest)

idle ember
#

Hi, all panels in my workload dashnboard showing no data. The required tempaltes are enabled in zapiperf and qos_* metrics collected. Is there anything else I need to do to make the workload dashboard work? We use harvest 23.11, ontap 9.10.1P14 and grafana 10.2.1. Thanks!

hexed bloom
idle ember
#

Here are the lines tha commented out

  Workload:                 workload.yaml
  WorkloadDetail:           workload_detail.yaml
  WorkloadVolume:           workload_volume.yaml       
  WorkloadDetailVolume:     workload_detail_volume.yaml```
hexed bloom
#

thanks! Your logs show the metrics are being collected without errors. Any chance you did't import the latest dashboards? Some of the metric names for workloads changed in 23.08 so if you are using 23.11 pollers with earlier dashboards, Grafana will make the wrong queries to Prometheus

idle ember
#

i did import the 23.11 dashboard with overwite

#

let me delete the workload dashboard and re-import

#

delete/re-import did not fix the problem

hexed bloom
#

Do the variable dropdowns at the top of the dashboard have values?

idle ember
hexed bloom
#

those are sad screenshots 🙂
let's check Prometheus - can you try a query like this, change cluster to one you have enabled workloads for qos_read_ops{cluster="umeng-aff300-01-02"}

idle ember
hexed bloom
#

good good. Now back to Grafana, hover over the first panel in the Workload dashboard and press e to edit that panel, then Query inspector, then press Refresh Does the query and the result look right?

idle ember
#

0 objects returned

hexed bloom
#

what if you change the Workload variable to All?

idle ember
#

it is ALL

#

will a zoom session help? 😄

hexed bloom
#

let's try changing WorkloadClass to All also

idle ember
#

got 500 error after changing workloadclass to all

#

sorry my bad. ignore above

#

workload=~"()"

hexed bloom
#

and workload comes from this var and uses workloadClass

idle ember
#

query only returned ALL

hexed bloom
#

I strongly suspect this is a Grafana breaking change. I'm running 8.1.8 and you are running 10.2.1. The Harvest team has not validated that our dashboards work with 10.X. I've got to jump into a meeting shortly, but will take a look afterwards

idle ember
#

👍

hexed bloom
idle ember
#

ok.

hexed bloom
#

Good news, that dashboard works for me in 10.2.1. You empty workload query that you pasted above is the result of the TopQOSreadOps variable. What do you see for that variable in Preview of values?

idle ember
#

It only has ALL

wooden galleon
#

@idle ember Can you share output of qos_workload_labels metric from prometheus for this cluster?

#

If this is empty then please share logs consisting of lines Zapi:QosWorkload

idle ember
#

Zapi:QosWorkload was missing in default.yml. After adding it, most panels have data now. Only QOS FIXED UTILIZED % still no data. qos_ops and qos_total_ops are being collected.

hexed bloom
#

does your zapi/default.yaml have QosPolicyFixed: qos_policy_fixed.yaml?

idle ember
#

It did not. but after adding it, still no data.

hexed bloom
#

those look good - is it possible the data hadn't updated when you checked the dashboard? If you reload that dashboard is that panel still blank. I confirmed that it works fine on Grafana 10.2.1.

idle ember
#

reloaded and it is still blank😕

hexed bloom
#

phooey - can you check your TopFixedQOSIOPsPercent variable - do you have anything in preview there?

idle ember
#

Only has ALL

hexed bloom
#

is it possible that you haven't created any fixed QoS policy groups defined on the cluster in question?

idle ember
#

i don't think so. How do I check? BTW, the fixed qos panels are blank for all our clusters

hexed bloom
#

try qos policy-group show from the ONTAP CLI

idle ember
#
Name             Vserver     Class        Wklds Throughput   Is Shared
---------------- ----------- ------------ ----- ------------ ---------
extreme-fixed    flc1-noprod-ash-storage user-defined 0 0-50000IOPS,1.53GB/s false
performance-fixed flc1-noprod-ash-storage user-defined 0 0-30000IOPS,937.5MB/s false
value-fixed      flc1-noprod-ash-storage user-defined 0 0-15000IOPS,468.8MB/s false
3 entries were displayed.```
hexed bloom
#

great that explains why you aren't seeing any fixed in the panel

idle ember
#

oh but why we get fixed metrics?

hexed bloom
#

you're right, I should have said that those are fixed policy-groups but they have not been applied to any workloads, the Wklds column is zero and that also matches your screenshot of the prometheus metrics that have object_count="0" The panel in question is showing the top qos_ops that have a fixed policy group applied to them, and in you case, there are no workloads with a fixed policy group applied

idle ember
#

So you are saying that we are not using or enforcing QOS fixed or adaptive at all?

hexed bloom
#

My read of your CLI output is since the number of workloads (Wklds column) is zero for all policies - that no workloads have a policy applied. @pine flicker is that a correct read?

pine flicker
#

Sorry on a P1.

#

0 workloads on those QoS policies. You might do qos adaptive show

idle ember
#
                            Expected    Peak         Minimum Block
Name         Vserver Wklds  IOPS        IOPS         IOPS    Size
------------ ------- ------ ----------- ------------ ------- -----
extreme      flc1-noprod-ash-storage 0 6144IOPS/TB 12288IOPS/TB 1000IOPS ANY
performance  flc1-noprod-ash-storage 0 2048IOPS/TB 4096IOPS/TB 500IOPS ANY
value        flc1-noprod-ash-storage 0 128IOPS/TB 512IOPS/TB 75IOPS ANY
3 entries were displayed.```
#

0 workload too

idle ember
#

I enabled qos fixed and adaptive on another cluster but the panels still show no data.

Name             Vserver     Class        Wklds Throughput   Is Shared
---------------- ----------- ------------ ----- ------------ ---------
extreme-fixed    flc1-poc-ash-storage user-defined 36 0-50000IOPS,1.53GB/s false
performance-fixed flc1-poc-ash-storage user-defined 1 0-30000IOPS,937.5MB/s false
value-fixed      flc1-poc-ash-storage user-defined 1 0-15000IOPS,468.8MB/s false
3 entries were displayed.

flc1-poc-ash-storage::qos> qos adaptive show
                            Expected    Peak         Minimum Block
Name         Vserver Wklds  IOPS        IOPS         IOPS    Size
------------ ------- ------ ----------- ------------ ------- -----
extreme      flc1-poc-ash-storage 1 6144IOPS/TB 12288IOPS/TB 1000IOPS ANY
performance  flc1-poc-ash-storage 0 2048IOPS/TB 4096IOPS/TB 500IOPS ANY
value        flc1-poc-ash-storage 0 128IOPS/TB 512IOPS/TB 75IOPS ANY
3 entries were displayed.```

Logs
``` 2023-12-04T16:15:43Z INF collector/collector.go:510 > Collected Poller=flc1-poc-ash-storage apiMs=94 calcMs=0 collector=Zapi:QosPolicyFixed instances=12 metrics=96 parseMs=1 pluginMs=0       │
│ 2023-12-04T16:15:43Z INF collector/collector.go:510 > Collected Poller=flc1-poc-ash-storage apiMs=118 calcMs=0 collector=Zapi:QosPolicyAdaptive instances=3 metrics=27 parseMs=1 pluginMs=0    │
│ 2023-12-04T16:15:43Z INF collector/collector.go:510 > Collected Poller=flc1-poc-ash-storage apiMs=191 calcMs=0 collector=Zapi:QosWorkload instances=110 metrics=330 parseMs=6 pluginMs=0 ```
wooden galleon
# idle ember I enabled qos fixed and adaptive on another cluster but the panels still show no...

@idle ember There is an issue with QOS Fixed panels where QoS fixed panels are not displaying the workloads where the admin svm qos policy has been applied. I have opened an issue for this https://github.com/NetApp/harvest/issues/2530 and fix via PR https://github.com/NetApp/harvest/pull/2532
Can you try importing dashboard from https://github.com/NetApp/harvest/blob/a55cb91327f19634e094d981669bb577e51d8b6c/grafana/dashboards/cmode/workload.json and see if this fixes the issue.
Thanks for reporting!

idle ember
#

@wooden galleon yes the updated workload.json fixed the issue. Thanks!

hexed bloom
#

great! thanks for the confirmation