#Looking for specific metrics ...

1 messages · Page 1 of 1 (latest)

solid lotus
#

Using nabox (4.0.7) there is a lot of information in the various dashboards, which is just great! But, sometimes I get a bit lost 🙂 E.g. I'm pretty sure I once saw a panel with QoS metrics/info. ... But I can't find it now.
Similarly I was hoping that I might find a Dashboard/Panel/Graph showing ONTAP CP (Consistency Point) information, but no luck so far. Is there such a thing in nabox?

cloud gyro
#

@solid lotus You can find CP in disk dashboard as below

#

For QOS metrics, Workload dashboard should have it.

solid lotus
#

Aren't there also different types of CPs in WAFL / ONTAP?

solid lotus
# cloud gyro

Found the QoS too, thank you. Looks as if it is turned off by default. So we'll have to think about if we should enable it or not ...

cloud gyro
#

That's 2 metrics in that panel as below

#

Only 2 types of wafl CP are shown there.

cloud gyro
cloud gyro
solid lotus
#

Humm, actually what I see here is both axis labelled "write latency" and no sign of any counts or a y-axis on the right hand side

#

Underneath the graph are just a list of write latencies per aggr, but no other info.

cloud gyro
#

If you click on Back-to-back CP count that should show CP

cloud gyro
#

Could you share screenshot how does this panel look?

solid lotus
#

Probably can a screenshot. I have to tx it out of ... where it is. Inspect json shows me references to cp counts, so it seems that there should be more info. ... Maybe the graph is just being rendered incorrectly somehow

#

4.0.7

cloud gyro
#

I meant to ask Harvest version as shown here in NABox

solid lotus
#

How would I do that? (sorry - I'm (not yet) a Grafana professional :-/)

solid lotus
#

Aha. Harvest version is 24.11.1

#

It was imported via the nabox upgrade page to replace the version included with 4.0.7

#

Reimport from github won't work in any case, the site has no direct access to internet

cloud gyro
#

You can copy json as well

solid lotus
#

Humm. Fond the import function, but it doesn't seem to want to overwrite the existing definition.

#

Maybe it would be easier to just upgrade the whole think to 4.0.8 ?

cloud gyro
#

Okay

#

NABox 4.0.8 will not help

solid lotus
#

I see this (sorry about the hover-popup)

cloud gyro
#

OK could you share your Grafana version?

#

I see the issue. I'll share fix.

#

As a workaround for now, Could you modify panel query to below if NABox allows that.

sum(wafl_cp_count{datacenter=~"$Datacenter",cluster=~"$Cluster",node=~"$Node",metric=~"back_to_back_CP|deferred_back_to_back_CP|back_to_back_cp|deferred_back_to_back_cp"})
solid lotus
#

I'll try the mod, some things don't seem to be modifiable / deleteable (if that's a word) I get an error about teir being "provisioned" (or similar)

tidal umbra
#

maybe a Save As will work instead of a Save

cloud gyro
#

I have started a nightly build for you to upgrade.

#

@short field for dashboard modification in NABox

tidal umbra
#

@short field has a change, not yet published, that will let you edit the out-of-the-box dashboards too

short field
#

I should probabl release that before the Top Client dashboard checkbox

solid lotus
#

Well Hey ... that worked (mostly). I changed the query, hit Apply and the CP panel is updated. Now I have a RHS y-axis labelled "Back to Back CP Count" 👍🏻

cloud gyro
#

Nice!

solid lotus
#

Two things occur to me ... the first row of the table, below the graph is showing the query e.g. "sum(walf_cp_ ..." the next rows show "Write Latency" values .... looks strange

solid lotus
#

Second thought ... Is it true? I though "back to back CP" events should be rare and v. not good. This seems to be showing me them happening all the the time?

cloud gyro
#

@red cradle can help in domain side of this. See if KB articles help.

solid lotus
#

Good idea, I should know what I'm talking about, before talking. I'll read the KBs

#

In edit I see "Legend: Auto" ... maybe that's why I'm seeing the "raw" query displayed in the table

cloud gyro
#

Its like this for me. Is it any different for you?

solid lotus
#

Dude, it's Friday! Never change a running system (on a Friday afternoon)🔥

#

Seriously though ... I could consdier nightly ... But in general how stable is it? We have (we will have) users who will want to see reliable data ...

tidal umbra
#

it's stable, runs through all CI, regression, and unit tests, and it's what we run leading up to our next release

solid lotus
#

Also .. I still have to get back to getting my custom Grindstaff ISL interface counters working 😊

solid lotus
cloud gyro
#

Okay. You can set Legend to Back-to-back CP Count for A

solid lotus
#

Yeah ... but ... really shouldn't it say something like "back2back" + a variable i.e. the node name I think would be correct? (I think of CP ops as a Node / ONTAP instance level concept)

#

I changed it to be "BOBs {{$Node}}" ... but that results in the Name in the table becoming the literal string "BOBs {right-axis}"

#

Too many braces {} ?

tidal umbra
#

for legends it is typically {{name}} example

#

but I think the Back-to-back CP Count legend is acting different, likely because it has right placement

#

for example

#

ah, i see the problem. let me try something

solid lotus
#

I'm glad you guys understand this 🤣

#

FYI: In your last screenshot: "Foo (right y-axis)" under "Name" is also what I have achieved 🤪

tidal umbra
#

Give this a try
query: sum by (datacenter, cluster, node) (wafl_cp_count{datacenter=~"$Datacenter",cluster=~"$Cluster",node=~"$Node",metric=~"back_to_back_CP|deferred_back_to_back_CP|back_to_back_cp|deferred_back_to_back_cp"})
legend: Back-to-back CP Count {{node}}

solid lotus
#

OK. I'll tx. that query over and try it. It'll take me a couple of minutes

#

CP Info and RHS axis have disappeared from the graph. Table undeneath has only "write latency" entries

#

Let's forget it for now.

tidal umbra
#

Here's what I see with the changes pasted above

solid lotus
#

That's nice 🙂 Let me try it again - probably screwed something up here. Meanwhile, you can read email to ng-harvest-files?

tidal umbra
#

yes

solid lotus
#

I sent you a screenshot of a different design

tidal umbra
#

thanks. At the moment, Harvest does not have the CP durations, which is what the PAS CP dashboard is showing. Let me check and see if the ZAPI includes that information (I know Rest does not)

solid lotus
#

Right, that data is not being generated via Harvest

#

But it does look cool:-)

tidal umbra
#

we collect cp_phase_times which is percentage time spent in different phases of CP. Looks like there is a counter for total_cp_msecs which is Milliseconds spent in CP but that won't be broken down by phases of CP.

#

so far, I don't see anything in the wafl perf object that includes the durations. PAS must be collecting that info some other way

solid lotus
#

That's too deep for me :-). I could poke around inside a perf autosupport archive and see if i can identify anything, bu tit would be a emperical approach

tidal umbra
#

that's OK, I'll ask around and see what I can find out

solid lotus
#

BTW: Rahul's query line works. Just saying 😎

tidal umbra
#

yes, we reviewed before he sent. Try this
sum by (datacenter, cluster, node) (wafl_cp_count{datacenter=~"$Datacenter",cluster=~"$Cluster",node=~"$Node",metric=~"back_to_back_CP|deferred_back_to_back_CP|back_to_back_cp|deferred_back_to_back_cp"})

solid lotus
#

On it. ( I was just trolling you with that last one ;-))

solid lotus
#

So I cloned the complete Dashboard and applied your changes (query + legend). There it works. Go figure 🤷🏻‍♂️

#

Sent you a screenshot via email

tidal umbra
#

thanks! I'll check on the scatter plot, doubt that Harvest has the data to show that. Either way, sounds like you would prefer we change this dashboard to use the latest query. Is that correct?

solid lotus
#

For sure. Before, here, I wasn't seeing any CP info in that graph. So this is definitely better - IMHO. Still have that slight doubt that these events are really all b2b CPs ... But I need to do my KB homework