#Hi does harvest exports its own metrics

1 messages · Page 1 of 1 (latest)

jolly pier
#

hi @rose depot Harvest exports some own metrics, most are shown in Metadata dashboard. We don't current report error stats, but we should. If you get a chance, please create a GitHub for that, otherwise I'll create one later today

rose depot
#

Thanks Chris! Created issue #1457

jolly pier
#

thanks!

rose depot
#

is there a document about the metadata metrics? most our metadata_target_status is 0 but there is one has value 1.

#

seems we can use metadata_component_status

jolly pier
#

as the Harvest Prometheus exported data makes clear 🙄
# HELP metadata_component_status Metric for metadata_component and # HELP metadata_target_status Metric for metadata_target

#

🙂 i think we can make that clearer

rose depot
#

metadata_component_status is very clear. metadata_target_status shows value 1 but don't see error in the log

jolly pier
#

let me dig it up and see

rose depot
#

👍

#

metadata_component_status{poller="flc3-prod-ams-storage",reason!="no instances"} !=0 can give us something

jolly pier
rose depot
#

cool

#

how about metadata_component_status? seems 0 normal. how about 1 and 2? we see 2 in our instance

jolly pier
#

on it

#

BTW these are great questions - I'm going to update our fancy new documentation https://netapp.github.io/harvest/ with this info so we can point to it next time. Any opinion on where in the docs you would expect to find this info? Maybe "Configure Harvest (advanced)" or "Reference" or somewhere else?

rose depot
#

Troubleshoot or a new section "Monitor Harvest"?

jolly pier
#

i like it, monitor harvest makes the most sense

#

metadata_component_status is published by the poller and is metadata about each collector and exporter associated with the poller.
Let's say you're using the Zapi collector, with the out-of-the-box default.yaml, that means you will be monitoring ~22 different objects (as defined in default.yaml). And let's say you are exporting to Prometheus. That means we would expect Harvest to export 22 + 1 metadata_component_status metrics.

#

The status will be 0 if the collector runs without error, same for prometheus

#

actually, for Prometheus, the metadata_component_status = 0 means that the exporter was initialized, basically successfully created. It does not really track anything after that. The collector metadata_component_status metric tracks errors like I mentioned. Here is an example, metadata_component_status{name="Zapi",poller="sar",promport="12990",reason="no instances",target="SecurityAuditDestination",type="collector",version="2.0.2"} 1 This has a value of 1 because the collector failed to collect any instances for object SecurityAuditDestination because there are none on this cluster. When you have a non zero value, you should also have a reason

rose depot
#

yeah. here is what we see

jolly pier
#

Here is how the collector sets the values for metadata_component_status
c.SetStatus(0, "running")
c.SetStatus(1, errs.ErrConnection.Error())
c.SetStatus(1, errs.ErrNoInstance.Error())
c.SetStatus(1, errs.ErrNoMetric.Error())
c.SetStatus(2, errMsg)
c.SetStatus(0, "running")
c.SetStatus(0, "running")

#

context deadline exceeded means there was a timeout collecting that resource

#

Fcpport, maybe your cluster has no fiber channel ports?

rose depot
#

possible. so we should filter with reason!="no instances",reason!="API request rejected"

jolly pier
#

i think that should work (with the caveat that I'm correct about "API request rejected"). Harvest is trying to figure out if errors it gets from ONTAP should be errors you care about or not. That's why when there are "no instances" the status is set to 1 and logged at INFO instead of WARN or ERR, because maybe it's perfectly fine that you have no instances of some object. We opted to tell you but not shout it 🙂 Seems to me that "API request rejected" is a similar situation. If memory serves, that's ONTAP saying, you sent a ZAPI to me that does not exist. You could make the case, that's fine sometimes and not others.

Can you drop reason altogether and only query for !=0

#

ah, but you want to remove those missing resources that you don't care about. I think you mean this?
metadata_component_status{reason!="no instances",reason!="API request rejected"} != 0

rose depot
#

yes

jolly pier
#

yep, that's looks good

rose depot
#

Thanks a lot!