#Harvest not able to get the headroom aggregate metrices

1 messages · Page 1 of 1 (latest)

lilac crest
#

NetApp Release 9.14.1P9
harvest version 24.08.0-1 (commit 0cd72654) (build date 2024-08-12T08:58:20-0400) linux/amd64
H/W AFFA400

I have recently built a new netapp in our infra and I have noticed that this one is not displaying the metrix like headroom_aggr_current_utilization. While checking the poller logs I could see that the netapp is rejecting the API calls.
Logs here;
{"level":"info","Poller":"storage01","collector":"ZapiPerf:HeadroomAggr","error":"API request rejected => For resource_headroom_aggr object, no instances were found to match the given query. errNum="61110" statusCode="0"","task":"data","caller":"collector/collector.go:415","time":"2024-11-20T07:53:10Z","message":"Entering standby mode"}

The only difference between the other storage in the infra and this one is, it is the only one running the ONTAP version 9.14.1P9.
All the others are on 9.13.x.

To check this I have upgraded another instance to 9.14.1P8, that has no issues with the metrix.
Is anyone else having this issue?

late crane
lilac crest
#

Sure

#

@late crane Done

late crane
#

thanks

#

I think that error is the ZAPI Perf infastructue of ONTAP saying that you don't have any instances of that object. I see the same response from ONTAP for CopyManager, HeadroomAggr, NFSv3, NFSv4, NFSv41, NFSv42, and Qtree. That message makes sense for most of those objects. Not sure about HeadroomAggr, would have thought you would have had instances of that object

#

Is REST enabled on the cluster in question? If so, we can curl the endpoint to check that way too

lilac crest
#

Yeah REST is enabled

#

{
"name": "headroom_aggregate",
"description": "Display message service time variance and message inter-arrival time variance for aggregates in a node.",
"counter_schemas": [
{
"name": "current_utilization",
"description": "This is the storage aggregate average utilization of all the data disks in the aggregate.",
"type": "percent",
"unit": "percent",
"denominator": {
"name": "current_utilization_denominator"
}
}
],
"_links": {
"self": {
"href": "/api/cluster/counter/tables/headroom_aggregate"
}
}

late crane
#

Perfect, can you also try this endpoint?
https://nappstore-14/api/cluster/counter/tables/headroom_aggregate/rows?return_records=true

lilac crest
#
  "records": [
    {
      "id": "nappstore-14a:DISK_SSD_nappstore_14a_data_32de3074-7e40-4ffb-86e1-20c3b16eace9:32de3074-7e40-4ffb-86e1-20c3b16eace9",
      "_links": {
        "self": {
          "href": "/api/cluster/counter/tables/headroom_aggregate/rows/nappstore-14a%3ADISK_SSD_nappstore_14a_data_32de3074-7e40-4ffb-86e1-20c3b16eace9%3A32de3074-7e40-4ffb-86e1-20c3b16eace9"
        }
      }
    },
    {
      "id": "nappstore-14b:DISK_SSD_nappstore_14b_data_b8d85f81-8650-414d-b8f5-f4c86d736dd8:b8d85f81-8650-414d-b8f5-f4c86d736dd8",
      "_links": {
        "self": {
          "href": "/api/cluster/counter/tables/headroom_aggregate/rows/nappstore-14b%3ADISK_SSD_nappstore_14b_data_b8d85f81-8650-414d-b8f5-f4c86d736dd8%3Ab8d85f81-8650-414d-b8f5-f4c86d736dd8"
        }
      }
    }
  ],
  "num_records": 2,
  "_links": {
    "self": {
      "href": "/api/cluster/counter/tables/headroom_aggregate/rows?return_records=true"
    }
  }
}```
late crane
#

well that's interesting - RestPerf returns two records while ZapiPerf returns errNum=61110

#

as a workaround, you can switch your poller to use RestPerf instead of ZapiPerf (or move RestPerf above ZapiPerf in the list of collectors). I'll see if I can find anything out about why ONTAP returns a 61110

lilac crest
#

Okay let me try switching to REST in sometime. Also will look forward to know why this is happening 🙂

late crane
#

me too!

#

@mossy kayak have you seen this before?

#

What if you try this against the same cluster? (please replace ip)

curl -snk --data-ascii '<?xml version="1.0" encoding="UTF-8"?> <netapp xmlns="http://www.netapp.com/filer/admin" version="1.30"> <perf-object-get-instances><instances><instance>*</instance></instances><objectname>resource_headroom_aggr</objectname></perf-object-get-instances> </netapp>' -H "Content-Type: text/xml" -X POST 'https://10.193.48.154/servlets/netapp.servlets.admin.XMLrequest_filer'
lilac crest
#

@late crane excuse me about the late response. got engaged with some other stuff till late night

Here is the output

serene night
#

Thanks @lilac crest
Could you please also share response of below command. Do replace USER,PASS,CLUSTER_IP as needed.

curl --connect-timeout 30 --user USER:PASS --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?>
<netapp xmlns="http://www.netapp.com/filer/admin" version="1.130">
    <perf-object-instance-list-info-iter>
        <objectname>resource_headroom_aggr</objectname>
      </perf-object-instance-list-info-iter>
</netapp>' -H "Content-Type: text/xml" 'https://CLUSTER_IP/servlets/netapp.servlets.admin.XMLrequest_filer'
lilac crest
#

here is the output

<!DOCTYPE netapp SYSTEM 'file:/etc/netapp_gx.dtd'>
<netapp version='1.241' xmlns='http://www.netapp.com/filer/admin'>
<results status="passed"><attributes-list><instance-info><name>DISK_SSD_nappstore_14a_data_32de3074-7e40-4ffb-86e1-20c3b16eace9</name><uuid>DISK_SSD_32de3074-7e40-4ffb-86e1-20c3b16eace9</uuid></instance-info><instance-info><name>DISK_SSD_nappstore_14b_data_b8d85f81-8650-414d-b8f5-f4c86d736dd8</name><uuid>DISK_SSD_b8d85f81-8650-414d-b8f5-f4c86d736dd8</uuid></instance-info></attributes-list><num-records>2</num-records></results></netapp> ```
#

@serene night

serene night
#

Thanks. This shows that Zapi commands are working fine. I'll give you a Harvest command shortly to check further.

lilac crest
#

Okay

#

Thanks Rahul

serene night
#

Could you run below from Harvest install dir? Let it run for few mins and share the output. You need to replace POLLERNAME with the poller which is having this issue as defined in harvest config.

./bin/poller --poller POLLERNAME --collectors ZapiPerf --objects HeadroomAggr
lilac crest
serene night
#

Thanks. We can see that data is being collected just fine from logs. Could you restart your Harvest poller and check once if it is working now. You don't need to change to Rest for this step.

lilac crest
#

Ok let me see

serene night
#

And you can stop the CLI, i shared earlier.

lilac crest
#

Thanks Rahul

#

restarted the instance and waiting for sometime

#

I am not seeing the error I shared earlier now in the logs

serene night
#

Sure. We should be able to see data in 5 mins if it works fine.

lilac crest
#

Will give it some time and see

lilac crest
#

@serene night It works
Thanks a lot for the help

mossy kayak
#

No idea Chris.