#collector permission errors on NAbox3

1 messages · Page 1 of 1 (latest)

rustic heath
#

First look at the logs today while helping a customer troubleshoot data collection. Looking at my own lab system, I see multiple errors from different clusters, similar to this:

ERR collectors/commonutils.go:54 > Failed to fetch data error="error making request StatusCode: 403, Error: Permission denied, Message: not authorized for that command, API: /api/private/support/alerts?return_records=true&suppress=false&time=%3E%3D1706047895" Poller=select02 href=api/private/support/alerts?return_records=true&suppress=false&time=>=1706047895 object=Health plugin=Rest:Health

The errors are for multiple endpoints, and occur across both physical and ONTAP Select clusters. All the errors are permission denied errors. I'm wondering if this might be the clusters unable to respond?

mellow shoal
#

Maybe you need to update the roles on the ONTAP systems ? I know they changed a bit over time

rustic heath
#

I did them using the latest ones from the NAbox directions page. Have they changed beyond that?

#

I crosschecked those as well. Happy to add more if they are needed.

azure lantern
coral carbon
rustic heath
#

I'm still working on these.

#

This one I believe is triggered because Harvest is attempting to get FC port data on ONTAP Select which has no FC ports.

2024-01-24T15:32:36Z ERR collectors/commonutils.go:54 > Failed to fetch data error="error making request StatusCode: 403, Error: Permission denied, Message: not authorized for that command, API: /api/network/fc/ports?enabled=true&fields=name%2Cnode&return_records=true&state=offlined_by_system" Poller=rtp-sa-select02 href=api/network/fc/ports?return_records=true&fields=name,node&enabled=true&state=offlined_by_system object=Health plugin=Rest:Health

#

Perhaps a template modification for Select is in order?

coral carbon
#

the 403 permission denied implies RBAC - can you try running security login role create -role harvest -access readonly -cmddirname "network fcp adapter show" and see if that resolves the 403?

rustic heath
#

@azure lantern I did check and /api doesn't show.

BUT, when I try to add /api with readonly access, I get this error message from System Manager UI:
Failed Adding the "/api" privilege.
The role already exists in the legacy role table.

azure lantern
rustic heath
#

@coral carbon Current status, I have made or tried all the recommendations from this thread. The majority of errors I'm seeing in my logs now are from REST.

Example: I have multiple clusters failing on this API call with permission denied error:
/api/cluster/licensing/licenses?fields=name%2Cscope%2Cstate&return_records=true&state=noncompliant
These clusters are Physical clusters running either ONTAP 9.8P21, or 9.14.1RC1; OR ONTAP Select clusters running 9.14.1RC1

an exact error is:
2024-01-24T17:57:08Z ERR collectors/commonutils.go:54 > Failed to fetch data error="error making request StatusCode: 403, Error: Permission denied, Message: not authorized for that command, API: /api/cluster/licensing/licenses?fields=name%2Cscope%2Cstate&return_records=true&state=noncompliant" Poller=rtp-sa-select01 href=api/cluster/licensing/licenses?return_records=true&fields=name,scope,state&state=noncompliant object=Health plugin=Re

coral carbon
rustic heath
#

Hi @coral carbon, I applied your updated permissions, I think we're almost entirely there. I still see the/api/support/auto-update failure you mentioned, so I'm discounting that.

I do have one new issue that popped up after I applied the changes - good news is it isn't a permissions issue!
2024-01-25T18:25:14Z INF collectors/power.go:222 > sensor excluded Poller=rtp-sa-cl01 object=Sensor plugin=Zapi:Sensor sensor=" node:rtp-sa-cl01-05 sensor:[{rtp-sa-cl01-05 CPU0 Temp Margin -51 }] node:rtp-sa-cl01-07 sensor:[{rtp-sa-cl01-07 CPU0 Temp Margin -55 }] node:rtp-sa-cl01-08 sensor:[{rtp-sa-cl01-08 CPU0 Temp Margin -58 }] node:rtp-sa-cl01-06 sensor:[{rtp-sa-cl01-06 CPU0 Temp Margin -55 }]"

This looks like maybe a ONTAP is getting bad sensor data? Looking at events log on cluster I see no errors reported.

#

I'm going to call this closed for now.