I added new netapp and updated the username and password and created same on netapp as well and updated the harvest.yaml file when i am starting the poller i am getting below error. I created new user with http,cert and ontapi and created new role called all-api and assigned the user. Appreciate the responses.
"source=poller.go:1513 msg="gather cluster info" Poller=*** remote="{Name: Model: UUID: Version: Release: Serial: IsSanOptimized:false IsDisaggregated:false ZAPIsExist:true HasREST:false IsClustered:false}" remoteErr="auth failed => 401 Unauthorized errNum="" statusCode="401"\nStatusCode: 401, Error: auth failed, Message: 401 Unauthorized, API: /api/cluster?fields=%2A&return_records=true"
#Polling Server error for Netapp poller as 401 Unauthorised.
1 messages · Page 1 of 1 (latest)
hi @white dirge did you also verify that the harvest role has web access? https://netapp.github.io/harvest/nightly/prepare-cdot-clusters/#verify-that-the-harvest-role-has-web-access
What if you try something like this substituting $user, $pass, and $ip with the appropriate values.
curl -sk -u$user:$pass 'https://$ip/api/cluster?fields=version'
@untold orbit - There was missing web access for rest and docs-api and i added that now but still same error after restarting the poller.
When i check with curl it say as "curl -sk -u username:password 'http://ip/api/cluster?fields=version'
"<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>401 Unauthorized</title>
</head><body>
<h1>Unauthorized</h1>
<p>This server could not verify that you
are authorized to access the document
requested. Either you supplied the wrong
credentials (e.g., bad password), or your
browser doesn't understand how to supply
the credentials required.</p>
</body></html>"
seems like something is still off with the ontap permissions. Can you double check that you created the rest role as mentioned here https://netapp.github.io/harvest/nightly/prepare-cdot-clusters/#create-rest-role
I cross verified and all seems to be fine but still same 401 error. @untold orbit
Role Access
Vserver Name API Level
test all-api /api readonly
Vserver Type Service Name Role
test admin ontapi all-api
Vserver Type Service Name Role
test admin rest all-api
Vserver Type Service Name Role
test admin docs-api all-api
User/Group Authentication Acct Authentication
Name Application Method Role Name Locked Method
username http cert all-api - none
username http password all-api no none
username ontapi password all-api no none
is test the name of the admin vserver?
Not exactly i changed the name as test for privacy, it has meaningful name
if curl doesn't work, Harvest won't. So something is still off about how you have setup a new user/role/RBAC. If test is not the actual vserver admin, it won't work.
What if you run this, replacing the vserver with the admin vserver and role with your rest-role?
security login rest-role show -vserver umeng-aff300-01-02 -role harvest-rest-role Maybe that's what you pasted as the first table?
Sorry for misunderstanding , that "test" name changed here is the admin vserver.
And what about if you run this, replacing the vserver with your admin vserver and role with your rest-role?
security login rest-role show -vserver umeng-aff300-01-02 -role harvest-rest-role
security login rest-role show -vserver adminvserver -role all-api
Role Access
Vserver Name API Level
Adminvserver all-api /api readonly
What about security login role show -vserver adminvserver -role all-api?
Role Command/ Access
Vserver Name Directory Query Level
adminvserver all-api DEFAULT readonly
Does that curl command should return any value?
yes, for example, on my cluster
curl -sk -u$user:$pass 'https://$ip/api/cluster?fields=version'
{
"version": {
"full": "NetApp Release Mightysquirrel__9.15.1: Fri May 24 05:03:10 UTC 2024",
"generation": 9,
"major": 15,
"minor": 1
},
"_links": {
"self": {
"href": "/api/cluster"
}
}
}
Oh ya i see that on my other netapp's
You're sure the user/pass are correct? Does this show that the service is enabled?
vserver services web show -vserver umeng-aff300-01-02 -name rest
I am sure it is right, i can delete and recreate if it needs to be.
Vserver: ******
Service Name: rest
Type of Vserver: admin
Version of Web Service: 1.0.0
Description of Web Service: Remote Administrative REST API Support
Long Description of Web Service: This service supports a RESTful Interface that can be used to remotely manage all elements of the cluster infrastructure.
Service Requirements: -
Default Authorized Roles: admin, readonly, vsadmin, vsadmin-protocol,
vsadmin-readonly, vsadmin-volume
Enabled: true
SSL Only: false
sounds like you set this up for other ONTAP clusters and that worked? Anything different about this one? Since this issue doesn't have anything to do with Harvest, you might be able to get better help on this auth issue in the #1062049169520476220 channel. I'm happy to keep helping, but someone in that channel may have better ONTAP troubleshooting steps
Only other difference i can point is the netapp which we are tshooting is running on 9.6 ontap . Do you think is that a problem?
it might be. 9.6 does support REST according to this https://docs.netapp.com/us-en/ontap-restapi-96/index.html I don't have any 9.6 clusters to try. We would recommend ZAPIs for a 9.6 cluster anyway though since performance metrics are not available via REST until 9.12.1
I have prefered to use zapi in my harvest.yml page as well . prefer_zapi: true
And what do you have listed for collectors in your harvest.yml?
Defaults:
collectors:
- Zapi
- ZapiPerf
- Rest
- RestPerf
- Ems
use_insecure_tls: true
does that work? Can you upload your log file to https://upload.nabox.org/cere-kewe-rusu
/var/log/harvest -- To that particular poller log which is failing?
yes
Other netapp's works fine
just the one that is failing is fine
I need to run to a meeting and will look at your log files afterwards
log has been uploaded, thank you.
@white dirge I have followed the steps outlined at https://netapp.github.io/harvest/latest/prepare-cdot-clusters/ for a 9.6 vsim using the steps in System Manager Classic interface via the System Manager, and it works fine. Could you please try creating the relevant user from the System Manager UI to check?
Once you have recreated the user following the steps mentioned above, could you please check if the command below works? Replace USER, PASS, and CLUSTER_IP as applicable.
curl --connect-timeout 30 --user USER:PASS --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?>
<netapp xmlns="http://www.netapp.com/filer/admin" version="1.130">
<system-get-version/>
</netapp>' -H "Content-Type: text/xml" 'https://CLUSTER_IP/servlets/netapp.servlets.admin.XMLrequest_filer'
@charred granite - I see the account that was created through cli in GUI. You want me to delete and create the user and role through GUI and give it a try?
That is right
@charred granite - If i select DEFAULT as command i do not see commands as per the above link. Is that a bug?
@untold orbit - Did you get a chance to see the logs? Any pointers there to see what might be the issue?
Yes, Rahul and I looked at the logs. There isn't anything there beyond the auth error. Harvest can't do much if it is unable to talk to the cluster. Or put another way, until curl works, there is no point in trying Harvest. I know you're still trying to make sense of the SM UI and you have the question about DEFAULT above - in the meantime, can you try the curl that Rahul shared?
yesterday, you mentioned that other clusters work. Did you follow the same steps for those clusters?
Yes basically same steps i followed, but as Rahul pointed out today i deleted and create the user and role in SM UI. Basically it timed out as below. Below IP ADDRESS has been replaced with cluster ip.
curl --connect-timeout 30 --user --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?> <netapp xmlns="http://www.netapp.com/filer/admin" version="1.130"> <system-get-version/> </netapp>' -H "Content-Type: text/xml" 'https://IPADDRESS/servlets/netapp.servlets.admin.XMLrequest_filer'
curl: (28) Connection timed out after 30000 milliseconds
are you able to ping that ip?
from the same machine that you are running curl from?
yes
thanks - it is odd that curl is timing out after 30s
I see that time taken for the ping is around 66ms which should not be a conern i think
agreed
To rule out the authentication process, i was able to login via http with the credentials which i created via SM UI through gui
And those are the same credentials you are trying with the curls? Authentication isn't ruled out until one of the curls works
Yes same creds. Trying to understand if thgere is any network role here?
not sure what you mean by network role. Can you clarify? Back to your earlier comment about "if i select DEFAULT as command i do not see commands as per the above link." I don't think that's a bug. I think that means you are giving read-only access to all API objects instead of limiting to a subset of API objects
I meant like i was able to login through gui with same creds, so i thought will there be any network role as it is failing to authenticate.
I thought as well, but that is 9.6 so was not sure if it lists all API objects, understood on that DEFAULT case. TY.
gotcha, yeah there is no network role. Whether you can login or not is determined by the -application argument here security login create -user-or-group-name harvest2 -application ontapi -role harvest2-role -authentication-method password and these commands
# ZAPI based access
vserver services web access show -role harvest2-role -name ontapi
# REST based access
vserver services web access show -role harvest2-rest-role -name rest
Got it. No luck yet for me though
If you log into the CLI and run version what is displayed?
I ask because I ran across this issue https://mysupport.netapp.com/site/bugs-online/product/ONTAP/BURT/1342292 management tools cannot access management LIF IP via HTTP(S) ONTAP after upgrading to ONTAP 9.6P9
NetApp Release 9.6P2
@white dirge We expected some sort of authentication error in this curl command, as seen in Harvest. What if you try using admin user with this CURL command to rule out any network issues?
Hmm, admin account is timing out as well, with same message. @charred granite . You think it is network then?
Are you using cluster management IP?
curl --connect-timeout 30 --user admin:****** --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?> <netapp xmlns="http://www.netapp.com/filer/admin" version="1.130"> <system-get-version/> </netapp>' -H "Content-Type: text/xml" 'https://CLUSTERIP/servlets/netapp.servlets.admin.XMLrequest_filer'
curl: (28) Connection timed out after 30001 milliseconds
Yes cluster ip
I suspect, It may be wrong ip. Let's try a Rest call with admin user
curl -sk -u USER:PASS "https://CLUSTER_IP/api/cluster?fields=*"
Does this work?
See if this KB helps in getting the correct IP
You want to use Network Address field against cluster_mgmt value for admin vserver
One thing i observerd is , node mgmt ip's are one vlan and cluster mgmt ip is in different vlan. I know that is wired config (not sure the reason) but someone has set up like this. Is that a problem?
I am not sure about that but mostly we use cluster mgmt ip for monitoring.
When i use the mgmt ip i get the response on that curl command but then we should specify user:password in quotes as "user:password" so that i get curl response
This one
That works as well too.
Let's try with user which we created for Harvest using SM GUI.
With the same user i tried ZAPI and it worked
Yes i did, but the poller log has still errors.
time=2025-04-24T09:44:52.943-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=ZapiPerf object=NFSv41
time=2025-04-24T09:45:13.124-04:00 level=WARN source=poller.go:828 msg="init collector-object" Poller=**** error="StatusCode: 401, Error: auth failed, Message: 401 Unauthorized, API: /api/cluster?fields=%2A&return_records=true" collector=Rest object=FCP
time=2025-04-24T09:45:33.365-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=**** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=Zapi object=SecurityCert
Is it failing for all objects or few?
It's fine for Rest call to fail as this user only has Zapi permissions.
I restarted the poller , when you say restart harvest do you mean "bin/harvest admin stop/start"
After restart still the same, it is failing @charred granite
time=2025-04-24T09:52:00.307-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=Zapi object=SnapMirror
time=2025-04-24T09:52:20.475-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=Zapi object=SecurityCert
time=2025-04-24T09:52:40.648-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=ZapiPerf object=CIFSNode
time=2025-04-24T09:53:00.911-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=ZapiPerf object=VolumeNode
time=2025-04-24T09:53:21.081-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=Zapi object=ClusterPeer
time=2025-04-24T09:53:41.504-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=Zapi object=Security
time=2025-04-24T09:54:01.636-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=ZapiPerf object=Iwarp
I changed to mgmt ip yesterday ( in the harvest.yml) and i was looking at these logs and thought since it is failing it is not working. Didnt try curl with mgmt ip yesterdat wish i could have tried that 😦
Okay, Let's focus on one of the error logs from above and try Zapi call for that
time=2025-04-24T09:52:40.648-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller= error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=ZapiPerf object=CIFSNode
curl --connect-timeout 30 --user USER:PASS --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?>
<netapp xmlns="http://www.netapp.com/filer/admin" version="1.130">
<perf-object-counter-list-info>
<objectname>cifs:node</objectname>
</perf-object-counter-list-info>
</netapp>' -H "Content-Type: text/xml" 'https://CLUSTER_IP/servlets/netapp.servlets.admin.XMLrequest_filer'
Could you check if this call works for Harvest user
Yes that works with harvest user with nodemgmt ip.
<?xml version='1.0' encoding='UTF-8' ?>
<!DOCTYPE netapp SYSTEM 'file:/etc/netapp_gx.dtd'>
<netapp version='1.160' xmlns='http://www.netapp.com/filer/admin'>
<results status="passed"><counters><counter-info><desc>Total number of requests using Access Based Enumeration.</desc><is-deprecated>false</is-deprecated><name>access_based_enumeration</name><privilege-level>advanced</privilege-level><properties>raw</properties><unit>none</unit></counter-info><counter-info><desc>Number of active searches over SMB and SMB2</desc><is-deprecated>false</is-deprecated><name>active_searches</name><privilege-level>advanced</privilege-level><properties>raw</properties><unit>none</unit></counter-info><counter-info><desc>Authentication refused after too many requests were made in rapid succession</desc><is-deprecated>false</is-deprecated><name>auth_reject_too_many</name><privilege-level>advanced</privilege-level><properties>delta,no-zero-values</properties><unit>none</unit></counter-info><counter-info><base-counter>path_based_ops</base-counter><desc>Average number of directories crossed by SMB and SMB2 path-based commands</desc> .....
Okay. That's good
So we seem to have some issue related to cluster management ip
But whey are the poller logs shows as still 401 unauthorized? Where as same curl works directly with node mgmt ip?
In Harvest configuration, Have you added node mgmt ip or cluster mgmt ip?
I changed that to node mgmt ip since 40 mins
Okay. Could you restart Harvest and share start up logs @ https://upload.nabox.org/cere-kewe-rusu
In your earlier logs it looked like you were using a secret file - to rule that out, can you specify the username/password for the harvest user in your harvest.yml and retry?
yes it the same harvest user i am using and same harvest user in the secret file as well
oh you mean, in the harvest.yml specify the username and password?
yes, that's right
same issue after i updated the username and password and restarted that poller.
time=2025-04-24T10:21:41.304-04:00 level=INFO source=poller.go:231 msg=Init Poller=***** logLevel=INFO configPath=./harvest.yml cwd=/opt/harvest version="harvest version 24.11.1-1 (commit bb4113ef) (build date 2024-11-25T09:19:45-0500) linux/amd64" options="&{Poller:***** Daemon:true Debug:false PromPort:13025 Config:./harvest.yml HomePath: LogPath:/var/log/harvest/ LogFormat:plain LogLevel:2 LogToFile:false Version:24.11.1 Hostname:pduulpprmpol01.corp.siriusxm.com Collectors:[] Objects:[] Profiling:0 Asup:false IsTest:false ConfPath:conf ConfPaths:[conf]}"
time=2025-04-24T10:21:41.305-04:00 level=INFO source=poller.go:267 msg="started as daemon" Poller=***** pid=679499
time=2025-04-24T10:22:09.536-04:00 level=WARN source=poller.go:1513 msg="gather cluster info" Poller=***** remote="{Name: Model: UUID: Version: Release: Serial: IsSanOptimized:false IsDisaggregated:false ZAPIsExist:true HasREST:false IsClustered:false}" remoteErr="auth failed => 401 Unauthorized errNum="" statusCode="401"\nStatusCode: 401, Error: auth failed, Message: 401 Unauthorized, API: /api/cluster?fields=%2A&return_records=true"
time=2025-04-24T10:22:29.666-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=Zapi object=SnapshotPolicy
time=2025-04-24T10:22:49.980-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth failed => 401 Unauthorized errNum="" statusCode="401"" collector=ZapiPerf object=NVMfLif
time=2025-04-24T10:23:10.175-04:00 level=WARN source=poller.go:819 msg="abort collector" Poller=***** error="connection error => auth
Thanks for trying. Can you upload the most recent log files for this poller to https://upload.nabox.org/cere-kewe-rusu
Do you still want me to try restart harvest as per @charred granite ?
you restarted after updating your harvest.yml 5m ago which is sufficient
Uploaded the log. Thank you for your patience on working through this.
in your harvest.yml, for the poller in question, do you have a fully qualified domain name or ip address listed for the addr?
ip address
And the curls are being run from the same machine that is running Harvest?
can you run bin/harvest doctor --print and copy/paste the result into this chat?
You want for all or only for the failed poller?
Netapp array name:
datacenter: Name
type: NetApp
model: FAS9000
addr: -REDACTED-
auth_style: basic_auth
username: -REDACTED-
password: '-REDACTED-'
use_insecure_tls: true
exporters:
- prometheus1
prefer_zapi: true
@untold orbit - Is above info fine or you need all data? If so i need to upload in text file and send you as this chat restrict on number of lines of data
I only need the info for the failing poller and the info you pasted looks fine - I guess the collectors are defined in a Defaults section? I don't understand how their can be an auth failure with Harvest when curl is working. That's never happened before. If possible, could you download 25.02 and try it with the failed poller? You can download the tarball and install in a tmp directory if you don't feel like updating the working pollers (although we always recommend upgrading to the latest version). https://github.com/NetApp/harvest/releases/tag/v25.02.0
if 25.02 does not help, we'll build a debug build with additional instrumentation to get to the bottom of this perplexing problem
Will look to that and update
After upgrading, if you still have auth issues, let's turn on the Harvest request/response recorder so we can check that the HTTP headers are as expected. To do that, in your harvest.yml file, for the poller in question, add the following lines in the poller block, then restart the poller.
Pollers:
poller1:
addr: 10.0.1.1
recorder:
path: /tmp/record
mode: record
After it runs for a minute or so, you can stop it. Zip up the /tmp/record directory and upload to https://upload.nabox.org/cere-kewe-rusu
After uploading, you should remove or comment out the recorder section from your harvest.yml
Hey @untold orbit - Issue resolved, it is all the password that created all this issue. Apparently copy/paste was the issue on the older netapp until it was corrected. But that wierd cluster config also created that confusion.
I am now able to poll and see the graphs in Grafana
Again my sincere apologies with the password mishap and i had to thank you for both of you guys effort in assisting me in this issue
That's great to hear. Glad you got it sorted and thanks for letting us know. No worries, mishaps happen
Happy weekend guys! But i learnt some interesting stuff, thank you!