a7e72859a316 ghcr.io/netapp/harvest:latest "bin/poller --poller…" 5 days ago **Restarting (1) 20 seconds ago ** poller-Netapp01
b26883f63bc5 ghcr.io/netapp/harvest:latest "bin/poller --poller…" 5 days ago Up 5 days poller-unix
#NetApp poller keeps on restarting.
1 messages · Page 1 of 1 (latest)
What does docker logs poller-Netapp01 show?
2024-02-14T03:38:09Z ERR poller/poller.go:1161 > Failed to negotiateAPI error="connection error => Post "https://Netapp01:443/servlets/netapp.servlets.admin.XMLrequest_filer\": dial tcp 172.30.0.4:443: connect: connection refused" Poller=Netapp01 collector=Zapi
2024-02-14T03:38:09Z ERR poller/poller.go:1161 > Failed to negotiateAPI error="connection error => Post "https://Netapp01:443/servlets/netapp.servlets.admin.XMLrequest_filer\": dial tcp 172.30.0.4:443: connect: connection refused" Poller=Netapp01 collector=ZapiPerf
2024-02-14T03:38:09Z WRN poller/poller.go:669 > abort collector error="connection error => connection error => Post "https://Netapp01:443/servlets/netapp.servlets.admin.XMLrequest_filer\": dial tcp 172.30.0.4:443: connect: connection refused" Poller=Netapp01 collector=ZapiPerf object=ObjectStoreClient
2024-02-14T03:38:09Z WRN poller/poller.go:306 > no collectors initialized, stopping Poller=Netapp01
2024-02-14T03:38:09Z INF poller/poller.go:531 > cleaning up and stopping [pid=1] Poller=Netapp01
Looks like a connection issue with cluster. You may want to check Poller config.
You can run below command to check if configuration is correct
curl --connect-timeout 30 --user USER:PASS --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?>
<netapp xmlns="http://www.netapp.com/filer/admin" version="1.130">
<system-get-version/>
</netapp>' -H "Content-Type: text/xml" 'https://URL/servlets/netapp.servlets.admin.XMLrequest_filer'
Replace USER, PASS, URL as applicable
curl --connect-timeout 30 --user test:Test123 --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?><netapp xmlns="http://www.netapp.com/filer/admin" version="1.130"><system-get-version/></netapp>' -H "Content-Type:text/xml"'https://netapp01:443/servlets/netapp.servlets.admin.XMLrequest_filer'
curl: no URL specified!
curl: try 'curl --help' or 'curl --manual' for more information
I think you are missing a space between the URL and the arg before? Try this
curl --connect-timeout 30 --user test:Test123 --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?><netapp xmlns="http://www.netapp.com/filer/admin" version="1.130"><system-get-version/></netapp>' -H "Content-Type:text/xml" 'https://netapp01/servlets/netapp.servlets.admin.XMLrequest_filer'
[root@it-devops04 harvest-22.08.0-1_linux_amd64]# curl --connect-timeout 30 --user test:Test123 --insecure --data-ascii '<?xml version="1.0" encoding="UTF-8"?><netapp xmlns="http://www.netapp.com/filer/admin" version="1.130"><system-get-version/></netapp>' -H "Content-Type:text/xml" 'https://netapp01:443/servlets/netapp.servlets.admin.XMLrequest_filer'
<?xml version='1.0' encoding='UTF-8' ?>
<!DOCTYPE netapp SYSTEM 'file:/etc/netapp_gx.dtd'>
<netapp version='1.211' xmlns='http://www.netapp.com/filer/admin'>
<results status="passed"><build-timestamp>1670716457</build-timestamp><is-clustered>true</is-clustered><version>NetApp Release 9.11.1P5: Sat Dec 10 23:54:17 UTC 2022</version><version-tuple><system-version-tuple><generation>9</generation><major>11</major><minor>1</minor></system-version-tuple></version-tuple></results></netapp>
that looks good. And this was run from the same machine that you ran the docker logs command from earlier? The reason Rahul asked you to try that is because the poller is failing to connect to that cluster (dial tcp 172.30.0.4:443: connect: connection refused) Does ping netapp01 show the same ip (172.30.0.4)?
yes.. this was from the same machine docker logs command was run.. netapp01 ip is different ..it is 10.15.187.22
To rule out DNS issues, what if you change your harvest.yml's addr to use the ip address for the cluster instead of the domain name? If you're using NAbox, try changing the Host field from domain name to ip
ok..let me try and get back to you
Double check your NAbox DNS settings indeed, I've seen instances where the search domain would be discarded
DNS seetings looks fine
2024-02-14T18:27:18Z ERR poller/poller.go:1161 > Failed to negotiateAPI error="connection error => Post "https://10.15.187.226:443/servlets/netapp.servlets.admin.XMLrequest_filer\": tls: failed to verify certificate: x509: cannot validate certificate for 10.15.187.226 because it doesn't contain any IP SANs" Poller=netapp01 collector=Zapi
2024-02-14T18:27:18Z ERR poller/poller.go:1161 > Failed to negotiateAPI error="connection error => Post "https://10.15.187.226:443/servlets/netapp.servlets.admin.XMLrequest_filer\": tls: failed to verify certificate: x509: cannot validate certificate for 10.15.187.226 because it doesn't contain any IP SANs" Poller=netapp01 collector=ZapiPerf
2024-02-14T18:27:18Z WRN poller/poller.go:669 > abort collector error="connection error => connection error => Post "https://10.15.187.226:443/servlets/netapp.servlets.admin.XMLrequest_filer\": tls: failed to verify certificate: x509: cannot validate certificate for 10.15.187.226 because it doesn't contain any IP SANs" Poller=netapp01 collector=ZapiPerf object=VolumeNode
2024-02-14T18:27:18Z WRN poller/poller.go:306 > no collectors initialized, stopping Poller=netapp01
2024-02-14T18:27:18Z INF poller/poller.go:531 > cleaning up and stopping [pid=1] Poller=netapp01
Is this a certificate issue?
yes
if the cert is self-signed or invalid you need use_insecure_tls: true included for that poller. NAbox shows that in the UI as
did selecting that checkbox make the poller work?
I made the change to harvest.yml file with docker and it worked
however I dont see a port on the netapp poller in the docker ps output
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a7e72859a316 ghcr.io/netapp/harvest:latest "bin/poller --poller…" 6 days ago Up About an hour poller-netapp01
b26883f63bc5 ghcr.io/netapp/harvest:latest "bin/poller --poller…" 6 days ago Up About an hour poller-unix
9dc119dd030d grafana/grafana:8.3.4 "/run.sh" 6 days ago Up 6 days 0.0.0.0:3000->3000/tcp grafana
e0a28ef9ee6d prom/prometheus:v2.33.1 "/bin/prometheus --c…" 6 days ago Up 6 days 0.0.0.0:9090->9090/tcp prometheus
I dont see any metrics in the grafana dashboard. I dont see my netapp cluster datacenter details etc.. so it looks like metrics are not being polled into grafana
are you using the compose workflow or nabox? In an earlier msg, it sounded like you were using nabox?
Trying both
depending on how you created the harvest-compose file, it does not expose the port, which in the compose workflow is fine since the Prometheus container can see the poller ports because they're in the same network. Can you paste your harvest.yaml file using doctor by running the following. cd to the directory that contains your harvest.yml file docker run --rm --env UID=$(id -u) --env GID=$(id -g) --entrypoint "bin/harvest" --volume "$(pwd):/opt/temp" --volume "$(pwd)/harvest.yml:/opt/harvest/harvest.yml" ghcr.io/netapp/harvest doctor --print
do you see metrics in Prometheus? http://:9090$prom-ip then Status > Targets
Admin:
Tools:
Exporters:
prometheus:
exporter: Prometheus
local_http_addr: 0.0.0.0
port_range: 2000-2030
prometheus1:
exporter: Prometheus
port_range: 13000-14000
Defaults:
collectors:
- Zapi
- ZapiPerf
- Ems
- Rest
- RestPerf
use_insecure_tls: true
Pollers:
unix:
datacenter: local
addr: -REDACTED-
collectors:
- Unix
exporters:
- prometheus
us01cmqa:
datacenter: us01
addr: -REDACTED-
auth_style: basic_auth
username: -REDACTED-
password: -REDACTED-
exporters:
- prometheus1
How can i find prom-ip?
if you are on the same machine that you ran docker ps on, you can use localhost:9090
yes that port looks suspect :0 can you share your harvest-compose.yml file that you generated via https://netapp.github.io/harvest/nightly/install/containers/#generate-a-docker-compose-for-your-pollers
[root@it-devops04 harvest-22.08.0-1_linux_amd64]# cat harvest-compose.yml
version: "3.6"
services:
unix:
image: cr.netapp.io/harvest:latest
image: ghcr.io/netapp/harvest:latest
container_name: poller-unix
restart: unless-stopped
command: '--poller unix --promPort 12990 --config /opt/harvest.yml'
volumes:
- /root/harvest-22.08.0-1_linux_amd64/conf:/opt/harvest/conf
- /root/harvest-22.08.0-1_linux_amd64/cert:/opt/harvest/cert
- /root/harvest-22.08.0-1_linux_amd64/harvest.yml:/opt/harvest.yml
networks:
- backend
us01cmqa:
image: cr.netapp.io/harvest:latest
image: ghcr.io/netapp/harvest:latest
container_name: poller-us01cmqa
restart: unless-stopped
command: '--poller us01cmqa --config /opt/harvest.yml'
volumes:
- /root/harvest-22.08.0-1_linux_amd64/conf:/opt/harvest/conf
- /root/harvest-22.08.0-1_linux_amd64/cert:/opt/harvest/cert
- /root/harvest-22.08.0-1_linux_amd64/harvest.yml:/opt/harvest.yml
networks:
- backend
[root@it-devops04 harvest-22.08.0-1_linux_amd64]#
I've got a meeting to head to. We'll pick back up tomorrow. Try stopping the containers, regenerating the harvest-compose.yml and bringing up. https://netapp.github.io/harvest/nightly/install/containers/#docker-compose
I can take over on the NAbox option though
@dapper orbit .. The process you mentioned fixed the issue.. I am able to see metrics in Grafana now