#REST poller timeout

1 messages · Page 1 of 1 (latest)

odd flame
#

time=2026-03-02T15:23:27.226-08:00 level=ERROR source=collector.go:449 msg="" Poller=mynode collector=Rest:ClusterSoftware error="failed to fetch data: error making request connection error: Get "https://mynode.com/api/cluster/software?fields=status_details%2Cupdate_details%2Cvalidation_results&ignore_unknown_fields=true&max_records=500&return_records=true\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" task=data

this runs for 45s when timed via curl (without the added filtering, so a straight up /api/cluster/software call) and setting client_timeout: 3m in the REST:clustersoftware metric file doesn't prevent this.

latent ore
#

@odd flame There is a bug in code which is not honouring this timeout. This requires a fix.

latent ore
latent ore
odd flame
#

msg=Collected Poller=mynode collector=Rest:ClusterSoftware

#

seems to have worked

latent ore
#

Great! Thanks.

hollow kernel
#

@odd flame I just wanted to share this curl command to help see where the slow downs might be coming from:
curl -s --cert <filename>.crt --key <filename>.key -w "DNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTLS: %{time_appconnect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" -X GET "https://<hostname>/<your api call>"

odd flame
#

DNS: 0.000501s
Connect: 0.031613s
TLS: 0.102540s
TTFB: 35.012498s
Total: 35.043718s

#

for the timeout above, this is the results

hollow kernel
#

I recently was having timeouts on a specific cluster and I discovered those curl options and found I was hitting 140s on the TTFB (the time it takes the cluster to gather and return the results). I looked closer at the cluster audit logs and saw a very large number of api calls from another source and I am now addressing that. It might be worth seeing if your cluster is more busy than usual otherwise increasing the timeout like you have can be your fix. Just sharing my recent experience and hope it helps.

odd flame
#

this is a new API call for REST that didn't exist in our prior version, so I have nothing to compare it too