#psisoyev
1 messages ยท Page 1 of 1 (latest)
Last week we had some issues in our internal infra, and one of the nodes wasn't accessible from outside.
Unfortunately, one of our services, which is serving a webhook endpoint, was also located on the node. Therefore, some of the requests were failing. Unfortunately, it took us a couple of days to realize there was a problem because it wasn't visible to us that one of the services wasn't reachable.
From the webhook side, some events were failing with Timed out connecting to remote host response.
We haven't received any notifications from Stripe that would say there were failures. I'm wondering how can I actually make sure it won't happen again. Therefore, I would like to add some kind of monitoring.
Also, I'd like to figure out why we haven't received any notifications from Stripe, that the webhook was failing to deliver some events. I wonder if it could be because the response had no HTTP error code, but a generic message or it is a misconfiguration from my side?
We haven't received any notifications from Stripe that would say there were failures. I'm wondering how can I actually make sure it won't happen again. Therefore, I would like to add some kind of monitoring.
You could periodically list all events that failed, with this endpoint https://stripe.com/docs/api/events/list and thedelivery_successproperty
Also, I'd like to figure out why we haven't received any notifications from Stripe, that the webhook was failing to deliver some events. I wonder if it could be because the response had no HTTP error code, but a generic message or it is a misconfiguration from my side?
If your server responded with a status code different thatn 2xx, then we assume it failed. You can learn more about this here: https://stripe.com/docs/webhooks/best-practices#disable-logic
If your server responded with a status code different thatn 2xx, then we assume it failed. You can learn more about this here:
But that's the thing. As you can see in the screenshot there is no HTTP status code at all. It didn't reach the service
You could periodically list all events that failed, with this endpoint
That's an interesting point, I will have a look
But that's the thing. As you can see in the screenshot there is no HTTP status code at all. It didn't reach the service
It still should count as a failur since we didn't get a 2xxx response.
Alright, so I will try to poll the event list.
Thank you!
Happy to help ๐
๐ taking over for my colleague. Let me know if there's any follow-up Qs I can answer!