newtreyes-webhook-delivery | Stripe Developers | Page 1

spare mural Sep 20, 2021, 1:58 PM

#

Can you provide an example of what you mean? If you're not subscribed to an event type when it is emitted, there will not be any future delivery of that event.

hybrid dock Sep 20, 2021, 2:17 PM

#

Seems like when the health of a webhook endpoint is bad, Stripe will queue events for sending them but will not send them right away.

#

It is not that the whole endpoint is turned off but only that some events are not sent immediately (only queued to be sent later on).

#

This was not an expected behavior. We were expecting that events were either sent immediately (even when failing) or that the whole endpoint was turned off when the endpoint health was bad.

#

But we were seeing some event arriving while other events were just queued.

#

That will cause out of order events with delays on some events that could span from minutes to hours.

#

So we wanted to read about how this works and if there is any way we can have control over this "throttling" mechanism.

spare mural Sep 20, 2021, 2:34 PM

#

Right, so we may do this in test mode when your endpoint seems particularly unhealthy (lots of errors) ahead of fully disabling it. There is no control over this behaviour as it's not an expected situation, you should ensure your endpoint it able to handle the traffic you've subscribed to or otherwise disable it.

hybrid dock Sep 20, 2021, 3:09 PM

#

Interesting

#

So in production, this does not happen?

#

A production endpoint is either enabled or disabled and it can be disabled by the user of by Stripe on cases like these.

#

but if disabled, it will be completely disabled.

#

Am I reading that correctly?

slow flower Sep 20, 2021, 3:14 PM

#

but if disabled, it will be completely disabled.
what do you mean by "completely disabled", not sure I understand

#

but re:

Are there any docs about how Stripe queues events when a specific webhook endpoint starts returning errors?

I don't think there's anything documented here

#

So we wanted to read about how this works and if there is any way we can have control over this "throttling" mechanism.

no control over this either, basically the best thing to do is build resilient webhook handlers that expect events out of order but also make sure that your webhook handler is not returning errors or timing out

hybrid dock Sep 20, 2021, 3:23 PM

#

what do you mean by "completely disabled", not sure I understand
Deactivated because of its health. I mean, we were expecting an expoint to be deactivated. We were not expecting some events arriving and other events not arriving until minutes / hours later.

slow flower Sep 20, 2021, 3:27 PM

#

We were not expecting some events arriving and other events not arriving until minutes / hours later.
IIRC for unhealthy webhook endpoints, the rate at which events are sent can be slowed down. Don't think there's any documented duration either (mins vs hrs). Do you have an example of a webhook event that was sent hours after creation though? Can have a look

hybrid dock Sep 20, 2021, 3:50 PM

#

I wish I had taken a screenshot

#

If I see that happening again, I will

wraith zealot Sep 20, 2021, 5:56 PM

#

@hybrid dock Is this the original thread relating to the screenshot you just posted int he main channel?

hybrid dock Sep 20, 2021, 5:57 PM

#

Yes

#

Event id: evt_3JbqeALNS7fsUd0U1oZN8EMq

#

#

As you can see, the event was not even attempted

#

but other events were sent.

wraith zealot Sep 20, 2021, 5:59 PM

#

Gotcha - give ma few minutes to read back!

wraith zealot Sep 20, 2021, 6:21 PM

#

Sorry for the wait - I'm still looking!

hybrid dock Sep 20, 2021, 6:28 PM

#

Sounds good

#

No prob

dawn tulip Sep 20, 2021, 6:56 PM

#

@hybrid dock Hey, I'm a bit confused by your explanations earlier in the thread. To my knowledge we do not do the behaviour your are describing where we'd hold events for minutes or hours on end. Events out of order is totally normal and something you have to be resilient against but that's it.
The event id you gave, I'm not sure what you need me to look for on that one

#

That event was created on 2021-09-20 17:51:26 UTC and then sent immediately to your endpoint, where your server failed to respond so we retried it an hour later

#

ugh and now that I say all of this, obviously I'm clearly wrong adn someone changed that logic my bad

hybrid dock Sep 20, 2021, 6:58 PM

#

🙂

dawn tulip Sep 20, 2021, 6:58 PM

#

let me dig into this, this doesn;t match my understanding of how webhook delivery work

hybrid dock Sep 20, 2021, 6:58 PM

#

Yeah

#

Same on my side

dawn tulip Sep 20, 2021, 7:04 PM

#

Okayyyy, sorry for the confusion. We talked as a team and it is expected and it's been like this for a year. I have no memory of knowing this but I see I even read the announcement email internally 😅
Based on the code and internal documentation, this behaviour is only for Test mode today. The idea is to stop overwhelming our servers with an endpoint that is mis-behaving in Test mode and failing sometimes millions of events per hour
Looking at the code, it could happen in Live mode one day if your endpoint is clearly down/mis-behaving, but it's only enabled for Test mode today because that's where the majority of the bad load was coming from.

#newtreyes-webhook-delivery