#Is it expected that Payload goes offline a lot?
24 messages · Page 1 of 1 (latest)
Help is on the way! To mark it as solved, use the /solve command. In the meantime, here are some existing threads that may help you:
Documentation:
The service is hosted by DigitalOcean, and we have not been notified of any outages - so this typically means it's your application code.
I can provide some insight if you provide your project ID from Settings -> Billing
hi @pale bolt , jumping in here. our project id is:
65c420e015aac56684a56746
as @modest stag said, we've been experiencing intermittent outages typically lasting between 5-10 mins. however, the last 24 hours have been constantly down, with 30 seconds of uptime if it kicks back on, then going offline immediately after
Any indication of issues in the logs?
I'd be curious if triggering a redeploy from the dashboard would cause any change in behavior.
I host on DO, never had an issue (not saying
that it's not possible)
thanks for the responses.
i redeployed and we were back up for all of about 3 mins, then back down.
@pale bolt the only error i found in the logs was this:
(payload): MongoNetworkError: connection xxxx to xx.xxx.xxx.xxx:xxxx closed
upon searching for the issue, i found you opened this issue a while back:
https://github.com/payloadcms/payload/issues/3180
curious if the cause was ever identified?
@trim spear I'm looking at possible causes, I see one post from the mongo community that may be relevant
If you experience network timeouts or socket errors in communication between clients and servers, or between members of a sharded cluster or replica set, check the TCP keepalive value for the affected systems.
try running sysctl net.ipv4.tcp_keepalive_time to check
Though, it seems unlikely to me that the requests payload makes to atlas would exceed the defaults
we're getting back net.inet.tcp.keepidle: 7200000
per that highlighted line, it would seem the value of 7200000 is being ignored?
It would seem that way, though the time in both cases is longer than the duration of your testing period
so i'm not sure it's this
I'm curious if this is a pooling issue perhaps
thank you for the guidance. we're going to keep troubleshooting.
one thing i did notice when using the API:
https://x.payloadcms.app/api/products/6629c476ace2ebaaa45428dc?locale=undefined&draft=true&depth=1
returns the error:
Too many requests, please try again later.
A root cause was not found, and we were unable to recreate.
I don't see any mongo connection errors in your logs, though, which is odd. Is "back down" the same issue as before?
yeah, same issue as before.
we have a lot of relationship fields throughout our collections. i'm wondering if they're just not optimized well & getting rate limited
yesterday we had over 8 hours of outage, there are no indications as to why? are we the only ones that have reached out about this?
Just the ones in this thread.
Can you give more info on what you're seeing? 504s? slow requests? Is your front-end and back-end hosted on Payload Cloud?