#Randomly can't connect to Database

12 messages · Page 1 of 1 (latest)

warped glacier
#
  • My production server that uses Prisma sometimes keep throwing PrismaClientKnownRequestError with the message of Can't reach database server at, but only happen like <10% of the time, can someone help me to resolve this?

More context:

  • The server with prisma is running on a docker container , managed by AWS EKS
  • The database server is an AWS RDS instance, all within the same cluster.
warped glacier
hot bough
warped glacier
#

In my case, I already had it as connection_timeout = 0

warped glacier
#

@analog maple can you take a look at this, it's becoming more frequently now

analog maple
#

Let me collect a bit more info and then bring it to the Eng team. You’ve got an EKS cluster and an RDS instance. What database engine is the RDS instance? How many (on average) containers are running? Could it be due to a container starting and/or stopping?

warped glacier
#

the RDS instance is using PostgreSQL 14.8. On average, I would say we have around 5-10 pod/containers running, I think on burst, it may go up to 20

#

Container starting/stopping makes sense, but on our end, I see that there are many logs with this issue, it does not make a lot of sense as containers will just spin up once while bursting, so it should not be that many 🤔

#

I will do some more digging and keep you updated

analog maple
#

Yeah my thinking is that you've got this managed pool of containers that stop/start (somewhat) frequently. As a new container spins up, maybe it tries to make some calls to the DB but the security groups aren't quite all set up yet? So those fail. I'm not sure

warped glacier
#

yeah that makes sense, but if that's actually the case then we should only see like maximum of a few dozen, maybe 20,30 of those error logs, but on burst, we are seeing like a couple thousands. And on normal traffic with 5-10 pods, we have around 10-25 instance of those errors

#

Also, if that's the case, shouldn't library like node-pg should also experience the same issue?