Hello Railway team,
Per the post-incident guidance from station.railway.com/community/road-to-recovery-post-gcp-outage-builds-d362e48c, my Postgres service is stuck in the catatonit crash loop described in your recovery guide.
Account email: [email protected]
Project: AI Quantified Cyber Risk Platform (Demo SaaS)
Environment: development
Service: Postgres (shinkansen.proxy.rlwy.net:43452)
Project ID:5704a859-b207-4b5b-b233-f2931525ee2c
URL: https://railway.com/project/5704a859-b207-4b5b-b233-f2931525ee2c/service/ed045148-8655-46af-9f39-d5f915b37838/database?environmentId=817d5fcc-144d-4641-b44a-e6f78c80e465
Symptoms:
- Container crash-loops with: "ERROR (catatonit:2): failed to exec pid1: No such file or directory"
- Volume mounts ("Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/4d83c5b0-80f8-4e16-933b-aa2b5ddd6947/vol_py8otdqidon6miid"), then immediately fails
- Manual redeploy from dashboard attempted multiple times — no recovery
- DEMO Postgres (junction) and PROD Postgres (mainline) on the same project are unaffected
Per your recovery guide, this matches the "volume may need to be moved to a healthy node" scenario.
Please migrate the volume to a healthy node when your team has capacity.
Plan: [TODO — Hobby]
Thanks,
Dejan