#service redeploy causes problems to private network

55 messages · Page 1 of 1 (latest)

median sparrow
#

Hey there, I am using a nginx reverse proxy to proxy external requests to internal (non public) services over the railway private network. Everything is working fine until a service, that is proxied to, is redeployed. The proxy response after the redeploy is a 502 Bad Gateway . I tried some suggestions already that where mentioned in this channel but can't seem to get it working. Any help would be appreciated 🙂

This is my simplified nginx.conf:

server {
    listen 8080 default backlog=16384;
    listen [::]:8080 default backlog=16384;
    resolver [fd12::10] valid=10s;

    underscores_in_headers on;
    server_tokens off;
    absolute_redirect off;

    location /my-service/ {
        proxy_pass http://my-service.railway.internal:8080/;
    }

    location = / {
        default_type text/plain;
        add_header content-length "0";
        return 200 '';
    }
}

Logs before redeploy:

192.168.16.8 - - [18/Oct/2023:14:20:20 +0000] "GET /my-service/ HTTP/1.1" 200 46 "-" "PostmanRuntime/7.33.0" "<my_ip>"

Logs after redeploy:

2023/10/18 15:21:26 [error] 35#35: *1 connect() failed (101: Network is unreachable) while connecting to upstream, client: 192.168.16.8, server: , request: "GET /my-service/ HTTP/1.1", upstream: "http://[fd12:a877:fdc6:0:4000:1:79cf:2751]:8080/", host: "<host_name>"

192.168.16.8 - - [18/Oct/2023:15:21:26 +0000] "GET /my-service/ HTTP/1.1" 502 150 "-" "PostmanRuntime/7.33.0" "<my_ip>"
thick oreBOT
#

Project ID: 1eff5dc1-3495-488e-91ae-4becbfd85e1d

crude capeBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

median sparrow
steady axle
#

do you have a healthcheck setup on this "my-service" service?

median sparrow
#

currently not

steady axle
#

without a health check railway won't know when the new deployment is ready to accept connects and ready to be swapped in, it would likely end up swapping in the service too early and then that's where you would get the 502 errors

median sparrow
#

that makes sense thank you for the super quick reply 🙂 i will try it out right away

steady axle
#

does this "my-service" service have a volume?

median sparrow
#

no but i have a few services that do

steady axle
#

okay because even with a healthcheck, services with volumes will always have a deadtime to prevent two services reading/writing from the same volume (this is to prevent data corruption)

#

but not applicable in this case since you said the service you are proxying to does not have a volume, just thought it would be good to mention

median sparrow
#

seems reasonable, thanks for the info 🙂

steady axle
#

no problem, let me know if setting up a healthcheck on the my-service helps!

median sparrow
#

unfortunately this didn't fix my problem, i added a health check and i saw the health check succeed in the service deploy logs for "my-service".

steady axle
#

for how long after are you seeing this 502?

median sparrow
#

the proxy made http 200 requests to the old instance for about 20-30 seconds, after that the 502 was returned (i guess when the new instance took place). The 502 takes about 1000ms.

steady axle
#

try removing the valid=10s from the resolver directive, we want nginx to resolve a new ipv6 ip on every incoming request since the internal services are likely using dynamic ips

#

I have a good feeling this wouldn't even be an issue if you used caddy as your proxy

median sparrow
#

i removed the valid=10s from the resolver, the problem is still there :/

steady axle
#

okay im going to try with my caddy proxy setup

median sparrow
#

I might try caddy if it supports auth subrequests (that is one of the reasons i use nginx currently)

steady axle
#

just tested, refreshed at half second intervals through a caddy proxy while the upstream proxy endpoint was deploying and it was a perfectly seemless switchover

median sparrow
#

yes this seems to be it, i will try it out but it might take me a while (maybe until tomorrow)

steady axle
#

this is the same thing i just used in my testing

median sparrow
#

thank you 😊

steady axle
#

no problem!

plush sand
#

@lyric quest

steady axle
#

acron, do you know them?

plush sand
#

Err yeah was just tagging him because we're struggling with nginx reverse proxy as well :p

steady axle
#

use caddy 🙂

plush sand
#

just checking out your project now 😉

median sparrow
#

caddy works indeed fine 👏 damn nginx 😂

steady axle
#

nginx 👎

median sparrow
#

i used the same config you provided, i still need to solve the auth subrequest but this wasn't used in my nginx example anyways. should i mark this as done?

crude capeBOT
steady axle
#

acron, yanis, if you need any help please open your own help thread 🙂

median sparrow
#

thanks for the help Brody 🙂

steady axle
#

no problem!

median sparrow
#

Sorry to bother again and sorry that this is not directly related to Railway. Adopting my existing nginx proxy to a Caddy proxy created a small issue for me where I need to proxy to a non Railway service (Google Cloud Run in my case). This is only temporary and we plan to adopt it to Railway shortly.
The nginx works on the Railway platform, the Caddy one unfortunately does not, even on my local machine. Proxying to the Google Cloud Run service produces the same error site, that a direct proxy to google.com produces (that's why I included it in the config example).
Where you facing the same or a similar issue before?

{
    admin off
    persist_config off
    auto_https off

    log {
        format console
    }
    servers {
        trusted_proxies static private_ranges
    }
}

:{$PORT} {
    log {
        format console
    }

    handle_path /test/* {
        reverse_proxy https://google.com
    }
}
steady axle
#

what errors are you facing?

#

@median sparrow 🙂

#

this may help, since you are proxying http to https

reverse_proxy https://example.com {
    header_up Host {upstream_hostport}
}
median sparrow
# steady axle what errors are you facing?

This was an example log to another external Railway service I use, even though the status indicates 200, no content is returned:

2023/10/19 15:30:31.573    INFO    http.log.access.log0    handled request    {"request": {"remote_ip": "192.168.16.6", "remote_port": "33152", "client_ip": "my-ip", "proto": "HTTP/1.1", "method": "GET", "host": "<exposed_service>.up.railway.app", "uri": "/test/", "headers": {"X-Forwarded-For": ["my-ip"], "X-Forwarded-Proto": ["https"], "X-Envoy-External-Address": ["my-ip"], "User-Agent": ["PostmanRuntime/7.33.0"], "Postman-Token": ["0352246b-7b5f-4744-8fe8-999679d67d67"], "Accept": ["*/*"], "Accept-Encoding": ["gzip, deflate, br"], "X-Request-Id": ["e36b35ce-d6cf-41f5-90ac-0cf6b3f24e39"]}}, "bytes_read": 0, "user_id": "", "duration": 0.003001039, "size": 0, "status": 200, "resp_headers": {"Server": ["Caddy", "railway"], "Date": ["Thu, 19 Oct 2023 15:30:31 GMT"], "Content-Length": ["0"]}}
steady axle
#

hmmm I've seen that before

#

unfortunately I forgot the fix

#

so let me know if that caddyfile snippet i gave you does anything

median sparrow
steady axle
#

awesome

#

(reading the caddy docs for the proxy directive goes a long way)

median sparrow
#

yep indeed a case of RTFM for me 😅