#Issues setting up datadog with docker containers

80 messages · Page 1 of 1 (latest)

pulsar hull
#

Hi - I'm trying to use private networking to set up datadog. I have two docker files that start normally: server and datadog. Both seem to run fine.

I enabled private networking for the project already.

  1. What are the environment variables I can use to refer to the hostnames of the services? I had to hard code them in.
  2. It doesn't seem like the services are able to talk to one another. I don't see any errors, but I onyl get silence. Is there something off about the way I defined the hostnames below?
  3. When I run railway run echo $DD_SERVICE or any similar command to check the environment variables of the service, I don't get anything. Is this ok?

Here are the dockerfiles:

# Start from the official Node.js 18 image
FROM node:18

# Define environment variables
ENV DD_ENV=prod \
    DD_LOGS_INJECTION=true \
    DD_SERVICE=cami-server \
    DD_AGENT_HOST='datadog.railway.internal'

# Create app directory
WORKDIR /usr/src/app

# Copy package.json and package-lock.json
COPY package*.json ./

# Install app dependencies
RUN yarn install --frozen-lockfile

# Copy app source code
COPY . .

# Build the app
RUN yarn run build

# Start the app
CMD [ "yarn", "run", "start" ]

Datadog:

# Start from the official Datadog agent image
FROM datadog/agent:latest

# Copy your Datadog configuration to the correct location
COPY datadog.yaml /etc/datadog-agent/datadog.yaml

# Set the hostname and port
ENV DD_HOSTNAME='datadog.railway.internal' \
    DD_LOGS_ENABLED=true \
    DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true \
    DD_BIND_HOST=::

# Start the Datadog agent
CMD ["/init"]
gusty sailBOT
#

Project ID: 900023fc-eb3b-4c5e-b8bf-83d6fcb5a72f

pulsar hull
#

900023fc-eb3b-4c5e-b8bf-83d6fcb5a72f

#

Note: I tried replicating the environment we have for our other project, and it did not work for Node, so I have to try a new way

thorn hare
#
  1. these are all the railway variables https://utilities.up.railway.app/env-vars?prefix=RAILWAY_
  2. for services to talk to each other in the private networking they both need to be listening on [::] since the private network is ipv6 only, but im not sure how to get the datadog agent to listen on [::]
  3. railway run runs the command locally with the service variables available, so unless you have DD_SERVICE in your service variables, it will echo nothing
pulsar hull
#

Thanks! Does that mean that when I use ENV in the Dockerfile, does that not set any env vars?

#

Separately, I'm a bit confused by railway run. What do you mean it runs the command locally? I thought it ran it in the cloud service itself

#

cc @thorn hare

#

(in case this fell by the side)

thorn hare
thorn hare
thorn hare
pulsar hull
#

no problem thank you!

#

i see ok!

#

i will try more

#

(sorry for pingin then)

thorn hare
#

no worries

pulsar hull
#

Is there a way to run a command directly in the container?

#

when it's already running

thorn hare
#

there is not

#

what command do you want to run?

pulsar hull
#

the datadog agent status command

#

or like

#

ping datadog.railway.internal

#

to make sure things are communicating as expected

thorn hare
#

yeah there no way to run those commands in the container unless you include a shell script in your project that runs the desired commands

pulsar hull
#

Also, do you know what linux distro this is running? It looks like I need to install ping / dig, and i wanna make sure i use the right package manager

#

in the docker file

#

i see! that's what i'm gonna try to do (but right in the docker file)

thorn hare
#

node:18 uses a Ubuntu base, but I'm not sure about the datadog image, dockerhub will tell you though

pulsar hull
#

ah i see thanks!

thorn hare
#

no problem

pulsar hull
#

looks like the ping doesn't succeed (using ping6 for ipv6 too). I'm pinging datadog.railway.internal from server.railway.internal

I also used :: on both datadog and my server to bind to all ipv6 ports (datadog supports it too)

#

Maybe I'm binding wrong?

#

Also, none of the images are Alpine based

thorn hare
#

these services are in the same project and environment right?

pulsar hull
#

yeah!

#

running dockerfiles:

FROM node:18

# Define environment variables
ENV DD_ENV="prod" \
    DD_LOGS_INJECTION=true \
    DD_SERVICE="cami-server" \
    DD_AGENT_HOST="datadog.railway.internal"

RUN echo $DD_ENV
RUN echo $DD_LOGS_INJECTION
RUN echo $DD_SERVICE
RUN echo $DD_AGENT_HOST


RUN apt-get update && apt-get install -y inetutils-ping

RUN ping6 -c 4 datadog.railway.internal:5000/health

# Create app directory
WORKDIR /usr/src/app

# Copy package.json and package-lock.json
COPY package*.json ./

# Install app dependencies
RUN yarn install --frozen-lockfile

# Copy app source code
COPY . .

# Build the app
RUN yarn run build

# Start the app
CMD [ "yarn", "run", "start" ]
# Start from the official Datadog agent image
FROM datadog/agent:latest

# Copy your Datadog configuration to the correct location
COPY datadog.yaml /etc/datadog-agent/datadog.yaml

# Set the hostname and port
ENV DD_HOSTNAME="datadog.railway.internal" \
    DD_LOGS_ENABLED=true \
    DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true \
    DD_BIND_HOST=::

# Print all variables
RUN echo $DD_HOSTNAME
RUN echo $DD_LOGS_ENABLED
RUN echo $DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
RUN echo $DD_BIND_HOST

# Start the Datadog agent
CMD ["/init"]
#

all env vars look great btw - no errors

#

just no communication in between

#

do i need to bind to a specific port? i just did all

thorn hare
#

whats the default port the datadog agent listens on

pulsar hull
#

Datadog uses multiple ports. 5000 is one of them, for the main server. traces are sent on 8126

thorn hare
pulsar hull
#

Actually, I think the error from curl is easier to understand:


   0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: datadog.railway.internal
#11 ERROR: process "/bin/sh -c curl datadog.railway.internal:5000/health" did not complete successfully: exit code: 6
-----
> [ 7/12] RUN curl datadog.railway.internal:5000/health:
#11 0.424   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
#11 0.424                                  Dload  Upload   Total   Spent    Left  Speed
#11 0.424 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: datadog.railway.internal
-----
 ```

It looks like the host cannot be resolved.
#

normally, there should be at least a 404 not found result (that's what i get normally) -> it still means there's a server there

thorn hare
#

the internal resolver does not become active until about 3 seconds after app start, anytime during those 3 seconds internal domains will not resolve

pulsar hull
#

i see - let me try to run it after then

#

after the node js server is live

thorn hare
#

where did you run that curl from

pulsar hull
#

the server service in the same environment

#

in the dockerfile

thorn hare
#

sleep 3 && curl datadog.railway.internal:5000/health

pulsar hull
#

i just did that 😦

thorn hare
#

internal network is not available during build

pulsar hull
#

gotcha - then let me try to run something during run

#

i'll do it in node

thorn hare
pulsar hull
#

wdym?

thorn hare
#

thats what you want to do

pulsar hull
#

oh yeah - but i'm not sure what you mean by open source. sorry haha

#

do you mean your utility?

thorn hare
#

yeah

pulsar hull
#

oh i see

thorn hare
#

let me know how the request from within node goes, make sure to sleep for 3 seconds

pulsar hull
#

the request failed, but now datadog works. i have no idea why

#

i literally did not do anything special

#

but ... i guess victory?

#

i defined it as two separate docker containers fwiw

#

maybe other people will need this in the future

#

i did not change anything other than ping /curl / dig commands since i lsat checked :/

thorn hare
#

the sleep?

pulsar hull
#

nope - removed that and dd still works

#

because that's at runtime

#

is there a delay to the internal network coming online or some weird cache?

#

i went commit by commit and i couldn't find anything that change :/

#

i'm bewildered

#

i want it to be something at least

#

the only thing that comes to mind is using a wrong env var, but i fixed that problem 8 hours ago (DD_AGENT_HOST instead of DD_HOSTNAME in the node service). and i saw no traces until 15 min ago

#

alas, thank you!