#I tried simply running a `dagger core
1 messages · Page 1 of 1 (latest)
This is the behaviour that I am seeing in the registry.dagger.io/engine:v0.12.5 image:
What engine image are you using? What steps should I follow to reproduce the issue?
I am using my own but the base is the image in GHCR. We pull it proxied with our artifactory with mycompany.com/dagger/engine:v0.12.5
So my context needs my company proxies. I wonder if they are not honored in this context
because what I saw was a timeout.
09:06:22 [INFO] Waiting for Dagger Engine to be ready. This could take upto a minute
[Pipeline] sh
09:06:23 + dagger core version
09:06:23 1 : connect
09:06:23 2 : connecting to engine
09:16:23 2 : connecting to engine ERROR [10m0.0s]
09:16:23 2 : ! new client: context deadline exceeded
09:16:23 1 : connect ERROR [10m0.0s]
09:16:23 1 : ! start engine: new client: context deadline exceeded
09:16:23
09:16:23 Error: start engine: new client: context deadline exceeded
in your image, what is the _EXPERIMENTAL_DAGGER_RUNNER_HOST env variable set to?
I removed it entirely to test this. That's when I got the timeout. It shouldn't be needed in this context
in the default image, this is set to _EXPERIMENTAL_DAGGER_RUNNER_HOST=unix:///var/run/buildkit/buildkitd.sock
My expectation was I won't need to set ...RUNNER_HOST as I am executing inside the engine image
yes, agreed. when running inside the image, e.g. docker exec -it dagger-engine-7d4613636ea6099e sh, what does the following command return? env | grep RUNNER
OK, so that's configured as expected.
Inside this container, what does the following command return?
apk add procps
ps -Fp 1
OK, so that's the problem. There should be a running engine in the container, which means that pid 1 should be:
/ # ps -Fp 1
UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
root 1 0 23 586173 810396 3 14:43 ? 00:09:49 /usr/local/bin/dagger-engine --config /etc/dagger/engine.toml --debug
hmm, wonder why it's not running on mine
how was this engine container started?
Right now WithDefaultTerminalCmd([]string{"sh"}).Terminal(). That may be a problem
In Jenkins.. I have to do a long running command so Jenkins knows to keep the container alive. I am doing a cat in Jenkins
When I tried the default entrypoint it threw an error. Let me try it again
sorry, got into a meeting. I'll reply in a few mins
I tried removing it but didnt' make a difference. Isn't the entrypoint skipped by default now? How do I tell it to use the default?
Instead of trying Terminal. I directly tried to run the image. Here's what I got
#!/bin/sh
set -e
cat $0
# cgroup v2: enable nesting
# see https://github.com/moby/moby/blob/38805f20f9bcc5e87869d6c79d432b166e1c88b4/hack/dind#L28
if [ -f /sys/fs/cgroup/cgroup.controllers ]; then
# move the processes from the root group to the /init group,
# otherwise writing subtree_control fails with EBUSY.
# An error during moving non-existent process (i.e., "cat") is ignored.
mkdir -p /sys/fs/cgroup/init
xargs -rn1 < /sys/fs/cgroup/cgroup.procs > /sys/fs/cgroup/init/cgroup.procs || :
# enable controllers
sed -e 's/ / +/g' -e 's/^/+/' < /sys/fs/cgroup/cgroup.controllers \
> /sys/fs/cgroup/cgroup.subtree_control
fi
# expect more open files due to per-client SQLite databases
# many systems default to 1024 which is far too low
ulimit -n 1048576
exec /usr/local/bin/dagger-engine --config /etc/dagger/engine.toml "$@"
mkdir: can't create directory '/sys/fs/cgroup/init': Read-only file system
This is what I'm seeing within Jenkins when I remove the cat
exec /usr/local/bin/dagger-engine --config /etc/dagger/engine.toml "$@"
WARN[0000] failed to rehash ca-certificates: ERROR: Access denied '/usr/local/share/ca-certificates' error="exit status 2"
buildkitd: install resolv.conf: remount /etc/resolv.conf to upstream alias: operation not permitted
at least in jenkins.. at this point this is not running as a privileged container (and I have no way to do that anyway) and that will cause this issue. I'm afraid, the way I am currently running it is the only way to get it going. It's not buying me much but it makes my code cleaner. I won't be able to use the CLI inside the image as intended. I will need xxx_RUNNER_HOST
this worked for me. The jenkins controller is running in K8S (minikube in my case). The agent is a k8s pod.
here is a pipeline
pipeline {
agent {
kubernetes {
yaml '''
apiVersion: v1
kind: Pod
metadata:
name: ex-ci-pod
spec:
containers:
- name: ci-runner
image: registry.dagger.io/engine:v0.12.5
args: [ "--debug" ]
securityContext:
privileged: true
'''
defaultContainer 'ci-runner'
}
}
stages {
stage('Main') {
steps {
sh 'hostname'
sh "which dagger"
sh "dagger -m github.com/shykes/daggerverse/hello@v0.3.0 call hello"
}
}
}
}
This won't work for me because my company prohibits running securityContext.privileged: true 😦 The only container allowed to run --privileged is the dind container we use to run all other docker images with. That's how I am able to run dagger today.
ah -- bummer.
If you are not using the default entrypoint in the engine image, how are you starting the engine?
I am using the engine image just for the CLI right now. So within the engine image when I do a dagger call it starts another instance of the engine. BUT DOCKER_HOST within the first engine is set to my sidecar which is privileged. So the second engine that's spawned runs in that sidecar
OK. Is there an already running Engine that you would like to connect the CLI running in the engine image?
No running engine as we spawn a new engine for every pipeline.
Eventually we would like to have a remote engine/s to connect to that would maintain cache etc. We arent' there yet. Also looking forward to dagger team externalizing the distributed caching component from the cloud so we can self host.
is this part of the dagger's roadmap -- to externalize the distributed caching?
also, when running dagger as a DeamonSet in K8S, it uses hostpath to expose the buildkit socket.
when the CI POD runs, it needs to map that hostpath of buildkt in the the container.
is there a way to connect to the dagger engine via TCP ?
nvm -- i belive this is being answered here: #1274047894239973437 message
also, when running dagger as a DeamonSet in K8S, it uses hostpath to expose the buildkit socket.
We'll be changing that since for the buildkit socket itself its not necessary for it to be hostpath.
is there a way to connect to the dagger engine via TCP ?
You can, you can see the list of ways for connecting to the engine here: https://docs.dagger.io/manuals/administrator/custom-runner#connection-interface
right, but how do i tell hte dagger engine to listen to the address:port ?
You can specify the --addr flag multiple times, for example, to start both the socket and TCP: --addr unix:///var/run/buildkit/buildkitd.sock --addr tcp://0.0.0.0:8080
sorry, but this still isn't very clear for me.
the way i understand things so far:
helm deploy dagger creates a deamon-set and mounts the host path. In the helm chart, i and add --addr tcp://0.0.0.0:8080 when starting hte dagger engine?
https://github.com/dagger/dagger/blob/main/helm/dagger/values.yaml#L20
and in the dagger-cli, i add _EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://<node-address>:8080
am I on the right path?
Yes!
I gave a different perspective about it here: https://www.youtube.com/watch?v=tXoIpioUCsA&t=803s (no Kubernetes required). Works the same in Kubernetes too.
In this video, Gerhard Lazu shows the advanced functionalities of Dagger beyond its typical use in CI/CD workflows. He begins by exploring the connection between the Dagger CLI and the Dagger engine, emphasizing the role of Docker and how running commands in debug mode can provide deep insights into their interaction. The video covers various me...