Run Dagger on Kubernetes | Dagger | Dagger | Page 1

prisma leaf Mar 18, 2024, 5:25 PM

#

I assume that the runner I'm running would need to have kubectl installed as well as the appropriate kubeconfig to talk to the k8s API? Definitely feels like the docs are missing a step or two to get this wired up properly.

empty turret Mar 18, 2024, 5:42 PM

#

Hey @prisma leaf!! Are you setting up your runners on kubernetes and the dagger engine as a DaemonSet? Meaning, will runners connect to a container pod on the same host?

prisma leaf Mar 18, 2024, 5:43 PM

#

Yes, I believe I have all that set up properly. The piece I'm not sure I have set up properly is Persistent Volume with the local cache, but... I figured if I could get some dagger jobs running with engine on the node that would learn pretty quick if that PV was setup improperly.

#

But I guess the question is... If the engine is running on the same node, shouldn't I just be able to point the gitlab runner to a specific port on the node?

empty turret Mar 18, 2024, 5:47 PM

#

That is correct. Not a port directly but rather a uds. You can mount the unix socket and then set the environment variable that the Dagger CLI uses to connect to the engine. Quick example we have here for github runners:
The pod that will run the dagger CLI has the volume mount:

- name: varrundagger
  mountPath: /var/run/buildkit

And the env variable:

env:
- name: _EXPERIMENTAL_DAGGER_RUNNER_HOST
  value: unix:///var/run/buildkit/buildkitd.sock

This works because the dagger helm chart deploys the daemonset and volume mounts from the host the /var/run/buildkit directory

prisma leaf Mar 18, 2024, 5:49 PM

#

#

This is what I get on the describe for my dagger pod

#

So... I imagine I just need to alter what you gave me a bit for my varrundagger?

empty turret Mar 18, 2024, 5:50 PM

#

The daemonset pod looks correct. What I was referring to is the runner pod that needs the changes 👍

prisma leaf Mar 18, 2024, 5:50 PM

#

Hmmm, so I need to mount that path on the Gitlab runners is what you're saying?

empty turret Mar 18, 2024, 5:53 PM

#

Exactly! On the Gitlab runner you are going to run the Dagger CLI. The CLI needs to connect to the engine via a unix domain socket (there are other ways to connect to the engine, but for your use case this one seems the most appropriate), that is why you mount the socket the engine listens on directly on the runner pod

We'll review the docs and make some updates to make it more clear!

prisma leaf Mar 18, 2024, 5:54 PM

#

My gitlab job...

  extends: [.dagger]
  variables:
    _EXPERIMENTAL_DAGGER_RUNNER_HOST: "unix:////var/run/dagger/dagger.sock"
  rules:
    - if: $TRIGGER_ACTION == 'dagger-help'
  script:
    - dagger --help

I'll attempt to figure out how to get that mount path established on the gitlab runner. Wish me luck. Hah.

empty turret Mar 18, 2024, 5:56 PM

#

I think I messed up a bit my explanation. When I mean the runner pod, I'm referring to the actual gitlab runner, not the workflow itself. Are you deploying your own runners?

prisma leaf Mar 18, 2024, 5:57 PM

#

Yeah, I am deploying my own runners. So I need to figure how to modify the helm chart for the gitlab runners to mount that path.

empty turret Mar 18, 2024, 6:04 PM

#

No problem, if you are using the gitlab provided helm chart, you can configure volume and volumeMounts right here: https://gitlab.com/gitlab-org/charts/gitlab-runner/blob/main/values.yaml#L633-L644

GitLab

values.yaml · main · GitLab.org / charts / GitLab Runner · GitLab

Official Helm Chart for the GitLab Runner (https://gitlab.com/gitlab-org/gitlab-runner)

prisma leaf Mar 18, 2024, 6:05 PM

#

Ahh great! I was doing something else, and this looks like the right answer. Once I have this set up, how would I verify that the Dagger CLI is talking to the enginer properly?

arctic cave Mar 18, 2024, 6:07 PM

#

The k8s guide has a dagger command you can use.

prisma leaf Mar 18, 2024, 6:10 PM

#

Hmm the gitlab runner isn't happy with that mount. I have a feeling something isn't setup properly with my persistent volume.

empty turret Mar 18, 2024, 6:11 PM

#

You are missing the volumes section:

volumes:
- name: varrundagger
  hostPath:
    path: /var/run/dagger

empty turret Mar 18, 2024, 6:13 PM

#

arctic cave The k8s guide has a dagger command you can use.

You could use that one, you could also try calling a module. For example:

dagger call -m github.com/shykes/daggerverse/hello hello

prisma leaf Mar 18, 2024, 6:16 PM

#

The volume addition seemed to make the pod happy. I missed the command from the guide. My bad. I'll try that now.

#

So close, but so far it seems. Seems like it's hanging on starting the engine.

#

This is how I am running it.

empty turret Mar 18, 2024, 6:23 PM

#

The env variable is incorrect, I did not pick it up before. it's supposed to be:

env:
- name: _EXPERIMENTAL_DAGGER_RUNNER_HOST
  value: unix:///var/run/buildkit/buildkitd.sock

Remember that if you configure on the runner pod it is not necessary to configure it on the job!

#

Wait, I'm messing something up myself

#

Give me a sec haha. reviewing again the history of messages, got a bit lost with another case I was looking into

#

Just checked. I think we are looking good. Quick review:

Dagger helm chart does a host mount of /var/run/buildkit
Runner pod lists the volume and mounts it on /var/run/buildkit
Runner pod exposes env variable of _EXPERIMENTAL_DAGGER_RUNNER_HOST to point to the socket found on the host at /var/run/buildkit/buildkitd.sock

#

Try fixing the env variable to have the correct socket at /var/run/buildkit. Make sure that the runner pod mounts that volume as well /var/run/buildkit not /var/run/dagger

prisma leaf Mar 18, 2024, 6:54 PM

#

I'll try getting that setup going. It's a little confusing given the pod template for the engine. ha

empty turret Mar 18, 2024, 6:57 PM

#

My bad. I confused the host path and the path we mount it at the engine itself. The engine grabs the host path /var/run/dagger and mounts it into /var/run/buildkit in the engine container. Lets review your runner setup once more, how did you define the volumes and volumeMounts there?

prisma leaf Mar 18, 2024, 7:01 PM

#

This is how I have my mounts set up for my Gitlab Runner.

#

The actual CI job.

#

I am still getting hangs given this setup.

#

I also wonder if I am defining the volumes wrong given this page. https://archive.docs.dagger.io/0.9/488564/openshift-gitlab#step-2-configure-gitlab-runner In here they are defined in a different spot in the gitlab runner setup. But some of these docs seem old and Im not sure are still relevant

Run Dagger on OpenShift with GitLab Runners | Dagger

Introduction

#

I believe setting up the volume mounts like described in the doc are "correct" , but... still having issues getting everything connected

empty turret Mar 18, 2024, 7:10 PM

#

Okay. I made a mistake when I wrote the volumes definition above. In your gitlab runner, the volumeMounts is correct. The volumes is incorrect. We need to mount the host path /var/run/dagger into the cntainer path buildkit. We could put any path or names we want here, they don't have to be those, the important thing is that the components look in the right places.
THe volumes should be

volumes:
- name: varrundagger
  hostPath:
    path: /var/run/dagger

The volume mount you should leave it as is. And the environment variable should be pointing to unix:///var/run/buildkit/buildkitd.sock

#

The connections that happen here are:

Dagger engine mounts it's local /var/run/buildkit to the hosts /var/run/dagger
The gitlab runner mounts its local /var/run/buildkit to the hosts /var/run/dagger
The gitlab runner exposes the _EXPERIMENTAL_DAGGER_RUNNER_HOST to connect to the unix socket that was mounted on /var/run/buildkit which is named buildkitd.sock and connects the gitlab runner to the dagger engine container

#

Basically all we have to do is connect the socket created by the dagger engine to the gitlab runner so that the CLI can talk directly to it

prisma leaf Mar 18, 2024, 7:23 PM

#

So I think I've weaved all that in for my runners. Whenever my gitlab runner pods spin up to grab these jobs, they end up exiting with code 255, and Im not quite sure yet but I'm sure it's related to mouting issues.

#

The 255 might be totally unrealted, but I'm seeing startin engine hang

empty turret Mar 18, 2024, 7:27 PM

#

Is it possible for you to deploy a gitlab runner pod and exec into it? So that we can debug if the socket is correctly mounted

prisma leaf Mar 18, 2024, 7:30 PM

#

Git lab runners are setup as a "Deployment" and each job gets it's own pod set up for the run. I can exec and get a shell on the container running the deployment, but the actual pods where the jobs run I can't /bin/bash exec into them

empty turret Mar 18, 2024, 7:36 PM

#

mmm I see. I was not aware of that architecture, I was assuming it worked like github runners. What if you get the Pod specification of the job that was launched? with that you should be able to spawn a pod and exec into it

prisma leaf Mar 18, 2024, 7:40 PM

#

Defaulted container "build" out of: build, helper, init-permissions (init)
cache  empty  lib    local  lock   log    mail   opt    run    spool  tmp

I can't get an actual shell for the running pod where the "starting engine" is stalled, but I can run commands on it.

empty turret Mar 18, 2024, 7:41 PM

#

Okay. Lets start first by checking the pod specification itself that was created. Can you confirm that the pod spec has the correct mounts?

prisma leaf Mar 18, 2024, 7:43 PM

#

Seems like it? Based on how I mounted the volumes on the runner

#

Looks like it's mounted properly?

#

Oh wait... this doc as has the runner running in the dagger namespace. https://archive.docs.dagger.io/0.9/488564/openshift-gitlab#step-2-configure-gitlab-runner

Run Dagger on OpenShift with GitLab Runners | Dagger

Introduction

#

I don't have it running in that namespace. I could try changing that

empty turret Mar 18, 2024, 7:47 PM

#

Namespace shouldn't be a problem. As long as the pod of the DaemonSet is running on the same node than the pod of the runner job it should work

empty turret Mar 18, 2024, 7:47 PM

#

prisma leaf Looks like it's mounted properly?

Can you show the contents of the buildkit directory?

prisma leaf Mar 18, 2024, 7:51 PM

#

Workin on it

#

Looks like the dir is empty

#

The unix socket is definitely there on the dagger engine

#

Yeah, I don't know why the socket isn't getting mount there. Everywhere else I try to mount it, it seems to be working just fine. But inside the runner it has issues.

#

Ugh... I think I may have figured it out. I'll report back here in a few....

#

Yup.... I had to setup my runners to use affinity node selectors to run on the proper node. I thought I chekced this earlier. But this config seems to have done the trick.

#

@empty turret Thanks a ton for all the help with this today. Im sure I derailed your day a bit, but I appreicate the help

#Run Dagger on Kubernetes | Dagger