future Gitlab CI architecture | Dagger | Page 1

terse coyote Aug 16, 2024, 8:18 PM

#

🧵

terse coyote Aug 18, 2024, 2:45 PM

#

Hi @eager shale , sorry for the delay. I did a little homework too. I have a few questions, for example: how come /dagger is persisted between the before_script and script?

eager shale Aug 18, 2024, 2:48 PM

#

If we have multiple jobs running dagger we can have a default before action. In this case with one job it doesn’t matter but that is h the pattern. Technically this should be part of a default top level tag.

#

That repo that is cloned is a collection of dagger “apps” for building various things OpenApi, Java libs, services

#

Long term it will probably be modules once we get that working

terse coyote Aug 18, 2024, 2:50 PM

#

I'm just not super familiar with Gitlab CI internals. It looks like there's some magical volume that is mounted in the before_script container, then again in the job's script, and /dagger is transfered to the job pod that way. Is that right?

eager shale Aug 18, 2024, 2:51 PM

#

Hmm. No I don’t think so. I think it just runs those steps in the pod first

terse coyote Aug 18, 2024, 2:51 PM

#

Another question (sorry): in order to get _EXPERIMENTAL_DAGGER_RUNNER_HOST: "unix:///var/run/dagger/buildkitd.sock" to work, did you bake the dagger engine into your dagger-python image?

eager shale Aug 18, 2024, 2:52 PM

#

No. Daemonset

terse coyote Aug 18, 2024, 2:52 PM

#

eager shale Hmm. No I don’t think so. I think it just runs those steps in the pod first

I see, so it's just a configuration DRY thing - to avoid repeating that git clone in each job script

terse coyote Aug 18, 2024, 2:52 PM

#

eager shale No. Daemonset

Ah I see. But how do you get the unix socket mounted into the job pod?

#

In Github Actions for example, it requires deploying a patched version of the official GHA runner helm chart

eager shale Aug 18, 2024, 2:53 PM

#

We use affinity rules to ensure runners run on the same node as the engine with taints

#

https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/

Assigning Pods to Nodes

You can constrain a Pod so that it is restricted to run on particular node(s), or to prefer to run on particular nodes. There are several ways to do this and the recommended approaches all use label selectors to facilitate the selection. Often, you do not need to set any such constraints; the scheduler will automatically do a reasonable placemen...

eager shale Aug 18, 2024, 2:54 PM

#

terse coyote I see, so it's just a configuration DRY thing - to avoid repeating that git clon...

Yes

#

One sec on the socket

#

https://docs.gitlab.com/runner/executors/kubernetes/#configure-volume-types

Kubernetes executor | GitLab

#

We configure the runner to mount the host path socket that the ds exposes

#

Let me double check that

#

So yes host path mounting

#

Basically: get runner on same host as engine. Both mount /var/lib/dagger from the host and share the socket

terse coyote Aug 18, 2024, 3:08 PM

#

I see, so that kubernetes executor config is in a separate config file

eager shale Aug 18, 2024, 3:09 PM

#

Yes

terse coyote Aug 18, 2024, 3:10 PM

#

One more question sorry... I promise I'll give answers of my own after 🙂

How do you scale & allocate ressources to the runner itself on kubernetes? As opposed to the job pod. Is that something you have to carefully manage via monitoring, autoscaling etc.

I ask because there's 2 layers of kubernetes deployment: first the runners (presumably from a helm chart); then the jobs (from the runner). So I was wondering about the operational complexity of that in practice.

eager shale Aug 18, 2024, 3:10 PM

#

It applies to all gitlab runners we tag for dagger. It is a comfigmap as part of the gitlab runner helm deployment

#

Hmm. The runner itself doesn’t need to scale. It is just a router/translator between the job request and the k8s API. The jobs drive the scale

#

We mostly just run a couple replicas

#

Right now we probably only run a hundred jobs max simultaneously so if we were to grow might put an HPA on the runner

#

https://gitlab.com/gitlab-org/charts/gitlab-runner/-/blob/main/templates/hpa.yaml?ref_type=heads

GitLab

templates/hpa.yaml · main · GitLab.org / charts / GitLab Runner · G...

Official Helm Chart for the GitLab Runner (https://gitlab.com/gitlab-org/gitlab-runner)

#

Not at my work computer but that might already be enabled

terse coyote Aug 18, 2024, 3:42 PM

#

I see. That makes sense.

#

So... with all this information in mind (thank you). Here is how I think the "shim" pattern might improve your production architecture in the future:

Instead of running your dagger-python image, you simply run the official dagger image.
The script and before_script would be slightly modified to fit the standard parameters accepted by the dagger image. The exact format is TBD, but it would serve the same function as your current script: "please run this pipeline with the dagger CLI"
No daemon set - each job gets its own engine. This requires excellent cache persistence, which is something we're working on. Think of the DaemonSet as a crutch until we have fully stateless engine out of the box.
No custom DAGGER_RUNNER_HOST. The dagger CLI is pre-configured to have access to the engine. Probably because it is run by the engine (the nesting I was talking about in the very beginning of this conversation).

eager shale Aug 18, 2024, 3:53 PM

#

That would be awesome. Would this only work with modules? Right now that image is the dagger cli + python + dagger python sdk

#

We are close to getting modules to work thanks to the excellent efforts around the corporate env stuff but not quite there yet…

#

Getting rid of the ds would be great.

terse coyote Aug 18, 2024, 4:03 PM

#

eager shale That would be awesome. Would this only work with modules? Right now that image i...

I certainly have only thought about it in the context of modules... But perhaps there is a possible compat bridge, it would require some more thinking. For example perhaps the Python SDK runtime could also be made usable as a sandbox for dagger run of your tool. That way you get the right python version etc. Fair warning that is not a high priority at the moment, but we do want to make sure external clients are properly supported in general

eager shale Aug 18, 2024, 4:04 PM

#

Great. Thanks for the run down. We will make it work…

terse coyote Aug 18, 2024, 4:04 PM

#

ACTUALLY even simpler, your existing dagger-python image could be run as-is by dagger - so literally the dagger image would be a wrapper around your existing image. I think that would be the bridge.

#future Gitlab CI architecture