#future Gitlab CI architecture
1 messages · Page 1 of 1 (latest)
Hi @eager shale , sorry for the delay. I did a little homework too. I have a few questions, for example: how come /dagger is persisted between the before_script and script?
If we have multiple jobs running dagger we can have a default before action. In this case with one job it doesn’t matter but that is h the pattern. Technically this should be part of a default top level tag.
That repo that is cloned is a collection of dagger “apps” for building various things OpenApi, Java libs, services
Long term it will probably be modules once we get that working
I'm just not super familiar with Gitlab CI internals. It looks like there's some magical volume that is mounted in the before_script container, then again in the job's script, and /dagger is transfered to the job pod that way. Is that right?
Hmm. No I don’t think so. I think it just runs those steps in the pod first
Another question (sorry): in order to get _EXPERIMENTAL_DAGGER_RUNNER_HOST: "unix:///var/run/dagger/buildkitd.sock" to work, did you bake the dagger engine into your dagger-python image?
No. Daemonset
I see, so it's just a configuration DRY thing - to avoid repeating that git clone in each job script
Ah I see. But how do you get the unix socket mounted into the job pod?
In Github Actions for example, it requires deploying a patched version of the official GHA runner helm chart
We use affinity rules to ensure runners run on the same node as the engine with taints
You can constrain a Pod so that it is restricted to run on particular node(s), or to prefer to run on particular nodes. There are several ways to do this and the recommended approaches all use label selectors to facilitate the selection. Often, you do not need to set any such constraints; the scheduler will automatically do a reasonable placemen...
Yes
One sec on the socket
We configure the runner to mount the host path socket that the ds exposes
Let me double check that
So yes host path mounting
Basically: get runner on same host as engine. Both mount /var/lib/dagger from the host and share the socket
I see, so that kubernetes executor config is in a separate config file
Yes
One more question sorry... I promise I'll give answers of my own after 🙂
How do you scale & allocate ressources to the runner itself on kubernetes? As opposed to the job pod. Is that something you have to carefully manage via monitoring, autoscaling etc.
I ask because there's 2 layers of kubernetes deployment: first the runners (presumably from a helm chart); then the jobs (from the runner). So I was wondering about the operational complexity of that in practice.
It applies to all gitlab runners we tag for dagger. It is a comfigmap as part of the gitlab runner helm deployment
Hmm. The runner itself doesn’t need to scale. It is just a router/translator between the job request and the k8s API. The jobs drive the scale
We mostly just run a couple replicas
Right now we probably only run a hundred jobs max simultaneously so if we were to grow might put an HPA on the runner
Official Helm Chart for the GitLab Runner (https://gitlab.com/gitlab-org/gitlab-runner)
Not at my work computer but that might already be enabled
I see. That makes sense.
So... with all this information in mind (thank you). Here is how I think the "shim" pattern might improve your production architecture in the future:
- Instead of running your
dagger-pythonimage, you simply run the official dagger image. - The
scriptandbefore_scriptwould be slightly modified to fit the standard parameters accepted by the dagger image. The exact format is TBD, but it would serve the same function as your current script: "please run this pipeline with the dagger CLI" - No daemon set - each job gets its own engine. This requires excellent cache persistence, which is something we're working on. Think of the DaemonSet as a crutch until we have fully stateless engine out of the box.
- No custom
DAGGER_RUNNER_HOST. ThedaggerCLI is pre-configured to have access to the engine. Probably because it is run by the engine (the nesting I was talking about in the very beginning of this conversation).
That would be awesome. Would this only work with modules? Right now that image is the dagger cli + python + dagger python sdk
We are close to getting modules to work thanks to the excellent efforts around the corporate env stuff but not quite there yet…
Getting rid of the ds would be great.
I certainly have only thought about it in the context of modules... But perhaps there is a possible compat bridge, it would require some more thinking. For example perhaps the Python SDK runtime could also be made usable as a sandbox for dagger run of your tool. That way you get the right python version etc. Fair warning that is not a high priority at the moment, but we do want to make sure external clients are properly supported in general
Great. Thanks for the run down. We will make it work…
ACTUALLY even simpler, your existing dagger-python image could be run as-is by dagger - so literally the dagger image would be a wrapper around your existing image. I think that would be the bridge.