Containerd backend for dummies | Dagger | Page 1

gleaming grove Aug 25, 2023, 10:50 PM

#

What does that mean for the default docker driver for example?

#

Would it run a privileged container that can mount the underlying containerd socket from the docker engine, and pass that through?

#

Or would it actually run a privileged containerd?

#

Also: my understanding is that buildkit (whether embedded by our engine or not) needs to run in the same mount namespace as containerd, so that it can mess with the containerd state, push and pull layers etc. How does that work with an external containerd?

#

cc @safe crane whom I was bothering yesterday about this 🙂

topaz owl Aug 26, 2023, 12:12 AM

#

gleaming grove What does that mean for the default docker driver for example?

Yeah so for our default docker driver, we'll still just start the runner image in docker the same way we start the engine today. So "actually run a privileged containerd". Once we've started the runner, we'll have the containerd sock we can use. In general, it will be the exact same idea wherever you are running (i.e. a k8s daemonset pod would start the runner, which is how we'll get the containerd sock, etc.)

That's just a starting place though; it'll work fine because we can avoid another layer of container nesting and retain the current "container-in-container". But we absolutely can also support directly talking to a pre-existing containerd sock too. That's what I was trying to get at in point 2. here: https://github.com/dagger/dagger/issues/5484#issuecomment-1692294123

Also: my understanding is that buildkit (whether embedded by our engine or not) needs to run in the same mount namespace as containerd, so that it can mess with the containerd state, push and pull layers etc. How does that work with an external containerd?
Yes, this is where a bit of gnarliness will come in :-). With containerd, you can specify the whole entire oci runtime spec when you start a container. The configs we'll be interested in are mount propagation flags, which result in mounts from the host mount namespace also showing up inside the container.

With that, it's basically a matter of poking the right holes in the right places in the engine container, and it should result in it having everything it needs to use containerd even though it's technically inside a container.

gleaming grove Aug 26, 2023, 12:14 AM

#

ok I think I follow thanks

topaz owl Aug 26, 2023, 12:14 AM

#

Honestly, all we're really using containerd for here is an API for running the engine as a subprocess. The fact that it's containerized isn't really essential, just convenient since it lets us still package the engine up an image. I was wondering on the side what happens if you run a container where the oci runtime spec specifies to use the *host mount namespace", which is technically possible afaict. If that actually works and isn't just an unhandled corner case, it might be even more convenient

gleaming grove Aug 26, 2023, 12:16 AM

#

then if I understand correctly, this is where the longer-term refactoring comes in, to gradually spin out the lower-level parts of buildkit behind a clean API, until eventually the poking holes is no longer necessary

#

the result might look like “containerd with a plugin to remotely instrument the state”

#

is that roughly correct?

topaz owl Aug 26, 2023, 12:24 AM

#

gleaming grove then if I understand correctly, this is where the longer-term refactoring comes ...

Yes, containerd happens to simultaneously 1) give us a robust remote api for starting processes from a container image and 2) give us a backend that is both more flexible than the oci worker and also better for performance since it enables sharing the content store between different engines

So it just makes perfect sense for our initial steps. But in the longer term you're exactly right, I think we'll end up going down the route of customizing the runner more and more towards our use case so that gnarliness can be trimmed away. Honestly the exact shape of that long-term goal is still fuzzy, we could go many different routes, but in full agreement that it's the general right direction.

#

Tangential, but one nice side-effect we pick up by switching to the containerd worker as a backend is that there's a small but growing ecosystem of containerd shims + other plugins that enable e.g. running in firecracker, running wasm, etc.

#

That's why I like it more than the oci worker in general, but wasn't ever sufficient motivation to actually make the move until now

gleaming grove Aug 27, 2023, 1:25 AM

#

FYI @safe crane @bitter adder 👆

safe crane Aug 28, 2023, 6:40 PM

#

So if I understand correctly, the idea is to run an extra privileged containerd as a k8s daemonset, and when dagger points to it, it can run a dagger container on that extra containerd instance, poking holes so buildkit can access that extra containerd's state.

For Docker Desktop, do we also want to launch an extra containerd or just reuse the underlying one leveraging containerd's namespace mechanism for separating states ?

#

Yeah so for our default docker driver, we'll still just start the runner image in docker the same way we start the engine today. So "actually run a privileged containerd"

seems to lean towards running an extra containerd

gleaming grove Aug 28, 2023, 6:44 PM

#

extra containerd seems safer, also I think we need to bundle it with special OCI hooks, for networking, stdout capture and soon additional tools for gpu support , no? Don’t all these things need to be bundled in the containerd image now?

safe crane Aug 28, 2023, 6:46 PM

#

yes it's possible that stuff like gpu access will need to be in both images, or at least in the extra containerd one, where things are being actually run

topaz owl Aug 28, 2023, 6:53 PM

#

gleaming grove extra containerd seems safer, also I think we need to bundle it with special OCI...

I think we'll actually be able to have the current shim logic in the engine image still thanks to the fact that buildkit will support specifying which containerd runtime to use in the very near future (this PR is already approved: https://github.com/moby/buildkit/pull/4141).

But yes as a worst case fallback that would need to be in the containerd image. Would like to avoid that if possible, but not a huge deal if we don't start there.

GPU access is still a big TBD overall. I outlined the ideal path here: https://github.com/dagger/dagger/issues/5484#issuecomment-1692494794

topaz owl Aug 28, 2023, 6:55 PM

#

safe crane > Yeah so for our default docker driver, we'll still just start the runner image...

Yep that's the starting place; in practice not really any downsides besides the slight overhead of an extra containerd, but like I said we can layer on extra support for directly using containerd when possible. Can save that for the future though, not needed immediately

gleaming grove Aug 28, 2023, 6:55 PM

#

If there’s any possible headache from supporting arbitrary external containerd, better to assume we don’t support it, and we can always change our minds later

topaz owl Aug 28, 2023, 6:56 PM

#

gleaming grove If there’s any possible headache from supporting arbitrary external containerd, ...

Agreed, worth a shot not but not a blocker if it doesn't work out immediately

gleaming grove Aug 28, 2023, 7:01 PM

#

While we're discussing this. I'm not 100% sure "runner" is the right word to designate this in the docs & diagrams

#

It's accurate enough (it runs stuff for us) but perhaps too close to "CI runner" and "runner machine"?

#

I was going to suggest just calling it "containerd" but I think the "containerd on top of podman" situation, although it's the correct solution, is best not emphasize too much. "Why do I need to run a container runtime on top of my existing container tool?" etc etc

#

It's a 2-part question:

Question 1: what do we call the freshly spun out containerd part
Question 2: is it part of the engine, or not?

topaz owl Aug 28, 2023, 7:07 PM

#

The way we're using it is sort of like an init process but also like an execution backend, so yeah there's not an incredibly obvious pre-existing term I can think of. It's sort of a novel use case (to my awareness, which is limited)

#

Question 2: is it part of the engine, or not?
I can say yes to this, with reasonable confidence. It's needed for the engine to do engine things, so it's part of the engine architecture.

Question 1 is harder, need to think some more 🙂

gleaming grove Aug 28, 2023, 7:09 PM

#

OK 2-part engine it is

#

Then we need a name for the other part also 🙂

#

Listing a few options:

Most recently we have used Runner / Router. It has not caught on much (ie. we haven't documented it etc). So we can keep it or change it.
We could also do Backend / Frontend. I have used that a few times informally to explain the architecture.
Runtime / the rest of the Engine Breaking from my own suggestion of naming both parts. In this option we would say "The runtime is the most low-level component of the engine, it is used to bootstrap the rest, and provide an interface with the underlying system"
Containerd / the rest of the Engine same idea as "runtime", but with the word "containerd". "The Dagger Engine is built on containerd, the de facto standard container runtime. Containerd is used to bootstrap the rest of the engine, and provide a standardized interface with the underlying system."

#

FYI @craggy quest @lean dove we fell into a terminology trap 🙂

#

This is related to the architecture change that will indirectly affect the kubernetes & Argo guides

lean dove Aug 28, 2023, 7:21 PM

#

Noted, thanks. Following discussion.

safe crane Aug 28, 2023, 9:34 PM

#

To run the extra containerd image, you need some container API in the first place. For Docker Destkop and anywhere where docker is installed, it's usually just docker, and wherever it's k8s, it's the k8s api. I feel like that covers 99% of the usecases and that requiring access to a containerd API might make things less simple for the user. According to this we'd still be spawning an extra containerd, just not via the existing containerd but via either docker (like it is currently the case in dagger), or the k8s api.

topaz owl Aug 28, 2023, 10:32 PM

#

safe crane To run the extra containerd image, you need some container API in the first plac...

Just to clarify, the new architecture won't change anything for end users relative to the way it works today, which is:

By default, we will just find any available docker and run the engine image in that (via docker run)
But, if the _EXPERIMENTAL_DAGGER_RUNNER_HOST env var is set, that will point to a pre-existing engine to connect to. E.g. values could be kube-pod://<some pod already running the engine>, tcp://<some ip the engine is serving on>, etc.

With the new approach, everything will be the exact same, the only difference is that the "engine image" will be starting containerd rather than today where it starts the engine itself.

End users won't have any awareness of this though. If they are going through the default path of using docker, it's all behind the scenes anyways. And even if they are doing a custom deployment of the engine, they still just need to do a custom deployment of our image; all the modifications are internal to that image.

#

I can try to put together a diagram quick too if helpful, it's a bit hard to explain in words but makes sense once you have the boxes and arrows in your head 🙂

gleaming grove Aug 28, 2023, 10:36 PM

#

in my mind the big difference is in the interface for creating an engine driver

#

also having everything in one “engine image” confuses me. Shouldn’t it be 2 different images for 2 different components?

#

especially given that decoupling versioning between them is a big reason for the split

topaz owl Aug 28, 2023, 10:39 PM

#

gleaming grove also having everything in one “engine image” confuses me. Shouldn’t it be 2 diff...

Yes there is, I was explaining from the perspective of an end user though. They don't need to know about the other image embedded in the CLI. If they are doing a custom deployment (i.e. in k8s or whatever), then they are aware of that image, but that's no different than today where they are aware of an image (registry.dagger.io/engine) that they need to run.

#

Underneath the hood yes there will be two images, not changing anything there

topaz owl Aug 28, 2023, 10:41 PM

#

gleaming grove in my mind the big difference is in the interface for creating an engine driver

It will most likely be almost entirely the same. The way the "drivers" (aka buildkit connhelpers) work today is create a way of proxying a grpc from the client to a unix socket inside the engine image where the buildit API is being served. In the new architecture, it will be the exact same except to the containerd unix socket instead.

I actually think the existing connhelpers could all be used off the shelf without modification, just need to point to a unix socket at a different path.

#

Once the driver provides that connection, it can just hand off to common code though, no need for them all to repeat them implementation that starts the engine in containerd and such.

gleaming grove Aug 28, 2023, 10:43 PM

#

but for example, does the CLI provide the containerd image to run? I’m guessing yes given earlier discussion about not supporting arbitrary containerd

#

assuming yes, does the CLI pass a ref, or an inline tarball?

topaz owl Aug 28, 2023, 10:47 PM

#

gleaming grove but for example, does the CLI provide the containerd image to run? I’m guessing ...

That part will work the exact same way it does today. The logic today is:

If _EXPERIMENTAL_DAGGER_RUNNER_HOST is set, prefer that
Otherwise, if the CLI was built from a tagged version (i.e. v0.8.4), run the registry.dagger.io/engine:v0.8.4 image in docker
As a fallback (i.e. dev build) it uses registry.dagger.io/engine:main (which is just the most recent build off main)

The only difference is that registry.dagger.io/engine will be the image w/ containerd in it. And if we decide to rename the image, it'll be a different name. But the logic will be the same.

#

Obviously we're planning on improving that whole interface, not using the _EXPERIMENTAL env var, etc. But that's all orthogonal

topaz owl Aug 28, 2023, 11:01 PM

#

gleaming grove Listing a few options: 1. Most recently we have used **Runner** / **Router**. I...

Most recently we have used Runner / Router. It has not caught on much (ie. we haven't documented it etc). So we can keep it or change it.
The thing previously internally called "Router" actually no longer exists after the other recent re-arch to move the gql server (got split up into parts and incorporated into other components), so that term is available for use once again 🙂

We could also do Backend / Frontend. I have used that a few times informally to explain the architecture.
Concern here is that it would be most natural to refer to the CLI as the frontend (I personally call it the engine client)

gleaming grove Aug 28, 2023, 11:12 PM

#

Let me try to repackage that in a hypothetical UX built on engine drivers, and see if it makes sense to you:

The Dagger CLI needs a way to run the engine
To do so, it relies on an "engine driver"
Here is how it works:
1. The user selects an engine driver (by config file, CLI flag, env variable or simply by default)
2. The CLI executes the engine driver
3. The engine driver arranges for a containerd session to be open, and passes it to the CLI
4. The CLI uses the containerd session to load the rest of the engine
5. The engine will use the same containerd session to interface with the system (to run containers etc)
The CLI ships with the following builtin engine drivers:
- Docker (uses the docker CLI)
- Podman (uses the podman CLI)
- Nerdctl (uses the nerdctl CLI)
- Containerd-proxy (connects to an existing containerd)* --> see question below

Is that right?

#

Follow-up question: would "containerd-proxy" be guaranteed to work? Or would it require containerd to be a very specific reason? And assuming the latter - maybe we don't support it at all?

topaz owl Aug 28, 2023, 11:19 PM

#

gleaming grove Let me try to repackage that in a hypothetical UX built on engine drivers, and s...

Yep that sounds exactly correct.

Follow-up question: would "containerd-proxy" be guaranteed to work? Or would it require containerd to be a very specific reason? And assuming the latter - maybe we don't support it at all?
Our starting point should be to assume it won't work, like I said that feature wouldn't be something we start out supporting.

If, after we get this working, we have managed to do so with a 100% vanilla containerd such that all the "side components" (shim, config files, gpu-specific stuff, etc.) are in the engine image or otherwise not reliant on containerd being customized, then we can consider saying that we support arbitrary containerd installations (probably just a min version or something).

I think that could be nice, but it's very far from a requirement and is not worth making sacrifices for. It's more if the cards play out that way, we can consider it as an option.

gleaming grove Aug 28, 2023, 11:21 PM

#

topaz owl Yep that sounds exactly correct. > Follow-up question: would "containerd-proxy...

OK, that's what I initially understood but then your mention of _EXPERIMENTAL_DAGGER_RUNNER_HOST threw me off. Wouldn't that be the equivalent of pointing to an external containerd?

topaz owl Aug 28, 2023, 11:30 PM

#

gleaming grove OK, that's what I initially understood but then your mention of `_EXPERIMENTAL_D...

No, but I 100% understand the confusion. I think it's worth being a little pedantic here to explain how this actually works today, so bear with me a sec, I think it'll help clarify ultimately.

Today, here's what happens when you set various values of _EXPERIMENTAL_DAGGER_RUNNER_HOST:

_EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://dagger-engine
- This means "I have pre-run a container called dagger-engine in docker, there is a unix socket in there at /var/run/buildkit/buildkit.sock, connect to that using the docker cli"
- There's then various magic that happens to connect to that unix socket via docker exec, I can explain if helpful, but it's a bunch of magic we pick up from buildkit's connhelper, nothing custom.
_EXPERIMENTAL_DAGGER_RUNNER_HOST=podman-container://dagger-engine
- This means "I have pre-run a container called dagger-engine in podman, there is a unix socket in there at /var/run/buildkit/buildkit.sock, connect to that using the podman cli"
_EXPERIMENTAL_DAGGER_RUNNER_HOST=kube-pod://dagger-engine
- This means "I have pre-run a pod in k8s called dagger-engine, there is a unix socket in there at /var/run/buildkit/buildkit.sock, connect to that using the kubectl cli"
_EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://1.2.3.4:8080
- This means "I have pre-run the dagger engine and have it serving the buildkit grpc api at 1.2.3.4:8080 connect to that api directly over tcp"
- This one is special in that it's not even a buildkit connhelper, it's just passed directly to the grpc client as an address to connect to

#

After the re-architecture to use containerd, the behavior would be fundamentally the same, it's just that rather than connecting to /var/run/buildkit/buildkit.sock or to a tcp address at which the buildkit api is being served, we now would be connecting to /var/run/containerd/containerd.sock or a tcp address at which the containerd api is being served.

#

Then, once we move off _EXPERIMENTAL_DAGGER_RUNNER_HOST, all those different connection types would just be swapped out for the engine drivers concept. But underneath the hood, those drivers will be implemented pretty much exactly the same way. The main difference is likely that we would actually support starting new containers (similar to today's default behavior of starting the engine image in docker), whereas all the existing values of _EXPERIMENTAL_DAGGER_RUNNER_HOST presume the engine has been "pre-run"

gleaming grove Aug 28, 2023, 11:40 PM

#

Clarifying question here - in the future you also want to move the graphql server into the engine container, so the buildkit socket would become hidden right?

topaz owl Aug 28, 2023, 11:45 PM

#

gleaming grove Clarifying question here - in the future you also want to move the graphql serve...

That's already been done. The graphql server runs inside the engine container and is connected to via the buildkit API (specifically, via the session API, which underneath the hood is actually just a way of opening a net.Conn between server+client that's tunneled through grpc).

I 100% want to hide the buildkit API entirely and considered going all the way as part of moving graphql to the engine container, but it was going to be a bunch of extra work on top of the already huge PR, so I had to cut that out. But that's exactly the route we want to go, just serve the graphql API only.

gleaming grove Aug 28, 2023, 11:46 PM

#

OK awesome

#

so today the graphql server running locally is just a thin proxy to the actual graphql server running in the container engine, using the buildkit grpc API as a transport

#

right?

topaz owl Aug 28, 2023, 11:49 PM

#

gleaming grove so today the graphql server running locally is just a thin proxy to the actual g...

Yes exactly, the part that runs locally is literally just this code: https://github.com/sipsma/dagger/blob/de9f0e991fd9c29c874871e3fcf3e7119565d811/engine/client/client.go#L493-L535

Just proxies local requests to the server and back

gleaming grove Aug 28, 2023, 11:50 PM

#

OK thanks for walking me through it all

topaz owl Aug 28, 2023, 11:50 PM

#

I attempted throwing together a few ugly before+after diagrams of that earlier re-architecture to move the gql server to the engine container, if you scroll down in the description of this PR you'll see them, just in case that's helpful too: https://github.com/dagger/dagger/pull/5415

gleaming grove Aug 28, 2023, 11:50 PM

#

https://tenor.com/view/nosebleed-thumbsup-gif-8882037

Tenor

topaz owl Aug 28, 2023, 11:51 PM

#

gleaming grove OK thanks for walking me through it all

No problem! Part of our effort to standardize terminology should probably include diagramming this all out, so probably a good opportunity to clean up those ugly diagrams into something more consumable 🙂

#

That don't result in nosebleeds

gleaming grove Aug 28, 2023, 11:52 PM

#

One more question to try and tie it all together...

If the ~~engine~~ CLI always has a containerd socket (via drivers). Then maybe it never needs anything else? ie maybe then we can deprecated _EXPERIMENTAL_DAGGER_RUNNER_HOST to remove a dimension of complexity?

#

Assuming containerd API has a way to name / label containers, so the CLI could lookup containerd to see if there's already an engine running at the same version etc

#

That env var doesn't really work well with the engine driver concept anyway, since it kind of lets you bypass them at any time, making everything more complicated to understand

topaz owl Aug 28, 2023, 11:59 PM

#

Oh yeah, so I 100% do not want us to have separate env vars to specify how to connect to the "dagger engine" and how to connect to "containerd".

To be clear, my thinking is that just to start, we'd still have one env var but it will now be pointing to how to connect to the containerd api. I think we need to retain that for now just so power users who are already doing custom deployments can continue to do so. It can be named something besides _EXPERIMENTAL_DAGGER_RUNNER_HOST, e.g. _EXPERIMENTAL_DAGGER_CONTAINERD_HOST or whatever. But I think we'll need it to start.

But yeah, once we have the custom engine drivers, then we no longer need to support that at all and the env var can be gone entirely. And if we do the engine driver stuff at the same time as this whole re-architecture, then we can just remove any _EXPERIMENTAL_* env vars right away.

gleaming grove Aug 29, 2023, 12:02 AM

#

A good test case would be how we want kubernetes to work

#

That's one case where it would actually be an external containerd, by design right?

#

I mean externally provisioned - ie not by the driver

#

but UX-wise, it could be useful even in that case, to make it a specific kubernetes driver.

For example it could use a kub-specific method for connecting to the containerd (I know at the moment we rely on a unix socket being mounted into the client pod, but maybe in the future that could change, we were exploring kub-specific methods of injection and discovery)
Another example, perhaps if the containerd cannot be found, because the operator didn't install the daemonset, the kubernetes engine driver could actually print an error with specific instructions on how to install that daemonset

gleaming grove Aug 29, 2023, 12:06 AM

#

topaz owl Oh yeah, so I 100% do not want us to have separate env vars to specify how to co...

roger that

topaz owl Aug 29, 2023, 12:07 AM

#

gleaming grove I mean externally provisioned - ie not by the driver

Yes exactly. Actually, I think I was thrown off by "external containerd" before too. I thought you meant "vanilla containerd, not the dagger containerd we distribute in our image". But I realize now you meant "dagger containerd image setup ahead of time to be running somewhere"

In the daemonset approach today, the CLI needs to be set with _EXPERIMENTAL_DAGGER_RUNNER_HOST=unix:///path/to/buildkit.sock (with the unix sock being available due to volume sharing between the pod and the daemonset). Obviously that requires the daemonset be setup ahead of time.

In the new approach, by default, it'd be the exact same idea except we point to the containerd sock instead. Otherwise identical.

but UX-wise, it could be useful even in that case, to make it a specific kubernetes driver.
Totally agree with this though too. It would be great to support this, just obviously quite a bit more involved when we are making actual.y k8s api calls on behalf of users and such. But would be great to have.

gleaming grove Aug 29, 2023, 12:07 AM

#

I prefer the idea of doing architecture change + drivers all in one, and just remove the env variable altogether, to help simplify things
If for some reason that's not possible, and we do have an intermediary env var, definitely should be a new one, to not change the meaning of existing _..._RUNNER_HOST , that could cause its own class of problems. As an operator I wouldn't want to suddenly have a env variable that used to point to a buildkit socket, now point to a containerd socket basically

#

But I realize now you meant "dagger containerd image setup ahead of time to be running somewhere"

Well yes, but inevitably someone will use it to point at a vanilla containerd. So we have to design for what happens in that case.

topaz owl Aug 29, 2023, 12:10 AM

#

gleaming grove > But I realize now you meant "dagger containerd image setup ahead of time to be...

I agree, but actually the exact same situation exists today where someone could point the dagger CLI to connect to vanilla upstream buildkit. The connection will be successful up to some point where it would fail due to divergence w/ upstream (honestly not sure exactly where it would fail today, but somewhere, we don't have great handling for it even though we should).

In the new architecture, it's the same thing, just w/ "vanilla containerd" vs "containerd running in our dagger image".

Either way, failing gracefully in that situation is needed.

#

That feels like an argument for not loudly advertising that we're using containerd. Like we don't need to hide it as a secret or something obviously, we can mention it in architecture docs for power-users who want to know all the details underneath the hood. But unless we become sure we want to support vanilla containerd, it probably makes sense to not include the name "containerd" in any immediately user-facing parts of the UX/OX/docs.

gleaming grove Aug 29, 2023, 12:13 AM

#

Right. I think we can handle this mostly with UX, so the intent is clear and the only people pointing it at a vanilla containerd are people who really know they're voiding the warranty

#

By "handle with UX" I'm thinking specifically:

Go straight to engine drivers (could be a very simple implementation of them), no env variables
Deprecate _DAGGER_RUNNER_HOST
No generic 'external containerd' driver. Instead, specific ones like "kubernetes" which are basically a very thin wrapper over a generic containerd proxy, with the opportunity for a more specialized / guides UX
Escape hatch: write your own engine driver. Ideally it can be just a shell script so it's easy to hack one

topaz owl Aug 29, 2023, 12:17 AM

#

gleaming grove By "handle with UX" I'm thinking specifically: 1. Go straight to engine drivers...

Yep that SGTM. Also worth noting that the move from _EXPERIMENTAL* to engine drivers can be done in parallel w/ the rest of the re-architecture. There's no implicit dependencies between them until pretty far down in the "DAG of implementation tasks", at which point that should be easy to hook up.

gleaming grove Aug 29, 2023, 12:18 AM

#

well when you actually write a driver, it will either proxy a containerd socket, or do other things

#

so the architecture change will also change the driver interface I think

topaz owl Aug 29, 2023, 12:20 AM

#

gleaming grove well when you actually write a driver, it will either proxy a containerd socket,...

Right but it works out nicely because both buildkit and containerd APIs are grpc apis served over unix socks. And the plumbing we need here is not api-specific, it's just about proxying grpc traffic. So honestly the only difference between an engine driver for our existing architecture and an engine driver for the new one would be the path of the unix socket.

gleaming grove Aug 29, 2023, 4:24 AM

#

I let this sit for a few hours, and... surprisingly... I think I still understand all of it!

ancient ruin Sep 19, 2023, 5:35 PM

#

topaz owl No, but I 100% understand the confusion. I think it's worth being a little pedan...

Hi Erik, when you mention under 3 - "have it serving the buildkit grpc api at port 8080" this is not enabled by default by any means and would require changing /etc/dagger/engine.toml?

solar dawn Sep 19, 2023, 6:01 PM

#

Erik is on vacation for another ~2 weeks, but I think I can answer that one since I just had to deal with it myself. You can configure the engine to listen on TCP (either via TOML or via --addr=tcp://0.0.0.0:8080), and this will mostly work, but Dagger-in-Dagger will not work because that works by mounting the Buildkit socket into the container. What I ended up doing is spawning socat UNIX-LISTEN:/run/buildkit/buildkitd.sock,reuseaddr,fork TCP:127.0.0.1:8888 alongside the engine. (You could probably go in the other direction too, i.e. have socat listen on TCP and forward to Unix.)

ancient ruin Sep 21, 2023, 9:47 PM

#

Thanks. Some progress:
engine.toml file:

debug = true
insecure-entitlements = ["security.insecure"]


[log]
  # log formatter: json or text
  format = "text"

[grpc]
address = ["tcp://0.0.0.0:1234"]

And then

docker run --rm -it \
    -v $(pwd)/engine.toml:/etc/dagger/engine.toml:ro \
    --privileged=true \
    -p 1234:1234 \
    registry.dagger.io/engine:v0.8.7

Works nicely, from another box I can _EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://this-box:1234 and it works as expected.

#

Now, if I could only tell dagger-engine to additionally use buildkit rootless (https://github.com/moby/buildkit/blob/master/docs/rootless.md#containerized-deployment) that'd be awesome! However I don't control what happens inside registry.dagger.io/engine. Is this on roadmap?

GitHub

buildkit/docs/rootless.md at master · moby/buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit - moby/buildkit

solar dawn Sep 21, 2023, 9:54 PM

#

ancient ruin Now, if I could only tell dagger-engine to additionally use buildkit rootless (h...

@rich kelp has been looking into this and has a nice write-up: https://github.com/dagger/dagger/pull/5809. The long and short of it is rootless support is extraordinarily difficult to implement and maintain while supporting Dagger's full feature set. It doesn't seem likely to happen. cc @keen trout

keen trout Sep 26, 2023, 4:01 PM

#

So I've started work in containerd land on trying to switch to the containerd worker - which works fine 🎉
But I've got a couple open questions, I'm curious if anyone has any opinions here:

How do we plan to handle containerd upgrades?

When containerd is updated upstream (usually inline with docker), buildkit bumps the go.mod version (the client library) alongside the containerd components (binaries in the image, or server-side vendored into docker - which is usually a whole pain, the whole ecosystem needs to jump together so that all the clients match all the servers). It would be nice to avoid needing to join this pain.

But there's a future scenario when buildkit updates it's client-bindings of containerd to rely on new functionality (also bumping the containered server binaries at the same time). If the "containerd runner substrate" we have is on an older version we have an incompatibility here. I'm not too worried about straight breakage - but working out what we do in scenarios where dagger engine version X has containerd client version A, and dagger engine version Y has containerd client version B, and we want to run them in a runner (which ideally is long-running, so shouldn't need to be upgraded in step with the engine/cli) so that might even be running containerd server C.

TL;DR - it seems like there could be a lot of different containerd versions, all coexisting at the same time. Maybe the simple solution could be to just fix to containerd minor versions 0.0.X, given that buildkit doesn't tend to bump the version that often? But maybe someone has thought about this already and has a better solution.

EDIT: maybe this is less of an issue than I imagined it was. Buildkit actually seems to consistently test against the containerd versions that are supported (1.6 and 1.7 atm) - so we may actually be okay on this front. We would have to have version restrictions with what version of containerd on the runner each engine supports, but should be fine!

#

How do we handle different runc versions?

Assuming that we solve the above, we also have the same problem, but this time with different runc versions. There are two main options:

We have 1 runc, which we ship in the runner image.
We have 1 runc in the runner image (used to run the dagger engine), and then a runc in each dagger engine. Then we magically pass this when we create containerd containers in the buildkit worker, so that we have 1 running version of containerd, but different versions of runc.

There's also more fun things related to this - we need the containerd-shim-runc-v2 binary, which we'll need to ship alongside wherever we ship runc probably (or we could always put it in the runner, and version it alongside containerd, more options yay).

Finally there's the dagger-shim which wraps runc itself. This definitely needs to be distributed alongside the engine, so we possibly have this weird magical dance (same as above) where containerd has to use the engine container filesystem to determine which dagger-shim to use... this is a little funky, but possible.

#

A potential alternative to maybe consider that could eliminate a lot of toil... we could potentially keep the spirit of "stateless engine", while not using containerd. (this is literally off the top of my head, so maybe this might not work)

We could just nest runc versions - so we start the dagger engine container inside our runner using runc, and then the engine also starts it's containers using runc. To prevent us from adding another layer of containerization (which would be sad), we could just remove layers of runc protection - we don't actually need certain things at that level to be isolated - e.g. we don't really require network isolation, or pid namespace isolation - we could even potentially modify the engine so that multiple engines could all share the same filesystem (e.g. engine version x goes into /etc/dagger-x).

One of the issues here is we don't share the same containerd content store - but maybe this is something we could look at doing with upstream patches - I don't think there's anything fundamentally impossible about sharing containerd stores among multiple buildkits (even if it might not be possible today).

EDIT: after a night of sleep, I'm not convinced about this alternative anymore - I think the containerd versioning issue is pretty solvable now, and the need for runc magic is a bit annoying... but, doing runc nesting really starts to feel very custom and not fun to me.

#

</wall-of-text> 😢
i'm logging off for the day, but if anyone has any ideas for some of these problems, or any insight into my idea not to use containerd here, please share 😄 i'm still not fully aware of all the context around this, and any previous discussions that may have been had.

gleaming grove Sep 26, 2023, 4:36 PM

#

Sorry for missing all this, somehow I had left this thread

#

catching up

#Containerd backend for dummies