#How does DAG save computation
1 messages · Page 1 of 1 (latest)
All good, again I appreciate your valuable time.
It boils down to caching. A lot of the benefits of Dagger derive from automatic caching of everything, all the time. Every time you execute a tool as part your pipeline, you're effectively building or downloading the container image for that tool on the fly.
So in a typical Dagger pipeline, even if your source code changes, requiring you to re-run the app build and test, a lot of the work to get to that point will be cached
Ideally, you would run any number of ephemeral dagger engines, and they would magically share cache data. We are driving towards that, but in practice the architecture requires a long-running service to coordinate access to the storage. In other words you cannot have N engines sharing a state directory
So for architectural reasons, today you need 1 long-running service per hot cache (local state directory)
On top of that, there is container nesting. Dagger needs to run linux containers. If Dagger is running inside a container (via a kub pod) it still needs a way to run containers. Docker-in-docker is in theory an option, and it does work. But often in production you want to avoid it. So having a companion daemonset also addresses that problem.
As a result, the most pragmatic way (that we know of) to run Dagger on a Kubernetes cluster today, is to split it in two:
-
Run as much as possible inside the pod, or in an ephemeral container instrumented by the pod
-
Run as little as possible in a companion daemonset, to broker access to the host filesystem and container runtime
To clarify @toxic kayak I understand the concern you explained initially - how to make sure this isn't a step backwards in efficient resource allocation. Clearly we need to articulate our answer more clearly, and make it available in a document somewhere.
It's also worth mentioning that in the DAG / Dagger model, having fewer build nodes is "theoretically" better since if two build pipelines land in the same Dagger engine, they can efficiently de-dup operations between each other and make the best use of caching which doesn't generally happen with other traditional CI platforms where CI jobs are completely isoalted / unrealted one from each other even though they are probably overlapping several of the build steps
It just occurred to me that this is a very relevant topic for #kubernetes 😁