#Dagger SLOOOWWWW in CI; how should I REALLY use it?

1 messages · Page 1 of 1 (latest)

umbral vale
#

Hi.
I've been trying to adopt dagger for months, it's the core tool our current CIs are built on, but I'm still on a fence. Among other concerns, currently speed in in my crosshairs.

It's all well and nice if I run CI builds on my localhost: docker is pre-started, dagger engine is pre-started and all the dagger runs start quickly. However, CI is a different story. In CI I'm running builds in a DIND container, meaning the dagger engine has to be downloaded and started, CLI has to be downloaded and installed. For every CI pipeline. For every CI job... That's a lot of downloads and installs. And dagger engine startups/warmups. Also, this way I cannot leverage neither dagger's nor docker's cache. As a result, dagger run and dagger call take MINUTES to initialize -- that's +100% of the total build time in some of my pipelines.

There is (was?) an experimental ENV var allowing me to specify a dagger/docker host to offload builds to a speciffic machine. This way I could have both, docker and the dagger engine pre-started and pre-heated, like on my laptop. However... Some of my codebases are gigabytes in size, so transferring all those files to another machine over the network doesn't sound like a tempting approach. Moreover, this way I do not have any means to control resources' allocation: if I specify a particular sever's IP in that env var value, that single server will be running ALL the simultaneous jobs. Which might demand a lot of firepower and memory from that poor server. Also, it acts as a SPoF. Sure, I could provision a beefy machine for dagger, but that would mean I'm paying for 90% of its idle time.
If I instead set that env to some round-robin LB's IP, it partially solves the overloading problem, because this way I can spread the load. However, there's no guarantee that the same server won't receive X parallel CI runs. Also, this approach doesn't leverage docker/dagger caching, as each instance has its own cache separately.

#

So... How am I REALLY supposed to use dagger to get decent CI performance? And, preferably, reproducible builds (as much as possible)?

#

I'm tempted to give Nix a chance to see how it goes; something tells me it would provide both performance and reproducibility (and I could probably even share-mount /nix/store acoss multiple CI runner pods), at a price of a cryptic language.

royal furnace
#

Dagger in an ASG + Dagger Cloud to handle caching, basically

#

Have your minimum in the ASG be 1 instance, maybe schedule some additional instances just before the work day starts and scale those down near the end, but otherwise let instance metrics handle scaling

granite vector
#

@umbral vale isn't that the same issue you'd have for a Dockerfile, for example? Dagger caching challenges are the same as building traditional containerized applications in any CI environment. Caching container and build artifacts is challening and ultimately resolves to either (a) having a stateful set of machines with a warm cache or (b) optimize your pipelines to be performant in a more ephemeral environment

#

The Dagger engine + CLI download part should usually be a negligent of the overall pipeline. I'm thinking that in this case you're probably being bitten by the initialization time if your repository is in the size of GBs.

@umbral vale any chance you're in Github Actions? If that's the case and you're interested, we can enroll you in a beta feature we've been working recently to tackle these type of performance challenges use cases. cc @solar wyvern

umbral vale
#

@granite vector unfortunately we're married to bitbucket in this project