#Initialize step in CI is slow

1 messages ยท Page 1 of 1 (latest)

drowsy agate
#

Hello, I've been loving dagger so far! My issue is that while I've solved my slow prepare speeds locally with views, my CI builds still take a while to initialize:
8 : initialize
9 : resolving module ref
9 : resolving module ref DONE [0.2s]
10 : installing module
10 : installing module DONE [54.3s]
11 : analyzing module
11 : analyzing module DONE [0.3s]
8 : initialize DONE [54.8s]
I've already made it past connecting to the engine, so it seems like this is just initializing my dagger (golang) module-is that correct? If that's the case it seems like a long time since I only have ~1.5k lines of code, and one depencency (I'm going to remove this and test that way), and a go.mod file just under 50 lines. Locally this step takes 2-3 sec. Is there any way to speed this step up in ephemeral CI runners on GitHub?

#

Removing the one dependency I had took it down to 39, but that's 33% of my build time right now, so any tips welcome!

dusky chasm
#

Hi @drowsy agate,

  • Are you able to share a Trace url? You'll need to hook up your engine to a free Dagger Cloud account. Would help us drill down together

  • What's the performance like when running the same thing locally? It's possible that lack of persistent cache is the culprit

tropic kite
#

I'm also experiencing long initialisation times using the Go SDK. Looks like it spends almost all of its time in codegen? Total length of initialise for this run was 1m12s. I'm running on GitLab CI with the Docker executor.

zinc flame
#

Additionally, does your pipeline has a lot of Dagger module dependencies?

tropic kite
#

The pipeline has no dependencies and just the auto-generated dagger.json

zinc flame
drowsy agate
# dusky chasm Hi <@1274085554304192573>, - Are you able to share a Trace url? You'll need to ...

Hi there, thanks for your help!

  • I'll look into traces, but not sure it'll happen quickly. I'd like to but still trying to sell dagger fully to the whole team.
  • Locally I previously had issues with the prepare step taking 30s-1m each time. Now it only takes a long time after I've cleared my cache as the view seems to speed things up after an initial run. I've never seen this kind of prepare time in GH ephemeral runs. On the other hand initialize locally never seems to take more than 12s after a cache clear, while its consistently ~30s in CI.
  • I don't have any includes in my dagger.json, only excludes (normal stuff-dist dirs, node modules, etc.) and a view which has functionally the same thing with ! before each pattern entry to similarly exclude it.

I think what I'm mostly asking (and am confused by) is the cause of disparities in times between local and CI for the init step. Prepare never takes long in my CI runners, but it was greatly improved locally with views. My guess here (which I'm somewhat confident about) is that my local node_modules, dist dirs and such were slowing down loading the dir initially, and views allow a quick way to skip those when grabbing my root directory. On the other hand the initialize step never takes long locally (10-12s on a fresh cache run), but is pretty consistently around 30s in CI. This part I don't really understand: what are some differences that could cause this? Does the fact that I have go modules installed locally matter? (guess is no) What else could cause a significant difference in init time between my local system with the cache wiped and a CI runner? Could this just be a case of better network/compute on my local machine?

The init time in CI is not a blocking issue for me, but just having the understanding of why it might perform differently in these different scenarios is really what I'm hoping to gain!

dusky chasm
#

@willow sage my guess as to why initialization is slower in CI, is lack of cache persistence between CI runs. This is very common in a vanilla Dagger install on a vanilla CI runner. Your ephemeral runner runs a new dagger engine each time, with an empty local cache each time.

#

By contrast your dev machine has perfect caching persistence

#

This is where Traces would really help.. To be clear you can create a free account, and also perhaps you can start by simply tracing your local runs, on dev machine, to get an idea of the performance profile of a local run. Half of the equation would be better than nothing

drowsy agate
#

So as far as ways to speed it up: 1. run on a CI runner with a cache (definitely something I'm looking to in the future, when I have time ๐Ÿ˜› ) 2. keep go modules reasonably tidy and only include dagger dependencies when absolutely needed?

dusky chasm
#

Confirming that Dagger modules are fully sandboxed, so anything installed on your machine, other than dagger itself, has no incidence on how your Dagger module will run

drowsy agate
#

is my understanding that init times are generally related to compiling my module correct? If that is the case it seems like I'll just have to deal with the times in CI until I'm ready to get a runner with a persistent cache, keeping in mind to keep my module as slim as possible to help out a bit.

#

As far as the cloud setup I'll try to get that going at least for my local runs tomorrow. Not trying to be difficult here just have to get a sign-off for this and can't do that today unfortunately!

dusky chasm
dusky chasm
drowsy agate
drowsy agate
dusky chasm
#

Another possible outcome is that we ship features that provide relied even for fully ephemeral runners.

  • We're already working on object storage support. Still an infrastructure requirement but at least wouldn't require touching the compute, which often makes things easier

  • Also perhaps we should consider supporting a "vendored cache" or something of that nature, where parts of the runtime build process are pre-baked and vendored in your repository.. Not something currently being developed but given the prevalence of "slow init" problems, maybe we should consider it. cc @reef grove @velvet valve @zinc flame @blissful dagger

#

Certainly speed is becoming a focus for us so you should expect improvements

drowsy agate
#

And just to sandwich the whole conversation I have really been enjoying dagger as a whole! The amount of productivity I gain by being able to run my pipeline functionality locally has been a huge boon. Thank you for writing a cool tool and for the hands-on help!

zinc flame
#

@drowsy agate is there a chance you could share with us a Dagger Cloud trace so we can better understand where the time is going during the module initialization in your specific case? ๐Ÿ™

tropic kite
#

Hello, chiming in again to ask about the caching. Is there any examples (or if it is even possible) of how we could utilise native caching mechanisms for certain CI tools? For example with Gitlab CI, they offer a run caching mechanism similar to artifacts (using the cache keyword). Is there a directory we can supply and a sensible key to provide for this? I presume that this isn't really possible if all of the caching is done using OCI artifacts?

drowsy agate
dusky chasm
#

There is a feature currently in development, that offers exactly what you're talking about @tropic kite @willow sage ("here is a directory, persist it at your leisure using your CI's native caching mechanism"). But for now, it only covers cache volumes. Those are the directories you can ask Dagger to persist on a case-by-case basis, to allow tool-specific caches to accumulate. So typically you would find node modules, go module cache, linux distro package cache, stuff like that.

So that will definitely help accelerate.

Then there's the other half of the problem: layer cache. That is about caching Dagger's own operations, the same way docker build caches its layers. Under the hood it's the same tech. To persist that in an ephemeral CI runner, there is another feature we are developing, that is a bit further out (but still high priority), that will allow you to use an object storage service (S3 or equivalent) to persist all your layers seamlessly. In the meantime there are stopgaps. Which stopgap is applicable to you depends on your constraints.

drowsy agate
#

Those features sound good to me. I didn't get to dagger cloud today but intend to. Hopefully tracing will reveal exactly what kind of improvements I should be looking at!

blissful dagger
#

Had this in my queue to get back on but just found Solomon already summarized it perfectly ๐Ÿ˜„ I'll just add that:

  • There's a high-level tracking issue here: https://github.com/dagger/dagger/issues/8004
  • Part of the effort is not only to add better support for controlling cache storage but also to make the overall caching system more understandable and debuggable, which will indeed likely tie into our tracing support (details TBD)
GitHub

This has been discussed on various internal trackers and work has been started, converting to a public issue now. Goals Primary: Make the cache storage of the Dagger Engine pluggable and configurab...

drowsy agate
#

Ok I (finally) got my cloud traces going. I'm seeing the image above as part of the codegen step as the bulk of the time in initialize, so it looks like this is just because I don't have a cache for the directory in my CI runner. I don't think there's much else to do other than wait for the features Erik mentioned. Thanks for your help everyone!

dusky chasm
drowsy agate
dusky chasm
dusky chasm
#

@willow sage can you tell me more about your use case? What do that hot yaml mess actually do? What tools do you use and how do they interact with the host system?

drowsy agate
# dusky chasm <@196017612847579136> can you tell me more about your use case? What do that hot...

Nothing unexpected for normal CI/CD type stuff: get/bump the version, tag on main, setup VS toolchain stuff, restore, build/publish, and push that artifact somewhere. The trouble is I need to be on Windows (VS toolchain for UWP required) and I just don't have a good way to avoid having all my powershell commands just sitting in a yaml step list. Versioning can only be reproduced locally by copying/running those commands, and there's no way to get a clean build env locally easily.

The biggest pain point with this setup is just that its almost never obvious to a developer or even sometimes to me what's different about the CI machine vs local, and there's no good way to "run the pipeline" locally in a way that's even remotely sanitary. Complete opposite of dagger, where I haven't seen any significant differences between local runs and CI runs. So really what I want the most for Windows platform builds is that reproducibility so I'm not pushing and praying, but even more importantly so the win devs we have can run the pipe locally with confidence.

I dunno if that means win containers or what...I haven't worked with win containers in a while and it wasn't trivial when I did, but I'm not sure how else to get a relatively sanitary local environment.

dusky chasm
#

If I were to lift the whole thing into a linux container with powershell installed... what would break?

  1. Calling the VS toolchain
  2. anything else?
#

Are you able to copy-paste a snippet of the relevant commands? I'm interested in any example of the stuff that would break in a linux container. Since the crux is the issue is punching wholes through the sandbox.

drowsy agate
drowsy agate
#

If we didn't have those hard dependencies on the win platform I probably would just build for win in linux.

dusky chasm
#

Any chance you could share a snippet?

drowsy agate
#

Of what the .proj file stuff?

dusky chasm
#

not sure ๐Ÿ™‚ whatever could help me better understand the project, so I can think about how we might get it to work in dagger

velvet valve
#

Interested in the restore step too. Other windows folks mentioned that.

drowsy agate
velvet valve
#

Oh, I see

drowsy agate
#

Yeah restore is just like npm i and I can't imagine its gonna be what fails on linux unless it does some kind of platform checking based on your project, but WiX is the one dependency I identified immediately as being the thing that would force me onto Windows.