Hi all,
I noticed that most of the time consumed by my dagger call is actually the pulling of the dagger engine image.
Is there a way of caching this on the local filesystem?
Atm im running it in gha on selfhosted codebuild runners with s3 cache backend. Im already caching the entire /var/lib/docker directory but that does not seem to be enough.
#Dagger engine caching
1 messages · Page 1 of 1 (latest)
that's strange.. if you're caching /var/lib/docker, the image should be there. Have you verified that docker image ls shows the image when the workflow starts? Maybe your snapshot/restore process is not working as expected?
the image is not there. It seems like an issue with gha - codebuild. Im starting to think its not possible to cache the /var/lib/docker to codebuild backed by s3
/var/lib/docker is generally not straightforward to snapshot / restore. If you check the actual docker documentation, they way they recommend to do GHA caching (https://docs.docker.com/build/ci/github-actions/cache/) is by using the different exporters and not directly copying /var/lib/docker
I saw that. But if im not mistaken, Dagger has no option to specifiy a target to cache your layers too?
we support the same export options as docker (buildkit cache export) but they are marked as experimental because buildkit cache export is not without issues. We are working on something better. In the meantime, you can try using bk export, or persisting dagger cache at a lower layer (re-attachable block storage, longer-lived VMs, etc)
Thanks for clarifiying. Ill try to figure out where and how to set the buildkit cache export options.
Big changes to the underlying infra is not possible because of our usecase.
We are supporting multiple different cicd solutions atm and im leveraging dagger to distribute functions that should be ran on every possible cicd solution.
ps not sure what you mean with bk export tbh...
bk export:
-
Doesn't support merging cache results. The last run wins. If you're not careful you can easily make performance worse than with no caching. So you must very carefully tune your export configuration (which bucket & manifest to write to, for which workflow). This causes infra & app concerns to be tightly coupled, which is brittle and slows you down.
-
Doesn't persist cache volumes. Depending on where your pipelines spend the most time, you may not get the boost you're hoping for.
-
Generally has bugs and edge cases, since it's not used as widely as the rest of docker build
Thanks for clarifiying. Ill try to figure out where and how to set the buildkit cache export options.
here's some dagger tests that actually validate that feature works. https://github.com/marcosnils/dagger/blob/main/core/integration/remotecache_test.go
I tried to implement the s3 cache by setting the cacheenv als a argument to the module. And creating a container with that env var to run my function
func New(
//+optional
bucket string,
) *Metrics {
container := dag.Container().From("gcr.io/distroless/static-debian12")
var cacheEnv string
if bucket != "" {
cacheEnv = "type=s3,mode=max,region=eu-west-1,use_path_style=true,bucket=" + bucket
container = container.WithEnvVariable("_EXPERIMENTAL_DAGGER_CACHE_CONFIG", cacheEnv)
}
return &Metrics{
Container: container,
CacheEnv: cacheEnv,
}
}
So when i run dagger call --bucket cachebucket metrics .... hoping it would leverage the s3 caching mechanisme of buildkit.
Alltough i can see the env var is set on the container i by default use to run nothing is landing in the bucket or there is no trace of uploading the image in the run logs.
Container.withEnvVariable(name: "_EXPERIMENTAL_DAGGER_CACHE_CONFIG", value: "type=s3,mode=max,region=eu-west-1,use_path_style=true,bucket=cachebucket"): Container!
Any hints where to look?
you need to set that env var on the outside, when calling the dagger CLI itself.
Thanks you for helping out. But i feel really stupid now 😄
I configured my gh action to have the env var as you pointed out but nothing is happening or logged related to caching... am i missing something really stupid here?
This is how i did the gha:
- name: Dagger
uses: dagger/dagger-for-github@8.0.0
env:
AWS_ACCESS_KEY_ID: ${{ env.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ env.AWS_SECRET_ACCESS_KEY }}
AWS_SESSION_TOKEN: ${{ env.AWS_SESSION_TOKEN }}
AWS_ACCOUNT_ID: ${{ env.AWS_ACCOUNT_ID }}
_EXPERIMENTAL_DAGGER_CACHE_CONFIG: "type=s3,mode=max,region=eu-west-1,use_path_style=true,bucket=gh-cache-prod,access_key_id=$AWS_ACCESS_KEY_ID,secret_access_key=$AWS_SECRET_ACCESS_KEY,session_token=$AWS_SESSION_TOKEN"
with:
And after this i call my function as i alway did, but its not caching the dagger engine container...
buildkit cache export will export the output of operations executed by dagger/buildkit, but not the downloading of the dagger engine itself
@waxen fern the dagger image (and any other OCI image) can't be efficientely stored in the free GHA tier cache. The reason for this is that even though if you store a tar version of the image in the cache and then restore it on every run, it'll still be around the same time or even slower than pulling it directly from registry.dagger.io. This is because there's no really any significant improvement over pulling the tar image from the cache, and import it to the engine.
Assuming you use the default docker provisioner (CLI calls docker run registry.dagger.io/engine), some CI platforms have a proprietary "docker engine cache" feature, which reuses the same docker state in between runners. If that feature exists on your CI, and you enable it, it could help speed up dagger engine initialization, since it wouldn't always have to be re-downloaded. I think AWS Codebuild has that for example
Skimming through CodeBuild's docs, seems like the "local Docker layer cache" is what Solomon might be referring to: https://docs.aws.amazon.com/codebuild/latest/userguide/caching-local.html. From an initial impression and also kind of confirmed by this re:Post (https://repost.aws/questions/QUr8_kFPLjRa2n69BEdR9ZuQ/how-can-i-make-docker-layer-cache-permanent-when-i-build-docker-images-in-codebuild#ANY5vOlKw3T6alIf3SEUfPHg) it seems to be a "quick hack" where they try to schedule all the same builds in a given interval to the same host so you can re-use the cache. It kind of works but as stated in the AWS docs, if your builds are not super frequent, then it's the same as having no cache at all
exactly 👍
Okay wow thanks, this is great info.
I was hoping to cache to s3 and minimise altering the codebuild setup as much as possible cause I’ll need to distribute te setup across the teams.
But yeah I’m aware of the docker layer caching and even more interesting the new docker build server in aws.
Again thank you very much for helping me out.
@waxen fern one thing that just occurred to me which might speed up the engine pull phase after skimming through the CodeBuild docs is if you create a custom runtime image with the dagger-engine-image.tar file. Then, in the entrypoint of that runtime you could docker load that tar file so the engine image is present when the codebuild pipeline starts
You'd have to benchmark it but I'd assume that will be a bit faster than pulling and unpacking the image when the pipline starts