#Equivalent of --cache-to and --cache-from in docker buildx

1 messages · Page 1 of 1 (latest)

rain osprey
#

When I'm building a container with a dockerfile, is there a way to export the build cache to my registry? I'm looking for the equivalent of the --cache-to and --cache-from flags in docker builx build. I've also tried setting _EXPERIMENTAL_DAGGER_CACHE_CONFIG="type=registry,mode=max,ref=123456789.dkr.ecr.us-east-1.amazonaws.com/dagger-test:dagger-cache" to export the whole dagger cache but that didn't seem to do anything.

obtuse grotto
#

I had a very similiar question today when looking for options to configure caching in Github Actions in Dagger - I pretty much gave up.

However, Claude seems to think you can do this by supplying a modified "frontend gateway" in a buildkitd.toml file: that file is documented at https://docs.docker.com/build/buildkit/toml-configuration/

Claude suggests the attached config (which I have not tried but looks promising). Further context:

If you want to set a default --cache-from argument without modifying the docker build command itself, you can consider using BuildKit's frontend gateway configuration.
The frontend gateway allows you to define custom frontends and configure their behavior. You can create a custom frontend that wraps the default Dockerfile frontend and injects the desired --cache-from argument.

Docker Documentation
#

This seems like a super-involved solution and I hope there is something easier that we can do

#

When it becomes important enough for me to save the couple of minutes in Github Actions I will probably return to this problem: for now the runner time savings is not enough to warrant investing my time

#

If you figure this out I'd love to know

#

Probably you will also need this in buildkitd.toml

[frontend."default"]
source = "custom-dockerfile-frontend"

knotty dune
raven tide
# rain osprey When I'm building a container with a dockerfile, is there a way to export the bu...

_EXPERIMENTAL_DAGGER_CACHE_CONFIG should work given that we have tests for that: https://github.com/dagger/dagger/blob/743725c8b96ff0f4888bd7b1512efbe8ffae2779/core/integration/remotecache_test.go#L40-L52C34. Just tested out using a local registry and seems to be working ok

GitHub

Application Delivery as Code that Runs Anywhere. Contribute to dagger/dagger development by creating an account on GitHub.

#

@rain osprey even though we don't recommend using it and we don't actively maintain it, you should see a message in the engine logs while trying to export the cache if that env var is set

#

something like: time="2024-05-29T03:06:30Z" level=debug msg="done running cache export for client woi828fgfhbe2e787azz1vnna" client_hostname=xps client_id=woi828fgfhbe2e787azz1vnna server_id=sf69og5jrljcnvjh7m7bffo1j

rain osprey
#

Just getting back to this, not sure what I had misconfigured before but _EXPERIMENTAL_DAGGER_CACHE_CONFIG seems to work! I do have an issue with ECR where its failing with context cancelled so not sure about that

split marsh
#

@rain osprey where you able to speed the building process? can you share an example of the running command?

rain osprey
#

Hey @split marsh I haven't tested this in our actual build process yet, I'm still writing our initial workflows in dagger and I'm probably going to spend some time working on improving our CI infra so we have more reliable local caching before I try reaching for _EXPERIMENTAL_DAGGER_CACHE_CONFIG since that's not recommended.

#

The command I ran was export _EXPERIMENTAL_DAGGER_CACHE_CONFIG="type=registry,mode=max,ref=localhost:5000/ocli-test:dagger-cache" and then I ran my normal dagger call and I saw those log lines that @raven tide mentioned and I saw logs from my local registry showing layers being pushed/pulled to/from it. I didn't dig into the ECR context cancelled error since that seemed like a can of worms and I think its unlikely I actually leverage this anytime soon.

split marsh
#

thanks

split marsh
#

<@&946480760016207902> how do i know if dagger is picking the
export _EXPERIMENTAL_DAGGER_CACHE_CONFIG="type=registry,ref=us-west2-docker.pkg.dev/project-id/cache,mode=max" ?

versed marsh
split marsh
#

i dont see any on the artifact registry. i was expecting a log or something that points that at least its reading it. advices please we need to speed up the building process which takes around 30m every push

raven tide
#

@split marsh try grepping in your engine logs for the hostname that you used in your CACHE_CONFIG variable and check there's anything there

#

I just did a test locally and works with no issues

split marsh
#

ok seen some logs

2024-08-07 06:36:53 time="2024-08-07T11:36:53Z" level=debug msg="checked for cached auth handler namespace" cached=true key="us-west2-docker.pkg.dev/project-id/cache::pull" name=us-west2-docker.pkg.dev/project-id/cache scope=pull
2024-08-07 06:36:54 time="2024-08-07T11:36:54Z" level=debug msg="error while importing cache manifest from cmId=us-west2-docker.pkg.dev/project-id/cache: failed to configure registry cache importer: unexpected status from HEAD request to https://us-west2-docker.pkg.dev/v2/project-id/cache/manifests/latest: 400 Bad Request"

not sure to be honest whats missing, any help welcome

raven tide
split marsh
#

a gcloud artifact registry

split marsh
#

advices

raven tide
#

@split marsh can you check if the same issue happens with buildkit's buildctl build (https://github.com/moby/buildkit/tree/master?tab=readme-ov-file#export-cache)? Dagger uses buildkit internally and what's mostly happening in your case I'm quite confident is coming from buildkit.

If you still have this issue in buildkit, I'd advise to open an issue in their repo as there's nothing we can do from our end

GitHub

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit - moby/buildkit

#

I've tried using the CACHE_CONFIG option a local registry and it's working without issues

split marsh
#

got it, let me try it

split marsh
#

im not sure i can run buildctl on mac, i cant find any doc aroung it. i do have docker desktop to use buildkit but not sure how to use it directly

raven tide
#

@split marsh do you have docker buildx on mac?

split marsh
#

yes i do

raven tide
#

that's buildkit

#

docker buildx build --cache-from --cache-to

#

that's what you need

split marsh
#

and cache-from is on the format of "type=registry,ref=us-west2-docker.pkg.dev/project-id/cache,mode=max" ? sorry never use it before

split marsh
#

when i run

docker buildx build --platform linux/amd64 -t xyz --cache-to=type=registry,ref=us-west2-docker.pkg.dev/project-id/cache/cache:latest,mode=max -f Dockerfile .

it build and i can see the image and tag on the registry it self. But when i do


export _EXPERIMENTAL_DAGGER_CACHE_CONFIG="type=registry,ref=us-west2-docker.pkg.dev/project-id/cache/cache:latest,mode=max"

dagger call \
....

nothing updates on the registry, at least on cache

#

also, i cant see logs from dagger container 0.12.4

������MError grabbing logs: invalid character '\x00' looking for beginning of value
versed marsh
#

We have integ tests for this that pass but I’ll see if I can repro anything there. I think the current tests don’t do “dagger call” since they were added pre-modules so will give that a try

split marsh
#

Any update on this @versed marsh ?

royal meadow
split marsh
#

thanks @royal meadow im been presured on this one for long building times 😦

wise blaze
#

@split marsh are you using self-hosted runners?

split marsh
#

no. That said im using GHA on some projects, gcp cloud build on others and aws code build 😉 so running dagger on multiple clouds

wise blaze
split marsh
#

the one im getting more presure is actually been run on GCP cloud build, regular managed google runners

wise blaze
#

Oh I see, are the builds triggered from CI? or separately?

#

and if so what's the CI configuration?

split marsh
#

the pipeline it self its pretty simple as you can figure it out i do everything in dagger

options:
  machineType: "E2_HIGHCPU_32"
  dynamicSubstitutions: true
  logging: CLOUD_LOGGING_ONLY

serviceAccount: <redacted>

steps:
  - name: "gcr.io/cloud-builders/docker"
    entrypoint: "bash"
    args:
      - "-c"
      - |
        apt update
        apt dist-upgrade -y
        apt install -y apt-transport-https ca-certificates gnupg curl
        curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | gpg --dearmor -o /usr/share/keyrings/cloud.google.gpg
        echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
        apt update
        apt install -y google-cloud-cli
        gcloud init
        
        curl -L https://dl.dagger.io/dagger/install.sh | sh -s -- -b /usr/local/bin
        mv ./bin/dagger /usr/local/bin
        dagger version
        
        cd devops/btd
        
        export DAGGER_CLOUD_TOKEN=$(gcloud secrets versions access latest --secret=DAGGER_CLOUD_TOKEN)
        
        bash run.sh "${BRANCH_NAME}" "${BUILD_ID}"

```
#

the run.sh script is

#!/usr/bin/env bash

BRANCH_NAME=$1
BUILD_ID=$2

dagger call \
  --working-dir=../../ \
  --git-branch="${BRANCH_NAME}" \
  --build-id="${BUILD_ID}" \
  --github-token=cmd:"gcloud secrets versions access latest --secret=DAGGER_GH_TOKEN" \
  --gcp-access-token=cmd:"gcloud auth print-access-token" \
  --libs-pull-gcp-sa-content=cmd:"gcloud secrets versions access latest --secret=LIBS-READ-WRITE-KEY" \
  --database-connection-name=cmd:"gcloud secrets versions access latest --secret=PLATFORM_BACKEND_CLOUD_SQL_CONNECTION_NAME" \
  --database-connection-string=cmd:"gcloud secrets versions access latest --secret=PLATFORM_BACKEND_DATABASE_URL" \
  run
raven tide
#

@split marsh are you sure aboutt his?

#

I've just tried it with my personal GCR repository and it doesn't work

#

I even see this in my buildkit logs:

time="2024-08-14T21:20:24Z" level=warning msg="reference for unknown type: application/vnd.buildkit.cacheconfig.v0"
time="2024-08-14T21:20:27Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = error writing manifest blob: failed to open writer: unexpected status from HEAD request to https://us-east1-docker.pkg.dev/v2/$PROJECT/test/manifests/latest: 400 Bad Request"
#

you can follow that thread for updates

#

this is not related to Dagger

split marsh
#

oh got it. thanks. any other of the cache system i can use on this scenario?

wise blaze
#

@split marsh out of curiosity how is that cloud build triggered? Is it hooked up to git events, and if so how?

#

It looks like the event triggers are not configured in the build file itself

split marsh
#

yes it mirror the repo which is on GH and every push to the repo triggers the new build. Yea project owner does not want to use GHA 😦

wise blaze
#

That's fine, the whole point of Dagger is to make cross-CI portability easier 🙂

split marsh
#

and its great on it. few missing points dough

wise blaze
#

yes the dough is a missing point 😛

split marsh
#

taht said the question remains. which is the best way to cache the pipeline outside dagger cloud of course?

wise blaze
#

@split marsh soon there will be 2 solid answers:

  1. "configure a storage driver to use your favorite object storage service"
  2. "choose a hosting partner with great persistent caching builtin"

It's a high priority to implement both of these

#

@split marsh one solution you could implement right away, is to use self-hosted runners (if google cloud build supports this) with persistent or semi-persistent local storage.

raven tide
split marsh
#

sorry typo. mean its on buildkit side

raven tide
split marsh
wise blaze