#How does caching work

1 messages · Page 1 of 1 (latest)

noble cypress
#

Hi there - it depends if you're on a persistent or ephemeral machine

#

On a persistent machine, caching works out of the box, using your local filesystem . No configuration needed.

#

On an ephemeral machine (eg. your GHA runner), it depends if you want to use the "naive" buildkit export, or the distributed cache service we're developing. Both are configured by passing env variables to the dagger CLI.

#

(but they're different env variables)

true bloom
#

"the dagger CLI" assumes I'm using the Dagger CLI instead of using the SDKs, no?

#

We've built a Go CLI that a user can point at their project repo and a miracle occurs and they get built output

noble cypress
#

Oh sorry - yes this works also in your use cases - the dagger CLI is still there, but instrumented by the Go SDK

#

(to make SDK development easier, we implemented the logic to open an engine session in the CLI (dagger session) and all SDKs instrument that under the hood

#

So you can set the env variables in your own tool priori to initializing the Dagger Go client library, and they will be passed through to the underlying dagger CLI

true bloom
#

i.e. before dagger.Connect?

noble cypress
#

For debugging purposes, you can also wrap your own CLI with dagger run and set the env variables there. That gives the end user full control over how the session is established (including caching settings)

#

As long as you're using an official SDK, there should be no difference in the behavior between dagger run <your CLI here> and <your CLI here>.

#

meaning that you can set the env variables one way or the other, and your session will be configured the way you want it

#

it's also generally useful to know that your own end users can, if they want, wrap your CLI with dagger run if they wish to configure the dagger session themselves

true bloom
#

Do I need to pass anything to the engine itelf?

#

Our CLI automatically starts an engine container if _DAGGER_EXPERIMENTAL_ENGINE_HOST isn't set.

noble cypress
#

For caching, I believe you don't need to. But @grand peak @violet musk @split depot @zenith spear can confirm

#

Also, you should know that manually controlling engine deployment as you do, creates additional versioning headaches down the line (because CLI/engine/SDK versions must all be aligned)

#

We plan to address this by adding hooks to the Dagger CLI - so you can customize engine provisioning without having to wrap

#

(this is longer-term information - no action needed at the moment)

true bloom
#

Yeah, we're not planning to use the Dagger CLI because we're supplying a bunch of additional functionality

noble cypress
#

To clarify, I mean the Dagger CLI either used directly (not your use case) or wrapped/instrumented via a SDK (your use case)

#

In your case you would wrap anyway. It would just change how you wrap, with the benefit of not having to keep up with engine/CLI/SDK version matrices

true bloom
#

Where on the filesystem does the Dagger SDK cache stuff?

#

I was under the impression the caching was largely "server-side" i.e. in the engine

noble cypress
#

yes it’s all in the engine container

true bloom
#

Ah, so in /var/lib/dagger

#

What's the format for _EXPERIMENTAL_DAGGER_CACHE_CONFIG

low raft
#

To any Dagger team folks, how do you troubleshoot/diagnose caching issues when you develop Dagger? I'm seeing a lot of cache misses that seem like they should be hits. Dagger re-builds a Dockerfile even though the Git commit it comes from didn't change. I've never gotten --mount=type=cache in my Dockerfile to work. I'm almost convinced that I shouldn't be using Dockerfiles at all (e.g. should be using container().from().withRun()... in the Dagger SDK).

noble cypress
sly scaffold
# true bloom What's the format for `_EXPERIMENTAL_DAGGER_CACHE_CONFIG`

hey @true bloom , do you see any particular error messages in the engine logs? Other users were able to successfully configure this flag. ref: #1134307062009053294 message.

Also, as a quick note if you follow that thread, you'll find that the default buildkit cache exporters (which you're configuring with this env variable) are know to be very limited and will very soon become under performant and somehow cumbersome to optimize. Reason why Solomon mentioned that we're inviting some community members to get a preview of our caching solution which is part of our commercial product. Let us know if you'd like to see what's that about.

Discord

Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

true bloom
#

@sly scaffold I'm trying to determine how to provide the cache config. I'm interested in the GHA cache backend and thus want to supply multiple cache sources and a different cache destination based on the ref being built. Based on the code base, it looks like that isn't supported, but I could have just been misreading things, hence the ask.

#

Ideally I could use the DAGGER_CACHE_FROM and DAGGER_CACHE_TO from the Cue SDK.

sly scaffold
true bloom
#

So does no one use this with GHA then?

noble cypress
# true bloom So does no one use this with GHA then?

Yes, Dagger is used with GHA in one of 4 ways at the moment:

  1. Self-hosted runners with persistent or semi-persistent local storage

  2. Ephemeral runners, no caching (slower but some users still get value from Dagger)

  3. Using our distributed caching service in early access

  4. Hacks with buildkit cache export