How does caching work | Dagger | Page 1

noble cypress Aug 16, 2023, 11:18 PM

#

Hi there - it depends if you're on a persistent or ephemeral machine

#

On a persistent machine, caching works out of the box, using your local filesystem . No configuration needed.

#

On an ephemeral machine (eg. your GHA runner), it depends if you want to use the "naive" buildkit export, or the distributed cache service we're developing. Both are configured by passing env variables to the dagger CLI.

#

(but they're different env variables)

true bloom Aug 16, 2023, 11:23 PM

#

"the dagger CLI" assumes I'm using the Dagger CLI instead of using the SDKs, no?

#

We've built a Go CLI that a user can point at their project repo and a miracle occurs and they get built output

noble cypress Aug 16, 2023, 11:31 PM

#

Oh sorry - yes this works also in your use cases - the dagger CLI is still there, but instrumented by the Go SDK

#

(to make SDK development easier, we implemented the logic to open an engine session in the CLI (dagger session) and all SDKs instrument that under the hood

#

So you can set the env variables in your own tool priori to initializing the Dagger Go client library, and they will be passed through to the underlying dagger CLI

true bloom Aug 16, 2023, 11:32 PM

#

i.e. before dagger.Connect?

noble cypress Aug 16, 2023, 11:33 PM

#

For debugging purposes, you can also wrap your own CLI with dagger run and set the env variables there. That gives the end user full control over how the session is established (including caching settings)

#

As long as you're using an official SDK, there should be no difference in the behavior between dagger run <your CLI here> and <your CLI here>.

#

meaning that you can set the env variables one way or the other, and your session will be configured the way you want it

#

it's also generally useful to know that your own end users can, if they want, wrap your CLI with dagger run if they wish to configure the dagger session themselves

true bloom Aug 16, 2023, 11:35 PM

#

Do I need to pass anything to the engine itelf?

#

Our CLI automatically starts an engine container if _DAGGER_EXPERIMENTAL_ENGINE_HOST isn't set.

noble cypress Aug 16, 2023, 11:36 PM

#

For caching, I believe you don't need to. But @grand peak @violet musk @split depot @zenith spear can confirm

#

Also, you should know that manually controlling engine deployment as you do, creates additional versioning headaches down the line (because CLI/engine/SDK versions must all be aligned)

#

We plan to address this by adding hooks to the Dagger CLI - so you can customize engine provisioning without having to wrap

#

(this is longer-term information - no action needed at the moment)

#

--> https://github.com/dagger/dagger/issues/5583

GitHub

Swappable Engine Drivers in the CLI · Issue #5583 · dagger/dagger

Problem The Dagger CLI has a builtin “engine driver”: a software interface which allows it to install, run and manage a Dagger Engine. This engine driver cannot be customized or swapped out, so use...

true bloom Aug 16, 2023, 11:38 PM

#

Yeah, we're not planning to use the Dagger CLI because we're supplying a bunch of additional functionality

noble cypress Aug 16, 2023, 11:39 PM

#

To clarify, I mean the Dagger CLI either used directly (not your use case) or wrapped/instrumented via a SDK (your use case)

#

In your case you would wrap anyway. It would just change how you wrap, with the benefit of not having to keep up with engine/CLI/SDK version matrices

true bloom Aug 16, 2023, 11:42 PM

#

Where on the filesystem does the Dagger SDK cache stuff?

#

I was under the impression the caching was largely "server-side" i.e. in the engine

noble cypress Aug 17, 2023, 12:33 AM

#

yes it’s all in the engine container

true bloom Aug 17, 2023, 12:50 AM

#

Ah, so in /var/lib/dagger

#

What's the format for _EXPERIMENTAL_DAGGER_CACHE_CONFIG

#

I see this, but that doesn't let me do e.g. type=gha,scope=foo for the from and type=gha,scope=bar for the to.https://github.com/dagger/dagger/blob/75cb4a9fc7a4596fad23ac0656044b157b853800/core/integration/remotecache_test.go#L52

GitHub

dagger/core/integration/remotecache_test.go at 75cb4a9fc7a4596fad23...

A programmable CI/CD engine that runs your pipelines in containers - dagger/dagger

low raft Aug 17, 2023, 4:53 PM

#

To any Dagger team folks, how do you troubleshoot/diagnose caching issues when you develop Dagger? I'm seeing a lot of cache misses that seem like they should be hits. Dagger re-builds a Dockerfile even though the Git commit it comes from didn't change. I've never gotten --mount=type=cache in my Dockerfile to work. I'm almost convinced that I shouldn't be using Dockerfiles at all (e.g. should be using container().from().withRun()... in the Dagger SDK).

noble cypress Aug 17, 2023, 5:33 PM

#

low raft To any Dagger team folks, how do you troubleshoot/diagnose caching issues when y...

We use the Dagger Cloud dashboard to investigate 🙂 Currently in early access. It displays basic caching information, and we're working to add more advanced caching visibility features (as requested by eg. @coarse drum https://github.com/dagger/dagger/issues/5601)

GitHub

Issues · dagger/dagger

A programmable CI/CD engine that runs your pipelines in containers - Issues · dagger/dagger

sly scaffold Aug 18, 2023, 4:42 AM

#

true bloom What's the format for `_EXPERIMENTAL_DAGGER_CACHE_CONFIG`

hey @true bloom , do you see any particular error messages in the engine logs? Other users were able to successfully configure this flag. ref: #1134307062009053294 message.

Also, as a quick note if you follow that thread, you'll find that the default buildkit cache exporters (which you're configuring with this env variable) are know to be very limited and will very soon become under performant and somehow cumbersome to optimize. Reason why Solomon mentioned that we're inviting some community members to get a preview of our caching solution which is part of our commercial product. Let us know if you'd like to see what's that about.

Discord

Discord - A New Way to Chat with Friends & Communities

Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

true bloom Aug 18, 2023, 11:28 AM

#

@sly scaffold I'm trying to determine how to provide the cache config. I'm interested in the GHA cache backend and thus want to supply multiple cache sources and a different cache destination based on the ref being built. Based on the code base, it looks like that isn't supported, but I could have just been misreading things, hence the ask.

#

Ideally I could use the DAGGER_CACHE_FROM and DAGGER_CACHE_TO from the Cue SDK.

sly scaffold Aug 18, 2023, 1:58 PM

#

true bloom Ideally I could use the `DAGGER_CACHE_FROM` and `DAGGER_CACHE_TO` from the Cue S...

oh yes, I misread the previous statement. The current cache config doesn't allow to discriminate import and export configurations

true bloom Aug 18, 2023, 3:00 PM

#

So does no one use this with GHA then?

noble cypress Aug 18, 2023, 3:14 PM

#

true bloom So does no one use this with GHA then?

Yes, Dagger is used with GHA in one of 4 ways at the moment:

Self-hosted runners with persistent or semi-persistent local storage
Ephemeral runners, no caching (slower but some users still get value from Dagger)
Using our distributed caching service in early access
Hacks with buildkit cache export

#How does caching work