#Duplicating another cache volume on a cache miss

1 messages · Page 1 of 1 (latest)

proper skiff
#

Say I want to maintain cache volumes for different pipeline steps and branches, so I use <BRANCH> and <PIPELINE-STEP> as part of the cache key (i.e. java-ci_build-step_main). I would like feature branches to use the same cache as the default branch (main) for its base to speedup the initial build on a PR, but not actually reuse them as I want cache isolation. GHA cache has kind of a similar property: https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache.

Basically, I'm looking for another argument base_cache for the Container.with_mounted_cache method i.e. if you don't find the requested cache then duplicate the cache found in base_cache and use that as your base (but without altering the original). If not even base_cache exists, then just proceed with creating the cache entry as you would now.

Is there an easier way to achieve what I want? Staying within Dagger would be the most convenient.
If it's not possible, would this make sense as a feature request? I can open an issue.

pliant estuary
proper skiff
#

Hi, I'm not using it, I'd expect this to work locally for a single Dagger Engine.

proper skiff
#

Would it make sense to have something like this? Having the ability to duplicate cache volumes would be great, I could implement the rest, but I think people would find the other stuff also helpful in their workflows.

random ember
#

The closest thing we have to do this is the source argument to with_mounted_cache, which lets you specify a base directory for the cache volume (which is done efficiently with overlay). But it sounds like you want another cache volume to be the base? In which case yeah not possible yet

proper skiff
#

Yeah, I think having another cache volume as the base would be the cleanest. Using a source directory could work but it would require extra work in some.

For example, I was hoping I could share a single Dagger Engine across multiple nodes and use its cache (from what I gather that's the most common way to solve distributed caching when you want to run it locally i.e. without sending any data out) . If I went the source directory route I'd likely have to introduce another cache server like Memcached to manage that source directory on the nodes.
However, if I'm understanding all of this correctly, I'd need the option to use another cache volume as the base even if using Dagger's distributed cache to accomplish what I want.

Are there any plans to add stuff like cache volume duplication (or is it even possible)? I could write these in more detail as feature suggestions in GItHub issues if it makes sense, but I'm not sure if this is a direction the Dagger team would go in.

pliant estuary
#

For example, I was hoping I could share a single Dagger Engine across multiple nodes and use its cache

with this you mean "across multiple nodes" as "client nodes"?

#

just asking to clarify that it's not possible to run a single Dagger Engine across multiple nodes