#Understanding Dagger caching in depth: function caching

1 messages · Page 1 of 1 (latest)

steep hearth
#

I want to understand precisely what Dagger considers when determining a cache hit/miss.

I've got a function that should be cached when there's no changes in a directory, and not cached otherwise. That directory is attached to a container, and that container (plus another container) are passed as arguments to that function. Those containers are created in a previous method with other inputs, which is why I think this isn't caching as I'd expect.

E.g.: with-config --dir . --workload-ver tag1 function-to-cache -> when --workload-ver changes (which is attached to a container with WithEnvVariable) the function to cache doesn't cache, it runs. Is there a way to write that function such that it doesn't matter which tag is pulled, only that if there are changes in the --dir . directory run, otherwise cache?

Trace: https://dagger.cloud/mjb/traces/e892c1d167c5541ef90273b87b54c9c6. Of the two long-running functions that ran without caching, the first should only care about the directory contents, the second should run when the --workload-ver changes as usual.

#

(I'm working on the assumption that because the underlying container that is passed as an argument has changed, the function runs and isn't a cache hit. If that's the case, I don't think what I'm describing is possible?)

#

(If I run the function again with the same parameter values it produces a cache hit as expected)

haughty hornet
steep hearth
#

I will try to tomorrow. Quite a large module, will pull out the bits necessary in the morning

haughty hornet
#

many things could be happening here

#

also, sharing two consecutive cloud traces would also be useful

#

so we can see what's the difference between those and where the cache seems to bust

steep hearth
haughty hornet
steep hearth
#

Yes - that was a second run with no changes. What I'm asking is, how can I control the caching such that a change in environment variable on a container passed to a function doesn't invalidate the cache, if that's even possible. I'm hoping to get to a point that the migrateDb function only runs if there are new or changed migration files in a specific directory, ignoring all other changes

haughty hornet
#

you can add it to the Container with WithSecretEnv

steep hearth
#

This is basically a "how do I best architect my dagger module" question under the hood. This feels doable, but maybe I shouldn't be re-using the same container across multiple functions and should be creating fresh containers for each, mounting the caches, and hoping the creation of new containers is fast due to caching

steep hearth
haughty hornet
steep hearth
#

They're needed. Basically migrateDb and deployMs pull an image from a tag the user provides as a function parameter.

The first function runs the database migrations based on the xml files in a specific dir in that image.

The second function doesn't even look at that dir, it just runs the image entrypoint (the app).

Both need the --workload-ver input, but I'd like the first to ignore it for cache reasons.... So I'd have to split that off the current 'base' container they share, add it as a secret in the migrateDb function, then add it as an env var in the deployMs function?

haughty hornet
steep hearth
#

Yep, just done that

haughty hornet
#

that way, migrateDb doesn't get affected if workloadVersion changes

steep hearth
#
// Add image tag as a secret so it doesn't invalidate the cache
    tfCtr = tfCtr.WithSecretVariable("TF_VAR_image_tag", dag.SetSecret("WorkloadVer", workloadVer))

    tfCtr, err := m.runTerraform(ctx, tfCtr, command, MigrationTerraformName, MigrationTerraformPath)

and

// Add image tag as a variable so it does invalidate the cache
    tfCtr = tfCtr.WithEnvVariable("TF_VAR_image_tag", workloadVer)

    _, err := m.runTerraform(ctx, tfCtr, command, DeployTerraformName, DeployTerraformPath)
haughty hornet
#

👍

#

there u go

steep hearth
#

It is a bit hacky but if it works I'll leave it in

haughty hornet
steep hearth
#

Logging off, quarter to 6 here, will test fully in the morning

#

Thanks for the suggestion, hopefully works as intended!

haughty hornet
steep hearth
#

@haughty hornet Does WithMountedCache invalidate the cache of a function? The cache contents are the same as the last run until it's potentially written to, so I'd think not?

steep hearth
#

I think I'm getting pretty close to a point where a Terraform container with a load of stuff added (dirs, envvars, mounted caches, couple of withexecs) is not invalidating the cache because it's all constant or changes infrequently, should be able to confirm this afternoon

steep hearth
#

Think I've finished with this. Shared Terraform container across four functions; container has a versioned directory (passed in from parameter), a number of string parameters added as env vars, four mounted caches, and some secrets, and I'm able to correctly cache some of those functions with the right inputs, so leaving it there for now. I'll need to move git credentials to a secret file rather than a file for it to persist in Gitlab (job token changes every job) but otherwise should be good