#Sometimes, functions that didn't have modification in any of the layers do not use cache. Why?
1 messages · Page 1 of 1 (latest)
Should we care to exclude .dagger and .dagger.json when copying the root directory?
That didn't help. Still randomly cache is not used 🤷🏻♂️
Here's the issue: https://docs.dagger.io/getting-started/types/directory/#directory-evaluation
Sounds like self-inflicted user pain?
are you able to setup tracing in dagger cloud? it will help visualize
you can press w from the tui while dagger is running
(for "web")
I'm using it. But it would be helpful to get a warning or something because one doesn't expect that a core behavior of the function changes depending on how the same input value is passed to the function (either via "default" or explicitly).
it provides doesn't
chances are you have an actual input changed, perhaps an unrelated file that is not filtered out
So maybe I'm misreading this?
When relying on default paths in Dagger Shell, it's important to know that the source file or directory is re-evaluated on each command execution within the shell session. This differs from passing the source explicitly as an argument, where it's evaluated once and cached. This re-evaluation can lead to unintended behavior where changes to the source directory during the session (such as through exports or logs) invalidate the cache, causing the entire pipeline to re-execute.
I'm not sure what that sentence is saying..
Either way the most likely root cause remains unrelated files (like exports or logs mentioned in that paragraph), that get included in your input directories. You can use pre-call filtering (eg. +ignore pragma in Go) to filter out noise
Maybe @twin jetty can help? https://github.com/dagger/dagger/commit/8d656607e3756da2be205c19915c42504603fe90
I'll create a small repro
Maybe this helps? https://github.com/dagger/dagger/issues/10667
Ah I see
I think that's unrelated to your issue @tawny shuttle . I would focus on finding the input directory that gets invalidated (using dagger cloud to find non-cached operations upstream of the one being invalidated), then using pre-call filtering to remove the noise
Ok, thanks. I'll try right after I get out this other rabbit hole 🙂 https://discord.com/channels/707636530424053791/1425783514820776027
Ok, maybe there's a design flaw in my function pipeline. I'm setting up a CI for a monorepo and I've defined granular CI functions per package, in order to make use of cache. There's a base function that creates a base container (e.g. node:22.17) and install dependencies etc. Most of the CI functions have actions that require monorepo dependencies. Here's how it looks like:
@object()
export class MyDaggerStuff {
private source: Directory;
constructor(
@argument({ defaultPath: ".", ignore: [".git/**", "**/node_modules/**", "**/coverage/**", "**/dist/**"] })
source: Directory
) {
this.source = source;
}
// ...
/**
* Base container with source code and dependencies installed
*/
@func()
async base(): Promise<Container> {
return dag
.container()
.from("node:22.17")
.withWorkdir("/app")
.withDirectory("/app", this.source, {
include: [
// minimal set of files to install dependencies across monorepo packages
],
})
.withExec(["corepack", "enable"])
.withExec(["pnpm", "install", "--frozen-lockfile"])
.sync();
}
/**
* Function that doesn't require any monorepo dependency, only the package itself
*/
@func()
async formatMyPackage(): Promise<string> {
const base = await this.base();
const container = base
.withDirectory("/app", this.source, { include: ["my-package/**"] })
.withExec(["pnpm", "--filter", "@packages/my-package", "format"]);
await container.sync();
return "@packages/my-package: passed";
}
/**
* Function that require a monorepo dependency
*/
@func()
async lintMyPackage(): Promise<string> {
const container = await this.withMyPackageDeps();
await container.withExec(["pnpm", "--filter", "@packages/my-package", "lint"]).sync();
return "@packages/my-package: passed";
}
private async withMyPackageDeps(): Promise<Container> {
const base = await this.base();
return base
.withDirectory("/app", this.source, { include: ["my-package/**"] })
.withDirectory("/app", this.source, { include: ["my-package-dependency/**"] })
}
}
So I thought that the cache layers would depend on the withDirectory calls, but it seems like a different hash is produced for the source whenever any file changed inside the . directory or its subdirectories changes, unless they're listed in the ignore.
I'm wondering if working with the withFiles / withoutFiles is the proper way to accomplish what I want here.
If source invalidates the actions that depend on it (all the actions that need files?), does this mean the only way I can get this granular cache behavior I'm trying to achieve on each of the function executions is by having a single dagger call per function with a specific source parameter defined in the source call? 🤔
cc @wicked scaffold 👆
Simplified example: #general message
Because the constructor take as argument source, it will invalidate the cache of everything that using this.source
one pattern that I implement and I noticed it improved the cache hit was to use defaultPath on every entrypoint function and that improved the hit.
Here's an example for a ci in go:
func (d *DagbenchCi) Build(
ctx context.Context,
//+ignore=["**", "!**/*.go", "!go.mod", "!go.sum", ".dagger/"]
//+defaultPath="/"
source *dagger.Directory,
//+optional
platform dagger.Platform,
) (_ *dagger.File, err error) {
if platform == "" {
platform, err = dag.DefaultPlatform(ctx)
if err != nil {
return nil, err
}
}
return dag.
Go(source).
Build(dagger.GoBuildOpts{
Platform: platform,
}).File("bin/dagbench.io"), nil
}
func (d *DagbenchCi) Lint(
ctx context.Context,
//+ignore=["**", "!**/*.go", "!go.mod", "!go.sum", "!.golangci.yml", ".dagger/"]
//+defaultPath="/"
source *dagger.Directory,
) (string, error) {
return dag.
Container().
From("golangci/golangci-lint:v2.5-alpine").
WithDirectory("/app", source).
WithWorkdir("/app").
WithExec([]string{"golangci-lint", "run"}).
Stdout(ctx)
}
It involves a bit of code deduplication but it allows better cache hit because I can specify different pre-filter ignore for every functions
I think it really depends what you want to do with your module.
But post call filtering will involve less cache hit than pre-call filtering because we do not recompute the content the content hash when calling withDirectory so even if you filter things out that dir, it may not involve cache hit on the execution cache
However this should change once we complete https://github.com/dagger/dagger/issues/10367
So, as of now, the best we can do is atomic dagger call so we can use the defaultPath / ignore instructions to narrow down the caching right?
I think so yes, until we complete 10367
But the DX should stay quite simple thanks to defaultPath, you don't need to have 10 different flag
You can also try to group function based on the common source
It's just the orchestration layer that becomes a bit complex.
@wicked scaffold can you give the example in TS?
A better pattern is to have several arguments to the constructor
Each with a specialized job
That way you get your cake (better cache granularity) and eat it too (less repetitive)
Example in our CI: https://github.com/dagger/dagger/blob/main/version/main.go#L20-L40
export class Example {
@func()
build(
@argument({ defaultPath: "/", ignore: ["**", "..."]})
source: Directory,
platform?: Platform
): File {
...
}
@func()
lint(
@argument({ defaultPath: "/", ignore: ["**", "..."]})
source: Directory
): Promise<string> {
...
}
}
Yeah that also work fine! so you can keep your orchestration but use different source based on which function your calling
That would looks like:
export class Example {
repo: Directory
project: Directory
sourceCode: Directory
dependencies: Directory
constructor(
@argument({ defaultPath: "/", ignore: [".git"]})
repo: Directory
@argument({ defaultPath: "/", ignore: ["**", "!**/*.ts", "tsconfig.json", "package.json", "..."]})
project
...
) { ... }
@func()
lint(...) {
// use this.depdendencies, this.project etc..
}
}
Note for later: we should really update the boilerplate in dagger init to illustrate things like that better