#unclear caching behavior

1 messages · Page 1 of 1 (latest)

low pivot
#

hey folks. i am noticing that the following step unpredictably needs to be rebuilt (quite often), even if nothing about my source or dagger code has changed:

      ● Container.directory(path: "/src"): Directory! 9.2s
        ✔ cache request: copy / /dagger 0.0s
        ✔ cache request: [internal] creating dagger metadata 0.0s
        ✔ cache request: oci-layout://dagger/import@sha256:6313b8ab78fd9cc0e46e4311e52f20f41bdf52685a610bd8ee4a30cb14ad2ff8 0.0s
        ✔ load cache: copy / /dagger 0.0s
        ✔ merge 0.0s
        ✔ rm /dagger/dagger.gen.go 0.0s
        ✔ load cache: creating dagger metadata 0.0s
        ✔ load cache: mkfile /schema.json 0.0s
        ● exec /usr/local/bin/codegen --output /src --module-context-path /src/dagger --module-name dtest --introspection-json-path /schema.json 9.1

my source is a dead simple hello world with a test file, and my dagger fns look like this

func (m *Dtest) Build(dir *Directory) *Container {
    return dag.Container().From("golang:latest").
        WithMountedDirectory("/mnt", dir).
        WithWorkdir("/mnt")
}

func (m *Dtest) Test(ctx context.Context, dir *Directory) (string, error) {
    return m.Build(dir).WithExec([]string{"go","test"}).Stdout(ctx)
}

i'm running this with dagger call test --dir=. -vvv.
how can i predict when the cache will be invalidated? this causes runtimes to vary by an order of magnitude. thanks

#

currently it seems to be happening every run. so the test file that has no deps and is asserting arithmetic takes 30+ seconds to run.

#

will the engine logs show me the reason the cache is stale?

#

the modcache of this step also is seemingly invalidated each time:

    ● Module.initialize: Module! 4.8s
    ● exec go build -o /runtime . 4.8s
    ┃ go: downloading golang.org/x/sys v0.19.0                                                                                                      
    ┃ go: downloading go.opentelemetry.io/proto/otlp v1.2.0                                                                                         
    ┃ go: downloading google.golang.org/genproto/googleapis/rpc v0.0.0-20240401170217-c3f982113cda                                                  
    ┃ go: downloading google.golang.org/protobuf v1.33.0                                                                                            
    ┃ go: downloading github.com/go-logr/stdr v1.2.2                                                                                                
    ┃ go: downloading github.com/cenkalti/backoff/v4 v4.3.0                                                                                         
    ┃ go: downloading github.com/grpc-ecosystem/grpc-gateway/v2 v2.19.1                                                                             
    ┃ go: downloading github.com/google/uuid v1.6.0                                                                                                 
    ┃ go: downloading github.com/sosodev/duration v1.2.0                                                                                            
    ┃ go: downloading golang.org/x/net v0.23.0                                                                                                      
    ┃ go: downloading google.golang.org/genproto/googleapis/api v0.0.0-20240227224415-6ceb2ff114de                                                  
    ┃ go: downloading golang.org/x/text v0.14.0                                                                                                     
#

hm... this looks a little weird... (docker logs)

time="2024-06-05T00:24:28Z" level=error msg="function call errored" client_hostname=gt-workstation client_id=j4zxm339fuwoahqn120mwzb7u error="call constructor: process \"go build -o /runtime .\" did not complete successfully: exit code: 137" function= module=dtest server_id=paavz281xgqfbk2mty8bpf6kj
time="2024-06-05T00:24:28Z" level=warning msg="error resolving module.withSource(source: {moduleSource(refString: \".\").withContextDirectory(dir: {blob(digest: \"sha256:3ce85961a125596bd099496fcfaefac4b694d17e1fbaffc4f3f1b93036b8443c\", mediaType: \"application/vnd.oci.image.layer.v1.tar+zstd\", size: 46861, uncompressed: \"sha256:0a1f02ee4d84826bfeac4fab2886788adad0b74f1035c253263e59484140f7ae\"): Directory!}): ModuleSource!}).initialize: Module!" error="map[error:failed to initialize module: failed to call module \"dtest\" to get functions: call constructor: process \"go build -o /runtime .\" did not complete successfully: exit code: 137 kind:*fmt.wrapError stack:<nil>]"
#

maybe a red herring

median kayak
#

@low pivot just to verify, you're calling consecutive dagger call runs without changing anything and you're getting random cache invalidations?

What system are you running this on. Mac, or Linux?

low pivot
#

debian11

#

that's correct

#

at this point, it looks like codegen & module init are rerunning every time - no caching at all

median kayak
low pivot
#

i am

#

so bizarre. now it's caching... (sometimes)

median kayak
median kayak
# low pivot so bizarre. now it's caching... (sometimes)

hmmm I can't repro with a simple example. Is there a chance you can connect your engine to Dagger Cloud (it's a free service for a single user) via dagger login and share two executions one cached and the other uncached? 🙏

median kayak
low pivot
#
 () $ docker exec $(docker ps --filter name="dagger-engine-*" -q) df -h /
Filesystem                Size      Used Available Use% Mounted on
overlay                  98.2G     81.4G     12.7G  86% /

appreciate the support!

median kayak
# low pivot ``` () $ docker exec $(docker ps --filter name="dagger-engine-*" -q) df -h / Fi...

ok, I think this is very likely to be the issue. Dagger has quite an aggressive GC policy by default (https://github.com/dagger/dagger/issues/7392) which starts evicting cache after 10% of the disk has been used: https://github.com/moby/buildkit/blob/master/cmd/buildkitd/config/gcpolicy_unix.go#L10

given that the partition where your docker and dagger engine are running seems to be quite full, looks like like Dagger is evicting the cache regularly

#

I'll send you some insturctions on how you can tune this GC policy in a bit. cc @quiet violet

#

you need to start your own engine with a custom engine.toml file.

insecure-entitlements = ["security.insecure"]
debug = true

[worker.oci]
  enabled = true
  gc = false

and then start your engine with it. For v0.11.6 that'd be:

docker run --rm --privileged -v ./engine.toml:/etc/dagger/engine.toml -v /var/lib/dagger --name dagger-engine-32e0269fab8e9d98 registry.dagger.io/engine:v0.11.6

keep in mind this disables GC altogether. If you need more fine-tuning, here's another option you can set for that: https://github.com/moby/buildkit/blob/master/docs/buildkitd.toml.md?plain=1#L63-L69

GitHub

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit - moby/buildkit

low pivot
#

hm, very interesting. after reducing the usage to 62%, i'm seeing the same issue on occasion. that is indeed quite aggressive

#

oh, even 10% (!)

#

i can give this a try

#

but i suspect turning off gc would fill up the disk extremely quickly, no?

#

if no cache is ever evicted

#

let me experiment with the custom option quick- i assume i can set this same option in engine.toml?

#

regardless- this does seem to solve the issue! i greatly appreciate your help @median kayak , thank you!

median kayak
#

So you can set it to a higher percentage