#Misunderstanding of Cache?

1 messages · Page 1 of 1 (latest)

hallow arch
#

Hello! I'm trying to test the waters in migrating an existing (mostly sh and github action based) pipeline over to dagger.

The pipeline is supposed to build a package from some source files, deploy that package to k8s, run some tests there, then publish the package. I thought that by mounting the package source files in a dagger cache if those files don't change then the build process would recognize that and not be invoked. However, that's not the result I'm seeing - when nothing changes the package is re-built every time I run the pipeline.

Here is a link to my main.go ci file if any of you smart people have ideas on what I am doing wrong: https://github.com/meganwolf0/dagger-zarf/blob/main/ci/main.go

GitHub

Contribute to meganwolf0/dagger-zarf development by creating an account on GitHub.

upbeat cargo
#

hey Megan! are you seeing this behavior when running Dagger locally? or in CI?

spark pebble
#

I would try mounting pkgDir as a regulat directory rather than a cache volume

#

withMountedDirectory rather than withMountedCache, line 47

hallow arch
#

@upbeat cargo I have just been running locally

#

@spark pebble thank you for clarifying the different version of caching, I tried the mounted directory instead, but still seeing the command re-running in full

spark pebble
#

Are you on the Dagger Cloud early access by any chance? Would be useful to see your run

hallow arch
#

I am not, I can try and run some stuff over there though - if you have some instuctions on how to get started with it

spark pebble
#

your pipelines still run on your machines, but dagger cloud can be hooked up to receive telemetry, for orchestration and monitoring (and in this case: collaborative debugging)

hallow arch
#

ah ok I see, happy to hook it up! Not sure if it's open, sounds maybe invite-only at the moment?

spark pebble
#

in the meantime could you perhaps share the terminal output of running it 1st and 2nd time?

#

also have you tried wrapping your tool with dagger run to get more readable terminal output? might help troubleshoot

hallow arch
#

run 1:

dagger-zarf on  main [?] via :hamster: v1.20.5 on :cloud:  (us-east-1) 
❯ dagger run go run ./ci/main.go                                  
█ [11.7s] go run ./ci/main.go
┃ Building with Dagger                                                                                                         
┣─╮ 
│ ▽ host.directory /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux
│ █ [0.02s] upload /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux
│ ┣ [0.01s] transferring /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux:
│ █ [0.01s] copy /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux
│ ┣─╮ copy /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux
│ ┻ │ 
┣─╮ │ 
│ ▽ │ from ttl.sh/zarf-20230804:8h
│ █ │ [0.53s] resolve image config for ttl.sh/zarf-20230804:8h
│ █ │ [0.01s] pull ttl.sh/zarf-20230804:8h
│ ┣ │ [0.01s] resolve ttl.sh/zarf-20230804:8h@sha256:53f4aebb9068350b8723a1fb0f256b9620b3e81dc3ce01e6b35cbb101b1ce245
│ ┣─┼─╮ pull ttl.sh/zarf-20230804:8h
│ ┻ │ │ 
┣─╮ │ │ 
│ ▼ │ │ flux
│ █◀╯ │ [0.01s] copy / /flux
│ █◀──╯ [9.04s] exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│ ┣─╮ exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│ ┻ │ 
█◀──╯ [0.06s] copy /app/build /
█ [0.48s] exporting to client directory
┣ [0.48s] copying files
┻ 
• Engine: 0af1cbe1be1b
#

run 2:

dagger-zarf on  main [!?] via :hamster: v1.20.5 on :cloud:  (us-east-1) took 13s 
❯ dagger run go run ./ci/main.go
█ [8.93s] go run ./ci/main.go
┃ Building with Dagger                                                                                                         
┣─╮ 
│ ▽ host.directory /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux
│ █ [0.02s] upload /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux
│ ┣ [0.00s] transferring /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux:
│ █ CACHED copy /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux
│ ┣─╮ copy /Users/meganwolf/Documents/testing/dagger/dagger-zarf/packages/flux
│ ┻ │ 
┣─╮ │ 
│ ▽ │ from ttl.sh/zarf-20230804:8h
│ █ │ [0.28s] resolve image config for ttl.sh/zarf-20230804:8h
│ █ │ [0.01s] pull ttl.sh/zarf-20230804:8h
│ ┣ │ [0.01s] resolve ttl.sh/zarf-20230804:8h@sha256:53f4aebb9068350b8723a1fb0f256b9620b3e81dc3ce01e6b35cbb101b1ce245
│ ┣─┼─╮ pull ttl.sh/zarf-20230804:8h
│ ┻ │ │ 
┣─╮ │ │ 
│ ▼ │ │ flux
│ █◀╯ │ CACHED copy / /flux
│ █◀──╯ [7.05s] exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│ ┣─╮ exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│ ┻ │ 
█◀──╯ [0.07s] copy /app/build /
█ [0.46s] exporting to client directory
┣ [0.46s] copying files
┻ 
• Engine: 0af1cbe1be1b
slim bramble
#

Here are my runs from my Mac M1:

dagger-zarf ➤ dagger run go run ./ci/main.go                                                                                                                                                                 git:main
█ [15.4s] go run ./ci/main.go
┃ Building with Dagger
┣─╮
│ ▽ host.directory /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux
│ █ [0.08s] upload /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux
│ ┣ [0.05s] transferring /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux:
│ █ [0.01s] copy /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux
│ ┣─╮ copy /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux
│ ┻ │
┣─╮ │
│ ▽ │ from ttl.sh/zarf-20230804:8h
│ █ │ [2.35s] resolve image config for ttl.sh/zarf-20230804:8h
│ █ │ [0.51s] pull ttl.sh/zarf-20230804:8h
│ ┣ │ [0.02s] resolve ttl.sh/zarf-20230804:8h@sha256:53f4aebb9068350b8723a1fb0f256b9620b3e81dc3ce01e6b35cbb101b1ce245
│ ┣ │ [1.04s] ██████████████████████████ sha256:f7bf105f3b603304fc7f5ff9b9524eb97f3b55fb818fe326daf35ca780f70b93
│ ┣ │ [3.48s] ██████████████████████████ sha256:00de514abe5214e8cd52c5284fb382a66d2402612a29ef10ebf5ce7c10528209
│ ┣ │ [0.07s] extracting sha256:f7bf105f3b603304fc7f5ff9b9524eb97f3b55fb818fe326daf35ca780f70b93
│ ┣ │ [0.51s] extracting sha256:00de514abe5214e8cd52c5284fb382a66d2402612a29ef10ebf5ce7c10528209
│ ┣─┼─╮ pull ttl.sh/zarf-20230804:8h
│ ┻ │ │
┣─╮ │ │
│ ▼ │ │ flux
│ █◀╯ │ [0.03s] copy / /flux
│ █◀──╯ [6.96s] exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│ ┣─╮ exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│ ┻ │
█◀──╯ [0.06s] copy /app/build /
█ [0.43s] exporting to client directory
┣ [0.43s] copying files
┻
• Cloud URL: https://dagger.cloud/runs/e33e0be0-850a-4020-b0b7-ab77af96594d
• Engine: df2ad0001b9b
• Duration: 15.433s
#
dagger-zarf ➤ dagger run go run ./ci/main.go                                                                                                                                                                 git:main
█ [1.96s] go run ./ci/main.go
┃ Building with Dagger
┣─╮
│ ▽ host.directory /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux
│ █ [0.06s] upload /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux
│ ┣ [0.05s] transferring /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux:
│ █ CACHED copy /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux
│ ┣─╮ copy /Users/jeremyadams/src/meganwolf/dagger-zarf/packages/flux
│ ┻ │
┣─╮ │
│ ▽ │ from ttl.sh/zarf-20230804:8h
│ █ │ [0.35s] resolve image config for ttl.sh/zarf-20230804:8h
│ █ │ [0.01s] pull ttl.sh/zarf-20230804:8h
│ ┣ │ [0.01s] resolve ttl.sh/zarf-20230804:8h@sha256:53f4aebb9068350b8723a1fb0f256b9620b3e81dc3ce01e6b35cbb101b1ce245
│ ┣─┼─╮ pull ttl.sh/zarf-20230804:8h
│ ┻ │ │
┣─╮ │ │
│ ▼ │ │ flux
│ █◀╯ │ CACHED copy / /flux
│ █◀──╯ CACHED exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│ ┣─╮ exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│ ┻ │
█◀──╯ CACHED copy /app/build /
█ [0.44s] exporting to client directory
┣ [0.43s] copying files
┻
• Cloud URL: https://dagger.cloud/runs/fdfa6e1d-b712-422b-bdd4-2f9ff3642b72
• Engine: df2ad0001b9b
• Duration: 1.955s
#

@hallow arch could you run dagger version 🙏

#

wondering what CLI version you have

hallow arch
#

dagger v0.6.3 darwin/arm64

slim bramble
#

Ah, okay. I'd def upgrade to at least v0.6.4, but would go right to v0.8.0

#

brew install dagger/tap/dagger

upbeat cargo
#

I thought about the v0.6.3 issue but then I checked the go.mod and I saw v0.7.4 SDK 🙈

hallow arch
#

updating now! And sorry just getting on the dagger cloud, are there docs on how to connect? I assume i use the cloud client when initializing the dagger connection?

upbeat cargo
#

v0.6.3 has a regression which invalides caches from some ops incorrectly

#

@hallow arch if you run go run ./ci/main.go in your repo "as it", it shold work

#

newer versions perform a compatbility check between the SDK and the CLI to avoid running into this situations 🙏

slim bramble
#

We can test that on our side.

But to make it easy on you, I'd just

go get dagger.io/dagger@v0.8.0
go mod tidy

In your project
and upgrade CLI like above.
Then clear out your dagger engine container and volumes and start fresh 🙂

hallow arch
#

alrighty everything is updated, and yeah it does actually look like that resolved things, here's run 1:

dagger-zarf on  main [!?] via 🐹 v1.20.5 on ☁️  (us-east-1) 
❯ dagger run go run ./ci/main.go
┣─╮ 
│ ▽ host.directory packages/flux
│ █ [0.08s] upload packages/flux
│ ┣ [0.06s] transferring eyJvd25lcl9jbGllbnRfaWQiOiJyemNnYWRrcGhuY2sxZHE2a252M2QwYXkxIiwicGF0aCI6InBhY2thZ2VzL2ZsdXgiLCJpbmNsdWRlX3BhdHRlcm5zIjpudWxsLCJleGNsdWRlX3BhdHRlcm5zIjpudWxsLCJmb2xsb3dfcGF0aHMiOm51bGwsInJlYWRfc2luZ2xlX2ZpbGVfb25seSI6ZmFsc2UsIm1heF9maWxlX3NpemUiOjB9:
│ █ [0.01s] copy packages/flux
│ ┣─╮ copy packages/flux
│ ┻ │ 
┣─╮ │ 
│ ▽ │ from ttl.sh/zarf-20230804:8h
│ █ │ [0.61s] resolve image config for ttl.sh/zarf-20230804:8h
┣─┼─┼─╮ 
│ │ │ ▼ flux
│ █ │ │ [0.01s] pull ttl.sh/zarf-20230804:8h
│ ┣ │ │ [0.01s] resolve ttl.sh/zarf-20230804:8h@sha256:53f4aebb9068350b8723a1fb0f256b9620b3e81dc3ce01e6b35cbb101b1ce245
│ ┣─┼─┼─╮ pull ttl.sh/zarf-20230804:8h
│ ┻ │ │ │ 
│   ╰▶█ │ [0.01s] copy / /flux
│     █◀╯ [23.7s] exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│     █ [0.06s] copy /app/build /
┻     ┻ 
WARNING: Using development engine; skipping version compatibility check.                                                       
• Engine: 283d8bc75810 (version devel ())
• Duration: 26.151s
#

run 2:

dagger-zarf on  main [!?] via 🐹 v1.20.5 on ☁️  (us-east-1) took 27s 
❯ dagger run go run ./ci/main.go
┣─╮ 
│ ▽ host.directory packages/flux
│ █ [0.07s] upload packages/flux
│ ┣ [0.06s] transferring eyJvd25lcl9jbGllbnRfaWQiOiJrdDdvNHZpZjM4cmM3dHM3cXFnOTU2YzkwIiwicGF0aCI6InBhY2thZ2VzL2ZsdXgiLCJpbmNsdWRlX3BhdHRlcm5zIjpudWxsLCJleGNsdWRlX3BhdHRlcm5zIjpudWxsLCJmb2xsb3dfcGF0aHMiOm51bGwsInJlYWRfc2luZ2xlX2ZpbGVfb25seSI6ZmFsc2UsIm1heF9maWxlX3NpemUiOjB9:
│ █ CACHED copy packages/flux
│ ┣─╮ copy packages/flux
│ ┻ │ 
┣─╮ │ 
│ ▽ │ from ttl.sh/zarf-20230804:8h
│ █ │ [0.35s] resolve image config for ttl.sh/zarf-20230804:8h
┣─┼─┼─╮ 
│ │ │ ▼ flux
│ █ │ │ [0.01s] pull ttl.sh/zarf-20230804:8h
│ ┣ │ │ [0.01s] resolve ttl.sh/zarf-20230804:8h@sha256:53f4aebb9068350b8723a1fb0f256b9620b3e81dc3ce01e6b35cbb101b1ce245
│ ┣─┼─┼─╮ pull ttl.sh/zarf-20230804:8h
│ ┻ │ │ │ 
│   ╰▶█ │ CACHED copy / /flux
│     █◀╯ CACHED exec /bin/sh -c zarf package create ./pkgs/flux --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null
│     █ CACHED copy /app/build /
┻     ┻ 
WARNING: Using development engine; skipping version compatibility check.                                                       
• Engine: 283d8bc75810 (version devel ())
• Duration: 1.946s
#

also thank you for adding the "duration" at the end in the newer version 🙂

spark pebble
#

Now I'm curious if the WithMountedCache was part of the problem? 🤔

#

Those are designed to bypass the usual caching mechanism - so that tools running inside the container can implement their own caching (which requires cache state to be preserved in between runs)

#

They are like controlled side effects

hallow arch
#

🤷‍♀️ I can try changing it back

#

I'm still not seeing the runs show up in dagger.cloud though

upbeat cargo
#

basically none of the inputs of the DAG will be change

spark pebble
slim bramble
#

Yep! I've sent info in DM 🙂

#

@hallow arch

#

Also happy to jump on a quick screen share

hallow arch
#

I got it, thank you!

spark pebble
#

Anything that relies on the cache volume not being empty on first run, is a risk of error IMO

#

since it's not well understood how the "pre-filling" behavior works. It's safe to assume those use cases will not work as intended

slim bramble
spark pebble
slim bramble
#

I'm using the same code as Megan

#

lemme look

upbeat cargo
#

just tested with WithMountedCache and I get cached ops as well

#

which is what I'd expect though.. regardless of the cache mount being populated before the run

slim bramble
#
func buildPackage(client *dagger.Client, ctx context.Context, c *dagger.Container, pkg string, pkgCache *dagger.CacheVolume) {
    // Build Zarf package
    pkgDir := client.Directory().WithDirectory(pkg, client.Host().Directory(fmt.Sprintf("packages/%s", pkg)))
    zarfCmd := fmt.Sprintf("zarf package create ./pkgs/%s --confirm --no-progress --tmpdir /tmp/zarf -o ./build 2>/dev/null", pkg)

    build := c.Pipeline(pkg).
        WithMountedCache("/app/pkgs", pkgCache, dagger.ContainerWithMountedCacheOpts{Source: pkgDir}).
        WithWorkdir("/app").
        WithExec([]string{
            "/bin/sh", "-c",
            zarfCmd,
        }).
        Directory("build")

    _, err := build.Export(ctx, "./build")
    if err != nil {
        panic(err)
    }

}
#

Megan got caching with newer version of Dagger too

upbeat cargo
#

but I see your point solomon and it makes sense that it's difficult to visualize what'd happen

spark pebble
#

Difficult to visualize and I think difficult to guarantee that it will behave correctly

upbeat cargo
hallow arch
#

Yeah I changed it to withmounteddirectory locally

upbeat cargo
#

still... seems to work both ways

slim bramble
#

ah, I see.

spark pebble
#

To summarize, it looks like my suggestion to use WithMountedDirectory does not affect this particular caching issue - but does prevent other, harder to detect issues that were lurking under the surface. So I still recommend making that change 🙂

hallow arch
#

side Q for you all - does the dagger engine maintain parity if I have multiple runners? we have some customers that host their own runners so I'm not sure if this is what dagger cloud would solve but might not be usable for some of the cases I'm thinking about

hallow arch
#

ah sorry like if I'm running a dagger pipeline and it launches on one of my available runners, next time if it runs on a different runner I assume the caching or whatever would be invalidated because now it's on a different docker host? Like can I run the same pipelines across different runners and have them all sync up on caches

spark pebble
#

Ah, yes that is one of the big features of Dagger Cloud - it orchestrates caching across all your engines

#

Without that caching orchestration, each engine is its own island

#

And if the engine runs on an ephemeral runner (typical in a CI setup), then the island is swallowed by the sea after each run, and the next island starts with an empty cache

slim bramble
#

Yep. With the org you've just set up, we can get caching working for you @hallow arch

#

Easiest if we do a quick session to get you onboarded for a trial

#

But in essence, we'll use your caching token (next to the ingestion/logs token) in your org and we'll get you enabled on the caching service.

hallow arch
#

ok gotcha, that's what I figured, didn't know if there was a self hosted runner/on-prem kind of synchronization (sounds like maybe not yet). but yeah would love to get set up with the cloud caching!