#[Resolved] Cache Misses with ExperimentalPrivilegedNesting

1 messages · Page 1 of 1 (latest)

tawdry willow
#

Hey Team

I'm having some issues with cache miss's in my mono repo.

I have a folder structure of:

- lib
|- lib1
|- lib2
- service
|- service1
|- service2

And a dagger module that will dynamically read the go.mod and pull in only the required libs for a particular service.

My intention is that when either service2 changes, only those tests are run, or if lib1 is updated, the tests for both service1 and service2 are run as lib1 is a dependency.

The issue I'm having, is that when only service2 is changed, dagger is running tests for all services. When I run this with a "simple" pipeline line as per this example repo: https://github.com/DMajrekar/dagger-cache-monorepo everything is working as expected.

In my more complex prod monorepo, it's not working and I guess I'm doing something differint.

Looking at the -vvvv output from my dagger module, I'm getting the following output:

https://github.com/DMajrekar/dagger-cache-monorepo/blob/main/log

There are some lines such as

  ✔ Container@xxh3:31fddde736af50f9.withDirectory(
      directory: ✔ Directory.directory(path: "lib/db"): Directory! = xxh3:2262feec381368ba 0.0s
      path: "/go/src/neo/lib/db"
    ): Container! = xxh3:2ab223e595c3bab8 0.0s
  ✔ Container@xxh3:8815f64604dfadb3.withDirectory(
      directory: ✔ Directory.directory(path: "lib/instrumentation/"): Directory! = xxh3:aab975df517d8a29 0.0s
      path: "/go/src/neo/lib/instrumentation/"
    ): Container! = xxh3:e2f15381f94e4f4d 0.0s
    ✔ load cache: copy /lib/instrumentation /go/src/neo/lib/instrumentation 0.0s

which makes me think that lib/instrumentation has been cached correctly, but lib/db has not been and triggering a cache miss

Am I on the right track with this thought, or do I need to keep building up my demo repo into a more accurate replication of my prod pipeline to see where I've made a mistake?

GitHub

Contribute to DMajrekar/dagger-cache-monorepo development by creating an account on GitHub.

GitHub

Contribute to DMajrekar/dagger-cache-monorepo development by creating an account on GitHub.

#

In this broken version, I'm using a util folder in my dagger module, and passing the client, I'm not sure if that changes the behaviour at all

#

so out, err := util.RunGoTest(ctx, dag, dir, rootDir)

tawdry willow
#

Rulled out passing the client into a lib being the cause

Tested on the basic-dag-util branch

tawdry willow
#

Found the issue, enabling ExperimentalPrivilegedNesting looks to trigger the cache misses

#

To replicate this:

  • checkout https://github.com/DMajrekar/dagger-cache-monorepo
  • Run the pipeline dagger -m ci call run-test
  • Note that the run takes 20 seconds
  • Run the pipeline again dagger -m ci call run-test
  • note the run is instant
  • Increase the value of project1/VERSION to simulate a code change
  • Run the pipeline again dagger -m ci call run-test
  • note that the run is 10 seconds
  • Increase the value of project1/VERSION to simulate a code change
  • Run the pipeline with ExperimentalPrivilegedNesting enabled dagger -m ci call run-test --enable-experimental-privileged-nesting
  • note that the run time is 20 seconds
  • Run the pipeline with ExperimentalPrivilegedNesting enabled dagger -m ci call run-test --enable-experimental-privileged-nesting
  • Note that the run is instant
  • Increment the project1/VERSION to simulate a code change
  • note that the run time is now 20 seconds, not 10 seconds
GitHub

Contribute to DMajrekar/dagger-cache-monorepo development by creating an account on GitHub.

#

Cache Misses with ExperimentalPrivilegedNesting

#

my main projects dagger pipelines need this as I have some older SDK based repose that start up their own dagger services for databases etc:

Setting up Database and Redis within Dagger
panic: EOF: 1: connect
        WARNING: failed to list containers error="exec: \"docker\": executable file not found in $PATH"
        1: pulling registry.dagger.io/engine:v0.10.3 
        1: pulling registry.dagger.io/engine:v0.10.3 [0.00s]
        1: connect ERROR: new client: failed to pull image: exec: "docker": executable file not found in $PATH
        Error: new client: failed to pull image: exec: "docker": executable file not found in $PATH

This is the error I get in my main repo when disabling nesting. The main CI code allows for each project to connect to the dagger engine, start up a database and then tunnel those to "localhost" to run unit tests within

I will at some point get around to refactoring these to use 0.13.x, and may test that now to see if there have been changes that will allow me to run without nesting

rain briar
#

@rain prism I'm going to take a look at this. Would be good to get your take on it too.

rain prism
#

Yep for sure, also cc @wheat turtle who's been digging into caching things

tawdry willow
#

Thanks all, the repo should be easy to follow. I can strip it back to even more basic functions now that I know where the issue in the code is

Let me know if you'd like me to do that, but I've just pushed more into the Readme to explain what I'm up to

wheat turtle
#

Think I know why ExperimentalPrivilegedNesting invalidates too much cache, coincidentally I ran into a very similar situation today.

It's subtle and hard to explain succinctly but tl;dr is that flag changes the cache key of the withExec in such a way that there's less content-based caching.

I actually think there's a very quick fix possible that would cover your case. It's related to what I was working on anyways so I'll try it quick.

tawdry willow
#

amazing! Thank you 🙂

wheat turtle
#

Okay yeah my theory was right, ran your repro with my dev engine and got the expected 10s at the end with nesting enabled. I'll send out a PR in a few; it's an engine change but we are doing a release for v0.13.6 tomorrow so hopefully I can sneak it in quick.

Thank you btw for the simplified repro, it's an enormous help! Would have taken much longer to debug w/out it 🙂

tawdry willow
#

you're welcome. I was 100% expecting for me to ask this question and to then work through the simplified repo and find out I'd made this a very public rubber duck exercise; where I find my own issue, a spelling mistake or something stupid!

Thanks for finding this issue quickly. If it doesn't make it into 0.13.6 I won't mind. We've been running without this for a while and it was only beacuse I ran an internal demo and the cache did not work as I had assumed, I noticed it and spent the time this evening working on the simplified example. Working this through in my main repo would ... not be fun!

wheat turtle
tawdry willow