#Caching seems to be disabled when using directory mappings

1 messages Β· Page 1 of 1 (latest)

grizzled remnant
#

I am currently evaluating the rust SDK for usage in a rust project: https://github.com/consensus-shipyard/ipc/pull/1297 (the relevant code is under dagger/dagger.rs, run with cargo run from the repo root).

I am using a set of directory mappings to pass artifacts back to the host AND use these in the following steps. Unfortunately, I don't see any of the calls I am making being cached despite only ever chaning the dagger/dagger.rs file.

There are two questions:

  • how is caching supposed to work, I found the existing documentation to be a bit lacking, particularly around diagnosing, particularly why a cache was invalidated
  • does the order of with_exec and with_mounted_directory matter? I suppose so for the following execution context, but hopefully (?) less so far caching.

It'd be great if you could point me in the right direction

GitHub

An outline to build a sol, rs, car artifacts and pump them into a minimal~ish docker image.
dagger/dagger.rs is the only relevant piece here.

This change is

grizzled remnant
#

Gentle ping, I have not seen any exact caching behaviour explanations, mere vague bug reports

#

This is currently a showstopper to move beyond PoC

grizzled remnant
#

Gentle ping

rose jewel
rose jewel
#

take into account that since github actions use ephemeral runners, the caching won't have any effect there because you're running the pipeline in a new VM each time

#

dagger currently caches artifacts when running the pipelines multiple times in the same host

rose jewel
#

just checked that caching doesn't seem to be working locally for you! checking this now!

rose jewel
#

ok, found one of the reason. Seems like this always invalidates the cache regardless if any file(s) in the repo change:

func main() {

    ctx := context.Background()
    d := dag.Host().Directory(".")

    dc := d.Directory("contracts")

    dag.Container().From("ubuntu").
        WithMountedDirectory("/workdir", d).
        WithMountedDirectory("/workdir/contracts", dc).
        WithExec([]string{"apt", "update", "-y"}).Sync(ctx)
}

src:

.
β”œβ”€β”€ contracts
β”‚Β Β  └── fo
β”œβ”€β”€ go.mod
β”œβ”€β”€ go.sum
└── main.go

2 directories, 4 files

FWIW this Dockerifle, doesn't have the same effect:

# syntax=docker/dockerfile:1
FROM ubuntu


RUN    --mount=type=bind,target=/workdir \
    --mount=type=bind,source=contracts,target=/workdir/contracts \
apt update -y

cc @worthy basin @loud salmon

#

I'm filing an issue now

worthy basin
#

Got back on the issue, but it seems like the problem can be stepped around for now by avoiding calling .Directory() on an already loaded host Directory, e.g. modify the above to

    ctx := context.Background()
    d := dag.Host().Directory(".")

    dc := dag.Host().Directory("./contracts")

    dag.Container().From("ubuntu").
        WithMountedDirectory("/workdir", d).
        WithMountedDirectory("/workdir/contracts", dc).
        WithExec([]string{"apt", "update", "-y"}).Sync(ctx)
}
rose jewel
#

just tried the fix in your pipeline @grizzled remnant and seems to improve it considerably. Let us know how it goes and if you run into other issues πŸ™

grizzled remnant
#

I'll try in a few minutes/hours, ty!

grizzled remnant
#

directory(..).file("foo.bar") is ok though?

#

I get some cachehits, bot still not at he level I anticipated for two consecutive runs

rose jewel
grizzled remnant
#

Yes, correct

#

I still get an overwhelming amount of rebuilds

rose jewel
rose jewel
# grizzled remnant Yes, correct

@grizzled remnant I think a bunch of the invalidations you might be experiencing is because you don't seem to be excluding the dagger directory in the dir function. This causes that whenever you change your pipeline code, because your dir effectively changes, then a bunch of things get invalidated because all the steps which depend on those directories have to be re-executed.

The same way as Dockerfiles, it's generally advisable to call with_directory and with_mounted_directory right before the steps that needs them have to execute. Looking at your module for example, I see that both hrrd and hccd get added at the very beginning of the pipeline here: https://github.com/consensus-shipyard/ipc/blob/d052bb9b99c249294114f1e8747d5e49920a937d/dagger/dagger.rs?plain=1#L96-L99. This means that if any files change withing those directories, then the subsequent apt-get, npm install curl | bash steps that don't necessarily depend on those directories to be present, will effectively be re-executed. As mentioned before, in addition to adding the dagger folder in the exclude directory list, you should also make a minor refactor so the with_mounted_directory calls are added only when the subsequent steps need them

#

LMK if that makes sense πŸ™

rose jewel
#

additionally, another downside of having bloated functions that prepare containers like the one you have with with_caches is that if by any reason you have to add / remove any caches from that function, that will also cause a full invalidation because it'll basically modify the base container definition which will cause all the steps to be re-executed against the new defnition. Same advise as above it's generally better to add the caches only to the steps that need it to prevent these situations

rose jewel
#

also noticed that rustup cache mounts was missing one directory. Added this missing cache mount: .with_mounted_cache("/usr/local/rustup", cache_volume_rustup_downloads.clone())