#Slow going copying source of a monorepo?

1 messages · Page 1 of 1 (latest)

burnt bone
#

I'm just starting to play with Dagger on one of our monorepos, and I was taking the initial stance of "just copy the source of the repo into the container we're building" using client.filesystem.".".read and just reading the whole thing in.

This turns out to be not such a good idea and it took... forever 🙂 Particularly if the repo as a working repo contains "junk" like python virtual envs, build artifacts, etc. So... fine, the "right" thing is probably to copy in only the parts of the repo that matter for a particular task, but that can also get tedious for different reasons ("woops, forgot to copy this directory... ok, woops, forgot to copy this other directory...").

What's the best practice for a monorepo then? Try to "mount" the whole thing in read-only somehow? Copy only the parts needed for particular tasks? Just curious; trying to get this to play nicely with the established build tools (pants, bazel, etc) is also... just tricky.

stark crest
fallow belfry
#

@wide monolith @devout bluff we'll need to implement this properly in 0.3.x; right now we special-case node_modules.

we could respect .dockerignore or come up with our own .daggerignore. it's not too difficult: https://github.com/vito/bass/blob/df94cc9db27055ee14b8f638f86d6641cd86ba58/pkg/runtimes/buildkit.go#L980-L997 - but it probably gets a little trickier if you want to respect it for each sub-directory, not sure what the Docker semantics are regarding that

alternatively we could add exclude: [String!] params to Host.workdir and Host.directory.

cc @fierce halo in case this is something the Cue SDK(?) needs to support

fallow belfry
wide monolith
#

Yeah +1 for include and exclude in the API

fierce halo
#

Also, I'd like the same ability to include + exclude in directory { withDirectory(...) Directory! }

fallow belfry
#

that's doable too! I'll add a note

fierce halo
fallow belfry
#

oh ty

fierce halo
#

🙏🏻

devout bluff
burnt bone
#

So... I am actually doing include/exclude for the monorepo, which works (more or less) but now I have a more generic question on client: filesystem: ...

I'd like to have different "sets" of files from the client filesystem to copy to different layers of a docker image, say. But it feels like I only have once chance to set client: filesystem: "." and thus read from the project's directory.

How could I do something like, dunno

layer1files: client: filesystem: ".": read: include: [some-files]
layer2files: client: filesystem: ".": read: include: [other-files...]

dockerLayer1: docker#Something & { _src: layer1files.client.filesystem.".".read.contents }
dockerLayer2docker#Something & { input: dockerLayer1 & {_src: layer2files...}

if that makes any sense, like two different snapshots/sections of the client filesystem. In this case it's layers of the builder container where I want to set up the tools based on configuration but want to add the sources later since they'll change more frequently.

#

(or is this more what core.#Source does, since client.filesystem really just seems to be part of #Plan?)

wide monolith
#

@burnt bone ~I don't know if there is a cleaner solution.. But as a hack you can use different paths to designate the same directory.. It's ugly but I believe it works.~ wrong

fierce halo
#
#Plan: {
    // Access client machine
    client: {
        // Access client filesystem
        // Path may be absolute, or relative to client working directory
        filesystem: [path=string]: {
            // Read data from that path
            read?: _#clientFilesystemRead & {
                "path": string | *path
            }

            // If set, Write to that path
            write?: _#clientFilesystemWrite & {
                "path": string | *path

                // if we read and write to the same path, under the same key,
                // assume we want to make an update
                if (read.path & write.path) != _|_ {
                    _after: read
                }
            }
        }
}
#

You can see how read: "path": defaults to path... but also allows overrides.

wide monolith
#

Actually that's way better, and my example was completely wrong. Sorry & thanks @fierce halo

fierce halo
#

Definitely!

burnt bone
#

hmm, interesting; I like the latter a little better. I'm trying to separate the build into "build a docker container to build everything" (which is less #Plan-specific although it relies on the repository, so may be more in the spirit of #Source) and "now do tasks with a subset of the repo" (which may be a bit more #Plan-specific) so just trying to structure things. I was actually looking through the fs source but didn't catch the path override or how it would allow what you showed. Fascinating; cue is taking a while to internalize.

wide monolith
#

Oh I see my example was not wrong, you just showed an advanced option which I had forgotten about

#

(slowly loading CUE context back in memory 🙂

fierce halo
#

Yeah, it's an interesting language. I definitely had to change how I thought about things.

wide monolith
#

@burnt bone in my experience, most of the time you actually want core.#Source

#

the test question is: if you run dagger do from outside the plan's source directory, what directory do you want to interact with: your shell's current workdir, or the plan's source directory?

burnt bone
#

Since most often I'm referencing things within the repository (and my cue files are at the root), that's likely true.

wide monolith
#

there is a trick to use core.#Source even when you need to reference a parent directory to the plan (but still relative to the plan)

burnt bone
#

that's a good description of the difference actually; the page was a little less clear on that.

wide monolith
#

So if you find yourself needing client: filesystem: ".." but you don't actually want to follow the shell's workdir.. You can still use core.#Source with a trick 🙂

fringe flicker
#

Same question but for cloak!

I've tried dockerignore to no avail

wide monolith
#

Not yet supported in the API, but will be a transposition of what you can do in CUE. No support for dockerignore planned (better to put all the control in your code.)