#Is it possible to do host mounts like docker -v?

1 messages · Page 1 of 1 (latest)

violet girder
#

Dagger sends everything over the wire, which is why to my understanding, the equivalent of a host mount to a running container is not possible in Dagger.

I think I recall something being said we might be able to mount the host dir when starting the engine, then containers running in dagger might be able to have a real-time, two-way mount with that directory?

hoary dust
#

It's possible, just not yet implemented 😁 We have a POC branch somewhere, it's in the queue for this year.

#

mount the host dir to have a real-time two-way mount

This would break Dagger out of the box by fragmenting its user experience. When a feature works on Dagger, it must work for every module running on every Dagger engine installation. What you're proposing would break that core guarantee, by requiring different "host mount setup" using different external tools on different installations. In many cases it would still "send everything over the wire", just with a fragmented ecosystem of tools out of control (eg. docker desktop's network mount or whatever the docker clones implement) so no guaranteed performance benefits.

That is the way Docker went (on my watch) and it was strictly worse for users. I will not make that mistake again.

violet girder
#

I've grown to appreciate history lesson more the older I get, not sure if it's the gray hairs that make me feel like I can relate to history now?

#

that "load my source code" phase of local dev is one of those noticeable slow parts, if you don't know it has to all go over the wire, it's a wtf is going on moment

hoary dust
#

we're working on improvements

safe mauve
# violet girder that "load my source code" phase of local dev is one of those noticeable slow pa...

definitely agree. Specially when for some random reason you have extremely large files or directories (looking at you .git) and then all of the sudden the upload context takes a few seconds.

A first direction on improving this was to add better visibility (https://github.com/dagger/dagger/pull/11315) so users can understand better why / where the upload time is being spent on. Subsequent improvements are coming as Solomon mentioned to improve what and how things are transferred into the engine

GitHub

adds OTel spans for filesync uploads. This providers better visibility
into what's effectively being uploaded to the context. It only reports
root paths given that in most cases this will o...

violet girder
# safe mauve definitely agree. Specially when for some random reason you have extremely large...

We have this, re:

func FindLargeFiles(size string) ([]string, error) {
    excludes := []string{
        "src/stacks/foo/data/",
        "src/bar/data/",
    }

    // get a list of files over a certain size (hacky, depends on directory?) (like python models or other test files)
    findcmd := fmt.Sprintf("find . ! -readable -prune -o -size +%s", size)
    out, err := exec.Command("sh", "-c", findcmd).CombinedOutput()
    if err != nil {
        fmt.Println("find command failed:")
        fmt.Println(err)
        return nil, err
    }
    s := strings.TrimSpace(string(out))
    lines := strings.Split(s, "\n")
    for _, line := range lines {
        line = strings.TrimSpace(line)
        // we can ignore git, since we ignore the entire directory anyway
        if strings.HasPrefix(line, "./.git") {
            continue
        }
        line = strings.TrimPrefix(line, "./")
        excludes = append(excludes, line)
    }

    return excludes, nil
}
#

I'm actually moving towards being more surgical in what I upload from a monorepo source. It helps untangle the spaghetti monster of unintended/undoc'd/unknown dependencies... "did you know these two separate production services import from eachother without making an import cycle in golang?"

#

it still feels like it's scanning things though, cloning the repo is so much faster, let me do some tests and clarify my experience in different scenarios

violet girder
# violet girder We have this, re: ```go func FindLargeFiles(size string) ([]string, error) { ...

fwiw, I also need something like this within/between container layers because some stupid program will do something and there will be /tmp, /proc dirs in the diff, even though there are no files (?), or more recently, since I started including .git in the env I give my agent, as it makes changes, .git/index changes and shows up in the diff. I have had to resort to rm .git/index && git reset more than once getting this setup :]

#

the other thing that comes to mind is having filesystem stat / size metadata, this is a gap for being able to show how old a change is for my VS Code / Dagger virtual filesystem