#Directory upload
1 messages ยท Page 1 of 1 (latest)
Are you using dag.Host().Directory() in your client? And if so, are you using the include and exclude optional arguments?
yes, it's a large git repo, we are including .git because we want to run git commands and LFS
the binary we are trying to build is almost 400MB (mostly embedded files)
for some additional context, the .git directory is almost 400MB before any LFS pulls happen
Mmm, so everything that gets uploaded is actually needed - it's just uploading slowly. Right?
One option is to clone the repo from inside Dagger. That way you'd benefit from caching at the git level, and bypass local filesystem upload altogether. I think that has worked well for others in the past.
cc @glad mesa @thorn minnow who have messed with this I believe
We need to be able to build & test in the container, before making a commit
Is it only the first run that has a very slow upload? Are the subsequent runs faster?
.git of 400mb is quite a bit. I guess it has a big history, do you need all of it? You could do a fetch of the git repo using a depth of 1 to reduce the amount of history loaded. Are you using Go's SDK? I currently did an integration for our internal monorepo and using go-git I do a PlainOpen with a depth of 1, since I don't really care for the entire history
IIUC doing a fetch doesn't help since Tony wants to sync local changes.
This is adding a lot of work on our end that a docker buildx would not seem to need. Part of the goal of switching to Dagger Go SDK is that we could remove a lot of custom build tools.
@cursive kernel this should be exactly the same performance profile as docker build, since it's the same tech under the hood
If it's not, then it's fixable ๐
this.
yes, but with docker buildx we would keep much of our internal tools and keep git-lfs work outside of the buildkit engine
we are currently including the full repo, because we could benefit from the caching across 30+ services
but as we try to roll things out
- we run dagger per service
- we only do the
go buildstep of the service
You mentioned you can't clone the repo from inside the container, because the change you want to test is not yet committed - meaning this is a pipeline you want to run pre-commit on the dev machine, correct?
This is where they see docker buildx as a better alternative, because it wouldn't have the same overhead
Well, it's not really buildx that's faster, it's just keeping lfs outside the container that is faster. So you can always keep that part, and swap out buildx for an equivalent Dagger call. Then work incrementally from there.
An "unchanged" rerun has a 12s upload time, probably because the 400MB binary is there
we currently have LFS outside of dagger, I think it is actually faster inside, because the upload phase and caching
Right. What if you only uploaded the repo without the LFS assets, then fetched LFS objects from inside the container?
I've been experimenting with userland caching of git objects with my supergit module (since core buildkit doesn't support lfs)
we still have .git directories on the dev vms that are 10GB+
with or without lfs?
doesn't matter, they already exists on the dev vms in their current state and we cannot ask them to all wipe their setups
they have other, non-dagger builds that are going to fill it in
This might come down to "why are buildkit filesytem operations so slow"?
You're uploading 10GB.. I'm sure buildkit upload can be made faster, but I don't think that's the bottleneck in this case.
Maybe this would work:
- Upload the git worktree without
.git. So you get the uncommitted changes. - Get the git objects from inside the container (bypassing direct upload). Make sure to use a cache volume to keep everything fast
- Combine uploaded worktree with in-container
.git: everything should work
Upload the git worktree without .git. So you get the uncommitted changes.
Another optimization here could be omit all git LFS files and check them out in the pipeline as well in a cache volume. That way you avoid uploading huge file(s) to the build context often
That involves more effort than normal, because it would require porting some internal build tools in python/cue. We're still at the POC phase
I grant that this POC is exacerbating the issues
but it's causing 1m builds to take more than 30m
Can you elaborate on this part?
now, I have a second code path, that does much of what you describe here (also doing the 30+ services in a single docker run), and takes about the same time, for a fresh clone
makes sense, but realistically they're doing very different things. Uploading 10GB to a build context in the Dagger setup vs not doing that in docker buildx will always yield a very significant difference
re: elaborate - we use python + cue to form our software catalog and CI entrypoints. The LFS patterns are in CUE and LFS is run by python
A docker buildx POC might very well see some of the same issues, but it is a smaller POC than rolling out more Dagger integration before we know
at the same time, it would be designed differently, because we cannot realize the full product build (30+ services) in a single command
On Jenkins, we see a 50-100% build time increase for this POC, which is currently acceptable. We go from 30-45m to 45-60m
The main problem is the local developer experience, where just 1 of the services takes 30m
this is because the first time you're uploading a massive context to the build, correct? You mentioned a +20m context upload time
I'm getting confirmation on 2nd+ uploads, but they still seem to be significant
My timings (with a mostly clean repo)
fresh build -> 9m
no change -> 1.5m
change help string -> 5m
(for full run, not upload)
I have about 1.5GB (.git + lfs + src)
mind sharing where time is going in the 1.5m and 5m builds respectively?
sure, happy to keep chatting there
jumped into a quick call with Tony to go through this. Managed to reduce his 1.5m no change build to 12s. Reason was some Exports he was doing on each build which was generating some context re-uploads and subsequent cache invalidations.
Having said that, it'd be great to have a way to benchmark how much time Dagger is taking to checksum the build context and decide if/what should be uploaded. cc @silent oasis @torpid vigil in case you know if there's any way we could easily know this..
cc @leaden talon
This is definitely related to the discussion we had the other day here: https://github.com/dagger/dagger/issues/6155#issuecomment-1945019556
(beware, many technical details)
@silent oasis in this case I don't think so: #1211750401490292777 message
If I do subset := code.Directory(".", { include: ["src/libs/go", "src/services/foo"]})
Is that a copy or is something smarter going on?
This is unrelated. No modules being involved here
ah ok sorry i should check back in when i'm a little more awake ๐
out of curiosity - how much of the commit history etc, is actually relevant here @cursive kernel?
These are local builds on the developer cloud vms, so we are dealing with whatever they have there, plus we need to operate on local changes that are not in git
separately, they sometimes have 10s of GB that we cannot predict, one had a ./scratch directory with 30GB
I'm reproducing myself right now (added 6x ~5GB tar.gz files in a ./tmp dir), prelim timing indicates it takes ~20m even with the no-change that resulted in 12s above
๐ ?
I'm working on some "hacks" to deal with my filesize / upload issue
Having a dagger option / filter like this could be useful
either way, I imagine whatever we come up with could go into a knowledge base
well.. that's the good thing about dagger. Since it's programmable having a function that does this for you in the meantime until it gets added in the engine is quite trivial

yeah, the hard part is the size to filter on...
I'm not sure if we are going to have a universal value, might depend on the directory...
are Exclude paths relative to the dir or fullpath?
git := dc.Dagger.Host().Directory(".git", dagger.HostDirectoryOpts{
Exclude: []string{"lfs/"},
})
should this be lfs/ or .git/lfs?
another thing that has been mentioned before is supporting .gitignore, that would probably cover almost all of my cases besides LFS stuff
Here's the gist of what looks to be working so far
func loadSource(dc *client.Client) (*dagger.Directory, error) {
// get code
// we do this in parts and then combine, to improve caching
// first, we need to build up an exclude list that has our exported files
excludes := []string{
".git",
"src/stacks/foo/data/", // docker compose & root owned directory
}
for _, c := range components.QtpComponents {
// exclude the bin
p := filepath.Join(c.Dir, c.Bin)
excludes = append(excludes, p)
// other possible excludes would be LFS, ...?
}
// next, add large files to the exclude list
// get a list of files over a certain size (hacky, depends on directory?) (like python models or other test files) would prefer (configurable) .gitignore support here
out, err := exec.Command("sh", "-c", "find . -size +20M").CombinedOutput()
if err != nil {
return nil, err
}
s := strings.TrimSpace(string(out))
lines := strings.Split(s, "\n")
for _, line := range lines {
line = strings.TrimSpace(line)
// we can ignore git, since we ignore the entire directory anyway
if strings.HasPrefix(line, "./.git") {
continue
}
line = strings.TrimPrefix(line, "./")
excludes = append(excludes, line)
}
// then upload the source without git & exports
src := dc.Dagger.Host().Directory(".", dagger.HostDirectoryOpts{
Exclude: excludes,
})
// get the git directory separately, because it is large and changes less often
git := dc.Dagger.Host().Directory(".git", dagger.HostDirectoryOpts{
Exclude: []string{"lfs/"},
})
final := src.WithDirectory(".git", git)
return final, nil
}