#Slowness in git repos
1 messages · Page 1 of 1 (latest)
Yes, that's because of RequiredPaths: https://github.com/dagger/dagger/blob/3c1d3c23253b39ec078f745d405b9a6831a2c510/core/schema/sdk.go#L485-L498
Go modules load all those files from the repo.
I talked about that with @opal reef yesterday, I intend to make an issue about it.
I mean, that's one reason it might be slow in git repos, it may be something else too.
You shouldn't see an impact because of RequiredPaths if you don't have many of those files.
Slowness in git repos
Even if that stuff is in the PARENT directory?
Basically --starting with a "fast" module, just mv ing it into a sub directory of dagger/dagger makes it slow
@pastel hawk Also, I had this question a bit ago regarding TS: https://github.com/dagger/dagger/pull/8236#discussion_r1735077151
Wondering if it's related -- basically instead of expecting a "ready to go" codebase, we're injecting stuff on the fly?
Yes, notice this comment, it applies to all those patterns since they're relative to the context directory (i.e., root of the git repo): https://github.com/dagger/dagger/blob/9146651c8d4a29ff0273d7b1d5c1340e3b2e860c/core/schema/sdk.go#L491-L493
If you look at the patterns used when uploading the files from module loading, you'll notice they come from there.
Is this a recent change? (using the root of the git repo)
Hmmm. Why do we have a separate go.mod/sum for dagger if we end up slurping up the whole repo?
I missed the memo -- to my knowledge, the scope was limited to the dagger directory, not above
No, the scope has always been on the context directory:
- context dir: git root or "root directory" if not in a git repo
- root dir: directory where
dagger.jsonis - source dir: directory where
dagger.jsonpoints to in"source"
The reason for that is that Erik wanted to support the use case of importing Go code from outside the module, in a repo. However, the way it's currently done isn't useful for non-Go SDKs and it loads more than necessary for Go modules. The reason for that is that this setting is hardcoded in each SDK.
It would be more useful if it were something that users could configure in their dagger.json files. This way each module could default to not load anything from outside, or load something very specific that it requires.
Sort of like contextual directories, but for loading the module instead of a function argument.
The current include/exclude patterns there are relative to the source directory. There's no way to configure patterns relative to the context directory, at least not for loading the module. It's not just importing in code, you may need something even for building the module itself, like during dependencies install.
This was introduced here, I believe: https://github.com/dagger/dagger/pull/6575
See this comment: https://github.com/dagger/dagger/pull/6575#discussion_r1481797036
Weird experience though that it’s loaded even when not explicitly used
Basically — creating a brand new module under dagger/dagger and running dagger functions takes 25 seconds every time
Yes, it's loading all the go code in the repo with that module.
@paper forge hit the issue with a module on a his own repo. 20-ish seconds
Also not sure what we’re doing with the context aside uploading?
We tried a Dockerfile with a COPY . . on dagger/dagger and it only took a few seconds
Not sure why when loading the context it jumps to 25s
(Basically transferring the same amount of data to buildkit)
Could it be that the custom logic to find patterns for include/exclude be slowing that process down?
Maybe yeah ... I think @paper forge's repo was not THAT big, compared to dagger/dagger
Also ... it was slow on 2 machines, fast on 2 other machines. Weird
Similar characteristics on those machines? Could there be a random hang somewhere, as if in the slow ones they're waiting for something?
I am going to create the issue for slowness now.
Could it be that the slow machines have more generated code? Like internal/dagger, etc. Even git ignored they should still be uploaded during module loading.
Just ran a simple dagger init foo in my machine and got these files: https://app.warp.dev/block/8wnHMlOFRKJWTysCpr696N
No. Everything is identical on the two machines code-wise.
What is the issue? This is a follow-up to: #8395 Given the following host: noefetch ///////////// gerhard@h22 ///////////////////// ----------- ///////*767//////////////// OS: Pop!_OS 22.04 LTS x86...
That is a complete repro, including side-by-side comparisons & Dagger Cloud Traces for all runs.
I'm even getting php files because of the **/vendor required path.
Could the amount of files more than file size contribute to the slowness?
Don't see that mattering in a clean repo. I'm going to try in a clean clone and see what gets loaded.
I think that to compare with a Dockerfile equally you'd have to send the context of the whole repo, with .dockerignore applying the same patterns we are.
[^^ @paper forge]