#Investigating abnormally slow filesync
1 messages · Page 1 of 1 (latest)
I initially blamed my starlink internet, but I don't think that fully explains it, because it's really pretty good
I suspect there may be a regression, possibly caused by my blueprint PR? It does make changes to module context directory access...
cc @lapis ginkgo @young current @raw pawn @upper bane since we were discussing that in the other thread
Diagnostics:
git log -1
commit cfdd1f733f455d828b15a640e7c10926b40c9a38 (HEAD -> address-api, origin/address-api)
Author: Solomon Hykes <solomon@dagger.io>
Date: Mon Jul 21 12:57:59 2025 +0200
address(): a unified address to load containers, directories, secrets, etc.
Signed-off-by: Solomon Hykes <solomon@dagger.io>
git status
On branch address-api
Your branch is up to date with 'origin/address-api'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: core/container.go
modified: core/schema/address.go
modified: core/schema/container.go
no changes added to commit (use "git add" and/or "git commit -a")
dagger -M -c '.core | module-source . | context-directory | glob "**"'
▶ connect 3.7s
● loading type definitions 1.1s
● moduleSource(refString: "."): ModuleSource! 4.0s
● .contextDirectory: Directory! 0.0s
● Directory.glob(pattern: "**"): [String!]! 0.0s
.dagger/
.dagger/.gitattributes
.dagger/.gitignore
.dagger/README.md
.dagger/bench.go
.dagger/checks.go
.dagger/cli.go
.dagger/dagger.gen.go
.dagger/engine.go
.dagger/go.mod
.dagger/go.sum
.dagger/internal/
.dagger/internal/dagger/
.dagger/internal/dagger/dagger.gen.go
.dagger/internal/querybuilder/
.dagger/internal/querybuilder/marshal.go
.dagger/internal/querybuilder/querybuilder.go
.dagger/internal/telemetry/
.dagger/internal/telemetry/attrs.go
.dagger/internal/telemetry/env.go
.dagger/internal/telemetry/exporters.go
.dagger/internal/telemetry/init.go
.dagger/internal/telemetry/live.go
.dagger/internal/telemetry/logging.go
.dagger/internal/telemetry/metrics.go
.dagger/internal/telemetry/proxy.go
.dagger/internal/telemetry/span.go
.dagger/internal/telemetry/transform.go
.dagger/main.go
.dagger/scripts.go
.dagger/sdk.go
.dagger/sdk_all.go
.dagger/sdk_dotnet.go
.dagger/sdk_elixir.go
.dagger/sdk_go.go
.dagger/sdk_java.go
.dagger/sdk_php.go
.dagger/sdk_python.go
.dagger/sdk_rust.go
.dagger/sdk_typescript.go
.dagger/test.go
dagger.json
engine/
engine/distconsts/
engine/distconsts/consts.go
engine/distconsts/go.mod
engine/distconsts/go.sum
sdk/
sdk/typescript/
sdk/typescript/runtime/
sdk/typescript/runtime/.gitattributes
sdk/typescript/runtime/.gitignore
sdk/typescript/runtime/bin/
sdk/typescript/runtime/bin/__dagger.entrypoint.ts
sdk/typescript/runtime/bin/__deno_config_updator.ts
sdk/typescript/runtime/bin/__tsclientconfig.updator.ts
sdk/typescript/runtime/bin/__tsconfig.updator.ts
sdk/typescript/runtime/bundled_static_export/
sdk/typescript/runtime/bundled_static_export/client/
sdk/typescript/runtime/bundled_static_export/client/index.ts
sdk/typescript/runtime/bundled_static_export/client/telemetry.ts
sdk/typescript/runtime/bundled_static_export/module/
sdk/typescript/runtime/bundled_static_export/module/index.ts
sdk/typescript/runtime/bundled_static_export/module/telemetry.ts
sdk/typescript/runtime/clientgen.go
sdk/typescript/runtime/config.go
sdk/typescript/runtime/dagger.json
sdk/typescript/runtime/go.mod
sdk/typescript/runtime/go.sum
sdk/typescript/runtime/main.go
sdk/typescript/runtime/module.go
sdk/typescript/runtime/sdk_module_fs.go
sdk/typescript/runtime/template/
sdk/typescript/runtime/template/package.json
sdk/typescript/runtime/template/src/
sdk/typescript/runtime/template/src/index.ts
sdk/typescript/runtime/template/tsconfig.json
sdk/typescript/runtime/tsdistconsts/
sdk/typescript/runtime/tsdistconsts/consts.go
Full trace at https://dagger.cloud/dagger/traces/e92abeb8dd57a88a4e56cbd1124a13f0
@lapis ginkgo I don't know how to interpret that output 👆 is it good or bad? 🙂
Internet definitely doesn't qualify as "slow" ⏬Upload is not great, but not horrible either.
Hmmm.... your context directory looks ok to me. Each dependency has its own context directory now though, adding a few seconds each: https://dagger.cloud/dagger/traces/e92abeb8dd57a88a4e56cbd1124a13f0?span=0ff564fc2102af87
It used to be that dependencies would add to the context directory of the parent module so it used to be a single bigger filesync, and now they're multple filesyncs that contribute to the same cake (I think, simplifying).
Let me run a pipeline in this exact context, so we have the trace for reference
@cloud hull, can you check what untracked+ignored files do you have in the repo?
git status --ignored
On branch address-api
Your branch is up to date with 'origin/address-api'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: core/container.go
modified: core/schema/address.go
modified: core/schema/container.go
modified: sdk/elixir/lib/dagger/gen/container.ex
modified: sdk/go/dagger.gen.go
modified: sdk/php/generated/Container.php
Ignored files:
(use "git add -f <file>..." to include in what will be committed)
.dagger/dagger.gen.go
.dagger/internal/
.env
bin/
core/integration/testdata/test-blueprint/myblueprint-with-dep/dagger.gen.go
core/integration/testdata/test-blueprint/myblueprint/dagger.gen.go
dagql/idtui/testdata/TestTelemetry/TestGolden/disk-metrics
sdk/dotnet/sdk/Dagger.SDK/introspection.json
sdk/php/vendor/
no changes added to commit (use "git add" and/or "git commit -a")
du -hs bin
125M bin
.env is a file
du -hs sdk/php/vendor
45M sdk/php/vendor
Reference run:
dagger -c 'sdk | all | generate | export .'
https://dagger.cloud/dagger/traces/abaf4ec7397541ae2dcf35f66c8c75d2
I'll run it a second time to see if there's any caching gains
See this one? https://dagger.cloud/dagger/traces/abaf4ec7397541ae2dcf35f66c8c75d2?span=998fdcbd4513a943
It's not excluding sdk/php/vendor
Can you list the files that it uploads? 1 sec, I'll give you the command
dagger --cloud -M -c "host | directory . --exclude 'bin,.git,**/node_modules,**/.venv,**/__pycache__,docs/node_modules,sdk/typescript/node_modules,sdk/typescript/dist,sdk/rust/examples/backend/target/sdk/rust/target' | glob '**'"
Those patterns come from here, btw: https://github.com/dagger/dagger/blob/99a49b7addca7851613ce38e3df977dbe6bbbaa0/.dagger/main.go#L35-L44
back, sorry
Here's the second run for reference:
https://dagger.cloud/dagger/traces/e4b8b3b201cf1403e32d4a403d0d1130
The output is very long....
https://dagger.cloud/dagger/traces/1ef87d7035f72f0ee3032d53df46c1cc
You can export that dir locally and analyse it with ncdu or something. For sure I'd add add an ignore for php/vendor, not sure how much that'll help you though. We're uploading the entire repo as the source Directory for the .dagger module.
We're uploading the entire repo as the source Directory for the .dagger module.
Didn't understand this part. If I add the ignore, wouldn't it prevent the unnecessary upload and solve the issue?
All the **/.changes could be ignored too.
It should help since you say it's around 45MB, but compared to the rest of the repo that's purposefully a part of that single source Directory I'm not sure how much of an impact it'll have.
But that upload should be cached, what's been killing me is even uploads that should be super incremental take multiple minutes
At first glace sdk/php/vendor seems to be just php files, so curious why it takes 45MB. Could have a few generated ones that are big.
what about ./bin could that be a contributor also?
Oh, I'm not seeing the whole list -> 1000 of 12896 lines rendered 😄
Isn't that already ignored?
Based on the snippet you showed me, yes. Just trying to understand all the moving parts
OK, manually removing sdk/php/vendor definitely improved things
I'm back to a reasonable incremental upload time now
yay!
no more push and pray
I'll add the ignore entry in my PR
or maybe separately
thank you so much
this will recursively simplify my iterating on #1397223053452120144