#Dagger Cloud

1 messages · Page 1 of 1 (latest)

placid seal
#

On a second run, it passed, so idk, probably not high priority

remote tulip
#

hmmm yes I don't think I've seen that one previously, so most likely theseus related

#

adding to queue

placid seal
#

dagger trace f9f6b81bbf55dbc30217e64cf5717b9f

(just because i'm about to rerun it and you won't see the original in the checks url)

remote tulip
#

Yep I'm seeing it a lot in CI, investigating

remote tulip
#

Codex is onto something compelling:

• Yes. The strongest concrete path is now:

  GeneratorGroup.changes creates a temporary git repo on the server-side mount, runs git merge --no-commit, Git starts detached auto-maintenance, git merge returns before that maintenance finishes, and Dagger immediately calls os.RemoveAll(.git).

  The exact Dagger path is:

  1. Generated calls generated.Changes() after isEmpty succeeds: .dagger/main.go:20
  2. GeneratorGroup.Changes merges all generator changesets: core/generators.go:142
  3. WithChangesets builds a merged “before” directory, explicitly strips any source .git, then performs an octopus merge: core/changeset.go:852
  4. The temporary repo is created on a mutable snapshot mounted under /tmp/buildkit-mount...: core/changeset.go:1003, engine/snapshots/snapshotter/localmounter_linux.go:60
  5. gitOctopusMergeWithPatches runs git merge --no-edit --no-commit, then immediately deletes .git: core/changeset.go:1141

  The key issue is runGit: core/changeset.go:1152

  It sets a stripped env, but does not disable Git auto-maintenance. With current Git defaults, maintenance.auto=true, maintenance.autoDetach=true, and gc.auto=6700.

  I ran a focused local trace with the same stripped env shape. With ~7000 loose objects, git merge --no-edit --no-commit theirs did this:

  git merge --no-edit --no-commit theirs
    -> git maintenance run --auto --no-quiet --detach
      -> git gc --auto --no-quiet --no-detach --skip-foreground-tasks
        -> git repack
          -> git pack-objects ... .git/objects/pack/.tmp-...-pack

  The important timing: the top-level git merge returned before the detached maintenance finished. That is exactly the point where Dagger calls:

  os.RemoveAll(filepath.Join(workDir, ".git"))

  So this is not hand-wavy anymore. There is a real server-side path where a Git child process is still mutating .git/objects/pack, refs, logs, or lockfiles after runGit(... "merge" ...) returns.

  This also makes the client-sync theory unlikely. The failing .git is not the client checkout’s .git; mergeBeforeDirectories strips source .git, and then git init creates a new temporary .git inside /tmp/buildkit-mount....

  The object count threshold is plausible for this check. From the visible progress-log filters, the merged before-dir union is roughly on the order of 8.5k git objects, above gc.auto=6700. That estimate lines up with the trace mechanics.

  The fix I’d try first is to disable auto-maintenance for this internal temp repo, probably in runGit:

  maintenance.auto=false
  gc.auto=0
  gc.autoDetach=false
  maintenance.autoDetach=false

  maintenance.auto=false handles modern Git’s git maintenance run --auto; gc.auto=0 covers older/direct git gc --auto behavior. I’d also wrap the cleanup error so future traces say remove temporary merge git repo: unlinkat ....

  I’d rank this as a well-justified hypothesis now, not proven root cause. But it is a concrete code path that matches the exact error, exact timing, exact temp path, and CI-only flakiness.