#Couple of fixes for some bugs when

1 messages · Page 1 of 1 (latest)

amber thicket
#

@floral bluff I guess you could review it if you're not busy and able to

floral bluffBOT
floral bluffBOT
amber thicket
floral bluffBOT
twilit zinc
#

It makes me very happy that you're giving this tool a shot 🙂 🙏

amber thicket
twilit zinc
#

Yes actually I built this because we miss you @amber thicket 😛

#

We're luring codex here so we get to hang out with you again

floral bluffBOT
# amber thicket https://github.com/dagger/dagger/pull/13173

Review finding:

  • engine/server/snapshotter.go:32-33 / :60-61: this removes the previously supported proxy snapshotter path and fuse-overlayfs snapshotter. Any engine config with worker.oci.proxySnapshotterPath now fails startup with proxy snapshotter is not supported, and any config with worker.oci.snapshotter = "fuse-overlayfs" now fails as unknown snapshotter. This looks unrelated to the cache persistence/reset work and is a startup regression for rootless/custom snapshotter deployments unless there’s a deliberate product decision to drop those modes. I’d restore the old proxy/fuse branches or split this into a clearly called-out breaking change with config/test coverage.

I did not find another concrete blocker in the persistence encode/decode pass. I started a compile-only go test ./engine/server ./dagql ./core -run TestDoesNotExist, but stopped it after dependency download/compile was still running for a while, so no local test result from me.

amber thicket
# floral bluff Review finding: - `engine/server/snapshotter.go:32-33` / `:60-61`: this removes...

engine/server/snapshotter.go:32-33 / :60-61: this removes the previously supported proxy snapshotter path and fuse-overlayfs snapshotter. Any engine config with worker.oci.proxySnapshotterPath now fails startup with proxy snapshotter is not supported, and any config with worker.oci.snapshotter = "fuse-overlayfs" now fails as unknown snapshotter. This looks unrelated to the cache persistence/reset work and is a startup regression for rootless/custom snapshotter deployments unless there’s a deliberate product decision to drop those modes. I’d restore the old proxy/fuse branches or split this into a clearly called-out breaking change with config/test coverage.
Yeah, that was intentional because those were kind of obscure corners of the engine that we had just inherited sort of from the past. And I didn't even think it was actually live code. But now that you mentioned it, I suppose it technically was possible to hit those code paths. So because it is a configuration option. Okay. Yeah. Good call. I can. I'll tell the other codex to go and fix that. Thank you.

floral bluffBOT
amber thicket
#

🤖(Erik's codex speaking through discord skill) thank you fellow @floral bluff, we have applied your feedback. Robo bros 4 lyf 🦾

floral bluffBOT
amber thicket
#

(I told it those exact words to say, but it did successfully post the message on its own 😄)

twilit zinc
#

@floral bluff do you have gh auth? ie. can you list issues in dagger/dagger

floral bluffBOT
# twilit zinc <@1506562465381744641> do you have gh auth? ie. can you list issues in dagger/da...

Yes. gh auth status shows I’m logged in to github.com as shykes, and gh issue list -R dagger/dagger --limit 5 works. Latest listed issues I can see:

  • #13174 sdk/java: mvn versions:set fails with Maven Central 429 due to unnecessary dependency resolution
  • #13169 WithServiceBinding + WithExec fails: dnsmasq addnhosts has IP entries with empty hostnames
  • #13158 TUI: Show meaningful progress for container image pulls
  • #13157 Workspace.moduleSource(): load a module from a workspace
  • #13154 dag.Git() does not work with Git LFS repos
tight topaz
#

@floral bluff how badly does this conflict with the workspace branch ?

floral bluffBOT
tight topaz
#

@floral bluff crickets

floral bluffBOT
# tight topaz <@1506562465381744641> *crickets*

Sorry, I did answer but my relay post failed locally: the configured $AGENT_RELAY_URL was pointing at 127.0.0.1:18080 and nothing was listening there. Result: not badly. Merge simulation against origin/workspace shows one real conflict, core/workspace.go. The overlapping files core/integration/engine_persistence_test.go, core/module.go, and core/modulesource.go auto-merge. The workspace conflict looks mechanical: workspace changed the Workspace struct to Cwd/ConfigFile/LockFile plus compatWorkspace, while the PR added persistence for the old workspace fields, so the fix is to port the persistence payload/encode/decode onto the new workspace shape.

tight topaz
#

@twilit zinc FYI ^ not sure what the issue was

twilit zinc
tight topaz
#

<@&1506565370385793125> How are you testing the conflicts ?

floral bluffBOT
# tight topaz <@&1506565370385793125> How are you testing the conflicts ?

I used Git’s in-memory merge, so it doesn’t touch the working tree:

git fetch origin main workspace pull/13173/head:refs/remotes/origin/pr-13173-check
git merge-tree --write-tree --name-only --messages origin/workspace origin/pr-13173-check

That reports the conflicted file list and merge messages. For overlap context I also compared touched files from the shared base:

base=$(git merge-base origin/workspace origin/pr-13173-check)
comm -12 \
  <(git diff --name-only $base origin/workspace | sort) \
  <(git diff --name-only $base origin/pr-13173-check | sort)

So the conflict claim is based on Git’s merge machinery, not just eyeballing diffs. If I wanted to verify the actual resolution, I’d create a throwaway worktree/branch, perform the merge there, resolve core/workspace.go, then run focused compile/tests.

tight topaz
floral bluffBOT
# tight topaz I'm not familiar with git merge-tree, but why dont you just use git merge like a...

Fair. I used merge-tree because it is quick and doesn’t mutate the checkout, but I just did the normal-human version in a throwaway worktree too:

git worktree add --detach /tmp/dagger-workspace-merge-check origin/workspace
git -C /tmp/dagger-workspace-merge-check merge --no-commit --no-ff origin/pr-13173-check

Same result: merge exits 1 with exactly one unmerged file, core/workspace.go. Git auto-merges core/integration/engine_persistence_test.go, core/module.go, and core/modulesource.go. So the previous result checks out with an actual git merge; I just avoided doing it in the main checkout.

tight topaz
#

@floral bluff is there a tricky corner case i'm missing reviewing pr 13173 ?

floral bluffBOT
tight topaz
#

@amber thicket is the removal of fuse-overlayfs in engine/snapshots/blobs.go on purpose?

amber thicket
tight topaz
#

@floral bluff any updates ?

floral bluffBOT
# tight topaz <@1506562465381744641> any updates ?

Yes. I found one corner I’d double-check, but my relay was down when I tried to post it.

The suspicious edge is persisted lists whose elements are snapshot-owning objects (Directory, File, maybe Container), especially if those elements are detached/dynamic and not also persisted as standalone cached results. In dagql/cache_persistence_self.go, list encoding recursively encodes each item, but only stores itemEncoding.Envelope; it drops itemEncoding.SnapshotLinks. That is fine if every object-valued list item also has its own persisted result row. If not, a materialized Directory/File inside a list envelope can decode after restart without the result_snapshot_links row that loadPersistedImmutableSnapshotByResultID requires.

So I’d ask/check: can a persistable cached list contain snapshot-owning object items that are only embedded in the list envelope? Existing list codec tests cover nested scalar-ish results, not snapshot-owning object list items.

amber thicket
# floral bluff Yes. I found one corner I’d double-check, but my relay was down when I tried to ...

No, that can't happen. It doesn't make sense for there to be detached snapshot owning objects in a list. That's not possible to end up in that situation. It's not even possible to end up with a detached snapshot owning object in general. I mean, there can be very ephemeral ones, like in the course of a single operation, but after an operation has succeeded, then they will be attached. And the only time this persistence and loading of persistent state happens is after and before operations are occurring. So it doesn't It doesn't make sense.

floral bluffBOT
twilit zinc
#

:thread-owner on

floral bluffBOT
#

Thread owner set to true.

twilit zinc
#

:effort xhigh

floral bluffBOT
#

Effort set to xhigh.

twilit zinc
#

:leave