#🧵MERGING EGRAPH 🧵
1 messages · Page 1 of 1 (latest)
I'm starting a thread to coordinate the final push... Just to make sure we don't drop the ball across timezones
cc <@&946480760016207902>
What I know:
- @fiery lake is working on the last red test (python sdk provisioning test). A fix is incoming
- 3 conflicts remaining but trivial
- Lint errors remain. @unique steppe is on it
- One fix that was merged in main yesterday (builtin dang sdk loading go/ts dependencies) has been reverted on main, because it conflicts too much. @fiery lake has a new fix on top of egraph - but for now keeping it in a separate branch
The python test issue has been solved.
Just pasting here for the record a part of the analysis of this bug, as it might need a follow-up:
The Python SDK has always piped the engine's stderr without reading it — a latent deadlock waiting for the buffer to fill. Commit 9baf16e9e started logging every dagql cache prune at Info level, which on this branch's slow suite is enough output to fill the 64 KB pipe and wedge engine shutdown. The pre-existing bug just stayed dormant on main because the engine never wrote enough.
@unique steppe do you need help on the linting issues?
I'm 95% sure it's incomplete in this current form. This fixes the deadlock, but I don't think the lifecycle is actually correct (yet)
Once we hand proc.stderr off to the drain thread and then set proc.stderr = None, Popen.__exit__ no longer owns or closes that wrapper. The full integration repro now passes, but it also starts emitti ResourceWarning for an unclosed _io.TextIOWrapper, which looks like this exact pipe.
So I think the shape here is right, but it's incomplete: if the drain thread takes ownership of the pipe, it also needs to close it explicitly instead of relying on Popen after we've detached it.
Mmh potentially, it's not as straightforward as I woudl think at first (some casts are mandatory depending on the build tag, etc etc). Finishing what i have, then we can sync on top of that
Silly question but: is this the source of truth egraph: github.com/sipsma/dagger#egraph ?
yes
__Status on the linting issues: __
Todos
☒ Batch 1a: gofmt two files
☒ Batch 1b: fix duplicate imports (ST1019)
☒ Batch 1c: remove unnecessary casts (unconvert)
☒ Batch 1d: simple staticcheck/gocritic fixes
☐ Batch 1i: remove unused nolint directives
☐ Verify batch 1 with go build + go vet
☐ Batch 1e: ineffassign (per-site verification)
☐ Batch 1f: nilerr (per-site verification)
☐ Batch 1g: govet nilness fixes
☐ Batch 1h: govet copylocks
☐ Batch 2: remove confirmed dead code
☐ Batch 3a: remove WithSkipWorkspaceModules calls (18 sites)
☐ Batch 3b: migrate trace.NewNoopTracerProvider
☐ Batch 3c: migrate strings.Title to cases.Title
☐ Batch 3d: archive/chrootarchive deprecations (research first)
☐ Batch 3e: cfg.ArgsEscaped (flag to user)
☐ Final verification: go build + go vet
I'll push those (but first the fix for the python leak), manually and carefully reviewing each
I'll handle Batch 2 and 3
Doing some QA on the branch 
Slightly disappointed that dagger -m github.com/dagger/dagger functions on a warm cache still takes 10 seconds 😭
(Same result from a local checkout)
🙋 May I propose expanding the ci-bootstrap check before merging egraph -> https://github.com/dagger/dagger/pull/12989
To raise the bar for "green"
So... we merge tomorrow? 🙂
YESS 👼
Thanks for the fix of the fix 🙂
running dagger check go:lint and continuing to fix them
I now have a clean go:lint. Just running some more tests on my machine first, then push on the branch (and see if it didn't broke something else...)
But it should be close to a green CI 🤞
It's getting better.
On my machine, go:lint is clean.
I also fixed a bug in check -l/generate -l that was doing a weird display, but that's good now.
But on the CI I still have go:lint red. Always at the same place, when linting core/integration/testdata/modules/typescript/ifaces/. I don't know why
https://dagger.cloud/dagger/checks/github.com/dagger/dagger@f902bda10754fba1e8d0f539e5b5c30376b02e54?check=go:lint&listen=e4b42b5ff62e376e&listen=ba874410e1aa662c&listen=e20adef307f8c15b
If anyone has an idea why it fails on CI, and why it constantly passes on my machine, I'm all ears.
The core/integration/.../ifaces issue is also happening on other PRs (not egraph), for instance https://dagger.cloud/dagger/checks/github.com/dagger/dagger@c8a2199d335555b4030aab2bc521ea44fc441df3?check=go:lint&listen=270fdb4a7a251ce5&listen=18edd7a3ea1004b9

And it's green 🎉
I've also reviewed the changes made to fix go lint, there was way more than I expected, with a lot of duplicated code, unused parameters, dead code, etc.
It's a little scary when reading the changes, especially without to have a full understanding of the changes, but they seemed ok to me.
My manual tests went nicely.
Thanks! Will review everything this morning that I couldn't yesterday and we should be good to merge today