#maintainers
1 messages Β· Page 18 of 1
(opening a separate PR)
Caught an over-complicated claude-engineered lib. Removing
(my fault I think)
~~Good to know that we segfault when we can't access the registry π ~~
panic: failed to resolve image "docker.io/library/golang:1.25.3-alpine" (platform: "linux/amd64"): failed to resolve source metadata for docker.io/library/golang:1.25.3-alpine: unexpected status from HEAD request to https://registry-1.docker.io/v2/library/golang/manifests/1.25.3-alpine: 429 Too Many Requests
goroutine 1 [running]:
github.com/dagger/dagger/cmd/engine/.dagger/build.(*Builder).qemuBins(0xc0002bc160?, {0xdddcf0, 0xc0002be000})
/src/cmd/engine/.dagger/build/builder.go:346 +0x5ba
github.com/dagger/dagger/cmd/engine/.dagger/build.(*Builder).Engine(0xc0002bc160, {0xdddcb8, 0xc000181290})
/src/cmd/engine/.dagger/build/builder.go:238 +0x163e
main.(*DaggerEngine).Container(0xc0000c5980, {0xdddcb8, 0xc000181290}, {0x0, 0x0}, {0xcc8493, 0x6}, 0x0, {0x0, 0x0}, ...)
/src/cmd/engine/.dagger/main.go:111 +0x46f
main.invoke({0xdddcb8, 0xc000181290}, {0xc000480600, 0x587, 0x600}, {0xc0003c22c0?, 0xc000167b98?}, {0xc0001556b0?, 0xdda5b8?}, 0xc000167df0)
/src/cmd/engine/.dagger/dagger.gen.go:364 +0xf3b
main.dispatch({0xdddc48, 0x136f020})
/src/cmd/engine/.dagger/dagger.gen.go:258 +0x8f6
main.main()
/src/cmd/engine/.dagger/dagger.gen.go:178 +0x16f
Update: we seem to panic on purpose
@civic yacht was there a reason to panick and not propagate the error ? (have a small PR ready that propagates the error instead)
having not looked at all it might be intentional. if the only possible resolution is for the whole command to fail, and it simplifies a bunch of call sites, it's a reasonable shortcut to take
(reasonable because this is just CI glue, not production, of course)
Not to derail anything here, but sometimes, i check what you guys are talking about here, and im tottaly disconnected to the topic, but i just love reading how engaged you all are on dagger. I could be scrolling in bed reading something here and its awesome you guys collab here so much.
Thank you that means a lot! It's one of my favorite parts of the job π The fact that you can just hang out here and follow, is a big part of the fun
no I have no idea why it panics π, returning the error is much better
we should just livestream ourselves doing oss maintenance π
@still garnet is it possible to trace an sse Event from http://localhost:$DAGGER_SESSION_PORT/v1/logs during Dagger-in-Dagger back to the ID of the Service that it came from? By "ID" I mean something that I can pass to dag.LoadContainerFromID
an sse Event from container | from nginx | asService | start for reference:
id: 15
event: logs
data: {"resourceLogs":[{"resource":{"attributes":[{"key":"dagger.io/engine.name","value":{"stringValue":"3e409e5fc224"}},{"key":"host.name","value":{"stringValue":"3e409e5fc224"}},{"key":"service.name","value":{"stringValue":"dagger-engine"}},{"key":"service.version","value":{"stringValue":"v0.19.6"}}]},"scopeLogs":[{"scope":{"name":"dagger.io/core"},"logRecords":[{"timeUnixNano":"1763670459632001339","body":{"stringValue":""},"attributes":[{"key":"stdio.stream","value":{"intValue":"1"}},{"key":"dagger.io/logs.verbose","value":{"boolValue":true}},{"key":"stdio.eof","value":{"boolValue":true}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"ararAbM5Dts="},{"timeUnixNano":"1763670459632405785","body":{"stringValue":""},"attributes":[{"key":"stdio.stream","value":{"intValue":"2"}},{"key":"dagger.io/logs.verbose","value":{"boolValue":true}},{"key":"stdio.eof","value":{"boolValue":true}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"ararAbM5Dts="}]},{"scope":{"name":"dagger.io/engine.buildkit"},"logRecords":[{"timeUnixNano":"1763670459692083721","body":{"stringValue":"/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration\n/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/\n"},"attributes":[{"key":"stdio.stream","value":{"intValue":"1"}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"VrtwDLP4Qik="},{"timeUnixNano":"1763670459693363561","body":{"stringValue":"/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh\n"},"attributes":[{"key":"stdio.stream","value":{"intValue":"1"}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"VrtwDLP4Qik="},{"timeUnixNano":"1763670459706645637","body":{"stringValue":"10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf\n"},"attributes":[{"key":"stdio.stream","value":{"intValue":"1"}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"VrtwDLP4Qik="},{"timeUnixNano":"1763670459712897243","body":{"stringValue":"10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf\n"},"attributes":[{"key":"stdio.stream","value":{"intValue":"1"}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"VrtwDLP4Qik="},{"timeUnixNano":"1763670459713321916","body":{"stringValue":"/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh\n/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh\n"},"attributes":[{"key":"stdio.stream","value":{"intValue":"1"}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"VrtwDLP4Qik="},{"timeUnixNano":"1763670459715893880","body":{"stringValue":"/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh\n"},"attributes":[{"key":"stdio.stream","value":{"intValue":"1"}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"VrtwDLP4Qik="},{"timeUnixNano":"1763670459717365448","body":{"stringValue":"/docker-entrypoint.sh: Configuration complete; ready for start up\n"},"attributes":[{"key":"stdio.stream","value":{"intValue":"1"}}],"traceId":"eYweWCGLJEm0uzznrrBvzg==","spanId":"VrtwDLP4Qik="}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.37.0"}]}
those logs are tied to span IDs, which you can extract with some shell magic (fish ahead, sorry:)
β― for span64 in (wl-paste | jq -r '.resourceLogs[].scopeLogs[].logRecords[].spanId' | sort | uniq); echo $span64 | base64 -d | hexdump; end
0000000 b66a 01ab 39b3 db0e
0000008
0000000 bb56 0c70 f8b3 2942
0000008
so if you browse the trace in Cloud you could try navigating to ?span=b66a01ab39b3db0e or ?span=bb560c70f8b32942
any way to do that programmatically during Dagger-in-Dagger? or is that data only exported to Cloud?
call data is all encoded on span attributes, but each span only has its local call data, if that makes sense. like for foo(1).bar(2).baz(3), baz will have baz(3) and point to the digest of baz(2) which is expected to have its own span, which points to foo(1) and so on. technically all the data you need is in the telemetry database, but there isn't a tool specifically to "re-construct" a call given a span ID
depending on how much info you need, it could be enough to look up the span in the telemetry DB, grab the dagger.io/dag.call attribute, and decode it using https://www.protobufpal.com/ + https://github.com/dagger/dagger/blob/main/dagql/call/callpbv1/call.proto
the telemetry DB is internal to the engine though, yeah? so I wouldn't be able to access it from within a module
right, yeah
but, if your dagger-in-dagger is able to read from /v1/logs, you could also read from /v1/traces
indeed, I can see the span events from /v1/traces--im looking at both and trying to find the relation now
ok, I see the spanId from the log event embedded in a span event. I see that the name is Container.asService and the value of the associated dagger.io/dag.call is ChV4eGgzOjQ3OGMyMTg4Y2FlMTFhNTcSCwoHU2VydmljZRgBGglhc1NlcnZpY2VKFXh4aDM6YjU4OGM4MTgwOTU0MmFlNlIHdjAuMTkuNg==
you're saying that I should be able to loadServiceFromID using that last value?
unfortunately no - what you have there is just a single component of a full DAG which is what IDs are, you would need to also find the rest of the spans that it depends on. they're split up like this to avoid redundant data
decoding that value yields:
{
"args": [],
"receiverDigest": "xxh3:478c2188cae11a57",
"type": {
"namedType": "Service",
"elem": null,
"nonNull": true
},
"field": "asService",
"nth": 0,
"module": null,
"digest": "xxh3:b588c81809542ae6",
"view": "v0.19.6",
"isCustomDigest": false
}
so then you'd need to find which span has dagger.io/dag.digest == "xxh3:478c2188cae11a57"
same for any IDs in args, etc. etc.
so yeah - it's possible but you'd likely want to hand-roll a tool for it
ok ok ok that makes a ton of sense.
this might help: https://github.com/dagger/dagger/blob/main/dagql/dagui/extract.go
An open-source runtime for composable workflows. Great for AI agents and CI/CD. - dagger/dagger
yeah I'm trying to build a tool that does this, just trying to understand the lay of the land first
not something you can readily use, but maybe repurpose / use as reference
one more: can you point me to the struct that most of Dagger uses to access the telemetry DB?
it seems to me that I'm basically reconstructing a subset of Dagger's telemetry DB in my module, and if I were to try to move this code into dagger core I'd just use the telemetry DB directly, so I'd like to try to make them as interchangeable as i can
I would recommend using https://github.com/dagger/dagger/blob/main/dagql/dagui/db.go - it has everything you'd need for what you're trying to do, it's the high level representation that both our TUI and web UI uses. You just export OTel data to it and it maintains state
thanks a bunch! excited to have some insight and direction
another pointer: if you're running a dagger command to generate this telemetry, you can just set the standard OTEL_* env vars around it, so you don't need to hit those internal engine /v1/logs endpoints. We have a test suite that wires those up to point to a local *dagui.DB: https://github.com/dagger/dagger/blob/87289ede5a3220de4dc1d338785c879769a50afa/dagql/idtui/golden_test.go#L322 + https://github.com/dagger/dagger/blob/87289ede5a3220de4dc1d338785c879769a50afa/dagql/idtui/golden_test.go#L394-L396
@tidal spire good and bad news π
it works! but also thats a lot of output π
must be every time engine-dev is a dependency?
Oh right.. makes sense
I found that the OTEL env vars aren't supported by the Dagger-in-Dagger client, only the CLI, which was a bit of a headache when trying to develop this
I can pretty much just do a conditional on DAGGER_SESSION_PORT and use sse if it's set and OTEL env vars otherwise. In fact, that's probably what I will do so that it works in both contexts
ah, yeah that happens because WithEnvironmentVariable is only passed to the auto-started dagger-session, when you dagger run it has already been started, so the OTEL_* vars need to have been set on the outside
@civic yacht I'm looking at this trace and can't tell what triggered that particular filesync... Is there a trick?
@tidal spire I just pushed a working .env.gha. What's the cleanest way to do that remove move?
When the engine ignores your ignore filter
like a bash step in the workflow I guess?
ah
debugging now
looking
I did it in the workflow file, not the action. We'll have to repeat it in slightly more places (grep -r .docker/config.json .github) but simpler that way IMO
Ah actually I don't think that's what broke it
Turns out my carefully tuned ignore filter for engine source, was silently ignored the whole time
Now that it's not ignored... I bet there are tons of missing files. That explains a lot
ahh ok
yeah it would be convenient to specify a .env file in the dagger/checks action, you're right. I will make an issue and add it later!
@tidal spire another loose end is that it's not just checks.yml that has dagger call --docker-cfg. And at least one of them does it without a checkout, so we need a slightly different .env that also works as an "outer .env". I'm testing that now
(but running into 3 different variations of flakes and other issues, getting a bit overwhelmed)
lmk if this sounds sane https://github.com/dagger/checks/issues/2
@civic yacht when the dust settles I'm going to pick up that async convo we started about making filesync filters more dynamic.. I've been spending a lot of time adjusting filters by hand to match the behavior of the underlying tools. It's probably not an overnight solution, but it would very useful if we could somehow pull it off
sgtm, I think there's a pretty reasonable option there
yeah first order of business for me is to ask you to explain your "hybrid" idea π
So far this is my ignore filter for core engine toolchain (build, test, release).
// +ignore=[
// "*",
// "!.git",
// "!**/go.*",
// "!version",
// "!core",
// "!engine",
// "!util",
// "!network",
// "!dagql",
// "!analytics",
// "!auth",
// "!cmd",
// "!internal",
// "!sdk",
// "sdk/**/examples",
// "!cmd"
// ]
I reluctantly added .git because otherwise go build complains about not being able to retrieve cvs info - not sure if that's expected
well at least I'm not uploading docs π
π
@civic yacht I'm getting "secret xxx not found in secret store" in my monolith PR, freshly rebased on main
https://dagger.cloud/dagger/traces/99c179dff1fb5d27cc4d7d07335e9f97?span=1655880dc23c22b6
that's with the stable outer engine right? there is a fix for that error on main
Oooh I forgot to update the pin on the outer engine..
ok lemme know, want to make sure my fix didn't somehow break something else
re-running with latest main
@tidal spire not urgent. I think I'm doing something wrong in the GHA config, the jobs fail with //.docker/config.json: file not found, like $HOME resolves to / or isn't set in the CI runner? weird
that would probably also explain the secret not found errors you're getting since I'm guessing that is treated as a secret
You realize how much better dagger is, when you have to go back to push-and-pray for debugging actual CI
question - why publish 'packages' along with the generated sdk code? because some languages act/behave better with published packages or 'good practice reasons'? Do all sdks publish packages
I was hacking around at my own jab at a c# sdk and i have modules working and the project runs etc dagger init --sdk=github.com/pjmagee/dagger/sdk/csharp/runtime@copilot/add-csharp-sdk copilot helped me a lot with navigating the sdk/go/runtime and how other sdks worked to hook into the module runtime system, which was nice to be guided a bit with copilot.
Then I was thinking, hmm, why even publish a library
Wait im getting confused, go sdk doesnt, and i thought the python one did but looking at it, it does a special mapping to a generated sdk folder
In the context of Dagger, the term "SDK" has been used as an umbrella term to mean a loose collection of related but distinct things:
-
The dagger module that provides a runtime for other modules (ie. the Dagger Python SDK provides a dagger module that can introspect your python code and load them into the dagger engine.
-
The official client library that you can import into any program to call the core Dagger API. Depending on the native platform, that library might be published on a registry (npm for typescript, pypi for python), or straight to github (eg. Go client library is at
dagger.io/daggerwhich is a special managed repo)
Ahh, thats is the missing piece.
We want to clean up and clarify that terminology, but it's a "measure twice, then measure twice again just to be safe, then cut" situation. Don't want to change the UX too many times and give users wiplash
π this is the only outstanding issue for kill-monolith: https://github.com/dagger/dagger/pull/11373#issuecomment-3560851093
The time has finally come to break up our CI monolith into cleanly separated toolchains. This will make our CI simpler, smaller, and hopefully faster! It will also finally allow us to show our own ...
Open issue in dagger/dagger fix toolchain config (default values can only be strings)
I was having a look at this item in "later" part of https://github.com/dagger/dagger/pull/11373.
But my understanding of the actual code and my different tests is that already works, no? We can pass Json values, so for instance booleans, ints or array of strings in addition to strings.
Or is it to be able to load host resources? Like secrets, files, etc?
yes @tidal spire fixed it this week
(at least for simple json-encodable values)
I merged the env fix. In the integ test for it I did still have to use an absolute path rather than relative to the .env file. Not sure if I just messed up the syntax of relative path or if that's a separate issue, but with an abs path it all works
still SQLITE_BUSY
quick fix for one of the flakes we're hitting in CI: https://github.com/dagger/dagger/pull/11448
Thank you so much!!! Will rebase on it now.
Re: relative paths. Do you mean the path of eg. the secret file? Of so, I don't think it's expected to be relative to the .env file (I didn't even think of that..). It should be relative to the client's workdir I guess? Is that also broken?
oh I don't know, I just instinctually expected that if a file:// relative path appeared in .env, it would be relative to .env. And when that didn't work I switched to an abs path and it worked and I didn't try anything else
OK that's good UX feedback π In reality we just parse the value and pass it to dag.Address().<type>(), which is lifted directly from the former CLI implementation.
It's tricky because what seems the most natural to the end user (what you expected) would violate the loose coupling between 1) .env loading and 2) dag.Address().Secret() in a way I'm not sure how to addresss. Maybe dag.Address() could take an optional workdir argument?
What's needed to know a trace was created from which line in the code source ? I'm looking at a trace and it doesn't match what i see in the code, would be so cool to just click on it and it opens the relevant github line.
I hit a weird qemu-related error in CI: text file busy -> https://dagger.cloud/dagger/traces/ea300cd205093db422b1dff5e0a72127?span=12b6b921dc7a0249
@civic yacht I think this error is still related to the .env loading & secret? https://dagger.cloud/dagger/traces/02319ae6c5eb2801db117e2d81b08760?span=2aed16d4446785e3
Getting a mysterious test fail on shell autocomplete tests... Manual testing of the same test case looks fine... No relevant diff from main... π€·ββοΈ
https://dagger.cloud/dagger/traces/9d1cdda3213c93615c8ae85011929568?span=15d1d6533c6d4e73
This checks that the dagger shell auto-complete presents the correct option. Manual testing involves loading the same module in dagger shell, typing the same prompt, then hitting <tab> and checking available options
can someone sync https://github.com/dagger/winget-pkgs up to latest
i had a weird bug when i was publishing the winget dagger release before we merged it in, and then it was because my fork was so far behind something caused an issue with the PRs back into wget when it was producing the artifact. It was something to do with the repo not being upto sync with main (large amount of changes in between since its a mono repo)
@civic yacht I don't know if you fixed anything relevant in main, or if my failure was just a flake... but I rebased kill-monolith on main and now CI is green π
Now all I need is a β ... π
@tidal spire π
And it's merged π
Looks like it broke release-from-main. Makes sense since that one doesn't run on PR push
We could have tested it locally but very hard to do without access to all the secrets & context on how to set them up
We also got 2 failed integration test runs in main. Flakes?
I'm looking into the publish error
π this was supposed to go in 11452, but got mysteriously dropped... https://github.com/dagger/dagger/pull/11478
@tepid nova what do you think about { check: { ignore: [] }} in dagger.json? Basically if your module/toolchain/blueprint includes a check that is irrelevant/will never pass for your particular project and you want it automatically filtered out. Like go/test if we installed modules/go as a toolchain
I was thinking we could make it part of "toolchain customization"? Seems like a logical place to have it. wdyt
(which reminds me we should make that change)
yeah i'm back and forth. Initially I was thinking top level check: {} because it can apply to any module, but also you're unlikely to create a check in your parent module and then ignore it, so...
the only missing piece is a blueprint check, but i guess we can address that if it actually comes up
Yeah I think it's in the spirit of blueprints that you can't customize them - take it or leave it π
sweet i will pick up the toolchain customization issue and roll that in
(I mean you can customize it indirectly by adding toolchains & customizing those toolchains that were defined in the blueprint)
π got a weird failed check on main... No idea where it comes from. https://dagger.cloud/matipan/traces/2a7087207a149a7b7168a3861787b5ed?showHidden=923d89d83f80d13c&showHidden=8096576b781e4f80
Oh wait it's on @astral zealot 's org π€
@tepid nova I think this is the schema we landed on for toolchain customization, lmk if i forgot something! https://github.com/dagger/dagger/pull/11480
Follow up to #11373
New schema:
{
"name": "app",
"engineVersion": "v0.19.4",
"toolchains": [
{
...
getting a stack trace trying to dagger call generate
https://dagger.cloud/dagger/traces/73098aa0753e3f908bc2de722285fe8a?listen=d3ee883ebecdbbf2
That trace looks busted no? I can't figure out the origin of the error. Ah I guess it's happening in the top-level function itself.
yeah i think its coming from the changeset merge but its not clear in the trace
i think its because my branch changes the dagger.json schema and daggerDev isnt able to load the main dagger.json π€
and also hack/build is broken on main π’
Oh is it because I moved devBinaries to a toolchain and hack/dev could no longer find it?
Can't figure out this one
based on the stack trace it seems like it must be that changeset on this line is nil in some case
i hit that locally too, i'll split that out to its own pr now
Ah I see. Thanks & sorry
@tidal spire re: CI secrets. I'm thinking we start by adding .env.release to the public repo. I don't mind leaking 1p references, assuming it's not a security issue (I don't see why it would be), then the "leaking config" part is fine IMO
Then we can see what secrets are left and deal with them
yep agreed, I'll do https://github.com/dagger/checks/issues/2
on that note - what's this about .env.<foo>? how's that work? was just looking for something like that earlier (i think)
Oh it's a super cool feature. The way it works is, you manually run mv .env.foo .env in your CI script π
my dilemma: I have an .env that I want to commit half of:
β― cat .env
## CI STUFF (fine to commit to private repo)
PublishApp_StaticAssetsBucketAccessKey='op://cloud/blah/AWS_ACCESS_KEY'
PublishApp_StaticAssetsBucketSecretKey='op://cloud/blah/AWS_SECRET_ACCESS_KEY'
## PERSONAL STUFF
ANTHROPIC_API_KEY="op://Dagger/Anthropic/password"
Dev_GithubToken="op://Private/GitHub/token"
Dev_KagiToken="op://Private/Kagi/token"
The plan was to add dagger --env-file=NAME but I had a bug with nested clients and I never got it to work
what about something like dagger --with-env-file .env.cistuff that merges all --with-env-file and a .env if present?
or just --env-file π
Actually I think we have the same issue...
.env.ghafor Github Actions-specific stuff -> specifically~/.docker/config.json.env.releasefor Release-specific stuff.
Doesn't make sense to tightly couple them. But will often need both.
I'm also wondering if it might make sense to allow reading user defaults from system env directly also? Sometimes I find mysef wanting to do eg. engineDev_dockerClientConfig=~/.docker/config.json dagger call foo bar, but I can't - have to manually create the .env even if it's a throwaway. Seems artificially limited.
RFC on reducing number of engine images we publish from 4 to 2: https://github.com/dagger/dagger/pull/11482
Immediate motivation being to not have to deal with 3 different distro libcs when building the engine w/ CGo (context), but also seems like some worthwhile cleanup either way
Relevant: https://github.com/dagger/dagger/issues/5668 π
@tidal spire when you're back (tomorrow is fine) can you send me the actual 1p references? that way I can start testing release calls locally
@tepid nova https://github.com/dagger/dagger/pull/11483
I'm working on a POC of sub-checks. Can someone run dagger call go tests and tell me if that list of tests seems like the right granularity for our sub-checks? If each line in that list were its own Dagger check, would that be useful? Are there tests missing, or is it the wrong level of detail?
dagger -m github.com/dagger/dagger call go tests
How can I go about renaming an enum in the API? Specifically I have the ExistsType (under https://github.com/dagger/dagger/blob/main/core/directory.go#L1638-L1643 ) which I want to rename to FileType.
The issue is old client code might reference dagger.ExistsTypeRegularType which will have to be changed to dagger.FileTypeRegularType -- it could catch some users off guard if they upgrade dagger then all of a suddon their code breaks.
I thought about leaving it as is; however, I'm then running into issues when calling:
ExistsTypes.Register("REGULAR_TYPE", ...)
FileTypes.Register("REGULAR_TYPE", ...)
where the same "REGULAR_TYPE" causes duplicate, conflicting, variables to be defined.
release dotenv
@tidal spire you need a review?
checks are failing but i think it must be failing on main because its a golang-ci lint filter issue
It took forever but I ran check go:lint on main, and it succeeded π€·ββοΈ
Oh it's the nested one under ci:boostrap
Let me try that
running it locally π
A small detail that I love: ci:bootstrap runs a full nested dagger engine, then runs a bunch of checks inside that. But those nested checks still show up cleanly as sub-checks in the trace π
yeah super nice!
so the go linting is failing because its not going through the codegen check like the go:lint one does I guess
and I guess it is missing the ignore list from modules/go in the top level dagger.json
but i dont understand how i would pass on main then
ci:bootstrap is passing for me on main
(just to confirm)
@tidal spire how can I help?
all fixed up: https://github.com/dagger/dagger/pull/11480
I'm sure a bunch of us have seen this already but putting a note here so its discoverable in discord. The TestEngine/TestLocalCacheGC test from test-split/test-cli-engine is currently pretty flaky. I see failures like Expected used bytes to decrease below 1073741824 from gc, got 2160634505 pretty often. Rerunning the job from github has been the workaround. Here's a trace for reference https://dagger.cloud/dagger/traces/a7e049c1a0adc2c5b61bdef91f2f490b#c8511b570a93c34a:L218
Hey, noob question but how can i run a specific test with the new test toolchian ? π
./hack/with-dev go test -v ./core/integration -run '^TestCache/TestFunctionErrorNotCachedInSession$' -count=1 doesn't show me the logs && dagger call test specific --pkg="./core/integration" --run="^TestCache/TestFunctionErrorNotCachedInSession" doesn't exist anymore
dagger call engine-dev test --run=...
(I'm working on adding support for test selection in checks π
Looking for feedback/ideas.
I was migrating one of my projects to dang + checks + changeset.
Changesets are great to export files to the host, with a nice CLI integration.
I was thinking about doing the same, but for images: to have dagger function exporting a generated image to the host (to be then used via other tools).
Today we can do it by returning a Container and using export-image. Like with a Directory and export
So I was wondering how we could do the same for images. I have two ideas in mind:
- to create a new type (name TBD) that wraps a
Container, adds a reference and, when the CLI will receive it, export it to the host like we do for change sets (something likectr.ExportableImage("namespace/image:tag")-> returning a newExportableImagetype or any better name) - allow to add a reference to a container, and when the CLI receives a container with a reference, allows to export it (something like
ctr.WithRef("namespace/image:tag")-> returning still aContainerbut with a ref so we can export it)
Both ideas are very close, just the question of creating a dedicated type or not.
Is that something we'd like to do?
Is there any better solution? Or do you think one of the two is already good enough?
Let me know π
I wonder if that could make sense as a // +generator with a Container return type?
(And some way to set the ref, yeah)
yeah that's what the idea behind, I should have soon a generator poc working with changesets, the image was second on my list
I think it would be problematic for a function to hardcode a location on an external registry ("namespace/image:tag"). That would break the sandbox.
But, dagger could expose its own registry with all its artifacts. That would make more sense to me. So instead of dagger auto-pushing to docker - docker can pull from dagger when needed.
// +artifact
func (app *MyApp) Server() *dagger.Container {
// ...
}
dagger registry serve -l :4242
docker pull localhost:4242/myapp/server:latest
related--I've been using dagger to do exactly that: https://github.com/frantjc/sindri?tab=readme-ov-file#sindri---
Oh I love that!
That's been on my hack project list forever, cc @charred lotus π How's the experience of building it so far? Do you stream the layers straight from the cache, or private-push to an intermediary registry?
There's 2 "backends", and both use some form of intermediary storage for the layers and manifest outside of Dagger.
The backend can either be:
- Another OCI registry. In this case I use
Container.Publish()and then redirect or proxy requests to that registry. - A
gocloud.dev/blob.Bucket. In this case, I useContainer.Export(), unpack the tarball to the Bucket usinggithub.com/google/go-containerregistry, and then either proxy or redirect requests to it using signed URLs
Streaming the layers directly from Dagger is something I want to do, but I couldn't find a way to get ahold of the layers or manifest from Dagger directly.
@civic yacht is this a reasonable way to produce a TypeDef from an ObjectTypeDef?
func typedefFromObject(o *ObjectTypeDef) *TypeDef {
return &TypeDef{
Kind: TypeDefKindObject,
AsObject: dagql.NonNull(objTypeDef),
}
}
yep, shipit 
I'm not sure to see the difference between that and to export a file on the host FS.
In case that wasn't clear, this wasn't to push to a registry, only to export the image into the host image store, as we do with export-image
Just automated, the same way we have changeset to apply host fs changes instead of calling export
Yes I like the idea of auto-exporting to an image store. But the function should not hardcode the address of the image to export to ("namespace/image:tag"), just like it can't hardcode the filesystem path to export to, either.
The difference with auto-applying changesets, is that the CLI can infer the path to export to from context. What's missing is a way to infer the address of the container address from context also. One option is what I suggested: the image name is infered from the function path; the tag from the module version; and the registry is run by dagger, so that there are no side effects.
@tepid nova I have a use case for flexible _ separators for .env files:
Let's say I start with this:
type Dev {
pub githubToken: Secret!
pub github(args: [String!]): String! {
# ... run `gh` with githubToken
}
}
And want to change to this, so you only need to provide githubToken if you actually use the github tool:
type Dev {
pub github(args: [String!], token: Secret!): String! {
# ... run `gh` with githubToken
}
}
My understanding is right now you use githubToken or MyMod_githubToken as the var name, which means this refactor will require .env files to change to github_token (honestly not sure about casing). If we also allowed all forms like github_token, GITHUB_TOKEN, the env file wouldn't have to change. (Think I suggested this before, the casing and separator rules were the first thing I tripped up on)
normally these are all allowed already
(although there might be bugs π)
I tried to cover variations in the tests
huh ok, maybe i'm thinking of an earlier iteration. i didn't actually check if it worked π
- but I had already set Dev_GithubToken and assumed that wouldn't work. I think what I really want is just DEV_GITHUB_TOKEN - and for that to work whether it's a constructor or a function arg
yeah that should work too
I made it more flexible after you & others got stuck on lack of flexibility
oh nice, cool then 
but it can still be frustrating to iterate on the right key
let me know if one of those doesn't work!
Quick question: Is there a reason the other SDK's don't use the dagger.json include field for anything?
Just wondering because I've used it in the PHP SDK for a couple of things, I considered adding more to it just to minimize it. But I'm wondering if everyone else has moved onto a more efficient way?
Also second question while I'm at it:
When the PHP API Reference was made, the config was stored in docs/. Is there a reason for this to be stored here, rather than next to the code it's including/excluding? (i.e. in the sdk/php folder)
it's been mostly replaced by contextual dir/file args and setting ignore pragmas there instead. I think there are some less common cases where it's still valid though, like we use it for pulling in extra files: https://github.com/search?q=repo%3Adagger%2Fdagger ..%2Futil%2Fparallel&type=code
I figured as much, I ignored the same patterns in the contextual dir args anyway. I just didn't want to remove it without reason
Some time ago we discussed golangci-lint
Not finished, but I have a go-lint toolchain that has those checks:
$ dagger checks -l
β connect 0.2s
β load . 0.9s
β fetch check information 0.0s
Name Description
dogsled
gocritic
vet Run linters through go vet - bodyclose
errorlint
staticcheck Run staticcheck linter
dupl
Of course once the toolchain is added to a project, a single dagger checks go-lint will call all of them, in parallel, with possible scale-out.
It's not finished yet, I need to add more linters and more configuration (to have the same behaviour than our actual linters).
But before to invest too much in it, I was wondering what you all think about it, is that worth it or we're fine with golangci-lint.
neat - how does it run them? did you have to figure out how each of them run, or is it pretty straightforward build binary => run with ./... or something?
makes sense to me, i know golangci-lint can be pretty overwhelming CPU wise at least when I used to run it locally on a laptop, seems worth scaling out
each linter is different π some are taking packages (so ./... works quite well), some are taking files, etc. And to ignore (like internal/buildkit) is to be done.
But all that is done only once, I'm running go list and filtering.
Then I have to configure them one by one. That's the advantage of golangci-lint, a single yaml to configure all of them.
But to configure each linter is not a lot of work.
For instance for errorlint:
let goPackages = base.
withExec(["go", "list", "./..."]).
stdout().
split("\n")
let goPackagesWithoutBuildkit = goPackages.
reject { file => file.contains("/internal/buildkit") }.
join(" ")
let runOnPkgsWithoutBuildkit(cmd: String!): Container! {
base.withExec(["sh", "-c", cmd + " " + goPackagesWithoutBuildkit])
}
pub errorlint: Container! @check {
runOnPkgsWithoutBuildkit("go-errorlint -errorf -errorf-multi -comparison=false -asserts=false")
}
# @generator
pub errorlintFix: Changeset! {
runOnPkgsWithoutBuildkit("go-errorlint -fix -errorf -errorf-multi -comparison=false -asserts=false").
directory(".").
changes(base.directory("."))
}
There's probably ways to improve it, it's really wip.
my first impression is they are running quite fast one by one (or all together, I meant without golangci-lint)
@civic yacht FYI based on your explanation earlier. I might (try to) simplify CheckGroup.Run() by simply iterating over its checks and calling Check.Run(). I think that will remove duplicate code. Also de-duplicate the Module field which is currently set in both Check and CheckGroup, and I believe is always set to the same value everywhere anyway
They are deduplicated already since they both call check.run https://github.com/sipsma/dagger/blob/a1cde9744bdd73824ac3b6d0140d0248358012fd/core/checks.go#L280-L280
I'm not sure more deduplication is possible without performance tradeoffs; e.g. we don't want to re-init the server checks are called against (what dagForCheck returns)
Ah I see. Didn't think of the performance impact of dagForCheck. I was just observing that CheckGroup.Run and Check.Run visually seem 99% identical
@charred lotus can we merge https://github.com/dagger/dagger/pull/11483 ?
Yes please! Sorry I think I got pulled into family stuff before I could merge it..
(just clicked "update branch")
that feeling when you get an out or range error, so you add in some extra verbose panics only to not trigger the initial error again π
https://github.com/dagger/dagger/pull/11403/commits/8a990e0dac3e75024b71613f94908b02e3b6d72e
still no reproducing it...... but silver lining: it's only during the error handling phase. π₯
The only good thing about those horrible debugging situations, is that it feels great when you finally solve it π
In dagql, can I one-shot selecting a single field of N objects in an array?
For example given this schema:
type Mod {
tests: [Test!]
}
type Test {
name: String!
}
Can I do this?
var names []string
dag.Select(ctx, dag.Root(), &names, []dagql.Selector{
{Field: "tests"},
{Field: "name"},
})
you can have a method return a dagql.Array[core.YourTypeHere], if that helps? e,g, https://github.com/alexcb/dagger/blob/842e498fe60e157f1e69c356f88741d9559696a6/core/schema/container.go#L2585
Good to know, but in this case I'm querying arbitrary modules, so I don't control their code
hmmm, not sure then. best to ask the dagql experts.
no, that's not supported. Your best bet is to fetch up until the array and then do another dagql.Select for each dagql.AnyObjectResult in parallel
(or enhancing (*dagql.Server).Select to support that :P)
Ok thanks - it's actually good news for me, I found the root cause of my bug...
I just assumed it worked, since dagger call supports it seamlessly... I guess it's handled manually at the SDK level?
yeah, that's on the other side of the fence - dagql.Select is just an internal helper so it only supports what we've needed so far, dagger call is going through the actual GraphQL API layer which supports everything
Would there be a performance benefit to supporting it in dagql.Server.Select? Or would it be strictly convenience?
just convenience really
@still garnet from a telemetry/UI perspective, do we still need to split selecting the check in 2 distinct selects - one up to the parent, tucked away in a boundary/passthrough span, and one with the last field?
Or do we have a better way to achieve the desired result (hiding noise from users)
From 77 tests failing this morning.. to 3 π slowly but surely ahahah
@dagql_experts : is there a performance penalty to calling dagql.Server.Select() on each individual field of a path, vs. one-shotting it in a single select?
there technically would be a little extra overhead to splitting them up, but I highly doubt it's meaningful unless you're doing this in a very hot codepath
While we're at it, is there a difference between the selects and constructing and loading an ID directly ? srv.Load() (in terms of perf)
Same as above, the difference between those is very small
Follow-up question: what's the most reliable way to determine, at runtime, whether the result of my select is an object or an array?
- Select on
dagql.AnyResult - Try
result.(dagql.AnyObjectResult)-> if ok, it's an object - Try
???-> if ok, it's an array
?
yes but using dagql.UnwrapAs[T] instead of straight up casting, since we have some types that wrap others (e.g. NonNull)
you can do dagql.UnwrapAs[dagql.Enumerable] for array
nice thanks
TLDR, with arrays in the mix (for dynamic checks) it's no longer viable to one-shot-select anything. So I'm always going to select each node of the tree individually
Do we have known CI (GHA) issues? I keep getting timeouts on this PR: https://github.com/dagger/dagger/pull/11483
github check status timeouts
<@&946480760016207902> I created a new label for PRs stuck on CI: https://github.com/dagger/dagger/pulls?q=is%3Aopen+is%3Apr+label%3Astuck-on-ci
Use it when you need help figuring out why CI isnt working.
(PLEASE DON'T USE EVERY TIME CI IS RED!)
@wild zephyr to help with your investigation π
@tidal spire rebasing dynamic checks on your "ignore toolchain checks".. quick question.
- When looking at
mod.ToolchainIgnoreChecks, will the key always match the name of the "mountpoint" for that toolchain? In other words if I walk the main object's fields & functions, and find eg. a function calledfoo, can I safely lookupmod.ToolchainIgnoreChecks["foo"]?
Asking because in my refactor, I completely dropped the part where we walk each toolchain module when looking for checks. Everything is already mounted in the main module's main object as far as I can tell, so it appears we were doing unnecessary work. So it's easier for me if I don't have to add back a loop over the toolchain modules
If I'm understanding the question correctly, yes. Basically, if I have a toolchain go with a check lint, I would dagger check go:lint, or ignoreChecks: ["lint"}
I'm more worried about the name of the toolchain itself (in your example: go)
In your code: ignorePatterns := mod.ToolchainIgnoreChecks[tcMod.OriginalName]
ah, no its not the name of the toolchain, its the name of the function in the toolchain
But there's a higher-level map keyed by toolchain π
there is currently no way to ignore checks for an entire toolchain other than listing each of its checks in the ignoreChecks
let me find that in the code, one sec
Yeah that's fine, I'm just talking about that lookup by tcMod.OriginalName. Trying to understand if tcMod.OriginalName is always the same as the "mountpoint" of the toolchain
sorry if my question is confusing π
ah I understand, sorry I thought you were asking something completely different lol. Yes I believe thats the case
Nice π Thanks!
if thats not the case and you break something then thats a bug at least
I'm not at all familiar with the "toolchain mounting" code
@tidal spire well there's the distinction between:
- The
toolchains[].namefield in the main module'sdagger.json(eg.go) - The
namein the toolchain's owndagger.json(eg.my-super-go-toolchainor whatever)
I was worried about situations where those two are different. Wasn't sure if OriginalName points to 2?
Double checking, but I think we tested that
ok summary is yes it should be 1
In ResolveDepToSource (core/modulesource.go:1075) we pass in the depName from the config/dagger.json, and then in the selector we pass that through via withName
nice thanks!
@tidal spire what's the format of the pattern? Is it the same as the check filter argument? eg. foo:**:bar etc?
Does it match against the whole path, or only the individual check name?
(all questions I should have asked in the code review, sorry)
eh good question, I didn't test subchecks
so regexp regular string lookup on top-level check?
yeah check.Match([]string{ignorePattern})
the standard regex or explicit a:b should both work
processing... so it is the standard check filter argument. Ok!
yes sorry more words would have helped π
It's perfect atually, I already have the generalized plumbing for walking the tree with include filters... just adding the exclude filters, and we're good
awesome!
Also the plumbing is ready to be used by generate functions (cc @leaden glade), and any other entrypoint functions we want to add in the future
I've left a comment here that's worth discussion if we want to move forward with committing generated files https://github.com/dagger/dagger/pull/11515#issuecomment-3607997691
@civic yacht is this correct about check scale-out?
CheckGroup.Run()honors--scale-outby iterating over individual checks, and running each through the scheduler.Check.Run()does not honor--scale-out: it always runs the check on the current engine. This avoids infinite recursion, in the case whereCheck.Run()is triggered by a remote engine as part of an upstream scale-out
Yep, all correct
We can change Check.Run() to honor scale out, it just was simpler to have it not for now
OK, still trying to understand what is load-bearing as I refactor π
My puzzle for context: I gave a quick demo yesterday, and we discussed UX. We agreed it would be nice if dagger check -l only listed top-level "nodes" in the check tree instead of recursively listing all checks. More like ls rather than find. But that opens a pandora box for the best API schema, for example should Check and CheckGroup still be distinct? Or should every node in the tree be a Check, that can have sub-checks? etc
I want to make sure I don't paint myself in a corner by exploring design paths that break scale-out in a fundamental way
None of the API aspects are truly load-bearing to scale-out. e.g. we could pretty easily change the implementation to not even use the Check api (and instead some separate internal one). It just felt like a logical starting place. So if we end up in a situation where changes to the Check api mean we can't use it for scale-out anymore, that's totally fine
Sounds good. I'm confident we can keep scale-out working on checks, even after my refactor, so we don't close any doors or add any unnecessary short-term work
It occured to me that with my special treatment of named objects in a list (requiring a name() function by convention), I'm approximating maps? In case we want to move this down a layer and add map support in the future π
@still garnet in your opinion should we keep Check.completed and Check.pased in the core API? I don't think we even use it in the CLI (we infer all results from the otel stream). Not sure if we will use it in Cloud. Depending on how you look at it, it's either redundant with otel and should be removed; or an important aspect of reducing client dependency on otel, and should be kept and expanded to also include eg. logs: File! via server-side otel collection
I just used it 10 mins ago as part of the logic for syncing state to GitHub check runs π
I guess that answers my question π
I also like that we don't make otel too load bearing for writing a useful client
can be a slippery slope
(see related convo in #1445555192727867414 )
@still garnet (sorry for the ping-flood). FYI I'm grappling with the implications of makingcheck -l "like ls, not like find".
Basically I don't know when to do a full roll-up of actual leaf-checks, and when to "hide" them and return a list of "virtual checks" that might be parent nodes
basically it breaks assumptions in the current API
On the telemetry side: is it safe to nest spans with the check.name attribute? Will it break your rendering?
I'll experiment and see how it goes
@civic yacht re: honoring scale-out in Check.run(), what would be the simplest way to do that?
- Add an optional argument
run(scaleOut: bool)- parent engine sets it tofalsewhen calling sub-engine - Check
clientMD.EnableCloudScaleOut- parent engine ensures it's set tofalsefor sub-engine (somehow) - New contextual information, like
clientMD.AlreadyScalingOutor something like that. Same as 2. except the decision to scale-out or not is still left to the sub-engine (it's not actually forbidden from doing another level of scale-out if that turns out to be useful) - other?
Check clientMD.EnableCloudScaleOut - parent engine ensures it's set to false for sub-engine (somehow)
That should work, it's already explicitly set tofalsewhen one engine scales out to another https://github.com/sipsma/dagger/blob/6ffdc9a8c49e9bacdf249a0dd01f76b9825cba19/engine/server/session.go#L1573-L1576
I've been assuming that's how it'll be modeled so go for it π - but it might help to have some hint that it's a sub-check (like another attribute), if that's easy enough. Just so we don't have to walk all the way up the span hierarchy to discover whether it's a top-level check
but if there's some reason that's not viable that's OK too, I already found a reasonable way to work without it
OK. At the moment still struggling with impedence mismatch between the flattened model of the Check API, and the tree structure of the refactored underlying plumbing
I might have to bite the bullet, and just merge them - basically exposing the tree-like reality in the schema..
Can we not "just" make it a tree in all representations? like Module.checks: [Check!], and also Check.checks: [Check!]
Yes, that is all pretty trivial now, I have that plumbing ready to go. The part that breaks is when you run + collect results
ie. Check and Checkgroup are chainable (checks().list() or checks().run().list()) in a way that the underlying plumbing is not
I added a RunCheck() to the underlying tree node, it works on any node of the tree - if it's not an actual leaf check, it will roll those up. The problem is, how do I return the resulting tree of passed and completed
I know you guys have a dagger/dagger-for-github action that this is in contradiction to, but I love the dx of this: https://github.com/frantjc/sindri/blob/08bbb1559459778b4c20857dba1c4ee9004d37e4/.github/workflows/ci.yml#L17-L27
tl;dr:
jobs:
tldr:
runs-on: ubuntu-latest
steps:
- uses: frantjc/actions/setup-tool@v1
with:
repository: dagger/dagger
version: v0.19.7
- shell: dagger {0}
run: container | from busybox | withExec echo dagger shell is so nice | stdout
Agreed π
@tidal spire @leaden glade @fair ermine re: the idea of merging engine-dev into the top-level toolchain. Unfortunately I can't do it, because of circular dependencies.
@tepid nova Have you thought about having lint return annotations somehow?
No idea how it would work, just planting a seed. Maybe it could build on the existing SourceMap type somehow - https://docs.dagger.io/api/reference/#query-sourceMap
Would need: 1. source location, 2. annotation level (notice, warning, error), 3. title, 4. message (Markdown I think)
Related: we should add fix() functions that hook up those linters that support fixing, and have them return changesets
is that something that we can bubble up in the github action?
only if we use the (slightly crummy) Check Runs API
my rough plan is to just keep using commit statuses for the 99% case, but if something returns annotations, create a dummy check run just so we can show them
sweet
check runs are crummy otherwise because they're designed to keep you in GitHub, so it takes 2 clicks (intermediate page) just to get to the trace UI
but, annotations are sweet
@still garnet what's the alternative to a "check run"?
just push checks from the app, without github asking for it?
check runs vs. commit statuses - they're both just different APIs that the GitHub app uses to add checks to the commit. But when you click a check run you get taken to this intermediate page, instead of the run's 'details' URL, which is annoying
interestingly it supports re-running from that page, but since you have to do a click to get there anyway, you might as well just click into the Cloud page and rerun from there
Update: the TUI does not like nested check spans π
Or at least, it does not like my possibly buggy implementation of it
@still garnet current state. Probably parallel is messing with it? (each individual check run is wrapped in a parallel.WithJob()
Repro if you're curious:
dagger -m github.com/shykes/dagger@subchecks call \
engine-dev \
playground \
with-directory --source=https://github.com/shykes/dagger#subchecks --path=. \
terminal
Then:
dagger check helm
Or if you want a glimpse of dynamic checks while you're at it π
dagger check go -l
commit statuses
hmm I've actually seen this happen before for regular dagger check runs too, there's definitely a bug somewhere, if this repros it reliably hopefully it's the same bug
Granted my run implementation is in an intermediary state, because of the impendance mismatch I mentioned earlier. But still the telemetry part should work, I copied the pre-existing telemetry code, then wrapped it in parallel
I did drop a bunch of attributes (was hard to keep everything as it was pretty intricate and tightly coupled to the structure I'm refactoring)
@still garnet I think I figured it out. I temporarily removed the scale-out part, but left one snippet that resulted in crucial attributes not being sent. Removing that snippet seems to fix the output.
pushed
Also: I am not running nested check spans... Not yet sure if I will end up doing that. Depends on the impendance mismatch gets resolved. For now it's just a flat list of leaf checks, nested in regular (parallel) spans
Mmmm @still garnet is it normal behavior that the check spans cannot be expanded at verbosity=1? They only become expandable at v=3
Also no logs are streaming (but maybe that's normal for those checks?)
EDIT: not normal, the same checks run by 0.19.7 have some logs streamed at the top-level
do you have a link to traces for these?
@still garnet testing a possible fix. If it doesn't work, I'll share the trace
(I had dropped dagql.WithNonInternalTelemetry(ctx) when selecting the check value, not sure how important that is. I'm trying to add it back and see if the problem goes away
and yes it fixed it π
Here's the victory trace: https://dagger.cloud/dagger/traces/3cb119ef3d0ec306de9aae18f5fb93f3?span=dadfe9c00da3105d
yeah that's pretty load-bearing - normally a dagql.Select is considered "internal" and marks all inner spans as such - that config overrides it
Well, I temporarily disabled the "tuck-away span" trick, so there's more logs than we want. But I'll fix that too
EDIT: Instead of creating a special tuck-away span like we do today, could I just toggle WithNonInternalTelemetry(ctx) on or off depending on the select? make the last "leg" non-internal, but keep all the parent selects internal? Technically I don't need an extra span for that
Example: dagger check go:lint
- 1st leg:
dagql.Server.Select(ctx, ..., "go")-> internal - last leg:
dagql.Serve.Select(dagql.WithNonInternalTelemtry(ctx), ..., "lint")-> non-internal
a new runc vuln has appeared https://dagger.cloud/dagger/traces/60c9d761d71505bfa2f777a14e0a820c?span=467e436d5a884b9e
and erik already has a PR π π
FYI I'm getting 404s from alpinelinux.org while building: https://dagger.cloud/dagger/traces/87774d8fdc2bafff1b978b1a0062db16
I reproduced it 3 times in a row, then it went away π€·ββοΈ
Got span log roll-up working in the web UI
- was pretty interesting to implement, what's neat is you could technically do it with any span (like a UI toggle), the attribute that we set on check spans just opts-in. We can show whatever we want as the prefix, even an arbitrary UI component (for now I just reproduced what the TUI does).
A super easy (knock on wood) next step could be to do it with the root span for a "logs only" view, I think you asked for that @tepid nova?
Nice! Yes a pure log view has been a dream of mine (assuming the bracketed prefix helps make it readable). I think it's a high-potential medium!
On my end, I have an optional checks --all flag, which toggles listing dynamic checks. You can still run dynamic checks (in aggregate) without --all. It keeps listing very fast, and you only pay for the overhead of test enumeration at the time of running them
UX sounds ideal π - did you consider --recursive/-r?
Totally a placeholder. --all is wrong I think, it makes it seem like dagger check wouldn't run all the checks (in fact it would)
@civic yacht @still garnet do you happen to have an opinion on the best way to enumerate go tests in our repo? It's surprisingly non-obvious how to do this in a scriptable way
I've been grilling chatgpt about it, and it doesn't seem to really know π€·ββοΈ
I know you can do go test -list ., but that's only the top level tests
I'd suggest regexes but that would have false positives for multiline code strings, which we have in our integ suite
Right. It looks like the most reliable way is to 1) search for **/*_test.go ; then for each matching directory, call 'go test -list .' Nothing else seems to work properly in every case
maybe there's a static analysis middle ground like the Go SDK does?
you basically want to find toplevel TestXXX funcs and toplevel types with TestXXX funcs (I think that's how multiple test suite packages work, not just testctx)
Possible but not great. For starters it ties me to a go version, so I need to arrange to execute my analysis tool with the same go version that will run the tests. Possible but more work. Also edge cases with dynamic tests
hmmmmmmm maybe tree-sitter?
or steal from gopls / call its MCP server? π
it was like pulling teeth, but we got there: https://dagger.cloud/dagger/traces/77ce4b4799d3b7999ca9e710bb6dcd63?span=b0bbea670a13d1bd&logs
I could do a very basic regexp search as a fast-and-loose default. Then optionally do a slower / more reliable go test -list based logic
looks neat! what was like pulling teeth?
the conversation leading up to it π - it cheated in every kind of way before I told it to literally just call GoSearch with "Test"
I was planning on that name initially, but in the current implementation it's actually always recursive... It's just that without -a I stop at dynamic values.
ah, true. hmm
i guess it depends on how the user thinks about it. like if the fact that we recurse internally to find the "top-level" checks (which to some degree is oxymoronic) is something they're conscious of
can't tell if i'm speaking nonsense
<@&946480760016207902> is there any technical limitations to annotating a type with a pragma/decorator, instead of a function or argument?
(context: hoping to allow setting +check on an object)
Not that I'm aware of. Definitely fine in Go, pretty certain it'd be okay in python. TS and others I don't know for sure
What about the plumbing on the engine side?
Guess it doesn't exist yet, but should be trivial to add
You'd just want to add args to Module.withObject (or add methods to ObjectTypeDef, etc.)
Feedback: the only hiccup of the lazy execution model is that we only see when a secret is wrong upon the actual execution of the pipeline (i was referencing it wrong 3 times and thought it was ok prior that)
Oh yeah we really need to add a builtin check for that...
So dagger check should always check the module itself, even if there is no user-defined check...
@tidal spire I just realized just how badly we need this "overlay" feature... Wrapper modules are a pain to write in general, but if you want to return several layers of complex types (which I need to do for test splitting..) then it is hell because you need to fully reimplement each wrapped type individually
Basically, if I want to wrap go.Tests() with engineDev.tests(), I need to write more copy-pasta than I will be removing
Yeah makes sense, the overlay is applied pretty far down in that chain
since all types for the overlay are merged into the corresponding type from the base, then I guess they are sharing a single set of types
@still garnet I couldn't find the original thread that you were helping me out with mapping OTEL logs back to the container or service that they came from π§΅
Still in draft, but could use some early testing & general review: https://github.com/dagger/dagger/issues/11529
@leaden glade this is meant to be general plumbing that can support generate in addition to check
I started implementing test splitting in our own engine tests, it runs but I couldn't get it to complete successfully yet. I won't have a lot of coding time this week because of my travels, but would love some extra eyes & brains on this to help me keep it alive
my dream is to run engine tests e2e with splitting + scaleout, and see if we can get a complete run under 5mn π
also I'm wondering @still garnet: how hard would it be for dang to support a special type in modules, say _Runtime, with functions like types(): [TypeDef!], dispatch(), generate()... π
Nice, I'll have a look. I'm cleaning the generate pr without files indication, so something simple enough we can imagine to merge quickly. I'll see if it's better to base it on the plumbing changes (because I have for now a lot of duplication between checks and generate)
I think you should merge yours first π Then we can remove duplication on top of 11529 together, as a followup
β design question regarding dagger generate
In our current usage of generators, we can pass a check flag to ensure everything is up to date (that's what we are doing in our .dagger module https://github.com/dagger/dagger/blob/6f611d705f5370faa3a0613f1b3f288dc3cf25ba/.dagger/main.go#L16-L21)
I'm adding something similar to the generate command: dagger generate --check will run the generators and ensure they are empty.
That is working as expected.
But I'm wondering how this could be (if that's a good idea) seen as a real check.
Right not a check requires a check directive, but it might be extendable to generator directives. So each generator could become a check.
Or...
We make it explicit, so a function could return a Changeset and have both generator and check directive. And we change a bit the code that handles check return values, if it's a Changeset (and generator directive is present?) we check the changeset return value is empty and not just if it sync without error.
WDYT?
@leaden glade I think we should leave checks and generate functions completely separate, the generate(check: bool) was a stopgap that we should not keep, because it will be the same implementation for every toolchain, it's super repetitive for toolchain devs and unreliable for users, better to not support it at all
My proposal:
- implement a builtin check (visible in `dagger check' as a special case), that calls all generate functions and asserts their changeset is empty
$ dagger check -l
generated
go:lint
go:test
...
dagger check generated
ok, sounds good to me, I'll leave them separated.
Would it still make sense to have a dagger generate --check that simply ensure the result is empty? In a way it's not seen as a check, but only as a generate function.
mmm good question.
what do you think about having only one way at the beginning, and see if people are confused, or askfor a different way?
mmm another approach could be to expose each generate function as a check also... basically like what you suggested, but would be automatic, no explicit pragma required. then instead of check generated it would be check path:to:my:generate:func
the problem with that might be that the same function name that is perfectly clear in a dagger generate X command may be confusing in a dagger check X maybe
I think I was trying to mimic the generate --check from .dagger, but maybe that's not the right thing π I'll push a first simple version without --check. That way we have something, and we can iterate on a specific more or less automated way to check generation. Or nothing if it's not worth it because it feels better to have specific check functions. (to keep it simple)
dang _Runtime
running into what looks like a cache issue? Not completely sure. My local state seems busted now https://dagger.cloud/dagger/traces/3712d51d3dc3e5b8e5774f8cf5f5513f?listen=26bb4ce2e9e2df49
could not get used client id
@tidal spire Think I found a toolchains bug - I only see toolchain-provided checks if I use ModuleSource.asModule, not Directory.asModule:
dang> toJSON(moduleSource("github.com/vito/dagger").asModule.checks.list.{name})
=> [{"name":"checkGenerated"},{"name":"ci:bootstrap"},{"name":"cli:releaseDryRun"},{"name":"docs:lintMarkdown"},{"name":"engineDev:releaseDryRun"},{"name":"go:checkTidy"},{"name":"go:lint"},{"name":"helm:lint"},{"name":"helm:test"},{"name":"helm:releaseDryRun"},{"name":"installers:lintBashScript"},{"name":"installers:lintPowershellScript"},{"name":"installers:testBashScript"},{"name":"javaSdk:test"},{"name":"javaSdk:releaseDryRun"},{"name":"javaSdk:lint"},{"name":"phpSdk:phpCodeSniffer"},{"name":"phpSdk:phpStan"},{"name":"phpSdk:test"},{"name":"pythonSdk:lintDocsSnippets"},{"name":"pythonSdk:lint"},{"name":"pythonSdk:format"},{"name":"pythonSdk:test"},{"name":"pythonSdk:releaseDryRun"},{"name":"rustSdk:cargoFmt"},{"name":"rustSdk:cargoCheck"},{"name":"rustSdk:test"},{"name":"rustSdk:releaseDryRun"},{"name":"security:scanSource"},{"name":"security:scanEngineContainer"},{"name":"typescriptSdk:releaseDryRun"},{"name":"typescriptSdk:testNodejsLts"},{"name":"typescriptSdk:lintTypescript"},{"name":"typescriptSdk:testNodejsPrevLts"},{"name":"typescriptSdk:testBunjs"},{"name":"typescriptSdk:lintDocsSnippets"},{"name":"testSplit:testClientGenerator"},{"name":"testSplit:testInterface"},{"name":"testSplit:testLlm"},{"name":"testSplit:testModules"},{"name":"testSplit:testCallAndShell"},{"name":"testSplit:testCliEngine"},{"name":"testSplit:testContainer"},{"name":"testSplit:testModuleRuntimes"},{"name":"testSplit:testBase"},{"name":"testSplit:testCgroups"}]
dang> toJSON(git("https://github.com/vito/dagger").head.tree.asModule.checks.list.{name})
=> [{"name":"checkGenerated"}]
Not blocked - can just use moduleSource instead, probably better off doing that anyways for other reasons
mmmh i'm having a weeeird changeset diff that only exists when running the command inside dagger, a bit painful to track atm π
probably an exclude somewhere ahah, but where ππ€ π€£
I'm getting a lot of module not found: ~/guillaume/dagger error on the dagger check check-generated, is it me or is it flaky ? π
thats weird, why would those paths be different?
Quick update on dagger generate side.
Still requires a bit of polish, but it's working well.
$ dagger generate
β changelog:generate 0.4s β£Ώβ£Ώβ‘β£Ώβ‘β‘
β docs:generate 27.1s 4Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£·β‘
β engineDev:generate 11.8s β£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£Ώβ£·β‘β£
β goSdk:generate 6.1s 2Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β‘β£Ώβ‘
β phpSdk:generate 25.4s β£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£β‘
β pythonSdk:generate 26.1s 2Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£·β£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£Ώβ£
β rustSdk:generate 26.1s 3Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£β£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£β‘
β typescriptSdk:generate 7.1s 2Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£§β‘β£Ώ
This replaces the actual dagger call generate we have.
And I can also
$ dagger check generated
β generated 17.8s 5Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£§β£Ώβ£Ώβ‘ ERROR
β DaggerDev.generated βΊ
β Container.withError(err: "generated files are not up-to-date"): Container! 0.0s ERROR
! generated files are not up-to-date
And this is the full code for our main dagger module:
type DaggerDev {
"""
Verify that generated code is up to date
"""
pub generated: Void @check {
let generators = currentModule.generators.run
let empty = generators.isEmpty
if (!empty) {
print(generators.report)
container.withError("generated files are not up-to-date").sync
}
null
}
}
The module doesn't need dependencies anymore as it doesn't have to know what generators exists, it's all dynamic.
BTW, is it possible to easily print an error message as part of a check? I'd like to have the generators.report on stderr but I'm not sure what's best to do that in dang (yes I also migrated the main dagger module to dang as this is the only function left)
Question regarding the dagger generate output: should I find a way to display the number of added/modified/removed files on the line?
Something like:
β goSdk:generate 6.1s 2Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ‘β‘β£Ώβ‘ 1+|3Β±|0-
(I don't know what we can do on those lines, but I guess it should be possible, no?)
a check function can print to stdout or stderr, it will be collected as logs for the check
The invalid overlay saga is finally over, fix here: https://github.com/dagger/dagger/pull/11545
- I'm fairly certain this issue is pretty widespread in the container ecosystem. A while ago everyone (docker, containerd, etc.) just forced
index=offfor overlay mounts (which makes it not an error and instead just "undefined behavior") and forgot about it
e.g. here in containerd
More interestingly in the long term, have a draft PR for the ebpf support in the engine here: https://github.com/dagger/dagger/pull/11548
Amazingly it actually works in our CI infra no problem. Will make debugging way way easier and has a lot of potential for performance analysis too (would have made the bolt disk syncing thing way way more obvious).
I'm also wondering if we can use this to give deep insights into user programs by taking the stuff read from ebpf and turning it into OTEL traces (cc @still garnet). Not even just system stuff, ebpf uprobes let you hook into any function in any userspace program (with some caveats I'm sure, haven't gone down that rabbit hole yet).
(this is 98% claude code generated so far, just seeing whether it actually works in CI infra on a first try, needs some human cleanup)
eBPF was crucial in debugging #11545 and has quite a bit of po...
engine: support running ebpf programs at...
I just noticed when I init a typescript module the description is A generated module for QuickStart functions (QuickStart instead of my module name). Looks like its hardcoded. Has it literally always been this way?
π€·ββοΈ probably? seems worth a quick fix
@civic yacht for dagop'ing container image exports, do you think here's a good place to tackle it: https://github.com/alexcb/dagger/blob/fc3a382745c9c3b3ace88e417da2e64d2246504d/engine/buildkit/containerimage.go#L190-L193
instead of doing a c.Solve(.... here, instead we could just pass in a bkcache.ImmutableRef Result for each platform variant, along with the specs.ImageConfig, Annotations, etc?
and along the way change the API to PublishContainerImage, ExportContainerImage, ContainerImageToTarball ... in particular the map that gets passed:
opts map[string]string, // TODO: make this an actual type, this leaks too much untyped buildkit api
π Would it be possible for services spawned via experimentalPrivilegedNesting/dagger-in-dagger to bind to 127.0.0.1 of nested containers?
e.g.
# Chose this particular image because it has dagger and an http client (wget)
dagger -c 'container | from registry.dagger.io/engine:v0.19.7 | terminal --experimental-privileged-nesting'
# Run a service via experimentalPrivilegedNesting
dagger -c 'container | from nginx | withExposedPort 80 | asService | start'
# Interact with that service on localhost (this is the bit that doesn't work as desired currently: the service is only reachable via its endpoint)
wget http://localhost:80
Hey @civic yacht
I'm digging on a perf regression around expensive chown of files and dirs raised by a user
Several questions:
- Was there a specific reason Directory.Chown uses WalkDir + os.Lchown instead of fscopy.Copy with WithChown ?
- Any edge cases where copy-with-ownership wouldn't work (symlinks, special files, preserving other metadata)?
- Is switching to fscopy.Copy pattern acceptable for the dagops direction?
I got a local fix that relies on fscopy.Copy with WithChown as a solution and wanted your first scent on it
(seems to work)
Hey @Erik Sipsma
services binding to 127.0.0.1
Might be getting another meaningful "everything is faster"-type perf improvement for the engine π€
By the way, are we still getting the socket not found errors ? It seems to have disappeared no ? Do you remember the associated PR that fixed it? π
Not 100% sure but I suspect this fix https://github.com/dagger/dagger/pull/11456 also covered that error
See the added integ test for the case being fixed here.
In retrospect erroring out when transfering secrets/sockets from one client to another was unnecessarily zealous since in the worst case an a...
The problem was when you do operations from different clients on the exact same git repo + commit in parallel, but the clients use different auth. Things would overlap and dedupe correctly, but then it could happen that one client saw some reference to auth in the DAG (token, ssh socket, etc.) that it didn't have access to and would error out. The fix is to not error out just because you see the reference to those secrets. The clients still don't have access to the secrets, they just won't error out unless they end up in a situation where they would actually need to use the auth they don't have access to.
as promised, here's a working POC for overlays / toolchain middleware https://github.com/dagger/dagger/pull/11561 cc @tepid nova @wild zephyr
Maintainers: would be nice to get this reviewed & merged. Thanks to @obsidian rover for carrying it.
This should make our own Dagger module faster to load.
@civic yacht is valid? π sorry meant +ttl("1year")+cache("1year")
no not right now, I put a max at 7 days for now. The only reason was that we prune the ttl entry from the sqlite db only based on the TTL, so enabling super long TTLs means taking space for that long. We need smarter pruning based on usage/references/etc. to make things like that possible
It's sorta arbitrary though, if there's a strong use case for >1 week TTL it's easy to bump
Honestly I don't know what TTL to choose
If you specify nothing it defaults to that TTL of 7 days, so that's as good as it gets in the current implementation in terms of "idk" cases
by specify nothing I mean don't include any cache annotation at all
Oh right, cached by default!!
Perfect
Should be somewhere around 0.1s now, consistently π
It looks like it's ~5s, I think the span you are looking at there is something else
(reviewing, LGTM so far and nice improvement, I was trying something quick to see if we could skip uploading .git/objects/pack because that's extremely painful on PARC given it's almost 200MB on my current checkout)
let's make it as fast as possible yeah π
https://github.com/dagger/dagger/pull/11232#pullrequestreview-3580530605 (ship it though!)
Thanks will incorporate that π I've also realized that for the Dirty() i'm not relying on the internal implem, which should be faster π
definitely don't feel obligated to include in that PR, might be better for a follow-up unless it works perfect first try with no downsides π
yeah i'll make a follow-up PR, same for my thing. By using the IsEmpty(), it now takes 23 seconds ... somehting is off (wtihout a container).
Generating patches are too expensive atm --> IsEmpty() is not optimized enough !!!
It's tangent, but I think there's quite a few optimizations to make on the changeset API
Do you remember how much faster it was ? I managed to make it rely on the changeSet API at ~3.7s with empty cache (and wihtout your optimization). Testing your approach at the moment, but I'm not sure about the fact of not mounting the full git object, as we'll loose access to some of the dager APIs
In my mind, people should be able ot have a dumb implementation and it "just works", the problem of changeset being slow is our problem, i'm worried about having a "smart" implementation to circumvent our API functions being slow?
Itβs gonna be faster for PARC because there the bottleneck is syncing from laptop to remote engine. You wonβt see that in CI
I started to split my dagger generate PR in multiple ones so they can be merged one by one instead of waiting for the complete version (and will fix the issue where I want to use in the dagger module what I'm providing)
The first one: https://github.com/dagger/dagger/pull/11577
This PR allows to "merge" changesets and handle conflicts in case of any (like modified files in both changesets).
This can be used to replace https://github.com/dagger/dagger/blob/42b688321ed26d05cc332f883ed7bc2838a3ed5e/.dagger/util.go#L15 and ultimately to merge changesets from multiple generators at once in a safe way.
@still garnet stack overflow fixed, thanks π
There's a weird performance bottleneck somewhere, when the engine enumerates those objects... Checkout the shape of the trace: https://dagger.cloud/shykes/traces/dfe8fb0733461140b2c0b9f050de12ee?listen=d7e53f8cef472ca0&listen=3dafbc58a3ae5e9d&span=5a4a89068c478cc9
I could literally follow the enumeration in the TUI: 1 call to TestDirectory.Tests every second, each call takes 0.5s. So slow!! But there's no actual compute going on. Super weird
Hey Erik, git status needs to read the HEAD commit object, which in CI fresh clones only exists inside the pack files we're excluding (getting a loot of errors).
This optimization is doable, but the more I get my hands dirty, the more it seems creating a very focused "version" module that gets away from the dagger world (and most go git libs)
Need a quick β for this please.. https://github.com/dagger/dagger/pull/11523
@fair ermine @leaden glade @charred lotus π
done
thank you!
ah sorry maybe i merged too quickly, hope i didnt break anything
I merged it π
Still baffled by this
Basically it's prohivitively expensive to return a list of objects referencing other objects
Circular dependency in engine & sdk build π§΅
Hey Alex, was there a reason for asPatch to do a per-file diff ? Is it for the UI / agent UX ? Why not relying on a single git diff --no-index --src-prefix= --dst-prefix= and showing just that patch ?
there's a comment about git 2.51, so maybe why
yeah, this is why - Alpine at the time didn't have git 2.51, not sure if it does now
looks like alpine v3.23 has git 2.52 π
Thanks π Do you object on having a more git-centric approach for the changeset / and lazifying it ? (have a working POC)
IsEmpty is a bit too long atm and a git-centric approach would be very fast
no objection but a lot of curiosity!
aaaaaand ... it's not faster
it seems that the evaluation of the ref / mounting the dirs is the real bottleneckI got tricked by the UI: with a lot of print debug, we see that the real bottleneck is the waiting time on git.head.tree, and as it's waiting it seems to add its own time on top
I'm still fighting with Changeset merge and there's something I'm not sure to understand. If anyone has idea/knowledge.
My issue is the different fields modifiedPaths, addedPaths and removedPaths are not aligned with the result of as-patch.
On the things I'm implementing, there's the ability for instance to skip a file in conflict (for instance the same path modified in both changesets I'm merging)
If the resulting changeset seems ok (the before and after states, the as-patch content that doesn't contain the file in conflict as expected) if I look at modifiedPaths the files are still there.
So basically modifiedPaths contains files that will not be part of the patch generated from the changeset.
Why that, is that an issue, or should I just not care as long as my patch seems ok and the result if I apply it looks good?
Yeah this was one of the reasons I wanted to unify around a single git -generated patch, so there's a single source of truth, and have modifiedPaths etc. work by looking at the generated patch
maybe there's some overlap between you and @obsidian rover's efforts?
Maybe? I'm not sure exactly, and I didn't know there was some work on changeset when I started to work on the merge. But yes maybe there's some work to combine
Yeah sorry, i've been single exploring this to try to make it faster, as part of version/main.go (which still takes 6s) by 1) lazifying it 2) unlocked by previous step, unifying all the logic under git commands instead of the Buildkit diff / glob 3) have the isempty rely on fast git commands
It's not that much faster atm ... still digging. But was bitten by this too
I think you're hitting the buildkit vs git differences (no metadata diff, deleted files under dir): On my exploration branch, ComputePaths uses git diff --name-status as the single source of truth
dagger -M -c '.exit 20' > this should effectiely exit dagger with a 20 exit code, correct?
Mmh, i would expect so, agreed
doesn't seem to work for me in main. I get a 1 status code. There's this test (https://github.com/dagger/dagger/blob/4f7bfabebb685d85e1bc18af552685fef8c9f35d/core/integration/module_shell_test.go?plain=1#L1033-L1044) so it "should" work π€
ok, the --progress plain actually works. Seems to be some gotcha with the idtui exit code handling π. cc @still garnet just checking if this should also exit with the internal exit code in the TUI as well
yeah don't see why not
bet: bubbletea error wrapping is messing things up ahah
π Has anyone else encountered this? I'm stuck on it for implementing test splitting...
@astral zealot @wild zephyr this looks very handy: https://github.com/maxence-charriere/go-app/pull/1046
ok I have a bit of a proposal. A while back we (mostly) stopped running CI on dev engines and started running on the released engine. This greatly reduced the complexity of our CI, and still the right change imo. The problem is we lose the ability to ship changes that depend on other unreleased changes. For example, this PR https://github.com/dagger/dagger/pull/11515 requires running the CLI from main to get the changes to codegen in https://github.com/dagger/dagger/pull/11579.
So my proposal is that when we run checks on PRs we use main instead of the latest release. This does require us to continue publishing main commits, though. I think this should get us the best of both worlds
π I'm getting a worrisome error on a very basic engine-dev playground call, that is flaky from my local machine. Never had that before.
Haven't experienced it, but I have a theory - functions return values by JSON marshaling them, and JSON marshaling an object sends a request to get its ID. Maybe it's doing all of those serially, for each contained object?
π it looks like the codegen binary got committed on main...
PR to remove it: https://github.com/dagger/dagger/pull/11594
@obsidian rover this is the test where we just have a custom http server that the engine-as-a-service depends on, if useful https://github.com/sipsma/dagger/blob/f3afdbde3d346ca4de8e3ff92b1d152c20c2c435/core/integration/proxy_test.go#L503-L503
A portable devkit for CI/CD pipelines. Contribute to sipsma/dagger development by creating an account on GitHub.
π§΅ Toolchain Overlays https://github.com/dagger/dagger/pull/11561
@fair ermine do we have an integration to have go tests over a toolchain's implementation ? if yes, where could i see an example ? π
Iβm not sure I understand your question
What are you looking for? A module with unit tests that can be run with dagger run go test?
I'm willing to make an integration / unit test of a dagger module's function, potentially that i could run with go test
i could use your client generator, but i'm not up to date on whether we have a native integration for go test on modules
The dagger repo has some toolchains, I want to make an integration test of one of the toolchains' function
Ohhhh yeah okay, easy: https://github.com/TomChv/dagger-native-test-example/tree/main/go
It's not up to date but I don't think any breaking change has been release since
The repo shows an example, with the guide in the readme π
I would recommend to run it using dagger run to automatically handle the module loading + web UI
It was a long run, but finally fixed my issues on https://github.com/dagger/dagger/pull/11577 (I hope CI will be green π€ )
I've hit a lot of times (and sometimes it was just hanging) recursive calls issues
recursive call detected
I still don't understand exactly what was the issue, but I fixed it my moving my code from a core.Changeset method to the body function at the schema level (I wanted my changesetWithChangest function at schema level to call parent.WithChangeset but it never worked, the exact same code back to changesetWithChangeset is just fine)
The integration tests about changesets merging are now passing, with conflict resolution, so that's good π
I'm soooo close to having the lazify git clean ... go go go ! Only a few hours left to finish it in 2025 ahahah
if you fly to the international dateline you'll get a few more π«
Hey team. Just wondering when the next release of dagger is coming out. My team are eager to get their hands on it so we can start utilising checks, better performance and all the other goodies you packed in there!
We can release this week. But, checks are already in the last release fyi
Ah yes, checks is in the last release but due to a bug around host files we were unable to upgrade. Now that it has been fixed we want to upgrade π
Doing one this week π, coordinating with everyone to confirm ahah
@tidal spire you mentioned a possible alternative to committing our repo's generated files?
Ah cool. Yeah if you could this week that would be really good for us π
Committing Host Files
README rewrite to match the homepage π https://github.com/dagger/dagger/pull/11627
My humble 2 cents.... π
Marketing Dagger as just a 'CI tester' feels like selling a supercomputer but advertising it solely as a high-end calculator. Sure, it does the math, but youβre hiding the fact that it can simulate the universe. π
I feel the same way! Why is it called a testing framework?! We have a million of those already.
For example, we use it for deployments, packaging, archiving, compiling, orchestrating, tagging etc which aren't directly related to testing. Testing is only one part of CI.
I like this description a lot better
we can refine, but we got pretty loud & clear feedback that the current version hedges too much between CI and Agent runtime. So we'll start with this and keep iterating
<@&946480760016207902> I need a lgtm π
Reviewed, ready to approve once my comment is resolved
sharing my feedback openly here in case someone else feels similar. Otherwise, feel free to dismiss:
Overall changes LGTM! the only thing caught my attention is the emphasis on coding agents from the README PR. Particularly this:
AI coding agents are fast but unreliable β they need an external system for trusted feedback on every code change. Dagger provides that feedback: repeatable test execution that agents can call as they develop, and CI can verify during review. It runs locally, in CI, or directly in the cloud.
The first impression after reading that sentence was: "are we going back to highlighting the AI use case as our main strength"? Just being honest here with my reaction about the paragraph above not surfacing all the "test platform" goodies like checks, scale-out, splitting, observability, etc which elevate your application stack as a whole. Since we're not mentioning that, I am thinking if some users might think that Dagger is only valuable when using agents mostly/only.
yeah agree with @wild zephyr - it feels like this revision is still heavily agent/LLM-centric to anyone viewing through that negative lens, since it opens with it in multiple places
From a user perspective, after reading this revision, I did not feel that it captured what I use dagger for / like about dagger
Not sure how to reconcile that with the loud & clear feedback that Solomon mentioned, just throwing out a data point for you all to take or leave π
I agree 100%. I meant to say this too. None of my use cases even touch AI or Agents and probably won't be the for the foreseeable future.
Side thread to not take away from the current discussion: any idea why the checks on my PR are hanging? https://github.com/dagger/dagger/pull/11621 π§΅
There's a whole section on observability.
I will try to break down the negative feedback here, to get something actionable out of it
So far I have:
- "Still too much AI"
- "It's not just for tests!"
- "Not enough details on test features"
- "I like the message, just not the specific wording"
- Something else
Can I get votes on which of those apply to your feedback?
@wild zephyr @still garnet @coral vector @stuck bloom @lunar brook
Also if there's anything at all that you do like in there, I'm interested
Would you prefer if we said Dagger is a CI platform? I have issues with that, but it's definitely a viable candidate.
Personally I'm a bit of 1, 2, 3.
- There's currently no specific integration with agents that make dagger just for agents.
- Dagger can do a lot of things, and we're working on making it really incredible for tests. So the test messaging IMO is a good way to get people in the door and they'll have a good time.
- Someone using agents heavily should connect that a fast/sandboxed/repeatable testing tool will be useful for their agents
- Some of the really flashy things we're hoping to do for tests, specifically in cloud, are really more for humans than agents
The problem with "CI" is that it gets associated with 1) low-level infra, just a bunch of machines, 2) not strategic to the engineering org, just let infra team or consultants figure it out 3) we already have one, no urgent reason to change.
Then upvote all 3 π
Calling it "CI" also opens us to some confusion & criticism:
- We don't have Mac & Windows runners, unlike every other CI platform out there
- Dagger has a "parasite mode" where it runs on top of your existing CI. We will soon start recommending standalone mode instead, but parasite mode will continue to be a popular option - great for dipping your toe in the water so to speak. Will it be confusing to say "our CI works standalone, or on top of your existing CI!"
- The other issues I mentioned above - "just infra", "not looking for a new one" etc
I voted for 1, 2, 3 and maybe a bit of 5.
I agree that "CI Platform" isn't the right term either. I really like the "workflow engine" because to me, that's what dagger is. It's well suited for CI but that's only one use case.
Same with E2E testing. For devs (app and platform), E2E testing is just one part of CI. I feel like it massively undersells what Dagger can do.
For example, lately, I am exploring using dagger as a tool delivery mechanism to developers to run things locally. To improve their local developer experience by delivering complex tools to their terminal (Sonar, container scans, secret scans, packaging tools etc.). Toolchains makes this an awesome DX. dagger install github.com/mytool and dagger call mytool dosomething. No need to ask devs to install specific tools locally and set them up. I think there's a ton of value there.
Obviously I love that Dagger is so generalizable, and that elite Daggernauts are exploiting this in clever ways. But, the reality is that our commercial product is not selling well enough, and our community is not growing fast enough, to sustain us in the long term. There are 2 things we need to do to fix that: build a better product, and tell a better story. Here I will focus on the story.
A good story requires focus. Nobody starts with the "everything dag engine". It's too vague. Without focusing on a clear problem for a clear set of people, nobody can tell if it's for them - so nobody tries it. Hence low growth. Then when the time comes to monetize: the business doesn't know what business problem it solves. So it's hard to make the case for buying the commercial version.
TLDR the lack of a focused story hurts us twice: lower community growth, and lower monetization opportunity.
This is why I'm trying angles like "test platform". Sure it can do more than that! But we know e2e tests are a big problem for a lot of people right now, and we can solve those problems in a major way. But it does require focus both on the product implentation (for example: test instrumentation; test splitting; test analytics; flake management; those require serious engineering effort), and IMO on the story (why bother to develop all those features if they'll get buried in an "everything dag engine" message).
So that's my dilemma. I worry that "workflow engine" suffers from the vagueness issue. To the extent that it's understood as a precise category, it's a category that we're not competing in: Temporal, n8n, inngest, etc.
I will tone down the agentic parts in the readme.
BUT, here's a question: if the tagline is not "The agent-ready <foo> platform" (ignore <foo> for now). Then what is the tagline? What makes Dagger memorable? Why should I care? And, why should management at your respective employers care enough to consider paying for it?
My 2 cents, I think solomon makes a good point. I've spent a lot of time working in the workflow engine space (airflow, prefect, etc), and dagger is NOT that. I could use those tools to perform the kinds of things dagger does, but to me that is overkill as well as more language limiting.
That said, dagger feels more like workflow engine than a CI engine/tool.
When I was looking for alternatives for our use case, dagger was really the only tool other then shell/tool specific language scripts. IMO, that kinda means dagger is alone in the space it is filling (for my use case anyway) which means it can define the space, but it also means it has the burden of defining the space. There's not much it can latch onto as a similar tool set/terminology.
I understand the conundrum. I can't come up with a tagline but I can throw some words out there
- portable
- sandbox
- orchestrator
- pipeline
I'd like to echo @stuck bloom that portability (engine runs anywhere), reproducibility (everything is in a container), and caching by default are the main draws for me, and so feel best in the tagline.
That said, similar to what @tepid nova and @raven siren said, we're not really the target audience for this change, right? We have already "seen the light" so to speak. The content that we're providing feedback on is meant to get new folks in the door.
To that end, I also don't know the right angle, but I do think that toning down the agentic parts leave a better taste in my mouth, though that does feel like a nit that goes against current trends.
Whatever helps Dagger succeed!
There's another really useful feature, which is failing on error. Shell scripts and other impls happily continue with error messages which leave you in doubt of what you achieved if the pipeline succeeds. Reproducibility seems wrong for that. Consistency maybe? Consistent Reproducibility?
Agreed with helping Dagger succeed. It's so tempting to tie wording into AI because that's where all the interest lies nowadays, but that feels really limiting to me. I'm not in the AI space as anything but a light consumer so not gonna speak as any kind of marketing expert. π
Agreed with helping Dagger succeed
Same! It's what we all want π
Thank you all for participating, I really appreciate it!
i'll start by toning down the ai language
I still think "end-to-end tests" is a very strong contender for category, even though it does not capture everything Dagger does. IMO it's the most commong killer use case: the reason a new user is most likely to use it at all, and even consider additional uses like deploying, codegen, linting etc. It feels like the use case where Dagger is the most clearly differentiated.
I could be wrong though.
On the "open-ended technical definition" side, one term I heard used recently was incremental execution engine, I thought that was interesting. Because it neatly captures the caching/memoization aspect, and implicitly associates with build, test and other CI/CD tasks, without outright tying itself to it.
But, it has the same problem as other open-ended terms: it doesn't clearly answer what it's for. Also it describes the engine but not the overall platform.
To me, the Dagger module system is what makes Dagger the killer app for our CI goals. Dagger Modules are to me, the EXACT line in the sand that is needed between DevOps/ Platform Engineering teams and App Developers in terms of the logic that is needed to accomplish a task and the interface needed to make sure tasks are followed, whether it be a set of tests or anything that can run as a containerized execution.
I'd call Dagger a - Programmable Sandbox for Containerized Tasks - devOps and Platform Engineers define the interface i.e. which tasks are mandatory. Application developers can freely define the logic to actually run the required tasks (and more, if they wish).
This inversion of control is what silences the arguments between devOps and application developers on who is responsible for what. That is the pain Dagger needs to tap into to sell it.
In other words, other solutions fail because they don't have a functional interface as a "service contract". Jenkins or GitHub Actions give everyone a "Script Area", which is just an open field. Dagger Modules give the two teams a Function Signatureβa typed, compiled interface that acts as a contract for application developers to follow, but doesn't tie their hands at all. In fact, it allows them to express themselves in the language they love the most.
And I also will postulate, that the biggest issue is the environment itself. Many people and I believe even the Dagger team wish to see the environment as anything that can be built out as one or more containers - inside Dagger. But, that same thinking will run Dagger into the same issue that Docker ran into itself.. A real distributed system cannot simply be "containerized" i.e. wound up and down in a fire and forget fashion. There are just way too many moving parts to orchestrate that sanely, even in a programmable way. So, Dagger MUST be part of the larger system, in an already-running state. (Yes, k8s). Once that is done, all that is left to run any task is to give the containerized task the configuration it needs to "wire into" the running system and it can run. So, Dagger apps becomes the Sandbox in the Sandbox. This is why all the other CI systems don't work well and Dagger can.
So, Dagger MUST be part of the larger system, in an already-running state. (Yes, k8s). Once that is done, all that is left to run any task is to give the containerized task the configuration it needs to "wire into" the running system and it can run. So, Dagger apps becomes the Sandbox in the Sandbox. This is why all the other CI systems don't work well and Dagger can.
Your reference of k8s threw me off a little.. Do you mean a) running Dagger on Kubernetes, or b) running Dagger to interact with Kubernetes (ie. to deploy applications to it etc)
It's a). Dagger running as part of k8s so the containerized task is in the right environment from the start.
I looove this new view, kudos to the team π (also, *could it be that lazy git is ... finally clean ? π€©, after new year though ahahahah. I don't think there's a country with 6 days of timezone difference Erik ahahah π *)
Not everyone runs Dagger in Kubernetes though... For example we don't π In fact the more we expand our use of Dagger, the less we use k8s
b) makes sense to me though
You can't run your whole stack in a homogenous platform - containerized or otherwise. But you can and should standardize your CI workflows and associated environments. We propose Dagger as a standard π
I personally believe k8s is obsolete for CI workloads, but that's a personal opinion that I don't expect every Daggernaut to share, we will continue to happily support k8s as a provisioning target
(didn't mean to sidetrack from your broader point on positioning though! sorry)
Sure, as the tool builder, you don't need k8s to build the tool. But, if your applications are distributed, then it has to be inside the apparatus that runs the distribution.
Please ignore that part too then. Think more about the devOps/ app dev pains Dagger can solve. If you can effectively market that, I think you'll get good traction.
I realize too my thinking is also more along the lines of e2e testing. So, nevermind. π
Another poll: what actual real-world workloads do you run on Dagger today?
- Build & publish artifacts
- Tests
- Security scans
- Linting and other static checks
- Release & deployment
- Codegen
- Other
I realize too my thinking is also more along the lines of e2e testing too. So, nevermind. π
Do you mean that e2e testing is your anchor use case for Dagger?
Not an anchor, but yes, quite important.
How about a hybrid of both:
Dagger is a programmable sandbox for integration testing
What is Dagger?
Dagger is an open-source, programmable sandbox for integration testing.
Test any codebase, repeatably and at scale. Runs locally, in your CI server, or directly in the cloud.
Features
System API β A language-agnostic API for orchestrating containers, building artifacts on the fly, loading secrets, fetching data, and more. Every operation is typed and composable. Native SDKs for Go, Python, TypeScript, PHP, Java, Elixir, and Rust.
Typed artifacts β Define custom object types with encapsulated state and functions. Types are content-addressed and can be passed across SDK language boundaries.
Incremental execution β Every operation is keyed by its inputs. Change one file and only the affected operations re-run. Caching is content-addressed and works automatically across local runs and CI.
Runs anywhere β The only requirement is a Linux container runtime. Runs natively on Linux, or via Docker Desktop and similar products on macOS and Windows. Local and CI behavior are identical.
Built-in tracing β Every operation emits OpenTelemetry spans. The CLI includes a live TUI; traces can also be exported to Jaeger, Honeycomb, or any OTel-compatible backend.
I'm sorry, but this still feels like you are trying to sell me an iPhone, because it has a calculator in it. π€·π»ββοΈ
Of course, I know what Dagger is too and I understand its potential.
And I also realize the problem you have is that everyone doesn't know what Dagger is. So they first need to learn what Dagger is, to then go many steps further to understand the paradigm shift that it represents. That is a really hard cognitive jump through several hoops to make to get buy-in. I get it.
I just wholeheartedly believe that selling Dagger short to convince people to even look at Dagger with a "promise little and deliver more" aspect isn't the right tactic. It definitely isn't telling anyone -> this is the resolution to the hard and painful problems you have - which is the real value of any good product.
I did some research with Gemini about what kinds of friction there are in the CI/CD space and it came back with a number of good references. It even spoke about people getting burned out because of the friction and conflict. I believe that's the pain Dagger can tap into. The best marketing speaks to emotions first and in a tight second is demonstration, so prospective customers can visualize how the solution can be useful to reduce or get rid of their problems.
Dagger CI - The ultimate pain killer for your devOps and application developer CI process conflicts.
Then explain how the pain is resolved.
π
Dagger CI - The ultimate pain killer for your devOps and application developer CI process conflicts.
I like the "ultimate painkiller" part! Has a cool vibe to it.
for your devOps and application developer CI process conflicts
This I am struggling to unravel. I am not sure I understand what it's actually talking about. DevOps is a loaded term with many definitions. I wouldn't use it.
You are right. Leave that part out.
Dagger CI - The ultimate pain killer for your CI process conflicts!
If you are aiming for simplicity, I don't think this achieves it. I'd say, someone has to be versed in Dagger already to grasp some of the nuances in this description.
Better, but one problem with associating with CI is Dagger is not yet universal. What about windows and mac builds?
Dagger - The ultimate shift left π
^^^ probably has political connotations so maybe not. But "shift left" is something that I see dagger can help with. There are a ton of things devs are scared to do today because it will "increase their workload" or "be complex and time consuming". E2E testing is part of it too. Dagger modules (and toolchains) can and does simplify these processes running locally. Even if we don't build entire pipelines with dagger, I think this delivery of easily executable tools to the developer's fingertips is very valuable and can itself be a big selling point.
Dagger makes it easy for platform engineers to build these tools, and then makes it super simple to distribute and use for developers. I don't know of any other tool that does it as well as dagger.
Or how about:
Dagger: The Functional Interface that turns CI Conflict into Collaboration.
or
Dagger: Stop fighting the pipeline. Start fueling the product.
I like the last one, because it doesn't mention CI either. π
By using "pipeline" as the central theme, you can pivot the conversation to any area where automation currently feels like a "chore" rather than a "utility."
Some (AI supported) reasons to use Dagger, which shows its swiss-army-knife like adaptability:
For Web Devs: Itβs about killing the "Push and Pray" loop of YAML or script-based CI.
For AI/ML Engineers: Itβs about the MLOps pipelineβthe the "Production-Parity Engine" that allows them to run data-heavy evaluations locally before hitting the cloud. Research shows that 85% of ML models never reach production due to the "disconnect" between data scientists and DevOps; Dagger provides the common language to bridge that gap.
For Platform Engineers: Itβs about moving from "Fragile Scripts" to "Self-Healing Agents." Dagger is currently being used to build Agentic CI, where AI models can use Dagger Functions to diagnose and fix pipeline failures autonomously.
Some (AI supported) reasons to use
I'm not going to call these tag-lines, these are reasons I use it, but I tried to keep them short and snappy:
- Language-agnostic, type-safe CI pipelines.
This is huge because :
2. Your project and pipeline speak the same language
This is awesome because if I have a development environment set up for my project's language, it's already set up for my pipeline's language. Now, for free, I get:
3. Autocompletion and discoverability
This last point is especially helpful, because as much as we should all be capable of scouring documentation... lets be real, a quick flick through a list of completion candidates with doc blocks and type hints? Slightly faster.
Personally, I do not like the LLM side of things but I understand it's useful marketing for certain crowds. I don't have anything to contribute on this point other than to bear in mind that it's a double-edged sword that will attract certain crowds and repel others.
PS: Also Scott's thread reminded me of this point #1458389112665935936 message
shift left -> expand left. The former implies moving tasks, the later expanding those involved. Shift left is an industry term, but it's bad. There's some traction around expanding left as the replacement term, and for dagger that's really a better description.
Hello, is there a way for to express, when mounting a dir:
// A directory containing all the inputs of the artifact to be versioned.
// An input is any file that changes the artifact if it changes.
// This directory is used to compute a digest. If any input changes, the digest changes.
// - To avoid false positives, only include actual inputs
// - To avoid false negatives, include *all* inputs
// +optional
// +defaultPath="/"
// +ignore=["**_test.go", "**/.git*", "**/.venv", "**/.dagger", ".*", "bin", "**/node_modules", "**/testdata/**", "**/.changes", ".changes", "docs", "helm", "release", "version", "modules", "*.md", "LICENSE", "NOTICE", "hack", "!**/.gitignore"]
inputs *dagger.Directory,
that we want to apply the recursive gitignores: here i include them "!**/.gitignore" but I want to apply them as a pre-filter
I was thinking about a // +gitignore maybe ?, or maybe there's already a way ?
I don't understand the question
yes π but i was asking for a way to do it
I think I understand, just got confused by:
here i include them "!**/.gitignore"
Because obviously that would not apply them
Oh sorry, it's easy to miss, but, in the line I first remove ALL all the git dirs / files ("**/.git*") and then re-integrate just the .gitignore. The aim is to pass as little infos as possible to the container.
The use case is to know if a local directory is in dirty state. Your issue above is the generic UX, I was potentially thinking about a pragma or a variation to the current pragma to potentially handle those edge cases --> apply the filter, but also apply the gitignores if you find any gitignore
Is there a way to make a (sub) dagql call (from within an existing call that's being evaluated), in a way that the sub call won't be displayed to the user? e.g.
func (dir *Directory) FileLLB(ctx context.Context, parent dagql.ObjectResult[*Directory], file string) (*File, error) {
err := validateFileName(file)
if err != nil {
return nil, err
}
srv, err := CurrentDagqlServer(ctx)
if err != nil {
return nil, fmt.Errorf("failed to get dagql server: %w", err)
}
if 1 > 0 {
var fileStat *Stat
srv.Select(ctx, parent, &fileStat,
dagql.Selector{
Field: "stat",
Args: []dagql.NamedInput{
{Name: "path", Value: dagql.String("i-dont-care-about-this")}, // <------ I'm deliberately stat'ing a file that doesn't exist
},
},
)
}
...
i've started an attempt on a nushell SDK: https://github.com/dagger/dagger/pull/11638 it appears all checks are green, but I could use some guidance into making this a "fully featured" SDK. I will not have much time the coming month to work on this, but if anybody wants to help me with getting this SDK out there it would be much appreciated. I'm not a software engineer by trade, so there might be things in here that are very wrong π
Fix for the security:source check that is failing on PRs: https://github.com/dagger/dagger/pull/11649
Nushell SDK by henkhofs Β· Pull Request #...
fake news, closed it
Dagger version
1.9.0
guess we'll watch out for that in the future
@tepid nova I have a first iteration working to make available checks (and later generators, ship, etc) in currentEnv: https://github.com/dagger/dagger/pull/11650
It's only working for now from the main module, I'll see to allow it from any toolchain (so in that case the toolchain's currentEnv would be the same as the main module's currentEnv).
But before to go further I'd like to validate I'm on the right path with it
I'd like to merge https://github.com/dagger/dagger/pull/11577 that merges changesets
This looks like to work as expected, and covered by some tests. But there's some dagop related changes I'm not sure about.
This PR will unblock the generator one as it's a requirement to be able to merge multiple changesets.
@tepid nova should we change check names to be the hyphenated form? 
Looks weird that output doesn't match input I agree
the whole multi-casing situation is tough
i can have the TUI cycle through all the variations 
PSA: GitHub now eats the toplevel PR review comment if you dismiss the overlay popup thingy. Careful out there.
worst website regression in my lifetime
yeah i almost ended my day at 10:53
A week or so ago, a refresh of the page lost like 20 review comments I did. That was definitely end the day worthy
Quick cleanup PR: https://github.com/dagger/dagger/pull/11657
@still garnet was there a reason for changeSet.Layer() to specifically return the full filesystem diff and not just scoped directory diff as shown in https://github.com/dagger/dagger/issues/11656 ? I dug a bit [here](#1459242431596990709 message), but I might miss the big picture.
From what I see, I guess it's because it was too much work short-term ?
seems like a bug to me
not sure where from, Directory.diff is supposed to preserve Dir, from a quick look at the code it is
thanks, will check it out next week on the side π
So far I have:
I this error concealing excepted? I have the following code:
package main
import (
"dagger/error-test/internal/dagger"
)
type ErrorTest struct{}
// Builds a container with the Dagger CLI
func (m *ErrorTest) Build() *dagger.Container {
return dag.Container().
From("alpine").
WithExec([]string{"apk", "add", "--no-cache", "curl"}).
WithExec([]string{"curl", "-l", "-o", "dagger.tar.gz", "https://github.com/dagger/dagger/releases/download/v0.19.9/dagger_v0.19.9_linux_amd64.tar.gz"}).
WithExec([]string{"tar", "xzf", "dgr.tar.gz"}).
WithExec([]string{"dagger", "version"}) // checks the dagger CLI is correctly installed
}
the tar command has an incorrect file name intentionally.
running dagger call build yields ot the following output:
1|marcos:tmp/error-test (β |N/A)$ dagger call build
β connect 0.2s
β load module: . 0.2s
β parsing command line arguments 0.0s
β errorTest: ErrorTest! 0.0s
β .build: Container! 0.2s ERROR
! exit code: 1
Full trace at https://dagger.cloud/marcos-test/traces/ad084ac846c98b59104e2240ee4c17f4
1|marcos:tmp/error-test (β |N/A)$
if I add some verbosity level (4), I can actually see the no such file or directory error in the output.
I had the impression that we used to show that error message in the default verbosity level but it seems that we might be hiding a bit too much now?
cc @still garnet
mind opening an issue? could have something to do with the new logic for showing root causes, maybe it doesn't handle lazy return values
Yup. Just making sure that should be valid
@still garnet @obsidian rover Question regarding changesets merge: I have a simplified version, using git merge (so creating a git empty repo), on which I removed some of the junk created by Claude.
It looks like to work but the granularity is not well handled and I'm not sure how to deal with it.
If I want to use a PREFER_OURS (for instance) strategy and we have conflicts, like a file modified by both changesets, I can't merge them using git: git will not be at file level, but at chunk level. So if the file is modified twice but in a way it doesn't (git) conflict, it will be applied with no issue, while a file level granularity would means the full file from ours will be used and changes from theirs been discarded.
(Hope that's clear enough)
So I wonder if this file level granularity is a good idea in the end. It makes sense if we only want to fail, but if to handle file level granularity it means to not use git, is that a good choice? Especially if we also implement a chunk level granularity.
Thoughts? Ideas?
How do I run specific module tests?
I can see dagger call php-sdk test will run PHPUnit. But how do I run the the tests located in module_php_test.go?
I used to do it with something like this:
dagger call test specific --race --pkg="./core/integration" --run="TestPHP"
It looks like dagger call test-split test-module-runtimes runs ALL the module runtimes. But I only want to run TestPHP
I don't think we implemented fine-grained test filtering in all our dagger toolchains. But we are working on it π
The "test-split" toolchain is a handrolled stopgap. It's very coarsely split in hardcoded buckets for purposes of CI scaleout
https://github.com/dagger/dagger/pull/11638 my nushell SDK is ready for review π
Nushell SDK by henkhofs Β· Pull Request #...
@tepid nova is there something keeping .dagger/main.go from being migrated to Dang? (in dagger/dagger)
Yeah parallel
Or, once we merge dagger generate from @leaden glade, we can split it into static generate functions without making the UX worse. Then we could switch to dang without the need for parallel support
As discussed: https://github.com/dagger/dagger/issues/11695
Easy review https://github.com/dagger/dagger/pull/11701
little dang update: https://github.com/vito/dang/pull/11
tl;dr @directives can be placed ahead of the field now
# Multiple prefix directives (on separate lines)
@check
@cache(ttl: 60)
pub multiDirective: String! {
"multi"
}

That's fair enough, thank you for the info. π I resorted to calling dagger directly inside the testdata modules and then letting CI handle the rest.
I could use some insight on getting fields working: #11689
It registers fields, but with a couple of caveats:
- if the field is called
fieldthat works fine. But if it's calledstringField, it fails. Presumably due to kebab-case conversion. - If the field exists on the main object, and the field is accessed without calling any normal function first, then the main object is never constructed in PHP and thus, the values are never initialized either.
When either of those caveats occur, the field returns the value you'd expect if you cast null to the field's type.
What am I missing?
PHP: Support Getters by charjr Β· Pull Re...
ping for a ship it on ebpf support https://github.com/dagger/dagger/pull/11548, been rebasing on it as needed for debugging but would be nice to just have it in main π
@still garnet I had replied yesterday to your 'ignore checks not in toolchains' comment, but screwed up and it only went to linear... Just re-sent now
Related to toolchain only way of doing dagger: I'm thinking about ease of use for a developer. With this PR it would be possible to dagger toolchain install without a top level module first. If it doesn't exist, a bare module (with no SDK) will be created, to support the toolchain installation.
On the other side, let's say I want to create a custom function, but I want to follow this pattern. Would it be interesting that dagger init --sdk:
- creates a base dagger module
- init a module at a different path (like
toolchains/something) with the sdk - install the toolchain
Equivalent to:
$ dagger init
$ dagger init --sdk=go toolchains/...
$ dagger toolchain install toolchains/...
but all at once. That way to get started, either by installing a toolchain or by creating a module to support custom functions, is just one single command.
I think the UX is not perfect, as there's cases we want to init a module with sdk without to have a root module, especially while working on a toolchain to be shared somewhere else. So maybe it requires some flag in one case or the other (or a different command? but my first feeling would be to avoid it)
I finally have my PR to access checks from currentEnv ready.
And currentEnv when called from a toolchain's module returns the currentEnv from the root module, so we can build a toolchain that access to all the checks, wherever they are defined.
https://github.com/dagger/dagger/pull/11650
cc @tepid nova
Quick native CI question: if i have a PR open and force-push on top, do I start with a fresh cache when it re-runs the tests ?
currently yes I believe
Just ran into an issue with local cache on 0.19.10. Not sure exactly what happened but Dagger wasn't able to find a local dependency which definitely exists. I pruned the cache and now its fine. here's a trace https://dagger.cloud/kpenfound/traces/cfa200f90f21ca824d97880963e54f40?listen=0a7e7243a1ad2e65&listen=9bc023927c67c7b5
Pruned by using dagger core engine local-cache prune or just forced push on top ?
yeah with dagger core engine local-cache prune, not sure what you mean by force push. My cache issue was unrelated to your CI question
Dagger Cloud
couple easy reviews for anyone looking for thursday night fun:
Hello maintainers. There are a lot of design discussion threads this week... I am listing them here so you are aware, and can participate. These feel like a "knot", meaning that they are collectively blocking a lot of implementation... I hope we can resolve this knot together soon!
-
Toolchain-centric UX. With the arrival of toolchains, checks, user defaults... The "ideal" way of using Dagger is changing. But how exactly? There are still unknowns, and these unknowns are slowing down implementation. https://github.com/dagger/dagger/issues/11695#issuecomment-3757509323 . Note @still garnet I am writing down my thoughts there.
-
Gaps in CI workflow. In theory, every dagger module now has a built-in CI workflow:
dagger check. In practice, there is more to CI/CD than checks. The bare minimum is to publish what you checked. But how to coordinate all the possible permutations of "check" and "ship", without falling into the pseudo-code nightmare Dagger was meant to prevent? https://github.com/dagger/dagger/discussions/11653 -
Integration tests & toolchains. In a toolchain-centric UX, what is the very best way to configure integration tests with Dagger? We have identified 3 distinct patterns: a) toolchain customizations (to inject dependencies in test environment), b) "testcontainers model" -> orchestrate dependencies from test code. c) "up services": project has top-level services (+up), automatically discoverable by hostname in every container, test code can just use the hostnames and connect. No glue code. cc No written thread yet. Lots of live discussion with @tidal spire @fair ermine @charred lotus @civic yacht . Copying @tawny iris who is exploring testcontainers+dagger - we should talk π Discussion thread: https://github.com/dagger/dagger/discussions/11710
-
Test tracing. Can we give developers state-of-the-art test tracing, just by installing a dagger toolchain? The answer is yes π We made excellent progress on that front this week.
-
Test splitting. How to do it out of the box? I am working on a proposal, after many live discussions, and throwing away many designs. The goal is that you can get magical test splitting (scheduling of your tests across multiple machines in a cluster) out of the box, only by installing a dagger toolchain. This is equivalent to test splitting products by CircleCI and Buildkite - but with zero configuration π Would be cool
-
Better context API. We need a better way for dagger functions to interact with their context (everything outside their sandbox). Currently we have a "shadow API" that has evolved organically. This is becoming a blocker in several ways. Example: we need the ability to dynamically filter files uploaded from the client. toolchains need a more explicit way to receive the context from their project (this is related to 1 - toolchain-centric UX).
UPDATE: here's a discussion thread for point 3: integration tests & toolchains
Context Dagger is great at building and running integration test harnesses. This is done in two parts: Execution environment: the container that will run the test code Service dependencies: service...
quick perf improvement, faster image pulls/pushes thanks to faster (de)compression: https://github.com/dagger/dagger/pull/11709
As discussed: for our own e2e tests we're going to switch to the "testcontainers model" where servide dependencies are orchestrated from the test code, rather than from external orchestration functions.
--> https://github.com/dagger/dagger/issues/11713
As a nice side effect @still garnet, this will allow us to rely exclusively on reusable toolchains for test execution π
I just merged a PR that fixes ci:bootstrap, that should help to have green PRs π
This is pretty overwhelming but exciting! I'll try to follow and get involved in the discussions, thank you for the transparency. That's one thing I love about Dagger. My main ask (and hope) is that these all still work locally in the TUI as much as possible.
Of course π local-first, always.
Very excited about this! Hope that the work that I've done can help get the Docker API implemented. native support for docker compose, kind, and Testcontainers would be a huge boost to dagger's utility in the e2e testing world. At least, it would then hit all of our use cases at my job
Test tracing. Can we give developers state-of-the-art test tracing, just by installing a dagger toolchain? The answer is yes π We made excellent progress on that front this week.
I would enjoy reading more about this one πͺ¨ (that was meant to be a rocket, but I don't have the heart to get rid of it).
is this expected that env variables persist between multiple from calls? e.g.dagger -c 'container | from golang | from alpine | with-exec -- sh -c "env | grep ^GO" | stdout' will still show the go envs:
β Container.from(address: "golang"): Container! 0.2s
$ .from(address: "alpine"): Container! 0.2s CACHED
β withExec sh -c 'env | grep ^GO' 0.1s
β .stdout: String! 0.1s
GOTOOLCHAIN=local
GOPATH=/go
GOLANG_VERSION=1.25.6
IMO that's a bug, it should reset the env vars to be alpine's if you re-run from. Probably just never hit that case yet in practice
@still garnet only getting to the async part of my day now.. will ready your follow-up comments, thanks for those
@leaden glade I'm finally catching up on my PRs after 10 days of only sync... It looks like "checks in current env" conflict with my "dynamic checks" plumbing in an irreversible way. I'm trying to reconcile them but I might have to just throw my whole PR away
It looks like the root cause, is the constraint of having to graft the feature on Env, which has its own way of dealing of modules. So you had to bridge 2 different worlds in the best way you could
Yet another arrow pointed at "we need a better context API and we need it yesterday..."
Better context API π§΅
I seem to be getting a lot of failed to get other directory ref: no active sessions errors today, looks like flakes (confirmed that they're flakes)
π
That's really sad. Maybe in that case I should completely revert it then. I already reverted part of it that was breaking some other features. So it might be the sign this is not the right approach. I'll open a PR to revert it (it's still time, not part of a release, and I don't think it's used at the moment) and we can discuss it.
no no I found a solution, check out #1463332258453651569
That one has to do with the buildkit solver. Once everything is dag-op-ified, removing the solver is the next step (and arguably the most important part of theseus). Fortunately, we are almost done with removing all use of LLB thanks to @rocky plume so I started work on removing the solver now in prep for it
I am also working on an intermediate step that will fix the flakes in TestConstructor in the immediate term, but those βno active sessionsβ flakes are probably best fixed via just guillotineβing the solver
If i notice anything shorter term I will fix that too though, I know they are obnoxious
Just donβt want to invest effort in fixing something thatβs going away very soon
so I started work on removing the solver now in prep for it
This sounds pretty exciting!
I'm just wrapping up my small container.From post-PR-review changes this morning (thanks for the feedback)
so I started work on removing the
Need a quick ship-it on the cache-expert skill: https://github.com/dagger/dagger/pull/11720
still working on fleshing it fully out, but I already have other PRs out that will require updates to it, so much easier to just iterate in each PR
This other one might be a little more controversial: https://github.com/dagger/dagger/pull/11730
That's a skill for daggerizing a repo (i.e. adding dagger automation to an existing repo). Long story short, I was playing with a vibe coded project over the weekend, prompted Codex to daggerize the repo (rust backend + react frontend) and then told it to turn the general knowledge it learned into a reusable skill.
I don't think the opinions in it are all the latest and greatest but if others agree we can merge and collectively iterate on it. Otherwise I'll just keep it to myself for now and keep iterating π cc @tidal spire @tepid nova since it also ends up encoding toolchain related patterns, seems up your alley
This one already was super useful today. Solved a a couple cache conundrums myself for this pr, but was getting more and more brain dead after each one with one more test failure to go, so I loaded the cache-expert skill into codex and gave it some rough ideas and it totally nailed the fix (and found another bug along the way) π¦Ύ
It feels like by writing the skill, I uploaded part of my brain to the computer. Then I handed off to it when my biological brain started getting tired π
ok that's amazing
can we do the same thing for codegen plz ππ¬π
Makes me want to make an agent that interviews us here on discord then contributes a new skill
Me tonight when reading your skill Erik ahahah
https://tenor.com/view/boy-math-school-asia-study-gif-18879119
There seems to be a regression in main compared to v0.19.10. dagger call generate fails for me. I can have a look later unless someone knows exactly what the issue is with Error: introspection query: returned error 502: {"data":null,"errors":[{"message":"http do: Post \"http://dagger/query\": stream error: stream INTERNAL_ERROR; received from peer"}]}
looks like an engine panic to me - anything in the engine logs?
Is there a usable, non-node, non-rosetta, graphql playground ? I've heard rumors that we embedded one in the engine.
Me tonight when reading your skill Erik
I'm confused about moduleSource.directory, it seems the documentation doesn't describe what the code does.
Fix for security scan failures: https://github.com/dagger/dagger/pull/11736
Also, I noticed weird 1m30s gaps between the start of the withExec for go test ... and the actual check spans in our integration test telemetry, turns out it's the overhead of go test building the test binaries when -race is enabled.
This fixes it so all of the workflows save that time (reduced to ~10s now), while leaving -race on for one of them: https://github.com/dagger/dagger/pull/11735
- I also snuck in a separate commit to remove the LLM workflows since they are causing red on main due to the commands not existing anymore cc @still garnet
Toolchains v2 π§΅
It is a dump question. How do I rerun failed jobs in https://github.com/dagger/dagger/pull/11739?
there's multiple places in dagger cloud where you can do that. One of them is directly on the checks page or in the page about a PR
I'll re-launch the two testSplit in the meantime, the security:scanSource is something else.
Does @fresh harbor need extra permissions to do it?
fix for TestConstructor flakes amongst other improvements: https://github.com/dagger/dagger/pull/11729
I'm working on updating the cache-expert skill with the changes made there, so might push but should be docs only, safe to review
Ship functions π§΅
Weird cache related failure (cc @civic yacht ) in CI. Not sure what to make of it
failed to get content hash: failed to get directory: failed to get snapshot: failed to snapshot: failed to sync: conflict at "subdir2/.gitignore": change kind changed from "add" to "delete" during sync
https://dagger.cloud/dagger/traces/c0066be1da7263d6119f72b097aaea8b?span=7bdff74edac11983
That one is somewhat rare but has popped up for a while. The test case itself is for quite the corner case: filesync is told to honor .gitignore but then also to exclude .gitignore from the load: https://github.com/sipsma/dagger/blob/93c2afac88c12b8138706cf48e8c6b8855dfa893/core/integration/host_test.go#L367-L367
There must be a weird race condition that causes it to flake occasionally, but it's been rare enough and for such an obscure case that I haven't gotten around to it yet
Need to add an agent skill ref doc on filesync, seems like something good for one of them to do in the background
Welp... Naively running dagger client install go in ./core/integration certainly isn't going to cut it... π¬
https://dagger.cloud/dagger/traces/cfa2681186480387311ce359a721f3cd
That one is somewhat rare but has popped
go: github.com/dagger/dagger/dagger imports
github.com/vektah/gqlparser/v2/gqlerror imports
github.com/vektah/gqlparser/v2/ast tested by
github.com/vektah/gqlparser/v2/ast.test imports
github.com/andreyvit/diff: github.com/dagger/dagger/engine/distconsts@v0.19.10 (replaced by ./engine/distconsts): reading engine/distconst
.mod: open /src/engine/distconsts/go.mod: no such file or directory
π€
π€
π€
π€
π€

I'd bet engine/distconsts is not getting included in the filesync upload, there's a chance including that in the dagger.json you're using will fix it, provided the client generator stuff pays attention to that field
Ah! Thanks will try that
But wouldn't that also cause other important dagger functions to fail?
can't think of why, worst case it seems like it would be an unused file
Ah right, almost all the dagger functions we run are loaded from a module in toolchains/.... So the dagger.json from the root is almost never used for this
This made me realize we have leftover entries in //dagger.json:include that I can safely remove
That was it, thanks @civic yacht π
Mmm now I need to decide where I actually that generated client to live... As you predicted @tidal spire
Maybe github.com/dagger/dagger/internal/test/dagger ?
This is where, from a Go nativeness point of view, it would be nice for these generated bindings to be only for the toolchain functions, and somehow pluggable into the real dagger.io/dagger
I don't know if it's because our module has so many toolchains and dependencies, but generating this client is slllllooooowwww...
each time or only on empty cache?
Just over 2mn fully loaded (on warm cache) https://dagger.cloud/dagger/traces/dfd727085a940acf29431011033c4b80#25d2f1c62278de15
something must be going wrong then, it shouldn't ultimately be much different than dagger develop in terms of speed
Re-running the exact same command to be sure: https://dagger.cloud/dagger/traces/8023a1a9102f30507b80d30cf3a8273b#e4f3f3566c8ea728
Could be toolchain-related.
Toolchains didn't exist when generated clients were developed. Maybe that triggers an unforeseen edge case
this is what @fair ermine was talking about earlier!
after a few reruns on this one I realized I have an actual failure
- testSplit:testBase is failing the TestIntrospection suite, however for some reason that suite doesn't get a span like the others https://dagger.cloud/dagger/checks/github.com/dagger/dagger@5011484ebbfd0e670753d9a9e07815c49939dbeb?check=testSplit:testBase
- Its complaining that the
testdata/introspection.jsonis out of date and I can update it withYou can run 'go test . -update' to automatically update testdata/introspection.json to the new expected value.'. I guess we're missing it in the generators. I'll find a spot ingeneratefor this to fit in
PR for a "dagger codegen" skill π https://github.com/dagger/dagger/pull/11747
New mystery:
This command:
dagger client install go ./client/testutil/dagger
Modifies these files:
- Created:
internal/testutil/dagger/π - Modified:
go.modπ€ - Modified:
go.sumπ€ - Modified:
sdk/go/dagger.gen.go
Ooooh it's because of the difference between 0.9.10 and main
But how did it find that particular pasth sdk/go/dagger.gen.go? I guess it followed the replace directive for dagger.io/dagger?
OK confirmed, that's what it does. It intentially looks for a replaced version of dagger.io/dagger, and overwrites it with the version bundled in the current engine I guess
I don't understand why... It seems to me that actually, if dagger.io/dagger is replaced, the dev specifically wanted to take control of which version to use. So the generator could 1) leave it alone silently, 2) leave it alone with a warning, 3) fail with an error. But I can't think of any case where it's a good idea to silently overwrite it
@fair ermine when you wake up
Container defaultAddress ready for review https://github.com/dagger/dagger/pull/11714
The go.mod and go.sum also got destroyed
First call to go test with my generated client π
w# github.com/dagger/dagger/internal/testutil/dagger/dag
/app/internal/testutil/dagger/dag/dag.gen.go:152:26: undefined: subnetNumber
/app/internal/testutil/dagger/dag/dag.gen.go:152:40: too many arguments in call to client.EngineDev
have (unknown type, []"github.com/dagger/dagger/internal/testutil/dagger".EngineDevOpts...)
want (..."github.com/dagger/dagger/internal/testutil/dagger".EngineDevOpts)
/app/internal/testutil/dagger/dag/dag.gen.go:721:23: undefined: sourcePath
/app/internal/testutil/dagger/dag/dag.gen.go:721:35: undefined: doctumConfigPath
/app/internal/testutil/dagger/dag/dag.gen.go:721:35: too many arguments in call to client.PhpSDK
have (unknown type, unknown type, []"github.com/dagger/dagger/internal/testutil/dagger".PhpSDKOpts...)
want (..."github.com/dagger/dagger/internal/testutil/dagger".PhpSDKOpts)
/app/internal/testutil/dagger/dag/dag.gen.go:726:26: undefined: sourcePath
/app/internal/testutil/dagger/dag/dag.gen.go:726:38: too many arguments in call to client.PythonSDK
have (unknown type, []"github.com/dagger/dagger/internal/testutil/dagger".PythonSDKOpts...)
want (..."github.com/dagger/dagger/internal/testutil/dagger".PythonSDKOpts)
/app/internal/testutil/dagger/dag/dag.gen.go:736:24: undefined: sourcePath
/app/internal/testutil/dagger/dag/dag.gen.go:736:36: too many arguments in call to client.RustSDK
have (unknown type, []"github.com/dagger/dagger/internal/testutil/dagger".RustSDKOpts...)
want (..."github.com/dagger/dagger/internal/testutil/dagger".RustSDKOpts)
Giving up for today. Here is my progress so far. https://github.com/dagger/dagger/pull/11746
Kind of related to container defaultAddress and the discussion to default.
Would it be possible to connect that with .env files? To define the default address of a container in a .env file could be great, but I'm not sure how .env files are tight (or not) to default values.
Yes!
Itβs for integration tests, because we want to use the dev version of the SDK, so my way to specify that was through replace, instead of a dev flag.
But youβre right, they might be case where the user wants to replace the package for his own need, in that case Iβm not sure how the user can specify that when generating the client.
another cool one for you @civic yacht
failed to solve builtin container: failed to load cache key: unable to get info about digest: NotFound
https://dagger.cloud/kpenfound/traces/04172bfea183bda83ae02c385b2ce3d1?span=cc08aff4f9f19104
let me know if there's any info i can grab before nuking the cache
probably it should be the integration test harness that should do this then, and not the generator itself
did you find a solution?
I don't have a perfect solution, but since I can know from the Go SDK if the engine is a dev one or versioned, I could rely on that instead of a replace directive.
If dev -> add the replace + the lib
If not dev -> ignore
Same as what I did for https://github.com/dagger/dagger/pull/11708
PSA: Cloud checks UI updates rolling out, there were some breaking Cloud API changes, if you see errors just do a hard-refresh
@civic yacht @still garnet @leaden glade wdyt?
@fair ermine is the issue with go.mod & go.sum the same problem?
The issue with go.mod? The go.mod is simply updated to use the corresponding released version of the dagger client (for released engine)
Dagger Cloud
But if you're generating a client with a dev engine and you already have a replace in your go.mod, maybe then yeah that might create issues
Only skimmed the previous thread but the most common case of needing to run integration tests against the dev version of the Go SDK is when you add a new core API and need to test it. So need to make sure that's handled (both for local dev + ci) and easy to do
OK. I always get lost in the entanglement of versions: engine, SDK, client lib.
@fair ermine are client generators tied to the regular SDK system? In other words: can I develop my own custom SDK, and add support for client generator to my SDK, and dagger client will know how to use that?
Yes! if your SDK implements the ClientGenerator interface, it can be used with dagger client
dagger client install github.com/my/sdk ?
Which SDKs support that interface today? Just Go/Typescript?
@fair ermine ironically your replace-overwrite behavior exists to enable testing against a dev version. And it's preventing me from building & testing against a dev version also π (overwriting my dev lib with a stable release of the lib)
I believe so
(and by the way if you try with an unsupported SDK, I believe there is a bug that will cause the SDK to hang forever)
Here's an issue https://github.com/dagger/dagger/issues/11751
Separate issue... @fair ermine do you understand this error? https://github.com/dagger/dagger/pull/11746#issuecomment-3802641952
Yeah I'm sorry I didn't expect that use case...
It's the beauty of dogfooding π
Looks like a type issue:
/app/internal/testutil/dagger/dag/dag.gen.go:152:40: too many arguments in call to client.EngineDev
have (unknown type, []"github.com/dagger/dagger/internal/testutil/dagger".EngineDevOpts...)
want (..."github.com/dagger/dagger/internal/testutil/dagger".EngineDevOpts)
The repro is on your PR?
Hi! Curious if there's been any more internal discussion about supporting tools that rely on DOCKER_HOST or a docker-compatible CLI a la https://github.com/dagger/dagger/discussions/11710
Not so far, you're still at the bleeding edge π We're focusing on supporting the testcontainers pattern natively, which has rough edges of its own
Docker engine compat is still highly desirable, but our plates are full internally for now
For a serious dagger deployment, we'd recommend migrating from testcontainers to native dagger anyway (once we support the pattern in a solid way)
Wondering if anyone on the team could help me with an internal change to include something on each OTEL log entry that can be used to tell which container or service that entry came from? That would help me make progress on the docker engine compat, which I'd like to make it upstream someday
You should already have that context, from walking the tree
You can walk up the parents, and and see if your span is inside a Container.withExec or Container.asService span -> that's your answer
aye, that's just been maddeningly difficult to do in practice
probably would be equally difficult internally
@still garnet is the expert in this area π Maybe he can point you to code to look at? Or we can look at the code you have
Also -> good candidate for a new claude skill!
I'm embarrassed to admit that I've struggled even with the excellent direction vito's already given. I don't want to warp your guys' focus any, so I'll just keep my head to the grindstone on this. Please keep me posted if your focus shifts in this direction any, I'd love to help out!
are you using Cloud by any chance? it has a feature for that now, if i understand the request correctly (aggregating logs across services/containers and annotating them with where they came from)
No, I'm not using cloud.
I'm trying to write a Dagger module to create a Service which implements the Docker API by translating the API calls made by the docker CLI to the Dagger engine. Any Dagger user could install this module and e.g.
dag.Container().
From("docker:cli").
WithServiceBinding(
"docker",
dag.DockerAPI().
AsService(dagger.ContainerAsServiceOpts{
ExperimentalPrivilegedNesting: true,
})
).
WithEnvVariable("DOCKER_HOST", "tcp://docker:1331").
WithExec([]string{"docker", "run", "busybox", "echo hello"})
My ask is targeted at being able to implement the API calls behind docker logs and docker attach (I'm only really trying to implement stdout and stderr--I understand that stdin is a different beast entirely), which need a stream of logs from the Service. I've found how to get a hold of the logs from all Dagger Containers and Services via OTEL, but I've been struggling to sort through those logs to stream only the ones from a specific Service.
EDIT:
So far, I have this bit (https://github.com/frantjc/daggerverse/blob/main/dogger/internal/dogger/internal/backends/container.go#L372-L380) which creates a Dagger client and tries to gather logs and traces for a Service that the client starts later.
Then I have this bit (https://github.com/frantjc/daggerverse/blob/f8d849686eba6b9b811222dcf7d02611fe9e9e38/dogger/internal/dogger/internal/backends/container.go#L191-L218) which tries to turn those logs and traces into a stdout/stderr stream for a specific Service
It is, of course, nonfunctional at this time π
collecting logs programmatically
No, the go.mod and go.sum are completely purged of all dependencies, for the actual code of the go module. And replaced only with the dependencies for the generated client
Hmm okay yeah thereβs an issue then!
If anyone has a few cycles, I have two small PRs I'd like to merge. Both are extracted from my generators branch, I'm trying to shrink down it size by moving some changes to dedicated PRs, so the review of the generators one will be easier.
This is preliminary work for generators I extracted in a dedicated PR to help keep the generators one as focused as possible.
Allow to handle changesets in the CLI with the object directly instead ...
This is preliminary work for generators I extracted in a dedicated PR to help keep the generators one as focused as possible.
This PR extracts the code to preview patches (used in the CLI to displa...
Very quick PR: https://github.com/dagger/dagger/pull/11756
- Fixes some flakes due to a missing line update in a previous PR
- When I sent the PR out we started getting security scan failures due to a ts sdk dep, so I just bumped it as a separate commit there to save PRs
Does this error ring a bell?
$ go test -v
# github.com/docker/docker/pkg/archive
/go/pkg/mod/github.com/docker/docker@v28.5.2+incompatible/pkg/archive/archive_deprecated.go:103:47: undefined: archive.Compression
/go/pkg/mod/github.com/docker/docker@v28.5.2+incompatible/pkg/archive/archive_deprecated.go:159:43: undefined: archive.Compression
FAIL github.com/dagger/dagger/core/integration [build failed]
<@&946480760016207902> π
For the record it has to do with github.com/docker/docker/pkg/archive being moved to github.com/moby/go-archive, but the root cause is likely a premature go.mod update in that specific dagger branch.
More details in #1465764711101497467
Quick one: removing dead code in integration test harness (spin off from bigger test harness improvements - adopting testcontainer pattern etc) https://github.com/dagger/dagger/pull/11766
ModTree PR is merged π
And the generators PR based on it is almost ready, I should be able to open it tomorrow, with a simpler version based on ModTree (for now without scale out, but let's see if I can add it too)
dagger generate here we come! Thank you @leaden glade !
Any idea why dagger call engine-dev introspection-json as-json contents is failing with:
load http(url: "https://dl-cdn.alpinelinux.org/alpine/edge/main/x86_64/libcrypto3-3.5.4-r0.apk", refID: "qbyaq3p936rlfmt7shw4nbi3z"): invalid response status 404 Not Found
?
might be a alpinelinux issue? that URL does give me a 404. But if i replace x86_64->aarch64 it works fine
rm buildkit effects crap
Unfortunately in the refactor check.Source() got deleted, I'm adding it back in this pR: https://github.com/dagger/dagger/pull/11771
Sometimes when alpine is updating some packages, it's not an atomic operation. It already happened to me, like the package is in the packages list but not yet available. And some times later everything is finally aligned and that works.
And iirc we also have sometimes issues with wolfi, with package versions to update.
That said, I don't see any reason to not move to wolfi if that makes things easier π
Good skim for anyone curious about the constraints that go in to choosing TUI colors: https://blog.xoria.org/terminal-colors/
Except I've decided Solarized is wrong for coopting the entire bright palette in 2011, and continue to use bright black anywhere we want something dimmed π
Gotta say, as someone who prefers reading black text on a white background... a lot of terminal programs enjoy using nearly invisible bright yellows and greens.
EDIT: Also, even before considering "light themes": predominantly using red, yellow and green is a terrible idea as protonopia is the most common colour blindness
how's ours?
it's frustrating, because as a CLI designer you'd expect when given a palette of ONLY 16 COLORS that they would be, you know, readable. but every OS and terminal default seems to disagree with that principle
Honestly, it probably depends on the theme a lot.
yep, that's the issue
CLI designers choose nice themes, design for their terminal, and then find out "blue" is off limits π
Is blue off limits due to poorly designed dark themes?
(sort of like bright yellows on poorly designed light themes)
Dagger CLI colors
yeah, the default Mac terminal theme for example makes blue unreadable on black, or at least very hard on the eyes (the post I linked has examples of all this). even bright blue π«
Ah, that's what I get for skimming.
@fair ermine noob question regarding the ts SDK
I defined a generator decorator in typescript. Very similar to check.
If I look at the file sdk/core.d.ts it's there:
declare const generator: () => ((target: object, propertyKey: string | symbol, descriptor?: PropertyDescriptor) => void);
And exported.
But when I'm running a command, it's not found
/src/modules/test-ts-gen/src/index.ts:24
generator,
^
SyntaxError: The requested module '@dagger.io/dagger' does not provide an export named 'generator'
Is there any black magic I'm missing?
Do you have a PR open? So I can look at your changes
I just pushed the branch, and here is the corresponding commit: https://github.com/dagger/dagger/pull/11779/changes/7036c4bd05270343c031141306330545d1d661fa
Okay that what I suspected, you also need to update: https://github.com/dagger/dagger/blob/3e92868ab5993b07bd996e5c0abe2d143da3f25e/sdk/typescript/runtime/tsutils/module/index.ts#L6
thanks, trying that π
working π
fix for security scan and generated files checks on main: https://github.com/dagger/dagger/pull/11780
need a review on this fix too, fixes a corner case that's causing some occasional filesync test flakes: https://github.com/dagger/dagger/pull/11748
I just rebased, and regenerated code for https://github.com/dagger/dagger/pull/11666 am I good to merge it once tests are passing?
Fixes #11569
Summary
Allow ReturnType.ANY and ReturnType.FAILURE to accept exit codes 192β255, while preserving signal-related exclusions (128β191).
Testing
dagger call engine-dev test --pkg &q...
thanks! yeah shipit
I'm wanting to remove NewFileWithContents; however, we still have a single reference to it under https://github.com/dagger/dagger/blob/2d7ddbc5ae8a7a38caf25937ba379f5afdf00671/core/checks.go#L101
@civic yacht should I convert it to a call to a call to srv.Select(.....) rather than creating a new dagop via the wrappers?
Automation engine to build, test and ship any codebase. Runs locally, in CI, or directly in the cloud - dagger/dagger
Yeah the Select is probably an easier lift, agree that's the right place to try first, should work afaict
caught a weird one
engine logs
time="2026-02-02T21:11:31Z" level=error msg="error exporting metrics" err="map[error:export to xhepiw26d5irhv1pjtbhescz4: failed to export resource metrics: insert metrics: database or disk is full (13) kind:*fmt.wrapError stack:<nil>]"
time="2026-02-02T21:11:31Z" level=warning msg="failed to insert log record" error="map[error:database or disk is full (13) kind:*sqlite.Error stack:<nil>]"
time="2026-02-02T21:11:31Z" level=error msg="failed to emit telemetry" error="map[error:database or disk is full (13) kind:*sqlite.Error stack:<nil>]"
time="2026-02-02T21:11:32Z" level=error msg="error exporting spans" err="map[error:export to xhepiw26d5irhv1pjtbhescz4: insert span: database or disk is full (13) kind:*fmt.wrapError stack:<nil>]"
trying to run dagger core engine local-cache prune
connect buildkit session: unexpected status 200: get or init client: open client DB: ping file:///var/lib/dagger/worker/clientdbs/nu6o57hbg580gs2y6kiz9z822.db?_pragma=foreign_keys%3DON&_pragma=journal_mode%3DWAL&_pragma=synchronous%3DOFF&_pragma=busy_timeout%3D10000&_txlock=immediate: database or disk is full (13)
Waited a few minutes and ran prune again and it was fine, so maybe I had to wait for gc?
@civic yacht looks like we are still calling chownLLB under https://github.com/alexcb/dagger/blob/b8282555c073c4cbbd08a1376acf8f5cdc72c366/core/container.go#L1300
I recall there was a performance regression, but it looks like you fixed that.
Is it safe for me to work on changing WithMountedCache to use the dagop version of chown?
Yep that sgtm!
caught a weird one
Design knot proposal part 1: Module vs Workspace
https://gist.github.com/shykes/e4778dc5ec17c9a8bbd3120f5c21ce73
cc @tidal spire @still garnet @charred lotus
Dagger Design: Part 1 - Module vs. Works...
Part 2: Worspace API
https://gist.github.com/shykes/86c05de3921675944087cb0849e1a3be#cannot-be-stored
some easy PRs for anyone in the mood:
- https://github.com/dagger/dagger/pull/11786 - remove network dep on downloading apache LICENSE during dagger init to avoid CI network flakes
- https://github.com/dagger/dagger/pull/11784 - tweak function caching docs to clarify layer vs. function cache
Design knot part 3: Artifacts.
https://gist.github.com/shykes/aa852c54cf25c4da622f64189924de99
New skill just dropped π π§ https://github.com/dagger/dagger/pull/11794
(Distilled from making the "Design knot" proposals)
@tidal spire @still garnet @charred lotus @obsidian rover notes from tonight's discussion on design knot & proposal
cc <@&946480760016207902>
note: "Solomon Hykes" is actually all 4 of us sharing a computer, talking to Kyle π
"RCI" = "our CI"
https://docs.google.com/document/d/1N_DInAeEeRZRa5eD9Xd0VR_hFAimFKUJxkIKQcUIhUc
Meeting Feb 3, 2026 at 16:32 PST Meeting records Summary Solomon Hykes and Kyle Penfound agreed to prioritize prototyping the workspace feature and API, defining the workspace in a monorepo as the entire monorepo with dynamic filtering managed by a TOML configuration file to handle module ...
@fair ermine is there a way to refresh all test fixtures for the typescript SDK? I'm having tests not working because of additions to the output json on the introspector (adding generator but also check that was missing) but I'd like to not update all tests by hand...
See for instance https://dagger.cloud/dagger/checks/github.com/dagger/dagger@11579cab234b7e731b836951a42c5b7a13f26738?check=typescriptSdk:testBunjs
There was a script but it's not longer maintained π
However you shouldn't need to update all the tests, the field should simply not exist in the json output if it's not set
It's this part
public toJSON() {
return {
name: this.name,
description: this.description,
deprecated: this.deprecated,
alias: this.alias,
arguments: this.arguments,
returnType: this.returnType,
// TODO: refresh test suite
// isCheck: this.isCheck,
// isGenerator: this.isGenerator,
}
}
(in introspector/dagger_module/function.ts)
I'm happy to not add them at all if it's not useful π
It's not for the decorators like check or generator I think
fine, I'm removing it (and tests are passing) all good then
(they have been added by Claude π«£ )
dagger generate PR is ready to be reviewed: https://github.com/dagger/dagger/pull/11779 π
tiny low-hanging fruit web UI perf PR if anyone has a sec: https://github.com/dagger/dagger.io/pull/4773
got my dev engine in an unrecoverable state somehow π€ rebuilding/restarting with ./hack/build doesn't do it. The new engine is just unresponsive. Even docker rm -fv dagger-engine.dev before rebuilding doesn't do it. No idea what else to do. Only thing incriminating in the logs is this
time="2026-02-04T20:39:54Z" level=warning msg="failed to release network namespace \"w27btp00wd5xc4a8e1jcvf2df\" left over from previous run: plugin type=\"loopback\" failed (delete): unknown FS magic on \"/var/lib/dagger/net/cni/w27btp00wd5xc4a8e1jcvf2df\": ef53"
time="2026-02-04T20:39:54Z" level=warning msg="failed to release network namespace \"yskfmfatzg9o4vucj7jy1sxtn\" left over from previous run: plugin type=\"loopback\" failed (delete): unknown FS magic on \"/var/lib/dagger/net/cni/yskfmfatzg9o4vucj7jy1sxtn\": ef53"
time="2026-02-04T20:39:54Z" level=warning msg="failed to release network namespace \"yzpx076otml1j75cttvoadwdd\" left over from previous run: plugin type=\"loopback\" failed (delete): unknown FS magic on \"/var/lib/dagger/net/cni/yzpx076otml1j75cttvoadwdd\": ef53"
docker desktop on mac btw
@civic yacht seeing some really strange overly-sticky caching behavior, it repros with a fresh engine:
- edit β¨
cmd/codegen/generator/typescript/generator.goβ© to add β¨panic("wtf")β© inside β¨func generateβ© around line 43 - β¨
dagger call generateβ© => panics, as expected - remove the panic
- β¨
dagger call generateβ© => still panics?!
do you have traces for both 2+4?
i first noticed this because changes to the β¨.gtplβ© files weren't taking effect
(how i got down this rabbit hole: changing TS to avoid β¨/* */β© style comments because it chokes on β¨**/*.goβ© examples in doc strings
cc @fair ermine)
Do you have an example? Thatβs not supposed to happen π
PHP SDK: Support Getters: I finally got this PR ready, for anyone that fancies reviewing a bit of PHP π
it was for a newly added API - pretty sure I hit this before and worked around it by changing the glob example, but β¨**/*.goβ© is a valuable example so figured I'd go in and try to fix it haha
Would like to get this one in quick for the release: https://github.com/dagger/dagger/pull/11799, cuts module load times in large monorepos at least in half ([thread](#1468216262558617775 message))
This one too that fixes support for β¨--mount=type=sshβ© in dockerbuild: https://github.com/dagger/dagger/pull/11793
I just want to say that from an outside perspective, @civic yacht is an absolute animal
Any takers for a quick skill review? https://github.com/dagger/dagger/pull/11794
think I found a regression with nested checks: https://github.com/dagger/dagger/issues/11811
--> nevermind. Today I learned of a little function called dagger call uses GraphQL introspection to figure out available functions. Should we consider moving that to dagger-native introspection?dag.CurrentTypeDefs() π
any thoughts on this tiny addition to _EXPERIMENTAL_DAGGER_RUNNER_HOST=image://...? https://github.com/dagger/dagger/pull/11773
this would allow me to manage use of Dagger within my team's project with 1 environment variable rather than having to manage our own container with proxy env vars set and use _EXPERIMENTAL_DAGGER_RUNNER_HOST=container://...
would love it to be included in v0.19.12, whenever that may be!
my queue is backed up at the moment so can't look now, but I put it on the v0.19.12 milestone so we for sure get to it before the next release!
quick fix for security scan failure (go has had a lot of CVEs and minor releases lately...) https://github.com/dagger/dagger/pull/11829
I'm having difficulties building an earlier commit (to try to pinpoint when something broke, in this particular case ffc705673), I'm getting dang compilation errors:
β withExec go build -o /entrypoint ./entrypoint 0.2s ERROR
entrypoint/main.go:23:2: package dagger/dang/internal/querybuilder is not in std (/usr/local/go/src/dagger/dang/internal/querybuilder)
entrypoint/main.go:24:2: package dagger/dang/internal/telemetry is not in std (/usr/local/go/src/dagger/dang/internal/telemetry)
so I went into the dagger.json where dang is referenced and changed it's pin to something that doesn't exist:
$ git diff
diff --git a/sdk/dang/dagger.json b/sdk/dang/dagger.json
index 8a6997246..66c9fb49b 100644
--- a/sdk/dang/dagger.json
+++ b/sdk/dang/dagger.json
@@ -4,6 +4,6 @@
"blueprint": {
"name": "dang-sdk",
"source": "github.com/vito/dang/dagger-sdk@main",
- "pin": "266e75bc46ac1ad596517645f4b7e1355a69b731"
+ "pin": "eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee9b731"
}
}
however when I run ./hack/build again, I get the same error, which leads me to think the pin field is being ignored and we're always trying to build using main instead?
@obsidian rover @tidal spire that test you started looking into the other day seems to now just be failing almost consistently? TestToolchain/TestToolchainMultipleVersions/install_multiple_commits_with_different_names. I'm hitting an error on every PR and also locally on main. Which is weird because in CI main it passed most recently.
seems to be passing locally for me, checking (probably cache)
PS: just got it too (on my pr)
Also, pretty straightforward PR here: https://github.com/dagger/dagger/pull/11830, just adds support for individual GC settings when running a manual prune, e.g. you can do dagger core engine local-cache prune --target-space 20% (whereas previously you could only either prune everything or prune according to the exact settings in the engine's config file)
this one is good now too: https://github.com/dagger/dagger/pull/11828 (deflaked it after the first attempt), fixes a git SSH socket bug while also making some changes we'll need for the solver removal
@charred lotus Are you still working on the module source addition to ModTree (checks, generate)? Want me to have a look as I finished and merged the generate functions? (just don't want to collide with your work if you're on it)
Go ahead
@still garnet I opened a PR on dang to move from @generator to @generate as we did on the other SDKs (forgot dang...) https://github.com/vito/dang/pull/26
But I can't use it, and I suspect this is something else not working, maybe on main branch:
Error: failed to load dependencies as modules: failed to load module dependencies: failed to call module "typescript-sdk" to get functions: Error: "nodeVersionPrevLts" not found
--> /mod/toolchains/typescript-sdk-dev/ts-sdk.dang:170:46
|
168 | # Build a nodejs dev environment, with the SDK source code mounted and its dependencies installed
169 | # FIXME: use baseNodejs() to avoid duplication
170 | pub nodejsDevContainer(nodeVersion: String! =nodeVersionPrevLts): Container! {
^^^^^^^^^^^^^^^^^^
171 | container.
172 | # β οΈ Keep this in sync with the engine version defined in package.json
|
Does that ring a bell? Or is there something I should changed in the code to make it work?
what command did you run to produce that error?
also, i think we're missing the directive definition for generate - https://github.com/dagger/dagger/blob/07e60b3c73471079c1b72bb1f25ed8cb54c0594b/dagql/server.go#L318-L325
any command, for instance dagger check -l
ah yep, can repro after bumping dang. looking into it!
What's the consequence of this list? Because dagger generate works without it π (but I'll add it)
it works atm because the dang SDK injects the directive on its own: https://github.com/vito/dang/pull/26/changes#diff-c2b778a6637a0df5bfe2904801594046eac433bdfbd2bd01bb5943143d9806c5R141-R146 - if it came with the schema it wouldn't need to
i suppose we don't strictly need them in our schema because the SDK can control both sides, but it felt a little better that way with Dang being 'graphql native' π
ah needs a regen π - will approve after
fixed, rolled the bump into https://github.com/dagger/dagger/pull/11831 (includes your @generate PR, ty for that too)
FYI I got a decent LLM-usable "test with playground" skill, in case anyone's interested
- Basic understanding of how the playground works
- Understands not to mess with 'go build', docker etc
- includes a shell script that lets llm worry about inner command, and takes care of the rest, eg.
with-playground.sh dagger core version - progress=plain so engine crashes are visible
- configurable timeout, so it can handle super-long builds, hangs and other realities of life
I'm hoping to get it reliable enough that the LLM can use it as a first layer of feedback when vibe-coding engine changes (ie. in the context of #1468070450524459029 )
Note, this probably dumps way too much logs to the llm (a full plaintext trace to be exact). So will consume context rapidly. But it's the only reliable way to get engine crash logs
have you tried --progress=dots? i think that should show the crash logs too
I also always instruct the agent to pipe the output to a temp file and search it afterwards because they tend to just grep for things by default which leads to a lot of reruns
Ah, I considered it but don't understand how dots works, so didn't want to introduce too many unknowns while developing the whole skill & script.
Is the idea that it streams nothing by default, but on error it will show more (like the stderr of crashed commands)?
it streams logs as they are printed from anywhere that would be normally revealed in the TUI, and dots for spans as they complete (test output style, like rspec)
so it's like a balanced 'i mostly care about logs, but still want a vague idea of progress so i can tell if it's hanging' - tuned for running in places like Jenkins or GitHub actions
But in TUI, normally dev engine logs will not be revealed when calling engine-dev playground
(I would have to manually search and expand)
by 'revealed' i meant more like, if the span is visible via expanding and moving around, its logs are visible.
so internal and encapsulated logs are still filtered out (module init etc)
ok so dots does seem like an easy win π
quick fix, still occasionally hit these weird panics: https://github.com/dagger/dagger/pull/11837
nice, does this replace https://github.com/dagger/dagger/pull/11392?
It's just turning the panic into an error, not attempting to address the underlying cause, which I'm not sure of
tiny Cloud QoL fix for anyone around: https://github.com/dagger/dagger.io/pull/4787
Flagging a design issue... While prototyping #1468070450524459029 , I'm discovering that the logic to decide what to serve on a given clietn session, is very fragmented and a lot of it is client-side. We have:
dagger call: serve core API + the main module, then scope every query to the main moduledagger functions: samedagger core: serve core API, query it directly- dagger shell: service core API + the main module + its dependencies. Custom scoping
dagger querylooks slightly different, but I didn't get to that yet
My issue is that I'm removing the concept of main module.. What to serve is determined by the workspace config, and (ideally) all centralized engine-side, it just seems simpler...
What I want:
- Client connects, passes a few client headers (load workspace modules? load extra modules? load core API?)
- Engine figures out the schema from there: find-up workspace config; load modules and/or core API based on client headers
- Client exposes what's served back to the user, without custom filtering/scoping
The problem is that I can't just "hide the core API" engine-side, because the CLI depends on parts of the core API, it's just not clearly documented or marked which parts... So that's what I'm stuck on.
Whadduya know, I just hit this again but with the real error:
panic: page 5778 already freed
goroutine 1577288 [running]:
go.etcd.io/bbolt/internal/freelist.(*shared).Free(0xc000436000, 0x391f, 0x7f5344a92000)
/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/internal/freelist/shared.go:80 +0x319
go.etcd.io/bbolt.(*node).spill(0xc0701ab5e0)
/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/node.go:319 +0x1ac
go.etcd.io/bbolt.(*Bucket).spill(0xc053a68e80)
/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/bucket.go:786 +0x365
go.etcd.io/bbolt.(*Bucket).spill(0xc053a68e40)
/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/bucket.go:753 +0x105
go.etcd.io/bbolt.(*Bucket).spill(0xc011d68d38)
/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/bucket.go:753 +0x105
go.etcd.io/bbolt.(*Tx).Commit(0xc011d68d20)
/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/tx.go:204 +0x2ca
go.etcd.io/bbolt.(*DB).Update(0x32cb8c0?, 0xc0701ab490)
/go/pkg/mod/go.etcd.io/bbolt@v1.4.3/db.go:915 +0xca
github.com/containerd/containerd/v2/core/metadata.(*DB).Update(0xc000792240, 0x32cb8c0?)
/go/pkg/mod/github.com/containerd/containerd/v2@v2.1.5/core/metadata/db.go:263 +0x96
github.com/containerd/containerd/v2/core/metadata.update({0x421e618?, 0xc04dccd2f0?}, {0x41fc930, 0xc000792240}, 0xc0701ab490)
/go/pkg/mod/github.com/containerd/containerd/v2@v2.1.5/core/metadata/bolt.go:58 +0xc6
github.com/containerd/containerd/v2/core/metadata.(*snapshotter).createSnapshot(0xc000176680, {0x421e618, 0xc04dccd2f0}, {0xc0329de441, 0x19}, {0x0, 0x0}, 0x0, {0x0, 0x0, ...})
...
github.com/dagger/dagger/core.(*Container).WithExec.func1(0xc0700773b0, {0x0, 0x0})
/app/core/container_exec.go:273 +0x209
github.com/dagger/dagger/internal/buildkit/frontend/gateway/container.PrepareMounts({0x421e650, 0xc056ae15e0}, 0xc051da6880, {0xc01ff59050?, _}, {_, _}, {_, _}, {0xc070079100, ...}, ...)
/app/internal/buildkit/frontend/gateway/container/container.go:178 +0x402
github.com/dagger/dagger/core.(*Container).WithExec(0x34e85a0?, {0x421e650, 0xc056ae15e0}, {{0x6b701c0, 0x0, 0x0}, 0x1, {0x0, 0x0}, {0x0, ...}, ...}, ...)
/app/core/container_exec.go:271 +0xac5
...
Boltdb corruption
From containerd no less!
It's funny, I think it's all along been a panic from that, but then we had a defer in which we were triggering a new panic, covering up the original panic π΅βπ«
I'm still hitting this on a new dev build, not sure of the context. Any workaround? Error: failed to resolve dep to source: failed to resolve dep to source: failed to load sdk for local module source: invalid SDK: "github.com/vito/dang/dagger-sdk@7e250fd9ab63b01ae80e5db37c2d2baf3303472a" [traceparent:d8abb37b93813881e6e6633dd1e19975-879c847763ae2a03]
seems like my dev build from main is hitting this in other projects too, but v0.19.11 is fine π€
huh - that's weird. is there anything else in the logs? i just hit that but it turned out i needed to restart my engine, because its network stack got wedged:
β¨```
β git(url: "https://github.com/vito/dang", keepGitDir: true): GitRepository! 20.0s ERROR
! Get "https://github.com/vito/dang/info/refs?service=git-upload-pack": dial tcp: lookup github.com on
10.87.0.1:53: read udp 10.87.0.1:38631->10.87.0.1:53: i/o timeout
β
tends to happen when i swap between networks
next step in moving to fully dagql native caching is good to review: https://github.com/dagger/dagger/pull/11838
I promise it's not nearly as bad as the diff makes it look π There's >1000 lines of new unit+integ tests and updated internal cache-expert docs. And the rest is mostly just boring refactor-y stuff. Just want it out of the way so I don't accumulate a 10k+ line PR over the next few days.
Also, codex did technically write every line, but I spent a long time with it cleaning up the slop, so it's not thousands of lines of garbage. It's the lines I would have typed anyways except done in 1 day rather than like 5 days, and no hair pulled out in the process 
That's the new world we live in! Excited for this.
it truly is a brave new world, quite suddenly too... but I love it
I accomplish more and my brain doesn't feel like it got run over by a truck at the end of the day π
I have something that looks like to work π
Let me know if that solves your issue. It adds a OriginalModule that is the module in which the node entry (field, function, so by extension check or generate function) has been defined. If for instance the check is from a toolchain, Module will remain the root module while OriginalModule will be the toolchain's module. And from there you can access the module source or any other properties you need.
I didn't know if it's something that needed to be exposed, so I did it, but if it's not something we want I can easily remove it (it's on dedicated commits)
https://github.com/dagger/dagger/pull/11846
Quick update @still garnet.
Actually --progress=dots is a mixed bag..
- Less volume overall, so more context-efficient π
- Actual stdout/stderr of the inner command is not clearly visible at the end of the output. Instead it's buried above a "error trace dump" - that seems to confuse the model π
good to know - do you have a sample output i can eyeball?
(i believe based on your description, just curious how bad it is / what tweaks to make)
Yeah making one for you now π
btw, dagger trace <trace-id> is a thing now on main (hidden command, streams from cloud API)
wat
Would be cool to have --progress=logs similar to the "logs view" in dagger cloud, people seem to like it.
I think in this case it's what we need?
yeah makes sense. like an even more trimmed down dots - logs was the main goal, with dots to solve the 'how do i know if it's stuck'. but printing the report at the end is definitely more opinionated
i wonder if the report should be opt-in, like --progress=dots,report (but that's a can of worms - multiple frontends)
it's easy enough to just add logs which is a slightly reconfigured frontendDots - main question is do you want the dots? or JUST prefixed logs, hanging be damned?
I think it's especially sensitive in the context of testing a dev dagger by building & running it inside a system dagger. Seeing a dagger trace on every error can confuse you - am I looking at system dagger or dev dagger? A regular system tool wouldn't typically dump a whole trace on a normal error, so you can see how a LLM could easily go down the wrong rabbit hole, assume it's a trace of their dev dagger, etc
In the case of LLMs, the dots are not needed. They don't have a concept of waiting or time, literaly they have to write a shell script that times out for them. Still a painful rough edge in my own skill actually.
But yeah the benefits of dots are totally wasted on the llm right now (unless I'm doing it completely wrong)
they do kind of indicate how much work happened, which could give the LLM a better sense of scale of what happened 
I actually wondered how they could handle interactive ssh sessions for example. I think the answer for claude code is: they can't
Yeah maybe. But you could also have a relative timestamp in the log entries, or a configurable "filler" that adds a line for each bucket of 5 seconds or something -> "here we waited for 5 seconds / and there were 412 spans"
i'm imagining reading these logs without the final report and it seems like it would be hard to tell what failed. maybe the report just needs a big header clarifying what it is, instead of just spitting it out at the end?
like at the moment it would just see this at the end:
βΌ withExec sh /tmp/inner.sh
Something went wrong again... this is stderr
Something went wrong... this is stdout
no status, no nothing
llm output
In my particular use case (testing dagger in the playground), I really just need the outputs of my commands & the dev engine logs. Everything else - including the fact that we're using dagger to test dagger - is an implementation detail
(don't know how well that generalizes to other use cases)
@tidal spire merged a fix for the boltdb crash thing, which I'm hoping is what was causing those unrecoverable boot-looping engines you hit every once and a while, so after you rebase on main let me know if you see that in a dev engine again (π€ not)
Would it make sense to release the low-level substrate of the engine image as its own image, with its own versioning? base image, runc, containerd, cni plugins etc.
The interface is pretty stable, and the update cycle is decoupled. And, I bet it would make our builds faster... In theory re-building that substrate is very cacheable. But, our caching is not perfect yet... We could use the assist IMO.
Anyway, just a thought, as I stare at my 99th rebuild of the day π
all of those are cached and instant every time for me
how long does a rebuild take you?
Locally yes, but on dagger cloud?
Maybe the imminent persistence upgrade will help (possibly by a lot)
Yeah that should be the fix imo
and then post-theseus: a global read-only cache for blessed builds π like cachix for nix
Yep all the code I've been writing today has that end goal in mind, it's so freakin close π
That might end up the next biggest boost in perceived speed for actual workloads. Since there's a super high concentration of well-known modules in most pipelines... Everything would seem to magically go faster when it comes online.
But one thing at a time π
I just opened a PR that replaces custom generate function by dagger generate and adapt the check generated function accordingly: https://github.com/dagger/dagger/pull/11853
The only left function in .dagger is the check function, that will be migrated in a toolchain when possible (but maybe only when we'll have workspace?)
Two changes on dev side:
dagger check generatedreplacesdagger check check-generatedbecause I think that makes more sense, but it's open to discussion (and on a dedicated commit, so easy to remove if needed)dagger generatereplacesdagger call generate
Thanks @civic yacht and @obsidian rover for the feedback, we can now use git worktrees to develop dagger https://github.com/dagger/dagger/pull/11857
Thank you so much for that 
this is really convenient and way better than my collection of clones 
Now we all get to work in a tree π
If anyone has some cycles, I have 3 PRs to review:
- https://github.com/dagger/dagger/pull/11859 support for
@Generatein sdk/java - https://github.com/dagger/dagger/pull/11846 expose
OriginalModule(the module in which the check/generate is defined, contrary toModulethat is the one from which it's called, IIRC needed to smart checks) - https://github.com/dagger/dagger/pull/11834 (CI is running but should be green) to expose generators/checks to env/current-env: I wonder if that's will be replaced by the new workspace, and in that case maybe it's not needed to add right now?
can I get π on https://github.com/dagger/dagger-for-github/pull/200
I forget why we ended up creating the separate dagger-checks action... But assuming it was because of an issue using the standard action, are you sure it's resolved?
(sorry I don't remember our discussion back then)
it was for simplicity. If all you need is checks, then the standalone dagger/checks action is cleaner
you can do
- name: Check
uses: dagger/checks@v1.0.0
with:
filter: '**/lint'
or just
- name: Check
uses: dagger/checks@v1.0.0
vs
- name: Hello
uses: dagger/dagger-for-github@v8.3.0
with:
check: "**/lint"
Easy review π https://github.com/dagger/dagger/pull/11870
When DAGGER_MODULE is set, the module is silently loaded with no visible
indication in the TUI.
This PR fixes that by always displaying the name of the module being loaded.
πββοΈ another easy review https://github.com/dagger/dagger/pull/11872
A small (if you don't look at the uv.lock file) that bumps python version for the SDK, including dependencies: https://github.com/dagger/dagger/pull/11868
This sets the default version to 3.14 and adding it to the test suite matrix
@civic yacht @rocky plume sorry if this is the 1000th report of an already fixed thing w/ theseus, just doing due diligence: https://dagger.cloud/dagger/checks/github.com/dagger/dagger@c758e9f672959635cc3ba09aa5428ea3d1d7c5bd?check=testSplit%3AtestModules&listen=61977c94e45e6b36&listen=f19df9dd2b0266ec&listen=65e56b05769b73b7&listen=e4a74b9f63dc4541&listen=c8e07d9b3fb73a3f&listen=48f3cb2f0bf97951&listen=069b68213211189e&listen=8f59fc8a2fdc609a#e4a74b9f63dc4541:L44
Error: failed to generate code: failed to mount directory: failed to get other directory ref: no active sessions [traceparent:e547407de69f45363c9fdfa6911c8ecc-a74054889e20e134]
Yep that's a buildkit solver related bug, should be gone in the next week π
To be clear, there's no fix on main (saw your other message in separate thread). I've been documenting the remaining work to remove the buildkit solver in my PRs, here's the current WIP one (Here's what's left after this PR in terms of fully centralizing caching on dagql lists what's left)
So by gone in the next week I mean "merged to main in the next week"
how is toolchains/test-split/engine-tests.gen.dang actually generated @still garnet ?
i have no clue π
It's not. I wrote it by hand, the file is called .gen.dang to indicate it could and in the future should be generated
I don't think it is. I've always been confused by the name
ah, I was fooled by the whole diff having @still garnet 's name, but that was probably because of the new dang formatter π
sorry for the confusion. I was planning to quickly add the actual generation, just never got around to it
no worries, i'll add a comment!
π https://github.com/dagger/dagger/pull/11689
Still seeking a kind soul to review this one to add fields to the PHP SDK. π
It's a pretty small PR, only touched 6 files in src the rest is additional testing, changie and docs.
π easy review, quick PR to fix the security scan issue we currently have on main: https://github.com/dagger/dagger/pull/11878
This is just a bump of tar in typescript SDK
Approved
Thanks, merged. We should have a green CI again on PRs π
I alo opened a new PR that complete the dependency bump and fix the issue introduced by the bump: https://github.com/dagger/dagger/pull/11880
Also fix a bug introduced by the bug on env path.
https://github.com/dagger/dagger/pull/11875 took a bit longer than expected -- I ran into some pretty gnarly things to debug, but finally got all the checks passing.
will take a look tonight!
Hey, here's my proposal for go codegen v2 if anyone want to give some input: https://gist.github.com/TomChv/86d8b19468642153eb8f003e1023b876
@tepid nova Thanks for the idea, it really helped me!
Quick question: what about interfaces and as* functions? Is it solved the same way than env/bindings? Does it need more wrapping?
π A quick one, to fix/improve lint checks for python SDK: https://github.com/dagger/dagger/pull/11886
return container instead of an error. As returning a container the engine will sync to return the error, that's exactly what was done. By making the container available, it's easier...
is --interactive an alternative to this? i think on its own returning a Container won't give you much because you can't chain from it if it failed anyway. just curious, i like the change either way, nicer to avoid the manual sync
That's a bit different than --interactive. Especially I wanted to enter the container even if it was not failing. For instance to explore the files.
ah ok makes sense
Caught a weird one. Failed check thinks it passed https://dagger.cloud/dagger/traces/78cd78f5d08757014cc6c6544e20b280
actually its probably because how i'm handling the exec in dang https://github.com/shykes/dagger/blob/ci-engine-tests/toolchains/test-split/engine-tests.gen.dang#L7-L23
I think on main this relies on grapqhl to handle any error. I thought this would still error but maybe not correctly
the old code relies on Go potentially returning an error, and I don't know what the dang equivalent would be @still garnet
i think it's a bug in the --scale-out path - when we invoke on another engine, we just query passed and don't bubble up an error: https://github.com/dagger/dagger/blob/42bb3b568c0a38671561a1f40e7a40cba49cde60/core/modtree.go#L200-L206
unlike the local branch above, which does, since it calls the underlying function
you can check by doing a local run, i expect that to fail
to answer the Dang q: i did add raise/try/catch recently, just don't think it's the root cause here: https://github.com/vito/dang/pull/28
ah yeah that was it, thanks! wasn't aware of that bug
my fault, forgot to report/fix it when i stumbled on it π - just sitting in a pile of TODOs
I have a test-only PR that's getting stale: https://github.com/dagger/dagger/pull/11800
https://github.com/dagger/dagger/actions/runs/22108534795/job/63905196975 SDK python provisioning tests on main started failing after https://github.com/dagger/dagger/pull/11868 cc @leaden glade
It's a guess, but I hope this can fix it: https://github.com/dagger/dagger/pull/11898
π€
This is a guess, but the error https://github.com/dagger/dagger/actions/runs/22108534795/job/63905196975 is related to a change with 3.14 and it started to failed after bumping python to 3.14 in #1...
@still garnet any thoughts on https://github.com/vito/dang/pull/35 ? last week was trying to debug failing workspace cache tests on my PR, but my PR also had separate problems with effect ID propagation that impacted telemetry, so then I was getting extremely confused on whether the workspace caching test was broken because of telemetry (since it asserts on whether a println happened) vs. the caching
- spoiler, it was a little of both...
if we had a "generate random string" func in dang stdlib, can just use that in the future for these sorts of tests and avoid entanglement
saw that, wanted to take a bit to bikeshed and think about where it should go
- like if there should be a Random.text, Random.int, etc. or something instead
yeah on fresh eyes I also wondered if uuidv4 would be a bit more "universal"... less obvious that dang has origins with go developers π
happy to bikeshed that on the pr, just making sure it wasn't lost
(and no rush, just a nice to have thing sometime)
ah yea uuid generation is probably warranted eventually too
Is someone taking a look at CI ? About to start understanding all the failures on the latest checks π (oh, the failingChecks seem to only appear on github's UI and not on Dagger Cloud, is it expected ?)
Thanks Erik, missed it π Dagger Cloud as the source of truth 
let the bikeshedding commence
shed looks beautiful, shipit π³οΈ
fix: python provision tests after 3.14 p...
A small fix that allows to find check and generate functions exposed by an object from a toolchain: https://github.com/dagger/dagger/pull/11902
One of the objective, outside of the bug fix, is to allow to explode test matrix (so user can pick the one to run but dagger check will run all of them in parallel, with less code to write as relying on the engine more)
$ dagger check -l
...
python-sdk:python-310:slow Run slow python tests
python-sdk:python-310:unit Run unit tests.
python-sdk:python-311:slow Run slow python tests
python-sdk:python-311:unit Run unit tests.
python-sdk:python-312:slow Run slow python tests
python-sdk:python-312:unit Run unit tests.
python-sdk:python-313:slow Run slow python tests
python-sdk:python-313:unit Run unit tests.
python-sdk:python-314:slow Run slow python tests
python-sdk:python-314:unit Run unit tests.
...
(the PR with the python test changes, and simplification of the code, will come after this is included in a release to be available in dagger binary without to build an engine)
I just noticed that dagger init --sdk=ADDRESS_OF_SDK_MODULE does not pin the sdk by default? Has it always been that way?
yes afaik. dagger update also doesn't touch it
I think this is new? filesync is shown under "parsing command-line arguments" (0.9.11)
I haven't noticed in the past but looking at the code it makes sense that's what happens because we call .sync on the address https://github.com/sipsma/dagger/blob/0a6d5585387cf7dd8602fd3330fae09c268cc90d/cmd/dagger/flags.go#L308-L308
doesn't look to have changed recently, at least obviously
I'll try to revive my "more clear spans" PR. It was just UI polish but turned out to be surprisingly hard to merge
Fun fact: while porting our own modules to the new workspace API, I discovered that the "markdownlint" check in the docs-dev module, has been successfully... printing the usage message for markdownlint.
https://dagger.cloud/dagger/traces/78287c07f60a3c666b1a6414c59202d6?span=2fd3491ba9b3ed5a
Would be funny to measure how much we've spent on compute to successfully recompute this important information on each PR π
ruh roh...
$ markdownlint . 2>&1 | wc -l
8602
With config and ignore files properly applied:
$ markdownlint . 2>&1 | wc -l
3820
just lost a couple hours to having the srv.Root() and dst swapped in srv.Select(ctx, srv.Root(), &dst) π - thought i was going crazy based on the backtrace (Class[T].fieldsL was nil)
π