#maintainers
1 messages Β· Page 15 of 1
Yeah interacting w/ the service via a withExec that has a service dependency on the service definitely works. I was thinking about the case of "function code directly talks to the service sockets", which I didn't see covered anywhere right now
that's the one I'm interested in...
Will try up and see if it works
this is what I ran into with that https://github.com/dagger/dagger/blob/03d204ebcdc908c07edd6aa92aab046fc0f5c6d5/core/schema/host.go#L389
oh nice, so it does work but only for containers, nice π
(selfish reaction for my use case)
0.5 * π
I updated the PR with a more thorough description of the problem and the fix. It's technically a breaking change (in terms of caching behavior, not compilation or correctness) and bubbles up to UX/DX a bit, so RFC: https://github.com/dagger/dagger/pull/10311
seems like here: https://github.com/dagger/dagger/blob/main/core/mcp.go#L515-L522
obviously easy to lift that limit but there are other limits in play here: context windows, tokens per minute, and RAM... @still garnet added the limit here, it should be configurible and maybe higher by default, but i suspect "real world" performance might degrade pretty violently without a truncation limit otherwise it'd not have been added
short term, idk what's best here... 10x it and run evals? if 10xing it overloads openAI and anthropic tpm limits, what then?
maybe no truncation but optional arg for line ranges?
This is what I've implemented here https://github.com/kpenfound/dag/blob/main/workspace/src/workspace/main.py#L37 based on https://docs.anthropic.com/en/docs/build-with-claude/tool-use/text-editor-tool#view, but if the underlying file.contents is truncated then this will not work in my module
the weird part is this isn't really a read_file tool, it's the tool return logic for anything returning a string
https://github.com/composefs/composefs π€― (we should try this)
My thoughts on host directory caching https://github.com/dagger/dagger/issues/9934#issuecomment-2847963204
If any core dev is around: I'm trying to understand how the workdir (dag.Host(".") is set when my nested dagger session is open.
Can I control which workdir is used, and if so how?
It seems that calling os.Chdir() before call dagger.Connect() doesn't work
My issue is that I'm opening a session from within a dagger runtime container, and I don't control the workdir: the engine hardcodes it to /scratch. By the time I can call os.Chdir() from within the container, it appears it's too late to change the workdir in the dagger session
and the workdir of the inner dagger session seems to affect how my generated client works, specifically where it looks for local dependencies...
Does buildkit have a primitive for running several processes inside a container? I would love to expose that in the core API
without having to inject shell scripts
Itβs the workdir that the exec starts with. The session attachables are initialized when the container starts, so a later chdir doesnβt impact anything
Not really. Thereβs support for execing additional processes but it only applies to βgateway containersβ (which arenβt exec ops) and had weird bugs with waiting last I tried
what is the casing for Module.Name()? camel case? or whatever is in the dagger.json?
Yeah should just be the same string in dagger.json, no case conversions
nice thanks! is there an easy builtin to get the language-appropriate casing of that name or should I roll my own?
guessing maybe it's implemented in each sdk
yeah double-checked but nothing built-in at the moment, I guess the SDKs are doing that
makes sense, thanks!
@civic yacht I'm facing a module loading conendrum...
- I have a custom dagger runtime, bundled in a SDK module
- My runtime container executes a custom go binary
- That go binary loads dagger modules using the experimental
dagger clientfeature. So that go program (cmd/my-runtime) is its own dagger module, with the dependencies to be loaded by the generated client - Problem: the generated client fails to load its dependencies. I get a graphql error trying to query it
- Question: how do I configure which module to load when opening a nested dagger session from with a dagger runtime container?
(importantly, loading works the dependeny is a remote git source. but fails on local source)
I think we may have a tracing blind spot: when loading a module, if the SDK's ModuleRuntime() function calls more Dagger functions, that doesn't seem to show up in the trace
Nevermind, I called modSource.AsModule().Source().SourceRootSubPath() instead of just modSource.SourceRootSubPath() (dumb IDE auto-complete) and apparently that bricks the runtime
@civic yacht π I didn't see sdk/elixir/v0.18.6 getting publish on GitHub release page (with search query https://github.com/dagger/dagger/releases?q=elixir&expanded=true).
Just worry that what happens that cause it cannot get publish. (maybe it's my fault π )
elixir Β· Releases Β· dagger/dagger
So I've been experimenting with custom runtimes, and I'm on a personal quest to democratize their creation and their use.
In that vein I'm testing the waters on new ways to use runtimes. Ie. how difficult would it be to:
- Allow loading a module without a dagger.json, eg.
dagger -m github.com/anything/atall - Allow specifying a runtime at load time, eg.
dagger -m github.com/anything/atall --runtime=github.com/mycustom/runtime - Build on 1 and 2 to allow specifying dependencies in your dagger.json that a) don't have a dagger.json in their source, and b) are loaded by a runtime you specify
This would IMO unlock the power of custom runtimes (assuming we can also make them super easy to develop, that's another story..), and basically means you can daggerize anything including repos you don't control
Thoughts?
So I've been experimenting with custom
I noticed with 0.18.6 I'm prompted by 1password when I open shell. Is that to pre-authenticate prompt mode? Feels kind of disruptive
π hallo, i've got a couple of spec-ed out issues around improvements to module versioning/updating! if anyone has any spare cycles would appreciate a hand on these, i'm really caught up in theseus work:
- improvements to
dagger updatehttps://github.com/dagger/dagger/issues/10362 to have it do real semver bumping! - adding semver module versions to all modules https://github.com/dagger/dagger/issues/10099 - should help contextualize info for non-tagged versions in daggerverse, and give some hefty performance improvements for local dev.
π hallo, i've got a couple of spec-ed
interesting codegen failure seemingly related to interfaces #1369924460903141406 message
@civic yacht @rancid turret I know I can dynamically load and introspect a module using the dagger API - which is amazing btw.
But is there a way to use the module after that? I tried calling ... | as-module | serve in the shell but nothing seemed to happen. Maybe I was over-ambitious? π
Ah, well calling that will install the module schema into the current session, but yeah you do at that point need to query it, which is only going to be available via raw gql queries since there's no codegen for it...
Don't think the shell supports raw gql queries
If you did all this in a nested dagger exec, then you'd be able to serve the module and run raw gql queries separately w/ dagger query, though obviously a decent set of extra hoops to jump through
But the shell doesn't rely on codegen for any of it, it discovers everything dynamically via introspection anyway. So I was wondering if it would just magically work
right, good point
Anyway, it was just for experiments, my actual implementation would probably involve patching asModule to do what I want
if it doesn't magically work yet, maybe .refresh would do it? or if not I bet that could be made to work (reaching the end of my knowledge of details of shell implementation)
I tried .refresh but it didn't seem to work either
(I think it reloads the module it knows about in the CLI config)
noobquestion: i have a dagql.Object and i want to turn it into a dagql.Selector, how can i do that ?
actually nvm i think i want to do dag.Select(ctx, obj). I was doing dag.Select(ctx, Root(), turnObjIntoSelector(obj))
yep, that sounds right
For reference: a trace of an actual coding agent session. I'm using dagger CLI agent to scope out a experimental engine feature I want to prototype
https://v3.dagger.cloud/dagger/traces/54b78f840a122702127dd42fc6bbaa6e#07df568ef0f58080
After serving a module like you did the shell then also inspects the API via a GraphQL query (see source) and saves the result in memory. Then what you interact with comes from this introspection. It's basically a dag.CurrentTypeDefs() call but optimized so it takes only 1 query to fetch all the data from that.
.refresh does this but it's only on the currently loaded module (from .cd or initial context). Serve + inspect also happens when you <module ref> | ... or .help <module ref>. Doesn't that fit with your use case for testing?
Is it possible to have cloud enabled and also have the process dagger is running send it's OTEL to a 3rd party service?
@civic yacht so just hit this weird error during a test run (changes to the http stuff, doesn't seem super related?)
https://v3.dagger.cloud/dagger/traces/01fcc1fa5f6c0d02412b509a54a1eef3?span=181587e3c437720f
but i'm seeing a "cached" failed exec - just as @wild zephyr reported in https://github.com/dagger/dagger/issues/10320
and then, because the exec failure gets cached, we're not actually printing stderr π
i think it's likely a flake, but just thought it was kinda odd that it didn't seem to ever show up until we noticed it π
yeah, the OTEL_* vars are additive
i do that frequently if i need to use e.g. honeycomb to inspect the lower level telemetry
I think that flake started after we switched the way the tests are split up so itβs probably just an unlucky new ordering of test execution order 
so I just added an optional arg to host.directory and noticed something a little suspicious - the rust sdk examples break. i fxied them by adding argName: None but the whole experience has me a little spooked, is it a known thing that additive-optional args are breaking changes in the rust SDK?
If I wanted to implement a core function that can take an argument of any object type. What's the least hacky way to achieve that?
My default answer is "not possible". So I have low expectations. But just in case π
i'm not sure if we have it already, but we could define a generic ID type, and the function could decode it to determine the type at runtime
That would do it! Cool
FYI this is in the context of "tool scoping" in Env. The shell has a layer of scope which is "current object" (almost like a current selection, oh god why am I reopening this topic π ) which influences the command lookup. If we were to bring back foo | bar | .load or .cd $(foo | bar) that's what would allow the object's functions to be top-level shell commands.
I was wondering how we would add this to Env if we wanted to.
It breaks my brain when I think about it too long, because it means Env becomes kind of like a query builder API, on top of the Dagger API which itself is queried with a query builder... still not convinced it's actually useful
@still garnet would that mean we could have Env.withObjectBinding and remove all the other typed object binding setters? Since we have the .asFoo getters on the other side
Technically we lose type checking at set-time, but it still happens at get-time, so maybe it's fine? As long as the engine can infer the type from the ID, it shouldn't affect tool mapping?
I think Env is basically a client context in the engine. It can allow us to make the CLI incredibly thin and dumb (which is a good thing)
I'm editing my post to incorporate model-context and other bits and pieces
yeah - but i think we would need to handle it (generic objects) at the SDK layer somehow, e.g. for ID args that should probably translate to an interface implemented by all objects in Go (prob the XXX_GraphQLID thingy we have already)
Maybe that's how we resolve the client/engine matrix from hell: eventually the engine eats everything, except session attachables
and all that's left is a capability handshake
only skimmed so far (weekend time) but i'm interested in the part where shell interpretation moves into the engine
I'm refining that part as we speak π
maybe then we can revisit having the LLM generate shell
I'm also trying out a merger of LLM tools and shell tools. They sure have a lot in common...
I just want to ssh into my dagger engine π
after virtualizing all host access in env, the final boss: virtualizing session attachables.
@still garnet feel free to ignore me. Random idea, I'm wondering how monolithic we would want an engine-side interactive shell.
- Is it literally the whole TUI bundled together - prompt mode, agent mode, the keymaps to switch between them? Like ssh-ing into dagger and getting a raw terminal stream?
- Or, is it one different terminal session for shell mode, agent mode, and in the future any number of parallel sessions, left for the client to present to the user? So, more like a tmux experience... π feels like an elegant generalization and opportunity to make the Dagger CLI more powerful in the future no?
oooh shell & chat history...
ππ»ββοΈ https://github.com/dagger/dagger/discussions/10150
Is it me, or is the "connect" phase slower in 0.18.6, when starting dagger shell ?
What's the best way for building dagger? I see there's a CONTRIBUTING doc that mentions using the dagger dev module (which is json data), but I feel like I'm missing something since I'm pretty new to using dagger. I wonder if the doc should be expanded to mention how to use it?
I also see there's a ./hack/build dagger script, which worked out of the box for me π , should I be using this as part of a typical dev workflow? I don't see it mentioned in the CONTRIBUTING guide.
π huh, yeah, looks like somehow that's fallen out of date
CONTRIBUTING should probably be updated to match
Yeah it definitely shouldn't just link to dagger.json, feel like it was meant to link to https://github.com/dagger/dagger/blob/main/.dagger/README.md, which has some actual useful info
ah ha, yeah that looks like the guide I need!
@rocky plume keeping in mind that I'm an occasional low-intensity core contributor, here are a few dagger commands I find super useful. For the more hardcore low-latency local devloop, you're already talking to the right people:
To get a complete build (CLI + engine) of any PR, and play with it interactively:
dagger -m github.com/dagger/dagger@pull/$PR/head -c 'dev | terminal'
To do the same thing locally
dagger -c 'dev | terminal'
To run the dev engine only, then connect to it from an external CLI
dagger -c 'engine | service my-feature-name | up'
Then in another terminal:
_EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:1234 ./any/build/of/dagger
that's really helpful, thanks! the -m flag seems really cool.
i personally use ./hack/dev and ./hack/with-dev like a billion times a day lol, they essentially bootstrap off ./hack/build, then load and run a dev engine for you and point your CLI to it, like you can do ./hack/dev dagger ${thing i'm currently developing} or to skip the rebuild and use your most-recently-built dev engine ./hack/with-dev dagger ${thing i'm currently developing}. like solomon said the "hack" scripts are the quicker, more host-machine-centric advanced methods. they have some pitfalls like build errors can easier to miss than the more dagger-native "disposable" dev builds that solomon and some others use
I feel like it's important to know that, technically, you can do everything just with the dagger CLI (our buids, tests etc are fully self-hosted in Dagger). The only reason we still use custom ./hack scripts is speed... specifically the overhead of re-running everything in dagger every time. Hopefully at some point we will get the overhead down to a point where ./hack/ is no longer necessary at all
Ah one more @rocky plume : at any time you can run dagger and you'll get an interactive dagger shell with the current module's functions available as commands. .help to list them
ah yes, I've used the .help several times already π
I thought it was neat to see the hack/build script is actually a dagger script, where as the other ones seem to just do a wee bit of bash before calling build
I must admit while reading the dagger quickstart docs it wasn't immediately clear I could do that, but it was once I saw the hack/build script it clicked. I then went back to the dagger shell docs and noticed it showed "inline execution" by default, but I could then click over to the other options, such as "script"
@polar fulcrum @still garnet π
Hey, do I need to have llm credentials in order to run dagger mcp?
You shouldnβt, like you can even set them to an arbitrary string and itβll still work.
@final star as in, I should set them to a random string as a workaround?
Yes
yes i have a wip branch somewhere to get rid of that but there's a lot of entangled stuff to detangle, sorry it's been lurking for so long
@still garnet @final star btw, playing around the github MCP server ... the set of tools is quite gigantic as it covers the whole API, so they have an (optional, off by default) selectTools-like: https://github.com/github/github-mcp-server?tab=readme-ov-file#dynamic-tool-discovery
ha neat. yeah, sounds basically the same challenges as us
soon they'll add a nested agent π
to sneak copilot in
(that's what I would do anyway)
@still garnet do you think I can use #10366 or should I wait a bit?
go for it - it's just pending approval
random engine devex thing that i've been ignoring: i feel like every time i rebase a pr with an api change i get conflicts in one file more than any other: docs/static/reference/php/renderer.index. could we avoid this somehow? it'd save everybody quite a lot of time
@still garnet sweet!
im hoping to take it for a spin today too, i have a shortlist of demo components i wanna string together and this is the last one: gonna try to wire mcp up to zed
wiring up a dagger module to a framework with mcp, taking it for a spin now
mcparty
i gave it a go yesterday and it worked - it happily chewed through my tool call limit and tried to upsell me on Max Mode 
@still garnet btw this is pre-your-PR: https://v3.dagger.cloud/dagger/traces/2899a97a98e1fa6ed7562874161b0f05
looks about right
just stuck in a loop
maybe adding that to the prompt would help
speaking of limits, i know you were considering plumbing withexec combinedoutput out through tools... im scared how quickly this is gonna burn context windows
see https://github.com/dagger/dagger/pull/10361 - it caps at the last 10 lines of output
ah sick love it
i'll dust that PR off once static tools is merged
First (very unrelated) roadblock: I was using a fresh MCP module that needed a build. Framework has a connection timeout and aborted before the server was ready
Started the MCP server manually on the side to warm up cache and all is good
@charred lotus is fixing that atm -- he's been exploring for claude code yesterday specifically. Most clients do not seem to follow the ping instructions but there are solutions
sweet
@still garnet oh btw -- what happens if the module hits an exception? does that travel back through MCP or does it get lost in translation?
@charred lotus presumably you wanna let the server come up before module load is complete? i'd still love to use mcp in dagger/dagger and module load times are definitely gonna be a blocker there XD
errors are propagated back to the model, and any rich graphql extension attributes are turned into <foo>...</foo> (e.g. ExecError stdout/stderr/exitCode)
i looked into this yesterday and it's very hairy to fix because we start the MCP server engine-side, which currently has a dependency on the env being set up, which has a dependency on calling on the module init, which of course depends on loading the module
if you happen to be testing with Claude Code, there's a bug there
weird that it's getting null?
uhh it's doing pretty weird things there altogether
(quick and dirty module, very likely I did something wrong)
is it not respecting the MCP server instructions maybe?
lol yeah that's why i asked, it's way harder than it looks on the tin
those need to get injected into the system prompt, ideally (Goose does that)
@still garnet hmmm ... I'm using a few MCP servers already, adding dagger to the mix
ideally the mcp server startup is fully decoupled from module load, like if the module is broken it'd be nice to only error once a relevant tool is called... but like even in saying this i'm not fully sure what a "relevant tool" would be given the good ones come from the loaded module XD
@still garnet the module is a tiny, one function thing
I do see the function with dagger functions
the problem is it's passing a bogus type parameter to list_objects (should be Sandbox). not sure why it's doing that - what's your prompt?
@still garnet I don't have any tool specific prompt in there, maybe the framework has one but I didn't have to specify anything for the other tools
@still garnet oh wait, it's different now
what openai model is this? seems kinda dumb
also repeating this question - it's pretty important for MCP clients to respect the server instructions
but even so, i've seen better behavior from clients that don't in my testing. but that was with Anthropic. hmm
mcp_server = MCPServerStdio(
client_session_timeout_seconds=300,
this did the trick for me
For the record, claude code has a timeout only for the init, if there is a timeout for execing a tool, it's above 1h. What i was hoping is for claude to do healthcheck pings.
yeah, this works, but it can still be weird in goose where you're starting a new session each time... we're basically reliant on caching to get unsurprising behavior (not so different from LSPs, so maybe to be expected)
So just FYI, there's a Ping request in MCP, that apparently is not often used but if you're doing custom stuff with MCP feel free to use it
(dagger mcp responds to pings)
@final star any reason endpointMtx is a *sync.Mutex and not just sync.Mutex?
interesting - didn't expect this result. without any system prompt at all, Claude 3.5 Sonnet passes all evals, while GPT 4.1 and Gemini 2.0 Flash struggle. it even figured out how to chain methods (pictured)
(this is with https://discord.com/channels/707636530424053791/1370465673872146482)
no conscious reason, no, I prolly left the * in by reflex...
I'm refactoring to do lazy loading of modules
i do recall go mutexes can't be copied, so given that we have a lot of clone logic with the chaining, the pointer may be safer?
Correct me if i'm wrong, but i'd intuitively say that when you clone you don't want to inherit the mutex, you want a new one, otherwise the old and new objects would interfere with each other
ah yeah, so less safe in that sense, more safe in the sense that you're protected from accidental copies (unsafe both ways lol)
I understand now why there are pointers everywhere. It's indeed the chaining. Because when in Clone() you do cp := foo if foo contains non-pointer locks, go vet will complain for the reasons you mentioned above. Alternative is to create a new foo value instead of copying by value. Anyway, tangential to what i'm doing but was curious. Mystery solved
yeah, i figured my reflex was prolly reinforced by some tool lol
Quick question regarding the export consistency API-wise. I'm playing with some exports and i've been surprised about the --wipe field on Directory.Export ; which doesn't seem to be exposed on the file export π -- does that mean that we expect the file export to be wipe: true by default ? (if not expect i'll fire an issue πΌ )
I would think so? Wouldn't a File.export --wipe false be like an append or something?
yeah you're right -- i'm now just surprised about the --allow-parent-dir which only exists on files though ; we have safeguards for files (to just scope to the module context) and not for dirs (atm)
i remember someone recently mentioned something about "env" support, and how we were thinking a different format from the standard .env file - is there an issue here, or is https://github.com/dagger/dagger/issues/9584 the best place for that?
cc @tepid nova, I think it might have been you?
kicking the tires here again, since @humble hearth is starting work on it ππ https://github.com/dagger/dagger/pull/10397
I tried several keywords in search with not a lot of success. Did anyone already launched the engine with a debugger attached? That could help a lot to fix some nil pointer for instance.
Well, maybe that's my answer π«£ #1359449990036066355 message
I'm not sure this is the right implementation, I'm wondering if the security risks of modules grabbing random env vars from the host (talked with Jeremy yesterday). Maybe a native .env support would be better. I'm going to reply to #9584 with some thoughts.
This PR is really focused on this: https://github.com/dagger/dagger/issues/8428
@spark cedar there are at least 2 possible designs:
- Initially I proposed this: https://github.com/dagger/dagger/issues/9584
- Now I'm proposing we tie it to the overall
Envdesign instead of doing a one-off: https://github.com/dagger/dagger/discussions/10370
Overview Dagger should natively support .env files. It's a de facto standard for managing environment-specific configuration in a lightweight, portable way. It is widely supported, included by ...
Table of Contents Overview Bindings Env and Dagger Shell Model context Env and TUI Env and host filesystem Tools Env persistence: .env and beyond Horizon 1: lossy persistence with .env Horizon: los...
@tepid nova where is a reference/info/code to the custom dagger runtime you were talking about in the call?
This should be a good reference I suppose https://github.com/quartz-technology/daggerverse/tree/main/docker_sdk -- but still rough ; but on the engine side it's https://github.com/dagger/dagger/issues/9582 + https://github.com/dagger/dagger/issues/7707
How do people feel about including shell commands in the integration tests? e.g.
func (DirectorySuite) TestSymlink(ctx context.Context, t *testctx.T) {
c := connect(ctx, t)
t.Run("symlink in same directory", func(ctx context.Context, t *testctx.T) {
dir := c.Directory().
WithNewFile("some-file", "some-content").
WithNewFile("other-file", "other-content").
WithSymlink("some-file", "symlink-to-some-file")
ctr := c.Container().From(alpineImage).WithDirectory("/test-dir", dir)
// test the symlink is an actual symlink
_, err := ctr.WithExec([]string{"test", "-L", "/test-dir/symlink-to-some-file"}).Stdout(ctx)
require.NoError(t, err)
// test accessing the symlink correctly accesses some-file
_, err = ctr.WithExec([]string{"sh", "-c", `test "$(cat /test-dir/symlink-to-some-file)" = "some-content"`}).Stdout(ctx)
require.NoError(t, err)
// test overriting some-file with other-file doesn't break the symlink
_, err = ctr.WithExec([]string{"sh", "-c", `cd /test-dir && mv other-file some-file && test "$(cat symlink-to-some-file)" = "other-content"`}).Stdout(ctx)
require.NoError(t, err)
})
}
maybe it's fine? I see it being done in several places:
$ grep -RI -- '"sh", "-c"' core/integration/ | wc -l
163
personally i'd rather not use test and sh -c in congress. the test -L makes sense, though. do these cases not test easily with ctr.File.Contents, though? like i think that syntax is preferable, but there are also cases where it makes sense to verify that the in-container state (via WithExec) matches an assertion
I think it's useful to have one execution to cover that case, but I agree the others might be better suited using dagger primitives. I'll try to rework it.
random engine devex thing that i've been
Personally my approach is:
- Default to native Dagger exec
- When native Dagger API can't do what I need, or is just too cumbersome, switch to the smallest possible shell script, with a FIXME for later
- If I notice a pattern of always needing a shell script for the same reason, I start looking for possible improvements to Dagger
that makes sense, I rewrote it to simply do one Exec (so we can use test -L to verify it's actually a symlink by the time the container starts running), and then am leaning on File.Contents for most of the other testing, e.g.
t.Run("symlink in same directory", func(ctx context.Context, t *testctx.T) {
dir := c.Directory().
WithNewFile("some-file", "some-content").
WithSymlink("some-file", "symlink-to-some-file")
ctr := c.Container().From(alpineImage).WithDirectory("/test-dir", dir)
// test the symlink is an actual symlink
_, err := ctr.WithExec([]string{"test", "-L", "/test-dir/symlink-to-some-file"}).Stdout(ctx)
require.NoError(t, err)
symlinkContents, err := dir.File("symlink-to-some-file").Contents(ctx)
require.NoError(t, err)
require.Equal(t, "some-content", symlinkContents)
symlinkContents, err = dir.WithNewFile("some-file", "overwritten-contents").File("symlink-to-some-file").Contents(ctx)
require.NoError(t, err)
require.Equal(t, "overwritten-contents", symlinkContents)
})
it'll also give us better error messages from testify rather than simply getting an error code of 1.
I wonder if there's a need for a File.Stat() call in the dagger api?
For a while you had to use a shell script every time you wanted to catch a non-zero exit code... Now we have Container.exec(expect:ANY|ERROR) so I got to remove a lot of shell script wrappers that day π
Somehow we never got around to finishing it, but yeah overwhelming consensus that we have a gap there
oh that's a nice wrapper, I recall doing similar test ...; echo $? > exit_code hacks to get around similar cases where I wanted to test an error actually occured. it's a bit nasty π
context canceled makes it sound like something else entirely, but looking
Thank you @final star for getting a release out on short notice!
once theseus is shipped, it would be pretty cool to have a git cache backend... mapping blobs to git objects could be sweet. better than GHA cache service IMO
Is Theseus a managed BuildKit / cache service? π
Telemetry with links
not really - it's an effort to piece-by-piece rework and replace large parts of our usage of buildkit, including caching, the scheduler, the underlying implementations, etc.
we're often bumping against the limitations of buildkit, so the idea is to try and own more of that logic π this then enables us to do really cool things with remote caching/persistence, because we're not limited by what's supported in buildkit
Very cool. Excited to see what y'all do
Since we discussed the topic live @civic yacht @spark cedar I went ahead and created the issue https://github.com/dagger/dagger/issues/10416
@spark cedar opened this a few hours ago: https://github.com/dagger/dagger/issues/10413
sorry I should have checked. will close mine
I very recently started getting Go errors in my IDE. It seems that it's because the engine no longer builds on Darwin?
As a result some IDE operations don't work for me anymore (for example renaming symbols): the IDE asks me to fix errors first, but I can't
Where's the code that emits this? Is it buildkit?
And what operation triggers it?
I noticed that a solid chunk of module loading logic lives in core/schema/modulesource.go which surprised me, since most files in core/schema are thin graphql frontends, with the backends living in core/. Is there a specific reason for this difference, or is it just a leftover quirk from past refactors?
I noticed that a solid chunk of module
Where's the code that emits this? Is it
I very recently started getting Go
Is it possible to plug into the otel traces of dagger?
I would love to instrument my test code to have a fine grain info on which test takes more time directly in dagger cloud.
we do this in dagger/dagger, in go we had to write a lil library to add context.Context support to the default testing lib. it is definitely possible and probably easier in a lot of other languages.
is it reasonable to create a new set of DagOpContainer calls, which would mimic the existing wrappers under https://github.com/alexcb/dagger/blob/with-symlink/core/schema/wrapper.go#L107-L154 (and also a corresponding NewContainerDagOp under dagop.go) ?
ah lol
funny
I have that in my branch dagop-exec
it's on my fork? sorry, afk now, it's all suuuuper super crappy right now
I can try and tidy it up tomorrow so you can actually use it? the reason it's so much harder than the others is because there are actually multiple inputs and outputs
each mount goes in, each mount comes out
but there's still one op
I'll take a closer look at your branches, thanks for the reminder about it.
in the meanwhile, I can even try cherry-picking from it just to see if it works.
it's a bad history, but you probably don't want the bits that hook into withExec
(since it's almost entirely broken lol)
so for some reason i thought that it was already possible to pass module constructor args through to the autoloaded shell module, but went to test it and it seems that that's not a thing. is there some reason i might be missing as to why we don't want that? in dagger/dagger/.dagger, i'm basically thinking about the dagger shell --source=../dagger-worktree-copy --dockerCfg=... cases. ideally this would also apply to dagger mcp too and re-use some dagger call code. wdyt, should i open an issue? @rancid turret @tepid nova
so for some reason i thought that it was
ContainerDagOp
Quick question: https://github.com/dagger/dagger/blob/68f969a06b4cd5afb9c9f51a7fac0617fd81ba1a/core/schema/git.go#L273-L282
In what case is if clientMetadata.ClientID != parentClientMetadata.ClientID { true and leads to a break ? Like hypothetically -- i'm a bit confused by the NonModuleParentClientMetadata
Given this example:
type Jo struct{}
// Here, in this hypothetical example, let's say that privateGitRepo is private HTTP ref
// In this case, the user runs this function as a top-level dagger call: `dagger call example --private-git-repo https://XXXXX.git`
func (m *Jo) Example(ctx context.Context, privateGitRepo *dagger.Directory) *dagger.Directory {
return privateGitRepo
}
func (m *Jo) ExampleRef(ctx context.Context, ref string) *dagger.Directory {
return dag.Git(ref).Ref("main").Tree().Directory("zip")
}
This works:
dagger call example --private-git-repo https://github.com/grouville/daggerverse-private/
And this fails:
dagger call example-ref --ref https://github.com/grouville/daggerverse-private/
Now, this is expected in term of security by design (users need to pass the socket and / or the secret and pass them as args. Is it that check that enables that ? if clientMetadata.ClientID != parentClientMetadata.ClientID {
Got my answer, indeed π―
I'm running into a case where dagger call sdk go generate export --path . is failing with Error: introspection query: returned error 502: {"data":null,"errors":[{"message":"http do: Post \"http://dagger/query\": unexpected EOF"}]}, my best guess is the internal dagger engine that's started via https://github.com/dagger/dagger/blob/68f969a06b4cd5afb9c9f51a7fac0617fd81ba1a/.dagger/sdk.go#L71 is crashing, but I'm stumped on how I can view the engine logs.
You should be able to docker logs <engine-container-id>
For example on my stable engine I might do:
docker logs dagger-engine-v0.18.8 2>&1 | grep -v debug | less
Then look for a stack trace
from my host? or via a different container? I had a watch -n 0.1 docker ps -a on my host and didn't see it come up.
Oh, well it depends how your engine is provisioned. Do you have EXPERIMENTAL_DAGGER_RUNNER_HOST set to something custom? The dagger CLI will default to its builtin docker provisioner unless that variable is set
that's so very odd if it is the top level instance, because I see:
root@tyee:/var/lib/docker/volumes# ps auxw | grep dagger-engine | grep tcp
root 2150004 0.0 0.0 1320024 59948 ? Ssl 17:07 0:00 /usr/local/bin/dagger-engine --config /etc/dagger/engine.toml --addr tcp://0.0.0.0:1234 --network-name dagger-dev --network-cidr 10.88.0.0/16
which appears to be running via buildkit (well the dagger engine, but still under a different buildkit cgroup)
root@tyee:/var/lib/docker/volumes# cat /proc/2150004/cgroup
0::/system.slice/docker-da3da4bd56dd9e80ddd8518f59edd36f21e5d14d6ce9e999b1c86cee801657dc.scope/buildkit/f715uk0jsa0hacel0kzepa176/init
if I look at my host docker instance, I see it's been up for 6 days:
root@tyee:/var/lib/docker/volumes# docker ps -a | grep da3da4
da3da4bd56dd registry.dagger.io/engine:v0.18.8 "dagger-entrypoint.sβ¦" 6 days ago Up 28 hours dagger-engine-v0.18.8
which seems a bit confusing. I need to figure out what's actually running underf715uk0jsa0hacel0kzepa176/init
are you running a dev engine somewhere? Or is this 0.18.8 engine what you expected to find?
both I suspect? dagger call sdk go generate export --path . uses v0.8.18, but doesn't that startup a dev instance under the hood?
I don't have any _EXPERIMENTAL_DAGGER_RUNNER_HOST set on my host
but the build tooling sets it under https://github.com/dagger/dagger/blob/main/.dagger/sdk.go#L72 which makes it seem like it points to a different encapsulated dev instance.
because after running the dager call ... I end up with two seperate instances running
# ps auxw | grep dagger-engine.*--config.*toml | grep -v grep
root 4987 12.2 38.1 38715032 24797772 ? Ssl May19 205:41 /usr/local/bin/dagger-engine --config /etc/dagger/engine.toml --debug
root 2150004 0.0 0.0 1320024 59772 ? Ssl 17:07 0:00 /usr/local/bin/dagger-engine --config /etc/dagger/engine.toml --addr tcp://0.0.0.0:1234 --network-name dagger-dev --network-cidr 10.88.0.0/16
and the dev instance is running under a child cgroup of the official build:
root@tyee:/var/lib/docker/volumes# cat /proc/4987/cgroup
0::/system.slice/docker-da3da4bd56dd9e80ddd8518f59edd36f21e5d14d6ce9e999b1c86cee801657dc.scope/init
root@tyee:/var/lib/docker/volumes# cat /proc/2150004/cgroup
0::/system.slice/docker-da3da4bd56dd9e80ddd8518f59edd36f21e5d14d6ce9e999b1c86cee801657dc.scope/buildkit/f715uk0jsa0hacel0kzepa176/init
and going deeper, here's the nested instance running vua runc:
# cat ./volumes/acbb87c1a60414d61877a84560fad964cd2aa22bf70a59fd46a53986029621d2/_data/worker/executor/f715uk0jsa0hacel0kzepa176/config.json | jq .process.args
[
"/.init",
"dagger-entrypoint.sh",
"--addr",
"tcp://0.0.0.0:1234",
"--network-name",
"dagger-dev",
"--network-cidr",
"10.88.0.0/16"
]
and it's this instance I'm hunting for logs from.
could https://github.com/dagger/dagger/issues/6553 be related to my quest for these logs?
Ah I see!
OK so I use dagger cloud for this π
If you navigate the trace, you'll find the nested service (pulsing yellow while it's running) then all the logs are there
Or, you can run with dagger call -E (for --no-exit) then the TUI remains for you to interactively explore. You can enter a span with enter, exit it with escape, etc. The logs are in there. Can be laborious in a very large trace though
Another way would be dagger --progress=plain then you'll get the firehose of logs and meta-logs in text format
ah ha, I should really play around with both cloud traces and the -E, thanks for these ideas.
Lev upgraded my cloud account, so I should be good to go on that front.
@obsidian rover @charred lotus https://huggingface.co/datasets/princeton-nlp/SWE-bench_Verified
If a ModuleSource was loaded as a dependency, does it contain a reference to its parent?
(I think the answer is no, but just checking)
@midnight rapids Your repros were very useful, thank you a thousand time π
I isolated two bugs with SSH auth loading that I'm currently fixing:
- We don't seem to support the case where 1) the user needs a key for a private module and 2) needs another one for one of its args
- We seem to only take the first one into account
I guess it's time to load the gitconfig / sshconfig from the user -- which seems to be the underlying cause. From memory we DO load the gitconfig -- but sshconfig
This seems related to the same underlying cause -- digging
it shouldn't no - we actually take some effort to make sure it doesn't, so that if two different modules load the same dependency, then we get things de-duped nicely
hey @spark cedar can I get a quick confirmation I'm on the right track with my core/schema/directory additions for while working on a new directory.exists method?
func (s *directorySchema) exists(ctx context.Context, parent dagql.Instance[*core.Directory], args existsArgs) (dagql.Boolean, error) {
if !core.DagOpInContext[core.RawDagOp](ctx) {
return DagOp(ctx, s.srv, parent, args, s.exists)
}
exists, err := parent.Self.Exists(ctx, s.srv, args.Path, "")
return dagql.NewBoolean(exists), err
}
and it can be hooked up with:
dagql.NodeFunc("exists", s.exists).
Doc(`todo.`).
Args(
dagql.Arg("path").Doc(`todo`),
),
am I right in thinking I need to use NodeFunc rather than simply Func here?
π
Anything interesting in there?
We were looking at the examples of codex with @final star, and they are carefully crafted so that using dagger wouldn't help much (mostly doing code rewrites)
@wet mason wdym by carefully crafted ?
they don't need much sandboxing
@wet mason @charred lotus FYI I'm actively using dagger CLI as a coding agent right now, because I actually needed it π For codebase analysis
Here's the trace if you're interested https://v3.dagger.cloud/dagger/traces/9456b4fe0dee0086f084a13f30bee449
you mean codex's simple dev env image is sufficient ?
@charred lotus got some time to chat? I'm getting stuck /cc @final star if you're around
sure
@skcd42 @_xjdr is using multiple git worktrees lol https://t.co/CZLo5n9xR3
npx @modelcontextprotocol/inspector
And it freaking worked! Dagger + o3 found the solution for implementing platform modules cleanly. Just need a better environment so that it can just implement it for me. cc @wet mason @charred lotus @obsidian rover
note I gave up on doing it in my local checkout. Just had it checkout the git repo from main. Git context is strength of ours and I think you guys should leverage that in your prototype
let the IDEs struggle with messy local checkouts.
I say in the future, a local git checkout on the user's computer will be the exception not the norm
I'm curious to know what were the issues in the local checkout. Messy git state ?
Just didn't know how to plug it in
or get it back out
prompt was simpler with just "help me understand the source code of dagger, start from main branch"
IMO git is a strong entrypoint for us
actually there's an opportunity to simulate regular stateful file tools, except it's in an ephemeral checkout of a git branch, and gets auto-saved to your agent's "working branch" when they're done (no explicit tool callls needed). Then expose simplified container tools that assume that local context
cc @wet mason π
the env starts with a context dir
(which by the way we already have an implementation of with .cd in the dagger shell)
Depends on what we want to target for the integration. Goose, codex, cursor β¦ all work from local code
If we target βcoding agentsβ (as in, hand rolled agents) then itβs easier to go with git yeah
Yeah we were thinking the same. If thereβs a way to βdiffβ / βapplyβ what the agent did, then a starting point is needed, the context dir could be it
Yeah they focus on working from local code, but they don't have to stop there. After all they give you a shell, from there you can ssh, curl, run git operations etc
there are lots of useful tasks a dev wants to do within their IDE, that involve interacting with remote resources
I just wanted to point out anecdotically, that for my use case today, I really didn't care about using the local checkout. If dagger had given me a super easy way to auto-push the agent's changes to a temp branch, I would have used that for sure
I could get the agent to do it, but then I have to setup git-related tools, give it auth etc...
easier if it happens implicitly
maybe a crazy idea, but imagine if I could just aim the tool at any git branch of any repo, and boom that's the environment. Then expose very basic tools for snapshotting your env, rolling it back.. under the hood it would be backed by git magic (annotations, multiple branches whatever)
There's a test failure in main, is anyone on top of it? \cc @fair ermine π https://v3.dagger.cloud/dagger/traces/8551c04fb6a75547d7af32801ccf1361
sdk/typescript/runtime/bin/__tsclientconfig.updator.ts: no such file or directory
This file doesn't even exist in the repo lol
oh not this again
I've seen an increase in the number of flakes that look something like (both in CI, locally, and in the middle of test runs): Error: failed to get configured module: input: moduleSource failed ...
It seems it's using a previous version?
I've been seeing it pretty reliably for the last commits in main and open prs.
yeah same, i'm gonna try and commit some time soon to dive in. atm just trying to wrap up the current state of my execop work, so i can dive back in later
In core, given a *ModuleSource (or any other core type), can I get a dagql.Instance[*ModuleSource]?
I'm in a core schema handler func, and need a dagql.Instance of the parent. But I only get the core type of the parent.
do you have a specific scenario? it depends a bit on what you're doing
you can do NewInstanceForID
but that will only generally work if you load that ID again you'd get the same result
But how do I get the ID?
I'm starting from:
dagql.Func("withDependencies", s.moduleSourceWithDependencies).
Doc(`Append the provided dependencies to the module source's dependency list.`).
Args(
dagql.Arg("dependencies").Doc(`The dependencies to append.`),
dagql.Arg("platform").Doc(`Install as platform dependencies (executed in the parent's context directory)`),
),
And am trying to patch s.moduleSourceWithDependencies, which starts like this:
func (s *moduleSourceSchema) moduleSourceWithDependencies(
ctx context.Context,
parentSrc *core.ModuleSource,
args struct {
Dependencies []core.ModuleSourceID
Platform dagql.Optional[dagql.Boolean]
},
) (*core.ModuleSource, error) {
// what to do
ah that's perfect! Thank you
π«‘
ha ha this could not have been a more laser-focused solution to my problem, there is literally zero other change, just getting the instance I needed π
@spark cedar but how do I deal with this?
parentSrc = parentSrc.Clone()
Going to try
~~```
parentSrc.Self = parentSrc.Self.Clone()
parentSrc = parent.Self.Clone()
// ...
return dagql.NewInstanceForCurrentID(ctx, s.dag, parent, parentSrc)
And hope nothing blows up
shell interpreter builtins
FYI I opened an issue for this, going to take a stab at it, in order to make progress on improving Env
Not sure where to even start π
Maybe I should ask an agent
In core or core/schema, how do I convert a string-encoded ID back to a call.ID?
OK my daggerbot actually gave me the answer π
Hello guys, I was about to jump in the implementation but I thought it would be smarter to first discuss it.
I created an issue about the optional SDK function call, so SDK would not be forced to implement the whole SDK interface to work but only what they need (like only ModuleRuntime but not Codegen for module support)
https://github.com/dagger/dagger/issues/10480 happy to read your thoughts before making the implementation
/cc @tepid nova @spark cedar @leaden glade @rancid turret
I think with the Typedefs added in https://github.com/dagger/dagger/pull/10475, some of the "methods called on the sdk" column changes a little bit?
i generally like the idea here a lot, should simplify making sdks a lot
one thing i would like to see is a separated out Init method for dagger init, so that column would be Init + Codegen + GenerateClient. i started work on it in https://github.com/dagger/dagger/pull/8201, i can't remember if i discussed with you if you wanted to pick that up.
I didn't mentionned it because it's not merged yet and I assumed my proposal will be merged before that one but yeah it would be part of it
dagger init should only implements init no? Codegen and GenerateClient are only for dagger develop and client install? or if you set a sdk in your init flag no?
I wanted to quickly improve the Java SDK generated doc for a module.
Right now, like in Go, we are documenting the package. But we are also adding a documentation to the main object.
It looks like if both are defined then the one from the main object is displayed (at least in dagger shell).
As the package documentation is, contrary to Go, in a separate file, would it make sense to only document the main object? So that everything is in a single file, and more visible to the user. Or is there any reason I should keep the package level documentation only?
Module docs
engine benchmarks started getting preemptively cancelled around 8m of runtime ~2 weeks ago. investigating now. commit where it started is maybe suspicious, it's the one where @civic yachtfixed the exec deduping cancel edge case
@spark cedar perhaps too late in the day for you, but if you want to chat about "inline dependencies" + the related "persisted env configuration", both are high priority for me given what they unblock. Would love to hear your thoughts on the design directly
I replied in one of the issues earlier today, but yeah, I'm now ooo
will continue there async.
Context for those reading: https://github.com/dagger/dagger/issues/10464
skip deduping running nested execs (#102...
Hi, can someone help check the evals / testdev job in PR https://github.com/dagger/dagger/pull/10373 ? It is always failure when the job try to get API_KEYs. I tried keep rebasing but it didn't work. π’
@civic yacht @spark cedar I think I just hit an engine session-wide caching bug (or maybe just something the telemetry doesn't handle?) - when running the same thing in parallel, a failed withExec shows up CACHED in the other session:
evals failing for dependabot bumps
fyi, we've got a core dev architecture discussion coming up on the hour (10 minutes from this message)
anyone who wants to is welcome to join (we'll discuss lots of juicy technical details)
@still garnet π Hey ... to start a service with port forwarding manually, what's the process?
I'm doing svc = AsService().Start()
Then for each port, tunnel = Host().Tunnel(svc).Start()
Then to get the local address, endpoint = tunnel.Endpoint()
Wasn't sure whether it's host.tunnel or host.service
Host.service is container->host. Host.tunnel is host->container
@obsidian rover Happy to help! Any luck while digging?
Thank you again for working on this π
FYI I'm nibbling away at small building blocks required to simplify Env:
Hey Isra! Yep, actively exploring thisβitβs a known issue affecting Docker and most container runtimes. Iβm currently building a library to properly handle SSH configs in containers, including some tricky edge cases around private keys. Still very much WIP, mostly limited by available time. Iβll keep updates coming!
Thank you! If you'd like help with that, I'm up to try to take a look too π
Started experimenting with Claude to see if it could add new features based on existing examples. Started with GCP secret provider. It needed a lot of guidance but it's getting close.
fix for a weird test failure on main: https://github.com/dagger/dagger/pull/10522
(something something gnome-keyring is a bizarre piece of software)
does dagql schema support enum in some way? I see there's a withEnum for modules, but i'm not sure if it can be used in the way I want.
I'm working on a new call directory.exists(.....), and it takes an optional expected type, so you can also validate it exists and is a directory, or file. Currently I just have a string I'm passing: https://github.com/alexcb/dagger/blob/2825be594748d0fcd299312f502029282d56f182/core/schema/directory.go#L103 but this doesn't feel robust.
Error extensions
I have a first version of the self invocation working with Go modules.
This is similar to what I did in Java, so it doesn't contain shortcuts to access to types (for the environments)
This is what it looks like on my "local-agent" if I remove the local module for the workspace and replace it by a local type: https://github.com/eunomie/local-agent/pull/1
I updated the PR there if anyone wants to try: https://github.com/dagger/dagger/pull/10475
cc @hasty basin
NoteThis is the first part of #10336 and #8030
TypeDefs split
To be able to generate types of the module itself, this means it requires to generate and expose (register to the engine) the types an...
Telemetry logs
π @civic yacht @still garnet Hey, I'm guessing the answer is no, but would there be a way to catch the "output" (like dagger.ExecError) of a Directory.DockerBuild?
@civic yacht @still garnet π again ...
Dealing with a hang when using the SDK directly, can't seem to find my way around
Shot in the dark but: I should be able to do AsService().Start as many times as i want in the same session, right?
Found a bug that would perfectly explain the "no such file or directory" flakes during filesync that have been plaguing our CI: https://github.com/dagger/dagger/pull/10541
Will need to go out in a release to fully squash the flake (since it sometimes happens in the stable "outer" engine too), but hopefully its reign of terror is coming to a close
I've been bitten a few times by /hack/dev/ returning 0 even though the engine compilation actually fails. Has this bitten anyone else? Happy to fix π
sadly, this is not the world's most trivial fix π @rancid turret is working on this right now: https://github.com/dagger/dagger/pull/10438
glad I posted here π
yeah I've been bitten by it a few too many times too, so I wrote my own hacky hack wrapper:
#!/usr/bin/bash
set -e
set -o pipefail
cd $HOME/dagger/dagger # you might have to change this part
rm -f bin/{dagger,engine.tar}
docker rm -f dagger-engine.dev
./hack/build
test -f bin/dagger || (echo "ERROR: dagger failed to build" && exit 1)
test -f bin/engine.tar || (echo "ERROR: engine failed to build" && exit 1)
echo "== build worked =="
I did something similar here: https://github.com/jasonmccallister/dotfiles/blob/main/zsh/functions#L35
so, i've been poking a lot around typedef handling in the engine π±
when modules register they produce a lot of "incomplete" typedefs - e.g. WithObject used as a reference. the problem is - we never actually fully resolve these, which ends up producing some weird cases, e.g. if you want to look at the possible enum values of an arg, you can't.
if feels like the ModTypeFor function could/should handle this? but not entirely sure how desirable this is for core? kinda a bit confused, it feels almost linked to core interfaces here as well?
seeing a weird engine panic against ./hack/dev, happened twice in a row (second time was after pulling main) (putting logs in thread since it's a bit of a puke)
With Dagger 0.18.10, I'm using finch instead of docker (via symlink trick similar to nerdctl) and it works with no errors shown on CLI
~ β€ /tmp/bin/dagger -m github.com/shykes/hello call hello
β connect 0.7s
β load module 1.1s
β parsing command line arguments 0.0s
β hello: Hello! 0.0s
β .hello: String! 0.2s
hello, world!
Full trace at https://dagger.cloud/dagger/traces/5d4321a5078de204ca5ecb5ebdab38c5
but getting this error (see image below) in Dagger Cloud π€ @still garnet really an error?
https://dagger.cloud/dagger/traces/42db4f3bad4ab10235d1165ab1055316
Same error with 0.18.9
https://dagger.cloud/dagger/traces/5d4321a5078de204ca5ecb5ebdab38c5
Plus caught another error before this one
https://dagger.cloud/dagger/traces/163e5b86ef9a5986cd4997e05180b713
exec docker inspect --type=image registry.dagger.io/engine:v0.18.9
failed to run command: exit status 1
time="2025-06-10T09:54:16-07:00" level=fatal msg="1 errors: [no such object registry.dagger.io/engine:v0.18.9]"
With Dagger 0.18.10, I'm using finch
uhhhh @still garnet π€
any idea how i might have managed to produce this trace?
https://dagger.cloud/jedevc/traces/f206e83a51b18eba37ccbdc3bb4181a1#770d975a661a42ad
i just did dagger develop π€
cloud panic
another one for you @still garnet (sorry) https://dagger.cloud/dagger/traces/9f28b4b977d3ee22b55842681c312e13#578a2f8af50a3c52:L5
not quite sure, but it seems like the failing check span isn't available in the rest of the view?
i can't seem to find any of the surrounding context
@final star i chucked down a few ideas about git apis in https://github.com/dagger/dagger/issues/7202#issuecomment-2966176223
I have a module my-module that has a dependency to a module sub-module. This module has a dependency to a module my-module. And this module has a dependency to a module sub-module.
- my-module (1): dag.SubModule().xxx
βΏ sub-module (2): dag.MyModule().xxx
βΏ my-module (3): dag.SubModule().xxx
βΏ sub-module (4)
The main thing here is that works fine. So I have multiple modules with the same name, and it's fine.
But now, I tried to rename them.
So in my-module (3) I changed the dependencies for:
"dependencies": [
{
"name": "self",
"source": "sub-module"
}
And changed the call dag.SubModule() to dag.Self().
This is working fine.
I moved to sub-module (2) and did the same thing. But this time, when I dagger develop, I have the following error:
failed to get schema introspection json during module sdk codegen: failed to get schema for module "self": type "Self"
is already defined by module "self"
While part of me understand the error, the other part is wondering how in the initial state with 2 modules my-module and 2 modules sub-module that was working but with 2 modules aliased to self that doesn't work anymore.
What am I missing here?
cloud not showing some spans
@leaden glade side tangent: have you thought about having dag.Self() (or similar, maybe SDK specific) actually be a reference to the current object instance - i.e. not just the module entrypoint? that should be totally possible, since we already have its ID by the time we're calling the function on it. 
module name conflicts
@leaden glade I saw in your update that you're running into difficulties specifically for renaming self-calls to dag.Self. I just want to make sure you're not wasting your time there, since there is a DX debate over whether that's what we actually want. Maybe better to settle the debate before you spend a ton of time on this particular change?
cc @tidal spire
π I'll put that on the side for now and focus on the main use case (without dag.Self) And we can re-discuss the dag.Self later, maybe even merge a version without it first.
CI is failing because of 500s from package.cgr.dev right now.
Also just tried to login to dockerhub for unrelated reasons and am getting 500s from them too. World may be collapsing, just FYI.
GCP down ahahah
i think we're back?
Sometimes, when things doesn't work as expected, it's time to go back to the drawing table and start (almost) fresh.
So I opened a new PR regarding self invocation: https://github.com/dagger/dagger/pull/10584
This PR only focus on adding moduleTypeDefs to module SDKs (with Go implementation only) and not on self calls in order to keep the PR smaller and more readable, and move step by step.
The PR is in draft because I haven't focused on tests, but happy to get some π on it
@still garnet hey, can I cast the error as dagger.ExecError in a AsService().Start()?
errors.As seems to not catch it
hmm yeah it might not, since it runs through the gateway; different path, it's more like spawning a process and waiting for its exit status, as opposed to running a LLB exec
longstanding issue: https://github.com/dagger/dagger/issues/6553
Yeah, this is a subset of that ... service.Start does fail, and does return an error, and does say "exit code 1", but it's not castable to ExitError and we can't get stdout/stderr back
@still garnet the problem i'm having is when e.g. the agent writes a app.py in flask, starts it (as a service, otherwise withExec would hang forever), if there's a syntax error or whatever it doesn't "see" it, so it's flying blind
yeah exactly - it's not the usual buildkit error wrapping/extracting path, it's just a plain old fmt.Errorf: https://github.com/dagger/dagger/blob/ee8f54d42a38b487d1010b73a86e5c9a6418380f/core/service.go#L572
i can take a look and see if that's a quick fix
@still garnet sweet! yeah i don't see any workarounds in "userspace" ... I could try to start the service using a withExec and a context timeout, and hopefully if it times out it's because it works, but it's beyond hackish
@still garnet oh btw -- a quick one (hopefully):
dagger.PortForward{
Frontend: rand.Intn(1000) + 5000,
^^^ is there a way to get random allocation for a Host().Tunnel? I believe I tried not setting the Frontend and not getting anywhere
there's definitely a way, it should be in the godoc
iirc you just set it to 0, but there might be a flag like 'random' or something
@wet mason wait, i just wrote a test for getting a dagger.ExecError back and it passed right away
// If a port's frontend is unspecified or 0, a random port will be chosen by the host.
and then you call .Endpoint to see what it actually chose
Oh wait ... I need to pass the frontend port to Endpoint, no?
no it's optional
// If no port is specified, the first exposed port is used. If none exist an error is returned.
@still garnet
pseudo-code
for _, port := ... {
hostForwards = append(hostForwards, dagger.PortForward{
Backend: port,
Frontend: rand.Intn(1000) + 5000,
Protocol: dagger.NetworkProtocolTcp,
})
}
// Expose ports on the host
tunnel, err := dag.Host().Tunnel(svc, dagger.HostTunnelOpts{Ports: hostForwards}).Start(ctx)
if err != nil {
return nil, err
}
// Retrieve endpoints
for _, forward := range hostForwards {
externalEndpoint, err := tunnel.Endpoint(ctx, dagger.ServiceEndpointOpts{
Port: forward.Frontend,
})
ah, there are multiple
i guess you could create one tunnel per port, instead of one tunnel for all ports?
kinda neat how the Zed agent runs the full blown TUI
that did the trick
I have this kind of error, but I'm not sure what it means
Error: input: daggerDev call constructor: NotFound: rpc error: code = NotFound desc = socket xxh3:1e5cef82d3912aea not found
For instance when I'm running dagger call test specific --pkg="./core/integration" --run="^TestModule"
Any idea?
scan is failing on main, working on a fix in https://github.com/dagger/dagger/pull/10593
Fixes the failing scan check on main, which is detecting that we're building our CNI binaries with an old go version.
To do this, we need to update alpine, since the older version we...
yayyyy, another one of these π’ does it happen every time? using which dagger version?
I have that from my branch: https://github.com/dagger/dagger/pull/10584
hm odd
it feels like either one of two things is happening.
either the typedefs are wrong, and a socket field is somehow getting smuggled, which means the transfer from one module context to another isn't working
or, somehow, a socket is being accessed that shouldn't be
it's not immediately obvious to me what's going on, do you have a trace link?
@still garnet sorry, another incredibly vague weird report - it feels like the traces ui has gotten more sluggish recently? e.g. https://dagger.cloud/dagger/traces/1207c9946fb0aaf680fc23c3d72e580c?listen=4561d51f260a7e86&showHidden=56064ea0aee6c8bd&showHidden=ec7143524540962d#ddff9cdb1efa342f:L4957
i find that the whole thing really freezes up, and it's almost impossible to click around in firefox (chromium seems to manage better)
got it all green now π (needed to update alpine as well)
For instance this trace if you have access: https://dagger.cloud/dagger/traces/27bc3f90d8b0a4e1fa32afe2ed5e6305
I'm running dagger call test specific --pkg="./core/integration" --run="^TestModule/TestInvalidSDK" for instance (or any other test)
https://dagger.cloud/dagger/traces/7861eeb07e9c693fd488f17f98fc0b7e#db8e0fdc59506914:L62
glibc-2.41-r7.apk disqualified because "2.41-r7" does not satisfy "glibc=2.41-r8"
glibc-2.41-r8.apk disqualified because "2.41-r8" does not satisfy "glibc=2.41-r9"
glibc-2.41-r9.apk disqualified because "2.41-r9" does not satisfy "glibc=2.41-r8"
why, when it finds the match, does it change the version it's looking for? lol
I also have this trace, on CI: https://dagger.cloud/dagger/traces/4107b92cebb05babf032d855ccb602a3
In case that can help
Could use a β on https://github.com/dagger/dagger/pull/10562 - it speeds up the evals quite a bit, at the moment they sometimes hit the 20m timeout, post-refactor they take 7m (on the GHA side where the timeout is enforced - actual runtime is ~4m without the dev engine setup)
The goal: establish a repeatable pattern for evals, decoupling them from Dagger's own evals.
Evals are now implemented as separate types that implement this interface:
type Eval interface {...
@still garnet not sure if it's your pr, but after merging https://github.com/dagger/dagger/pull/10601, the TestLLM suite seems to be failing
2035: [2m4s] | --- FAIL: TestLLM/TestAllowLLM/prompt_calls/allowed_unrelated,_calling_direct (71.58s)
2035: [2m4s] | testctx.go:173:
2035: [2m4s] | Error Trace: /app/core/integration/llm_test.go:352
2035: [2m4s] | /go/pkg/mod/github.com/dagger/testctx@v0.0.4/testctx.go:295
2035: [2m4s] | /go/pkg/mod/github.com/dagger/testctx/oteltest@v0.0.3/log.go:37
2035: [2m4s] | /go/pkg/mod/github.com/dagger/testctx/oteltest@v0.0.3/trace.go:83
2035: [2m4s] | /go/pkg/mod/github.com/dagger/testctx@v0.0.4/middleware.go:25
2035: [2m4s] | /go/pkg/mod/github.com/dagger/testctx@v0.0.4/testctx.go:149
2035: [2m4s] | Error: Received unexpected error:
2035: [2m4s] | read |0: i/o timeout
2035: [2m4s] | Test: TestLLM/TestAllowLLM/prompt_calls/allowed_unrelated,_calling_direct
ok well
huh looks like that showed up in my PR too but i can't imagine how these would be related (...yet?): https://dagger.cloud/dagger/traces/d203c44c302a63b2abbcc72b98c55355?span=f03e11713c1be80c
How common is it for the SDK gen jobs to timeout in a PR? I've retried them a few times yesterday, but same issue. I have a number of them that ran for 20 minutes then timed out: https://github.com/dagger/dagger/actions/runs/15720828016?pr=10569 (I just restarted the php-dev job right now to see if it works today).
π€
they shouldn't be timing out
it may be worth finding the span for the engine logs, and making sure the engine isn't crashing
(especially when making a lot of weird internal changes)
there's some retry logic I take it? that's good to know. I'll take a closer look at the logs.
update: i have tracked this down to the line that changes a ctx to context.Background() but still haven't the faintest clue why that would bubble up in this way
there's some retry logic I take it? that
I have hard time trying to understand (and fixing) the tests failures on my moduleTypeDefs branch.
I succeeded to fix some of them, but the last ones are like rpc error: code = NotFound desc = socket xxh3:736dee0b5d28a207 not found
Any help appreciated π
(this branch does not contain anything self invocation, this is just about extracting type defs, so the impact should be minimal. In theory, but I guess in practice something's wrong somewhere)
https://github.com/dagger/dagger/pull/10584
afaik the socket ... not found is a known flake, so if it seems super unrelated to your changes that's most likely the case
Have you seen that one recently? I haven't personally seen that one in quite a while, but obviously not checking every CI run
I'll pull the PR down and take a look. Best guess is it's related to SDKs attempting to use SSH auth socks, which could connect it to what you're working on
yeah I still hit it every now and then - I can link them if you want in the future
Yes plz π Dropping it in an issue would be best
Figured out what's going on, I'll leave a comment on the PR w/ suggestions on how to fix but it has to do with the fact that the new TypeDefs container and Runtime container are different (due to the new --typedefs-only arg) and thus end up as different cache entries. The Runtime container now doesn't get evaluated until the actual function invocation occurs, at which time it's running codegen (again) and looking for the SSH socket that's not available at that time.
https://github.com/dagger/dagger/pull/10584#issuecomment-2992580570 left suggestions there
Thanks a lot, looking at it (and answering on GitHub)
TL;DR: the goal is specifically to not use Runtime for TypeDefs and to run codegen twice (could be two different codegens, depending on the SDK)
I see this error popping since 15 minutes when I try to generate something in dagger/dagger, is anyone also hitting that?
package conflicts: [musl]
package conflicts: [musl]
I tried to:
- Rerun multiple time -> Same issue
- Remove container + volumes -> Same issue
- Compile dev engine with
./hack/dev-> same issue - Try from main with
./hack/dev-> same issue
I think last time I saw that error there was an issue in wolfi
Yeah I think so too, do you remember how we fixed it? Or maybe wolfi update it after so it's just a matter of time
It looks like it tries two times to install busybox, first time is working, second not
Looks like it's fixed on main though, I can build the dev engine ; will try on my PR
@still garnet π what's the best way to have a service healthcheck timeout, just a context?
yeah that seems reasonable
random question: I see references to a "shim", and found some git logs where the shim was removed, but I'm still seeing some potential leftovers? e.g.
alex@tyee:~/dagger/dagger$ grep -RI ShimEnableTTYEnvVar
core/service.go: ShimEnableTTYEnvVar = "_DAGGER_ENABLE_TTY"
alex@tyee:~/dagger/dagger$ grep -RI ENABLE_TTY
core/service.go: ShimEnableTTYEnvVar = "_DAGGER_ENABLE_TTY"
Is the above used in any way, maybe in some other repo?
I was ultimately trying to figure out how /.dagger_meta_mount is actually used, but my grepping isn't offering much hints.
update: looks like it's used here? https://github.com/jedevc/dagger/blob/fae6ff1aff62c32779a73ea781e6ce186a8c4330/engine/buildkit/errors.go#L313
Has anyone ever hit that issue with building the dev engine:
process "/bin/busybox --install -s" did not complete successfully: failed to install cacerts: failed to run [update-ca-certificates] for install: installer command failed: exit code: 1, output: run-parts: No such file or directory
I'm not able to build a dev engine nor run integrations tests because of that
It works fine with --cloud but I cannot use --cloud to compile a dev engine (or maybe I can?? If so, how ?)
This actually works but you have to modify the hack script to add the --cloud flag, and it'll be pretty network dependent for upload/download
Oh, I'll try that!
Is there a way to spin up the GraphQL server so I can setup introspection? Trying to run some stuff against the dagger query command
env DAGGER_SESSION_TOKEN=test dagger listen --allow-cors and then send requests with Authorization=Basic dGVzdDo=
Thank you π
I just found the listen command π I was asking copilot on github where the auth was π
Not sure how many people use VS Code for Dagger dev, but I loaded up dagger/dagger in the VS Code extension I'm working on.... that is a whole lot of functions, might need to nest those under the module name π
much better! Was wondering what each of the 20 lint functions was for π
Is it a new dagger vs code extension? That would be so cool to call the function from the IDE directly haha, would be quite tricky for function that requires a lot of parameters but definitely cool!
It sure is, it looks at the function args and prompts for user input, still need to double check some types that we support but it is working π₯³
I was parsing the cli output but switched to calling dagger query
https://x.com/mccallisterdev/status/1939698146839335338?s=46
So cool! Congratz
I wanted to build container-use binary using dagger on my mac
dagger -c 'build | export ./cu'
The build itself went well, but the result isn't what I expected:
$ file ./cu
./cu: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, BuildID[sha1]=bda00bf1d955276690c4b73c9fbaf6a1e0b373a7, with debug_info, not stripped
$ ./cu --help
zsh: exec format error: ./cu
My understanding is about an issue on the go module and the way the default platform is handled: https://github.com/dagger/dagger/blob/a569ec610a25df6c94cfe4bb55be02a8ade11f32/modules/go/main.go#L180-L182
If no platform is provided (empty string) then we don't setup the GOOS and GOARCH while, from my current understanding, we should instead use dag.DefaultPlatform() that returns the host platform (and not the engine platform, that for instance runs in a linux container on a mac host)
β Should I update modules/go to handle the default platform that way?
dag.DefaultPlatform() can return an error.
In the modules/go code it's used inside a With(f func(r *Container) *Container) *Container block. But this doesn't allow to return an error. That's not the first time I'm in this situation.
β Should we had something like a With(f func(r *Container) (*Container, error)) *Container (maybe with a different name than With so that we can raise issues in some way? Or maybe there's a better pattern? Usually I just not use With when I'm in this case but it means I'm losing the chainable aspect
"platform modules" design... π§΅
rfc: https://github.com/dagger/dagger/pull/10662 (new Container.load api)
to workaround you can do --platform=current
βShould I update modules/go to handle the default platform that way?
Issue here: https://github.com/dagger/dagger/issues/8354
It needs to be part of the args, and not adag.DefaultPlatform, otherwise, different platforms will cache to the same value.
β Should we had something like a With(f func(r *Container) (*Container, error)) *Container (maybe with a different name than With so that we can raise issues in some way?
Issue here: https://github.com/dagger/dagger/issues/9343 and https://github.com/dagger/dagger/issues/8402#issue-2518178288?
We did discuss before, discussion seems to have stalled.
it looks like I need to improve my search-git-issues-fu π
thanks for the pointers
am i missing a trick? it looks like if the engine dies, if i've got a shell session open, i can't get it to close without killing it, any combination of commands doesn't seem to get it to exit.
(discovered cause i found a weird panic bug in the engine)
question for @spark cedar @civic yacht @rocky plume : does Theseus get us closer to building the whole CLI + engine into a single binary like normal people (as opposed to gluing together various binaries in a OCI wrapper?)
not really. it doesn't really change or affect any part of our packaging process
I thought maybe getting rid of buildkit changed the equation for packaging
but I guess we already used it as a library... And there's still runc and its plugins
And containerd I guess?
yeah, i think we're quite far away from having a single binary (that's not something that contains an oci image itself)
theseus doesn't make it any harder, potentially we're able to be less relieant on a long running daemon - since we'd own the call points. but aside from that, i'd say it's pretty much entirely different
can I get a review on https://github.com/dagger/dagger/pull/10449 ? It's been open for quite some time now.
Evals on main failed, looks like it's catching something involving interfaces that weren't caught by the actual interface integ tests themselves (π I guess). Almost certainly from https://github.com/dagger/dagger/pull/10620 so just an FYI that it's ack'd and I'm looking into it
Quick core internal question... Does core.TypeDef allow setting a default value for an argument of type Secret, or other object types?
@spark cedar @leaden glade @fair ermine π
I see this in core:
func (fn *Function) WithArg(name string, typeDef *TypeDef, desc string, defaultValue JSON, defaultPath string, ignore []string, sourceMap *SourceMap) *Function
But in the case of an object type, I'm not sure what to pass as defaultValue of type JSON. Maybe the object ID? Would that work?
if you set it as the object id, then i think that miiiight work?
Would it leak weird giant ID strings into the graphql schema and SDK introspection?
yup it would π¦
hmmm
i don't think we've thought about how we'd do default objects
Let me move to #1391746885093425295 since that's what I'm working on
I have a real-world example of hard-to-read error: https://dagger.cloud/dagger/traces/c0dd2325173043f807f255bbd32bf97d?listen=5555b821a9625a2b&listen=9304cae9d5cef168&showHidden=55eb5cf982a65eb0#ea32cf65903f787e
Command:
dagger -m github.com/sagikazarmark/daggerverse/gh@v0.11.0 | with-repo https://github.com/dagger/dagger.io | clone
Console output:
βΌ gh | with-repo https://github.com/dagger/dagger.io | clone 0.5s ERROR r jump β΄
β°ββ clone 0.5s ERROR r jump β΄
! input: gh.withRepo.clone failed to stat file /tmp/repo: process "gh repo clone https://github.com/dagger/dagger.io /tmp/repo" did not complete successfully: exit
code: 4
! input: gh.withRepo.clone failed to stat file /tmp/repo: process "gh repo clone https://github.com/dagger/dagger.io /tmp/repo" did not complete successfully: exit
code: 4
Web trace default output: same in web form
Web trace after drilling down: see link above π . The actual error was hidden in the logs of an exec:
To get started with GitHub CLI, please run: gh auth login
Alternatively, populate the GH_TOKEN environment variable with a GitHub API authentication token.
hard-to-read errors in shell/cloud
I'm looking for the best place to hook into the loading of the session's top-level module... I successfully hooked ModuleObject.Install but it hooks every module. How do I hook into only the top-level module?
what's the authoritative way to find the main object of a given module? Compare the lowercase object type name to the module original name?
Or is it ObjectTypeDefs[0]?
@civic yacht @spark cedar @rancid turret Hey, not sure who's familiar with engine provisioning nowadays ...
Running into a deep paper cut when the SDK and CLI versions do not match, does this ring a bell?
https://github.com/dagger/container-use/pull/206
/cc @obsidian rover
currently trying things on the side -- making repros π thanks Andrea
Update: from the Connect() code, I don't understand where this could come from -- digging a bit more. I might be wrong, but i'm tending towards a cache poisoning -- trying to repro on the side
Are you overriding which CLI to use with an env var? By default it should just be using an automagically downloaded CLI
Not that the version being off by a patch version between the two should result in an error necessarily, just checking
Nope β import dagger + dagger.Connect(), very vanilla
okay does the problem happen when a new release is made? like everything was fine on v0.18.11 until v0.18.12 was released? or it just randomly appears?
either way, yeah sounds like something very wrong is happening
Do you have steps that repro consistently? I just tried on my system where I have v0.18.12 CLI installed anddagger-engine-v0.18.12 running, then ran a hello world with the Go SDK directly on v0.18.11 and it all worked as expected
engine got replaced with v0.18.11 and everything
It happened once but we shrugged it off and just bumped the dependency (which fixed it). And now itβs happening again.
Will look into the timeline, I think it was fine until it was released but will double check when it happened the first time around
fwiw this is what happens for me too. in fact would be kind of nice to have a way to keep the engine instead of deleting the other one, sometimes i swap back and forth between projects and they keep nuking each other π
DAGGER_LEAVE_OLD_ENGINE https://github.com/dagger/dagger/pull/9130, though obviously you'll be on the hook for cleaning them up manually so they don't eat your whole disk
ah nice. i wonder if there could be a default heuristic, like 'if == version found, keep, if only version is <, destroy and replace' - i.e. don't destroy a newer one
env var is good enough for me, just thinking out loud
It happened once but we shrugged it off
I wouldn't mind that, yeah. The whole point of rm'ing existing ones was that we assume users are just upgrading to newer versions to replace the old one and we didn't want to put them on the spot to manually manage them. But from a dev's perspective where we're more often going back and forth, I think what you described would help w/out breaking that default-path behavior for users
yes it seems related -- i wonder if it's not our amazon endpoint to retrieve the cli version ? That's the only thing I can think of atm ?
Not at the moment, digging
does anyone have any insight into what could possibly be making these tests fail? i definitely have made changes in this pr, but i can't quite understand where the failures coming from: https://dagger.cloud/dagger/traces/83c323388b58713b9e8fe57a18ad6ae2?span=3a1e3fa35ed75a2f
there's also nothing anywhere in the logs that indicate a test failure
it's like there are a lot of test failed spans, but with no information as to what is causing the failure
@spark cedar do you know how i can pass -i when using Go SDK ?
to get an interactive session on failure? π€ i don't think you can, unless the entire thing is wrapped in a dagger run (you'd add it to dagger run -i)
Thanks!
@still garnet sorry for the ping! but any chance there's something going on here with testctx? it feels like there's an error happening somewhere, but falling through the cracks
np! when in doubt you can check the go test output
indeed, i'm kinda struggling through in cloud, the frontend is trying to render 60k lines of ca cert oputput and struggling
i see 'race detected' in the logs
yeah those are always heinously difficult to track down
wish i could hook into it somehow
i see 'race detected' in the logs
my browser handled it OK - just took a while to get through the batches but wasn't laggy.
which one you using?
both chromium and firefox struggle for me
copy pasta'd the race backtrace
you can also try filtering for WARNING: DATA RACE and expanding down from the match, since that filters before rendering
(or "FAIL")
huh wild, that's interesting. i can barely select the filter box, it's lagging so hard π€
firefox here but on a pretty beefy laptop
π’
it do be struggling
is that intentional btw? even just doing nothing, and everything seems loaded, it still seems to be sitting pegged at 100% on one core, a couple minutes later
oh yes totally intentional
have to keep those laps warm
anything in the console logs?
sorry π poor phrasing indeed
lol np
hmm mind capturing a profile?
CI typescript-dev job is failing on main, it seems I forgot the regenerate the client after a PR was merged (and it wasn't caught by the CI in the PR for an unknown reason)
Don't panic, it will be fixed in 5 minutes π
Quick update:
-
Added a config file to the POC for default secrets https://github.com/dagger/dagger/pull/10697
-
Resurrecting "platform module", it's embarrassing that we still haven't shipped a solution after 2 months of very clear feedback after POC
I welcome help on both of these
@spark cedar I'm rebasing on main, and it looks like dagql.Instance is no longer a thing? Is it easy to replace it with dagql.Result? Any trapdoors?
looks like I just replace .Self with .Self() and I'm good to go basically?
Instance => ObjectResult, .Self => .Self() yep
Dev toolchain problem: I'm adding calls to the engine API, and using them in the CLI. The CLI fails to build because the new calls aren't defined in the client lib. I can't generate with client lib because dagger -m ./sdk/go -c generate somehow has a dependency on the CLI and can't load. Circular dependency?
I'm currently stashing my CLI changes so that I can re-generate the client then unstash the CLI changes... I expect/hope it will work but it's a bit embarrassing
(update: it worked)
I've been working on the blueprint module PR, I'm stuck on a one obscure issue in the module loading logic... Might need a little help
I've been working on the blueprint
@spark cedar I'm a bit stumped by some of the CI errors here: https://github.com/dagger/dagger/pull/10712
Like there's an error on 'test-publish' because some nix install script doesn't want to install nix as root?
completely irrelevant to my change
yup
What caused the issue? Did Nix change how it was configured? Or we recently decided to add Nix in that workflow?
upstream changed their install script to require coreutils' cpπ€·ββοΈ
probably, there's a lot of stuff we should pin - i don't have time to look through right now, i'm also not sure where they host their archive of old install scripts
@spark cedar I'm getting a few other failures that looks like flakes... are there known flakes at the moment?
we have issues for the known flakes: https://github.com/dagger/dagger/issues?q=sort%3Aupdated-desc is%3Aissue is%3Aopen label%3Akind%2Fflake
looks like the CI runner crashed under the checks feet or something
yeah that's a real failure I'm wrangling it
But everything else seems unrelated
yup, seems like alpine infra is having a moment
This is my local loop for re-running the tests with "real" failures: dagger -c 'test | specific --run="TestBlueprint.*"' but it takes forever to re-run even on a hot cache, is there a way to do it with less overhead?
Should
dagger -c 'dev --image alpine | terminal' give me an alpine-based dev engine?
cat /etc/os-release
ID=wolfi
NAME="Wolfi"
PRETTY_NAME="Wolfi"
VERSION_ID="20230201"
HOME_URL="https://wolfi.dev"
BUG_REPORT_URL="https://github.com/wolfi-dev/os/issues"
Oh, maybe ALPINE...but that should be default according to
dagger -c '.help dev'
Creates a dev container that has a running CLI connected to a dagger engine
USAGE
dev [options]
OPTIONAL ARGUMENTS
--gpu-support bool Enable experimental GPU support
--image DaggerDevDistro Set target distro (default: ALPINE) (possible values: <<<<<<<<<<<<<<<<<<<
ALPINE, WOLFI, UBUNTU)
--shared-cache bool Share cache globally
--target Directory Mount a directory into the container's workdir, for
convenience
No joy with --image ALPINE π€
Should
./hack/dev & ./hack/with-dev go test -count=1 -v -run TestBlueprint ./core/integration?
What's the protocol for using the contents of core/integration/testdata from the test suite? Can I rely on a predictable workdir, or some other method?
Yeah in your integ test case you can assume your workdir is ./core/integration, so anything you want to load from testdata is accessible at ./testdata/
πββοΈ I might need help getting blueprints over the finish line.. I'm hitting edge cases in module loading that I don't know how to solve
and the devloop is so slow...
Here's my issue, in case anyone has an idea...
- When I run
dagger install ../fooit adds the relative path indagger.json - When I run
dagger init --blueprint=../fooit adds the absolute path indagger.json. Even though I tried to reuse as much code as possible fromdagger install. I'm doing something differently and I can't trace what exactly
update: claude may have figured it out?
got stuck a little further down...
@spark cedar I got ensnared in engine version compat checks... Doing something that doesn't relate to it at all. Maybe you can help?
in the CLI implementation of dagger install:
modSrc = modSrc.WithDependencies([]*dagger.ModuleSource{depSrc})
// If I don't call this... π
if engineVersion := getCompatVersion(); engineVersion != "" {
modSrc = modSrc.WithEngineVersion(engineVersion)
}
_, err = modSrc.
GeneratedContextDirectory(). // ...then this π fails. Why?
Export(ctx, contextDirPath)
what does it fail with?
module requires dagger , but support for that version has been removed
Context is that I'm trying to wrap moduleSourceWithDependencies from moduleSourceWithBlueprint, to avoid duplicating all the special logic in there.
But in order to do that, I must also call loadModuleSourceConfig from moduleSourceWithBlueprint. And it fails with the error above.
weird, um i don't really know
i don't think you can call that function from there, i think it needs to be called from a fully configured module?
and it's not configured at that point?
(sorry, multitasking hard here)
It should be, it's a function attached to the ModuleSource type (eg. you can call ModuleSource.withBlueprint in the Dagger API)
No problem, it was just in case you had an obvious answer off the top of your head. I'll figure it out
somehow the engineVersion is being set to the empty string - i'm not quite sure how that's happening
if you have a branch pushed, i can try and peek
I think I finally figured it out
If anyone can spare a test / review... We're basically there. https://github.com/dagger/dagger/pull/10712
Is it possible to do a WithExec ignoring the exit code ? Nevermind I find the ANY of ContainerWithExecOpts.Expect
been working on support for more driver types (podman, containerd, apple containers, finch, etc)
https://github.com/dagger/dagger/pull/10714
it's pretty much there? but since it's changed a bit from my original proposal, would love some more feedback β€οΈ
^ fyi @tepid nova @tidal spire @hasty basin
oh fun, i briefly hacked at podman support like a year ago just tryna understand how the engine works. do any of these offer any interesting tradeoffs compared to docker/colima/etc?
not really π podman compatibility is pretty close
it's mainly about just being able to support whatever users are using and meeting them where they are
also, I'm biased, I'm maybe looking at daily driving nerdctl daily, but I need good dagger support to give it a go
btw it would probably easy for the default driver to auto-detect either docker, podman, or nerdctl. That way all those users would get zero-config local provisioning out of the box
π great minds think alike
Additionally, we also have support for the generic image + container drivers (which becomes the new default) - these behave the same, but auto select from the above based on what the user has installed on their system.
the preferred order is this:
I would move docker to last in that list, because it's common for it to be installed (or aliased) alongside one of the others. In those cases, the user probably prefers to not use docker
in other words: if container, podman, finch or nerdctl are available, they should be used, even if docker is also available
maybe - that's a breaking change though, if users have anything installed, dagger will start running somewhere else instead
i'm happy to change the order up later, but i'd also rather testdrive the new logic before we switch everyone over to it (there might be tiny incompatibilities i haven't found in testing)
yeah that makes sense
fyi, that pr is probably chill to merge whenever? not gonna try and rush it for the release though, there's no urgent need
(i'm also on holiday tomorrow, and don't wanna be scrambling to fix something)
@spark cedar re: JSONValue. I'm interested in your idea for replacing setString, setInt, setBool with a generic set()
<@&946480760016207902> what's the suitable replacement for dagql.Object?
Looks like dagql.AnyResult, not sure if it's as simple as search-replace, or if there's more to it?
dagql.ObjectResult should be what you need
That would have been the more logical name, but that seems to be a templatized type, so I can't use it
AnyResult should probably be it then
I think ObjectResult replaces Instance
what are you working on for this?
I guess a more accurate name would be AnyObjectResult since it's only for types that have an ID
all types have ids now
or can - even if they're scalar values (needed so we can cache them)
Refreshing my old "generic ID" PR, part of the Env cleanup, which I'm trying to chip away at
(very slowly)
Oh... Even scalars? How does that work π€
Not using graphql scalars anymore?
Cool cool cool
new generic id types should probably be able to handle that
Yeah my patches are pretty superficial, as long as there's a way to designate "anything with an ID" in the schema in the code, should be good
(in the schema it's even simpler: I just define a new scalar "ID" and that's it)
Yeah the summary is:
Instance[T]->Result[T](for any T including scalars) andObjectResult[T](when T is an object type)Object->AnyResultandAnyObjectResult
The need to have both the parameterized structs and the unparam'd interfaces is annoying but the way generics work in Go mean that you end up in situations where you need the interfaces, so π€·ββοΈ
Obscure engine/dagql question if anyone is around: I use dagql.Server.Select() to query the engine API from within the engine API. It works fine, except when passing an optional argument.
Example:
err = s.srv.Select(ctx, s.srv.Root(), &inst, dagql.Selector{
Field: "secret",
Args: []dagql.NamedInput{
{
Name: "uri",
Value: dagql.NewString(uri), // β
required argument
},
{
Name: "cacheKey",
Value: dagql.NewString(cacheKey), // β optional argument
},
},
I get this error:
select: failed to compute cache key for Query.secret: assign input "cacheKey"
(dagql.Optional[github.com/dagger/dagger/dagql.String]) as (dagql.String): assign:
Setter.SetField dagql.String to dagql.Optional[github.com/dagger/dagger/dagql.String]: cannot
set field of type dagql.Optional[github.com/dagger/dagger/dagql.String] with dagql.String
<@&946480760016207902>
Also @rancid turret I'm working on extracting some CLI core to the engine (the "resource loader" that loads a directory, container, secret, file or socket from a single command-line flag). But there are a few strings that I don't know how to detach... Could you help me navigate that?
(I see you online - welcome back? π
Yeah, I'm around π What do you need?
going for a quick lunch, but in short: I'm moving the custom flag values from CLI to engine API. See https://github.com/dagger/dagger/pull/10753
I started with directoryValue, I got a basic implementation to work (host | resource ADDRESS | as-directory). But I had to drop modArg.Ignore because I have no idea how that works...
So my question is: what is modArg.Ignore and how do I transpose it to the engine-side implementation?
Try dagql.Opt(dagql.String(...)) π€
Nice let me try that ty!
I looked for examples in the code but hard to find
yeah I guess we don't hit that a lot. I looked for another place with a similar pattern and it does 'required/non-null but with default ""' instead of a nullable value so that must be why
@still garnet while I have you... πΌ
I have a []core.PortForward and trying to pass them as argument to Host().Service()
var ports []core.PortForward
err = s.srv.Select(ctx, s.srv.Root(), &inst,
dagql.Selector{
Field: "host",
},
dagql.Selector{
Field: "service",
Args: []dagql.NamedInput{
{
Name: "host",
Value: dagql.NewString(host),
},
{
Name: "ports",
Value: dagql.ArrayInput[dagql.InputObject[core.PortForward]]( /* ???? */ )
),
},
},
},
I tried this:
Value: dagql.ArrayInput[dagql.InputObject[core.PortForward]](ports),
But it says invalid conversion
ah wait maybe I can just change the type of ports and just pass it
yeah that should work
But once I have a var ports dagql.ArrayInput[dagql.InputObject[core.PortForward]]
I don't know how to append to it
ports = append(ports, dagql.InputObject[core.PortForward]{
Backend: nPort,
Frontend: &nPort,
Protocol: core.NetworkProtocolTCP,
})
--> unknown field Backend in struct literal of type dagql.InputObject[core.PortForward] (compiler MissingLitField)
OK I think I figured it out
But it hurts
Heads up I'm going to send a request for API bikeshedding soon π
Has anyone run into the issue where dagger -c 'generate | export. ' suddenly goes haywire and removes a gazilion files from the repo?
Note: I interrupted the export after 4mn because it shouldn't take that long, even with slow internet...
And export without --wipe is supposed to be additive, so it shouldn't be possible for files to be removed, even if I interrupt the filesync
Note: it resolves itself in the end. But yeah there seems to be a race condition where an interrupted sync will wipe some files
request for comment: https://github.com/dagger/dagger/pull/10753
Thank you to everyone who commented π I'm digesting the API dogfooding and trying a new API variation (same implementation)
Flagging: I think we have a filesync performance problem. I'm on starlink (good bandwidth, meh ping, maybe some packet loss?) it's barely usable on the dagger codebase with small edits
Also that performance problem is, I think, surfacing some non-atomic file wiping issues
not super surprised π’ i'd really like to rework the file transfer backend especially for remote dagger connections
(forgot to mention that this is on a remote engine eg. parc)
if we start doing more parc stuff, i'd definitely want to fix that (e.g. not just internal use)
At the moment I switched to git-pushing my changes, then loading straight from remote git
but it's not super trivial to just fix
The issue is that for complicated codegen dependencies, it makes my life unnecessarily hard
@spark cedar how do you think it compares to regular buildkit? Same? Better? Worse?
i want to try using an rsync backend for remote cases? i think that would be a lot more stable + performant for remote cases (and for local cases, i think we could use direct mounts)
probably a little bit worse? our "protocol" is a bit chatty
cc @night moat maybe this is where we can talk ? π
not significantly so though. the buildkit file transfer protocol works terribly over a network though, it's a known thing
Maybe there's a possible combo between 1) live dev services, and 2) improved filesync?
The first is about exposing filesync in the API; the second is about. making filesync perform better.
mmmm yeah potentially π€ I've got a proposal for the live sync on my back burner, was hoping to have something this week, but sigh, I've got caught up
i think there's some overlap, but they are different, there's no one perfect file sync protocol out there that solves every problem we have
right. I'll settle for "average performance" π
So 3 options off the top of my head:
- rsync
- mutagen?
- @night moat 's thing π
New proposal: https://github.com/dagger/dagger/pull/10770
cc @spark cedar @still garnet @civic yacht
I like this one better π
@night moat TLDR:
- Dagger wraps buildkit (aka the backend of docker build)
- Dagger inherits buildkit's filesync feature, which uploads local files & ingests into the DAG
- buildkit (and dagger) support remote hosts seamlessly, but the protocol is slooooow
- We are removing the buildkit dependency. As a result we control the filesync implementation.
- For now we have simply copy-pasted the buildkit protocol, but it's within our reach to swap it out for something better
- Meanwhile we started using remote engines much more. Which makes the need for a better filesync protocol more and more acute.
- We were just discussing next steps to deal with all that
- An important step is to choose a protocol to implement. In addition to the useful suspects (rsync, mutagen) maybe there's something awesome you have built which would be better?
it's just for file synchronization ? as in you'd like a better rsync ?
yes. But tightly integrated into our runtime. This is basically docker's proprietary file transfer protocol for upload files into the docker build context. We want to replace it with something better.
And, it's for ingestion into our content-addressed data layer. So just naively wrapping a vanilla rsync on both ends isn't enough
mh, need to think about it but surely there's something that could be done with ptar
whose working on that on your side ?
me and @civic yacht mostly
okie, I have a 1h call but then I'm available to see if I can help if you want
@spark cedar gentle ping on this: https://github.com/dagger/dagger/pull/10680#issuecomment-3097125225
Interested in what you had in mind
i was wondering about having a generic scalar value in graphql. like any "any" type. essentially something that to decode a string, would just store the json string - then you'd be able to lazily interpret it later
essentially the equivalent of json.RawMessage in go
it might not be possible, but it feels like if it is, that might be preferable?
Maybe but I worry it might feel too verbose in practice
fair, that was my concern with the original proposal
i think i was envisioning something similar to how the gjson lib works (we use it in our integration tests)
they have quite a terse api imo
ππππ
This is dagger -c 'generate | export .' taking forever... But I can't interrupt it, because it temporarily wiped a bunch of files that aren't supposed to change.
Could anyone help me get the last CI checks to green? π https://github.com/dagger/dagger/pull/10770
It doesn't look like the remaining errors are relevant to my code?
Is this normal / expected?
mkdir -p temp/dir temp/otherdir
cd temp/dir
dagger -c 'host | directory ../otherdir'
! path "../otherdir" escapes workdir; use an absolute path instead
Looks like an easy fix. I'm going to remove that check, as it doesn't improve security (the client can already access the whole filesystem via absolute path) and is inconvenient
I am surprised that even when temp is a git repo (tried it), the behavior is the same. Otherwise, it's been a while, i don't recall exactly
Yeah it' shardcoded in the engine
I'm removing it, will see if anyone objects in the code review π
It's ancient, IIRC stemming from a concern around an earlier version of modules that doesn't exist at all anymore. So yes please feel free to bulldoze
I seem to have broken integration tests, but in a way that seems impossible? π
context: https://github.com/dagger/dagger/pull/10770
Error: Not equal:
expected: "shhh"
actual : "cmd shhh"
Relevant code: https://github.com/dagger/dagger/blob/fd03e5fb5e950361e72d80307fa0eac129523fde/core/integration/module_call_test.go#L553-L602
**My failing test is getting the result from a totally different test??? π π π π π π **
I guess that means my implementation creates massive side effects between calls?
but how π
I don't really understand how that's even possible. Also naive testing in a dagger shell (same session) shows no issue..
Clearly I did something wrong in the implementation but π€·ββοΈ
Oooh maybe it's this:
dagql.Fields[*core.Address]{
dagql.NodeFuncWithCacheKey("secret", s.secret, /* π */ dagql.CachePerCall /* π */ )
@civic yacht so I wrap { secret } with { address { secret } }
Since the former used dagql.CachePerCall, the wrapper does too. Then the wrapper calls the wrappee with dagql.Server.Select(). Could that somehow cause my issue? Some sort of caching version of crossing the beams?
Relevant function: https://github.com/dagger/dagger/blob/da4f730f32bd034a1f8525b645c6dd4de4080115/core/schema/address.go#L357-L413
dagger's registry auth on windows is a bit more complex than i thought ... π€― Can't seem to download public ubuntu atm and truly don't understand why the docker.json is not satisfied. Any recall of such issue @hasty basin ? Do users just put base64'd username:psswd ?
PS C:\Users\Guillaume\coding\container-use> cat C:\Users\Guillaume\.docker\config.json
{
"auths": {}
}
PS C:\Users\Guillaume\coding\container-use> dagger shell
βΌ container | from ubuntu | terminal 0.3s ERROR
β°ββΌ .from(address: "ubuntu"): Container! 0.3s ERROR
! failed to resolve image "docker.io/library/ubuntu:latest" (platform: "linux/amd64"): failed to resolve source metadata for docker.io/library/ubuntu:latest: failed to
get credentials: error getting credentials - err: exit status 1, out: `A specified logon session does not exist. It may already have been terminated.`
β°ββΌ resolving docker.io/library/ubuntu:latest 0.3s ERROR
! failed to get credentials: error getting credentials - err: exit status 1, out: `A specified logon session does not exist. It may already have been terminated.`
β°ββΌ remotes.docker.resolver.HTTPRequest 0.3s ERROR
β°ββ HTTP HEAD 0.3s ERROR
! failed to resolve image "docker.io/library/ubuntu:latest" (platform: "linux/amd64"): failed to resolve source metadata for docker.io/library/ubuntu:latest: failed to get
credentials: error getting credentials - err: exit status 1, out: `A specified logon session does not exist. It may already have been terminated.`
container-use β
docker pull ubuntu works π€ ; Ok a user had it: https://github.com/dagger/dagger/issues/6043. Installed via chocolatey too. Investigating π
Update: changint the docker.json to below fixed it. Investigating:
{
"auths": {
"https://index.docker.io/v1/": {}
}
}
It's mostly the windows creds helper that seem to break -- not sure if it's just my setup
π my PR is blocked by a failing "security scan", a vulnerable npm dependency somewhere in the docs... But that's unrelated to my issue, IMO it doesn't make sense to block every unrelated PR on this
I also have a weird failing test in the telemetry suite: https://dagger.cloud/dagger/traces/9ea712debf08a99f5b148f5d3b094444?span=1ad8bacba9bb4456
--- expected
+++ actual
@@ -38,5 +38,5 @@
β viztest: Viztest! X.Xs
βΌ .list(
-β β dir: Host.directory(path: "/app/dagql/idtui"): Directory!
+β β dir: Address.directory: Directory!
β ): String! X.Xs
β .git/
You can run 'go test . -update' to automatically update testdata/TestTelemetry/TestGolden/list to the new expected value.'
You can run 'go test . -update' to automatically update testdata/TestTelemetry/TestGolden/list to the new expected value.'
How do I do that? π
worth checking the flake label: https://github.com/dagger/dagger/issues?q=sort%3Aupdated-desc+is%3Aopen+label%3Akind%2Fflake
that's why the telemetry tests are failing fyi
i've seen the elixir one on main? not quite sure about, still needs someone to dig into it
The viz issue seems legit, the trace did change, and apparently you have to regenerate the golden data to reflect that
Nice thanks π Will rebase
the issue is that the trace is not consistent - that's the source of the flake
sometimes it's the same
OK good to know. So I'll mentally mark that one as a flake
But that diff does make sense. The CLI calls a different API now (address().directory() instead of host().directory())
If there's a test verifying that a given dagger call outputs a specific visualization to the terminal, it's normal that the test fails
I just don't know how that test works, or how to update the expected visualization
oh sure
yeah, just run dagger call test telemetry --update -o .
i missed that, there's also a flake
Meta-feedback: would be cool for cloud to surface specifically failed tests (since we have custom spans for them), and show them in a nice clean list view - separate from trace view
not just cool - amazing
almost like auto-issues that you can list and go through, like a checklist
Right now it requires spelunking deep across several traces
Trying: dagger -m github.com/shykes/dagger@address-api -c 'test | telemetry --update | export .'
(oh man, imagine if you could export straight to the current git branch π )
@spark cedar this seems to actually run the test, takes a while - normal?
yeah the tests should run, and output the results
ok π
@spark cedar once this is finally green, I'm moving on to jsonvalue, trying an improved API with your feedback in mind
But unfortunately the third PR in the combo (generic env bindings) is stuck on this: https://github.com/dagger/dagger/pull/10744#issuecomment-3097514851
In case you have ideas
(will ping you in the comment for later reference)
@fair ermine I'm getting a deadlock from new Typescript bundling code, while developing new engine API
dagger -m github.com/shykes/dagger@jsonvalue -c 'sdk | go | generate
Error:
βΌ .withExec(args: ["tsc", "--emitDeclarationOnly"]): Container! 2.7s ERROR
src/api/client.gen.ts(6755,23): error TS2552: Cannot find name 'JsonvalueGetOpts'. Did you mean 'JSONValueGetOpts'?
src/api/client.gen.ts(6770,27): error TS2552: Cannot find name 'JsonvalueGetBoolOpts'. Did you mean 'JSONValueGetBoolOp
src/api/client.gen.ts(6785,26): error TS2552: Cannot find name 'JsonvalueGetIntOpts'. Did you mean 'JSONValueGetIntOpts
src/api/client.gen.ts(6800,27): error TS2552: Cannot find name 'JsonvalueGetJsonOpts'. Did you mean 'JSONValueGetJsonOp
src/api/client.gen.ts(6815,29): error TS2552: Cannot find name 'JsonvalueGetStringOpts'. Did you mean 'JSONValueGetStri
'?
src/api/client.gen.ts(6830,24): error TS2552: Cannot find name 'JsonvalueSetBooleanOpts'. Did you mean 'JSONValueSetBoo
ts'?
src/api/client.gen.ts(6838,24): error TS2552: Cannot find name 'JsonvalueSetIntegerOpts'. Did you mean 'JSONValueSetInt
ts'?
src/api/client.gen.ts(6846,21): error TS2552: Cannot find name 'JsonvalueSetJsonOpts'. Did you mean 'JSONValueSetJsonOp
src/api/client.gen.ts(6854,23): error TS2552: Cannot find name 'JsonvalueSetStringOpts'. Did you mean 'JSONValueSetStri
'?
src/api/client.gen.ts(6863,19): error TS2552: Cannot find name 'JsonvalueUnsetOpts'. Did you mean 'JSONValueUnsetOpts'?
! process "tsc --emitDeclarationOnly" did not complete successfully: exit code: 2
- Re-generating the SDKs requires building the engine...
- ...which requires bundling the SDKs...
- ...which requires up-to-date SDK...
- ...which requires re-generating the SDKs β οΈ
How do I break the deadlock?
312255260f .dagger/build/sdk.go (Vasek - Tom C 2025-04-22 22:10:46 +0200 158) WithExec([]string{"tsc", "--emitDeclarationOnly"}).
looks like a name capitalization issue?
If anyone is around who could help me out, let me know. I'm spelunking deep in the guts of the engine & sdk build, just so I can develop a trivial API change
That"s weird what's happening, we already have a case convertor
Do you have a PR link?
Also you can repro with the command I shared
dagger -m github.com/shykes/dagger@jsonvalue -c 'sdk | go | generate'
(not specific to go sdk)
(sorry wrong PR link - fixed)
Moving to a thread to avoid flooding #1397223053452120144
@still garnet if I wanted to develop a custom visualizer (eg. an alternative to plaintext or tui) where would I start? Any special knowledge to impart on me that would avoid lots of blind searching?
Module .refresh
developing custom frontends
I started the day with 2 almost-done PRs and failing CI... Finished the day with 2 almost-done PRs and failing CI π
Quick question: i'm fighting the windows github runners to use docker on the WSL (and it's starting to win ahahah π’ )
What I want: for docker on windows to handle linux container:
docker version --format "{{.Server.Os}}"
linux
What I get: windows, making it impossible to test the windows dagger inside those runners.
Is there a workaround ? Can't seem to make this wsl work (kernel issues, hyperV disabled)
But, wsl seems to be enabled, so i don't understand why it isn't theoritically possible ? π€
indeed
I've burnt like 2 days on this before
it feels possible right? π
no specific advice, just good luck, windows on GitHub actions is not my favorite
I kinda have it -- dagger still breaks , getting cloooose -- but hacky
one question: if the windows cli connects to a remote engine via the experimental env var, engine located inside the WSL2, would that be acceptable ?
Is the goal to test dagger clients (i.e. the CLI and such) running on windows? If so, then yes the engine being in WSL2 would be fine IMO.
I guess it would be nice to have tests against the "official" docker-on-windows (since that's probably closer to what users do), but we would gain quite a bit just from having tests for dagger clients on windows, so if WSL2 is what we can get working I'd say go with that
Mmh the goal is to have container-use's test suite run on windows following a native Windows support. We have some host code + it's a nice way to test sdk's windows support π
Regarding the "official" docker-on-windows, I'm actually working rn against a windows instance that was originally designed for that. We could totally add a user and have it part of the CI (don't know how exactly), but it's kinda here ahah
One idea if you didn't consider it already: you could setup just the dagger client on Windows and then use PARC to connect to a linux engine for the actual tests? Then you don't need to play around with any of the various nested virt problems
Mmmh I did not think of that ahah π€£ Does it work with the sdks ? I'll check it out, thank you π
Yeah afaik it should since it's just a _EXPERIMENTAL_DAGGER_RUNNER_HOST setting at the end of the day: https://github.com/sipsma/dagger/blob/8063fd422d5dc453a9acfdecdeb521b633f0bab6/engine/client/drivers/cloud.go#L21-L21, which works with the SDKs
it's a brilliant idea, thanks π
Last question: Would your proposed solution (Windows + PARC) be sufficient to test your scenario on dagger/dagger? Iβm very close to getting the WSL approach working, but before I discard it, I wanted to confirm whether this would genuinely benefit dagger/daggerβor if sticking exclusively to the PARC approach would be simpler or preferable (for both of us).
βdagger/daggerβ would require a little more work because you need to build the engine off the commit being tested, then connect to that via PARC. However I think we have some background plans to do that anyways (cc @astral zealot).
After that, yeah I think this would be a great way to run windows tests in daggers CI. Nested virt is an absolute shitshow so avoiding it would be great. Honestly we would probably want to convert our existing macos tests over to that approach too because the majority of time in those runs is currently just installing docker π
So I would go with the PARC approach if it works for container-use
@civic yacht (can wait til tmrw) - do you remember why we had dagger.CurrentModule() not expose the 'real' *Module that it has inside, and instead only have a super limited interface?
also, what happens if I pass dag.CurrentModule() into another module's function? will it be theirs or the caller's? π
ahh i guess it's because it's what provides Workdir() and that can't exist on *Module since it has to be "current" (i.e. coming from a function call that has one)
Yeah that, I think another reason, at least at one time, was that CurrentModule is used by sdks while initializing the Module, so there were some fun recursive calls to avoid.
ah yeah that sounds familiar
so, there's no reason we couldn't expose the underlying Module i guess?
i'm exploring having a LLM.withModule(Module) so it'd be nice to be able to pass your own module in
i can actually get away with using CurrentModule for now but it's kinda funny; it works because I have one module calling a generic agent module; so it passes itself to the other module, which then which adds it to the LLM, but also adds its CurrentModule to the LLM
Running Dagger GraphQL API errors
Yeah canβt think why not. Naming will be funβ¦ but always solvable eventually
If anyone has ideas on this error... Last one holding back the new address() API. https://github.com/dagger/dagger/pull/10770#issuecomment-3106859201
--> test span: https://dagger.cloud/dagger/traces/9b496c12795ef4a512ba1d4f73d4e823?span=cade61194ee7bac3
Investigating abnormally slow filesync... possible regression? π§΅
Self calls support in SDKs
address() ci failure
Related to self calls, I have one last issue I don't understand. Who would be the best (and available) for a quick chat so I can share what I have and maybe we can find together the issue. @rancid turret or @fair ermine maybe?
Heading to lunch now, but if it's quick I can hop on a call.
In theory - how hard would it be to implement an ID-to-shell converter? π
build any object, boom export a shell snippet that produces it
doesn't seem too hard - have a look at ./cmd/dump-id as a starting point, think you'd just need to translate embedded IDs to $(<recurse>)?
@civic yacht @rancid turret @still garnet
Since Justin is off, I'm pinging you to review that PR: https://github.com/dagger/dagger/pull/10736
It applies .gitignore on contextual argument but also add a ApplyGitIgnore on Host.Directory so it can be used for other purpose (cf: https://github.com/dagger/dagger/pull/10736#discussion_r2213034069).
Happy to get review and merge at asap since it may improve DX and performances for all our users and solve: https://github.com/dagger/dagger/issues/6627
I was just now looking for this PR, once I copied the URL to open the browser you posted this π
I'm too fast β‘
I got an odd flake just now: panic: recovered from errgroup.Group: value method chainguard.dev/apko/pkg/apk/auth.multiAuthenticator.AddAuth called using nil *multiAuthenticator pointer on the client side. I hit retry and the failure went away π₯²
Has anyone else seen this panic before? Here's a full gist of the failure: https://gist.github.com/alexcb/18e6be1a42c9e0736668d3eff2d69f5d
I got an odd flake just now: `panic:
I think i'm hitting this again on 0.18.12 and 0.18.14 ? https://github.com/dagger/dagger/issues/10300
edit: nevermind I broke something else!
FTR I found the issue, I (I mean the AI agent π ) introduced a bug in the JSON support for modules, constructors are not handled. That's why we didn't found it in the code we were looking at. Let's see if AI can fix it
βdagger/daggerβ would require a little
Trivia / call for help... Given a core.TypeDef, how do I check whether it's an object or not? I keep forgetting...
EDIT: Typedef.Kind
@tepid nova request for bikeshedding: https://github.com/dagger/dagger/pull/10747
based on add Directory.withPatchΒ #10742
uses ripgrep --json under the hood (added to engine image)
returns a structured [SearchResult!]! output
also added File.contents(offset: Int, limit: Int) for...
What's wrong with this ID? π€
dagger -M -c '.core | load-secret-from-id ChV4eGgzOjY1YzNkNTFiYWE2ZjUxNTASWQoVeHhoMzo2NWMzZDUxYmFhNmY1MTUwEkAKFXh4aDM6OWU0MWRkYTc3YzUyZGQ5MhIMCghTZWNyZXRJRBgBGgJpZEoVeHhoMzo2NWMzZDUxYmFhNmY1MTUwElsKFXh4aDM6OWU0MWRkYTc3YzUyZGQ5MhJCChV4eGgzOmM2YzkwZGFlZWE5YzczODASCgoGU2VjcmV0GAEaBnNlY3JldEoVeHhoMzo5ZTQxZGRhNzdjNTJkZDkyEn8KFXh4aDM6YzZjOTBkYWVlYTljNzM4MBJmEgsKB0FkZHJlc3MYARoHYWRkcmVzcyI3CgV2YWx1ZRIuOixvcDovL0Rldi9uMnRyc2w1bndiNTJwaHljY3d3ajZjYWY0cS9wYXNzd29yZEoVeHhoMzpjNmM5MGRhZWVhOWM3Mzgw'
error: parse selections: parse field "loadSecretFromID": init arg "id" value as dagql.ID[*github.com/dagger/dagger/core.Secret] (SecretID!) using
dagql.ID[*github.com/dagger/dagger/core.Secret]: expected "Secret" ID, got SecretID! ID
#10697: default secrets
kinda sounds like you somehow got an ID of an ID lol
oh noz
curious if anyone else has been confused by this, cause it's bit me twice now
https://github.com/dagger/dagger/issues/10788
I need to setup a test branch on a test repo, for a regression test (https://github.com/dagger/dagger/issues/10785)
Where should I do that?
What is the issue? For context, see: https://discord.com/channels/707636530424053791/1397614015756374219 Dagger version dagger 0.18.4 Steps to reproduce In a repo storing the blueprint: $> git c...
Could I get a quick review & LGTM π --> https://github.com/dagger/dagger/pull/10789
It's a tiny patch which fixes #1397614015756374219
@fair ermine if you're around π
@tepid nova, quick bikeshed if you can. @fair ermine is adding an option to Host.directory for adding .gitignore exclude patterns. How do you feel about IgnoreVCS? For context: https://github.com/dagger/dagger/pull/10736#discussion_r2228208813.
Could also make it the default and disable with NoIgnoreVCS (not sure if anyone's depending on having locally ignored files but it's the best for most cases).
what's rg that you mention?
ah ripgrep
Yep
OK. I made a suggestion π
Calling for bikeshed! --> https://github.com/dagger/dagger/pull/10680
I'm using the query to get a module from a directory, is there a way to get the service ID from the return type if its a service or does the ID only get assigned when its called/running?
query ($id: DirectoryID!) {
loadDirectoryFromID(id: $id) {
name
asModule {
id
name
objects {
asObject {
name
functions {
id
name
description
returnType {
kind
optional
asObject {
name
}
}
args {
name
description
typeDef {
asObject {
name
}
kind
optional
}
}
}
}
}
}
}
}
Also, if you miss a good desktop app/explorer for a GraphQL client, I stumbled across this today and super happy so far! https://altairgraphql.dev/
Hey folks, I am trying to test a simple integration. I am able to run dagger call dev terminal and run it. However, when I do a dagger call cli binary -o ./bin/dagger and try to run it, I get a zsh: exec format error: ./bin/dagger. I tried on main too. Same result.
Hey Nipuna! Do you have any DOCKER_ * or PLATFORM envs set by any chance? I think what's happening here is that the binary that's getting built is for a different architecture
You should be able to check that by calling file ./bin/dagger
That was my suspicion too, but I checked. I have a M4 mac ./bin/dagger: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, BuildID[sha1]=4e7295cf30dec172a17989279b2099972e6ce84a, stripped
@stuck bloom from my memory, what you want is dagger call binary --platform=current -o ./bin/dagger
When an argument of type Platform is optional and not set explicitly, it defaults to the runner's native platform (not the client's).
See my proposal to sand off that rough edge π https://github.com/dagger/dagger/issues/8354
created a new area/cache label in github to track caching issues + prs, i keep getting caching issues lost π’ (fyi @civic yacht @rocky plume, should be useful for theseus π)
Enum issue inside nested connection
default --platform=current
spent a couple minutes on improving the performance of the version module for building the dagger engine: https://github.com/dagger/dagger/pull/10801
What's the equivalent of COPY --link --from=stage / / in dagger ? How can I merge an entire container image without invalidating cache in subsequent layers ?
there's no explicit syntax for it, but we use the same MergeOp behind the scenes wherever possible for WithDirectory
the logic for when we can do it is here: https://github.com/dagger/dagger/blob/f1bcb9b6dede8fd650185e0566090df4eb78ff6b/core/directory.go#L838-L850
I think there's an issue on dagger main, I was doing some benchmark on the new git ignore Pr and noticed that some execution were finished but on cloud it was still running.
I checked out on main, built a dev engine and saw that it also happens on main: https://dagger.cloud/Quartz/traces/c7720ddb15f112bb417490f876772e41?listen=6b48aff4540e8284
It's not happening every time but I made a screen record to show the problem
Here I execute multiple time dagger function (from a same terminal), we can see that my dagger functions finish properly on the terminal and on cloud but if I go back to a previous attempt, it's still running dagger functions (on a prev attempt) )
Yeah, I was trying to repro the bug, but only the 1st attempt triggered it, the 2nd & 3rd ended properly so I showed the one failing after
I cannot repro that consistently
I'll try from 10736, I think it happened more time with it
Yeah no I cannot reproduce that consistently, it's so weird
I have this trace here that also continue to run after the completion, but I'm not able to repro it https://dagger.cloud/Quartz/traces/f187305c5d20da88f7d384541e1c85f5
They end with : timeout due to inactivity (https://dagger.cloud/Quartz/traces/ff0626bc57c24af227477514aedfe13e), I think that something we see on our CI also sometime
Ok this record is the same, from dagger v0.18.14, might be an issue on dagger cloud then?
Ok I caught it there: with dagger v0.18.14: trace: https://dagger.cloud/Quartz/traces/57b929a4d8811682c82534b10789c063#f4f0d42dba4159ba
I accidently ran a 2nd time daggger functions, I wanted to run dagger version (my bad)
Looks like it's an issue on cloud, not on the engine, it seems I can repro it when I change my engine version from my dev one to main or from main to dev (which is why it happened a lot during my benchmarks since I was consistently switching engine version)
I opened an issue https://github.com/dagger/dagger/issues/10809 /cc @wild zephyr
@fair ermine the traces show "completed" in cloud now, don't they?
I think it's related to the clickhouse aggregations. It might take a few seconds after the trace is done to compute the final state of the whole trace which is what's reflected in cloud
Since the data is partitioned by OrgID, if your Org doesn't have that much traffic, it's likely that you see this more frequently
No it says timeout
thx. Mind sharing the trace URL please?
rfc on https://github.com/dagger/dagger/issues/10376
we need a way to create secrets from the results of a WithExec - currently, users are just doing this unsafely, and potentially leaking secrets.
i don't think an impl will be too hard, just want to start the convo going, since it feels like a fiddly api (and we need to make it as easy as possible to avoid the chance for misuse)
rfc on https://github.com/dagger/dagger/issues/10826
this would be a pretty big change to how we manage some dependencies - but imo, we've had this ask enough times that we really need a solution here.
What would be the best solution to add an experimental flag, to gate a feature (self calls) the time we improve performances? Is it best to use a simple env var or to control it through the engine.json?
π having a little chat in #911305510882513037 rn
@still garnet is this error as opaque to you as it is to me?
engine logs:
container-use error upstream:
Error: failed to write file: failed applying file edit,
skipping git propagation: failed to apply patch.
got a trace link or anything?
there should be more info written to the WithPatch span logs
is there a trick for getting cu-generated dagger traces? i have not tried to do that before
not really - i just look at recent local traces in cloud
cool ill try to get one
oh hey it's in the logs
traceID=3a3... => https://dagger.cloud/dagger/traces/3a3b57ce1aa4458209f44405e3ed32a1
https://dagger.cloud/dagger/traces/3a3b57ce1aa4458209f44405e3ed32a1?span=3cf3c93f8ae6c783
error: patch failed: CLAUDE.md:17
error: CLAUDE.md: patch does not apply
π€· - guess there's not much git can say here
weird though cause we generate the patch based on a successful find/replace
lol yeah and i've been stochastic-monkey-gun debugging it with claude for a minute now and it can not repro
there is one case where this happens if the agent supplies search_text == replace_text
but there's some other case where search_text != replace_text that doesn't repro under units
ah yea makes sense - not sure what "patch" would be generated for that, and if things handle an empty patch nicely (probably not). should probably be a special case somewhere
the no-op patch ends up being ""
fair enough
i also have a native File.withReplaced API sitting on my coding agent branch, which would simplify things here, but needs bikeshedding
(among 1000 other things...)
it kinda feels like a string escaping problem to me, but i'd also think that that'd get caught by stochastic monkey debugging
like if it finds the search text, but the search text is truncated due to mishandled escapes, and the replace text is also truncated to the exact same spot, then bam you get a diff that's "" even though your search had hits?
on main of container use rn claude "there's some inconsistent prompt engineering in our env configuration for developing this repo with an agent, specifically current max version we support is (i guess the current one might be the newest, idk) and remove the roots thing cuz we don't actually support roots. the prompt engineering should also instruct the agent that container-use can't be tested in its own /workdir, a new git repository must be first initialized somewhere else on disk, like /tmp/test-repo. make sure you manually test an instructions that you write into those files." repros the error super consistently with tool calls that look like this:
βΊ container-use - environment_file_edit (MCP)(environment_source: "/Users/braa/src/container-use",
environment_id: "regular-shiner",
target_file: "CLAUDE.md",
search_text: " - Initialize: '{\"j
sonrpc\":\"2.0\",\"method\":\"initi alize\",\"params\":{\"protocolVersi
on\":\"2024-11-05\",\"capabilities\
":{\"roots\":{\"listChanged\":true}
,\"sampling\":{}}},\"id\":1}'",
replace_text: " - Initialize: '{\"
jsonrpc\":\"2.0\",\"method\":\"init
ialize\",\"params\":{\"protocolVers ion\":\"2024-11-05\",\"capabilities
\":{\"sampling\":{}}},\"id\":1}'",
explanation: "Removing roots
capability from initialize request
example")
π finally finished up defaultGit: https://github.com/dagger/dagger/pull/10847
part of my large saga of improving git interactions in dagger π
This should fix the errors I started to have today when building the engine locally: https://github.com/dagger/dagger/pull/10856
@civic yacht (and anyone else) do you have any thoughts on https://github.com/dagger/dagger/pull/10803#discussion_r2259859069 ? It's related to ExposedPorts(ctx) returning ports from an imported container (either via From or Import); however no port healthcheck is ever created for the returned ports.
do mounted cache volumes have any "weird" interactions with conventionally LLB-built containers? i'm trying to work out if it's feasible to run a (possibly read-only) LSP pointed at a cachevolume where another pipeline is modifying files using WithExecs and WithFiles. Like there's no problem doing like container.WithMountedCache().WithFile(/cache/x).Container().WithExec().Container() etc etc, right? it's not obvious to me in this case if the fs layering "just works" as normal
halloooo π i've got a few pieces of work blocked on open PRs:
- https://github.com/dagger/dagger/pull/10847
- https://github.com/dagger/dagger/pull/10810
- https://github.com/dagger/dagger/pull/10723
kind of a collection of various things, but would appreciate a look from anyone who's got some spare capacity π
also, rfc on live development: https://github.com/dagger/dagger/issues/10895
i've got some very early feedback from some of the team already, so i'm convinced it's not entirely the wrong direction, but would like a lot more feedback before kicking the implementation off
it's a big enough chunk of work that once we've got consensus, i'll probably make a project proposal (for our internal project tracking)
@spark cedar are these weird python errors in my CI build related to the gql dependency issue?
yup π just need an approve, but lots of folks are off today
Fixes #10885.
We need to release this ASAP.
Sorry I missed your reply. Just approved.
@spark cedar I'm also seeing lots of failed "scan" checks in CI, is that a known issue in main? Anything I can help with?
Fun fact: you can't call our codegen pipelines in the middle of a rebase, because our version module requires calling git checkout, which fails in the middle of a rebase
so me and [other container-use users](#container-use message) have been hitting some weird hanging behavior, not sure the exact cause, but when it happens i see the engine failing grpc healthchecks, api calls that take forever, etc, etc... i found 1 interesting engine log in addition to the healthcheck failures:
time="2025-08-18T20:47:19Z" level=error msg="ref evaluate error: evaluating released result\ngithub.com/moby/buildkit/solver/llbsolver.(*resultProxy).Result.func2\n\t/go/pkg/mod/github.com/dagger/buildkit@v0.0.0-20250708131355-3c56a47e3f5c/solver/llbsolver/bridge.go:327\ngithub.com/moby/buildkit/util/flightcontrol.(*call[...]).run\n\t/go/pkg/mod/github.com/dagger/buildkit@v0.0.0-20250708131355-3c56a47e3f5c/util/flightcontrol/flightcontrol.go:122\nsync.(*Once).doSlow\n\t/usr/lib/go/src/sync/once.go:78\nsync.(*Once).Do\n\t/usr/lib/go/src/sync/once.go:69\nruntime.goexit\n\t/usr/lib/go/src/runtime/asm_arm64.s:1223" caller stack="goroutine 3492744 [running]:\nruntime/debug.Stack()\n\t/usr/lib/go/src/runtime/debug/stack.go:26 +0x64\ngithub.com/dagger/dagger/engine/buildkit.(*ref).Result(0x401dbeb038, {0x33b1918?, 0x40bcaf46f0?})\n\t/app/engine/buildkit/ref.go:237 +0x94\ngithub.com/dagger/dagger/engine/buildkit.(*ref).getMountable(0x401dbeb038, {0x33b1918, 0x40bcaf46f0})\n\t/app/engine/buildkit/ref.go:211 +0x34\ngithub.com/dagger/dagger/engine/buildkit.(*ref).StatFile(0x401dbeb038, {0x33b1950?, 0x40a62152c0?}, {{0x40aebea210?, 0x0?}})\n\t/app/engine/buildkit/ref.go:166 +0x54\ngithub.com/dagger/dagger/core.(*File).Stat(0x403eaa19a0, {0x33b1950, 0x40a62152c0})\n\t/app/core/file.go:283 +0x1e0\ngithub.com/dagger/dagger/core.(*Container).File(0x33a4b50?, {0x33b1950, 0x40a62152c0}, {0x4027b0d6e4, 0x9})\n\t/app/core/container.go:1021 +0x60\ngithub.com/dagger/dagger/core/schema.(*containerSchema).file(0x0?, {0x33b1950, 0x40a62152c0}, 0x40685738c8, {{0x4027b0d6e4?, 0xdf4d00?}, 0x0?})\n\t/app/core/schema/container.go:1651 +0x78\ngithub.com/dagger/dagger/dagql.FuncWithCacheKey[...].func1({{0x40a3626380?, 0x40685738c8?, 0x0}, {0x0, 0x1, 0x40b9711830, 0x40b9686f78, 0x40b96a1040}}, {{0x4027b0d6e4, 0x0}, ...})\n\t/app/dagql/objects.go:802 +0x6c\ngithub.com/dagger/dagger/dagql.NodeFuncWithCacheKey[...].func1({{0x40a3626380, 0x40685738c8, 0x0}, {0x0, 0x1, 0x40b9711830, 0x40b9686f78, 0x40b96a1040}}, 0x401bcd7980, {0x0, ...})\n\t/app/dagql/objects.go:854 +0x118\ngithub.com/dagger/dagger/dagql.Class[...].Call(0x341bc40, {0x33b1950, 0x40a62152c0}, 0x402cc3df01?, {{0x40a3626380, 0x40685738c8, 0x0}, {0x0, 0x1, 0x40b9711830, ...}}, ...)\n\t/app/dagql/objects.go:302 +0xf0\ngithub.com/dagger/dagger/dagql.ObjectResult[...].call.func2()\n\t/app/dagql/objects.go:670 +0xa0\ngithub.com/dagger/dagger/engine/cache.(*cache[...]).GetOrInitializeWithCallbacks.func1()\n\t/app/engine/cache/cache.go:210 +0x64\ncreated by github.com/dagger/dagger/engine/cache.(*cache[...]).GetOrInitializeWithCallbacks in goroutine 3492743\n\t/app/engine/cache/cache.go:208 +0x604\n" client_hostname=Connors-MacBook-Pro.local client_id=fxf5i3l6vv4r3jbhyr3og62jo session_id=hzr3n8a75dr8nm4vsxr1ozywv spanID=32b338ec5845db45 traceID=85317dc1a7c300e5bf79f2ed06cdb40b
that occurred after I shut down all my dagger clients. prior to doing that, all dagger api calls were just stuck, including non-container-use calls. one time last week i also tried running dagger call engine local-cache prune during one of these hang states and it took over 25m before i sigkilled it
cc @still garnet cuz you're seeing similar things on your branches (im on the released version) and @civic yacht cuz well caching and refs and bk seem to be in play here.
@leaden glade @spark cedar I have a question related to https://github.com/dagger/dagger/pull/10847#issuecomment-3179906799
I'm trying to remember why we called it +defaultPath and not +default, and see if the proposed address() PR gives us a way to avoid fragmentation
New discussion topic for your consideration @leaden glade @spark cedar (and anyone else who is around): generic ID... π§΅
i found myself wanting this lol π
@justin is this test failure related to the regression we're fixing in 0.18.16? https://dagger.cloud/dagger/traces/68641237f1965487839d31551137f348?span=2ac792e59226ffa6
I'll just have GHA re-run (hate how hidden that button is..)
(side note, we need better support for services that listen on unix sockets)
we can't health check them, i think that's likely the source of this problem
ah
withExposedUnixSocket? π
Feels weird to attach that stuff to Container, but it would be consistent with Container.withExposedPort
have an issue https://github.com/dagger/dagger/issues/10911
π’ fyi, tests on main are failing because of a bitbucket outage
(just the tests where we're testing integration with that)
potentially good news regarding all the weird engine connection blocking behavior i've been seeing lately on container-use, ive been stress testing cu + dagger 18.16 for >30m and so far i've been completely unable to repro.
there was also a weird bug coming out of environment_file_edit that i've only repro'd once so far, so ~1.8 birds with one stone if we're lucky π€ 
no idea which engine commits might've fixed all this, but y'all have been cookin' π
lol i jinxed it
now i've got the engine blocking periodically for longer and longer... like >4m calls to cache-hitful dagger functions
virtually no interesting engine logs, though
2 separate engine blocks in the course of a single dagger functions call that normally takes 8s
and after getting it consistently reproing, exiting all the container-use servers, dagger functions failed like this:
failed to get configured module: Post "http://dagger/query":
command [docker exec -i dagger-engine-v0.18.16 buildctl dial-
stdio] has exited with exit status 137, make sure the URL is
valid, and Docker 18.09 or later is installed on the remote host:
stderr=
gonna repro again on a dev-built one with debug servers enabled
π Self Calls/Invocation PR is finally π’
The PR includes an experimental flag to enable/disable it (dagger init --with-self-calls, dagger develop --with-self-calls), a few integration tests to ensure the flag and self calls are working (in Go).
Go, Python and Typescript SDKs are compatible with self calls, others will come a bit later, probably starting with Java as a big part has already been done)
This is for sure not a small PR, it contains changes in a lot of different places. And even if I tried to keep the commits as clean as possible, it's still a lot of commits rewritten during some time so they are not as nice as I'd wanted π
It even includes a fix on the .Net SDK, as it looks like I added for the first time a List of enum values, something that wasn't implemented in .Net (I haven't wrote C# for almost 20 years π€ )
I'd love to have some π on it, and happy to help to review it if needed, explain how it works, etc.
https://github.com/dagger/dagger/pull/10584
I don't know if that's a good idea or not π
but I played with a CombinedOutput flag on WithExec. I sometimes would like to get both at once, to print the full output for instance, and to retrieve stdout then stderr is not nice as I can just put one after the other.
https://github.com/dagger/dagger/pull/10924
My change is very simple, and it surprises me. So maybe I'm missing something...
yes please!!
is the idea that stdout() and stderr() just return the same thing when that argument is set?
In the current version I've made, stderr is redirected to stdout (and not the other way) so by querying stdout we get both. But if we retrieve stderr it's only the standard error.
π€
If I stick with this behaviour, maybe I should rename the option to something like RedirectStderrToStdout so it's clear it's only one way.
you could also just create a third output meta file, called combined in the executor_spec.go
we already do some funky pipe manipulation
or
I could just read the pr where vito already suggested that
π€¦ββοΈ
alrighty, i've acquired 2 goroutine dumps during the engine-wide hangs, perusing now... in both i've got a fairly deep and stuck solver isDep stack, some waits in netpoll reading, multiple flightcontrol waits inside the bk llb solver... any tips on what i should be looking for here? cc @civic yacht @still garnet
if you grab them with ?debug=2 you can copy them into https://vito.github.io/swirly/ to visualize
it reads like all of the bk network resources (grpc server, engine read-side, progress) are blocked on something but it's unclear what.... there's no obvious cycle going on in the isDep stack either, but the shape of the calls also don't match what im reading in the bk code (should be 2 edge structs as receiver and arg, but the stack shows {pt, pt}, {pt, pt})
tbh the phat stack of isDep is probably the culprit... realizing i took dumps over a 10 minute window and the latter 3 have the same goroutine working on the same phat isDep stack, always runnable
and it's not like this blocks permanently, just for an uncomfortably long time.
the rest of the lock acquisition and chan recv blocking could all feasibly be waiting for the result of the isDep stack, which is downstream of the central dispatch loop
ah... i think this edge merging? @civic yacht
glad once i read the code and then searched for it in the discord i immediately see that you pushed trying to fix it 15m ago XD
Hey, new directory exist is great addition but why didn't we add this funct to File type?
is there any specific reason?
if you've constructed a File, it exists
if you want to check it exists, just call Sync
it exists if that call succeeds
yea , i know and use sync to test this, just as DevEx I feel weird having Container and Directory have exist but File don't.
I just randomly found an unused dependency in one of our modules... Is there a way we could automate detecting this? It could help us improve loading times
Mind sharing the module?
dagger/dagger/docs
@tepid nova IIUC you're saying that dagger/dagger/docs is not currently used by dagger-dev?
according to Cloud, it is here https://dagger.cloud/dagger/mod/github.com/dagger/dagger/docs?tab=dependents
or is it some other module that wasn't using docs?
no it's docs that isn't using php-sdk-dev
(or at least I couldn't fine anywhere it uses it)
https://dagger.cloud/dagger/mod/github.com/dagger/dagger/sdk/php/dev?tab=dependents according to Cloud it's using PhpSdkDev.base call
or it was using it 3 days ago..
Yep. Seems like it was recently removed here https://github.com/dagger/dagger/commit/4fa61fddb8547c72b4514117f4403defd73604c4#diff-650caae762f8a97ee4f5e0859db50e7f0150dedbf39dd93e4264db929eb6aa7f
yeah I don't think telemetry is enough to reliably know if a dependency is used or not.
In the current state it's not but it might be improved to do so. i.e: if we can know which entrypoint function was used in the trace we could know when dependency functions are removed from the call graph. The main limitation today is that we don't have that information when we're instrumenting cross module function calls
but just because a dependency has never been called doesn't mean there isn't a code path that can call it. You need static code analysis to know for sure.
We could use telemetry to give hints, though. "we've noticed a dependency hasn't been called in a while, are you sure it's used in the code?"
@leaden glade let's dogfood self-calls in our own dagger module π I'm going crazy looking at traces with 1000 low-level calls in a row...
I'll have a look π
Thanks! I think it will be the perfect showcase π