#Environment API
1 messages Β· Page 1 of 1 (latest)
Would this be more or less a replacement for the workspace concept?
Well "workspace" was a usage pattern, prominently featured in our early examples, but was never a core concept in the API
So that usage pattern would continue, just built on this new version of the core API
type LLM {
model: String!
withPrompt(prompt: String!): LLM!
history: [LLMMessage!]!
lastReply(): String!
withEnvironment(Environment!): LLM!
environment: Environment
}
type Environment {
with[Type]Binding(key: String!, value: [Type], overwrite: Bool, overwriteType: Bool, includeFunctions: [String!], excludeFunctions: [String!]): Environment!
bindings: [Binding!]
binding(key: String!): Binding
encode: File!
}
type Binding {
key: String!
as[Type]: [Type]!
}
@vital copper @hallow steppe I am calling Environment, call it a gut feeling... Maybe this Environment type is not just for LLMs... Maybe it's the missing piece for streamlining shell/llm integration. And perhaps also... for this? π https://github.com/dagger/dagger/issues/9584
Nice! So to oversimplify it, an Environment is a collection of objects
Yes, it's a collection of bindings, which are basically named objects
(if you want to split hairs, bindings can be to any value in the Dagger API - which could be an object, scalar or array
ah right
would stray us further from .envrc interop, though... like i imagine if we had a file that was a keyed collection of dagger objects, to get the objects you'd dagger-subshell, i imagine, and then .envrcs are out the window
but I've used "object" to sometimes mean "value"
that part is definitely left ambiguous at the moment
if we had a file that was a keyed collection of dagger objects, to get the objects you'd dagger-subshell
I did not understand this part, sorry
So my definition of Environment gets expanded based on my types (if we get self calling) and dependencies. So I have to be careful with that because I shouldn't use Environment as an argument or return type since I can't pass those types around
Yes, Environment becomes the magical type, and as a result it's complicated to pass it around between modules. Most likely we will block that until we have a way to make it work.
BUT: that's exactly the situation for LLM today. So we don't create a new problem, we just scope it to a smaller type
OK now get this
what if the shell added a builtin .save and .load for saving and loading the current environment to a standard file?
then we define a filename convention for dagger to load these files automatically at standard locations. CLI flags to load from custom location... Basically what we discussed for .env, but with Dagger env files instead
mmm typed env
Now, these files unlike .env and other primitive environment files, are 1) typed, and 2) safe to share because they contain secret references, not values
you can either sync them via your Dagger Cloud org - theoritically
And... you can plug them straight into a llm
I would probably still have my LLM configuration in regular env vars from my .bashrc (or similar) since it's global
Then, we can add default values for all argument types in modules, starting with secrets
func New(
// +default="$GH_TOKEN"
token *dagger.Secret,
)
--> Loaded with dag.CurrentEnv().Binding("GH_TOKEN").asSecret()
humans and robots sharing the same environment, co-existing peacefully (what could go wrong)
cc @blazing tangle @fading iron for MCP implications π
(added Environment.export()Environment.encode() π )
i might be over-assuming things about how you can construct an env, but if we wanna have a .env file, and you wanna provide an object to the shell by default, that file is gonna have a K/V format:
SOME_SECRET=file:///my-api-key
NOT_SECRET="just-a-value"
CTR_DAGGER_OBJ=$(container | from alpine) # dagger shell
export FLYIO_KEY="$(flyctl auth token)" # typical .envrc line
Yeah so far I've been assuming we would want to piggyback on the actual .env format, for familiarity and ubiquity. But, I tried in practice and there's a fatal flaw: secret references
While it's super convenient to save dagger secret references (eg. op://foo/bar) directly in the .env, and have dagger automatically resolve them... It's not sustainable because it breaks other tools
so we can either 1) truly co-exist in the .env ecosystem, or 2) take advantage of dagger secret references, but we can't do both
I choose 2 π
So this would be our own file format, our own filename, and it would not be compatible with .env
idk, feels like maybe a false choice, could we not have both .denv and a .env?
.denv does sound way more powerful, but .env sounds way more convenient (i already have those files and don't wanna keep them in sync with another format)
Quite possible we could still support .env. But it would not longer be the bearing wall of our env configuration system. So for the purposes of designing that bearing wall, I am discarding it as a viable option
yeah agreed that makes sense to me, at the very least .env is very loosely coupled to the more exciting option lol
trying to wrap my head around having a .denv with CTR_DAGGER_OBJ=$(container | from alpine) where I wouldn't have it in code.
If I have a frequent flow I'm doing in shell, I'll write a function for that. I don't want to automatically have a bunch of boilerplate bindings in the environment unless I really want LLM to see them
I guess it makes defaulting secrets really easy. So that is a win
there are a bunch of use cases if we're cool with findupping out of the repo root, like user-specific config vars, but idk if that's in scope here.
if this is a file that's typically checked in, then you're prolly right that it's like secret defaults and maybe other convenience objects? tbh for the secret defaults is there a reason we can't put those in SDK code?
@hallow steppe yeah the goal is not to replace your script logic with a serialized object (although it's intriguing that those two overlap). It's more for those snippets of script that are really env-specific configuration, like GH_TOKEN=$(secret op://sdfkjhsdkjfhsdjf/sdfdsf) or DOCKER_SOCKET=$(host | unix-socket /var/run/docker.sock)
Nice. if it's committed, is it tied to a module's context, or is it only relevant for the client's local shell?
Would be relevant to the local shell. So, safer to commit than .env thanks to secret references, but still not magically portable, and not safe to use one from a stranger π
technically if a hacker can get you to 1) load their env file, and 2) run their module, then the evil module could collaborate with the evil env file to steal secrets and files via default args
Default args will probably require prompting / trust system (which luckily we just built for LLM access π
Module github.com/superhacker/l33th4x0r requests access to binding GH_TOKEN (type Secret). Accept?
Got it. I was thinking maybe a module could define it's base images in a .denv but it sounds like no
maybe, now it feels like things are getting complicated π¬ maybe sticking to the local .env style is better
sticking with .env as in piggy-backing on the existing format?
or do you mean having our own .denv, just managing it just like a .env?
yeah this. So it's specific to my local client and not magically tied to module context
I feel like we got pretty far from the original Environment proposal π€£
Not really - it just seems to all connect really well
(lots of hypothetical layers built on top for sure)
So we're got pretty far in the same way the 30th floor is far from the basement π But still the same building, foundation is strong!
Yeah totally, but we didn't hash out the design of Environment itself, unless everyone's good with it as-is!
I'm just waiting for bikeshedding π
what if we tied the current query object (with available modules) to the Environment? could streamline how we do "llm.withQuery" @vital copper
does the shell have access to "dag.CurrentModule()"? is that conceptually what we should give the environment?
I don't think you can access the core API from there, but it would make sense since different modules can have different views of the core API
So conceptually it makes sense to query the core API as seen by a particular module
I didn't expect to find this conversation in the agent channel but I'm pleasantly surprised to see all the discussions around .env and .denv. That would be amazing to have! Right now every single command to my modules have a password, user and sometimes token parameters that are from the host env. Would be great DX to externalize that to a .env
I think we will end up supporting .env as well as a superior format that takes advantage of Dagger secret references and other types
I like .env files too but they are a bandaid for secrets since you have to store the plaintext value in the file
.denv (name tbd) would be that superior format - would that work for you, or are you locked into .env specifically?
No, I am not locked in. I may actually prefer the more powerful format which can also be checked in to the repo? For example, I'd want dagger to resolve MYSECRET=env:MYPASSWORD even in CI, if not provided via CLI flag. Assuming the CLI flag takes precedence. That will let us keep the CI calls less verbose too.
I in fact created a module to ingest .env (read and set as env var) in my workflow because we have various other env vars (secret and non-secret) that we want to pass into the module.
I kind of like sticking with .env, thatβs a know format that ideβs know and support and widely adopted. The thing thatβs tricky here is that .env was designed to not be checked into your git repo and made to tweak your app to act like itβs running in an environment where those values are set as βrealβ environment variables
To somewhat counter that, the Symfony project has a built in way to encrypt those values and store them in the repo (it references the vars as PHP files (or YAML) but same concept). It requires a single key on the environment to decrypt the values at runtime.
I could see that as maybe a way for us to "checkin" what variables the Dagger CLI can access from the environment - I know we sandbox what is sent from the host machine - without having to be so verbose on what we pass to shell?
That specific example kind of overly complicates the existing secrets support, but the .denv example made me think of it
taking a stab at this
resuming... @vital copper should I start from your "eval" branch?
@dark maple yeah go for it
it's getting there, just integrating the ambient var access stuff now
had green CI earlier
Nice, are you aiming to merge today?
has the moment for QueryID arrived?! https://github.com/dagger/dagger/blob/548f702b2dd168b007ac2914401ed39fc1088a51/dagql/server.go#L95-L98
just getting caught up - love the idea, my biggest question is whether this file will be committed to the repo or not. I agree that since part of the point is to avoid secrets being in plaintext, it's now safe to commit, but I suspect in practice people will need local overrides, and if it's committed to the repo there's a risk of people accidentally committing their local changes. So I think we'll want to have an obvious way to have local overrides. Maybe .denv.local?
fyi @vital copper this πis what I'm trying to rebase on your branch
any major hurdles yet?
oh also - is there any gelling between "environment" and "context"? we have "contextual dirs", maybe they could play off environments somehow?
this is mostly thesaurusly motivated
Environment will just be another object (a la container or llm), right?
it would be the "matrix" through which the llm interacts with all other objects
so not visible as an object itself
used in a sentence: "my code instantiates a LLM then binds two containers to its environment, plus one secret and one string containing its assignment
and it'll only be used with LLMs?
we're trying to make it so the same environment is presented to LLMs and humans
@vital copper going to push a buildable version of my branch soon. I'm taking a more cautious approach, first pushing a "naive split" from main, where I preserve as much as possible of your API. Then on top of that, we can move more things around
example: I kept "vars" and "objects" separate, but think they can be merged under "bindings" as a follow-up
sounds good!
Gotta disappear for calls - back in 1h or so
@vital copper sorry I didn't have time to actually rebase on your "remove variables" change. But I manually incorporated most of it. Will get to it right after my calls
OK almost done with that π
rebased & pushed
possible bug? https://v3.dagger.cloud/dagger/traces/ca08cbbdaac1d205d2ac1f0216b5f259#bb7ea8cc5ad77276:EL1284
have to head to a π game soon so approving for when that's fixed
@vital copper thought I fixed it, but still fails. Pushing anyway..
It's the same test, but different failure I think
Does this look correct?
func (b *Binding) AsString() (string, bool) {
return dagql.UnwrapAs[string](b.Value)
}
hmmm it might not be, I've never used it to unwrap + type convert in one go
i don't think that'll work, no
try UnwrapAs with dagql.String first, and then going from there
side note these llm tests semm pretty brittle with the hardcoded replay data that encodes specific tool calls etc
yeah you have to run a command to update them sometimes
if you need it: dagger call test update --pkg=./core/integration --run="TestLLM" --env-file=file://$PWD/.env -o . (I don't think you should though based on your change)
Good to know. Yeah not needed now, was just poking around to see how it worked
re-running test...
everything passes except rust-sdk tests? π€·ββοΈ
@fading iron @blazing tangle @buoyant crescent I'm going to merge part 1, just to keep things moving. All tests are green
Merged
automatic updates fix like 2/3 of the brittleness, and though i still find them somewhat aesthetically offensive, they have already found quite a few bugs so at least they're doing their job XD
nice, yeah having the auto update from the start is key
the only obvious alternatives are setting up the mock responses in code (probably more brittle, at least in the sense that you've gotta change the mocks and their impls whenever you change the actual code) or actual e2e testing (subject to an obscene amount of nondeterministic behavior from the models)... if things eventually calcify i could see it being nicer to just be like mockLLMModel.ExpectPrompt($userInput); mockLLMModel.StubToolCall(tool, params, returns) but for now the replay thing is wayyyyyy more amicable to underlying shit changing rapidly
Part 1 merged
looks good so far! do you have an idea for how to get the return value(s?) out, to replace the LLM.<type> getters?
relatedly, I did some experimenting in https://github.com/dagger/dagger/pull/9978 to support -i/--interactive, which might inform the design somehow since it involved changing those getters.
I made it so that calling LLM.<type> sets <Type> as the desired return type before calling .sync, and then if the model exits its loop (stops talking/asks a question) without returning that type, and you've run with -i, it prompts through the CLI and lets you continue the conversation.
demo: https://asciinema.org/a/8jHjMd4z8tdPhDbt1SdYFXvnS
I also tried adding an explicit returnFoo tool for returning a Foo type when the task was completed, but had trouble getting the model to consistently call it, so I ended up just checking that the final state matched the desired return type. But that's far from perfect - what if the type happened to match, but the model didn't actually complete its task? So I started piling more prompting into the currentSelection tool, and thought about adding a way to clear the selection to disambiguate that, but I'm not sure how reliable that would be.
looks good so far! do you have an idea for how to get the return value(s?) out, to replace the LLM.<type> getters?
π forgot about that
I feel like exposing the query tool as Query_xxx in the top-level most confuse the llm
is that a side topic? just making sure im following lol
like exposing the full Query type?
there's a way it can get wedged at the moment, since once it selects another object it has no way to go back to Query
yes side topic sorry
Pushed a fix, thanks @hallow steppe for testing
@vital copper FYI I'm looking at your other branch, will try to merge with mine, hopefully not too many conflicts
Good morning @unkempt meteor , we're hitting a mysterious buildkit error in this branch, any chance you could help us debug? π
It's in a branch that doesn't actually do anything buildkit related, we do move things around in the core schema, so the buikdit error might be an artifact of something stupid in the schema.. just not sure what
hallo yes
do you have a link?
i'm working from home today, so happy to help out on anything
@unkempt meteor simplest way is to run engine from my PR:
dagger -m github.com/shykes/dagger@environment-api -c 'cli | binary --platform=current | export ./dagger-env-api; engine | service dev | up'
Then try loading any module:
_EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:1234 ./dagger-env-api -m github.com/shykes/hello
Also could use a review on the PR while we're at it π π
Just merged 10010 since you approved it @unkempt meteor . Need to rebase on top of it and get some testing of the new $agent asap
If you don't mind π it might be related
thnaks
will do - hopefully should clear y'all up to focus on rejekts/etc π
quick clarification - are we aiming to merge every open llm pr for the release later today? just going to milestone them all
@unkempt meteor the release is severely at risk, because we have unresolved design decisions & testing, and it's been very hard to coordinate them because of travel & timezone difference
ie. Alex and I are currently developing in parallel and have had no chance to actually sync since friday
I'm sitting with @hallow steppe trying to reconcile
okay, anything you need shout for - i can hop into a voice-channel if needed
trying to dagger develop with the branch above fails to load any module, even python-sdk
time="2025-03-31T09:12:50Z" level=error msg="solve error: process \"codegen --output /src --module-source-path /src --module-name python-sdk --introspection-json-path /schema.json\" did not complete successfully: exit code: 1\ngithub.com/moby/buildkit/solver.(*edge).execOp\n\t/go/pkg/mod/github.com/dagger/buildkit@v0.0.0-20250128235329-9c8ee9e867a5/solver/edge.go:912\ngithub.com/moby/buildkit/solver/internal/pipe.NewWithFunction[...].func2\n\t/go/pkg/mod/github.com/dagger/buildkit@v0.0.0-20250128235329-9c8ee9e867a5/solver/internal/pipe/pipe.go:78\nruntime.goexit\n\t/usr/lib/go/src/runtime/asm_arm64.s:1223" client_hostname=Donnager.local client_id=r1rv9ziyw8vxm4h39ss6oidzv function= module=python-sdk session_id=cl3p6y7xfwadbne9ry4nlo7s8 spanID=2e0091c31156b592 traceID=a0f711204960cb8228ecb925ad7747fb
digging in π
Just rebased on main FYI
i can repro π
huh we're having bugs in codegen???
really no idea why that's suddenly appeared, looks like we're generating bad code
detective time
fyi, i do like the new bindings model a lot actually
feels nice to have llm as a "real" static type
haha yeah it feels cleaner π
also paves the way for Environment to be more broadly used, not just for LLMs
oho
yea that's got cool implications
One unresolved design issue we have, is that the current environment API assumes the LLM and caller exchange data by setting bindings.
But in parallel we have been discussing changes to LLM->caller sharing, because it's annoying to always have to prompt the model with instructions like "make sure to write the result to the variable 'foo', that's f-o-o, don't forget!"
ookay
it's binding.type that's causing the issue
type is a reserved name in go
i can get the go sdk to handle this
aaarg
we can change to TypeName also
it's not a super-super important function in the API. More for convenience
Most clients won't call it
So maybe not worth pulling out the big guns
i'll rename it for now to unblock? but i'll also keep hacking
perfect thank you

what's the plan for updating docs to match this new style? are we good on that front?
Yep on it π
With the caveat that the design is not complete, because of the unresolved UX issue
fixed + pushed
gonna work on fixing ci as well
if anyone has a moment, could i get an approve on https://github.com/dagger/dagger/pull/10000 ? i need that fix to be able to regen correctly (i just cherry-picked π it over for now)
At some point, this seems to have broken:
we weren't actually outputting the updated php docs to the right place
we weren't linting to see if those docs had actually been updated
...
cc @digital pond @merry helm π
@vital copper double-checkling for when you start your day: at the moment there is no way for the LLM to pass data back to the caller?
Correct - just the current selection, no explicit return. I experimented with explicit return but didn't get to the point where it would consistently call it. Maybe possible with more prompting (maybe an "instructions" tool with a description?) - but needs testing across models
And that was with a single return, not multiple named bindings, if that's what we want
@vital copper what about auto-chaining "auto-saving" of selected objects by default? I remember we discussed that, did you try?
Like having them update in place?
I had trouble figuring out how to frame it so that led me to writing evals
The bit that feels awkward is when it transitions types
Can it happen in the bbi/mcp rather than making the llm call _save or whatever
Like ctr becomes file when you get the resulting compiled binary
right
I think we would only auto-save when it's the same type (like in original single obj implementation)
Generally not a fan of values being reassigned esp if it pollutes back into shell
thats ok, i only want it to happen for chainable functions that return the original type
But that means we'd still have to solve returning different types no? Doesn't feel like it buys much
so yes for this: Workspace.exec(command: "go mod init go-curl"): Workspace!
not for this Workspace.getExecOutput: String!
the main use case we can't live without is mutating our input objects. So there's probably a bunch of other cases where chained objects would get dropped but that's the case for single object too
Trying to find where to plumb this in, to try
(code changed a lot, I'm struggling to find the right place to plumb this in)
I guess selectionToToolResult?
Yeah I'm going to need your help @vital copper
Furniture got moved around I don't recognize the room π
I agree it's not a panacea BUT @hallow steppe is right that it would at least give us parity with single-object and a good checkpoint for testers to use while we keep designing something better
what is the proposed code pattern? like how does this translate?:
dag.LLM().
SetWorkspace("work", ...).
WithPrompt("do your thing with $work").
Workspace()
i'll be around, starting earlier today than usual π
before := dag.Workspace(dagger.WorkspaceOpts{
BaseImage: "golang",
Context: dag.Directory(),
Checker: "go build ./...",
})
env := dag.Environment().
WithWorkspaceBinding("workspace", before).
WithStringBinding("assignment", task)
// Give the workspace to the LLM
coder, err := dag.LLM().
WithEnvironment(env).
WithPromptFile(dag.CurrentModule().Source().File("system.txt")).
WithPrompt(`
<assignment>
$assignment
</assignment>
`).
Sync(ctx)
if err != nil {
return nil, err
}
after := coder.Environment().Binding("workspace").AsWorkspace()
// Return the container
return after.Container(), nil
lol it was incorrect though
(forgot the withEnvironment part)
pseudo-code version:
dag.LLM().
WithEnvironment(dag.Environment().WithWorkspaceBinding("workspace", ...)).
WithPrompt("do your thing in your workspace").
Environment().Binding("workspace").AsWorkspace()
Meanwhile we're discussing a simplification where we could make the binding name optional when setting it. Would still be multi-object internally, but by default we would just pick a name for you. You would only need to pick a name if you need to set multiple bindings of the same type
whatever we do, we just have to be pretty confident that the model actually understands it (beyond like 50% success rate). it took a lot of iteration to get to where we are, so i'm a little bearish on a total paradigm shift just before releasing, unless mutation actually seems to bring a significantly higher baseline of understanding for one reason or another. but, we're going to be swimming upstream, since the entire Dagger model (all our GraphQL schema docs etc) is framed assuming immutability
like we saw how wild some models got just by introducing it to the concept of named variables; it's a delicate balance
I understand but, we've been discussing the need to simplify the API for a while. We had to get evals first, but the point of the evals was to be able to iterate on the API
also I don't think it needs to be up to the LLM to re-assign variables, that definitely gets too complicated. Just re-assign when a tool response comes back
yeah, all i'm saying is this is a very small amount of time to actually iterate
considering how long evals can take to run, across models with rate limits etc. etc., and how much variability there is inherent to the feedback loop
again i'm not against the switch, i'm just skeptical of the timing, especially if we think we'll need to iterate on the API some more anyway
I agree that there's very little we can iterate on without pushing back the release
Maybe I'm missing something though because we can't release if we can't get mutated objects out of LLM
we can't release with the Environments API if we can't get mutated objects out
the Environments API is also predicated on being valuable in other ways, but are we shipping that at the same time?
I think the following are realistic pre-release:
- Simplify binding API by making the binding name optional (no impact on LLM interface, so no eval required)
- Auto-save of bindings - does impact LLM interface, but simplifies it rather that complexifies it, and we have prior experience of the pattern in single object, it worked well
Everything else I can think of, would have to push back the release IMO
or are we just trying to get an API change out of the way?
Yeah the API change is the point - to avoid releasing then breaking API a week later
btw i use zed now if you wanna try the collaboration thing
oh!
nice π
sidenote we have a shared slack with MCP team now
and incidentally with Zed team too
@dark maple if there's interest I can try a parallel track, around pre-declaring "desired bindings" for the model to fill in. when I spiked on it I liked the DX, and it might be a smaller shift from where we are now, main question is whether the models will respect it
(brb, running around a lot today, contractors are here)
ok we're going to try a mini-spike on our end, ping when you're back?
@dark maple back
@vital copper want to chat in team-audio?
sure
Notes:
- We're moving forward on environment API branch. We have a list of changes we want to make before release
- LLM input: I would prefer a way for named bindings to be discoverable by the LLM. (at the moment you need to give their exact ID to the LLM, by variable expansion in the prompt). --> @dark maple will spike
- LLM output: since environment API does not expose the LLM's current selection (left as an internal MCP implementation detail), we need a way for the LLM to "return" a value to the caller. --> @vital copper will spkie
- Function masks. We need a way to restrict what the LLM sees, to help it not get lost. There are several ways to do this. We agreed to focus on 2:
1. Manually mask functions from core types (container, directory) to reduce the baseline cost
2. if possible, change MCP implementation to only expose "real" module functions, and hide field getters. cc @unkempt meteor @merry helm we need help on this π
FYI @quiet wing @vast jolt since you guys joined the call
SUCCESS RATE: 10/10 (100%) πΎ (with Gemini - other models TBD)
the nice thing about this is it'll naturally gel with -i for troubleshooting - if the model ends its turn without returning, we call for help
@vital copper does it work even without prompting it specifically to "return this or that"?
yep
the full prompt is WithPrompt("Mount $repo into $ctr, set it as your workdir, and build ./cmd/booklit with CGO_ENABLED=0.") - the rest just comes from the tool framing
Nice. I was going to ask about returning multiple values but I guess that's included
this description ends up on the bin arg description for example:
WantFileBinding("bin", "The compiled Booklit binary."),
yep
Did you add a description field to Binding? Was thinking we could expose it for regular bindings also
for now i just added a separate type, but yeah i think it could make sense there. only reason I did a separate type was to avoid a "sometimes nil Binding"
need to
terminology too - "WantFileBinding" feels awkward ("Want" vs "With" too subtle)
gonna try updating the other evals to use this and see how it feels
@vital copper do you want to allow the LLM reading from the same LLM it returns to?

oh like accessing the same bindings it returned, as bindings, for subsequent turns?
yeah, do the "input bindings" and "output bindings" share the same namespace?
we're facing this question right away with @hallow steppe's reference example (toy workspace in, toy workspace out)
that's what i've done so far in my spike, yeah. which took a bit of hacking, right now MCP just directly mutates:
// TODO: is it appropriate to just mutate directly here?
m.env.objsByName[name] = &Binding{
Key: name,
Value: obj,
env: m.env,
}
OK, so it makes sense to call them bindings then
hmm maybe
alternatively we could have an LLM have an 'output environment' or something, but it seems like it'd translate to the same number of hops either way
oh, or we could do LLM.output("foo"): Binding!
then you don't need the extra LLM.ENVIRONMENT.binding call
we can decide either tbh
i think the DX should drive it
yeah one way to think of it is like, "solve for $bin in $ctr = foo, $repo = bar, $bin = ?"
so how about:
env := dag.Environment().
WithToyWorkspaceBinding("workspace", dag.ToyWorkspace()).
WithStringBinding("assignment", assignment)
dag.LLM().
WithEnvironment(env).
WithPrompt("do your thing").
Loop(dagger.LLMLoopOpts{MustReturn: []string{"workspace"}}). // π
Environment().
Binding("workspace").
AsToyWorkspace()
hmm feels like a rough edge. and i think losing the description will hurt
(both in the model and in your own later understanding of the same code)
the descriptions are kind of nice since they're self-documenting and become part of your prompt
Would you want a different description for the input & output usage of the binding?
the workspace example is a good stressor of the load-bearing descriptions model though since it might be hard to describe an entire workspace's desired state
here's the full example where it felt natural to me:
m.LLM().
WithEnvironment(
dag.Environment().
WithDirectoryBinding("repo",
dag.Git("https://github.com/vito/booklit").Head().Tree()).
WithContainerBinding("ctr",
dag.Container().
From("golang").
WithMountedCache("/go/pkg/mod", dag.CacheVolume("go-mod")).
WithEnvVariable("GOMODCACHE", "/go/pkg/mod").
WithMountedCache("/go/build-cache", dag.CacheVolume("go-build")).
WithEnvVariable("GOCACHE", "/go/build-cache").
WithEnvVariable("BUSTER", fmt.Sprintf("%d-%s", m.Attempt, time.Now())),
).
WantFileBinding("bin", "The compiled Booklit binary."),
).
WithPrompt("Mount $repo into $ctr, set it as your workdir, and build ./cmd/booklit with CGO_ENABLED=0.")
really need an eval for the workspace pattern
if inputs and outputs both supported descriptions I would end up describing them differently, if that's what you're asking
OK then maybe we should call then with<Foo>Input and with<Foo>Output
outputs would be described in terms of their desired state, inputs would be described in how they should be used
works for me!
hmm can I WithFooOutput(&foo)??
i think it would be interesting if sometimes you end up not needing a prompt at all
because the descriptions cover everything
Yes I was wondering that also
this would be
but would need some SDK gymnastics i think
assignment β workspace β documentation β
- "how do you set the prompt?"
- "what do you mean?"
type LLM {
withPrompt(prompt: String="do your thing"): LLM!
}
in this snippet, what am I calling to get bin?
.Environment().Binding("bin")
or .Binding("bin")?
good vibes only
currently the former:
f, err := llm.Environment().Binding("bin").AsFile().Sync(ctx)
require.NoError(t, err)
bit of a mouthful
llm.Output("bin").AsFile() would help
sounds good
is this possible or is it a future optimization?
easy to do, if it passes
@dark maple π
Are you OK with llm.Environment().Output("bin").AsFile()?
having LLM type separate from its environment has several benefits
The extra intermediary call I think is worth it - it brings structural clarity IMO
also makes it easier to plug in external MCP
(otherwise MCP server has a dependency on LLM which is weird)
strawman: what if we shortened it to Env? LLM.withEnv, LLM.env
I know the pattern is to not abbreviate things, but we already have Container.withEnvVariable. (confusing example for multiple reasons, including my own arguing in the past that it should just be called withEnv for DX)
I do love typing less
Works for me
woah its the guy from the emoji
updated quickstart snippet
environment := dag.Env().
WithToyWorkspaceInput("before", dag.Workspace()).
WithStringInput("assignment", assignment).
WantToyWorkspaceOutput("after")
return dag.LLM().
WithEnv(environment).
WithPrompt(`
You are an expert go programmer. You have access to a workspace.
Use the default directory in the workspace.
Do not stop until the code builds.
Complete the assignment: $assignment
`).
Env().Output("after").AsWorkspace().Container()
original https://docs.dagger.io/ai-agents/quickstart#edit-the-agent-file
@vital copper which parts of all this do you want to take? I have a few hours of free time ahead of me
probably easiest to just take all of it and push a first pass to your PR, we decided on a lot but the code change isn't that big (unless I'm forgetting something)
biggest change was adding return but that's working already (so far...)
the list:
- Enviroment -> Env (everywhere? including type name?) (trivial I think?)
- withFooBinding -> withFooInput (trivial)
- adding withFooOutput (done)
- adding
return(done)
sweet once you have that pushed I'll build it and work on updating docs
ah another one:
- adding descriptions to inputs
biggest question there is whether it's a required argument
optional makes sense I think. I probably wouldn't need a description for the string var that's just used for prompt substitution
tbh i don't even know if I like the substitution for things like $assignment anymore. I might just concat it on the prompt string
with that change I'm less sure on optional description
// Write a Go program
func (m *CodingAgent) GoProgram(
// The programming assignment, e.g. "write me a curl clone"
assignment string,
) *dagger.Container {
environment := dag.Env().
WithToyWorkspaceInput("before", dag.Workspace(), "the coding workspace to complete the work").
WantToyWorkspaceOutput("after")
return dag.LLM().
WithEnv(environment).
WithPrompt(`
You are an expert go programmer. You have access to a workspace.
Use the default directory in the workspace.
Do not stop until the code builds.
Complete the assignment:` + assignment).
Env().Output("after").AsWorkspace().Container()
}
catching up sorry
nvm i got unconvinced
Meanwhile I'm still going to work on making inputs discoverable without var expansion
Don't panic, Kyle is only talking about one small cosmetic detail of his snippet π
I love that even the tool setup looks like "structured prompting"
// Write a Go program
func (m *CodingAgent) GoProgram(
// The programming assignment, e.g. "write me a curl clone"
assignment string,
) *dagger.Container {
environment := dag.Env().
WithToyWorkspaceInput("before", dag.Workspace(), "these are the tools to complete the task").
WithStringInput("assignment", assignment, "this is the assignment, complete it").
WantToyWorkspaceOutput("after")
return dag.LLM().
WithEnv(environment).
WithPrompt(`
You are an expert go programmer. You have access to a workspace.
Use the default directory in the workspace.
Do not stop until the code builds.`).
Env().Output("after").AsWorkspace().Container()
}
You could try piggybacking on the currentSelection tool description and turning it into a general context dump. Just make 100% sure to not make it seem anything close to variables π - which would be much simpler if they had descriptions, then we might not even need to mention their names?
or maybe description falls back to the name?
I'm super afraid of typing dagger.WithToyWorkspaceInputOpts{ "blah" }
lol yeah
@hallow steppe don't worry it will actually be dagger.EnvWithToyWorkspaceInputOpts{ "blah" }
dagger.EWTWIO 
@vital copper heads up rebasing on main + force-pushing
pushed the return tool on top of that, moving on to the rename
@vital copper does it seem feasible to you to move "withQuery" or equivalent to Environment itself?
Would be kind of cool if you could call dag.Env(dagger.EnvOpts{Privileged: true})
or perhaps even cooler: dag.CurrentEnv()
env(): initialize a new empty environment
currentEnv(): retrieve the current environment, including core API access and current module
you might actually be able to assign a Query as an input 
not sure how else it would move there, since environments are just a set of named values in the end
Well I've been considering since the beginning of this thread that "access to core API" and "access to current module" would be first-class properties of an environment
since they are also first-class properties in the shell's environment, for example
(and presumably they would also be recorded when we save environments to a file)
@dark maple @hallow steppe pushed Environment -> Env
what's LLM.attempt()?
right, i think the friction point is how/when the constructors are called. in shell you have an already-constructed instance of the current module in scope, plus constructors for your module's dependencies? is that right?
a bit of a hack, just a cache buster, added it in to support running evals N times in parallel without them just getting deduped. we should bikeshed that
@vital copper should I assume Environment.Bindings remains, or is it about to disappear and be replaced by Inputs and Outputs?
where'd we land on descriptions being optional?
from the POV of syncing them back to shell, I would only ever need to get the outputs
but it might not matter much since the syncing is bidirectional
like it might be redundant to sync inputs back, but i don't think it would be harmful, since any local changes made to them would have been synced to the LLM before it started its turn
so they might be skipped anyway because the digest is the same
going out on a limb, but I'm interested in just making them required and seeing how it feels - like if it replaces words that you might otherwise put in a prompt, it might be better all around
What I mean is, in terms of API, we had:
type Environment {
bindings()
binding(...)
}
Now I guess it would become:
type Environment {
inputs()
input(...)
outputs()
output(...)
}
Also it really sucks to set optional args in Go
that's my not so secret other motivation yeah π
is there an argument for or against exposing inputs, depending on how Envs might be used otherwise? what if we only supported accessing outputs and keeping it a black box?
i guess there could be scenarios where you want to inspect it ("dump the environment")
propose
Env().ToyWorkspaceOutput("after").Container()
instead of
Env().Output("after").AsToyWorkspace().Container()
i think you could just keep "bindings" and it'd be fine. in the end, outputs end up as a binding. any unfulfilled outputs would just not show up
the input vs. output is more of a setup-time distinction
Easier if we keep it consistent - otherwise we have to explain 3 words instead of 2
My immediate concern was just iterating inputs so I could expose them as tools.
Don't worry about it I will just focus on getting it working as-is
hmm i think it's hard to avoid a third word when your starting point is "input" and "output" but they fit the same shape
people might ask what they have in common
also, context dump: one issue I found with exposing them as tools, is sometimes their names sound like actions and fool the model into calling them instead of the actual function that does the thing. specifically in my case I ended up with a tool called eval that returned the string name of the eval to run, which it tried to call instead of running the Workspace_evaluate tool that it actually needed to call.
sort of a general risk around accepting wildcards into the top level namespace in our tool calling scheme
might be mitigated strongly with descriptions
@vital copper I'm trying to make the "input tools" work the same as "function tools" in terms of auto-selecting the result
could you point me to the correct incantation? π
prev := m.Current()
m.Select(obj) // , args.Functions...)
return m.currentState(prev)
the prev business is to support being given a single initial object as its state, so it can go back to it. is that gone now?
relates to the optional-binding-name topic
yeah - at the moment binding names remain required. Tried to convince @hallow steppe that optional would be awesome, but once we pseudo-ported his example module, I unonvinced myself... because of go opts
And with "inputs & outputs" having names feels fine now
is there an extremist angle where we get rid of the names entirely and just have required obj + required description?
i guess that'd harm my explicit-vars-in-prompts flow π
I think I'll have to copy-paste that, since it's wrapped in toolCallTosSelection which doesn't seem easy for me to reuse
oh I notice it's already copy-pasted in a few places
on the input descriptions
WithToyWorkspaceInput("before", workspace, "these are the tools to complete the task").
how does that work with the object's own description?
like the type description/docs?
yea
hmm i don't think they're very high leverage since they're not contextual/task-dependent, but could be a good fallback
i guess for types like Workspace you could make them higher leverage
but then it's awkward to have the required arg
something like Container's description is probably not very helpful though
(i dont even know what it is)
got it, at some version I was trying to keep my module descriptions optimized for the _select but I guess that doesn't matter now
bump
pushed "builtins as tools"
main disadvantage here is spreading the "API explosion" from Binding to Env
i am generally in favor of any changes to reduce typing though
Sounds good, it only saves a few chars so it's not worth
will try some evals with this now
I recommend starting with a simple manual evaluation because I didn't actually test it yet π¬
oh @vital copper we also need to fix prompt mode to work with the new api
lol yeah just hit that
hmmmm i guess it'll have to keep getting the environment, syncing it, and re-setting it on the llm?
good question, yeah it's less magically transparent now
also no descriptions
rebasing on main..
@vital copper we're heading for dinner
then back at it
hmm just realized there's no way to declare desired outputs in prompt mode atm
also @dark maple I might have an answer for how it listed Container tools without selecting it after you assigned one: it probably peeked at the description of selectContainer, which lists all the tools
right not as magically transparent
i'm trying a thing like, "build foo and return bar as $bin" - and allowing return to just take yolo arguments
that did not work at all π
it exported it instead
also, this was a little concerning
β ctr=$(container | from golang) 0.0s
β repo=$(git https://github.com/vito/booklit | head | tree) 0.0s
β build ./cmd/booklit and return the binary as $bin 22.1s
βπ§ build ./cmd/booklit and return the binary as $bin
β β 0.0s
β
βπ€ 0.7s β Input Tokens: 1,575 β Output Tokens: 4
β β selectDirectory 0.0s
β ! arg "id" does not match pattern ^Directory#\d+$: "repo"
βπ€ It seems I misinterpreted "repo" as a Directory ID. I should use the repo() function to access the user input directory. Then, I will create a container, mount the directory, and build the binary.
β β 0.8s β Input Tokens: 1,601 β Output Tokens: 46
βπ€ OK. Now I have the directory. I will create a container from alpine and copy the directory to it and build the binary.
β β 0.7s β Input Tokens: 2,189 β Output Tokens: 31
β β selectContainer 0.0s
β ! arg "id" does not match pattern ^Container#\d+$: "ctr"
βπ€ I made the same mistake again. I should use the ctr() function to access the user input container.
β β 0.7s β Input Tokens: 2,241 β Output Tokens: 24
βπ€ Now I have the container. I will add the directory to the container and build the binary.
β β 0.8s β Input Tokens: 6,832 β Output Tokens: 35
tl;dr it tried to use repo and ctr like vars in arguments
(this is gemini-2.0-flash)
at the end there it called ctr but those getters don't have telemetry wired up atm
it works, it's just not visible
also, bad test, it wasn't exposing return since there were no outputs declared
trying again
hmm, it tried but our plumbing isn't very amenable to this sort of thing (extra args of unknown types)
βπ€ Finally, I can return the file as the $bin result.
β β 0.6s β Input Tokens: 1,723 β Output Tokens: 17
β β return: String! 0.0s
βπ€ I need to call return with the bin parameter set to the ID of the file I extracted.
β β 0.7s β Input Tokens: 1,744 β Output Tokens: 27
β β return 0.0s
β ! unknown arg: "bin"
βπ€ I apologize, I seemed to have reached the end of the interaction and there is no further action to take.
β β 0.6s β Input Tokens: 2,402
β β return: String! 0.0s
βπ€ I am unable to return the file as $bin because the return function does not accept any arguments.
β β 0.5s
can keep pursuing if that seems like the right direction. I guess the fix would be to assume all args are valid and in the form Foo#123 and translate them to their real IDs
we should probably mask export
it seems to really enjoy calling it
still seeing it try to pass repo and ctr as args: https://v3.dagger.cloud/dagger/traces/1b1e4070a476951f130d2f5b53edb7ab
but it's actually useful sometimes... dilemna
alright i can't get gemini to call return with arbitrary args anymore for some reason
but why π

I guess my description is ambiguous
"This is not a variable."
my description for return may be too strict - iterating...
yeah I definitely see how export would confuse
host.withDirectory ftw
@vital copper can you re-explain your idea for getting explicit outputs in prompt mode?
basically just trying to get this to magically work:
! ctr=$(...)
! repo=$(repo)
> build ./cmd/booklit and return the compiled binary as $bin
the goal would be that the model sees $bin and deduces that it can call return({"bin":"File#1"}) which will "just work" if we make return accept arbitrary values
alternatively we could do something like:
! ctr=$(...)
! repo=$(repo)
! bin=$(.prompt "build ./cmd/booklit")
(maybe - lots to figure out there)
the last one kinda makes more sense in the old model (single return value) - from a language/interpreter POV
since it's assigned on the outside
to map that to the current approach we'd need the subshell to know what variable it's being assigned to, set it as an output, and then ... ??? maybe just let the var syncing do its thing? or maybe grab its binding out and return it ... ?
but return probably works best if it specifies expected values in the arguments no?
yeah for sure
i just tried that since it's the first thing i thought someone might try, so wanted to see if it could work with an open-ended escape hatch
but, it seems to undermine it too much
without the named + described args it loses its mind
hmm do we still expose currentType and getting the currently selected value?
bit of a cop-out but i wonder if $_ is the simplest approach in the end, since prompt mode tends to be more of a stream of consciousness, and you stumble across the values you want, and want to pluck them out one at a time
which you can do pretty easily with bin=$_
yeah that works
last call though (regardless of selection)
that way you can get scalars
π - that just needs a bit of extra plumbing right? i guess a second state: selection, which is always Object, and result, which is a Typed?
oh Right it needs to be in the API
honestly it could be a low level call history API
LLM.currentBinding? kinda weird, since it's not a binding, but kinda want to reuse Binding so we get all of its typed fields
and shell just happens to get the last call and use its result
that way we don't have to worry about defining another abstraction
(and you can get the result of each call as a Binding)
we need a better history API
true
mm one more issue here. multi agent.. I guess it's ok as long as we don't allow multiple agents to work in parallel
(I mean $agent and the possible upgrade to multiple variables for prompt mode )
should we bind the last result to $agent | last-call | result ? meh looks horrible though π
the trick will be having it handle objects (by GraphQL ID?) and scalars (by JSON I suppose?)
also arrays I guess
Sorry didn't want to hijack yves thread, but i'm still not sure what's expected and what's a bug that needs fixing:
Is it normal that env | with-file-output foo $somefile "some description" | output foo | name returns an empty string and not foo ?
hmm it looks weird on paper but I think it makes sense - with-file-output is more like "I desire this output", output foo is "give me this output" but the output hasn't been fulfilled
oh nvm sounds like it's a silly bug π (based on #1356669991470104627 message)
yes, didnt realize until just now π
is it me or WithState method on LLMSession is called from nowhere ?
hmmm so tryna wrap my head around output bindings... there's no WithStringOutput, is that intentional? To use the Env API it's implicit that I need to make the output some dagger object? (other than string)
is it insane to not only want WithStringOutput, but also WithObjectOutput? like a map[string]any in go?
oh or wait does this just require multiple modules? I notice the example using WithToyWorkspaceOutput, i just want this level of programmability within the module i'm currently working on
not intentional, just didn't have time
the problem is that graphql doesn't support maps, or any π
One thing to be cautious of: LLMs aren't very good at accurately reproducing data beyond trivial examples. One of my evals had a string with special characters and the eval would pretty consistently fail because the LLM subtly changed whitespace or special characters: https://github.com/vito/daggerverse/commit/7350e27037cdac07cf1ca4572b6a44d0a337a195
Using objects is much safer - it's a lot easier for an LLM to pass around "File#1" than it is to pass around the file's content.
I could be over-indexing on this one example, and maybe it'll get better with time, but intuitively it makes sense; it's sort of like sending over data and having an intern type it into their editor by hand, instead of just passing it around by reference.
For this reason I'm wondering if WithStringInput is a footgun - all the examples I could find in (admittedly limited) real world use were really just using it as a prompt variable. And in those cases it seemed kind of weird to tie it to the Environment. I'd be curious to see a strong use case
i think it also might give the wrong impression that environment is something you should be using for stringly-typed tasks... admittedly this isn't relevant for real applications, but when i'm building toys often the first thing i reach for is some dumb example, like ye olde "weather api"
ah duh i sometimes forget the wire protocol entirely XD
then reframing knowing all this, maybe we don't want string bindings or maps/objects, but just to be able to do the whole WithXInput/WithXOutput with module-local types? is that similarly difficult for technical reasons?
similar (same?) difficulty as self-calls, i think
actually if we could have an equivalent to interface {} in graphql that would be very handy
i think we've used JSON in that way in the past
maybe our destiny is to be the "we want generics!" mob but for graphql
tbh json would prolly be fine here