#Environment API

1 messages Β· Page 1 of 1 (latest)

dark maple
#

consolidating

plucky ermine
#

Would this be more or less a replacement for the workspace concept?

dark maple
#

Well "workspace" was a usage pattern, prominently featured in our early examples, but was never a core concept in the API

#

So that usage pattern would continue, just built on this new version of the core API

#
type LLM {
  model: String!
  withPrompt(prompt: String!): LLM!
  history: [LLMMessage!]!
  lastReply(): String!

  withEnvironment(Environment!): LLM!
  environment: Environment
}
type Environment {
  with[Type]Binding(key: String!, value: [Type], overwrite: Bool, overwriteType: Bool, includeFunctions: [String!], excludeFunctions: [String!]): Environment!
  bindings: [Binding!]
  binding(key: String!): Binding
  encode: File!
}
type Binding {
  key: String!
  as[Type]: [Type]!
}
#

@vital copper @hallow steppe I am calling Environment, call it a gut feeling... Maybe this Environment type is not just for LLMs... Maybe it's the missing piece for streamlining shell/llm integration. And perhaps also... for this? πŸ˜‡ https://github.com/dagger/dagger/issues/9584

hallow steppe
#

Nice! So to oversimplify it, an Environment is a collection of objects

dark maple
#

Yes, it's a collection of bindings, which are basically named objects

#

(if you want to split hairs, bindings can be to any value in the Dagger API - which could be an object, scalar or array

hallow steppe
#

ah right

buoyant crescent
dark maple
#

but I've used "object" to sometimes mean "value"

#

that part is definitely left ambiguous at the moment

dark maple
hallow steppe
#

So my definition of Environment gets expanded based on my types (if we get self calling) and dependencies. So I have to be careful with that because I shouldn't use Environment as an argument or return type since I can't pass those types around

dark maple
hallow steppe
#

I guess the same applies to LLM today

#

yep makes sense

dark maple
#

OK now get this

#

what if the shell added a builtin .save and .load for saving and loading the current environment to a standard file?

#

then we define a filename convention for dagger to load these files automatically at standard locations. CLI flags to load from custom location... Basically what we discussed for .env, but with Dagger env files instead

hallow steppe
#

mmm typed env

dark maple
#

Now, these files unlike .env and other primitive environment files, are 1) typed, and 2) safe to share because they contain secret references, not values

#

you can either sync them via your Dagger Cloud org - theoritically

#

And... you can plug them straight into a llm

hallow steppe
#

I would probably still have my LLM configuration in regular env vars from my .bashrc (or similar) since it's global

dark maple
#

Then, we can add default values for all argument types in modules, starting with secrets

#
func New(
  // +default="$GH_TOKEN"
  token *dagger.Secret,
)

--> Loaded with dag.CurrentEnv().Binding("GH_TOKEN").asSecret()

#

humans and robots sharing the same environment, co-existing peacefully (what could go wrong)

#

cc @blazing tangle @fading iron for MCP implications πŸ‘†

#

(added Environment.export() Environment.encode() πŸ˜› )

buoyant crescent
dark maple
#

Yeah so far I've been assuming we would want to piggyback on the actual .env format, for familiarity and ubiquity. But, I tried in practice and there's a fatal flaw: secret references

#

While it's super convenient to save dagger secret references (eg. op://foo/bar) directly in the .env, and have dagger automatically resolve them... It's not sustainable because it breaks other tools

#

so we can either 1) truly co-exist in the .env ecosystem, or 2) take advantage of dagger secret references, but we can't do both

#

I choose 2 πŸ™‚

#

So this would be our own file format, our own filename, and it would not be compatible with .env

buoyant crescent
#

idk, feels like maybe a false choice, could we not have both .denv and a .env?

#

.denv does sound way more powerful, but .env sounds way more convenient (i already have those files and don't wanna keep them in sync with another format)

dark maple
#

Quite possible we could still support .env. But it would not longer be the bearing wall of our env configuration system. So for the purposes of designing that bearing wall, I am discarding it as a viable option

buoyant crescent
#

yeah agreed that makes sense to me, at the very least .env is very loosely coupled to the more exciting option lol

hallow steppe
#

trying to wrap my head around having a .denv with CTR_DAGGER_OBJ=$(container | from alpine) where I wouldn't have it in code.

If I have a frequent flow I'm doing in shell, I'll write a function for that. I don't want to automatically have a bunch of boilerplate bindings in the environment unless I really want LLM to see them

#

I guess it makes defaulting secrets really easy. So that is a win

buoyant crescent
dark maple
#

@hallow steppe yeah the goal is not to replace your script logic with a serialized object (although it's intriguing that those two overlap). It's more for those snippets of script that are really env-specific configuration, like GH_TOKEN=$(secret op://sdfkjhsdkjfhsdjf/sdfdsf) or DOCKER_SOCKET=$(host | unix-socket /var/run/docker.sock)

hallow steppe
#

Nice. if it's committed, is it tied to a module's context, or is it only relevant for the client's local shell?

dark maple
#

technically if a hacker can get you to 1) load their env file, and 2) run their module, then the evil module could collaborate with the evil env file to steal secrets and files via default args

#

Default args will probably require prompting / trust system (which luckily we just built for LLM access πŸ™‚

#

Module github.com/superhacker/l33th4x0r requests access to binding GH_TOKEN (type Secret). Accept?

hallow steppe
#

Got it. I was thinking maybe a module could define it's base images in a .denv but it sounds like no

dark maple
#

aah I see

#

well maybe a module could ship with a default .denv?

hallow steppe
#

maybe, now it feels like things are getting complicated 😬 maybe sticking to the local .env style is better

dark maple
#

or do you mean having our own .denv, just managing it just like a .env?

hallow steppe
dark maple
#

yeah simpler for sure

#

we can always explore extra conveniences later

hallow steppe
#

I feel like we got pretty far from the original Environment proposal 🀣

dark maple
#

Not really - it just seems to all connect really well

#

(lots of hypothetical layers built on top for sure)

#

So we're got pretty far in the same way the 30th floor is far from the basement πŸ™‚ But still the same building, foundation is strong!

hallow steppe
#

Yeah totally, but we didn't hash out the design of Environment itself, unless everyone's good with it as-is!

dark maple
#

I'm just waiting for bikeshedding πŸ˜›

dark maple
#

what if we tied the current query object (with available modules) to the Environment? could streamline how we do "llm.withQuery" @vital copper

#

does the shell have access to "dag.CurrentModule()"? is that conceptually what we should give the environment?

#

I don't think you can access the core API from there, but it would make sense since different modules can have different views of the core API

#

So conceptually it makes sense to query the core API as seen by a particular module

cunning idol
#

I didn't expect to find this conversation in the agent channel but I'm pleasantly surprised to see all the discussions around .env and .denv. That would be amazing to have! Right now every single command to my modules have a password, user and sometimes token parameters that are from the host env. Would be great DX to externalize that to a .env

dark maple
#

I like .env files too but they are a bandaid for secrets since you have to store the plaintext value in the file

#

.denv (name tbd) would be that superior format - would that work for you, or are you locked into .env specifically?

cunning idol
#

No, I am not locked in. I may actually prefer the more powerful format which can also be checked in to the repo? For example, I'd want dagger to resolve MYSECRET=env:MYPASSWORD even in CI, if not provided via CLI flag. Assuming the CLI flag takes precedence. That will let us keep the CI calls less verbose too.

#

I in fact created a module to ingest .env (read and set as env var) in my workflow because we have various other env vars (secret and non-secret) that we want to pass into the module.

plucky ermine
#

I kind of like sticking with .env, that’s a know format that ide’s know and support and widely adopted. The thing that’s tricky here is that .env was designed to not be checked into your git repo and made to tweak your app to act like it’s running in an environment where those values are set as β€œreal” environment variables

#

To somewhat counter that, the Symfony project has a built in way to encrypt those values and store them in the repo (it references the vars as PHP files (or YAML) but same concept). It requires a single key on the environment to decrypt the values at runtime.

I could see that as maybe a way for us to "checkin" what variables the Dagger CLI can access from the environment - I know we sandbox what is sent from the host machine - without having to be so verbose on what we pass to shell?

https://symfony.com/doc/current/configuration/secrets.html

#

That specific example kind of overly complicates the existing secrets support, but the .denv example made me think of it

dark maple
#

taking a stab at this

dark maple
#

resuming... @vital copper should I start from your "eval" branch?

vital copper
#

@dark maple yeah go for it

#

it's getting there, just integrating the ambient var access stuff now

#

had green CI earlier

dark maple
#

Nice, are you aiming to merge today?

vital copper
#

yep!

#

🀞

vital copper
#

just getting caught up - love the idea, my biggest question is whether this file will be committed to the repo or not. I agree that since part of the point is to avoid secrets being in plaintext, it's now safe to commit, but I suspect in practice people will need local overrides, and if it's committed to the repo there's a risk of people accidentally committing their local changes. So I think we'll want to have an obvious way to have local overrides. Maybe .denv.local?

dark maple
vital copper
#

any major hurdles yet?

#

oh also - is there any gelling between "environment" and "context"? we have "contextual dirs", maybe they could play off environments somehow?

#

this is mostly thesaurusly motivated

thorn crypt
#

Environment will just be another object (a la container or llm), right?

dark maple
#

it would be the "matrix" through which the llm interacts with all other objects

#

so not visible as an object itself

#

used in a sentence: "my code instantiates a LLM then binds two containers to its environment, plus one secret and one string containing its assignment

thorn crypt
#

and it'll only be used with LLMs?

dark maple
dark maple
#

@vital copper going to push a buildable version of my branch soon. I'm taking a more cautious approach, first pushing a "naive split" from main, where I preserve as much as possible of your API. Then on top of that, we can move more things around

#

example: I kept "vars" and "objects" separate, but think they can be merged under "bindings" as a follow-up

vital copper
#

sounds good!

dark maple
#

Gotta disappear for calls - back in 1h or so

#

@vital copper sorry I didn't have time to actually rebase on your "remove variables" change. But I manually incorporated most of it. Will get to it right after my calls

#

OK almost done with that πŸ‘†

#

rebased & pushed

vital copper
#

have to head to a πŸ€ game soon so approving for when that's fixed

dark maple
#

back from quick lunch

#

fixing

#

Why the hell is it String! with an !, graphql type?

dark maple
#

@vital copper thought I fixed it, but still fails. Pushing anyway..

#

It's the same test, but different failure I think

#

Does this look correct?


func (b *Binding) AsString() (string, bool) {
    return dagql.UnwrapAs[string](b.Value)
}
vital copper
#

hmmm it might not be, I've never used it to unwrap + type convert in one go

#

i don't think that'll work, no

#

try UnwrapAs with dagql.String first, and then going from there

dark maple
#

side note these llm tests semm pretty brittle with the hardcoded replay data that encodes specific tool calls etc

vital copper
#

yeah you have to run a command to update them sometimes

#

if you need it: dagger call test update --pkg=./core/integration --run="TestLLM" --env-file=file://$PWD/.env -o . (I don't think you should though based on your change)

dark maple
#

re-running test...

dark maple
#

everything passes except rust-sdk tests? πŸ€·β€β™‚οΈ

dark maple
#

@fading iron @blazing tangle @buoyant crescent I'm going to merge part 1, just to keep things moving. All tests are green

#

Merged

buoyant crescent
dark maple
buoyant crescent
#

the only obvious alternatives are setting up the mock responses in code (probably more brittle, at least in the sense that you've gotta change the mocks and their impls whenever you change the actual code) or actual e2e testing (subject to an obscene amount of nondeterministic behavior from the models)... if things eventually calcify i could see it being nicer to just be like mockLLMModel.ExpectPrompt($userInput); mockLLMModel.StubToolCall(tool, params, returns) but for now the replay thing is wayyyyyy more amicable to underlying shit changing rapidly

dark maple
#

Part 1 merged

dark maple
vital copper
# dark maple https://github.com/dagger/dagger/pull/10007

looks good so far! do you have an idea for how to get the return value(s?) out, to replace the LLM.<type> getters?

relatedly, I did some experimenting in https://github.com/dagger/dagger/pull/9978 to support -i/--interactive, which might inform the design somehow since it involved changing those getters.

I made it so that calling LLM.<type> sets <Type> as the desired return type before calling .sync, and then if the model exits its loop (stops talking/asks a question) without returning that type, and you've run with -i, it prompts through the CLI and lets you continue the conversation.

demo: https://asciinema.org/a/8jHjMd4z8tdPhDbt1SdYFXvnS

I also tried adding an explicit returnFoo tool for returning a Foo type when the task was completed, but had trouble getting the model to consistently call it, so I ended up just checking that the final state matched the desired return type. But that's far from perfect - what if the type happened to match, but the model didn't actually complete its task? So I started piling more prompting into the currentSelection tool, and thought about adding a way to clear the selection to disambiguate that, but I'm not sure how reliable that would be.

dark maple
#

I feel like exposing the query tool as Query_xxx in the top-level most confuse the llm

vital copper
#

is that a side topic? just making sure im following lol

#

like exposing the full Query type?

#

there's a way it can get wedged at the moment, since once it selects another object it has no way to go back to Query

dark maple
dark maple
#

Pushed a fix, thanks @hallow steppe for testing

dark maple
#

@vital copper FYI I'm looking at your other branch, will try to merge with mine, hopefully not too many conflicts

dark maple
#

Good morning @unkempt meteor , we're hitting a mysterious buildkit error in this branch, any chance you could help us debug? πŸ™

#

It's in a branch that doesn't actually do anything buildkit related, we do move things around in the core schema, so the buikdit error might be an artifact of something stupid in the schema.. just not sure what

unkempt meteor
#

do you have a link?

#

i'm working from home today, so happy to help out on anything

dark maple
#

@unkempt meteor simplest way is to run engine from my PR:

dagger -m github.com/shykes/dagger@environment-api -c 'cli | binary --platform=current | export ./dagger-env-api; engine | service dev | up'
#

Then try loading any module:

_EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:1234 ./dagger-env-api -m github.com/shykes/hello
#

Also could use a review on the PR while we're at it πŸ™‚ πŸ™

unkempt meteor
#

looking now

#

do you need help getting ci in a green state? i can take that

dark maple
#

Just merged 10010 since you approved it @unkempt meteor . Need to rebase on top of it and get some testing of the new $agent asap

dark maple
#

thnaks

unkempt meteor
#

will do - hopefully should clear y'all up to focus on rejekts/etc πŸ˜„

#

quick clarification - are we aiming to merge every open llm pr for the release later today? just going to milestone them all

dark maple
#

@unkempt meteor the release is severely at risk, because we have unresolved design decisions & testing, and it's been very hard to coordinate them because of travel & timezone difference

#

ie. Alex and I are currently developing in parallel and have had no chance to actually sync since friday

#

I'm sitting with @hallow steppe trying to reconcile

unkempt meteor
#

okay, anything you need shout for - i can hop into a voice-channel if needed

hallow steppe
#

trying to dagger develop with the branch above fails to load any module, even python-sdk

time="2025-03-31T09:12:50Z" level=error msg="solve error: process \"codegen --output /src --module-source-path /src --module-name python-sdk --introspection-json-path /schema.json\" did not complete successfully: exit code: 1\ngithub.com/moby/buildkit/solver.(*edge).execOp\n\t/go/pkg/mod/github.com/dagger/buildkit@v0.0.0-20250128235329-9c8ee9e867a5/solver/edge.go:912\ngithub.com/moby/buildkit/solver/internal/pipe.NewWithFunction[...].func2\n\t/go/pkg/mod/github.com/dagger/buildkit@v0.0.0-20250128235329-9c8ee9e867a5/solver/internal/pipe/pipe.go:78\nruntime.goexit\n\t/usr/lib/go/src/runtime/asm_arm64.s:1223" client_hostname=Donnager.local client_id=r1rv9ziyw8vxm4h39ss6oidzv function= module=python-sdk session_id=cl3p6y7xfwadbne9ry4nlo7s8 spanID=2e0091c31156b592 traceID=a0f711204960cb8228ecb925ad7747fb
dark maple
#

Just rebased on main FYI

unkempt meteor
#

huh we're having bugs in codegen???

unkempt meteor
#

really no idea why that's suddenly appeared, looks like we're generating bad code

#

detective time

#

fyi, i do like the new bindings model a lot actually

#

feels nice to have llm as a "real" static type

dark maple
#

haha yeah it feels cleaner πŸ™‚

#

also paves the way for Environment to be more broadly used, not just for LLMs

unkempt meteor
#

yea that's got cool implications

dark maple
#

One unresolved design issue we have, is that the current environment API assumes the LLM and caller exchange data by setting bindings.

But in parallel we have been discussing changes to LLM->caller sharing, because it's annoying to always have to prompt the model with instructions like "make sure to write the result to the variable 'foo', that's f-o-o, don't forget!"

unkempt meteor
#

it's binding.type that's causing the issue

#

type is a reserved name in go

#

i can get the go sdk to handle this

dark maple
#

aaarg

#

we can change to TypeName also

#

it's not a super-super important function in the API. More for convenience

#

Most clients won't call it

#

So maybe not worth pulling out the big guns

unkempt meteor
#

i'll rename it for now to unblock? but i'll also keep hacking

dark maple
unkempt meteor
#

what's the plan for updating docs to match this new style? are we good on that front?

hallow steppe
#

Yep on it πŸ‘

dark maple
#

With the caveat that the design is not complete, because of the unresolved UX issue

unkempt meteor
#

fixed + pushed

#

gonna work on fixing ci as well

dark maple
#

@vital copper double-checkling for when you start your day: at the moment there is no way for the LLM to pass data back to the caller?

vital copper
#

And that was with a single return, not multiple named bindings, if that's what we want

dark maple
#

@vital copper what about auto-chaining "auto-saving" of selected objects by default? I remember we discussed that, did you try?

vital copper
#

Like having them update in place?

dark maple
#

right yeah

#

don't remember where we left that

vital copper
#

I had trouble figuring out how to frame it so that led me to writing evals

dark maple
#

Ah!

#

OK. Today we're going to try and run your evals for real

vital copper
#

The bit that feels awkward is when it transitions types

hallow steppe
#

Can it happen in the bbi/mcp rather than making the llm call _save or whatever

vital copper
#

Like ctr becomes file when you get the resulting compiled binary

dark maple
#

I think we would only auto-save when it's the same type (like in original single obj implementation)

vital copper
#

Generally not a fan of values being reassigned esp if it pollutes back into shell

hallow steppe
#

thats ok, i only want it to happen for chainable functions that return the original type

vital copper
hallow steppe
#

so yes for this: Workspace.exec(command: "go mod init go-curl"): Workspace!
not for this Workspace.getExecOutput: String!

#

the main use case we can't live without is mutating our input objects. So there's probably a bunch of other cases where chained objects would get dropped but that's the case for single object too

dark maple
#

Trying to find where to plumb this in, to try

#

(code changed a lot, I'm struggling to find the right place to plumb this in)

#

I guess selectionToToolResult?

#

Yeah I'm going to need your help @vital copper

#

Furniture got moved around I don't recognize the room πŸ˜›

#

I agree it's not a panacea BUT @hallow steppe is right that it would at least give us parity with single-object and a good checkpoint for testers to use while we keep designing something better

vital copper
#

what is the proposed code pattern? like how does this translate?:

dag.LLM().
  SetWorkspace("work", ...).
  WithPrompt("do your thing with $work").
  Workspace()
vital copper
hallow steppe
#
    before := dag.Workspace(dagger.WorkspaceOpts{
        BaseImage: "golang",
        Context:   dag.Directory(),
        Checker:   "go build ./...",
    })

    env := dag.Environment().
        WithWorkspaceBinding("workspace", before).
        WithStringBinding("assignment", task)

    // Give the workspace to the LLM
    coder, err := dag.LLM().
        WithEnvironment(env).
        WithPromptFile(dag.CurrentModule().Source().File("system.txt")).
        WithPrompt(`
<assignment>
$assignment
</assignment>
        `).
        Sync(ctx)
    if err != nil {
        return nil, err
    }

    after := coder.Environment().Binding("workspace").AsWorkspace()

    // Return the container
    return after.Container(), nil
vital copper
#

i liked solomon's 4-liner better πŸ˜›

#

whered that go

dark maple
#

lol it was incorrect though

#

(forgot the withEnvironment part)

#

pseudo-code version:

dag.LLM().
 WithEnvironment(dag.Environment().WithWorkspaceBinding("workspace", ...)).
 WithPrompt("do your thing in your workspace").
 Environment().Binding("workspace").AsWorkspace()
#

Meanwhile we're discussing a simplification where we could make the binding name optional when setting it. Would still be multi-object internally, but by default we would just pick a name for you. You would only need to pick a name if you need to set multiple bindings of the same type

vital copper
#

whatever we do, we just have to be pretty confident that the model actually understands it (beyond like 50% success rate). it took a lot of iteration to get to where we are, so i'm a little bearish on a total paradigm shift just before releasing, unless mutation actually seems to bring a significantly higher baseline of understanding for one reason or another. but, we're going to be swimming upstream, since the entire Dagger model (all our GraphQL schema docs etc) is framed assuming immutability

#

like we saw how wild some models got just by introducing it to the concept of named variables; it's a delicate balance

dark maple
#

I understand but, we've been discussing the need to simplify the API for a while. We had to get evals first, but the point of the evals was to be able to iterate on the API

hallow steppe
#

also I don't think it needs to be up to the LLM to re-assign variables, that definitely gets too complicated. Just re-assign when a tool response comes back

vital copper
#

considering how long evals can take to run, across models with rate limits etc. etc., and how much variability there is inherent to the feedback loop

#

again i'm not against the switch, i'm just skeptical of the timing, especially if we think we'll need to iterate on the API some more anyway

dark maple
#

I agree that there's very little we can iterate on without pushing back the release

hallow steppe
#

Maybe I'm missing something though because we can't release if we can't get mutated objects out of LLM

vital copper
#

we can't release with the Environments API if we can't get mutated objects out

#

the Environments API is also predicated on being valuable in other ways, but are we shipping that at the same time?

dark maple
#

I think the following are realistic pre-release:

  • Simplify binding API by making the binding name optional (no impact on LLM interface, so no eval required)
  • Auto-save of bindings - does impact LLM interface, but simplifies it rather that complexifies it, and we have prior experience of the pattern in single object, it worked well

Everything else I can think of, would have to push back the release IMO

vital copper
#

or are we just trying to get an API change out of the way?

dark maple
#

Yeah the API change is the point - to avoid releasing then breaking API a week later

vital copper
#

btw i use zed now if you wanna try the collaboration thing

dark maple
#

oh!

vital copper
#

connor and i used it, it was pretty nifty

#

my username is vito

dark maple
#

nice πŸ™‚

#

sidenote we have a shared slack with MCP team now

#

and incidentally with Zed team too

vital copper
#

@dark maple if there's interest I can try a parallel track, around pre-declaring "desired bindings" for the model to fill in. when I spiked on it I liked the DX, and it might be a smaller shift from where we are now, main question is whether the models will respect it

#

(brb, running around a lot today, contractors are here)

dark maple
#

ok we're going to try a mini-spike on our end, ping when you're back?

vital copper
#

@dark maple back

dark maple
#

@vital copper want to chat in team-audio?

vital copper
#

sure

dark maple
#

Notes:

  • We're moving forward on environment API branch. We have a list of changes we want to make before release
  • LLM input: I would prefer a way for named bindings to be discoverable by the LLM. (at the moment you need to give their exact ID to the LLM, by variable expansion in the prompt). --> @dark maple will spike
  • LLM output: since environment API does not expose the LLM's current selection (left as an internal MCP implementation detail), we need a way for the LLM to "return" a value to the caller. --> @vital copper will spkie
  • Function masks. We need a way to restrict what the LLM sees, to help it not get lost. There are several ways to do this. We agreed to focus on 2:
    1. Manually mask functions from core types (container, directory) to reduce the baseline cost
    2. if possible, change MCP implementation to only expose "real" module functions, and hide field getters. cc @unkempt meteor @merry helm we need help on this πŸ˜›
dark maple
vital copper
#

the nice thing about this is it'll naturally gel with -i for troubleshooting - if the model ends its turn without returning, we call for help

dark maple
#

@vital copper does it work even without prompting it specifically to "return this or that"?

vital copper
#

yep

dark maple
#

aweaome

#

awesome even

vital copper
#

the full prompt is WithPrompt("Mount $repo into $ctr, set it as your workdir, and build ./cmd/booklit with CGO_ENABLED=0.") - the rest just comes from the tool framing

dark maple
#

Nice. I was going to ask about returning multiple values but I guess that's included

vital copper
#

this description ends up on the bin arg description for example:

WantFileBinding("bin", "The compiled Booklit binary."),
#

yep

dark maple
#

Did you add a description field to Binding? Was thinking we could expose it for regular bindings also

vital copper
#

for now i just added a separate type, but yeah i think it could make sense there. only reason I did a separate type was to avoid a "sometimes nil Binding"

#

need to bikeshed terminology too - "WantFileBinding" feels awkward ("Want" vs "With" too subtle)

#

gonna try updating the other evals to use this and see how it feels

dark maple
#

@vital copper do you want to allow the LLM reading from the same LLM it returns to?

vital copper
#

oh like accessing the same bindings it returned, as bindings, for subsequent turns?

dark maple
#

yeah, do the "input bindings" and "output bindings" share the same namespace?

#

we're facing this question right away with @hallow steppe's reference example (toy workspace in, toy workspace out)

vital copper
#

that's what i've done so far in my spike, yeah. which took a bit of hacking, right now MCP just directly mutates:

                // TODO: is it appropriate to just mutate directly here?
                m.env.objsByName[name] = &Binding{
                    Key:   name,
                    Value: obj,
                    env:   m.env,
                }
dark maple
#

OK, so it makes sense to call them bindings then

vital copper
#

hmm maybe

#

alternatively we could have an LLM have an 'output environment' or something, but it seems like it'd translate to the same number of hops either way

#

oh, or we could do LLM.output("foo"): Binding!

#

then you don't need the extra LLM.ENVIRONMENT.binding call

#

we can decide either tbh

#

i think the DX should drive it

vital copper
dark maple
#

so how about:

env := dag.Environment().
  WithToyWorkspaceBinding("workspace", dag.ToyWorkspace()).
  WithStringBinding("assignment", assignment)

dag.LLM().
  WithEnvironment(env).
  WithPrompt("do your thing").
  Loop(dagger.LLMLoopOpts{MustReturn: []string{"workspace"}}). // πŸ‘ˆ
  Environment().
  Binding("workspace").
  AsToyWorkspace()
vital copper
#

hmm feels like a rough edge. and i think losing the description will hurt

#

(both in the model and in your own later understanding of the same code)

#

the descriptions are kind of nice since they're self-documenting and become part of your prompt

dark maple
vital copper
#

the workspace example is a good stressor of the load-bearing descriptions model though since it might be hard to describe an entire workspace's desired state

#

here's the full example where it felt natural to me:

m.LLM().
    WithEnvironment(
        dag.Environment().
            WithDirectoryBinding("repo",
                dag.Git("https://github.com/vito/booklit").Head().Tree()).
            WithContainerBinding("ctr",
                dag.Container().
                    From("golang").
                    WithMountedCache("/go/pkg/mod", dag.CacheVolume("go-mod")).
                    WithEnvVariable("GOMODCACHE", "/go/pkg/mod").
                    WithMountedCache("/go/build-cache", dag.CacheVolume("go-build")).
                    WithEnvVariable("GOCACHE", "/go/build-cache").
                    WithEnvVariable("BUSTER", fmt.Sprintf("%d-%s", m.Attempt, time.Now())),
            ).
            WantFileBinding("bin", "The compiled Booklit binary."),
    ).
    WithPrompt("Mount $repo into $ctr, set it as your workdir, and build ./cmd/booklit with CGO_ENABLED=0.")
#

really need an eval for the workspace pattern

vital copper
dark maple
#

OK then maybe we should call then with<Foo>Input and with<Foo>Output

vital copper
#

outputs would be described in terms of their desired state, inputs would be described in how they should be used

hallow steppe
#

hmm can I WithFooOutput(&foo)??

vital copper
#

i think it would be interesting if sometimes you end up not needing a prompt at all

#

because the descriptions cover everything

dark maple
vital copper
dark maple
#

assignment βœ… workspace βœ… documentation βœ…

#
  • "how do you set the prompt?"
  • "what do you mean?"
#
type LLM {
  withPrompt(prompt: String="do your thing"): LLM!
}
hallow steppe
vital copper
#

llm.Output("bin").AsFile() would help

hallow steppe
#

sounds good

hallow steppe
vital copper
dark maple
#

Are you OK with llm.Environment().Output("bin").AsFile()?

#

having LLM type separate from its environment has several benefits

#

The extra intermediary call I think is worth it - it brings structural clarity IMO

#

also makes it easier to plug in external MCP

#

(otherwise MCP server has a dependency on LLM which is weird)

vital copper
#

strawman: what if we shortened it to Env? LLM.withEnv, LLM.env

I know the pattern is to not abbreviate things, but we already have Container.withEnvVariable. (confusing example for multiple reasons, including my own arguing in the past that it should just be called withEnv for DX)

hallow steppe
#

I do love typing less

vital copper
vital copper
hallow steppe
#

woah its the guy from the emoji

#

updated quickstart snippet

environment := dag.Env().
        WithToyWorkspaceInput("before", dag.Workspace()).
        WithStringInput("assignment", assignment).
        WantToyWorkspaceOutput("after")

    return dag.LLM().
        WithEnv(environment).
        WithPrompt(`
            You are an expert go programmer. You have access to a workspace.
            Use the default directory in the workspace.
            Do not stop until the code builds.
            Complete the assignment: $assignment
            `).
        Env().Output("after").AsWorkspace().Container()

original https://docs.dagger.io/ai-agents/quickstart#edit-the-agent-file

dark maple
#

@vital copper which parts of all this do you want to take? I have a few hours of free time ahead of me

vital copper
#

probably easiest to just take all of it and push a first pass to your PR, we decided on a lot but the code change isn't that big (unless I'm forgetting something)

#

biggest change was adding return but that's working already (so far...)

#

the list:

  • Enviroment -> Env (everywhere? including type name?) (trivial I think?)
  • withFooBinding -> withFooInput (trivial)
  • adding withFooOutput (done)
  • adding return (done)
hallow steppe
#

sweet once you have that pushed I'll build it and work on updating docs

vital copper
#

ah another one:

  • adding descriptions to inputs
#

biggest question there is whether it's a required argument

hallow steppe
#

optional makes sense I think. I probably wouldn't need a description for the string var that's just used for prompt substitution

#

tbh i don't even know if I like the substitution for things like $assignment anymore. I might just concat it on the prompt string

#

with that change I'm less sure on optional description

#
// Write a Go program
func (m *CodingAgent) GoProgram(
    // The programming assignment, e.g. "write me a curl clone"
    assignment string,
) *dagger.Container {
    environment := dag.Env().
        WithToyWorkspaceInput("before", dag.Workspace(), "the coding workspace to complete the work").
        WantToyWorkspaceOutput("after")

    return dag.LLM().
        WithEnv(environment).
        WithPrompt(`
            You are an expert go programmer. You have access to a workspace.
            Use the default directory in the workspace.
            Do not stop until the code builds.
            Complete the assignment:` + assignment).
        Env().Output("after").AsWorkspace().Container()
}
dark maple
#

catching up sorry

hallow steppe
#

nvm i got unconvinced

dark maple
#

Meanwhile I'm still going to work on making inputs discoverable without var expansion

dark maple
#

I love that even the tool setup looks like "structured prompting"

hallow steppe
#
// Write a Go program
func (m *CodingAgent) GoProgram(
    // The programming assignment, e.g. "write me a curl clone"
    assignment string,
) *dagger.Container {
    environment := dag.Env().
        WithToyWorkspaceInput("before", dag.Workspace(), "these are the tools to complete the task").
        WithStringInput("assignment", assignment, "this is the assignment, complete it").
        WantToyWorkspaceOutput("after")

    return dag.LLM().
        WithEnv(environment).
        WithPrompt(`
            You are an expert go programmer. You have access to a workspace.
            Use the default directory in the workspace.
            Do not stop until the code builds.`).
        Env().Output("after").AsWorkspace().Container()
}
vital copper
#

or maybe description falls back to the name?

hallow steppe
#

I'm super afraid of typing dagger.WithToyWorkspaceInputOpts{ "blah" }

vital copper
#

lol yeah

dark maple
#

@hallow steppe don't worry it will actually be dagger.EnvWithToyWorkspaceInputOpts{ "blah" }

vital copper
hallow steppe
#

brb renaming all of my modules to 3 letter acronyms

vital copper
#

dagger.EWTWIO guy_fieri_chef_kiss

dark maple
#

@vital copper heads up rebasing on main + force-pushing

vital copper
dark maple
#

@vital copper does it seem feasible to you to move "withQuery" or equivalent to Environment itself?

#

Would be kind of cool if you could call dag.Env(dagger.EnvOpts{Privileged: true})

#

or perhaps even cooler: dag.CurrentEnv()

#
env(): initialize a new empty environment
currentEnv(): retrieve the current environment, including core API access and current module
vital copper
#

you might actually be able to assign a Query as an input notsureif

#

not sure how else it would move there, since environments are just a set of named values in the end

dark maple
#

Well I've been considering since the beginning of this thread that "access to core API" and "access to current module" would be first-class properties of an environment

#

since they are also first-class properties in the shell's environment, for example

#

(and presumably they would also be recorded when we save environments to a file)

vital copper
#

@dark maple @hallow steppe pushed Environment -> Env

dark maple
#

what's LLM.attempt()?

vital copper
vital copper
# dark maple what's `LLM.attempt()`?

a bit of a hack, just a cache buster, added it in to support running evals N times in parallel without them just getting deduped. we should bikeshed that

dark maple
#

@vital copper should I assume Environment.Bindings remains, or is it about to disappear and be replaced by Inputs and Outputs?

hallow steppe
#

where'd we land on descriptions being optional?

vital copper
#

but it might not matter much since the syncing is bidirectional

#

like it might be redundant to sync inputs back, but i don't think it would be harmful, since any local changes made to them would have been synced to the LLM before it started its turn

#

so they might be skipped anyway because the digest is the same

vital copper
dark maple
#

What I mean is, in terms of API, we had:

type Environment {
  bindings()
  binding(...)
}

Now I guess it would become:

type Environment {
  inputs()
  input(...)
  outputs()
  output(...)
}
dark maple
vital copper
#

that's my not so secret other motivation yeah πŸ˜›

vital copper
#

i guess there could be scenarios where you want to inspect it ("dump the environment")

hallow steppe
#

propose

Env().ToyWorkspaceOutput("after").Container()

instead of

Env().Output("after").AsToyWorkspace().Container()
vital copper
#

the input vs. output is more of a setup-time distinction

dark maple
#

Easier if we keep it consistent - otherwise we have to explain 3 words instead of 2

#

My immediate concern was just iterating inputs so I could expose them as tools.

#

Don't worry about it I will just focus on getting it working as-is

vital copper
#

hmm i think it's hard to avoid a third word when your starting point is "input" and "output" but they fit the same shape

#

people might ask what they have in common

#

also, context dump: one issue I found with exposing them as tools, is sometimes their names sound like actions and fool the model into calling them instead of the actual function that does the thing. specifically in my case I ended up with a tool called eval that returned the string name of the eval to run, which it tried to call instead of running the Workspace_evaluate tool that it actually needed to call.

sort of a general risk around accepting wildcards into the top level namespace in our tool calling scheme

#

might be mitigated strongly with descriptions

dark maple
#

@vital copper I'm trying to make the "input tools" work the same as "function tools" in terms of auto-selecting the result

#

could you point me to the correct incantation? πŸ™‚

vital copper
#
                prev := m.Current()
                m.Select(obj) // , args.Functions...)
                return m.currentState(prev)
#

the prev business is to support being given a single initial object as its state, so it can go back to it. is that gone now?

#

relates to the optional-binding-name topic

dark maple
#

yeah - at the moment binding names remain required. Tried to convince @hallow steppe that optional would be awesome, but once we pseudo-ported his example module, I unonvinced myself... because of go opts

#

And with "inputs & outputs" having names feels fine now

vital copper
#

is there an extremist angle where we get rid of the names entirely and just have required obj + required description?

#

i guess that'd harm my explicit-vars-in-prompts flow πŸ˜›

dark maple
#

oh I notice it's already copy-pasted in a few places

vital copper
#

yeah

hallow steppe
#

on the input descriptions
WithToyWorkspaceInput("before", workspace, "these are the tools to complete the task").
how does that work with the object's own description?

vital copper
#

like the type description/docs?

hallow steppe
#

yea

vital copper
#

hmm i don't think they're very high leverage since they're not contextual/task-dependent, but could be a good fallback

#

i guess for types like Workspace you could make them higher leverage

#

but then it's awkward to have the required arg

#

something like Container's description is probably not very helpful though

#

(i dont even know what it is)

hallow steppe
#

got it, at some version I was trying to keep my module descriptions optimized for the _select but I guess that doesn't matter now

dark maple
#

pushed "builtins as tools"

vital copper
#

i am generally in favor of any changes to reduce typing though

hallow steppe
vital copper
dark maple
#

I recommend starting with a simple manual evaluation because I didn't actually test it yet 😬

#

oh @vital copper we also need to fix prompt mode to work with the new api

vital copper
#

lol yeah just hit that

#

hmmmm i guess it'll have to keep getting the environment, syncing it, and re-setting it on the llm?

dark maple
#

good question, yeah it's less magically transparent now

#

also no descriptions

#

rebasing on main..

#

@vital copper we're heading for dinner

#

then back at it

vital copper
#

hmm just realized there's no way to declare desired outputs in prompt mode atm

#

also @dark maple I might have an answer for how it listed Container tools without selecting it after you assigned one: it probably peeked at the description of selectContainer, which lists all the tools

dark maple
vital copper
#

i'm trying a thing like, "build foo and return bar as $bin" - and allowing return to just take yolo arguments

#

that did not work at all πŸ˜‚

#

it exported it instead

#

also, this was a little concerning

βœ” ctr=$(container | from golang) 0.0s

βœ” repo=$(git https://github.com/vito/booklit | head | tree) 0.0s

βœ” build ./cmd/booklit and return the binary as $bin 22.1s
β”‚πŸ§‘ build ./cmd/booklit and return the binary as $bin
β”‚ ┃ 0.0s
β”‚
β”‚πŸ€– 0.7s β—† Input Tokens: 1,575 β—† Output Tokens: 4
β”‚ ✘ selectDirectory 0.0s
β”‚ ! arg "id" does not match pattern ^Directory#\d+$: "repo"
β”‚πŸ€– It seems I misinterpreted "repo" as a Directory ID. I should use the  repo()  function to access the user input directory. Then, I will create a container, mount the directory, and build the binary.
β”‚ ┃ 0.8s β—† Input Tokens: 1,601 β—† Output Tokens: 46
β”‚πŸ€– OK. Now I have the directory. I will create a container from alpine and copy the directory to it and build the binary.
β”‚ ┃ 0.7s β—† Input Tokens: 2,189 β—† Output Tokens: 31
β”‚ ✘ selectContainer 0.0s
β”‚ ! arg "id" does not match pattern ^Container#\d+$: "ctr"
β”‚πŸ€– I made the same mistake again. I should use the  ctr()  function to access the user input container.
β”‚ ┃ 0.7s β—† Input Tokens: 2,241 β—† Output Tokens: 24
β”‚πŸ€– Now I have the container. I will add the directory to the container and build the binary.
β”‚ ┃ 0.8s β—† Input Tokens: 6,832 β—† Output Tokens: 35

tl;dr it tried to use repo and ctr like vars in arguments

#

(this is gemini-2.0-flash)

#

at the end there it called ctr but those getters don't have telemetry wired up atm

dark maple
#

oh so my binding tools thing works?

#

I couldn't get it to work in my manual test πŸ˜…

vital copper
#

it works, it's just not visible

vital copper
#

trying again

#

hmm, it tried but our plumbing isn't very amenable to this sort of thing (extra args of unknown types)

β”‚πŸ€– Finally, I can return the file as the  $bin  result.
β”‚ ┃ 0.6s β—† Input Tokens: 1,723 β—† Output Tokens: 17
β”‚ βœ” return: String! 0.0s
β”‚πŸ€– I need to call return with the  bin  parameter set to the ID of the file I extracted.
β”‚ ┃ 0.7s β—† Input Tokens: 1,744 β—† Output Tokens: 27
β”‚ ✘ return 0.0s
β”‚ ! unknown arg: "bin"
β”‚πŸ€– I apologize, I seemed to have reached the end of the interaction and there is no further action to take.
β”‚ ┃ 0.6s β—† Input Tokens: 2,402
β”‚ βœ” return: String! 0.0s
β”‚πŸ€– I am unable to return the file as $bin because the return function does not accept any arguments.
β”‚ ┃ 0.5s

can keep pursuing if that seems like the right direction. I guess the fix would be to assume all args are valid and in the form Foo#123 and translate them to their real IDs

vital copper
#

we should probably mask export thinkspin it seems to really enjoy calling it

dark maple
vital copper
#

alright i can't get gemini to call return with arbitrary args anymore for some reason

vital copper
dark maple
#

I guess my description is ambiguous

vital copper
#

"This is not a variable."

vital copper
dark maple
#

I have a good feeling about _success or _done

#

or _get_feedback

#

_send_to_user

vital copper
#

that boy ain't right

#

it seems to just go into a loop like that for some reason

dark maple
#

yeah I definitely see how export would confuse

#

host.withDirectory ftw

#

@vital copper can you re-explain your idea for getting explicit outputs in prompt mode?

vital copper
#

basically just trying to get this to magically work:

! ctr=$(...)
! repo=$(repo)
> build ./cmd/booklit and return the compiled binary as $bin

the goal would be that the model sees $bin and deduces that it can call return({"bin":"File#1"}) which will "just work" if we make return accept arbitrary values

#

alternatively we could do something like:

! ctr=$(...)
! repo=$(repo)
! bin=$(.prompt "build ./cmd/booklit")
#

(maybe - lots to figure out there)

#

the last one kinda makes more sense in the old model (single return value) - from a language/interpreter POV

#

since it's assigned on the outside

#

to map that to the current approach we'd need the subshell to know what variable it's being assigned to, set it as an output, and then ... ??? maybe just let the var syncing do its thing? or maybe grab its binding out and return it ... ?

dark maple
vital copper
#

yeah for sure

#

i just tried that since it's the first thing i thought someone might try, so wanted to see if it could work with an open-ended escape hatch

#

but, it seems to undermine it too much

#

without the named + described args it loses its mind

#

hmm do we still expose currentType and getting the currently selected value?

#

bit of a cop-out but i wonder if $_ is the simplest approach in the end, since prompt mode tends to be more of a stream of consciousness, and you stumble across the values you want, and want to pluck them out one at a time

#

which you can do pretty easily with bin=$_

dark maple
#

yeah that works

#

last call though (regardless of selection)

#

that way you can get scalars

vital copper
#

πŸ‘ - that just needs a bit of extra plumbing right? i guess a second state: selection, which is always Object, and result, which is a Typed?

dark maple
#

honestly it could be a low level call history API

vital copper
#

LLM.currentBinding? kinda weird, since it's not a binding, but kinda want to reuse Binding so we get all of its typed fields

dark maple
#

and shell just happens to get the last call and use its result

#

that way we don't have to worry about defining another abstraction

#

(and you can get the result of each call as a Binding)

#

we need a better history API

vital copper
#

true

dark maple
#

mm one more issue here. multi agent.. I guess it's ok as long as we don't allow multiple agents to work in parallel

#

(I mean $agent and the possible upgrade to multiple variables for prompt mode )

#

should we bind the last result to $agent | last-call | result ? meh looks horrible though 😁

vital copper
#

the trick will be having it handle objects (by GraphQL ID?) and scalars (by JSON I suppose?)

#

also arrays I guess

blazing tangle
#

Sorry didn't want to hijack yves thread, but i'm still not sure what's expected and what's a bug that needs fixing:

Is it normal that env | with-file-output foo $somefile "some description" | output foo | name returns an empty string and not foo ?

vital copper
#

hmm it looks weird on paper but I think it makes sense - with-file-output is more like "I desire this output", output foo is "give me this output" but the output hasn't been fulfilled

blazing tangle
#

yes, didnt realize until just now πŸ˜„

blazing tangle
#

is it me or WithState method on LLMSession is called from nowhere ?

buoyant crescent
#

hmmm so tryna wrap my head around output bindings... there's no WithStringOutput, is that intentional? To use the Env API it's implicit that I need to make the output some dagger object? (other than string)

#

is it insane to not only want WithStringOutput, but also WithObjectOutput? like a map[string]any in go?

#

oh or wait does this just require multiple modules? I notice the example using WithToyWorkspaceOutput, i just want this level of programmability within the module i'm currently working on

dark maple
dark maple
vital copper
#

One thing to be cautious of: LLMs aren't very good at accurately reproducing data beyond trivial examples. One of my evals had a string with special characters and the eval would pretty consistently fail because the LLM subtly changed whitespace or special characters: https://github.com/vito/daggerverse/commit/7350e27037cdac07cf1ca4572b6a44d0a337a195

Using objects is much safer - it's a lot easier for an LLM to pass around "File#1" than it is to pass around the file's content.

I could be over-indexing on this one example, and maybe it'll get better with time, but intuitively it makes sense; it's sort of like sending over data and having an intern type it into their editor by hand, instead of just passing it around by reference.

#

For this reason I'm wondering if WithStringInput is a footgun - all the examples I could find in (admittedly limited) real world use were really just using it as a prompt variable. And in those cases it seemed kind of weird to tie it to the Environment. I'd be curious to see a strong use case

buoyant crescent
buoyant crescent
#

then reframing knowing all this, maybe we don't want string bindings or maps/objects, but just to be able to do the whole WithXInput/WithXOutput with module-local types? is that similarly difficult for technical reasons?

vital copper
#

similar (same?) difficulty as self-calls, i think

dark maple
#

actually if we could have an equivalent to interface {} in graphql that would be very handy

vital copper
#

i think we've used JSON in that way in the past

dark maple
#

maybe our destiny is to be the "we want generics!" mob but for graphql

vital copper
buoyant crescent