#Privileged env API 🧵
1 messages · Page 1 of 1 (latest)
cc @craggy temple @quaint geyser @quartz helm
Starting point: we agreed that we want shell/llm parity. In other words: for a given environment, the LLM and human (shell) interface should expose the same data and capabilities.
This means: function names are resolved with a search path that has 3 layers:
- Functions in current object
- Dependencies in current module
- Stdlib (a blended view of core API + blessed modules, curated by the engine)
I think developers will want granular control over which layers the LLM can access
Something like:
env(
showDependencies: bool=false,
showStdlib: bool=false,
): Env!
Or, a simpler numerical system:
env(
"""
Configure access level for the environment.
- Low: only explicit bindings are accessible.
- Medium: the current module's dependencies are also accessible
- High: the Dagger standard library is also accessible
Less access means less available tools, but more predictable and easier on cheap models.
"""
accessLevel: EnvAccessLevel=Medium
)
enum EnvAccessLevel {
Low
Medium
High
}
In any case, host, export and Container.withExec(privileged) are not accessible. Full sandboxing for LLMs
what about when you do want host accessible?
specifically export seems sorta unavoidable for code-writing tasks
I don't think so actually. Obviously the result has to be exported at some point. But it doesn't have to be the llm exporting it necessarily
IDE makers seem to be converging towards a "multi-agent army" approach, with UX that allows you to run several experiments in parallel, and pick and choose which result you want to incorporate
Full sandboxing by default could be a killer feature of Dagger in this context
hmmm... so you rely on the MCP client to call an extra, non-dagger-mcp tool like write_file or whatever, and in our prompt mode, you do ! $source | export .?
Yes exactly
That leaves a gap for mcp-only workflows without a special dagger integration...
(added clarification)
that did not clarify
OK let me rewind
im giggling as i press enter fwiw
you rely on the MCP client to call an extra, non-dagger-mcp tool like write_file or whatever
TBD
Some sort of special integration
the thing that's hard there is you rely on the LLM to take all these dagger tools, get the contents of each file it changed, and send those to the IDEs write_file tools
is that what you mean?
because if that's what you mean we're very much on the same page, i am concerned that that's a lot of faff to leave to these robotic text generation interns
especially bc i'm fairly certain every mcp client has multiple different tools for this... like some write whole files, some write line numbered blocks, some write diffs, and they all have varying amounts of obs middleware in between the llm and the filesystem (llm->mcpclient->mcpserver->filesystem)
I think the ideal UX would be for the LLM to not actually call export or write_file, and instead for the user to do it, through a UX designed by their MCP client, or by Dagger, or by both
through a UX designed by their MCP client
the UXes designed by the mcp client are increasingly activated through MCP
like in zed
or similarly if you're making an agent setup in claude desktop, you're gonna pick a filesystem mcp server to do this
Yeah but I'm specifically referring to Nathan's thinking out loud on that Zed call, where he described a multi-agent scenario with per-agent sandboxing to allow users to pick and choose changes without agents stepping on each other's toes
im not disagreeing with the idea that us being sandboxed by default is a boon for these workflows, i'm saying that the part where we break out of the sandbox and apply the changes is a super critical piece of UX
Right
and the LLMs do need to make that happen somehow
Not necessarily the LLMs
They just need a way to write the file. It's not really their concern if that file is written to the end user's filesystem, or in a snapshot. Directory.withNewFile for example is just fine for them right?
In fact Zed's builtin tool I'm pretty sure goes to some sort of buffer, for the user to approve the change (or if it doesn't yet, it will soon, since they have the UX for that already in predictive editing)
i've been using it heavily, it goes to host disk
Right but do you agree that soon it won't? Surely IDEs are designing smarter ways to review and approve changes by the agent, just like they do for in-editor code changes?
but it keeps track of what it's written through the replace tool so you can review and then accept reject
yeah, and this is a smart way of doing it -- it wouldn't be hard for them to buffer
Ah I see, the changes are made directly, but can be reverted. Makes sense
but from the llm's perspective, it's still requesting to write, even if it's buffered
in our world, it can write to a containerized filesystem, but when it's done doing all that and the human has seen the changes, what then?
Right, our equivalent of that is Directory.withNewFile()
yeah totally, but after it's done with that, how do i get the changes on my local machine?
when it's done doing all that and the human has seen the changes, what then?
<hand waving>
a UX designed by their MCP client, or by Dagger, or by both
</hand waving>
lemme provide one other bit of context lol - the pre-zed IDE plugin i've been using to do this, avante.nvim, originally had a non-mcp approach to doing this. via system prompts, it tells the llm it's gotta generate any and all code blocks in an XML format that the plugin knows how to parse, present, and apply when it's mixed into llm responses
avante.nvim is moving onto mcp write_file-ish tools because when you get the context window big enough, the llms start ignoring that instruction about XML
which to me implies that at the very least, we wanna expose some sort of content-for-export api that the llms can use to get stuff into those tools rather than expecting them to produce text-blocks that non-LLM client code knows how to apply
yeah i've been assuming this DX internally. I'd rely on the editor's MCP for dagger to write changes back to my host. Intermediate changes in a container for running tests, etc are valuable but if it passes those it should always write to the editor in the editor's preferred way (buffer, whatever)
i'm actually super curious now whether my assumptions are correct about zed's find_replace_file tool being the piece that collects changes into the "review changes" palette... like if i provide a different filesystem mcp i wonder if it'll still detect the changes
OK so regardless of whether we expose export or not, we have an unsolved problem: filesystem integration with IDEs - right?
yeah. i think the same problem kinda exists within shell/prompt mode, like if i get the LLM to make changes on my behalf, it'd be nice to review them before bulk-writing to host disk
i gotta sign out for dinner unfortunately lol but if you can't tell this has been very top of mind for me XD
OK, don't forget to wheigh in on the other parts of the proposal guys when you have a minute
Yeah my flow previously has been to go in a few steps
- generate code changes
- make sure the tests pass
- now export them
All in prompt mode. I think this doesn't work now on the last release right
personally i think ```
env(
showDependencies: bool=false,
showStdlib: bool=false,
): Env!
is preferable to levels
makes it easy to construct other arbitrary setups, especially if you also wanna layer in host access
env(deps:false, stdlib:true, host:true) or env(deps:false, stdlib:false, host:true) is kinda why i brought up the host thing
and those would not fit into a level-based scheme
yeah that's true. The levels could always be added later at the UX level, if we wanted