#Terminal access 🧵
1 messages · Page 1 of 1 (latest)
Yeah I'd like this specifically in the context of allowing LLMs to debug stuff interactively. e.g. give an LLM an interactive debugger shell and let it poke around and find stuff. But I'm sure there's plenty of possibilities here.
I feel like there's some annoyance in the plumbing required to implement this, but nothing too bad.
The big potential problem as you mentioned @odd shuttle is context... I suppose for an interactive shell where each command just outputs a few lines at a time it's no big deal, but if the command suddenly returns tons of data then you could easily overflow the context window. Also "visual" TUI programs like htop would be... interesting. I have no idea if an LLM would even be able to interpret it, much less how many terminal chars would actually be output
I guess there's some fundamental questions about what the tool call even looks like. Does the LLM just have the ability to send arbitrary key strokes to stdin? Can it ctrl-C 😄 ? etc
I think it would be easier to generalize a solution if we had a type for it
ie either a Terminal type, or maybe a more general Stream type? maybe?
or maybe a dagql equivalent of typed Go channels? Like you can get a "stream of <Foo>"? ( as opposed to a bytestream)
Stream for sure would be nice, even on withExec stdout/stderr.
We had a terminal type at some point, I think it got removed by easy to re-add
another pretty different approach might be - what if a function could expose a nested dagql server? For rapid back and forth dagger function calls, decoupled from the "containerized function" machinery
higher-level than a raw Service, but lower-level than a full-blown module schema
not sure if what I'm saying even makes sense
What's the relationship with streams and interactive terminals? Are you imagining a set of functions for "REPL" so it can "send line", "receive line", etc. quickly?
Yeah exactly - quickly and with the same in-memory state
I guess that boils down to quickly 😉
for operations that are below the overhead line of executing containers for each dagql function call basically
This gets us close to the question @quaint vine was asking on the call earlier - "a lot of agents are just composing lightweight API calls, is Dagger really useful"
--> right now Dagger brings benefits, with some constraints.
- The benefits are real - repeatability, caching, deep tracing, sandboxing by default etc. etc.
- The constraints also are real: for that just lightweight API calls part, you have to choose between 1) breaking up everything in Dagger types and functions (can be heavyweight, terminal access is one example). or 2) dropping to raw stream/socket - but then you're escaping out of dagger benefits too.
maybe there's a middle ground where you get the benefits of mapping more operations to dagql functions, even those operations that can't map to a buildkit exec
Yeah @edgy tinsel and I were plotting on Monday around Theseus and have a path in mind that gets module function calls off of buildkit exec ops relatively soon. My goal was to support global function caching asap but Justin also pointed out it would also free us to run functions however we want, including as long-running processes that avoid full container stop/start overhead.
I'm definitely open to it, only possible downside is it makes function calls more mutable/unreproducible, but yeah I agree there's just tradeoffs there. I think it'd be possible to support multiple execution modes on per-function basis too, etc.
For terminal specifically though, it feels like there's paths that avoid entanglement w/ all that. e.g. the BBI could:
- See that the LLM called
Terminaland add some tools for interacting with the input/output (tbd what that actually is) - The tools would just directly operate on the streams in the engine process
Upside is something like that is probably quicker, downside is it's highly "one-off" rather than built from primitives re-usable elsewhere.
. My goal was to support global function caching asap but Justin also pointed out it would also free us to run functions however we want, including as long-running processes that avoid full container stop/start overhead.
does this map into your head as having different "backends" for function calling? One could be a container and maybe some others could just return a stream of data (video, audio, etc) in a way that you can also compose pipelines around that primitive