Workshop feedback 🧵 | Dagger | Page 1

primal bloom Mar 18, 2025, 7:25 PM

#

multi-object requires extra prompting

#

ollama instructions in docs are unclear: makes it look like you have to install llama-3.2 specifically, but in reality other models are also supported (I think?) cc @gilded raven

#

Boilerplate example in dagger init really needs a refresh. No fields in object; example functions have no clear purpose; there was confusion on whether it's OK to remove them ("is this an example function, or part of the scaffolding?"). Also: maybe we should add something agent-specific 🙂

gilded raven Mar 18, 2025, 7:27 PM

#

primal bloom 2. ollama instructions in docs are unclear: makes it look like you *have* to ins...

got it, i'll reword that section a bit

clever marsh Mar 18, 2025, 8:07 PM

#

primal bloom 1. multi-object requires extra prompting

Trying some system prompting for 1). I think the next thing to learn will be how much we need to lean on it, like whether us setting a system prompt is foundational to everything you build on top of this, or whether people will need to be able to override it, or whether we'll need to tune a system prompt for each model, etc etc

#

right now we're inconsistent there; OpenAI has no system prompt, but Gemini and Anthropic do, unsure why exactly. so I'll try setting the same one across all of them first

gilded raven Mar 18, 2025, 8:10 PM

#

https://github.com/dagger/dagger/pull/9891

by the way I wish we didn't have to do the YOUR IP stuff instead of localhost for ollama

GitHub

improve ollama setup docs by kpenfound · Pull Request #9891 · dagge...

clever marsh Mar 18, 2025, 8:24 PM

#

also: maybe this system prompt is only actually relevant to interactive dagger shell?

#

like it'll be pretty hard to programmatically use anything that's making up its own var names

primal bloom Mar 18, 2025, 8:24 PM

#

FYI our use cases for the workshop:

Daggerverse QA ( @paper ledge )
Voice assistant for booking medical appointments ( @finite perch )
In-game world editor ( @azure fable )
Discord thread summarizer

primal bloom Mar 18, 2025, 9:26 PM

#

for the game editor: they will need programmatic access to the response stream (they send it back to the game client that streams it to the player on screen) (fyi @clever marsh )

#

paper ledge Mar 18, 2025, 9:30 PM

#

I just got a screenshot from inside a container that claude figured out how to take 🤯

gilded raven Mar 18, 2025, 9:30 PM

#

that is very cool

primal bloom Mar 18, 2025, 9:31 PM

#

screenshot inception

#

#

single-object doesn't seem to work with claude-sonnet-3.5 on llm.11. Switching to multi-object worked. (cc @clever marsh @paper ledge )

Repro: llm | with-container $(container | from alpine) | with-prompt "I gave you a container. Do you see it?"

gilded raven Mar 18, 2025, 9:41 PM

#

primal bloom 6. single-object doesn't seem to work with claude-sonnet-3.5 on llm.11. Switchin...

If I prompt with something related to container tools like "I gave you a container. Can you run a command 'echo hello'" it seems to behave correctly

paper ledge Mar 18, 2025, 9:42 PM

#

primal bloom screenshot inception

I took a screenshot of this and this also blows my mind 🙂

✔ use ocr to summarize what the screenshot is talking about 3.3s
│🧑 use ocr to summarize what the screenshot is talking about
│ ┃ 0.0s
│
│🤖 It seems the context changes with the variable "ctr" might be resulting in repeated tasks. To
│ ┃ summarize the contents of the screenshot from earlier OCR results:
│ ┃
│ ┃ The conversation captures a demonstration where a user, Lev Lazinskiy, manages to take a
│ ┃ screenshot from within a container using a headless browser tool like Chromium. There is
│ ┃ excitement around this capability, as expressed by other participants, Kyle Penfound and Solomon,
│ ┃ who describe it as "very cool" and comment on the concept of "screenshot inception."
│ ┃
│ ┃ The essence is about innovatively capturing a screenshot within a containerized environment,
│ ┃ illustrating the power and flexibility of container usage. If you need more detailed insights or
│ ┃ further analysis, feel free to ask!
│ ┃ 3.2s ◆ Input Tokens: 2,976 ◆ Output Tokens: 142

primal bloom Mar 18, 2025, 9:44 PM

#

ha ha 🙂

primal bloom Mar 19, 2025, 2:15 AM

#

primal bloom 6. single-object doesn't seem to work with claude-sonnet-3.5 on llm.11. Switchin...

llm will sometimes overwrite a variable with a different type. We should check for that and return an error, as a guardrail. In the case where this happened, it was a mistake, a type check would have helped recover

weak basin Mar 19, 2025, 3:06 AM

#

gilded raven https://github.com/dagger/dagger/pull/9891 by the way I wish we didn't have to ...

I think using host.docker.internal should work, shouldn't it?

gilded raven Mar 19, 2025, 3:07 AM

#

weak basin I think using `host.docker.internal` should work, shouldn't it?

I think I saw someone using that, so it might, yeah

weak basin Mar 19, 2025, 3:07 AM

#

I know that for linux you need to do an extra hop to make it work but we can just add a note in the docs

#

using host.docker.internal SGTM!

gilded raven Mar 19, 2025, 3:08 AM

#

I'll test it!

clever marsh Mar 19, 2025, 3:29 AM

#

primal bloom 7. llm will sometimes overwrite a variable with a different type. We should chec...

need to figure out how to square that with updating vars in-place. for example, how does it go from a Container to its directory("/out") without a variable changing its type? do we make an exception for that? or allow it only after the first duplicate call? thinkspin

primal bloom Mar 19, 2025, 3:30 AM

#

clever marsh need to figure out how to square that with updating vars in-place. for example, ...

In single-object at least, we differentiate based on return type for this exact reason. If the function returns the same type, "auto-chain". Otherwise, return the ID

#

Worked pretty well as a heuristic, basically it enshrines the pattern that when your object has a chainable function, it will be presented to the LLM as a mutation (with easy undo)

#

I admin I haven't through how much of our existing multi-object problems this would solve

#

it doesn't make the need for the LLM to handle IDs. But it does make it less load-bearing

#

maybe we could drop hints like:

clever marsh Mar 19, 2025, 3:35 AM

#

still struggling to see how that would mechanically work, since to turn it into a var you need to somehow assign it to the result of that call. would we keep _save + "current state"?

primal bloom Mar 19, 2025, 3:41 AM

#

_select("base_container") -> selected: base_container=<1234> (type: Container)
withExec(...) -> selected: base_container=<4567> (type: Container)
withNewFile(...) -> selected: base_container=<8976> (type: Container)
_duplicate("dev") -> selected: dev=<8976> (type: Container)
withExec() -> selected: dev=<abab> (type: Container)
directory(...) -> selected: (unsaved)=<xyxy> (type: Directory)
_save("build") -> selected: build=<xyxy> (type: Directory)

clever marsh Mar 19, 2025, 4:09 AM

#

@primal bloom lol, you'll like this

Should we still call these "variables," or should we adopt a new term?

Since they now auto-update, "variable" might be misleading. Would "binding," "reference," or "tracker" be more appropriate?

primal bloom Mar 19, 2025, 4:11 AM

#

objects?

clever marsh Mar 19, 2025, 4:12 AM

#

"binding" is growing on me, since the parallels to service bindings are kind of there now

#

in that it's a living stateful mutable thing

#

also: do we keep _undo? and maybe have per-variable history?

primal bloom Mar 19, 2025, 4:13 AM

#

yeah binding is great for the DX outside the sandbox (ie. LLM.withFooBinding). From inside the sandbox, we might need another word (selecting / saving are not verbs that usually apply top a binding)

#

but the llm might disagree with me on that 🙂

primal bloom Mar 19, 2025, 4:14 AM

#

clever marsh also: do we keep `_undo`? and maybe have per-variable history?

Yeah I think we should keep it. Maybe we drop it temporarily until everything else works great though?

clever marsh Mar 19, 2025, 4:15 AM

#

one of the tests relies on it 😛 it's easy to keep

#

but i'll try without and see how it goes

#

the test pretty explicitly prompts it to do it. i remember there was a time it would do things like apt-get remove instead, but that might have just been poor prompt engineering on my part

primal bloom Mar 19, 2025, 4:16 AM

#

I meant more for the cognitive load on the LLMs, not so much complexity of our code

#

I guess it's not that much either - but who knows, I'm getting paranoid 🙂

clever marsh Mar 19, 2025, 4:17 AM

#

also, i'm assuming you still want individual _select_foo and _duplicate_bar tools? since that helps the model a lot

#

actually maybe we don't need _duplicate_* if we have _save

#

just select the old var, and then save a new var

primal bloom Mar 19, 2025, 4:18 AM

#

what's cool is that we can also let it query the history, with the corresponding IDs and commands. So it could use undo in a precise manner

primal bloom Mar 19, 2025, 4:18 AM

#

clever marsh also, i'm assuming you still want individual `_select_foo` and `_duplicate_bar` ...

I haven't had a chance to test it in practice, it just seemed potentially helpful in theory. You're the best judge of whether it's worth the tradeoff

finite perch Mar 20, 2025, 6:06 PM

#

Hey, sorry about taking so long to give some feedback and the code, but here's the code: https://github.com/ankykong/daggerClinicSearch

GitHub

GitHub - ankykong/daggerClinicSearch

Contribute to ankykong/daggerClinicSearch development by creating an account on GitHub.

#

I'm gonna try to put together my notes and share those wiht you

#

Notes:

Without documentation, me and the Dagger engineer were both confused on certain aspects. -This comes with rapid enhancements, but it was a bit confusing.

How to exactly use an LLM in Dagger itself was confusing. I wasn’t sure if the message is sent immediately or if there’s a way to trigger a message.

Being able to work with streaming messages would be nice too

I had issues with the Enum and passing it along to the LLM. It wouldn’t accept the Enum Definition that I created with the class.

Having a simplified approach to do API calls would be a nice to have as a lot of the scripts I’ll write are going to be API calls.

Gemini Woes were not fun to try and debug 😦

I wasn’t sure what tools/features come with any specific argument I passed along. For example, if I share a file, what tools should I have access to? What requires a separate workspace for the agent to work with?

#Workshop feedback 🧵