#Workshop feedback π§΅
1 messages Β· Page 1 of 1 (latest)
- multi-object requires extra prompting
- ollama instructions in docs are unclear: makes it look like you have to install
llama-3.2specifically, but in reality other models are also supported (I think?) cc @gilded raven
- Boilerplate example in
dagger initreally needs a refresh. No fields in object; example functions have no clear purpose; there was confusion on whether it's OK to remove them ("is this an example function, or part of the scaffolding?"). Also: maybe we should add something agent-specific π
got it, i'll reword that section a bit
Trying some system prompting for 1). I think the next thing to learn will be how much we need to lean on it, like whether us setting a system prompt is foundational to everything you build on top of this, or whether people will need to be able to override it, or whether we'll need to tune a system prompt for each model, etc etc
right now we're inconsistent there; OpenAI has no system prompt, but Gemini and Anthropic do, unsure why exactly. so I'll try setting the same one across all of them first
https://github.com/dagger/dagger/pull/9891
by the way I wish we didn't have to do the YOUR IP stuff instead of localhost for ollama
also: maybe this system prompt is only actually relevant to interactive dagger shell?
like it'll be pretty hard to programmatically use anything that's making up its own var names
FYI our use cases for the workshop:
- Daggerverse QA ( @paper ledge )
- Voice assistant for booking medical appointments ( @finite perch )
- In-game world editor ( @azure fable )
- Discord thread summarizer
- for the game editor: they will need programmatic access to the response stream (they send it back to the game client that streams it to the player on screen) (fyi @clever marsh )
I just got a screenshot from inside a container that claude figured out how to take π€―
that is very cool
screenshot inception
- single-object doesn't seem to work with claude-sonnet-3.5 on llm.11. Switching to multi-object worked. (cc @clever marsh @paper ledge )
Repro: llm | with-container $(container | from alpine) | with-prompt "I gave you a container. Do you see it?"
If I prompt with something related to container tools like "I gave you a container. Can you run a command 'echo hello'" it seems to behave correctly
I took a screenshot of this and this also blows my mind π
β use ocr to summarize what the screenshot is talking about 3.3s
βπ§ use ocr to summarize what the screenshot is talking about
β β 0.0s
β
βπ€ It seems the context changes with the variable "ctr" might be resulting in repeated tasks. To
β β summarize the contents of the screenshot from earlier OCR results:
β β
β β The conversation captures a demonstration where a user, Lev Lazinskiy, manages to take a
β β screenshot from within a container using a headless browser tool like Chromium. There is
β β excitement around this capability, as expressed by other participants, Kyle Penfound and Solomon,
β β who describe it as "very cool" and comment on the concept of "screenshot inception."
β β
β β The essence is about innovatively capturing a screenshot within a containerized environment,
β β illustrating the power and flexibility of container usage. If you need more detailed insights or
β β further analysis, feel free to ask!
β β 3.2s β Input Tokens: 2,976 β Output Tokens: 142
ha ha π
- llm will sometimes overwrite a variable with a different type. We should check for that and return an error, as a guardrail. In the case where this happened, it was a mistake, a type check would have helped recover
I think using host.docker.internal should work, shouldn't it?
I think I saw someone using that, so it might, yeah
I know that for linux you need to do an extra hop to make it work but we can just add a note in the docs
using host.docker.internal SGTM!
I'll test it!
need to figure out how to square that with updating vars in-place. for example, how does it go from a Container to its directory("/out") without a variable changing its type? do we make an exception for that? or allow it only after the first duplicate call? 
In single-object at least, we differentiate based on return type for this exact reason. If the function returns the same type, "auto-chain". Otherwise, return the ID
Worked pretty well as a heuristic, basically it enshrines the pattern that when your object has a chainable function, it will be presented to the LLM as a mutation (with easy undo)
I admin I haven't through how much of our existing multi-object problems this would solve
it doesn't make the need for the LLM to handle IDs. But it does make it less load-bearing
maybe we could drop hints like:
still struggling to see how that would mechanically work, since to turn it into a var you need to somehow assign it to the result of that call. would we keep _save + "current state"?
_select("base_container")->selected: base_container=<1234> (type: Container)withExec(...)->selected: base_container=<4567> (type: Container)withNewFile(...)->selected: base_container=<8976> (type: Container)_duplicate("dev")->selected: dev=<8976> (type: Container)withExec()->selected: dev=<abab> (type: Container)directory(...)->selected: (unsaved)=<xyxy> (type: Directory)_save("build")->selected: build=<xyxy> (type: Directory)
@primal bloom lol, you'll like this
Should we still call these "variables," or should we adopt a new term?
Since they now auto-update, "variable" might be misleading. Would "binding," "reference," or "tracker" be more appropriate?
objects?
"binding" is growing on me, since the parallels to service bindings are kind of there now
in that it's a living stateful mutable thing
also: do we keep _undo? and maybe have per-variable history?
yeah binding is great for the DX outside the sandbox (ie. LLM.withFooBinding). From inside the sandbox, we might need another word (selecting / saving are not verbs that usually apply top a binding)
but the llm might disagree with me on that π
Yeah I think we should keep it. Maybe we drop it temporarily until everything else works great though?
one of the tests relies on it π it's easy to keep
but i'll try without and see how it goes
the test pretty explicitly prompts it to do it. i remember there was a time it would do things like apt-get remove instead, but that might have just been poor prompt engineering on my part
I meant more for the cognitive load on the LLMs, not so much complexity of our code
I guess it's not that much either - but who knows, I'm getting paranoid π
also, i'm assuming you still want individual _select_foo and _duplicate_bar tools? since that helps the model a lot
actually maybe we don't need _duplicate_* if we have _save
just select the old var, and then save a new var
what's cool is that we can also let it query the history, with the corresponding IDs and commands. So it could use undo in a precise manner
I haven't had a chance to test it in practice, it just seemed potentially helpful in theory. You're the best judge of whether it's worth the tradeoff
Hey, sorry about taking so long to give some feedback and the code, but here's the code: https://github.com/ankykong/daggerClinicSearch
Contribute to ankykong/daggerClinicSearch development by creating an account on GitHub.
I'm gonna try to put together my notes and share those wiht you
Notes:
Without documentation, me and the Dagger engineer were both confused on certain aspects. -This comes with rapid enhancements, but it was a bit confusing.
How to exactly use an LLM in Dagger itself was confusing. I wasnβt sure if the message is sent immediately or if thereβs a way to trigger a message.
Being able to work with streaming messages would be nice too
I had issues with the Enum and passing it along to the LLM. It wouldnβt accept the Enum Definition that I created with the class.
Having a simplified approach to do API calls would be a nice to have as a lot of the scripts Iβll write are going to be API calls.
Gemini Woes were not fun to try and debug π¦
I wasnβt sure what tools/features come with any specific argument I passed along. For example, if I share a file, what tools should I have access to? What requires a separate workspace for the agent to work with?