#Workshop feedback 🧡

1 messages Β· Page 1 of 1 (latest)

primal bloom
#
  1. multi-object requires extra prompting
#
  1. ollama instructions in docs are unclear: makes it look like you have to install llama-3.2 specifically, but in reality other models are also supported (I think?) cc @gilded raven
#
  1. Boilerplate example in dagger init really needs a refresh. No fields in object; example functions have no clear purpose; there was confusion on whether it's OK to remove them ("is this an example function, or part of the scaffolding?"). Also: maybe we should add something agent-specific πŸ™‚
gilded raven
clever marsh
# primal bloom 1. multi-object requires extra prompting

Trying some system prompting for 1). I think the next thing to learn will be how much we need to lean on it, like whether us setting a system prompt is foundational to everything you build on top of this, or whether people will need to be able to override it, or whether we'll need to tune a system prompt for each model, etc etc

#

right now we're inconsistent there; OpenAI has no system prompt, but Gemini and Anthropic do, unsure why exactly. so I'll try setting the same one across all of them first

gilded raven
clever marsh
#

also: maybe this system prompt is only actually relevant to interactive dagger shell?

#

like it'll be pretty hard to programmatically use anything that's making up its own var names

primal bloom
#

FYI our use cases for the workshop:

  1. Daggerverse QA ( @paper ledge )
  2. Voice assistant for booking medical appointments ( @finite perch )
  3. In-game world editor ( @azure fable )
  4. Discord thread summarizer
primal bloom
#
  1. for the game editor: they will need programmatic access to the response stream (they send it back to the game client that streams it to the player on screen) (fyi @clever marsh )
paper ledge
#

I just got a screenshot from inside a container that claude figured out how to take 🀯

gilded raven
#

that is very cool

primal bloom
#

screenshot inception

#
  1. single-object doesn't seem to work with claude-sonnet-3.5 on llm.11. Switching to multi-object worked. (cc @clever marsh @paper ledge )

Repro: llm | with-container $(container | from alpine) | with-prompt "I gave you a container. Do you see it?"

gilded raven
paper ledge
# primal bloom screenshot inception

I took a screenshot of this and this also blows my mind πŸ™‚

βœ” use ocr to summarize what the screenshot is talking about 3.3s
β”‚πŸ§‘ use ocr to summarize what the screenshot is talking about
β”‚ ┃ 0.0s
β”‚
β”‚πŸ€– It seems the context changes with the variable "ctr" might be resulting in repeated tasks. To
β”‚ ┃ summarize the contents of the screenshot from earlier OCR results:
β”‚ ┃
β”‚ ┃ The conversation captures a demonstration where a user, Lev Lazinskiy, manages to take a
β”‚ ┃ screenshot from within a container using a headless browser tool like Chromium. There is
β”‚ ┃ excitement around this capability, as expressed by other participants, Kyle Penfound and Solomon,
β”‚ ┃ who describe it as "very cool" and comment on the concept of "screenshot inception."
β”‚ ┃
β”‚ ┃ The essence is about innovatively capturing a screenshot within a containerized environment,
β”‚ ┃ illustrating the power and flexibility of container usage. If you need more detailed insights or
β”‚ ┃ further analysis, feel free to ask!
β”‚ ┃ 3.2s β—† Input Tokens: 2,976 β—† Output Tokens: 142
primal bloom
#

ha ha πŸ™‚

primal bloom
weak basin
gilded raven
weak basin
#

I know that for linux you need to do an extra hop to make it work but we can just add a note in the docs

#

using host.docker.internal SGTM!

gilded raven
#

I'll test it!

clever marsh
primal bloom
#

Worked pretty well as a heuristic, basically it enshrines the pattern that when your object has a chainable function, it will be presented to the LLM as a mutation (with easy undo)

#

I admin I haven't through how much of our existing multi-object problems this would solve

#

it doesn't make the need for the LLM to handle IDs. But it does make it less load-bearing

#

maybe we could drop hints like:

clever marsh
#

still struggling to see how that would mechanically work, since to turn it into a var you need to somehow assign it to the result of that call. would we keep _save + "current state"?

primal bloom
#
  • _select("base_container") -> selected: base_container=<1234> (type: Container)
  • withExec(...) -> selected: base_container=<4567> (type: Container)
  • withNewFile(...) -> selected: base_container=<8976> (type: Container)
  • _duplicate("dev") -> selected: dev=<8976> (type: Container)
  • withExec() -> selected: dev=<abab> (type: Container)
  • directory(...) -> selected: (unsaved)=<xyxy> (type: Directory)
  • _save("build") -> selected: build=<xyxy> (type: Directory)
clever marsh
#

@primal bloom lol, you'll like this

  • Should we still call these "variables," or should we adopt a new term?

    Since they now auto-update, "variable" might be misleading. Would "binding," "reference," or "tracker" be more appropriate?

primal bloom
#

objects?

clever marsh
#

"binding" is growing on me, since the parallels to service bindings are kind of there now

#

in that it's a living stateful mutable thing

#

also: do we keep _undo? and maybe have per-variable history?

primal bloom
#

yeah binding is great for the DX outside the sandbox (ie. LLM.withFooBinding). From inside the sandbox, we might need another word (selecting / saving are not verbs that usually apply top a binding)

#

but the llm might disagree with me on that πŸ™‚

primal bloom
clever marsh
#

one of the tests relies on it πŸ˜› it's easy to keep

#

but i'll try without and see how it goes

#

the test pretty explicitly prompts it to do it. i remember there was a time it would do things like apt-get remove instead, but that might have just been poor prompt engineering on my part

primal bloom
#

I meant more for the cognitive load on the LLMs, not so much complexity of our code

#

I guess it's not that much either - but who knows, I'm getting paranoid πŸ™‚

clever marsh
#

also, i'm assuming you still want individual _select_foo and _duplicate_bar tools? since that helps the model a lot

#

actually maybe we don't need _duplicate_* if we have _save

#

just select the old var, and then save a new var

primal bloom
#

what's cool is that we can also let it query the history, with the corresponding IDs and commands. So it could use undo in a precise manner

primal bloom
finite perch
#

I'm gonna try to put together my notes and share those wiht you

#

Notes:

Without documentation, me and the Dagger engineer were both confused on certain aspects. -This comes with rapid enhancements, but it was a bit confusing.

How to exactly use an LLM in Dagger itself was confusing. I wasn’t sure if the message is sent immediately or if there’s a way to trigger a message.

Being able to work with streaming messages would be nice too

I had issues with the Enum and passing it along to the LLM. It wouldn’t accept the Enum Definition that I created with the class.

Having a simplified approach to do API calls would be a nice to have as a lot of the scripts I’ll write are going to be API calls.

Gemini Woes were not fun to try and debug 😦

I wasn’t sure what tools/features come with any specific argument I passed along. For example, if I share a file, what tools should I have access to? What requires a separate workspace for the agent to work with?