#agents

1 messages Β· Page 3 of 1

spring wave
#

@shrewd ermine ok try now

shrewd ermine
#

building now

#

ok i ran this one that worked on llm.8 and its the same response on llm latest

worn hill
#

whoever made this was a strange person

#

incredible dedication to bear grills

storm gate
#

It's now possible to use the dockerfile-optimizer standalone without the Github PR flow (with dagger-llm.8)

$ dagger shell -m github.com/samalba/agents/dockerfile-optimizer
β‹ˆ src=$(git https://github.com/samalba/demo-app | head | tree)
β‹ˆ optimize-dockerfile-from-directory $src | file Dockerfile | contents

Thanks for the suggestion @smoky ocean

spring wave
#

dammit

quiet ether
spring wave
#

context: adding a hint to the model when its context changes, trying to avoid this:

βœ” container | from alpine | .llm 1.7s
β”‚πŸ€– Thank you for letting me know. How can I assist you with the current container context?
β”‚ ┃ 1.4s β—† Input Tokens: 6,971 β—† Output Tokens: 20
LLM@xxh3:7bf41186fcf1a6ee
spring wave
#

instead of replying it ran a tool to reply -_-

#

@smoky ocean you can tag llm.9 whenever, brought back withPromptVar and made multi-object lazily enabled (when you set a var)

worn hill
#

it really really really wants to use tools to the point it just never responds

smoky ocean
#

0.17.0-llm.9 🧡

smoky ocean
#

🚨 🚨 🚨 new release: 0.17.0-llm.9. Now with revolutionary multi-object support πŸ™‚ The design is not fully baked so we also left single-object, to avoid breaking existing modules and scripts.

#

@spring wave Ctrl-L doesn't work for me on that version

spring wave
#

(unironically - there's a sleep(100ms) in there to time the clearing for when the scrollback is flushed, quite unfortunate)

smoky ocean
#

worked in ghostty. not in zed

smoky ocean
#

@spring wave @shrewd ermine FYI I think llm.9 does break all modules, because of capitalization changes

spring wave
smoky ocean
spring wave
#

Zed might be eating the Ctrl+L keybind entirely, I know Cursor defaults to using it for chat stuff

#

or does it work outside of dagger

smoky ocean
#

OK in zed after re-starting the shell, it works on second try. can't repro the issue in zed. so I guess we're good πŸ™‚

spring wave
#

also, tiny thing: I made it so submitting an empty shell input starts and immediately cancels a span, instead of doing nothing, since that's a muscle memory (to add spacing). lemme know if it's weird

smoky ocean
#
  • dagger -m github.com/samalba/agents/dockerfile-optimizer: ./main.go:66:76: undefined: dagger.Llm
#
  • dagger -m github.com/shykes/toy-programmer -> loads ok πŸ€·β€β™‚οΈ
spring wave
#

ah, the type was renamed but the constructor is still dag.Llm - but I think that's a TODO

shrewd ermine
#

Nice!

smoky ocean
#

Oh I think it's a typo in the first module

#

OK I understand now - not a typo, Sam's module just exposes the LLM type in its API, which is relatively rare - but llm.9 does break that

#

Actually the problem is not actually exposing it's just referencing the type anywhere in his code

#

(in his case it's an internal function)

#

this is tricky because if I open a PR to his module, it will break for pre-llm.9 users

#

which as of right now is 100% of our small testing pool πŸ™‚

#

hey I reference that type also in my melvin module...

#

yup it's broken also 😭

spring wave
#

bummer. ah well, good to get these out of the way sooner than later

smoky ocean
#

Yeah basically it means we have to be aggressive in getting people to update engine and modules

#

or, slow-roll on both

storm gate
spring wave
#

complete with the classic running prompt as shell first

#

i think it would have figured things out easier _objects included all objects, not just variables, with the name of the call that created it (like Container@xxh3:abcdef.from(args...)). not sure if worth it, maybe there's another way

smoky ocean
spring wave
#

yea

#

it would be nicer if it would have just recognized it from the tool output though

smoky ocean
#

I still think reusing the trick from single-object, with auto-saving each variable, would help overall a lot

spring wave
#

feels like it just needs a small hint

spring wave
#

but this is a very specific scenario, not sure if it'd be common

smoky ocean
#

we can always give it "rewind KEY" and "history KEY" stuff like that

spring wave
#

ah true

#

that feels like basically what it's trying to do now, though

#

just instead of keys it's object ids

storm gate
shrewd ermine
#

Ideally compat mode saves us from relying on something like that, but yes I'd use it πŸ˜„

shrewd ermine
spring wave
#

why do we configure a system prompt for anthropic and gemini but not openai? thinkspin

shrewd ermine
#

Special as in we might want to make it configurable or something more specific

spring wave
#

maybe it nudges it to use tools?

#

i think there's a withSystemPrompt already

shrewd ermine
#

I don't know if thats wired up? I thought for Gemini and anthropic it was passed to the constructor

#

oh yeah duh it adds a message as system. I don't know what happens if you do that to gemini. Let's see

#

oh wait no I had it the first time, it's not wired up ! no function "with-system-prompt" in type "LLM"

shrewd ermine
river belfry
#

I was looking at make the llm part working with llama.cpp Did anyone already tried?
My understanding is while both tools and streaming are supported, they are not supporting at the same time.
So I started to change a bit the code to handle that, but it's not calling the tools anymore. For instance I have this message in the output "You can write this code in the workspace and then build it by calling ToyWorkspace_build.".
I'll have a deeper look, but if anyone has some ideas πŸ™‚

smoky ocean
#

@spring wave want to bikeshed multi-object design later?

#

I could setup another eval maybe

river belfry
spring wave
#

multi-object prompt/metaphor engineering

lilac crystal
river belfry
#

I wonder if it isn't related, but with ollama and llama3.2 on my machine, I have the same behavior.
I'm running the version llm.9 and if I'm trying toy-programmer with the classical go-program "develop a curl clone" | terminal it does nothing.
By nothing I mean

βœ” go-program "develop a curl clone" | terminal 4.7s
β”‚πŸ§‘ You are an expert go programmer. You have access to a workspace
β”‚ ┃ 0.0s
β”‚
β”‚πŸ§‘ Complete the assignment written at assignment.txt
β”‚ ┃ 0.0s
β”‚
β”‚πŸ§‘ Don't stop until the code builds
β”‚ ┃ 0.0s
Container@xxh3:f699019bc6b5b1b3

And nothing more.
So this looks like the behaviour I have with llama.cpp
(If useful https://dagger.cloud/eunomie/traces/9fbb1d84e75179ccf9687c71b06a9a2d)

merry scarab
#

If I am getting an error like this

β”‚πŸ€– 0.8s
β”‚ ! POST "https://api.anthropic.com/v1/messages": 400 Bad Request
β”‚ ! {"type":"error","error":{"type":"invalid_request_error","message":"prompt is too long:
β”‚ ! 211147 tokens > 200000 maximum"}}
! input: llm.withContainer.withPrompt.id select: POST
! "https://api.anthropic.com/v1/messages": 400 Bad Request
! {"type":"error","error":{"type":"invalid_request_error","message":"prompt is too long:
! 211147 tokens > 200000 maximum"}}

Is there any easy way for me to see what the prompt was in dagger?

shrewd ermine
spring wave
shrewd ermine
#

oh i missed it, what was the regression on llm.9?

spring wave
#

v0.17.0-llm.10

smoky ocean
#

To everyone using Dagger's agent features: what do you think of the new website? Do you recognize the reasons you personally are excited about Dagger? https://dagger.io

Dagger is an open-source runtime for composable workflows. It's perfect for systems with many moving parts and a strong need for repeatability, modularity, observability and cross-platform support.

abstract iron
subtle surge
abstract iron
#

I've seen a company selling a product like that but I guess it can be made using dagger aswell

subtle surge
shrewd ermine
shrewd ermine
#

@spring wave trying the . | .llm thing on llm.10. What's the flow look like? I tried

. | .llm
> are my tests passing?
spring wave
shrewd ermine
#

hmm i think it got confused about arguments. that makes sense I guess because I should give it variables for those

#

yup now we're good

spring wave
#

if your module constructor takes args you'll need to pass them to .

#

was that it?

#

so like . arg1 arg2 | .llm

shrewd ermine
#

I'm using the module from the quickstart and I forgot it doesn't use context directories. So I made a source=$(directory | with-directory / .) and it figured it out

#

multi object πŸš€

spring wave
#

ah ok nice

woeful quiver
#

That new prompt for external access to modules is πŸ”₯

smoky ocean
#

@spring wave quick feedback from ongoing mini-workshop: multi-object without auto-save requires custom prompting every time (at least on gpt-4o): "save to the same variable when you're done"

subtle surge
#

Hi new and old Daggernauts!

If you’re new here, we host a Dagger Community Call every other week to showcase what the community is building. You can check out past calls here: https://www.youtube.com/@dagger-io/streams.

Are you working on a Dagger Agent project? We’d love to highlight your work in an upcoming call...and yes, there will be Dagger swag involved! πŸ˜ƒ

Your project doesn’t need to be finished. We love seeing work in progress and half-baked ideas.

If you’re interested, DM me and I’ll be happy to add you to the agenda or answer any questions. Looking forward to seeing what you’re building!

smoky ocean
#

Workshop feedback 🧡

merry scarab
#

Im stuck in a doom loop

β”‚πŸ€– 0.2s
β”‚ ! POST "https://api.anthropic.com/v1/messages": 400 Bad Request
β”‚ ! {"type":"error","error":{"type":"invalid_request_error","message":"messages.7:
β”‚ ! `tool_use` ids were found without `tool_result` blocks immediately after:
β”‚ ! toolu_016BtYBAoQfwMzn4P7CeBD2p. Each `tool_use` block must have a corresponding
β”‚ ! `tool_result` block in the next message."}}
! input: llm.withPrompt.loop.withPrompt.loop.withPrompt.loop.setContainer.withPrompt.sync
! select: POST "https://api.anthropic.com/v1/messages": 400 Bad Request
! {"type":"error","error":{"type":"invalid_request_error","message":"messages.7: `tool_use`
! ids were found without `tool_result` blocks immediately after:
! toolu_016BtYBAoQfwMzn4P7CeBD2p. Each `tool_use` block must have a corresponding
! `tool_result` block in the next message."}}

Anyone seen this before?

merry scarab
spring wave
#

@merry scarab that's fixed on llm tip

merry scarab
spring wave
#

v0.17.0-llm.11

merry scarab
#

am i being dumb?

I expect this prompt to use the container I give it, but instead it tries to use ubuntu

● llm | with-container $(container | from alpine) | with-prompt "you have access to a container, us
β”‚πŸ§‘ you have access to a container, use it to install chromium
β”‚ ┃ 0.0s
β”‚
β”‚πŸ€– I'll help you install Chromium using a container. I'll use an Ubuntu base image and install
β”‚ ┃ Chromium using apt.
β”‚ ┃ 2.9s β—† Input Tokens: 11,846 β—† Output Tokens: 83
β”‚
β”‚ βœ” Container.from(address: "ubuntu:latest"): Container! 1.1s
β”‚
β”‚ βœ” remotes.docker.resolver.HTTPRequest 0.1s
β”‚ βœ” remotes.docker.resolver.HTTPRequest 7.0s
shrewd ermine
merry scarab
storm gate
#

Question, since -llm.{10,11}, the cli fails to read my api keys from the env, it now fails reading from (an non-existing) .env file. Did anything change or do we have a regression?

shrewd ermine
storm gate
shrewd ermine
#

oh but it does answer? πŸ€” not like a 403 or something? What provider are you hitting? I'm mainly using ollama and gemini

spring wave
storm gate
#

It's not bothering at all, they stay collapsed by default and the overall flow works without errors.

spring wave
#

oh, yeah the collapsing was the solution πŸ˜› - it should only expand if they all fail

#

would be nice to avoid them in the first place for sure

shrewd ermine
spring ocean
#

I'm from the hack day @smoky ocean's. i was playing with the agent and Gemini and I was struggling to have it find tools:
I've attached the output from my interaction with the model. I have been asked to ping @spring wave

merry scarab
smoky ocean
#

@shrewd ermine I have function masks almost working πŸ™‚

smoky ocean
#

pushing the branch

shrewd ermine
#

what's the UX? I pass a list of function names as opts?

smoky ocean
shrewd ermine
#

very cool. Building it now

#

still building πŸ˜… I need @spring wave 's pc

shrewd ermine
#
✘ llm | with-container $ctr --function-mask=withExec,rootfs,directory 0.0s
! input: llm.withContainer index 0 out of bounds
β”‚ ✘ LLM.withContainer(
β”‚ β”‚ β”‚ functionMask: ["withExec", "rootfs", "directory"]
β”‚ β”‚ β”‚ value: βœ” Container.withWorkdir(path: "/app"): Container! 0.0s
β”‚ β”‚ ): LLM! 0.0s
β”‚ ! index 0 out of bounds
! input: llm.withContainer index 0 out of bounds
smoky ocean
#

😦

#

I wasn't able to test it, ran into unrelated LLM hang issues

#

probably something stupid. will look into it

shrewd ermine
#

ah got it, wasn't sure if i was holding it wrong

smoky ocean
#

@shrewd ermine is there a stack trace in the engine?

shrewd ermine
smoky ocean
#

ah...

#

I smell something stupid

shrewd ermine
#

haha yeah...

steep onyx
#

dagql arrays are 1-indexed, you probably want i+1

#

In the call to Nth

shrewd ermine
#

that would do it! I'm sure there's a fun story behind that

#

trying that

#

confirmed that fixes it πŸ™

next speedbump

✘ llm | with-container $(container | from golang:latest) --function-mask=withExec,rootfs,directory | with-prompt "write a curl clone" | container | terminal 0.0s
! input: llm.withContainer.withPrompt.container instantiate: cannot instantiate dagql.Class[*github.com/dagger/dagger/core.Container] with core.maskedValue
spring wave
#

it started out as a "i don't want to deal with *int everywhere", how did it end up like this...

smoky ocean
#

I think my "clever" approach to add masking with minimal changes, might be a little too naive

#

will need to add a little more substance to it tomorrow

#

(tldr I wrap the actual value, of interface type dagql.Typed , with a simple wrapper type that keeps the original value embedded, and adds just the mask field:

type maskedValue struct {
 dagql.Typed
 mask []string
}

I was banking on the fact that my maskedValue still implements the Typed interface, with the original value - the perfect passthrough.

Except not at all, because callers try to cast it back to the original type, and I can't pass-through typecasts (I guess)

#

I think I'll need to define a new interface, and use that instead of dagql.Typed across the whole llm.Withxxx call chain

shrewd ermine
#

Yeah makes sense!

smoky ocean
#

@spring wave @shrewd ermine multi-object DX bikeshed. How do you feel about:

LLM.bindContainer("foo", ...)
LLM.bindings()
LLM.binding("foo").container()
LLM.bindToyWorkspace("bar", ...)
LLM.bindings()
LLM.binding("bar").toyWorkspace()
LLM.bindString("baz", ...)
LLM.bindings()
LLM.binding("baz").string()

Benefits:

  • Consistency around the word "binding". Everything related to bindings has the common root "bind".
  • bindings() allows for listing existing bindings -> that's a gap currently
  • binding() groups all the getters. So that cuts the volume in half right there
  • binding() and bindings() will always be listed immediately after bindFoo because of the uppercase/lowercase sorting
shrewd ermine
#

Works for me! Would the value from .binding() have a GetType or something too?

smoky ocean
#

well it has type()

#

could be getType() or asType()

#

went for the shortest as a baseline

shrewd ermine
#

Yeah makes sense, was just trying to think about using values from bindings()

smoky ocean
#

alternative: replace LLM.bindFoo with LLM.withFooBinding

shrewd ermine
smoky ocean
#

In practice callers are supposed to know the type they want

shrewd ermine
#

Yeah I was trying to think of the case where the LLM was able to save to a new variable (if that's going to be allowed) and how to safely find that

#

And withFooBinding sounds good too

#

I do like the parallels to container.withServiceBinding

smoky ocean
#
ctr, err := dag.LLM().WithPrompt("please save the container to $foo. don't mess up please").Binding("foo").Container()
if err != nil {
 panic("you had one job")
}
proper stratus
#

Where I can find examples of using multi-object? Wanna try that out

smoky ocean
#

@proper stratus docs update coming very soon!

#

@proper stratus in the meantime, you can try it straight from the shell:

  1. Start dagger shell
$ dagger

Make sure it's v0.17.0-llm.11 (released today)

  1. Set a few variables in the shell
ctr=$(container | from alpine | with-new-file hi.txt "Hi Bob")
dagger_repo=$(git https://github.com/dagger/dagger)
  1. Switch to "prompt mode"
>
  1. Start prompting
I gave you a container and a git repository. First, open the file hi.txt in the container, and tell me its contents. Then, fetch the last stable release of the git repo, get the subdirectory docs/, and copy them into the container I gave you. Save the result to new_container
  1. Switch back to shell mode
!
  1. Check that the new container was created
$new_container | terminal
#

(it's simpler that it seems in written form)

spring wave
#

side note: been considering tab to swap between prompt/shell, assuming the input is empty (have to compete w/ tab completion)

smoky ocean
#

@proper stratus in code, you can use LLM.Set<Foo>() and LLM.Get<Foo>() where <Foo> is the binding type. If you're familiar with the single-object API, it's the same, except you need to specify a key

spring wave
#

is that from something?

smoky ocean
#

or Ctrl-<something>

smoky ocean
#

conveniently placed. prime location

#

in an up-and-coming neighborhood

spring wave
#

lol

proper stratus
#

If I run the module in Dagger shell, does the agent know the TUI output? I want to try give the agent that output so it can help me improve my Dagger module performance.

smoky ocean
#

I believe @merry scarab was working on something very similar just today

#

also, @spring wave is working on allowing agents to access your current module's dependencies. That would allow you to install the modules of your choice, then have the agent call them directly

proper stratus
#

So currently if I give it a module, it just knows what I write in that module, not the dependencies I install in that module?

smoky ocean
#

yeah 1) it can't access the dependencies and 2) it can't call the module constructor, you have to call it and bind the object instance to the llm

smoky ocean
#

Today's workshop made me think about API integration.

@spring ocean and @wraith remnant worked on an agent that involves a lot of them. At the moment it's possible to write Dagger modules that wrap cloud APIs, and there are benefits to that - but it's labor-intensive. The DX is cumbersome and there are gaps, for example Dagger/Graphql types don't map directly to JSON and OpenAPI (eg. no maps). I believe @violet stump, @olive badge and @uneven depot brought this up in the in the past.

What if we added first class to external APIs somehow? Maybe as a special kind of dependency - imagine if your dagger.json could have remote APIs as a dependency, and the engine exposed that as a dagql module? The dependency source could be an OpenAPI/Swagger/graphql schema of some kind (I'm sure there are catalogs out there). They would be loaded by a special builtin SDK. Could be a big boost to our DX

uneven depot
#

That's a neat idea! I wonder if that same idea can extend to CLI tools also? That's what I end up wrapping more than APIs. I usually try to get an official image for the tool, if not pull an Alpine container and install it. CLI tools don't have a common structure though, like openAPI so, it's probably impossible. There's no guarantee a rest API follows the oapi spec also, so consumers may end up with weird errors that they can't directly identify because the api is wrapped in another SDK.

woeful quiver
#

I'm making a few changes and will polish and publish, but I thought it would be fun to write a database agent that can take a database connection and answer questions. I'm using an example database for dvd rentals.

shrewd ermine
#

πŸ‘€

woeful quiver
#

I saw that yesterday, was asked for that very feature last night at the meetup πŸ”₯

spring wave
#

FrogeAlarm llm has been merged into main FrogeAlarm

smoky ocean
# smoky ocean <@108011715077091328> <@135620352201064448> multi-object DX bikeshed. How do you...

Follow-up to DX thread @spring wave @shrewd ermine. Should we consider spinning out a LLMEnvironment type, separate from LLM? The former would have the bindings & state management. The latter would have the prompting and endpoint routing. Soon there will be MCP that currently grafts onto LLM. But would now cleanly graft onto LLMEnvironment instead.

Maybe makes the modules code cleaner also? Clear delineation of the LLM vs. its environment?

shrewd ermine
#

you have my attention πŸ™‚

spring wave
#

yeah, was thinking something similar

shrewd ermine
#

"environment" is the accepted industry term for where llm's interact with their tools and state right?

#

or is it more specific

smoky ocean
#

it will be πŸ˜‡

#

i think the industry is stuck on "tools" and will soon realize that they need more. Environment in my opinion is the next logical evolution, and I think we should spearhead it.

#

An environment implies 1) objects 2) state 3) rules for how objects interact

#

all of which dagger can provide

#

cc @noble notch πŸ‘†

shrewd ermine
#

ship it!

smoky ocean
#

Environment API 🧡

noble notch
smoky ocean
#

loop() 🧡

storm gate
#

I switched from the llm tag release, to main. @worn hill it'd be nice if we could set the --allow-llm from an env var for the CI. I thought setting DAGGER_LLM_ALLOW=all would override the cli arg but it does not work

worn hill
storm gate
quiet ether
# smoky ocean or `Ctrl-<something>`

asked ChatGPT about that but didn't take any hot takes. Mostly referencing vim's modes and python's ! special character

having said that C-/ maps as ^_ in my keyboard ( I think a bunch them do for some reason) which then bash in my case uses it to undo. FWIW C-[ and C-] are not currently remapped to anything and seems like bash doesn't use them

spring wave
#

quick idea: LLM.interrogate - like Container.terminal but for debugging an LLM. Runs the .sync and then pops you into interactive prompt-mode shell so you can ask it why it messed up

#

or, a way to pipe a LLM to .llm so you can load it as your current session, then you can at least change your function to return *LLM

quiet ether
#

Ideally you'd want to interrupt it when you know it's just not going anywhere, right?

spring wave
#

yeah that's another thing i've been wondering, if we do that we can get -i to do it automatically which is even better

#

for what i suggested you'd just splice it in after your last prompt before things go haywire, and hope it does it again. (same as splicing in .Terminal())

spring wave
#

MVP could just be grepping for "sorry" haha

woeful quiver
woeful quiver
smoky ocean
#

@spring wave probably safest and most portable to have _error builtin that llm can call to report an error

#

I love the idea of explicit LLM.terminal()

#

and I think prompt mode in the CLI should use it

#

separately, I think it would be SUPER powerful if you could just save variables of type LLM, and automatically the prompt mode shortcut can cycle through them. the default LLM would be a special case of this

I like this variable-based approach better than .llm which is too close to llm

spring wave
noble notch
noble notch
river belfry
woeful quiver
river belfry
shrewd ermine
polar loom
#

Hi
I've written an example of how to use an agent with Dagger in Python:
https://github.com/azorej/dagger-agent-example

Nothing special: I've used kpenfound/dag/workspace as the base for my workspace module and wrote a simple function to fix Dockerfile.

The most interesting part: I'm using a devcontainer to simplify setup, so it will be easier for others to try out the example.
I haven't seen a lot of use for devcontainers in the Dagger examples, and it's not very practical to have different versions of Dagger installed on one machine.

Therefore, it would be great if we could normalize the use of devcontainers.

GitHub

Test Dagger ML agents. Contribute to azorej/dagger-agent-example development by creating an account on GitHub.

#

btw, I am not sure how code generation works in Dagger.
Do I need to maintain a separate workspace module?
Or can I use the same module for both orchestration and the agent workspace?

smoky ocean
# polar loom btw, I am not sure how code generation works in Dagger. Do I need to maintain a ...

Ideally you would not need to maintain a separate module (while being free to, if you want)

There is a temporary limitation which prevents a module from calling itself via the Dagger API. We are working on allowing this. By extension, this also prevents a module from creating a binding to its own types, for a LLM to use.

This is why at the moment you have to separate the module being referenced by a LLM binding, and the module doing the binding.

--> hopefully this makes sense!

smoky ocean
#

I'm giving a live demo tonight... Should I show multi-object or not? πŸ™‚

shrewd ermine
#

ok the ❌ isn't helpful lol. I would vote no just because the DX is still up for discussion (I think? unless the WithFooBinding is in) and the reliability is in question depending on your model and objects

woeful quiver
#

Also, the deprecation underlines (at least in Zed) with the current with<> makes my eyes wander like I wrote broken code/syntax

shrewd ermine
#

oh but isn't the deprecation warning for single object?

spring wave
proud sigil
#

Hi everyone, I've recently begun exploring Dagger, love the idea of building containers for AI agents. I'm curious if there are common patterns or best practices for picking which models to use. It could be because you want to try different models for the same task and compare. Or, you could be building something that benefits from multiple models each with a specialized task.

shrewd ermine
# proud sigil Hi everyone, I've recently begun exploring Dagger, love the idea of building con...

Welcome! Definitely checkout https://docs.dagger.io/ai-agents#faq , it's a bit bare right now but we've been working on adding best practices as we can. As far as model selection, claude 3.7 and gpt-4o seem to be pretty capable in general. I've been enjoying gemini-2.0-flash too but you need to get the prompting just right for it to be successful. For coding tasks, qwen2.5-coder of whatever size you can run has been good too, but also needs just the right prompting and configuration

proud sigil
#

Thanks, I'll check out the FAQ. Do you think it would be worthwhile to build a module that could abstract away the model selection? As in, not have to fret about which is the current SOTA model for X and just have the module enable the current best? I know that sounds a bit abstract.

#

I imagine with more "vibe coding", you just forget about which model(s) and say "give me the best model that I can run on this machine right now for this task."

shrewd ermine
#

Yeah it's an interesting problem. Most of the functions I've been writing specify a default model but allow one to be passed in. The hard part that I've seen is that the prompting is somewhat model-specific so it's hard to just swap out the model and keep everything else the same

proud sigil
#

As it is, it's hard to keep track of which model identifier is right <model family><version>-<params>-<tuned for>-<quantized>. The naming conventions for these models is...rough.

#

Let alone the right prompt style/setup

shrewd ermine
#

Yeah I totally agree. It would be nice to have that kind of thing handled at some model router level since at the agent/Dagger level you don't necessarily know what models are available

proud sigil
#

It's something my team and I are looking into/building. I was looking at Dagger separately for workflow orchestration and then got nerd-swiped with agent containerization. Perhaps we can contribute.

shrewd ermine
#

Basically it would be cool if the agent could say "give me a model that meets these criteria" and the model server gives you the best fit

proud sigil
#

Which I think works great since containers may have the same functionality but access to different hardware resources or compute budget

#

So if the model registry could choose based on the resources available, it'd be a nice abstraction. Unless I'm misunderstanding the intent with containers.

shrewd ermine
#

Yeah its a surprisingly similar problem to container orchestration/scheduling in platforms like kubernetes. The app doesn't say "put me on this node", it just says "give me a node with this cpu/memory and access to this volume"

proud sigil
#

Yes, I think this is a good starting point.

smoky ocean
#

I was definitely thinking about adding models: [string] as an argument to LLM()

proud sigil
#

How would that work?

#

It can choose from the set or have fall backs if the first doesn't work?

smoky ocean
proud sigil
#

I like that affordance. I'd be curious how to build logic around the set of models. How to make it easier for the developer or the workflow to choose among the models in the set.

#

But allowing for multiple models is a good starting point IMO

smoky ocean
#

that would be the choice

proud sigil
#

Choice is good

smoky ocean
#

My personal coding agent 🧡

smoky ocean
#

as soon as 0.17 is out we can remove the custom install instructions from AI agent tutorial πŸ₯³

woeful quiver
#

Is it possible to set a callback for the llm response? Meaning every single response from the llm I can capture and send somewhere?

smoky ocean
#

That history API is a bit barebones, but we can beef it up to distinguish messages by sender (LLM, user, tool)

#

I think for now you could filter it by emoji in the contents πŸ˜›

woeful quiver
smoky ocean
#

@steep onyx @spring wave should I be worried that after running dagger develop with 0.17, my IDE autocompletes to dag.Llm and not dag.LLM ?

spring wave
#

if that reduces the worry πŸ˜›

#

i think we need to teach strcase about that acronym

#

ah, looks like we do that in core/ but the codegen code probably (hopefully) isn't loading core/

smoky ocean
#

(see very last error)

spring wave
#

looks like it tried to call a function with "app" in place of a FooID arg, essentially an unbound var

#

which didn't work because all the vars are app_*, and there's never just an app thinkspin wonder why it tried that

spring wave
spring wave
#

bots make typos too πŸ€—

smoky ocean
#

random question. of multiobj continues to cause problems. should we implement it as single object + shadowing?

#
  • the builtins system
smoky ocean
smoky ocean
#

@warped bramble @wraith remnant my guess is that your MCP pull request already works with multi-object... But only for models that don't need the crutch of a special system prompt

#

also @spring wave we could use the old "read the manual first" trick to inject the system prompt without making it a real system prompt - then it would work over mcp

spring wave
woeful quiver
spring wave
# smoky ocean <@108011715077091328> wdyt?

I do keep coming around to the idea that single-object is all we need for bootstrapping, and anything else can be implemented as a module that is able to maintain its own state (as the single object) . which i have done on a throwaway branch somewhere. Like I have a pretty strong feeling that different situations might call for different schemes, one of which being 100% control over the set of available tools to keep the model from jumping around and saving vars aimlessly.

smoky ocean
#

@spring wave I'm going to get back to dev mode today, I'm loaded up with demo feedback and papercuts. How do I do this in a way that doesn't conflict?

spring wave
smoky ocean
#

I'm thinking we should merge dagger mcp (hidden) asap to avoid conflict storm

warped bramble
#

One drawback of single-object (that i'm sure you're already aware of) is that just by bringing in dagger.Container's functions, you get ~70 tools. Which could grow quickly to the 128 tools limit.

spring wave
#

that's true for multi-object too

#

but, we never combine tools of multiple types, so that helps a bit

warped bramble
#

Ah sorry, i thought there was an indirection (not super familiar with it yet)

spring wave
#

i was only able to hit that limit by doing LLM.withLLM since it has a ton of getters/setters lol

smoky ocean
#

no you're right @warped bramble . in multi-object you get at most the tools of container; but it doesn't add up as you "unlock" more yypes

spring wave
#

(also, funny how a 128 limit is cropping up again, I remember that from the early Docker days with aufs :P)

smoky ocean
#

god...

#

did you hear the story of what it turned out to be? Which we discovered much later...

spring wave
#

hmmm was it the limit of the mount opts string length or something?

smoky ocean
#

yeah exactly. It wasn't actually 128 of anything - it just roughly landed at that number by chance with typical opt strings

#

and we all saw what we needed to see, to make sense of the world

spring wave
#

classic

#

the fix: mount everything under /d/

woeful quiver
spring wave
woeful quiver
spring wave
somber vault
#

Any trick to hide the progress bar so that it doesn't mess with python's input() reading from console?

spring wave
worn hill
spring wave
worn hill
spring wave
#

it does but v0.17.1 isn't a thing afaik

#

so i don't think dev versions would match it?

#

if you say v0.17.0 dev engines can at least still use it

#

you could tag a Llm version before the LLM bump i suppose

worn hill
#

squint maybe i'm making bad assumptions about the failures im tryna fix

smoky ocean
#

OK so: definitely don't publish llm modules targeting 0.17.0?

#
  • release 0.17.1 monday?
spring wave
smoky ocean
spring wave
#

looks like it ended up on an array maybe? those are currently not handled, might need something special like "select the Nth item"

#

also, yeah, at the moment you have to mention "dagger" for it to realize you want to use that module - ran into that to. try "lint the Dagger docs"

#

maybe it should be further scoped

smoky ocean
#

was going to try that next - but got that panic first

#

there's a subtlety here btw, sometimes you want an API endpoint to a module; and sometimes you want an API endpoint from the context of hte module. At the moment Dagger doesn't clearly delineate the two.

Maybe the distinction becomes more important when we throw LLM and their environments the mix?

smoky ocean
#

This video by @kylepenfound kind of blew my mind. He demystifies the concept of a coding agent, and shows how to build your own from scratch. From zero to "robot ships a feature" in one video 🀯

If you're curious about coding agents, but not sure where to start... Watch it! 🧡

forest reef
#

Hello, I'm trying to create an simple assistant for simple Kubernetes issues. As with many real-world problems, it would be ideal to find a perfect solution and finish, but it seems necessary to be able to instruct human intervention or interruption at each attempt (e.g. LLM Call). From what I can see, dagger/agent currently only adjusts loops through prompts, but would more programmatic control be possible? (e.g. MaxTry or confirmation on every call?)

storm gate
# forest reef Hello, I'm trying to create an simple assistant for simple Kubernetes issues. As...

You have a couple of ways to proceed, first of all you don't have to let the LLM handle the main loop. We built demos doing both, and I prefer to keep the LLM loop small, as well as its toolset.

Then you handle the main logic in a bigger surrounding loop that will do things beyond what an LLM can do. For example call containers, call an API, or anything that the Dagger API can do outside of LLMs, etc... You can then include extra information when you re-call the LLM, which increases the LLM accuracy (tried with both OpenAI and Anthropic models).

Also note that even if the LLM tries several times with its own loop, you can limit it to a specific number of attempts by making it explicit in the prompt.

shrewd ermine
shrewd ermine
fleet fiber
#

@shrewd ermine thanks for the YT videos on Agents, I'm going through them now.
https://www.youtube.com/watch?v=VHUi9ABdASA
https://www.youtube.com/watch?v=B7P04M9c1m0

This demo shows how an AI Agent can operate in a CI environment to assist in resolving test failures.

Code: https://github.com/kpenfound/greetings-api

Have questions? Ask us in Discord: https://discord.com/invite/dagger-io

β–Ά Play video

This demo shows off a simple agent that automatically creates new features in a demo project. Features are designed and assigned as GitHub issues and the agent creates a pull request with the completed work.

Code: https://github.com/kpenfound/greetings-api/blob/main/SWE_AGENT.md

Have questions? Ask us in Discord: https://discord.com/invite/da...

β–Ά Play video
spring wave
#

finally figured out those cryptic "mismatched function call/response" shaped errors - it's when an LLM tries to call a tool that doesn't exist, we were dropping that on the floor

smoky ocean
#

local git awareness 🧡

quiet ether
# spring wave finally figured out those cryptic "mismatched function call/response" shaped err...

I'm also getting these kind of errors quite often using Anthropic:

! POST "https://api.anthropic.com/v1/messages": 400 Bad Request {"type":"error","error":{"type":"invalid_request_error","message":"messages.33: `tool_use` ids were found without
β”‚ ! `tool_result` blocks immediately after: toolu_01T2iDjHMNTfRJWtGAgqidc6. Each `tool_use` block must have a corresponding `tool_result` block in the next message."}}
! input: llm.setK3S.withPrompt.loop.setK3S.withPrompt.loop.setK3S.withPrompt.loop.setK3S.withPrompt.sync select: POST "https://api.anthropic.com/v1/messages": 400 Bad Request
! {"type":"error","error":{"type":"invalid_request_error","message":"messages.33: `tool_use` ids were found without `tool_result` blocks immediately after:
! toolu_01T2iDjHMNTfRJWtGAgqidc6. Each `tool_use` block must have a corresponding `tool_result` block in the next message."}}

seems like a πŸ› ?

spring wave
#

yep, same issue, have a fix on my llm-evals branch

#

it's two issues: 1. that the model tried to make that call (bad prompting), 2. that we dropped the bad call and ended up with garbled history

quiet ether
#

even if the dagger function has cache buster

quiet ether
spring wave
quiet ether
#

I have a cache buster within my function but the trace doesn't even show the initial function call

spring wave
#

yep

#

that's true, cache busters technically have to be propagated all the way out now

#

i mean if we do a dagql persistent cache

spring wave
#

this might change with @steep onyx's work - he had to do something special for intra-session dagql cache hit telemetry

quiet ether
#

We need to find a way to set a pragma at the function level to hint the engine that the function should never be cached

#

Or just disable function caching altogether in prompt mode πŸ€”

#

There's many edge cases here I think

#

I'll open an issue tomorrow

smoky ocean
quiet ether
#

not sure what's the best way to handle that though. I'll open an issue to start the discussion tomorrow πŸ™

smoky ocean
#

you mentioned a pragma to disable caching of a function. That's part of the proposal in 7428. Are you thinking of a pragma that would be llm-specific?

quiet ether
smoky ocean
steep onyx
smoky ocean
#

@spring wave @worn hill @wraith remnant @warped bramble just to point out a major unresolved point between MCP and main branch: if our tool bindings implementation requires injecting a system prompt, it won't work over MCP. I know it's a tricky tradeoff. Just want to clarify that it's a high-impact problem to solve..

warped bramble
#

Other question possibly related: do we care or not about MCP clients that don't update their tools list as we make more tools available dynamically ? Because I wonder if (maybe just a stop-gap) the tools we expose in MCP would be a static list of loader/getter/setter tools that are essentially an indirection on top of what LLMEnv would provide. (Keep in mind i'm not up to date with what the new multi object API should look like).

river belfry
#

Just to share some fun stuff (at least to me πŸ˜… )
I built a small agent that allows me to start a dev environment based on (any?) codebase. It will install everything I need, without to worry about it. Depending on the model you use it will even build an run the tests before to give the container back.
That's just a demo, so probably a lot of stuff to improve, but that's nice to play with.
dagger -c "dev-environment path/to/some/code | terminal"
(there's also an other task that is summarizing a subreddit, nothing related to the first one but also nice to test)
If you want to try it be careful to select the model you want, by default it's tuned to use some of my local models for a fully local experience, including models)
https://github.com/eunomie/local-agent

spring wave
#

system prompting

smoky ocean
#

@wraith remnant @warped bramble do you know if Cursor supports dynamic tool registration? Also I noticed in a MCP+Cursor video that the client asks for manual confirmation before each tool call. I wonder if that will be annoying with Dagger MCP, since that implies more intermediary internal calls

warped bramble
wraith remnant
shrewd ermine
warped bramble
worn hill
# river belfry Just to share some fun stuff (at least to me πŸ˜… ) I built a small agent that all...

this is cool, have you thrown any lower level language (rust, c) stuff at it? https://github.com/redis/redis or https://github.com/tree-sitter/tree-sitter would be fun examples

GitHub

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs,...

GitHub

An incremental parsing system for programming tools - tree-sitter/tree-sitter

smoky ocean
#

Release checklist 🧡

smoky ocean
#

@quiet ether re: your discord agent. Can you split it into 1) a discord module and 2) an example agent using it? We're trying to apply that model to all examples going forward, to maximize composability

#

(ideally that discord module would be reusable enough to be a basis for stdlib)

quiet ether
# quiet ether Roger!

@smoky ocean one thing that I was wondering is if i should try to make it work with multi-object by default. It's not a big deal through because I can otherwise wrap all the tools that I need in a single workspace and use single-object as we currently showcase in multiple demos

shrewd ermine
#

what would the multiple objects be? discord client and _?

quiet ether
#

and potentially a third object to send notifications to somewhere else besides Discord?

smoky ocean
#

yes multi object. specifically multi object from prompt mode...

smoky ocean
#

by default the builtin agent has access to your module's dependencies

#

(right @spring wave )

quiet ether
#

i.e foo=$(my-module)

smoky ocean
#

not anymore πŸ™‚ (at least that's the UX we want to enable)

#

install; prompt; boom it works

quiet ether
#

is that v0.17.1?

smoky ocean
#

of course that leaves the question of injecting config

#

which is why we need to dogfood asap

quiet ether
#

the only thing I'm missing is the ability to set multiple -m flags then. So I can make it work in the prompt without even creating a module at all

smoky ocean
#

maybe now I will finally get it πŸ™‚

#

but init & install is a good start i think

#

maybe in the future, we will have a first class concept of environment, which you could initialize list load etc

quiet ether
shrewd ermine
spring wave
# smoky ocean (right <@108011715077091328> )

yep, it starts scoped to the toplevel Query now so it can call your module's constructor, and I think dependencies too, but actually not 100% sure - I remember we do things to avoid leaking module dependencies

smoky ocean
#

but would love to be wrong

spring wave
#

yes i think so

#

oh also - $_ is a thing, it'll always be the last object that the LLM operated on / "returned"

#

there may be some bikeshedding to do there, but i think it's an important mechanic

quiet ether
spring wave
#

that's the one that gets substituted with the last input right? so you don't have to go back and edit?

#

(i'm a fishy kinda guy)

worn hill
#

correct, but you can also use it in context, like i do with-dev go test ... and then when i rebuild i do dev && !!

#

(also im on zsh, and fairly certain that alias is POSIX, more ancient than even bash)

spring wave
#

right right

quiet ether
#

yep, exactly

worn hill
#

!$ is another one that's probably very useful in a dagger shell context, just the last arg of the last command

spring wave
#

what's the prevailing use case? !! | with-foo?

worn hill
#

yeah, either append or prepend

spring wave
#

i guess that can be $_ | with-foo once we support $_ in shell too (not just LLM response)

worn hill
#

i generally use prepend more, though, because append can also be <up arrow> | with-foo

quiet ether
#

in bash !! actually re-computes the last command, doesn't hold the actual value

worn hill
#

oh yeah it's unevaluated

spring wave
#

right it's more like a macro than a var

worn hill
#

yeah

quiet ether
#

so you'd do container | from alpine

and then something like

myfunc --ctr $(!!)

spring wave
#

and then your history will contain myfunc --ctr $(container | from alpine) right?

worn hill
#

lol that's a big "depends" i think

worn hill
#

at least on my config it saves the !! unless you tab-expand

quiet ether
worn hill
#

oh you're right... huh

#

nm i'm making shit up

shrewd ermine
#

@spring wave what does _scratch do and why does it get called so often in prompt mode?

spring wave
#

it resets the current state to nil, so there aren't any per-object tools

#

i've gotten rid of it on llm-evals

shrewd ermine
#

lol so basically a table flip, got it

spring wave
#

lol, yea pretty much. curious how it does on the llm-evals tool scheme. is it easily reproducible?

shrewd ermine
#

yeah i'm just setting source=$(directory | with-directory / .) and asking gemini to make changes to my project

spring wave
#

ah gemini specifically is the model i've been fixing, lemme try. are you using a particular agent? or that's it?

shrewd ermine
#

just prompt mode right now, trying to work on a no-code experience

woeful quiver
wraith remnant
spring wave
#

so theoretically selectFoo is only for 1) when you have no current selection or 2) when you want to go back to a different/older object

#

but, some models are keen to call it redundantly. those ones need a system prompt 😦

#

tried everything: putting something in the description => ignored. putting an explicit hint in the output => ignored. having selectFoo return an error if it was redundant => it just keeps doing it

wraith remnant
shrewd ermine
#

Ok, worked a generic sql module using

spring wave
woeful quiver
#

Is there a consensus on the best model to use in ollama for tool calling? List looks long and forum/reddit are a little all over the place in terms of recommendations: https://ollama.com/search?c=tools

shrewd ermine
river belfry
# worn hill this is cool, have you thrown any lower level language (rust, c) stuff at it? h...

I tried, and the results are... not consistent.
I can use it to work on a rust project for instance, no problem.
But on complex codebases, the result will really depend on the model. With my local qwen2.5 it works well on small codebases.
But tree-sitter for instance (that also contains bindings to other languages) will not be good.
If I switch to openai/gpt-4o (I kept defaults) it works great, install cargo, npm, install some npm tools, build it before to open the terminal
I'd love to have the same thing fully locally, but I probably need a bigger machine and a bigger model for that πŸ˜…

somber vault
#

Figuring out the deployment part, would appreciate any advice. Rest is working PERFECTLY!
So far the setup:

  1. I want my app containerized to simplify running under k8s, ECS whatnot.
  2. My app inside container calls dagger engine itself.
  3. Ideally some images come cached within dagger engine inside the app container.

What I'm doing right now:
use dind as base and install Dagger + UV

LOAD_CACHE = """
import anyio
import dagger
from dagger import dag

async def main():
    async with dagger.connection():
        print("inside async with dagger")
        container = dag.container().from_("oven/bun:1.2.5-alpine")
        result = await container.with_exec(["bun"])
        print(await result.stdout())


if __name__ == "__main__":
    anyio.run(main)
""".strip()


base = (
    dag.container()
    .from_("docker:28-dind")
    .with_exec(["apk", "add", "curl"])
    .with_exec(["apk", "add", "python3"])
    .with_exec(["curl", "-fsSL", "https://dl.dagger.io/dagger/install.sh", "-o", "/tmp/install.sh"])
    .with_exec(["sh", "-c", "BIN_DIR=/usr/local/bin sh /tmp/install.sh"])
    .with_exec(["curl", "-fsSL", "https://astral.sh/uv/install.sh", "-o", "/tmp/install.sh"])
    .with_exec(["sh", "-c", "XDG_BIN_HOME=/usr/local/bin INSTALLER_NO_MODIFY_PATH=1 sh /tmp/install.sh"])
)


runtime = (
    base
    #.with_exec(["sh", "/usr/local/bin/dockerd-entrypoint.sh"], insecure_root_capabilities=True)
    .with_workdir("/app")
    .with_new_file("/app/load_cache.py", LOAD_CACHE)
    .with_exec(["uv", "run", "load_cache.py"], insecure_root_capabilities=True)
)

Whats the best course of action?

nova bronze
#

Figuring out the deployment part, would

lean mural
#

Is there a generalize workspaces that folks are using for agents yet or are most folks hand rolling each time? I tried @shrewd ermine 's module from the daggerverse, but also noticed that it's not used in his agents demos

shrewd ermine
lean mural
#

yeah i keep finding that the agent's context grows huge and the task fails out before it finishes. suspect i'll need to hand roll a workspace module

shrewd ermine
#

yeah exactly, the workspace pattern is perfect for that. I tried to make a generalized one with kpenfound/dag/workspace but I still ended up needing changes for each implementation. Maybe function masking on top of a very generalized workspace would be a solution

merry scarab
#

I do think we should ship a default workspace with dagger init or something just to help people get off on the right foot

shrewd ermine
#

another case for init templates πŸ˜›

merry scarab
fleet fiber
#

Was talking Silicon Valley (HBO show) with a friend today and he told me about Windsurf's commercial that has Russ from the show on it. Hilarious. BUT, what Windsurf's founders describe is where I think Dagger might be headed "Both collaborative and independently powerful"

https://youtu.be/3xk2qG2QPdU?si=x5L63_3k2DDuSv7z&t=44

Introducing the Windsurf Editor - the world’s first agentic IDE. πŸ„

In Windsurf, we have given the AI a previously unseen combination of deep codebase understanding, access to a powerful set of tools, and real time access to your actions. The result? A magical experience we call Cascade, the evolution of chat that keeps you truly in the flo...

β–Ά Play video
warped bramble
#

I'm getting 503 errors using Gemini, is it just me ?

#

nvm, came back

shrewd ermine
#

yep I hit that too

smoky ocean
#

Cloudflare agent stuff

smoky ocean
#

@spring wave what does LLMEnv.intern() do exactly? Re-entrant ingestion into "ID system" + return ingested ID? So I have to call it at least once to ingest the value, but then I can safely call it several times and get the same result, without side effects on the env state?

#

(context: rebasing my Environment API branch on planet-eval πŸ™‚ )

spring wave
#

the PR is ready for approval now

smoky ocean
#

@spring wave how do you use the expectedType argument in Get()?

spring wave
#

it's hard to consistently convince a model to make that mistake, so it's kind of "best effort" atm, may need refinement as we test more (for example, to handle 1 vs. "1")

#

it would probably also make sense to assert that the value matches that type name, but that's already handled other places so I didn't bother

smoky ocean
#

I see everything is locked to objects, I'm guessing it wont be too hard to expand to any dagql.Typed in the future? There use to be a check for objects in some parts

spring wave
#

it might make sense to do that again yeah, but it might also make sense to still keep non-Object types in a separate spot since the mechanics are so different. I split it up at a time where string vars were moved out into the LLM (because they were reduced to just prompt vars), but now they're back in the LLMEnv, and I just gave them their own map to keep it tidy

#

make sure you pull, you might not have those changes

#

i was going to ask earlier: do we anticipate passing other types as variables, or only strings? other scalars are easy, but arrays are where things get complicated

smoky ocean
#

At least other scalars yeah. Didn't think about arrays, might not be worth it

spring wave
#

are scalar values preserved in shell? or are they all strings?

#

foo=1 is a string i'd imagine

#

(though there may be other ways to set these for sure)

smoky ocean
#

My immediate concern for the env API, is splitting LLMEnv in two halves: a Dagger-facing backend called Environment, and a LLM-facing frontend called MCP. So trying to sort the implementation in those 2 buckets

#

Implementation as I understand is mostly unaffected (besides being split in two), except for the functionMask part, which will move to the individual binding instead of just the current selected object

spring wave
#

oh right, I want to try functionMask again, I had it almost working but the model ended up just getting confused, so I shelved it :/ (but kept some of the code intact for when I get back to it)

smoky ocean
#

The Type#number system will be in the backend Environment. But things like the concept of "current" object, the specific string replies, tool hints etc, move to MCP

spring wave
#

I'm tempted to try another model where instead of having a "current object" you gradually increase your scope of available functions and explicitly pass a self argument

smoky ocean
#

Hopefully Environment can stabilize while MCP keeps iterating like crazy on best LLM interface

spring wave
smoky ocean
#

(to be clear I'm using MCP in a loose sense. It would encapsulate both actual MCP protocol implementation, and BBI. I'm thinking we can just rebrand our BBI as "MCP", really a sort of "MCP+" πŸ™‚

#

I feel like the longer we wait before we split, the harder it will get

spring wave
#

looking forward to some code so this can start to become concrete in my head πŸ˜›

smoky ocean
#

yeah sorry, the last week or so I haven't been able to look at the code at all

#

thanks for keeping that core going

quiet ether
#

πŸ‘‹ how's the load secret from env var tools coming?

βœ” call the Discord API and set the bot token using the DISCORD_TOKEN environment variable 5.0s
β”‚πŸ§‘ call the Discord API and set the bot token using the DISCORD_TOKEN environment variable
β”‚ ┃ 0.0s
β”‚
β”‚πŸ€– To work with the Discord API using the provided tools, I'll need to call  Query_discord  which requires a bot token parameter. The token should be provided through the environment, but I don't
β”‚ ┃ have direct access to environment variables. Let me let you know what's needed:
β”‚ ┃
β”‚ ┃ The  Query_discord  function requires a bot token and this needs to be provided as a SecretID. You'll need to first create a secret with the Discord token and then use that secret with the
β”‚ ┃ Discord API call.
quiet ether
# quiet ether πŸ‘‹ how's the `load secret from env var` tools coming? ``` βœ” call the Discord ...

related:

βœ” load the secret from the DISCORD_TOKEN env variable 12.4s
β”‚πŸ§‘ load the secret from the DISCORD_TOKEN env variable
β”‚ ┃ 0.0s
β”‚
β”‚πŸ€– I can help you load a secret using the name "DISCORD_TOKEN". I can use the  Query_loadSecretFromName  function for this purpose.
β”‚ ┃ 2.7s β—† Input Tokens: 2,427 β—† Output Tokens: 95
β”‚
β”‚ βœ” loadSecretFromName(name: "DISCORD_TOKEN"): Secret! 0.0s
β”‚
β”‚πŸ€– Now I can retrieve the information about this secret:
β”‚ ┃ 4.0s β—† Input Tokens: 628 β—† Output Tokens: 48
β”‚
β”‚πŸ€– 3.3s β—† Input Tokens: 702 β—† Output Tokens: 37
β”‚
β”‚ ✘ Secret.name: String! 0.0s
β”‚ ! secret not found: xxh3:a87f0af2887099fc
β”‚
β”‚πŸ€– I apologize, but it seems there was an error retrieving the secret. This could mean that either:
β”‚ ┃
β”‚ ┃ 1. The secret named "DISCORD_TOKEN" doesn't exist in the environment
#

related:

smoky ocean
#

@spring wave @warped bramble @shrewd ermine @wraith remnant can we talk live about llm release in a little bit?

spring wave
smoky ocean
quiet ether
#

This actually gave an idea. How about a flag along with "-c" so the shell doesn't automatically exits after running the commands? Python's REPL supports this by adding the -i flag

warped bramble
#

@quiet ether FYI i have an idea to make MCP work with multiple modules statically (not yet dynamically), could be in a follow up.

spring wave
smoky ocean
#

No problem, me too πŸ˜›

#

Want to ping us when you're back? The shell launch tornado will probably be over by then

#

Can you still talk today?

spring wave
river belfry
#

Are SetXxxx(name, value) and WithXxxx(value) equivalent?
I thought WithXxx was deprecated, but maybe not anymore?
What's the impact regarding tools? When I use WithXxxx I can see the tools (for instance using .LLM().Tools()) but not when doing the same thing with SetXxxx.
So I guess I'm missing something here, but not sure what πŸ˜‰

smoky ocean
#
  1. The way it currently works is that both work, and are layered. There is a concept of "selected object", which you can set directly with WithXXX. The LLM can also change its own selection with internal tools. At a higher layer on top of that, there are named bindings (variables) which can be set in the LLM environment (SetXXX). The LLM can list them and select them
#

We are actively working to simplify this API. It's tricky because there are several variables in the equation:

  1. Best DX for the developer (ie. you)
  2. Best performing LLM interface (how the bindings are presented to the LLM, lots of tricks and iterations there)
  3. MCP support. We want the same system to work over LLM and MCP protocols.
  4. Keeping modules up-to-date with API changes (examples, early prototypes etc)
#

@river belfry when you do WithXxx("foo", bar) the object bar is available at the named binding "foo". I think in the current implementation there is a special tool called _objects or _list or something like that, and it will list the bindings

quiet ether
#

just to validate array types are not very well handled by the dagger llm yet, right? Getting some panics when try to call function which return those

shrewd ermine
#

correct, I think arrays of basic types are fine but not arrays of objects

spring wave
spring wave
#

@smoky ocean half baked idea related to return values and whether there's an implicit current state / return value, and building on the idea of letting it know upfront what type of value we want back: something like withFileSlot("bin", "The compiled binary.") - the goal being for the model to fill all the "slots"(?) before returning. Those could then all be synced back to the shell. Dunno about the code DX, I have a feeling the functional model of "many inputs in, one input out" may still be more intuitive (I generally prefer schemes that don't require you to make up names for things if you only have one thing), but something to think about

smoky ocean
#

mm but "returning" a binding is weird

#

but so it "returning a slot" πŸ˜›

spring wave
#

lol, i thought of it more like filling a slot but am not bound (heh) to the word at all

#

since there may be multiple of them

#

there's probably a better metaphor

#

what I really don't want is mutable bindings

smoky ocean
#

I think if LLM familiarity matters, we should go for something very very present in the training data, like returning or exporting

#

Or printing πŸ™‚ (we don't have to actually print it)

#

Or showing ?

spring wave
#

yeah probably comes down to what the evals say

smoky ocean
#

What would a human user do

#

It would probably enter the values in a form

#

You could almost say it would be prompted to enter a value πŸ˜›

#

oh god we crossed the streams

#

I mean that is how they win in the end

warped bramble
woeful quiver
#

Dumb idea, but a user with a form would be to submit or even.. enter?

#

submitBinding

#

We’re all going to submit to them in the end.. so might as well adapt now?

#

enterBinding 🀘

worn hill
spring wave
#

knowing the desired return type

shrewd ermine
#

Definitely eager to try this one. I've seen lots of good things about gemma3 but in the few prompts I've thrown at it, I'm not 100% convinced yet. But I'm a gemini fan so I know they can do it πŸ™‚

spring wave
#

I'm tinkering with function masks again and it's working suspiciously well... tempted to sneak it in. It saves SO many tokens (2849 vs 13,470), and in turn makes things run much faster especially with Claude. Also cool to see the LLM planning ahead.

#

(that mask-less run ended with "overloaded"... I have been seeing that a lot)

lilac dagger
#

Hi, first post here: I tried using o3-mini of openai but got the error in the pic which shows parallel_tool_calls is not supported by the model yet, and parallel_tool_calls is true by default per https://platform.openai.com/docs/api-reference/chat/create#chat-create-parallel_tool_calls. Actually there is a recent issue on openai forum claiming o3-min tool calling issue (https://community.openai.com/t/o3-mini-api-with-tools-only-ever-returns-1-tool-no-matter-prompt/1112390/3), so it seems we cannot expect to use parallel_tool_calls for o3-mini any time soon. Alternatively, thinking of turning the option off, as I tracked down the dagger implementation which doesn't seem to allow custom params for the api (https://github.com/dagger/dagger/blob/20e8a174fd9e45c7ae915d091167aa7ef18d822a/core/llm_openai.go#L117), so parallel_tool_calls is true by default and not configurable from client side. So it seems a deadend for o3-mini unless I missed anything ? Is there any workaround or is allowing custom parameters for llm api worth being supported in near term ?

GitHub

An open-source runtime for composable workflows. Great for AI agents and CI/CD. - dagger/dagger

merry scarab
#

Has anyone been able to get their agent to understand how to find a trace url for the work its doing?

I am trying to have this URL included in a markdown file that the my agent is creating, but sadly right now it just says something along the lines of - **Dagger Cloud Trace**: N/A (local testing)

river belfry
#

While trying to use mistral-nemo I've got:
After the optional system message, conversation roles must alternate user/assistant/user/assistant/...
Is it something worth investigating (or is that something already known)?

merry scarab
#

Has anyone seen this error from (I think anthropic)?

input: daggerverseQa.doQa received error while streaming: {"type":"error","error":{"details":null,"type":"overloaded_error","message":"Overloaded"}     }
shrewd ermine
#

Yeah, I think they just need a break, same idea as google's 429

merry scarab
#

I have a feeling my demo is not going to go well 😦

shrewd ermine
#

usually they're not overloaded for long so 🀞

spring wave
#

I hit that pretty frequently with Claude :/ @merry scarab have you tried your demo with Gemini? it'll be much better with v0.17.2, super fast and no throttling/overloaded errors

merry scarab
#

I dont have a gemini account either, yet ..

smoky ocean
#

switch to openai?

merry scarab
#

openai has other errors - it tells me to f off because of a 30k token limit or something

#

Ugh........ I am so sad. This was working overall before but now its not writing my file again 😭

why does this always seem to happen right before a demo

#

nvm it works!

#

🀞

#

The overloaded thing gets cached lol -- this is where I really wish I could have a flag or somethign to tell shell to DOIT or something

shrewd ermine
#

I've run into this too, we should not cache those errors

smoky ocean
#

Masked functions 🧡

wraith remnant
#

custom params

smoky ocean
#

🚨🚨🚨 Dagger 0.17.2 is out, with many improvements to the LLM API. Make sure to upgrade and try running your agents again! Let us know if you see any issues or improvements!

woeful quiver
#

Updated my agents to v0.17.2 - everything works as expected. Any specific APIs we should try/test out?

spring wave
woeful quiver
worn hill
#

soooo @quiet ether @spring wave @smoky ocean the "we want dagger install'd deps to be available to the LLM" thing: this is obviously desirable for local modules, should it also apply to remote modules? eg dagger -m toy-programmer shell should have an LLM env where toy-workspace is callable? remotes-depending-on-remotes, too? like dagger -m dagger/dagger should have gerhard/daggerverse/notify?

smoky ocean
worn hill
#

imma just try to have the llm always feel like it's inside the module regardless of local/remote status

smoky ocean
#

That way it will match the shell .cd

spring wave
#

portalfrom#1354880578390065162 message - here's a delegateable task: retry logic in the LLM loop. Each provider implementation checks for certain retryable errors, annotates the error response as such (wrapping error type), outer loop checks for it and retries, backing off as appropriate

#

i'd wager that's pretty high priority

shrewd ermine
#

debugging a thing in the gemini client implementation https://github.com/dagger/dagger/blob/main/core/llm_google.go#L221

if candidate.Content == nil {
  return nil, fmt.Errorf("no content?")
}

I think maybe we need to continue in this case rather than error. Anyone else have more context on cases when the Content is nil (with streaming)

wraith remnant
shrewd ermine
spring wave
shrewd ermine
#

ah got it, yeah I don't think we're handling FinishReason at all right now are we

river belfry
#

I need to explore more, but on the same agent, same (local) model, I have worse results with the 0.17.2 than I had before. πŸ€”
I was running a version based on commit a2aaf08158a64bc47e4d3fe143701b9dbb88d885 that was pretty good. Not sure what changed, I'll have a look at the llm related commits between this commit and the release.

spring wave
spring wave
shrewd ermine
#

once I write a PR review agent from my demo repo I'm going to have a review bake-off between the different big models 😎

worn hill
# worn hill soooo <@336241811179962368> <@108011715077091328> <@488409085998530571> the "we ...

so i've been poking around for a couple hours now and i cannot for the life of me figure out where to rip open a seam to mix module dependencies into LLMEnv. i think i've found a couple of the relevant pieces, like LLMHook.InstallObject exposes objects in the LLM env, core/schema/modulesource.go has pieces that iterate through module dependencies... ModSource.lazilyLoadSchema even calls mod.Install on each module in a ModDeps (although that feels like maybe a different meaning of Install)... coming from the outside, shell_fs.go has maybeLoadModule for bringing in modules, but that's got all its own definitions of modules that don't map to core types, and there's a lot of indirection between those shell modules and the LLM install hook.

where would you start with this? @spring wave @hidden tartan it feels somewhat related to codegen activities, but the calling context is very different

smoky ocean
spring wave
worn hill
#

@spring wave there's a spooky comment here though: ```go
// Serve a module's API in the current session.
//
// Note: this can only be called once per session. In the future, it could return a stream or service to remove the side effect.
func (r *Module) Serve(ctx context.Context) error {
if r.serve != nil {
return nil
}
q := r.query.Select("serve")

return q.Execute(ctx)

}

lilac dagger
#

A question about "direct host access" which is disabled for dagger function per https://docs.dagger.io/api/sdk/#differences: my case is to develop an AI agent in form of dagger module, and I'm trying to let it take an input source directory from host (where my sample app is at), and do something agentic in my dagger module/functions including read, write, modify codes in my sample app or execute some arbitrary commands. But I tried achieving with no luck, the best I reached is taking in the host dir, manipulating it with or without container but cannot export it via code. Is it because function context is only the module folder and exporting to host via code is not allowed ? The dagger demos of agent are mostly publishing the agentic results on the project to PR, but I just want to apply them to host. It seems related to this issue https://github.com/dagger/dagger/issues/8235 ?

GitHub

Problem In theory Dagger is perfect for generating code or docs. In practice, the logic for exporting files back to the client filesystem is simplistic and brittle, which makes the experience awkwa...

Dagger SDKs make it easy to call the Dagger API from your favorite programming language, by developing Dagger Functions or custom applications.

smoky ocean
#

it's all or nothing though

hidden tartan
worn hill
hidden tartan
#

Something like that

dag.moduleSource("xxx").Serve(ctx)
dag.moduleSource("bbb").Serve(ctx)

Boom xxx and bbb are queriable πŸ˜„

worn hill
#

in the shell context, i'm initially trying this one layer up in 2/3 callsites of maybeLoadModule (.cd and on startup, skipping the one in exec)

#
func (h *shellCallHandler) maybeLoadModuleAndDeps(ctx context.Context, path string) (*moduleDef, *configuredModule, error) {
    def, cfg, err := h.maybeLoadModule(ctx, path)
    if err != nil {
        return nil, nil, err
    }

    for _, dep := range def.Dependencies {
        digest, err := dep.Source.Digest(ctx)
        if err != nil {
            return nil, nil, err
        }
        _, err = h.getOrInitDef(digest, func() (*moduleDef, error) {
            return initializeModule(ctx, h.dag, dep.Source)
        })
        if err != nil {
            return nil, nil, err
        }
    }

    return def, cfg, nil
}

looks like this

hidden tartan
worn hill
#

dunno yet still building lol

hidden tartan
#

Okay let me know πŸ˜„

#

You could just call dep.Source.AsModule.Serve technically

#

If you want to serve that dep

#

I guess there's more to add it to the shell completion etc but that should be a one liner to only serve

lilac dagger
# smoky ocean I believe there is now an optional argument to `llm()` to give it "privileged" a...

Thanks for the llm hint. But a dumber question is that without using llm/agent, just normal operations, is it possible to use a module function to take input source and write like a hello.txt directly back to the source directory on host via code instead of dagger shell ? Here is a simple code that ran with no error but nothing is created on mysampleapp folder even with wipe as True, and file is created in container as verified via terminal(), not sure what I miss here.

worn hill
#

imma try it with just serve now, because the way i did it the shell wiring is incomplete anyways

merry scarab
# lilac dagger Thanks for the llm hint. But a dumber question is that without using llm/agent, ...

Yeah this is possible I did a demo showing this exact scenario this morning. My example used LLM but it works the same in all scenarios.

Check out the code and video

https://github.com/levlaz/agent-playground/tree/main/daggerverse-qa

https://www.youtube.com/live/uOSmyFx7O7Q?feature=shared&t=2851

Main thing is you need to use β€˜export’ to get the file or directory back out to your local machine

https://docs.dagger.io/api/chaining/#export-directories-files-and-containers

GitHub

Public Repo for Building AI Agents using Dagger. Contribute to levlaz/agent-playground development by creating an account on GitHub.

Join the Dagger team and fellow Daggernauts for our bi-weekly Community Call! Stay up-to-date with the latest product enhancements, discover innovative use c...

β–Ά Play video

Function chaining is one of Dagger's most powerful features, as it allows you to dynamically compose complex pipelines by connecting one Dagger Function with another. The following sections demonstrate a few more examples of function chaining with the Dagger CLI.

worn hill
smoky ocean
#

Every time I see a PR by @worn hill, a little "victory trumpet" sound plays in my head because of that little trumpet-shaped avatar. Is it just me?

worn hill
#

that's the idea

#

it is a muted horn fwiw, so quiet victory trumpet

smoky ocean
#

I'll keep that in mind

lilac dagger
# merry scarab Yeah this is possible I did a demo showing this exact scenario this morning. My ...

Thanks @merry scarab ! I've watched the live video of you this morning but notice the difference that you use export function from dagger shell instead of writing it in code. I'm asking whether it's possible to achieve it using pure code. I guess this comment from dagger team is relevant ? https://github.com/dagger/dagger/issues/8226#issuecomment-2312479275 And I tried using your code, the shell way of exporting works for me even the target dir is arbitrary out of module root but the attached code doesn't work for neither under module root or arbitrary path (ran without error but nothing exported).

GitHub

What is the issue? We have created a container using go sdk and trying to export to host machine using Export function but it is not working. Same operation we are doing using cli "export path...

smoky ocean
lilac dagger
smoky ocean
lean mural
#

is there a way to see the full LLM query? i keep blowing up because the input grows too large as the system runs the tests on repeat, but i haven't figured out how to tell what exactly is getting appended to the LLM request

smoky ocean
lean mural
smoky ocean
lean mural
smoky ocean
lean mural
#

Ok now the LLM is doing what I expect, but dagger keeps exiting after the LLM prompt cycle finishes:

const llmSpace = await dag.llm()
  .withWorkspace(ws)
  .withPromptFile(prompt)
  .sync();

return await llmSpace
  .workspace()
  .diff()

it nevers calls diff -- it exits at the llm call

lean mural
shrewd ermine
#

That looks right, but there could be 2 things going on:

  1. If you're looking in cloud, currently the LLM basically "takes over" the whole trace and hides everything before/after. We need to fix this
  2. At one point the tracing output in the terminal would basically push the output of diff off screen. I don't remember what the current state of this is, I thought it was fixed... but when that was happening, running the same command again to get the cached run would show me the correct output.
spring wave
lean mural
lean mural
#

@shrewd ermine have you seen this in your greetings api?

POST https://api.github.com/repos/lamalex/greetings-api/pulls/3/comments: 422 Validation Failed [{Resource:PullRequestReviewComment Field:pull_request_review_thread.path Code:invalid Message:} {Resource:PullRequestReviewComment Field:pull_request_review_thread.diff_hunk Code:missing_field Message:}]

its failing to write suggestions, https://v3.dagger.cloud/lamalex/traces/e8fd16f89c8f5aaa89d2db32192d6de5?span=2deb6b8a5cac88a3

shrewd ermine
eager fiber
#

spent yesterday loading a bunch of NYC open data into postgres and then wrote an agent to analyze it.

  • gpt-4o seems to consistently generate working queries, but then quickly ends up in rate limit land.

  • gpt-4o-mini avoids rate limits but generates mostly useless queries and spends a bunch of time just looping on nonsense

trying out some alternative models this morning

shrewd ermine
eager fiber
lean mural
shrewd ermine
shrewd ermine
#

gemini-2.0-flash

spring wave
#

🧡 to bikeshed prompt mode toggle

smoky ocean
quiet ether
quiet ether
worn hill
eager fiber
#

@shrewd ermine ya gemini limits seem better.

#

although you'll notice i asked about 2025 and got a 2024 response πŸ˜›

#

so have to play with the data and prompts a bit.

river belfry
# spring wave do you have code anywhere I can try out?

My code is there but not sure how easy it is to test it https://github.com/eunomie/local-agent
But I finally managed to find the first commit that changed the behavior:
https://github.com/dagger/dagger/commit/44370a44d
I'll check again with main to see.
Basically:

  • I'm running qwen2.5 as the (local) model
  • the app is a small tool containing an environment with functions like addpackages, tree, read, write
  • the goal is to create dev environments on the fly, by letting the LLM read the files and understand what to do.
  • before this commit, the llm will read the file tree, read the files, install packages
  • after, the llm will read the file tree, read the files, but stops there. It never install packages, in best case it prints what it should do
    My guess would be in this upgrade of openai-go from 0.1.0-alpha.61 to 0.1.0-beta.2
    I'm trying to upgrade to the 0.1.0-beta-3 to see if that's any better
river belfry
feral birch
#

Hi all! I've been playing around with Dagger's agent use cases and am very excited to build an agent on top of it! Thinking ahead, I do have a question on how I can distribute my agent that's built on top of Dagger for others to run. I understand that I could ask my users to install Dagger and then do dagger install dagger run etc, with my own dagger module. But is there a way that I could package dagger runtime as part of my own executable and just ship one single binary to my users? Thanks for the help!

quiet ether
worn hill
# quiet ether just tested this and unblocks my use-case. Thx Connor ❀️

was chatting with @spring wave and there is one thing downstream of this that feels kinda necessary: the LLM needs a selectQuery tool. when its selected something else, in my use cases usually the type of the parameter of a query that im building up to put into the query, once you've got the param all constructed, it can no longer pass it in to the query-rooted function you're trying to call. curious if you've hit that same thing in your experimentation.

steep onyx
#

Did a rebase of the engine-wide cache PR on main and picked up the new AllowLLM tests, which started failing. The main "problem" is that tests are getting cache hits/deduped-execution when calling the same modules with the same args, which results in only one of the test clients actually getting a prompt to ask if it's okay to use the LLM, causing others to fail randomly...

"problem" in quotes because I'm not sure yet if that's actually something to consider a bug. The point of the --allow-llm stuff is to not eat users tokens without their permission, right? So in this case, not prompting a user who's tokens wouldn't be consumed anyways makes sense, I think? Would others agree? cc @worn hill

#

I got the tests to pass by mixing in the client's AllowedLLMModules settings into the cache key for function calls, but that's a sad way to fix it since it means less function caching everywhere just based on those settings.

steep onyx
worn hill
# steep onyx Did a rebase of the engine-wide cache PR on main and picked up the new `AllowLLM...

yeah, i'd agree. i was hitting this with the old cache algos too, but added this cache buster to try to avoid it... it seemed to work and cause the desired-for-test cache invalidations but i won't pretend i had a complete understanding of why, i was just trying to get hit all the cases i needed to hit and it seemed to unblock me

GitHub

Contribute to dagger/dagger-test-modules development by creating an account on GitHub.

#

@steep onyx that said the full caching on llm calls is gonna be really interesting... does it correctly factor in message history and whatnot? people definitely don't think of LLMs as being pure or "hermetic" (weird word to use here, i know, but i think you catch my meaning) and im curious how easy it's gonna be to get accidental cache hits that produce surprising behavior

#

those test examples are not what i'd call surprising fwiw, mostly cuz they each start from scratch so there's no collected message history

steep onyx
# worn hill <@949034677610643507> that said the full caching on llm calls is gonna be really...

For now I'm gonna toss in a CachePerSession on llm so that it just retains the behavior it currently has once we enable persistent dagql caching. Basically just delaying needing answers to those (tricky) questions.

I feel like not caching LLM calls is the right default, but we do definitely need the history to be cacheable (and transferable via remote caching). So there's some subtleties there to disentangle

worn hill
#

idk i could see caching feeling nice provided that all the history and env context bits are treated as part of the cache key

smoky ocean
#

🚨🚨🚨 Experimental MCP support merging soon! Thank you @wraith remnant @warped bramble πŸ™

smoky ocean
smoky ocean
river belfry
#

Not sure who I should put in reviewers, but I have this one that bumps openai-go. This improves a bit the behavior when using llama.cpp and small models. Not entirely sure why exactly, I wasn't able to find the exact commit between the alpha.61 and beta.2 that degraded the results
https://github.com/dagger/dagger/pull/10005

quiet ether
river belfry
# quiet ether writing some sort of eval for this would be nice. ref: https://github.com/vito/d...

Based on your module I did that: https://github.com/lgtdio/llmeval
I added my .env to the git repo as there's no screts here. It's using Docker Model Runner but that should be similar if we run the model using llama.cpp.
Basically it generates the reports for my main test case, where I want the LLM to generate dev environment on the fly by inspecting the code base.

I run it with dagger based on the alpha.61 of openai-go (using this branch https://github.com/lgtdio/dagger/tree/llm-demo-2) and here is the result: https://github.com/lgtdio/llmeval/blob/main/reports/with-openai-alpha-61.txt
-> At the end of the report I added the history of the built container.
-> It works really well, found the tools, use thems

I also run the same thing based on beta.3 (main) and here is the result: https://github.com/lgtdio/llmeval/blob/main/reports/with-openai-beta-3.txt
-> It's failing because it doesn't use correctly the tools
-> Instead of finding a tool tree it will install the tree package
-> It never uses the addPackage tool
-> It install weird stuff in mode --force-broken-world
In the end that works, but way less efficient.

I still haven't found the change from the alpha61 that degraded the performances.
My prompt is also complex because the model wasn't able to find the tools at start, but it might depends on the models, especially when they are not so big. But at least it was working.

proper stratus
#

Multi-objects is great. However, I have issues with Dagger Shell. When switching from navigate to input mode, the output stops where I was, and I can't see what I'm typing. Sometimes it shows input and output but only a few lines before stopping. Also, output from long prompts is difficult to follow, and it seems to not show the complete output.

river belfry
#

Currently I'm mostly running local models (<14B for most of them, so not so big)
I'm seeing a lot of differences in behavior depending on the model. Would it be interesting to share a list of the models that works well/to recommend based on the kind of task to perform?

shrewd ermine
#

<14b can be a bit rough, but I've mostly used qwen2.5-coder for code generation and it's been pretty good. Getting the prompting just right is the real challenge with the smaller models but once you get the constraints just right they're good

river belfry
#

I'm also using qwen2.5-coder in 14B, I can't really go more with my actual laptop πŸ˜•

shrewd ermine
#

yeah it might be worth the tradeoff to run 7b with a larger context length too, I haven't dug too deep on that config side

river belfry
shrewd ermine
#

That looks pretty good, I don't have anything as good as that πŸ˜› I would add under contraints DO NOT USE THE CONTAINER TOOL. If it calls Container() from your dev-environment module to get the container object it will immediately overwhelm itself. Hopefully that won't be an issue in the next release

river belfry
#

Regarding https://docs.dagger.io/api/llm#environments-and-tools I wonder if we shouldn't add a small example. Like if we only want the ability for the llm to read, a small module that only contains a read func. Or if we want to go a bit further, a read function and a tree function that runs tree in a container on a specified directory (I like this one because it's not just restricting the scope of the llm, it's also extending it with custom functions).
What I mean by that is I understand what is written, but also because I know what to expect. And while that sounds clear, I don't know how easy for someone to go from this description to the creation of a small module that can act as an environment.
I'll see to open a PR with a small example and we can discuss it if that makes sense.

shrewd ermine
#

btw if there's more feedback on the current LLM docs, now is an excellent time to share πŸ˜„

hidden tartan
gloomy kindle
#

πŸ€” for prompt mode, should the results be written back to the variable? sorry, i'm struggling a bit to get the actual result back out

gloomy kindle
#

aha 😦

smoky ocean
#

welcome to the frontier 😁

gloomy kindle
#

i'm using 0.17.2 without those env changes, still applies?

shrewd ermine
#

ah, no, different answer there

#

( but I don't know off the top of my head )

smoky ocean
#

~~in 0.17.2 I believe:

  1. the llm can set variables
  2. they get synced back to your shell
  3. but you need to explicitly prompt the llm to do it~~

Nevermind I was completely wrong

gloomy kindle
#

ahaha

spring wave
#

the llm can't set variables in 0.17.2

#

@gloomy kindle i think you want $_

#

that'll be assigned as the last value returned by the LLM

gloomy kindle
#

yes πŸ˜„

#

that's what i want ❀️

#

thank you!

shrewd ermine
#

is there any way to get $_ in code??

gloomy kindle
smoky ocean
spring wave
spring wave
#

well

smoky ocean
#

or result of last tool call?

spring wave
#

both (they are the same)

smoky ocean
#

regardless of selection

#

ah ok

spring wave
#

since tool calls auto-select

gloomy kindle
#

thank you thank you πŸ˜„

spring wave
#

i suppose that also means $_ will always be an object, never a string, since LLMs never select non-objects

#

but now you can e.g. $agent | last-reply if that's what you want πŸ˜›

smoky ocean
#

well at least now I'm caught up on what the API is in main..

smoky ocean
#

Bug report by @bronze fern : _currentSelection tool is always sent to LLM even when environment is empty. It seems to confuse the LLM (it gives tainted responses that talk about selection)

merry scarab
river belfry
# shrewd ermine That looks pretty good, I don't have anything as good as that πŸ˜› I would add und...

Just FYI but after a lot of different tries, it looks like I have better results by adding to a system prompts the list of available tools. This makes better results than to have them inside the prompt file.
With that I have really similar results than I had when we were using openai-go alpha.61 (I mean I have good results and I'll be able to demo with based on main, and really happy about that πŸ™‚ )

river belfry
hidden tartan
river belfry
#

Based on ⬆️
We have a Tools function. I wonder if we shouldn't make available the list of tools directly. That way we can construct a kind of similar doc but specifically for the model used and the expected format, and send it to the (system) prompt. (It can be useful for small, local models)
Would that make sense? (Happy to try to do it, but wanted to validate the need first)

river belfry
merry scarab
hidden tartan
#

That's actually better to set the system prompt so you don't send the instruction on every query in prompt mode, just noticed that while trying

woeful quiver
#

When LLMs are calling LLMs, it might be helpful to name them? cc @hidden tartan (e.g. LLM.WithName("bot1"))

smoky ocean
#

@river belfry @hidden tartan I'm not sure I understand, tool calling already works this way - the LLM endpoint already injects the same information in the context. Doing this duplicates it

#

Are your descriptions in that doc the same as what's in the comments of your functions?

river belfry
#

I'll try again to be sure, but what I saw is that works well with big models, like when you use gpt, but with small models if I don't add again the tools (sometimes in a different format) that doesn't work well, the LLM will for instance not find the tools to run. Especially with qwen model I'd say.

smoky ocean
smoky ocean
river belfry
#

Here is what chatgpt says to me when I ask it to improve my prompt:

Ah, you’re super close β€” but here’s the catch: LLaMA 3.2 1B is extremely small and may not reliably infer when to call a tool, even when told it can. Smaller models like this often need explicit prompting to take actions like calling tools.

smoky ocean
#

We could add this to the LLM type

#

Like if model == "qen" { /* inject system prompt */ }

hidden tartan
shrewd ermine
quiet ether
#

@spring wave As I mentioned in the prod-dev sync, I'm getting into a situation where after my agent selects the firs tool it needs, it seems like it gets stuck within that context and doesn't know how to use the other tools it knew about before selecting the current one. I'm currently using Claude 3.5 as an LLM, not sure if that matters.

spring wave
spring wave
#

basically we might need another tool, analogous to selectQuery but maybe not named that because the model might not understand

worn hill
#

it's definitely the sort of UX bug where it makes you wonder if you're doing something wrong, but the fact that both of us hit the exact same thing trying to use the interactive-onramp UX strongly implies this is not user error at all

quiet ether
smoky ocean
#

Be careful of split brain guys

quiet ether
#

@spring wave @worn hill thread about selectQuery tool

woeful quiver
#

FYI, you cannot use Rancher Desktop with Claude to register/test MCP servers - has to be Docker Desktop

river belfry
#

But the (important) result is that works great πŸ™‚

shrewd ermine
#

Amazing!

river belfry
#

(ok, it works better πŸ˜… )

smoky ocean
#

Masking fields 🧡

merry scarab
#

I find myself constantly rate limited by 4o - are there any good patterns to either

  1. consistently reduce my token size
  2. get visibiliyt into what the input tokens acftually look like?
spring wave
#

the biggest problem is that it currently gets every single API exposed to it as a tool, depending on the currently selected object, which is where #1354656055925149716 came in, which showed promise but led to the model making more mistakes

merry scarab
#

Sorry if dumb querstion but can I use dall-e-3?

.model allows me to switch but then seeing this error when I try

β”‚πŸ€– 0.1s
β”‚ ! POST "https://api.openai.com/v1/chat/completions": 403 Forbidden {
β”‚ !         "message": "You are not allowed to sample from this model",
β”‚ !         "type": "invalid_request_error",
β”‚ !         "param": null,
β”‚ !         "code": null
β”‚ !     }
! input: llm.withQuery.withModel.withPrompt.sync select: POST "https://api.openai.com/v1/chat/completions": 403 Forbidden {
!         "message": "You are not allowed to sample from this model",
!         "type": "invalid_request_error",
!         "param": null,
!         "code": null
!     }
shrewd ermine
#

But to answer more specifically, the model has to support chat generation. And if you want to give it objects, it has to support tool calling

smoky ocean
#

Oh @steep onyx there's another papercut we could use help with...

The "EnvironmentHook" in core/env.go install all core types in Environment.with[TYPE]Input and Output.as[TYPE] but really, half of those types can be removed...

merry scarab
smoky ocean
#

I'm thinking we could remove the following:

  • current-module
  • *type-def
  • env
  • error
  • function-*
  • generated-code
  • llm-token-usage
  • sdk-config
  • source-map
shrewd ermine
#

Not really sure what "sample" means here

steep onyx
#

I'm thinking we could remove the

fleet fiber
#

Deploy your app without complexity and $50 in free credits on Sevalla https://sevalla.com/fireship

Learn the fundamentals of Anthropic's Model Context Protocol by building an MCP server can give any AI model superpowers. In this tutorial, we build an TypeScript server that provides Claude with additional context and the ability to modify data o...

β–Ά Play video
smoky ocean
#

lol

#

OK @steep onyx @wraith remnant @worn hill @spring wave we're in countdown to release... The idea is to get the new environment API out, so we can port all examples & docs to it by the hack night tomorrow

#

@spring wave what say you? πŸ™‚

#

Night crew ready to do final testing here in London

spring wave
#

fixing $_ is the last blocker i think? i can figure something out there

#

and maybe revive -i / life-alert since that was based on returning, which we have now (in an even more solid form)

steep onyx
spring wave
#

@smoky ocean oh, and the exposing bindings as tools. that needs more testing i think

shrewd ermine
#

eval it uuuuuup

smoky ocean
#

Then we can cut 0.18 tomorrow when the stakes are less high πŸ™‚ Probably safe to assume there's another (hopefully small) release to be had tomorrow for last minute fixes

#

But hopefully we can freeze API today

bronze fern
storm gate
smoky ocean
#

My 2p.

2 pounds? πŸ™‚

storm gate
smoky ocean
#

@spring wave how can we ( @steep onyx @wraith remnant @worn hill ) help you get that PR merged?

spring wave
#

particularly curious about the vars-as-tools bit, that's the biggest unknown at the moment, and there are other schemes we could try

smoky ocean
#

@spring wave so you're positive you saw that working at least once?

spring wave
#

i definitely see it call those tools, but what's worrying is that it seems to take their presence (or maybe the way they're described) as a sign that it can pass them by name in args to things, which if true could be a real wrench in the gears

#

so another thing we could try is to change/repurpose currentSelection to additionally list the known object IDs, MAYBE paired with a name, but that might have the same risks. ideally they'd have a description instead

#

that's the general area that needs de-risking atm

#

aside from that, just trying out existing agents and trying to find (not too cryptic) ways to break/confuse it

#

just got $_ working, will push soon. it's currently as-before, where it's only the last selected object, not arbitrary scalars, since that's by far the easiest to support and you can always just re-select whatever field you want from it

#

(pushed)

#

i'll add some telemetry to those env getters too so it's more obvious when it's using them

smoky ocean
#

Let me tweak the description for inputs as tools

spring wave
smoky ocean
#

@spring wave did you want to make description mandatory in inputs?

#

probably last call to do that if so πŸ™‚

spring wave
#

...lemme dogfood it a bit

#

worth noting there isn't a way to add descriptions to shell vars. would be cool to use comments for that

ctr=$(container | from golang) # a Go image to use for building
spring wave
#

"A container for preserving broccoli"

shrewd ermine
#

(but actually what if "container | from golang" was the description)

smoky ocean
#

Oh right the shell... damn

spring wave
#

well, we could always accept a "" description

#

it's mostly about making it hard to forget, and easy to consistently provide

smoky ocean
#

yeah

#

Having descriptions everywhere also clarifies that inputs and outputs each have their own namespace...

spring wave
#

probably easier in the REPL than it would be in a script, though

#

having said that, right now var syncing doesn't work at all in a script anyway

smoky ocean
#

@spring wave this seems like dead code no? Should I remove?

spring wave
#

yeah noticed that, you can rm it

smoky ocean
#

could it have influenced some of the above?

shrewd ermine
#

ok i have to sleep, cant wait to try 0.18 (or 1.0?) when i wake up πŸ™

smoky ocean
#

@spring wave we should fix that papercut also

@wraith remnant if you're available? πŸ‘†

spring wave
woeful quiver
wraith remnant
smoky ocean
#

@spring wave trying to not get pulled into too many changes at once, to avoid conflict. pushing soon

spring wave
woeful quiver
#

MCP Server issues

spring wave
#

@smoky ocean rebased + pushed required input descriptions

wraith remnant
smoky ocean
#

@spring wave I don't see Env.outputs and Env.output in the llm schema, ok for me to add?

#

btw I just had a very successful run with the only explicit prompt being "do it" πŸ˜›

spring wave
smoky ocean
#

It really feels like with this pattern we're getting closer to agents being declarative reactive functions themselves, not just the outside code envelope, but the LLM itself πŸ™‚

spring wave
smoky ocean
spring wave
#

and why the idea of mutable bindings felt off

smoky ocean
#
env=$(
  .core | env |
  with-container-input base-container $(container | from alpine) |
  with-git-repository-input dagger-source $(git https://github.com/dagger/dagger) |
  with-file-output dagger-binary "The dagger command-line binary, built in Go from the latest stable release of Dagger, in a containerized dev environment" |
  with-container-output go-env "The go environment used to build the dagger CLI, with everything setup such that 'go build' works on the first try when entering the
  container"
)

result=$(llm | with-env $env | with-prompt "do it")

(sparing you the middle part)

β”‚πŸ€– 1.2s β—† Input Tokens: 2,355 β—† Output Tokens: 26
β”‚ βœ” return(
β”‚ β”‚ β”‚ dagger-binary:πŸ€– Container.file(path: "/bin/dagger"): File! 0.0s
β”‚ β”‚ β”‚ go-env:πŸ€– Container.withWorkdir(path: "/app"): Container! 0.0s
β”‚ β”‚ ): String! 0.0s
β”‚πŸ€– I've successfully prepared everything:
β”‚ ┃
β”‚ ┃ β€’ Dagger Binary: The Dagger CLI binary has been built and is available.
β”‚ ┃ β€’ Go Environment: The containerized Go environment is set up with the Dagger source code mounted at /app .
β”‚ ┃
β”‚ ┃ You can now proceed with your tasks using these resources.
β”‚ ┃ 1.8s β—† Input Tokens: 2,433 β—† Output Tokens: 62

spring wave
#

sweet

smoky ocean
#

It did try to cheat and return File#1

#

(before actually getting a file)

#

so that might be a weakness of the numerical ID system

#

maybe we need to make it look a little more random

spring wave
#

hmm yeah

#

what model?

smoky ocean
#

gpt-4o

wraith remnant
spring wave
#

env input chaining papercut

smoky ocean
#

@spring wave I'm a little confused trying to add Env.output() and Env.outputs(): it looks like the output definitions are saved in place (outputsByName) but the actual values in another (objsByName) is that right?

spring wave
smoky ocean
#

I would have expected objsByName to become inputsByName, with a mirror outputsByName of the same type: map[string]*Binding.

smoky ocean
spring wave
#

or that yeah

smoky ocean
#

OK I'll do that then?

#

Then need to go to sleep... Will you be able to carry the release today guys? I know it's getting late even for your timezeons

steep onyx
#

My assumptions atm are that "when" == "when that environment-api PR is merged" and "version number" == "v0.18"

#

so just tell me if that's wrong

smoky ocean
wraith remnant
spring wave
#

current status: nothing in progress, was just dogfooding the required-descriptions stuff for my evals, and then gonna run them, which I'll do after @smoky ocean pushes the new Env.output API since I need it now. πŸ˜›

but, have to go on a 40m-ish car ride so that's it from me for a bit. pushed my evals changes in case anyone wants to try them while i'm in the πŸš—

#

also want to try adding input descriptions to the tool descriptions, not sure if that's done yet

steep onyx
wraith remnant
smoky ocean
#

@spring wave do you use the type_ argument of WithOutput anywhere ?

#

my guess is that type enforcement was left as a todo?

#

or I'm blind

spring wave
smoky ocean
#

OK I think I'm done, testing real quick

#

I apologize in advance if there are sleep-depravation bugs

#

@wraith remnant another papercut request... env doesn't work in the shell, you have to call .core | env...

#

mmm there's a Terminal type? πŸ€” isn't that long deprecated?

steep onyx
#

I can append it to that list of things to hide from the env extensions

smoky ocean
#

@spring wave pushed

smoky ocean
smoky ocean
#

oh also missing in output arguments

#

(I think)

#

shall I add them too?

spring wave
spring wave
smoky ocean
#

oh you're right. Mmm then I guess the LLM ignored it in my eval

#

I was getting a little too cocky with the "do it" prompts πŸ˜›

spring wave
#

lol

smoky ocean
#

fixing it

spring wave
#

so, the currentSelection tool was originally added as a hint to the model so it knows when it's been given an initial selection, but now that isn't a thing

#

so i'm not sure if it's even needed anymore? unless the hint is still helping it? not sure

smoky ocean
#

Oh I see. Not needed the rest of the time?

spring wave
#

my other idea was to repurpose it into a general purpose "your current context" tool, which lists the inputs + descriptions in its description, that way we don't need all the getter tools

smoky ocean
spring wave
#

well, the advantage is a) not having to run those tools ever, and b) not risking confusing tool names because of what people named their inputs

smoky ocean
#

I mean it doesn't change the API so we can let the evals decide πŸ™‚

spring wave
#

In my testing simply putting stuff in tool descriptions is pretty high leverage

smoky ocean
#

For now I'll just fix the code as is, feel free to remove the hint

spring wave
#

/me writes an eval that sets an input called "return" elmofire

smoky ocean
#

(force) pushed

#

OK this is it for me...

spring wave
#

Night!

smoky ocean
#

Hopefully not too late for release?

steep onyx
#

The LLM integ tests are very upset on that PR right now, I'll work on updating them

spring wave
#

probably need a sdk all generate + docs generate too (I would but I'm on a laptop in a car and my legs are hot enough as it is)

#

hmm looks like we need a .sync before getting values out of the env - adding one to LLM.env. also just remembered I had a fix for .sync on another branch, gonna try pulling that in, and MAYBE -i support but that might be too much too late

quiet ether
#

feel free to ping / dm me

steep onyx
#

@spring wave @quiet ether does that sound right?

spring wave
steep onyx
#

or do I just need one provider?

spring wave
#

one provider should be sufficient

#

running evals now, noticed claude-3-5-sonnet-latest sometimes doesn't call return 😬 maybe more prompting needed?

spring wave
#

anyone know what causes this? my ./hack/dev is wedged

β”‚ β”‚ ✘ moduleSource(disableFindUp: true, refString: "/var/home/vito/src/dagger/docs"): ModuleSource! 30.0s
β”‚ β”‚ ! failed to resolve dep to source: failed to load local dep: select: failed to load sdk for local module source: failed to load local dep: select: local path "/var/home/vito/src/dagger/sdk/php/dev/php" does not exist: unknown builtin sdk
β”‚ β”‚ ! The "php" SDK does not exist. The available SDKs are:
β”‚ β”‚ ! - go
β”‚ β”‚ ! - python
β”‚ β”‚ ! - typescript
β”‚ β”‚ ! - php
β”‚ β”‚ ! - elixir
β”‚ β”‚ ! - java
β”‚ β”‚ ! - any non-bundled SDK from its git ref (e.g. github.com/dagger/dagger/sdk/elixir@main)

https://v3.dagger.cloud/dagger/traces/cdf57de1006c67015097cfffdfc7edaa

quiet ether
#

maybe sdk gen busted your local sdk folder? @spring wave

#

trying here

spring wave
#

hmm could be

❯ git clean -ffdnx
Would remove .dagger/dagger.gen.go
Would remove .dagger/internal/
Would remove .env
Would remove .jj/
Would remove .ropeproject/
Would remove bin/
Would remove sdk/dotnet/sdk/Dagger.SDK/introspection.json
Would remove sdk/php/vendor/
Would remove sdk/rust/target/
quiet ether
#

seen that happening in the past but with a different error

spring wave
#

now it's complaining about elixir - progress!

#

gonna try pruning the cache

quiet ether
#

elixir builds fail also in CI due to caching

#

I think the elixir cache is racy somehow

#

seen that happening quite a few times in CI and re-running generally fixes it

#

@steep onyx where you able to get the golden files? I have them here

steep onyx
spring wave
#

(pruning cache worked but now a fresh build is taking ages...)

steep onyx
# spring wave anyone know what causes this? my `./hack/dev` is wedged ``` β”‚ β”‚ ✘ moduleSource(d...

It's quite buried but eventually in the error trace I found this: https://v3.dagger.cloud/dagger/traces/cdf57de1006c67015097cfffdfc7edaa?span=db3424adfa16f059

host github.com not found

Which probably explains sorta what's happening; it tried to get the php sdk from git but then probably hits a fallback case where it assumes it's not meant to be a git ref and just a local ref

#

So might be a DNS problem elmofire

spring wave
#

ah nice find, saw that error later and decided to restart the engine, seems ok now

steep onyx
#

I pushed TestLLM fixes and sdk/doc regen to the PR, so hopefully CI is happy now

spring wave
#

@steep onyx sorry, got caught up in a rebase, trying to hoist over some fixes from the life-alert branch. are you getting close to a cut-off time? 😬

steep onyx
spring wave
#

@steep onyx ok, calling it for now - Claude 3.5 Sonnet still has some issues, but I don't think it's blocking

#

(just did a push -f)

steep onyx
spring wave
#

I temporarily renamed it to try to drop a stronger hint to Claude 3.5 but it didn't work well enough to be worthwhile

steep onyx
quiet ether
#

going to bed folks, it's quite late here πŸ™ 😴

steep onyx
storm gate
gloomy kindle
#

question around required descriptions - just on porting code to the new style, it feels a little clunky? i'm also not entirely sure what i should be describing? it feels like i'm just parroting the type comments from the type itself for simple examples

gloomy kindle
#

little nit about the new API - I have to repeat the magic string for the output var a couple times? it feels weird that it's not statically typed, and I could just get it wrong and get runtime errors. feels unavoidable though, and it always present (although less obvious) with single object before.
I also don't have any suggestions as to how to avoid it 😭 but it does feel very magically dynamic, and different from the test of our type system

#

(also I'm so far out of the loop, so you've probably discussed it all before)

shrewd ermine
shrewd ermine
#

getting API errors if I do llm | with-prompt (not supplying an env). Not supported?

bronze fern
#

Shell mode:

✘ llm 1.0s
β”‚ ● llm: LLM! 11.1s
! Post "http://dagger/query": unexpected EOF

Prompt mode:

dagger
Dagger interactive shell. Type ".help" for more information. Press Ctrl+D to exit.loading type definitions 0.1s
β”‚ ┃ 0.0s
β”‚
β”‚πŸ€– Hello! How can I assist you today?
β”‚ ┃ 1.4s β—† Input Tokens: 1,089 β—† Output Tokens: 11
shrewd ermine
bronze fern
#

Get similar with Gemini and Anthropic. Kills the engine when I run llm in shell mode.

proper stratus
#

I just hit the same error in two session

bronze fern
quiet ether
shrewd ermine
quiet ether
#

and checking where it broke

shrewd ermine
#

probably when the entire API was refactored πŸ˜…

quiet ether
#

yeah.. I'd assume it was the Environments API PR 😬

smoky ocean
#

Making a list of 0.18 papercuts 🧡

merry scarab
#

I really like the new env API 😍

quiet ether
#

checking..

quiet ether
#

@spring wave can you merge please? Had to step out for a bit