#agents

1 messages ยท Page 2 of 1

smoky ocean
#

@wild pasture I'm surprised you even get a llm implementation on 0.15.4? Very surprising to me. If you install dagger with this command then everything should work.

curl -fsSL https://dl.dagger.io/dagger/install.sh | DAGGER_VERSION=0.17.0-llm.1 BIN_DIR=/usr/local/bin sh

See https://github.com/shykes/melvin/blob/main/README.md

wild pasture
#

mmm already on 0.17.0-llm.1
new error dropped:

wraith remnant
wild pasture
#

! input: llm panic while resolving Query.llm: runtime error: invalid memory address or nil pointer dereference
! goroutine 9748 [running]:
! runtime/debug.Stack()
! /usr/lib/go/src/runtime/debug/stack.go:26 +0x5e
! github.com/dagger/dagger/dagql.(*Server).resolvePath.func1()
! /app/dagql/server.go:736 +0x78
! panic({0x25ef180?, 0x4b21010?})
! /usr/lib/go/src/runtime/panic.go:785 +0x132
! github.com/dagger/dagger/core.(*SecretStore).GetSecretPlaintext(0xc000ac8450, {0x30d5110, 0xc000b22210}, {0xc0016a6570, 0x15})
! /app/core/secret.go:199 +0x1dc
! github.com/dagger/dagger/core/schema.(*secretSchema).plaintext(0xc000499f20?, {0x30d5110, 0xc000b22210}, 0xc00127a588, {})
! /app/core/schema/secret.go:182 +0x9a
! github.com/dagger/dagger/dagql.Func[...].func1({0xc000373640?, 0xc00127a588?, {0x0, 0x1, 0xc0007f61e0, 0xc000ac2610}, 0x0}, {})
! /app/dagql/objects.go:577 +0x49
! github.com/dagger/dagger/dagql.NodeFuncWithCacheKey[...].func1({0xc000373640, 0xc00127a588, {0x0, 0x1, 0xc0007f61e0, 0xc000ac2610}, 0x0}, 0xc000b22030)
! /app/dagql/objects.go:634 +0xfc
! github.com/dagger/dagger/dagql.Class[...].Call(0x3118c00?, {0x30d5110?, 0xc000b22210?}, {0xc000373640, 0xc00127a588, {0x0, 0x1, 0xc0007f61e0,

#

i was doing fil cuz the log is too long discord doesnt like it ๐Ÿ˜ฆ

smoky ocean
#

๐Ÿšจ๐Ÿšจ๐Ÿšจ New pre-release: v0.17.0-llm.2. Now with Anthropic support!

To install:

curl -fsSL https://dl.dagger.io/dagger/install.sh | DAGGER_VERSION=0.17.0-llm.2 BIN_DIR=/usr/local/bin sh

Thank you @split shard @gloomy kindle @spring wave for wrangling our release system to support these pre-releases ๐Ÿ™

smoky ocean
#

The checksum issue should be fixed now...

wild pasture
#

amazing work kyle, guillaume and vito salute

spring wave
#

np! the main learning: for folks configuring ollama, you have to configure an IP reachable from the engine - so 192.168.xx.xx or 10.xx.xx.xx, not 127.0.0.1 or 0.0.0.0. Maybe we should automatically create a tunnel to the host?

smoky ocean
smoky ocean
#

@spring wave @wraith remnant quick note while preparing demo. We lost the ๐Ÿค–๐Ÿ’ญ on API calls... was pretty nice for long round trips

spring wave
#

doesnt it stream? or is this for a slow API that takes a long time to stream back?

wraith remnant
spring wave
#

@smoky ocean @wraith remnant try using loop instead of sync

#

i think i noticed sync gets hidden because we usually don't actually want to show it (i.e. container.sync)

smoky ocean
#

also shell doesn't print errors coming from function calls

#

(sorry rehearsing demo, will follow up after)

smoky ocean
#

@spring wave that change you mentioned, including stderr in withExec error under llm codepath... Is it in llm.2? If so I can simplify my toy-workspace module even further ๐ŸŽ‰

spring wave
#

yep it's in

smoky ocean
#

of course NOW the llm tries to access workspace.container()...

#

We may need a new pragma +nollm

#

or argument to llm: withToyWorkspace(hide:["container"])

#

๐ŸŽ‰ it worked! Thank you @spring wave

New workspace code:

package main

import (
    "context"
    "dagger/toy-workspace/internal/dagger"
)

// A toy workspace that can edit files and run 'go build'
type ToyWorkspace struct {
    // The workspace's container state.
    // +internal-use-only
    Container *dagger.Container
}

func New() ToyWorkspace {
    return ToyWorkspace{
        // Build a base container optimized for Go development
        Container: dag.Container().
            From("golang").
            WithDefaultTerminalCmd([]string{"/bin/bash"}).
            WithMountedCache("/go/pkg/mod", dag.CacheVolume("go_mod_cache")).
            WithWorkdir("/app"),
    }
}

// Read a file
func (w *ToyWorkspace) Read(ctx context.Context, path string) (string, error) {
    return w.Container.File(path).Contents(ctx)
}

// Write a file
func (w ToyWorkspace) Write(path, content string) ToyWorkspace {
    w.Container = w.Container.WithNewFile(path, content)
    return w
}

// Build the code at the current directory in the workspace
func (w *ToyWorkspace) Build(ctx context.Context) error {
    _, err := w.Container.WithExec([]string{"go", "build", "./..."}).Stderr(ctx)
    return err
}

cc @shrewd ermine @bronze fern ๐Ÿ‘†

#

I also removed the comments from read() and write() arguments, because they're self-evident, model doesn't need them. Saves me a lot of LoC ๐Ÿ™‚

shrewd ermine
#

lol every line counts!

smoky ocean
#

Starting a thread to coordinate launching the new repo github.com/dagger/agents

wraith remnant
#

sorry Alex laughcry

smoky ocean
noble notch
#
poll_question_text

๐Ÿฅค hey everyone, quick taste test: you already know about Dagger but many of your friends still don't, so which of these bite-sized descriptions do you think would get their attention the fastest?

victor_answer_votes

6

total_votes

11

victor_answer_id

1

victor_answer_text

Dagger: Composable agent runtime ๐Ÿค–โ†’๐Ÿ”„

smoky ocean
merry scarab
#

Good luck @smoky ocean - you are going to crush it! Remember to zoom in ๐Ÿ˜„

smoky ocean
#

Quick request @wraith remnant ๐Ÿ™ there is a typo in the readme.

Once this feature is merged (current target is 0.17), Melvin will support with a stable release of Dagger.

Could you fix it please? afk at the moment...

wraith remnant
#

Actually, there are several references to melvin still -- I shall reference to dagger/agents ?

smoky ocean
#

just say "a development build will no longer be required". no melvin mention

shrewd ermine
#

I can do some touch ups shortly too when I submit my demos

wraith remnant
smoky ocean
#

done. sorry! will give you permissions tonight

mystic steeple
#

i also think that section helps with index / observability on search

shrewd ermine
smoky ocean
dim cypress
merry scarab
# dim cypress I'm new to golang, can someone help me to understand where does this `llm.WithTo...

This is a bit of a dagger specific thing. When you install a dagger module you get bindings for all of the functions that exposes.

So in this case, llm.WithToyWorkspace() comes from the fact that the toy-workspace module is installed here: https://github.com/dagger/agents/blob/main/toy-programmer/dagger.json

--
more details below, feel free to skip if you dont want to fall into a rabbit hole (yet) ๐Ÿ˜„

You can see the details of ToyWorkspace here: https://github.com/dagger/agents/blob/main/toy-workspace/main.go

This one is written in go, but note that dagger modules can be written in many languages (go, typescript, python, php, and more coming soon) and be consumed from any other language.

You can learn more about dagger modules here: https://docs.dagger.io/api/module-structure

GitHub

Contribute to dagger/agents development by creating an account on GitHub.

GitHub

Contribute to dagger/agents development by creating an account on GitHub.

smoky ocean
#

@dim cypress note that Dagger is not Go-specific. What's your language of choice? There are SDKs for Typescript, Python, PHP

wraith remnant
#

Anthropic 3.7 (+ their agentic CLI -> Claude Code) released today -- need to explore ๐Ÿ‘€ ; wish its gonna be awesome for agents ๐Ÿคž

shrewd ermine
#

Gemini Client ๐Ÿงต

dim cypress
#

I guess I finally understand how dagger module works, but my IDE (VSCode) can't resolve module, I tried https://docs.dagger.io/api/ide-integration/ and restart IDE, but still the same

Dagger uses GraphQL as its low-level language-agnostic API query language, and each Dagger SDK generates native code-bindings for all dependencies from this API. This gives you all the benefits of type-checking, code completion and other IDE features for your favorite language when developing Dagger Functions.

dim cypress
woeful quiver
dim cypress
#

using qwen2.5-coder:14b and the ToyWorkspace, feels like the LLM is not really calling the function, it just explain the steps but not calling ToyWorkspace.Write , any ideas why?

wraith remnant
dim cypress
wraith remnant
#

Qwen tool calling

wraith remnant
#

๐Ÿ‘€
Was there a reason to not expose the with-system-prompt as a top-level primitive like with-prompt ? What's your sentiment around introducing it ?

shrewd ermine
#

since at least gemini and i think anthropic just allow setting one, it doesnt need to be chainable

wraith remnant
shrewd ermine
#

maybe that's what you were already talking about

wraith remnant
#

yes and it's not exposed ๐Ÿ‘

shrewd ermine
smoky ocean
#

I initially left it hidden just because of API churn: less boilerplate to refactor 5 times in a row.. now we could bring it back.

BUT

there is churn upstream also, I think openai actually deprecated the term "system" in favor of "platform" (model owner) and "developer" (mere mortals programming the agent)

shrewd ermine
#

๐Ÿš€ Prompt was very pushy lol

shrewd ermine
#

That gemini PR is ready for testing/feedback/review ๐Ÿ‘†

smoky ocean
#

@shrewd ermine nice! Is it safe to merge and test from a pre-release? Or is there a risk that it breaks other things?

smoky ocean
#

merged!

shrewd ermine
#

@smoky ocean coming soon to an agent repo near you...

smoky ocean
#

wat!

#

Can't wait ๐Ÿ™‚ @shrewd ermine . How's the model performing, is the translation reliable?

shrewd ermine
#

I haven't given it's workspace a checker yet, but it'll be checking against dagger functions matching the original

#

My next problem to overcome is that I'm providing all sdk reference inline with prompts and it's just too much. It's overwhelming the llm. So I'm giving the "dag-workspace" a reference lookup tool instead

smoky ocean
#

@shrewd ermine @bronze fern for changes to your respective modules only, feel free to just push directly to llm

#

for changes that might affect others, safer to do PRs (even if they end up being fast-tracked anyway)

#

This feels like a good time to tag llm.3? OK with you @spring wave @shrewd ermine ?

spring wave
#

yeah worth doing if not just for the API getter lowercasing change

#

i'm also cooking up some shell improvements, but that can wait til next release

smoky ocean
spring wave
#

the llm getters were all uppercase before, which we didn't notice probably because we use Go all the time

#

but in PHP/etc you'd end up with e.g. $llm->Container() which is weird

bronze fern
spring wave
#

wtf is going on here lol - running the same command in one shell is committing necromancy on an older session I had open?! /cc @steep onyx

#

happens on vito/shell-bbt (ish - local changes, but CLI only). maybe related to cross-session caching...?

steep onyx
#

you're running the same thing in each right? so I guess telemetry for it is getting sent to both clients because they are both open and the older client gets updates

#

isn't that how the telemetry pub/sub supposed to work (even though weird in this case)?

shrewd ermine
#

I was seeing an issue with directory | entries not displaying in shell but I'm afk at the moment so I can't provide any helpful details. Otherwise good for LLM.3 release

steep onyx
spring wave
#

it seems like the newer client doesn't get the buildkit-level telemetry at all (it just says pending)

steep onyx
#

Oh I wouldn't be surprised if this is our old friend buildkit edge merging...

spring wave
#

yeah could be

#

I hadn't factored in that shell is a very long lived client/session

#

seems to persist even after closing the old client. hmm

#

fine if i start a new one though

steep onyx
#

If you have access to the engine logs and you see any recent logs that contain "merging edges", that would help confirm

#

I wouldn't be surprised if the progress buildkit sends ends up being only the "original edge's client". And that also aligns with it persisting even after you close the old client and using on the one that was already merged

smoky ocean
#

0.17.0-llm.3 ๐Ÿงต

#

@spring wave you happy with your choice of MCP lib from that POC you had?

spring wave
#

yep, don't remember any major issues with it

#

seemed pretty well designed

smoky ocean
#

Same one @spark phoenix has been using. We're looking at fastest way to add server-side support.

I'm wondering how much I can reuse the existing bbi implementation vs. re-implementing a "Dagger introspection to X" adapter

#

Doesn't look like they allow passing an existing jsonschema

shrewd ermine
#

Do they take an openapi Schema? That's what Genai uses so we have a conversion for that now

smoky ocean
#

Doesn't look like it, it's a static Go API for declaring each type

#
  calculatorTool := mcp.NewTool("calculate",
        mcp.WithDescription("Perform basic arithmetic operations"),
        mcp.WithString("operation",
            mcp.Required(),
            mcp.Description("The operation to perform (add, subtract, multiply, divide)"),
            mcp.Enum("add", "subtract", "multiply", "divide"),
        ),
        mcp.WithNumber("x",
            mcp.Required(),
            mcp.Description("First number"),
        ),
        mcp.WithNumber("y",
            mcp.Required(),
            mcp.Description("Second number"),
        ),
    )
shrewd ermine
#

Oh ๐Ÿ˜ณ

shrewd fern
#

@smoky ocean, I can merge my shell branch in with the llm stuff so you guys can dogfood the filesystem navigation if you want, but may be risky if you hit a bug and I can only fix it the next day. Same thing for rebasing, I can do it easily but you may need it on your timezone.

smoky ocean
#

I would love to do it if we don't find any red flags

smoky ocean
#

That Properties looks a lot like raw unmarshaled jsonschema ๐Ÿ™‚

#

Trying now!

wraith remnant
#

it's crazy that google makes it so cheap compared to others -- I just hope it is good in agentic flows ๐Ÿคž

spring wave
#

here's the work so far on shell-bbt - it's a bit tedious to implement, but feels like a good balance

  • sticking with the change to not take over mouse input (scrolling and clicking)
  • instead, commands are now printed to the scrollback when they complete, similar to before - so you can just scroll like normal
  • in-progress commands are rendered with a different background, matching the prompt background
  • hitting <esc> still pops you into navigation mode, but now it has the distinct background so you can tell what's scrollback vs. the navigable history
spring wave
#

gonna do the rebase + re-merge shell-bbt + force-push /cc @smoky ocean

spring wave
#

done

smoky ocean
#

@shrewd ermine are you still using the top-level demo module? I'm going to split it up, the melvin parts will go under melvin/.

shrewd ermine
smoky ocean
#

done

#

please tell me if you find any issues!

#

I moved demo under melvin/demo, but later today will finish cleaning up by merging it into melvin/ proper

spring wave
#

web UI changes for LLM have been shipped

#

no more add-follow-v3 /cc @smoky ocean

shrewd ermine
#

I'm finding myself using this pattern in every checker

    @function
    async def check(
        self
    ) -> str:
        """Checks if the workspace meets the requirements"""
        cmd = (
            self.ctr
            .with_exec(["sh", "-c", self.checker], expect=ReturnType.ANY)
        )
        out = await cmd.stdout() + "\n\n" + await cmd.stderr()
        if await cmd.exit_code() != 0:
            raise Exception(f"Checker failed: {self.checker}\nError: {out}")
        return out

do we need a better DX for that? (specifically returntype.any -> exit_code -> stdout/stderr)

#

The reason I have to do this is because the raised exception does not include stderr. It just says process "foo" did not complete successfully: exit code: 1

#

anyway the dagger translator is getting better! Almost ready to PR I think. The prompt is something very special lol

spring wave
#

there's also a chance I change the errors back to directly embed, but it'll be a dive back down the frontend rabbit hole to figure out how to strip them back out

#

maybe something incredibly explicit like -----BEGIN STDOUT-----

shrewd ermine
#

I'm not sure ๐Ÿค” This is all I see when a function call fails (and all the llm sees) "foo" did not complete successfully: exit code: 1

spring wave
#

yeah, it's intentionally omitted from the default error string representation because it makes the UI and telemetry (and anywhere else it ends up) incredibly verbose.

#

but the data should be available on the error type

shrewd ermine
spring wave
#

the problem isn't necessarily that it shows it, the problem is that it's usually completely redundant with the logs also showing the same thing (only better, since they're interleaved)

spring wave
#

ah here it is

shrewd ermine
spring wave
#

the thing is, even if we do bring it back in the error string, the UI would just be stripping it back out, because otherwise in the above screenshot you'd be seeing the same thing twice

#

(only one would be poorly formatted)

#

might need to do something sophisticated like keep track of the same exact call having already errored elsewhere in the trace

#

hmm actually these aren't even the same

shrewd ermine
#

I guess. In general I'd like to see output from cached events most of the time (ignoring technical constraints ๐Ÿ˜› ). Pretty often I find myself looking at a trace and want to know what the output of a certain command was but it isn't shown because it was cached

#

oh right. So in the screenshot the one with no output was the one that the llm ran into. Eventually it gave up. My pipeline runs the test itself before returning the output (but after the llm has done everything) becase I'm struggling with llms giving up and I want to be sure the output is good manually

spring wave
#

this is even trickier because it looks like it was deduped at the LLB layer, not the DagQL layer - they're different calls (different space invaders, digests, etc), but they ended up the same at the LLB layer, because one of them calls withNewFile twice with the same content

#

it might still be possible to correlate them, though, I suppose they should have the same LLB vertex digests as outputs? maybe...?

shrewd ermine
#

because one of them calls withNewFile twice with the same content
Does that make things more complicated? I think it's the best primitive we have for an llm workspace write function, but maybe it needs to be synced/flattened

smoky ocean
#

@shrewd ermine @spring wave @bronze fern we're going to merge github.com/dagger/agents directly into dagger/dagger. It seemed like a good idea to have a separate repo, but having a throaway repo in the official dagger org ends up being worst of both worlds

I am going to open a PR to dagger/dagger which:

  1. Updates the README to mention both CI and AI use cases
  2. Adds the demo modules to a agents/ directory
  3. (waves hands)

Thoughts?

spring wave
# shrewd ermine > because one of them calls withNewFile twice with the same content Does that ma...

a little bit - it means I can't do something like "oh that same exact call failed over here, I can just show it the same way in both locations" - because they're actually different calls. Instead, if I want to e.g. show the logs in both places, the UI would have to aggregate based on something else, like buildkit vertex digests, assuming they actually are the same. (They might not be - iirc Buildkit distinguishes between vertex digests and cache keys - the latter are what actually dedupes things in the solver, but the former might be what gets sent over telemetry). Would you be able to send this telemetry to Honeycomb? I have a helper script if you need. Or, if you have a command I can run myself, that might be easier

shrewd ermine
shrewd ermine
spring wave
#

lol (from Cursor)

smoky ocean
#

Is any Cursor user not in yolo mode?

#

@steep onyx @spark phoenix quick engine question: how hard would it be to tweak the session-attachable that powers Container.terminal(), to support a "raw" mode without a term emulator? The idea would be to use that quickly add the ability for the dagger CLI to act as a stdio server (for the purpose of MCP integration)

#

Rough concept:

"""A large language model context""
extend type LLM {
  """Serve the current context for remote tool calling over MCP. This takes over the current client stdio streams, and converts it to a stdio server"""
  serveMCP(): Void
}
smoky ocean
#

I'm just not sure what a better version is

steep onyx
# smoky ocean <@949034677610643507> <@707661669819613324> quick engine question: how hard woul...

So you want the CLI to just proxy the stdio of an arbitrary service? Probably not too much work, the terminal attachable just proxies stdio to some io.Readers/io.Writers it's given: https://github.com/sipsma/dagger/blob/5f7a342344e9bc4fcf2b734c5a548727fa6025f0/engine/session/terminal.go#L55-L55

Right now the CLI passes it this: https://github.com/sipsma/dagger/blob/5f7a342344e9bc4fcf2b734c5a548727fa6025f0/cmd/dagger/terminal.go#L13-L13

But if it passed it just the raw os.Stdin/out/err then that'd probably do it

#

I don't know enough about MCP implementation details to have an opinion on the best approach, but if there's some way to just proxy a network protocol rather than proxy via stdio that feels cleaner/nicer and would be able to just re-use our existing support for network tunnels

smoky ocean
smoky ocean
#

LLM-friendly dagger cloud preview broken?

spring wave
smoky ocean
#

oh!

#

so it's merged?

smoky ocean
#

looks like it

#

@spring wave in the TUI, this seems to drop LLM spans:

dagger shell -c 'github.com/dagger/agents/toy-programmer go-program-qa "write me a curl clone"'

I see the prompt & replies for the dev agent. But for the QA agent, I only see the tool calls and the final reply. Not the intermediate prompt

spring wave
#

oh interesting, I think I see what you mean. it's the second human-emoji span, in the web UI, which is missing in the TUI?

smoky ocean
smoky ocean
smoky ocean
smoky ocean
#

https://x.com/willccbb/status/1894957149500965292

I've been wondering the same thing.

I think if we had eg. a Dagger object with "searchDiscord" and "searchDocs" functions we could build a decent "help bot" without sophisticated RAG or other information embedding at the prompt level.

if youโ€™re building a RAG system in 2025 just build a good search engine backend + let the model query it

spring wave
#

fyi: working on metrics for token usage now

#

seems really easy to run into token limits, to the point where I'm worried that either a) there's some silly thing consuming a bunch of tokens, or b) I'm being spoiled by AI services/tools (Cursor, Claude) that have partnerships that let them blow past the usual limits for individual accounts

smoky ocean
#

@spring wave fwiw I rarely hit limits on OpenAI. But had to pre-pay $100-ish, that got me to a certain "tier" where the limits are no longer an issue

spring wave
#

I'm trying out Claude 3.7 which has a pretty low limit atm (20k tokens per minute) - can try switching back

#

for that ^ I changed the TUI to show metrics by default, might need more thought (usually it's behind verbosity level 3)

velvet pawn
spring wave
#

ah interesting. maybe we should support that?

velvet pawn
#

It's "OpenAI API compatible", so as long as you can change the endpoint for openai, then you can point to it.

woeful quiver
spring wave
smoky ocean
#

@spring wave also make sure to tell Claude to "shut up and just do the thing", by default it will comment in detail on everything it's doing which wastes output tokens

shrewd ermine
spring wave
#

anthropic caching

smoky ocean
#

Not sure where those HTTP requests are coming from... verbosity=1, version=llm.3

shrewd ermine
#

I decided they're probably parts streaming from the llm's api, but that was my own theory ๐Ÿ™‚

spring wave
#

those might be docker image layer pulls

shrewd ermine
#

I don't think I've seen them on verbosity=1 before

smoky ocean
#

If someone is looking for a low-hanging fruit contribution: I would like to bring back the ๐Ÿค–๐Ÿ’ญ span during API round trip

--> Alex beat me to it ๐Ÿ™‚

#

It got lost in the refactors

smoky ocean
spring wave
#

still chewing on it though - it's unfortunate that it breaks the tool call chains

smoky ocean
steep onyx
#

Dropping random idea for a dagger agent: AI-driven "fuzz testing" of your app. I feel like one of dagger's superpowers is being able to easily write e2e tests that would normally be a nightmare to create thanks to all the primitives in the core API + external modules.

So idea would be to write an agent that's given a set of tool calls that the LLM can use to create stress/fuzz tests. I.e. "create a nightmare filesystem of complicated files/directories/symlinks/hardlinks and tests that verify my code works as expected with it", "spin up these five services with dependencies between them and then assault it with invalid requests, make sure it doesn't crash", etc. etc. All of those sorts of things are possible with the dagger core API, this would just automate creation of the scaffolding code around it.

Would love to give it a try if/when I have time but feel free to nerd snipe ๐Ÿ˜„

steep onyx
smoky ocean
smoky ocean
shrewd ermine
spring wave
#

๐Ÿšจ PSA: all shell work is now merged into main and I just force-pushed a rebase to llm /cc @smoky ocean @shrewd fern

spring wave
smoky ocean
#

I guess it's an overall UX problem of "I used to know the overall context at a glance, and now I don't". I think it's a combination of:

  • Loss of "depth" information: nested ๐Ÿง‘๐Ÿ’ฌ and ๐Ÿค–๐Ÿ’ฌ are flattened
  • Loss of ๐Ÿค–๐Ÿ’ญ
  • Loss of ๐Ÿค–๐Ÿ’ป
  • The occasional extra verbosity
  • maybe dropped function call spans?

Somehow the combination got me past the threshold of "I don't feel like I know what's going on exactly at all times, and I don't feel confident that the audience will either - which makes my demo less cool"

But I guess I shouldn't pinpoint one specific thing

smoky ocean
#

@spring wave I'm starting an eval thread, going to run the same reference workflow and record it, then point out issues

spring wave
#

Pushed the token counting + Anthropic caching.

I added token counting for OpenAI too but it always returns 0 for me thinkies - and there's a != 0 check so nothing is actually emitted. But, if it returns the token count for you, should work. Don't know what's up with that.

smoky ocean
#

Thank you!

#

Anyone have a gif for me? ๐Ÿ™‚

spring wave
#

between tasks atm - any preference?

  • what MCP clients support dynamically changing tools
  • implement graphql bbi
  • propagating stdout/stderr from module return value to LLM
  • what's using 10k input tokens
smoky ocean
spring wave
# smoky ocean I think stdout/stderr is the most urgent (sorry...) because it breaks my workhor...

Yeah no makes sense - I might have an idea that would help without requiring the frontend to strip the data out, so I'll look into that first, and if that starts taking too long I'll shift gears to just yeeting it into the string and stripping them in the frontend. (Though, even still it goes pretty strongly against OTel convention to put such large values in span errors)

the tl;dr is module-returned errors currently lose all the extra data we return from the API (in this case, stdout/stderr). Which is a shame since Errors are full objects and it'd be easy to attach more data to them. Then we just need to change the LLM implementation to also return all of that data, just like I hacked in for ExecError

#

is there a way to tell if you've been accidentally making an asciinema recording for hours without playing Ctrl+D russian roulette

smoky ocean
#

ps? ๐Ÿ™‚

spring wave
spring wave
# smoky ocean Anyone have a gif for me? ๐Ÿ™‚

here's the current state of llm: https://asciinema.org/a/wk1VtsAjThkyuZcG8rtpN13Qq

  • token usage metrics
  • message spans now show their duration
  • anthropic caching, saving ~12k tokens per roundtrip near the end, which is a ton considering 3.7-sonnet is limited to 20k per minute
  • now shows the "thinking" span always, instead of only when there's a message
    it gets pretty slow by the end, unsure why, maybe those cached tokens still take a long time for the LLM to process each time, they just don't count towards your limit? shrug
spring wave
# smoky ocean I think stdout/stderr is the most urgent (sorry...) because it breaks my workhor...

pushed a commit - wanna check when you have time? i tried it with your repro and it seems to work now. so far it's only implemented for Go SDK

GitHub
  • add Error.withValue which adds a named arbitrary JSON value
  • Error values get translated to GraphQL error extensions
  • SDKs convert GraphQL extensions back into proper Error
  • LLM now prints all...
smoky ocean
shrewd ermine
#

ok this is clearly not correct yet but it's starting to get pretty close

spring wave
shrewd ermine
smoky ocean
#

@tigran_iii sonnet: reliable workhorse, if a task is very well defined and i have a clear outline of how it should be done but i need something that writes code extremely well, perfect
grok: big model smell but undertrained, it sucks alone but if i have something that is difficult, i have a

smoky ocean
#

@spring wave got a panic in TUI ๐Ÿงต

smoky ocean
shrewd ermine
#

nice - if it helps any, here's my prompt

smoky ocean
#

Starting a thread for llm.4, seems worth it!

smoky ocean
#

Thread about human-in-the-loop

smoky ocean
#

๐Ÿšจ๐Ÿšจ๐Ÿšจ new release v0.17.0-llm.4 is out.

To install:

export BIN_DIR=/usr/local/bin # modify as needed

curl -fsSL https://dl.dagger.io/dagger/install.sh | DAGGER_VERSION=0.17.0-llm.4 sh
shrewd ermine
#

@spring wave do you remember what your change was when I was trying to enable the host stuff? I just ran into the same issue when trying to pass tcp://localhost:3000 as a dagger.Service through the cli (on llm.3)

! query{host{service(host:"localhost", ports:[{backend:3000,frontend:3000,protocol:TCP}]){start}}}
! error: parse selections: parse field "service": init arg "ports" value as dagql.DynamicArrayInput ([PortForward!]!) using dagql.DynamicArrayInput: assign input object "Frontend" as {Elem:0 Value:3000 Valid:true} (dagql.DynamicOptional): cannot assign dagql.DynamicOptional into field of type *int
#

I'll try bumping to llm.4

#

yeah still have it on llm.4 with in both call and shell

shrewd ermine
#

๐Ÿ‘† this is a broken thing on the llm branch that needs fixed before we can merge

#

I started that branch from scratch 3 hours ago ๐Ÿคฏ

smoky ocean
#

boom boom boom

#

how many LoC?

#

@shrewd ermine want to maybe make it a separate repo? Since we're going to spin back out next week

shrewd ermine
#

It's inflated a bit because it also includes functions to run the game server and web client. Still comes in around 50

#

Yeah I'll add it to my agents repo, one sec

#

the agent itself is entirely included on the screen. Yeah the game logic is in the "tictactoeClient" but that just provides the tools "read" and "move" to the llm

#

just realized there's a bug making this less cool but it's not super noticeable in the video

#

actually I'm going to record it again because it's way cooler with the bug fix

smoky ocean
#

random thought: how do we get in on Levelsio flightsim action ๐Ÿ™‚ https://x.com/levelsio/status/1895543217460035895

It's still going!

https://x.com/levelsio/status/1895595750215860384

Added red visuals now if you're getting hit

This is so cool ๐Ÿ˜Š

@danperks_ from @Cursor built an actual working live radar screen for my flight sim:

https://t.co/ET5ITHaIDM

It shows everyone online live flying around with their altitude, speed etc.

I will try to add this as a minimap in the game too!

shrewd ermine
#

haha I used to mess around with NNs in flight sims and I could totally hook an agent up to one now. A lot of them have great APIs

smoky ocean
#

What if we did a silly pipeline integration. Like watch the event stream and push it to github; or to a database or something

#

or, maybe more obviously plug it into an agent? But the problem is that it's very streaming-oriented, dagger might not work great for this. or maybe it will?

spring wave
#

spooky dagql code

smoky ocean
#

๐Ÿ˜ญ worth a try

shrewd ermine
#

there's a TUI issue I've been struggling with for these. You can see in my recording earlier that the function returns string (the output of histor()), but all I see in the terminal when the call completes is the traces. I think it puts my function's output somewhere in scrollback but it's not immediately visible

spring wave
#

noticed that too - looking into it now, think i know why it happens but not sure what can be done about it yet

shrewd ermine
smoky ocean
#

Also @shrewd ermine can you spice it up and make the model explicit? You've got the multi-model setup and python makes the model optional argument look less lame than Go

#

That way we can say "Kyle plays tic-tac-toe with <model>"

#

Question is: which model? ๐Ÿ™‚

#

llama? Can try to get another retweet from ollama

shrewd ermine
wraith remnant
#

new readme is amazing

shrewd ermine
shrewd ermine
smoky ocean
#

Maybe gemini, since Jaana was so nice and gave such fantastic feedback yesterday ๐Ÿ™‚

#

also we have that native client now, let's flex it

#

@shrewd ermine I know it's getting late for you. Are you OK to re-record it with gemini?

shrewd ermine
#

sounds good! yeah totally. I would just run llm first but now my tui shows this instead of the nice model output we had before

spring wave
#

what if when something is running we pop the terminal into an alternate screen mode instead? (same way vim/htop/etc work)

i think that might be the only way to preserve prompt position thinkspin - will keep looking in a bit

spring wave
shrewd ermine
shrewd ermine
#

it didn't just let me win on the 3rd turn that time either lol

smoky ocean
#

Can you have two models play each other? ๐Ÿ™‚

#

I might try that

#

(I don't mean record it now - just curious if the module would support it)

shrewd ermine
#

sure can! it's supported

#

actually needs a small tweak to the prompt because right now I tell it that it's player O but otherwise supported

#

tictactoe is kind of boring though. I was thinking about hooking it up to something more interesting like chess, go, or civilization ๐Ÿ˜†

smoky ocean
#

@outer moth ๐Ÿ‘‹ welcome ๐Ÿ™‚ I will start a thread here to discuss possible fun integrations between Dagger and Assistant UI. I know you're busy this coming week so no pressure to reply!

bronze fern
shrewd ermine
#

the waiting part simplified the llm loop a lot because you didn't have to convince it to keep polling until the opponent moves

#

and then gemini did non-dagger parts of the demo too ๐Ÿ™‚

bronze fern
bronze fern
spring wave
# shrewd ermine there's a TUI issue I've been struggling with for these. You can see in my recor...

trying a less invasive fix: instead of flushing command progress to the scrollback immediately, flush only when you run a new command. that way when it finishes you'll see your last command at the top of the screen instead.

building on this, I could have it switch into nav mode while the command is running, and switch back when it's done. someone mentioned being able to type while something is running feels weird, might be nice to be swapped into the usual TUI control mode while you wait, since it doesn't support running multiple concurrently anyway. but, sometimes it's nice to pre-type the next thing to run. shrug (i guess that's an i keypress away? also, how we feeling about that keybind?)

shrewd ermine
#

Sounds good. I don't feel like my issue was flushing output but maybe we're saying the same thing. I wasn't actually seeing my output at all but the trace output was still there

spring wave
# shrewd ermine At the very end of this clip

oh, in that clip it looks like you switched from input mode to nav mode and stayed in that mode, which is what keeps the trace nav onscreen - did you expect to be switched back to input mode when it completed?

#

i'm fixing a different issue then, but it sounds related to the second thing I mentioned

#

the issue i'm fixing is that when you run a command that fills your screen height, after it exits the screen is just blank with your prompt sitting at the top

shrewd ermine
#

Yeah I cut the recording too short to tell but it just sits on that last frame when it's done

shrewd ermine
spring wave
#

ah maybe because right now logs are printed above the child spans - so it was offscreen

#

maybe they should be below? hyperthinkspin

shrewd ermine
#

That's what I was expecting but maybe I'm holding it wrong?

spring wave
#

hard to say, from the clip it was still running so everything there seems expected to me, and from my testing it does collapse when it's done

#

either way, pushed a fix for the thing i mentioned at least

shrewd ermine
noble notch
#

Consider submitting each of these as Show HNs, not just to Twitter. Just need to have a Readme (with a gif/video or interactive demo)

smoky ocean
#

Demo idea: a docker registry where you describe the image you want in the address, and it creates it for you on the fly

docker pull "<registry>/an image with latest cuda and pytorch installed please"

#

call it the magistry ๐Ÿ™‚

quiet ether
#

I'd have expected that "initialize a container from an alpine image" was going to automatically give me a container but my Anthropic llm seems to insistently trying to use docker for that.

โœ˜ llm | with-prompt "initialize a container from an alpine image" | container | terminal 4.1s
โ”‚๐Ÿง‘ initialize a container from an alpine image
โ”‚ โ”ƒ 0.0s
โ”‚
โ”‚๐Ÿค– I'll help you initialize a container using the Alpine Linux image. Alpine is a lightweight Linux distribution that's popular in containerization. Here's the command to do that:
โ”‚ โ”ƒ
โ”‚ โ”ƒ docker run -it alpine
โ”‚ โ”ƒ
โ”‚ โ”ƒ This command will:
โ”‚ โ”ƒ
โ”‚ โ”ƒ โ€ข docker run : Create and start a new container
โ”‚ โ”ƒ โ€ข -it : Provide an interactive terminal
โ”‚ โ”ƒ โ€ข alpine : Use the latest Alpine Linux image (it will pull it automatically if not present locally)
โ”‚ โ”ƒ
โ”‚ โ”ƒ If you want to run it in detached mode (in the background) instead, you can use:
โ”‚ โ”ƒ
โ”‚ โ”ƒ docker run -d alpine
โ”‚ โ”ƒ
โ”‚ โ”ƒ Would you like me to explain any additional options or would you like to do something specific with the Alpine container?
โ”‚ โ”ƒ 4.1s โ—† LLM Input Tokens: 31 โ—† LLM Output Tokens: 1

am I missing something here?

smoky ocean
quiet ether
#

I was trying for the model to give me a container without having to supply a with-container myself

quiet ether
quiet ether
# smoky ocean you need to give it access to an object (Container or otherwise)
โœ” container 0.0s
Container@xxh3:6934f6e558023746

โœ˜ llm | with-container $(container) | with-prompt "initialize a container with an alpine image" | container | terminal 8.2s
! input: llm.withContainer.withPrompt.container no response from model
โ”‚๐Ÿง‘ initialize a container with an alpine image
โ”‚ โ”ƒ 0.0s
โ”‚
โ”‚๐Ÿค– I can help you initialize a container from an Alpine image using the  from  function. This function requires an  address  parameter specifying the image's address from its registry.
โ”‚ โ”ƒ 3.5s โ—† LLM Input Tokens: 11,698 โ—† LLM Output Tokens: 88
โ”‚
โ”‚ โœ” Container.from(address: "alpine"): Container! 2.3s
โ”‚
โ”‚๐Ÿค– 2.4s
โ”‚ ! no response from model

๐Ÿค”

smoky ocean
smoky ocean
quiet ether
#

also 11k tokens just for that ๐Ÿ˜ฌ

quiet ether
smoky ocean
#

mmm no it only gets the equivalent of .help with arg schema.

#

but I guess it's a lot of functions

#

also claude talks alot by default

wraith remnant
#

Cooking with @sand knot @stray ice

smoky ocean
#

File edits ๐Ÿงต

merry scarab
smoky ocean
smoky ocean
# smoky ocean Finally getting around to posting this on social ๐Ÿ™‚ ๐Ÿ™

Re: cached tokens still slowing things down. @void flint explained to me that cached tokens still have to be run through a GPU, to rebuild the inference state - it's not like a traditional cache lookup where cost is zero. It's cheaper but still you have to re-run the whole sequence through a GPU. So maybe that's what we're seeing

spring wave
#

got it, that does feel like what was happening

devout magnet
#

What's the status of Dagger's dagger watch or shell reloading features for AI agent workflows? I'm planning a GitHub Actions pipeline where a Dagger-powered AI agent modifies files based on LLM responses and then re-runs tests. Does this require dagger watch or shell reloading, or can it be done with current Dagger functionality?

bronze fern
# devout magnet What's the status of Dagger's dagger watch or shell reloading features for AI ag...

You could keep your working directory state inside of the Dagger pipeline in a Directory or Container in a "workspace" until you have passing tests and then as a final step create a PR with that working state, or publish a container image, or export it locally, or anything else you like. Since you don't need to use your laptop filesystem for a workspace, you don't need the watch/reload.
Good examples here: https://docs.dagger.io/ai-agents#examples

State in the golang Container: https://github.com/dagger/agents/blob/main/toy-programmer/toy-workspace/main.go#L12
Updated as it changes: https://github.com/dagger/agents/blob/main/toy-programmer/toy-workspace/main.go#L33

State in an empty Directory: https://github.com/dagger/agents/blob/main/melvin/workspace/main.go#L15-L18
https://github.com/dagger/agents/blob/main/melvin/workspace/main.go#L35

river belfry
#

I played again with the AI agents ๐Ÿ™‚

On the things that works well, the harder was to have the workspace ๐Ÿ˜… then it's quite easy to just wrap a prompt in a call. And more than that, once you have the few modules for your case (so for instance java tooling here) then it's very easy to create new functions when needed. One other good thing is it mixes modules in multiple languages. The AI stuff is in Go, is uses under the hood the java module in java. It works ๐Ÿ™‚

No idea where this stuff is going, but at least that's fun!

shrewd ermine
river belfry
smoky ocean
#

Nice @river belfry ! Glad you're seeing the "fun" part, it's actually very important for early tech waves ๐Ÿ™‚ fun brings more people

shrewd ermine
smoky ocean
#

re workspace: I'm adding a shell() function today

river belfry
#

One thing really good for sure is this aspect of workspace/checker: the ability to have the llm running the commands to build/test without to have to code them. It's roughly a line in the prompt saying it has to check it's working. On the developer side it really cool, very convenient. And way better than to have generated code that doesn't build for instance.

woeful quiver
shrewd ermine
woeful quiver
shrewd ermine
# woeful quiver curious what you mean by stdlib?

In general it means a module or set of modules that provide very basic utilities on top of the core API that would be useful in most modules. Things like concurrency helpers, more advanced file ops / git ops, common cookbook patterns. Kind of one step beyond "featured" daggerverse modules

#

๐Ÿฆ™ now that we have anthropic and gemini routing built in, we should split out ollama too. It's currently leaning on openai configs, which works but it means you can't use both openai and ollama together. We don't necessarily have to use the official client, the openai client seems to work fine. It'll mean an env change for ollama users, so we should do it asap

smoky ocean
#

I don't see workspace becoming part of the core API. Actually I'm trying it to make it more clear in the demo that it's a user-defined type, because that's the most powerful part, and the hardest to understand

#

Re: stdlib. Maybe, but not 100% certain either. We haven't really figured out the patterns yet, I don't know if we can encapsulate everything in a single stdlib type

#

seems more like a pattern to me

shrewd ermine
#

Yeah if anything I could see a stdlib/featured interface that most workspaces would be extending. But not a usable workspace itself. But that currently wouldn't be possible anyway

smoky ocean
#

workspace + shell = DX breakdown

steep onyx
#

Is the LLM PR in a good place to review now? I started the other day but then held off after hearing I should wait cc @spring wave

smoky ocean
#

it. there's some git cleanup to do

spring wave
smoky ocean
#

Teasing MCP support (thank you @spring wave for the POC ) https://x.com/solomonstre/status/1896634723684004013

Update: many questions about MCP support. There is now a PR ๐Ÿ˜‡ link below.

1. Develop agent state & functions (the robot body) in Dagger
2. Connect to LLM endpoint directly, or to monolith via MCP or generated clients

We want Dagger to adapt to your existing stack

#

@shrewd ermine @woeful quiver @dense flare I feel like the next few demos should focus on something more impressive on the functionality side - something that is super hard to do without dagger, but super easy to do with dagger. (ie. my "magistry" idea as one example).

shrewd ermine
woeful quiver
shrewd ermine
#

The 'magistry' thing sounds actually really easy to implement and potentially a very sweet demo

smoky ocean
#

just be careful to not fall into the trap of making something that looks really impressive to us systems people discovering AI, but looks super trivial to AI engineers. You want the other way around ๐Ÿ™‚

shrewd ermine
#

Yep exactly. I was brainstorming ideas to iterate on the tictactoe demo and realized they all fell into that category

smoky ocean
#

10000x bonus if you find a way to tag along on levelsio flight thing ๐Ÿ™‚

#

What vercel did was insanely smart and something to aspire to @noble notch @bronze fern

#

(the aliens)

shrewd ermine
dense flare
steep onyx
bronze fern
steep onyx
#

Has anyone else struggled to get LLMs to know how to use slices as args correctly? I was just playing with as trivial an agent as I could think of to start, where I initialized the workspace with []*dagger.File and gave it a grep tool to search through a single dagger.File at a time, but it consistently tries to give a whole list of []*dagger.File as the arg for a single file, even when I yell at it in all caps in the prompt.

Works like a charm if I change my types to just be a single *dagger.File.

Just wondering if the LLM is being dumb vs. I'm hitting some limitation in what we allow it to do.

dense flare
#

tool_call_id error

smoky ocean
#

When you're trying to remember how to split a gif from the command-line, ask ChatGPT, doesn't work on the first try, you're going to copy paste back and forth...

Then you remember you can just write a one-liner agent command with dagger ๐Ÿ™‚

#

"you do it"

#

So.... Goose is basically an open source, multi-model clone of Claude Desktop.

Should we make an open source, multi-model clone of Claude Code?

smoky ocean
#

Also @spring wave - should we cut a llm.5 with the token counting + claude token caching? (thats the gif I'm trying to split btw) ๐Ÿ™‚

wraith remnant
#

What if, what if, we tried to make a SOTA agent with Dagger agents ? Looking for my next objective

https://x.com/skcd42/status/1867561928165560381

With o1 models and many providers doing native inference time scaling, the writing is on the wall that spending more compute on inference leads to better results on any given task. Alpha (Go|Code|Maths|Chess) have already proved that for their respective domain.
With our SOTA

smoky ocean
shrewd ermine
#

๐Ÿ™ i'm also working on a bug fix for the gemini client that I ran into but don't wait for me

smoky ocean
#

e2b ๐Ÿงต

spring wave
#

openai returns 0 for token usage for some reason

smoky ocean
#

Sorry I meant - the agent is failing to accomplish the task ๐Ÿ™‚

#

0.17.0-llm.5 ๐Ÿงต

uneven depot
shrewd ermine
#

oh, in that clip it looks like you

dense flare
#

Is it normal for the same input to the same model to produce widely differing solutions?

dense flare
wraith remnant
dense flare
smoky ocean
#

also: welcome go AI engineering ๐Ÿ˜

dense flare
# smoky ocean we don't, but should

when responses are inconsistent - is it better to try different models with the same prompt, or the same model with iterative changes to the prompt?

dense flare
#

when responses are inconsistent - is it

dense flare
#

debugging workspace

uneven depot
quiet ether
spring wave
wraith remnant
# spring wave Yeah I don't think I'd ever want AI directly in the path to production for a ser...

I guess I'm on the disagreeing side -- (I agree, just temporarily). The best parallel is autonomous cars: we don't want them to drive until proven that it just works.

It's just gonna be a continuum: from assisting on the highway to FSD driving end-to-end on your vacation, and, at some point, we have enough data to just let the car drive, then laws will forbid humans to drive on some lanes on the highway ๐Ÿคฃ

uneven depot
#

The parallel to autonomous cars is interesting. However, a vast majority of people don't trust autonomous cars or want to go anywhere near them still. Granted there's slightly less risk with AI in software dev based on the area but it's still a risk. To convince developers to use it, we have to prove that it can be somewhat consistent (I won't say deterministic) to be trusted. But we aren't there yet. Hopefully Dagger can help us get there ๐Ÿ™‚

spring wave
#

to me it's more about reproducibility and isolating from external dependencies in the critical path to production

#

fully automating to that level seems like there's a lot to lose or re-invent and very little to gain, vs. just having the same AI generate code for you

uneven depot
#

yes agree, reproducibility is key. I say this as an owner of a Tesla with FSD which I use for my daily commute. It's gotten pretty good at taking me to work every day without much variance. So, i think software dev can get there too. But there's a long way to go. Tesla has an insane amount of targeted, real world data to work with.

smoky ocean
#

I take Waymo several times a week btw and it's basically flawless, fyi ๐Ÿ™‚

uneven depot
#

I've seen those. It's funny how what you are doing with Dagger is providing the tools to the AI like Waymo does with it's gadgets on the car

shrewd ermine
#

An AI agent doesn't need to be in the path to production to be helpful. Your CI is basically validation. An agent can help build that validation and diagnose if something is wrong with it, but it doesn't necessarily need to be in the critical path to do those things. It can asynchronously do these tasks on broken PRs or opening new PRs to solve issues / expand validation

uneven depot
smoky ocean
#

QA is a great use case for insertion in a production CI IMO

wraith remnant
smoky ocean
#

There are also several startups building "unit tests generator agents". For example Tusk https://www.usetusk.ai

wraith remnant
shrewd ermine
#

It's not just trust though, it's also cost. Do you really need 10 llms to run on every commit in github, or is it something that runs only after the tests fail in a certain way or when a qa label is added

devout magnet
#

I may be wrong, but such features can help to make everything fully automated like SWE agent, we can trigger github action pipeline on any type of issue creation like bug or security to run the pipeline and fix the issue and commit, release and closing the issue, many things can be fully automated, so much potentials...
everything is powered by dagger and ai agents with dagger...

shrewd ermine
#

Yeah, this kind of automation is already possible - it's what we all already do in CI. We're just adding a new primitive to the toolbox for what can happen in the pipelines responding to VCS events ๐Ÿ™‚

smoky ocean
#

Quick update on workstreams relevant to agents. Let me know if I'm wrong or missing anything.

  • @spring wave is looking into Claude Code and how Dagger might integrate
  • @wraith remnant is working on his e2b project with @warped bramble , and exploring n8n integration as a proxy for "human-in-the-loop" and framework integration
  • @hidden tartan is working on generated clients (also needed for framework integration)
  • @shrewd fern is stabilizing the shell
  • @shrewd ermine @woeful quiver @dense flare @bronze fern @storm gate are building demo agents & related devrel content
  • @spark phoenix is focused on daggerizing swebench
  • @smoky ocean is trying to ugrade BBI to multi-object + taking mcp/bbi from @spring wave so he can focus on claude code

@lost topaz @quiet ether are catching up and are available to help -> correct? ๐Ÿ˜‡

steep onyx
smoky ocean
#

๐Ÿšจ๐Ÿšจ New release: v0.15.0-llm.5

shrewd ermine
#

After digging into the actual code some

wraith remnant
#

mcp

smoky ocean
#

It's a great privilege to be able to follow @spring wave 's traces live as he develops ๐Ÿ™‚ ๐Ÿฟ

spring wave
smoky ocean
#

Later:

๐Ÿ‘€
Remember this value: <a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ" target="_blank">potato</a>

The official video for โ€œNever Gonna Give You Upโ€ by Rick Astley.

Never: The Autobiography ๐Ÿ“š OUT NOW!
Follow this link to get your copy and listen to Rickโ€™s โ€˜Neverโ€™ playlist โค๏ธ #RickAstleyNever
https://linktr.ee/rickastleynever

โ€œNever Gonna Give You Upโ€ was a global smash on its release in July 1987, topping the charts in 25 countries includi...

โ–ถ Play video
shrewd ermine
spring wave
spring wave
#

if there's a prompt engineering tip for that, it would be such a huge improvement

shrewd ermine
#

gemini was totally fine with it, but when I switched back to ollama qwen it would take the doc as a brand new assignment

spring wave
#

the only trick i really found was to add error checks before tool calls so they'd just fail until the required README was read

#

but, it's really clunky (the user sees errors, etc)

shrewd ermine
#

mine was good about looking up the information every time, but it would forget about it's original mission once it received it. Probably something about context windows ๐Ÿคท I'm learning

spring wave
#

Getting there... https://asciinema.org/a/CPEvr96l5dYoAyKodHdUiaKYZ /cc @smoky ocean (you can ignore the second half, forgot it was a cold cache ๐Ÿ˜… and then it blowed up)
Corresponding trace (TODO fix infinite pending message bubbles)

So far I've added a dagger llm command, which is like dagger shell but accepts text prompts instead. Under the hood it just does a Llm.withPrompt(...) loop, syncing after every input.

You can use /with to run a shell command and set its result as the LLM context, e.g. /with container | from alpine will do Llm.withContainer(...), or /with . will pass the current module to it, etc.

spring wave
#

dagger llm

wraith remnant
#

Grok 3 with thinking is just soo good with Dagger

#

One shotted this for my personal project:

name: Daily Scrape, Transform, and Publish

on:
  schedule:
    - cron: '0 0 * * *'  # Runs daily at midnight UTC
  workflow_dispatch:     # Allows manual triggering

jobs:
  daily_workflow:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Run Dagger pipeline
        uses: dagger/dagger-for-github@8.0.0
        with:
          verb: "call"
          module: "."
          args: "daily-workflow --code_dir=. --mistral_api_key=env:MISTRAL_API_KEY --orama_api_key=env:ORAMA_API_KEY"
          workdir: "."
        env:
          MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
          ORAMA_API_KEY: ${{ secrets.ORAMA_API_KEY }}
#

Just perfect

austere quest
#

awesome ๐Ÿ˜

dense flare
spring wave
#

merged dagger llm in to llm cc @smoky ocean

spring wave
#

also - I undid that change to keep the last command in the 'active' area after it completes, because the fact that it cut off the top with no way to read it felt worse than the behavior it intended to fix (having the output offscreen in the scrollback in some scenarios). /cc @shrewd ermine

#

(and also it wasn't the issue you initially ran into anyway, was just something else i noticed)

spring wave
smoky ocean
#

we hould try prefixing it with <type>://ID. eg. Directory://xxh3:...

spring wave
#

Oh yeah, maybe we can reuse the format from the CLI (Directory@xxh3:...) - but only if it solves the problem, URI scheme might be clearer to the LLM. I'll try that out

#

also, idea for dagger llm: /checkpoint foo => saves the current Llm object + state, so you can go back to it and continue later, sort of like rewinding the dialogue tree

smoky ocean
#

yes I've been saved once before (last year) by url scheme format being clear to LLM

spring wave
#

tried adding Foo@... prefix but then it just dropped the Foo@ prefix when it called back thinkspin - gonna try just adding something to the description, like "This function returns an object of type Foo"

#

getting a lot of Overloaded errors from Anthropic today

shrewd ermine
#

@spring wave running on a commit f3bccd836b5eb6d6e671d3bfa235aea19f4bfa21 (from 24h ago). Looks like something yesterday introduced a double output. Should I pull latest or is it still TODO?

spring wave
#

pull latest - should be fixed

shrewd ermine
#

yep looks good, thanks!

smoky ocean
#

0.17.0-llm.6 ๐Ÿงต

smoky ocean
#

๐Ÿšจ ๐Ÿšจ ๐Ÿšจ new release: 0.17.0-llm.6

This one has a new goodie... Which I highly recommend trying ๐Ÿ™‚

dagger llm --> our take on the Claude Code command-line experience... ๐Ÿ˜‡ courtesy of @spring wave

Try it and tell us what you think!

#

@spring wave btw you probably want a slash command to clear the context

#

@spring wave how do I get a list of slash-commands? I typed /help but the llm answered ๐Ÿ™‚

spring wave
#

Ok, the title of this one is a bit of a mouthful: "LLM calling LLM using a Dagger GraphQL MCP server module called via the flat BBI"
https://asciinema.org/a/nNjuad4p7WQEiQVXGTK66rMOH
https://v3.dagger.cloud/dagger/traces/f30fd419bb059cac83251fb3a91bb2f0?listen=bdaed855ce2487a8

I was going to revive the GraphQL BBI because I wanted to see if it would be any better at "thinking ahead" (#1346945909837008997 message), but then had an idea to implement the BBI as a Dagger module instead. Worked out pretty well I think.

spring wave
#

i got tab completion and completion checking working at least

smoky ocean
#

ok ๐Ÿ‘

smoky ocean
#

or orthogonal?

#

(I don't have a full handle on how tbh... It's like a halfway between BBIv1 and your gql approach)

spring wave
smoky ocean
#

I think I'm going to add LLM.withModule(string) to get rid of the annoying limitation where I have to install a dependency (and reload schema) before I can give my LLM a custom object

#

@spring wave but I don't want to lose "give any object instance to llm, not just a stateless module entrypoint"

spring wave
#

Oh I misread, dunno how it affects multi object

smoky ocean
#

just wondering if you discovered a new state-of-the-art approach to Dagger/LLM integration, that may invalidate some of our assumptions in ongoing BBI work

spring wave
smoky ocean
spring wave
#

I don't love the DX though, lots of GraphQL cruft, would be nice to just see the calls immediately without having to expand
(brb)

smoky ocean
#

LLM.set<Foo>(key string, value <Foo>) set the variable key to value, available to the LLM.

woeful quiver
shrewd ermine
#

Right now we're using the generative API / models for every provider because that's what agent and tool calling use cases need. Not to say that can't be supported but it would be something much different from what we have today I think

woeful quiver
#

FYI, I'm on llm.6 and following the example in the docs for the go-program - getting the following:

! could not parse arguments for function "go-program": missing 1 positional argument(s)

smoky ocean
#

Oh man it's my required arg

woeful quiver
#

Also, might be the Zed terminal - but its really hard to copy anything above the last output from the TUI - meaning almost impossible

smoky ocean
#

fixing now

smoky ocean
woeful quiver
woeful quiver
smoky ocean
#

fixed

#

shame-driven development ๐Ÿ™‚

woeful quiver
#

No shame here! Was just walking through the docs before the meetup presentation next week ๐Ÿ™‚

stray ice
#

nested llms is crazy

#

wtf

jaunty iron
# smoky ocean fixed

Oh man! That's why I got the error during the live as well!๐Ÿ˜‚ I thought it was me, now I can tell the truth! ๐Ÿ˜‚

jaunty iron
#

Eheh, no worries, I'll edit the recording... but it was funny, because I literally tried it 1h before and it was working

smoky ocean
#

Holy ๐Ÿ’ฉ๐Ÿ’ฉ๐Ÿ’ฉ๐Ÿ’ฉ dagger llm supports adding modules @spring wave what have you done

#

OK i have to share a recording of this...

smoky ocean
gloomy kindle
#

just a heads up, i'm gonna be pushing some stuff today to start prepping for merge - the idea is to get ci completely running and green by EOD!

#

i'm breaking down any/all changes into smaller pieces, with commit descriptions, so if any of the pieces look controversial or i remove or rename something that looks sus, then it can be reverted easily ๐Ÿ˜›

spring wave
smoky ocean
#

@gloomy kindle heads up there's probably a merge conflicts in our near future

#

shouldn't be too bad but depends how far you take your refactoring ๐Ÿ™‚ can you let me know when you push?

gloomy kindle
#

lol i've already solved two today ๐Ÿ˜ข

#

why not one more ๐Ÿ˜›

#

working on fixing the current one right now

smoky ocean
#

or ideally show me in advance, if you have a wip branch

gloomy kindle
#

i'm pushing one commit at a time, since it's a bit time consuming to run all ci locally

smoky ocean
#

getting to merge ๐Ÿงต

smoky ocean
#

Terminal access ๐Ÿงต

quiet ether
#

๐Ÿ‘‹ the dagger CLI doesn't support stdin, correct? i.e: I'd like to do something like some_command_useful_output | dagger -c "llm | with-prompt "blablabla" | with-input? so I can pass it to Dagger

smoky ocean
#

not sure how

quiet ether
wraith remnant
woeful quiver
#

When using a module with constructor args in shell, I'm not seeing the arguments documented in .help - makes it really difficult to determine the syntax and discover whats available

shrewd ermine
#

it looks like the way to see those is .help ., but I agree it should come up with just .help

smoky ocean
#

@quiet ether do you want the stdin stream or a buffered string is enough?

quiet ether
woeful quiver
# smoky ocean Can you paste the exact command?

Been thinking on this a bit, is there any reason to not show the help by default if there is an error with the input (missing args). That seems to be normal behavior for a lot of CLI programs and that would give the user some feedback on what they did wrong

smoky ocean
#

No reason not to

spring wave
#

@shrewd ermine any reason not to set up streaming for Gemini? noticed it's missing the fancy telemetry

#

can work on that now if not

smoky ocean
#

@spring wave separate telemetry question: I notice that the custom span for tool call ๐Ÿค–๐Ÿ’ป is still "old style" whereas the one for prompts and replies are using the new fancy UIActorEmojiAttr attribute. Should I change it over to do the same, or is there a reason it's been left behind?

#

context: trying to debug my llm-multiobject branch

spring wave
smoky ocean
#

Ah right

#

So now that tool calls don't map 1-1 to underlying function calls (at least in my branch) not sure what to do

#

mmm actually I do know. I have a half-dozen builtins, which I can show as special "ui emojis" messages. They don't have children so no need for pass through.

Then for regular tool calls, those do map 1-1, so I can keep the passthrough

spring wave
smoky ocean
#

oh I see. just add my custom span in my builtin function

#

and it will benefit from the pass-through, like regular function calls

spring wave
#

yep

smoky ocean
#

cool cool

#

@spring wave when using "UI actor emoji", can I still encode the actual tool call in the span, without relying on streaming content? Do I just put that in the span content? For the prompt that value is just "LLM prompt" not sure if it gets rendered anywhere?

spring wave
#

are you trying to put an emoji next to something that's not a message bubble?

smoky ocean
#

Yeah

spring wave
#

ah right now they go hand-in-hand

smoky ocean
#

Should I just make it a regular span with an inline emoji?

#

like the good old days (aka 2 weeks ago ๐Ÿ˜› )

spring wave
#

what is the content you want the emoji next to? thinkspin

smoky ocean
#

builtin name & args

spring wave
#

assuming it's not a function call or message bubble

#

ah ok - so like a shell command?

#

(kind of?)

smoky ocean
#

eg :

_load base

_save myctr

_objects

_scratch

#

(sadly .foo is not allowed, so I went for _)

spring wave
#

eh yeah you can just put the emoji in the name for now

#

if i find tiem i'll update the UI to support the emoji attribute

smoky ocean
#

So far I have 7 builtins:

  • _save: save the current object to a variable
  • _load: load a variable and make it the current object
  • _undo: undo the last action
  • _scratch: clear object selection
  • _type: show the type of a variable
  • _objects: list saved objects and their type
  • _current: print the current object ID
smoky ocean
smoky ocean
spring wave
#

side note: cursor's terminal does NOT like markdown output. grinds to a halt.

spring wave
#

@smoky ocean pushed what I have to get it off my machine, but seeing some TUI jankiness still, feel free to try

smoky ocean
#

@spring wave a /model would be cool

#

to show off the multi-model aspect

spring wave
smoky ocean
#

Yeah I would try to keep it, I think ChatGPT lets you do it (openai models only of course)

spring wave
#

adding support for it, you can always /clear ablobshrug

smoky ocean
#

Good combo with /compact and or /forget

#

ha ha ๐Ÿ™‚

spring wave
#

we should just send the old model history along to the new model

spring wave
#

(the /clear at the end did accidentally make it swap back to Claude, which is fixed now)

shrewd ermine
#

That's amazing

#

so I think we need to implement the real ollama client ๐Ÿ˜ฌ I guess it's not so bad because we're up to 3 official clients now

smoky ocean
#

Oh that reminds me @spring wave - with the multi-model focus, might be cool to show the model name prominently in the visible "identity" of the AI soehow in the trace. So that we can see "Deepseek said this"; "Gemini did that" etc

#

Maybe down the line there's a visual identity thing, like the space invaders but for LLMs ๐Ÿ™‚

shrewd ermine
smoky ocean
#

@shrewd ermine that line is actually prompt engineering ๐Ÿ˜›

shrewd ermine
#

yeah I have something to the same effect in my system prompt. The problem is the container type still gets added to the tools and adds all the tokens

smoky ocean
#

Yeah

#

I think multi-object might help

#

(mm but not really)

#

you're right we need an optional arg to hide fields

shrewd ermine
#

We have something similar in codegen already but it probably doesn't help because I'm guessing this goes in bbi

smoky ocean
#

multi-object ๐Ÿงต

spring wave
#

lemme know if it's annoying in the prompt, it's longer than i thought ๐Ÿ˜›

shrewd ermine
spring wave
#

has anyone tried "write me an agent that ..." yet? ๐Ÿ˜›

smoky ocean
#

then you should quote-tweet my claude code tweet with your video & say something like "this is amazing, I did XYZ"

woeful quiver
smoky ocean
#

then I'll retweet you ๐Ÿ˜

shrewd ermine
spring wave
#

yep

shrewd ermine
#

oh duh I have to rebuild cli facepalm

#

ok so /with directory doesn't take any args... what kind of voodoo magic does it need to say "get . from host?" (that specifically doesn't do it lol)

woeful quiver
#

oh man, running ollama and pgvector to embed documents is the first time I've heard the macbook pro (Apple M4 Pro) with 48GB RAM put the fans in overdrive ๐Ÿคฏ

woeful quiver
spring wave
shrewd ermine
#

is this a model issue or is it missing tools?

โœ” please read the contents of the file ./main.go 0.6s
โ”‚๐Ÿง‘ please read the contents of the file ./main.go
โ”‚ โ”ƒ 0.0s
โ”‚
โ”‚๐Ÿค– I am sorry, I cannot read the content of the file. The API only returns a digest of the file.
spring wave
#

that's what i've been seeing with gemini

#

ohhhh, maybe it doesn't support streaming + tools?

shrewd ermine
#

it's able to do directory things once I've done /with directory. Just gets stuck on file from there

#
โœ” what is in the current directory? 1.2s
โ”‚๐Ÿง‘ what is in the current directory?
โ”‚ โ”ƒ 0.0s
โ”‚
โ”‚๐Ÿค– 0.5s โ—† LLM Input Tokens: 1,087 โ—† LLM Output Tokens: 1
โ”‚
โ”‚ โœ” Directory.entries: [String!]! 0.0s
โ”‚
โ”‚๐Ÿค– The current directory contains the following files and directories: .circleci, .dagger, .git, .github,
โ”‚ โ”ƒ .gitignore, .gitmodules, DEMO.md, Jenkinsfile, LICENSE, README.md, dagger.json, go.mod, go.sum,
โ”‚ โ”ƒ main.go, main_test.go, and website.
#
โœ” /with container 0.0s
Container@xxh3:6934f6e558023746

โœ” /with directory 0.0s
Directory@xxh3:c02ee2fb89ab4f04

โœ˜ /with file 0.0s
! find module "file": input: moduleSource local path "file" does not exist
โ”‚ โœ˜ looking for module 0.0s
โ”‚ ! input: moduleSource local path "file" does not exist
โ”‚ โ”‚ โœ˜ moduleSource(refString: "file"): ModuleSource! 0.0s
โ”‚ โ”‚ ! local path "file" does not exist
shrewd ermine
#
๐Ÿค– When I try to read the file at ./main.go, my tool encounters an error.
โ”‚ โ”ƒ
โ”‚ โ”ƒ The error message is:
โ”‚ โ”ƒ
โ”‚ โ”ƒ "ID lookup failed: ./main.go"

getting this on ollama too. Tried a few different big models

smoky ocean
#

regression on llm.6?

shrewd ermine
#

this is on llm latest but i'm seeing the same on llm.6

#

Can you try on gpt-4o? Just give it a directory and ask it to tell you what's in a file

#

claude 3.7 is working thinkies

smoky ocean
#

testing

shrewd ermine
smoky ocean
#

perfect ๐Ÿ™‚ ๐Ÿ™

shrewd ermine
#

Nice, something is definitely off for ollama and gemini in this then. I can dig deeper tomorrow!

#

Thanks for testing!

uneven depot
#

Can I use copilot (any supported model) with this?

uneven depot
#

hmm, I don't want to run it locally necessarily. I can't really use bigger models that way. Is it even possible to integrate copilot with this? That's the only tool available to us in the Enterprise.

smoky ocean
#

Ideally you would get a direct access to the LLM endpoint powering your copilot instance. Probably a private Azure Cloud endpoints backed by OpenAI. But it's possible that Github doesn't expose that for anti-competitive reasons

uneven depot
uneven depot
shrewd ermine
#

is the Azure OpenAI Service an option for you? It seems like that's Microsoft's answer to the competition at the moment

bronze fern
#

Heard similar from someone I know at a big shoe company. Not allowed to go to public OpenAI, et al because they don't want employees to accidentally leak private info.

shrewd ermine
#

big shoes huh ๐Ÿ˜‚ I don't know if anyone's tested a private OpenAI setup yet but private Ollama works great

uneven depot
quiet ether
shrewd ermine
uneven depot
woeful quiver
#

Is there a good way to get the response as a string from the LLM? Meaing I gave it a prompt, and want that LLM's response as a string. Does that require I give it a workspace with a write tool?

shrewd ermine
#

llm | with-prompt "who are you?" | last-reply

spring wave
#

pushed iteration #(i lost count) at reasonable scrollback behavior /cc @shrewd ermine @smoky ocean

smoky ocean
#

should we cut a tag to get the latest improvements out ?

storm gate
#

For anyone interested in the Dockerfile-Optimizer agent, I finally have a new version that is smaller and producing much better results via an eval: https://github.com/dagger/agents/pull/20 - feedback is welcome, it's not finished yet (still have to re-wire the feature branch and the opening of a PR to not break the current behavior).

GitHub

๐Ÿšง WIP ๐Ÿšง
This is a complete rewrite of the initial implementation of the dockerfile-optimizer agent.
The new version implements a better pattern that aims for better results in the optimization, lea...

storm gate
smoky ocean
#

@storm gate nice! Is there a way to show the diff? It would make it feel more "real" in the demo than the agent saying "I did it"

bronze fern
#

I added line to my prompt for agent to cat out results to celebrate success. Could run diff too

#

I guess the โ€œcatโ€ was from the LLM since I didnโ€™t provide a shell ๐Ÿ˜›

smoky ocean
#

In my experience the more "real" artifact you show the more people get how cool it is. Of course opening a PR is awesome too ๐Ÿคฏ

storm gate
smoky ocean
lean mural
#

Does anyone know if it's possible to have an agent interact with a table loaded up in duckdb? I'm trying to get an agent to do some simple/fun EDA on a small data set from github, but I'm struggling about how to tie the pieces together

#

I feel like I can figure out how to do this with langchain in python, but where/how dagger intersects is thowing me

smoky ocean
# lean mural Does anyone know if it's possible to have an agent interact with a table loaded ...

๐Ÿ‘‹ hello! The way I would do it: first develop a small Dagger module that integrates with DuckDB. It doesn't have to do anything LLM-specific.

(you could also use a pre-made DuckDB module, but I searched on daggerverse.dev and couldn't find one)

Once the module is ready, instantiate a LLM (either in the command-line with 'dagger llm', or in code from another module) and bind a instance of your DuckDB module so it can use it.

do some simple/fun EDA on a small data set from github

Can you share a little more detail? ย Seems like a fun use case ๐Ÿ™‚

#

Or, you could try giving the LLM just a plain container, and see if it can figure out how to install duckdb software, and fetch data from github on its own.

In that alternative scenario you don't even have to develop a custom module: you let the LLM do all the work from an open-ended environment. But, you will get less predictable results. It's the defining tradeoff of agent engineering ๐Ÿ™‚

#

v0.17.0-llm.7 ๐Ÿงต

lean mural
# smoky ocean ๐Ÿ‘‹ hello! The way I would do it: first develop a small Dagger module that integr...

If this is helpful this is what I'm doing with python/langchain:

...
    con.execute("""
    CREATE SECRET secret2 (
        TYPE s3,
        PROVIDER credential_chain
    );
    """)

    result = con.sql("""
        CREATE TABLE IF NOT EXISTS github_pulls 
        AS (
            WITH data AS (
              SELECT 
                * exclude(is_locked, milestone_id, issue_number, is_pull_request, repository_id),
                row_number() OVER (PARTITION BY issue_id ORDER BY closed_at DESC) _row_num
              FROM 's3://launi-data-bucket/__unitystorage/catalogs/bdeab243-c684-4e4d-ab6c-c2c87596398d/tables/46d6ab44-359a-4d62-b888-682db22402e1/*.parquet'
              WHERE is_pull_request
            )
            SELECT * exclude(_row_num)
            FROM data WHERE _row_num = 1
        )
    """)
    con.close()

    db = SQLDatabase.from_uri(
        "duckdb:///./data.ddb",
        include_tables=['github_pulls'],
        sample_rows_in_table_info=3
    )

    toolkit = SQLDatabaseToolkit(db=db, llm=OpenAI(temperature=0))
    agent_executor = create_sql_agent(
        llm=OpenAI(temperature=0),
        toolkit=toolkit,
        verbose=True,
        agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION 
    )
    
    template = """/
    You are a SQL analyst that is querying a database of GitHub repository
    information named "github_pulls".

    Your job is to write and execute a query that answers the following question:
    {query}
    """

    prompt = PromptTemplate.from_template(template)
    agent_executor.run(
        prompt.format(
            query="Which creator_login_name has merged the most pull requests? Tell me their name and how many pull requests they have merged."
        )
    )
#

With the intent being there being another agent who is thinking of interesting questions to ask this agent, who actually runs the query

lean mural
#

So i guess if I were going to do this in dagger: iโ€™d have a duckdb module that exposes like, with_db, execute_sql, etc. and then thereโ€™d be a dagger LLM writing queries and calling execute_sql?

spring wave
#

@smoky ocean pushed more scrollback fixes/polish, if you wanna tag another release. ๐Ÿ˜…

  • Gave up on swapping to the Alt screen since that introduced new weird behavior. (interpreting scroll events as going through prompt history)
    • Don't really need it anyway, it's simpler without it and the end result (after things finish running) is still good.
  • Also got rid of the highlighted background since it was causing artifacting and reduced contrast.
  • Now prints the scrollback on exit, and fixed the spacing above the "Full trace at ..." message
  • Fixed an off-by-one error preventing the prompt from "hugging" the bottom of the screen
    https://asciinema.org/a/oCkA9UGG7bEq9MuE14reWyM6q
#

Can tag myself when I'm back at the pc

smoky ocean
#

17.0-llm.8 ๐Ÿงต

smoky ocean
#

๐Ÿšจ curl -fsSL https://dl.dagger.io/dagger/install.sh | DAGGER_VERSION=0.17.0-llm.8 BIN_DIR=/usr/local/bin sh

smoky ocean
#

I just realized I can't do /with llm because the LLM type can't introspect itself ๐Ÿ˜ญ

#

multi-agent with zero line of code, just raw command-line, would have been sick

smoky ocean
#

I'll try when the kids are asleep

spring wave
#

this seems to help:

diff --git a/core/llm.go b/core/llm.go
index 210ef854b..76512e72b 100644
--- a/core/llm.go
+++ b/core/llm.go
@@ -667,6 +667,17 @@ func (llm *Llm) Set(ctx context.Context, key string, value dagql.Typed) (*Llm, e
        }
        llm = llm.Clone()
        llm.env.Set(key, value)
+       if obj, ok := dagql.UnwrapAs[dagql.Object](value); ok {
+               llm.history = append(llm.history, ModelMessage{
+                       Role:    "user",
+                       Content: fmt.Sprintf("The variable %s is set to %s@%s.", key, value.Type().Name(), obj.ID().Digest()),
+               })
+       } else {
+               llm.history = append(llm.history, ModelMessage{
+                       Role:    "user",
+                       Content: fmt.Sprintf("The variable %s is set to %v.", key, value),
+               })
+       }
        llm.dirty = true
        return llm, nil
 }

before: https://v3.dagger.cloud/dagger/traces/b913833bf8e945bafda382b6a12e1d02
after: https://v3.dagger.cloud/dagger/traces/72b5fd437d2323cc8c5b70c38ce4948e

#

hrmm that helps it along a bit in the simple one-object case, but it doesn't seem to improve the 'use ctr to build repo' prompt (granted, it's an incredibly vague prompt)

smoky ocean
#

you know what, instead of _load we can expose eg. _load_ctr and _load_src. In other words _load_<key>: load the <typename> "<key>"

#

one less indirection

#

trying

fleet fiber
#

Hiiiii wavefoxy I'm a dagger noob and running through the AI docs page using v0.17.0-llm.8 and storing my openai env's in .env, and I get this 404:

! input: toyProgrammer.goProgram POST "https://api.openai.com/chat/completion
โ”‚๐Ÿง‘ You are an expert go programmer. You have access to a workspace
โ”‚ โ”ƒ 0.0s
โ”‚
โ”‚๐Ÿง‘ Complete the assignment written at assignment.txt
โ”‚ โ”ƒ 0.0s
โ”‚
โ”‚๐Ÿง‘ Don't stop until the code builds
โ”‚ โ”ƒ 0.0s
โ”‚
โ”‚๐Ÿค– 0.2s
โ”‚ ! POST "https://api.openai.com/chat/completions": 404 Not Found```
#

I'm assuming these are OK:

OPENAI_MODEL=o3-mini```
smoky ocean
#

Also hello! ๐Ÿ™‚ welcome

fleet fiber
#

Fixed! unsetting worked

rich plaza
spring wave
#

ah ha, figured out how to enable token usage stats for OpenAI - had to flip a boolean in the params. pushed /cc @smoky ocean

spring wave
lyric goblet
#

What's a local model that runs on a macbook with 32gb of RAM that can be successfully used for this flow? I'm using qwq because it supports tool use and seems to run locally, but I got strange behavior when I tried to use the toy go coder with it. Is it just that a 32b parameter model is too greedy for this RAM profile? Is there a smaller model that still supports tool use that y'all might recommend?

shrewd ermine
lyric goblet
#

is there an example of a node/typescript coder workflow, rather than the go one? I'm less familiar with go, would be able to make more headway with an example in the node ecosystem

lyric goblet
#

๐Ÿค”

lyric goblet
bronze fern
#
import { dag, Container, func, object } from "@dagger.io/dagger";

@object()
export class ToyProgrammerTs {
  /**
   * Write a Go program based on the provided assignment.
   */
  @func()
  goProgram(assignment: string): Container {
    // Create a new workspace using the third-party module
    let before = dag.toyWorkspace();

    // Run the agent loop in the workspace
    let after = dag
      .llm()
      .withToyWorkspace(before)
      .withPromptVar("assignment", assignment)
      .withPromptFile(dag.currentModule().source().file("prompt.txt"))
      .toyWorkspace();

    // Return the modified workspace's container
    return after.container();
  }
}
lyric goblet
#

ah, so like, the fact that the app is written in "go" is just a property of the prompt, the implementation of the flow is language agnostic?

bronze fern
#

Yep, this was a tightly-coupled demo example, but the concept could be super generalized so you could provide a container with the right tools for a workspace, and a function to test/check things with, etc.

#

I think there's a more general example like that...

lyric goblet
#

interesting. So the tools that the agent gets access to - the ability to read and write from the file system - those are just baked into the Container?

smoky ocean
bronze fern
#

You can attach an object of any type to an llm instance and all of the functions on that object become tools for the LLM ๐Ÿ™‚

lyric goblet
#

ah, interesting.

bronze fern
#

So a core type like Container has a ton of functions/tools, prob too many

#

While in the toy-workspace we constrain down to 3 functions

#

read, write, build - keeps the agent out of trouble ๐Ÿ˜‰ and is surprisingly powerful when the agent can run in a loop: receiving the initial assignment and then errors and function return values as feedback as it loops, trying to reach the goal (and then stopping).

dense flare
#

agent loop repeats

proper stratus
#

Is it a bug that sometimes when I give the agent a container with Alpine image via the /with command, it pulls another image (e.g., Ubuntu) when I ask it to do something? It seems like it doesn't know the context.

shrewd ermine
smoky ocean
woeful quiver
merry scarab
#

Qwen is a bit drunk for me ๐Ÿ™‚

โ”‚๐Ÿค– Certainly! To understand what your Dagger module does, I'll need to take a look at the code in the
โ”‚ โ”ƒ https://github.com/levlaz/snippetbox repository. Specifically, I'm looking for a Dagger file, which is
โ”‚ โ”ƒ typically written in HCL (HashiCorp Configuration Language) or Go.

GitHub

Contribute to levlaz/snippetbox development by creating an account on GitHub.

uneven depot
#

This gave me a good laugh! Qwen must be sponsored by Hashicorp

merry scarab
#

@spring wave I wish it was possible to share a specific span/repo from dagger cloud that is not in the context of a repo - esp for the agent use case where I might be in vibe mode and wanting to share a link for someone to look at - I think right now these become "orphaned" local traces and there is no way to access them via the public url

#

Is there an easy way to see which model is being used?

โœ” llm  0.0s
Llm@xxh3:372bce69d14ff865

I feel like llm used to tell you more info, now its just this id it seems

spring wave
#

or if you're in dagger llm it'll be in the prompt

merry scarab
#

Thank you so much!

spring wave
#

we got rid of the object dumping because in a lot of cases it was super spammy and didn't show the most useful info

merry scarab
#

Hm, is this all cached? I am trying to switch from local qwen to anthropic using env var but it seems to still be using qwen even though my env has changed

#

nvm, i am dumb ๐Ÿ™‚

smoky ocean
#

small UX feature request, if anyone has time: would be awesome if /model could autocomplete model name ๐Ÿ˜› I find it super hard to memorize the model names

#

either that, or support shortnames like "claude-sonnet", "claude-sonnet-2.7" or even "claude"

#

or both ๐Ÿ™‚

merry scarab
#

Would love to see this, ran into it myself this morning

smoky ocean
#

Also, if anyone is looking for well-scoped contributions: token counting doesn't work with openai ๐Ÿ˜ญ

smoky ocean
#

(is there a way to make a single trace public?)

spring wave
#

no, i (and Lev :P) would very much like that too

smoky ocean
coarse epoch
#

hi @spring wave ๐Ÿ‘‹ been a while since our Cloud Foundry days

spring wave
#

whoa hey!

coarse epoch
#

i'm trying to run one of the agents, the dockerfile optimizer. i'm in a clone of the pandas repo, just trying to try it out but when i follow the instructions on the tutorial (https://docs.dagger.io/ai-agents/) i hit this positional arguments bug. maybe it's expecting me to pass in my api keys?

coarse epoch
bronze fern
# coarse epoch possibly a dumb question, but where is the code for the agentic loop? i see the ...

Great question, a bit of magic.

When you provide the LLM with an object like a Directory or ToyWorkspace using llm().withDirectory or llm.withToyWorkspace, all of the functions on that type are made available as tools for the LLM. Then you kick off the conversation with a prompt. The agent will loop in its own, requesting tools to be called with certain parameters and consuming the outputs until it sends a final response with no tool call.

For example the Dagger Directory type has these functions:

as-module
 as-module-source
 diff
 digest
 directory
 docker-build
 entries
 export
 file
 glob
name
 sync
 terminal
 with-directory
 with-file
 with-files
 with-new-directory
 with-new-file
 with-timestamps
 without-directory
without-file
 without-files

This may be too many or not the right tools for the job, thus in a lot of cases we provide a set of tools and a suitable execution environment in what we're calling a
"workspace"

This this little example has just 3 functions / tools ๐Ÿ™‚ to constrain the LLM and keep it on track. The prompt plays a big role. And some LLMs are better at tool calling and selecting tools than others. As you may have guessed, often the "bigger, smarter" models are better at it.

https://github.com/shykes/toy-programmer/blob/main/toy-workspace/main.go

read
write
build
coarse epoch
#

ah got it. so it's this part:

after := dag.Llm().
        WithWorkspace(ws).
        WithPrompt(`

that kicks it off

is that a part of the OSS repo? i'm curious to poke around if it is

bronze fern
#

yep! On the llm branch

coarse epoch
#

ah i see, so the "workspace" is a special case folder that tells the agent essentially what it's allowed to do/which tools to use

bronze fern
smoky ocean
#

yeah "workspace" is not a special case

shrewd ermine
#

yep, so you can grab any module from https://daggerverse.dev/ and give it to an LLM as dag.Llm().With{Module} and it'll have all the functions of that module available as tools

smoky ocean
#

(we really need a way to make that part more clear...)

coarse epoch
#

it's like Dagger's version of MCP in a way

shrewd ermine
#

exactly ๐Ÿ™‚

coarse epoch
#

very very cool

#

side note, using Claude Code to explain the Dagger codebase to me is very fun

#

i'm not sure i should have paid 61 cents for the privilege but it found the right code!

shrewd ermine
#

It summarized the loop really well

smoky ocean
#

@coarse epoch note we are adding actual MCP compat, so you can expose any Dagger object as a MCP server also

smoky ocean
#

did you try the dagger llm command yet?

coarse epoch
#

oh whoa that is sick

#

just tried it

coarse epoch
worn hill
smoky ocean
# coarse epoch oh whoa that is sick

you can bind any object to the llm context also. Same plumbing as explained above, just with a slicker UI.

Try: /with github.com/dagger/dagger/modules/wolfi

then:

please build me a container with git, go and python installed. Also add a text file that says "hello world". Then push it to the registry ttl.sh with an image name of your choice

bronze fern
lean mural
#

I feel like my dagger llm just keeps telling me how to do things, rather than doing them itself

smoky ocean
lean mural
#

no, i'm using openai, 4o i think

smoky ocean
#

I like to say "you have access to a container/workspace/whatever. Use it to accomplish your tasks. Don't tell the user how to do it, do it yourself". Then follow up with more specific prompts

lean mural
dense flare
#

If I already have a CI dagger module with test/build/etc functions, how can I give those to my agent directly, without recreating them in a workspace module?

dense flare
#

maybe the workspace.test() is just a wrapper which internally calls ci.test()

coarse epoch
stray ice
#

which is actually a very useful error, as you can iterate until you find the best model for your use case
maybe Gemini would be best as it allows more input tokens

woeful quiver
#

Wrote a technical content summarizer agent, that might have had a few too many long islands... ๐Ÿคฃ

It's using Ollama and qwen2.5-coder, and does the following:

Takes a URL and strips the content out of the website using cheerio (JS library)
Gives a reader workspace with get-content (this uses cheerio) and check-content against a min and max length and forbidden words
Then its asked to summarize the content for a non-technical audience

Right now its modifying the summary to fit the actual check-content tool ๐Ÿคฆ so it needs a little tweaking

Code is here - will be touching up a little more https://github.com/jasonmccallister/tech-summarizer-agent

bronze fern
#

Wrote a technical content summarizer

shrewd ermine
#

sneak peek ๐Ÿ™‚

#

if tests or linting fails on a PR, it'll comment the fix as a suggestion

coarse epoch
dense flare
# shrewd ermine sneak peek ๐Ÿ™‚

interesting, would like to see the code and maybe integrate into my demo as well. currently it only produces a diff in a single comment (and sometimes doesn't do the diff at all)

shrewd ermine
smoky ocean
bronze fern
#

Just tried a qwen model with llama.cpp:
llama-server -m qwen2.5-coder-7b-instruct-q5_k_m.gguf --host 0.0.0.0 --port 8000

when I ran a function using Llm and tool calling or via dagger llm /with xxx ... I hit:

got exception: {"code":500,"message":"Cannot use tools with stream","type":"server_error"}
srv  log_server_r: request: POST /v1/chat/completions 192.168.64.2 500

Seems like a llama.cpp limitation, perhaps.

Most GGUF models lack OpenAI-style tools
need some wrapper magic, it seems: vllm or ollama?

spring wave
#

Thoughts on today's OpenAI announcements

shrewd ermine
bronze fern
woeful quiver
#

Anyone else getting 502/503's from Gemini right now?

shrewd ermine
#

seems to be acting up a bit, yeah

#

oh cool, since ollama 0.5.13 you can set a default OLLAMA_CONTEXT_LENGTH to override the default 2048 ๐Ÿš€

shrewd ermine
#

if you're using ollama with ollama serve, set this ๐Ÿ‘† to something like 8192 and see qwen fly

worn hill
shrewd ermine
worn hill
#

if you can get it to fix a flaky test on camera...

shrewd ermine
#

fixes tests and lints btw. I did not emphasize the linting part in the video

worn hill
shrewd ermine
#

it's also the classic problem of "oh yeah but i run cyborg-vim closed beta that doesn't support linters so i just push"

worn hill
#

lol or "my build system is complicated enough i don't actually know how to apply linter diffs, especially inside my editor" coughuscough

#

demos like this do make me wonder if we're gonna see a resurgence of phab or gerrit style stacked-diff code review systems.... like someone should have to stamp the agent commit seperately from the rest of the PR

#

there's also a paid saas startup that does this on top of github but i can't remember the name

shrewd ermine
#

are you thinking of trunk?

worn hill
shrewd ermine
#

ah yes

spring wave
#

fyi: merged main into llm, had some nontrivial conflicts but I think we're good, lemme know if anything seems... off

smoky ocean
#

Feature request: a way to interrupt the llm, without killing the shell

#

@spring wave worth it for me to push a llm.9 ?

spring wave
#

shrug it's just merges from main I think

coarse epoch
smoky ocean
#

Could anyone repro a build error for me please?

dagger -m github.com/shykes/dagger@llm-multiobj -c 'cli | binary'

--> does this build for you? And if so, what's your engine version?

#

OK I can't build main either...

#

wth

shrewd ermine
#

what is your engine version? notsureif

smoky ocean
#

either 0.17.0-llm.8 can't build dagger main; or main is broken

shrewd ermine
#

ah. I have 0.17.0-llm.8 and 0.16.3 running so i can try both

smoky ocean
#

I'm getting elixir SDK errors on both main and my llm-multiobj branch

shrewd ermine
#

yeah i got the elixir error from llm.8

#
# github.com/dagger/dagger/.dagger
./sdk_elixir.go:52:19: sdkDev.Lint undefined (type *dagger.ElixirSDKDev has no field or method Lint)
./sdk_elixir.go:81:18: sdkDev.Test undefined (type *dagger.ElixirSDKDev has no field or method Test)
./sdk_elixir.go:97:16: sdkDev.WithBase undefined (type *dagger.ElixirSDKDev has no field or method WithBase)
./sdk_elixir.go:98:16: sdkDev.Generate undefined (type *dagger.ElixirSDKDev has no field or method Generate)
./sdk_elixir.go:164:16: sdkDev.WithBase undefined (type *dagger.ElixirSDKDev has no field or method WithBase)
merry scarab
merry scarab
#

@shrewd ermine why does your :looking for module take a min lol - mine was 20ish sec

shrewd ermine
#

uh because I lied and it still had to start 0.16.3 lol

smoky ocean
#

I am told main builds fine from 0.16.2

shrewd ermine
#

based on the error it feels like a merge issue

merry scarab
#

@smoky ocean this worked in 0.16.3

smoky ocean
#

ok thanks

smoky ocean
shrewd ermine
#

oh wait yeah you're right

smoky ocean
#

Ok this is melting my brain:

  • 0.16.3 can load main and llm-multiobj fine
  • llm.8 fails to load main, llm and llm-multiobj
  • But llm.8 doesn't fail to load any other module that I know of?
#

@spring wave are you getting any of these build errors?

#

OK I narrowed down the issue to github.com/dagger/dagger/elixir/dev. Somehow when loading that module, llm.8 returns a boilerplate module straight from dagger init

#

So it looks like we shipped llm.8 with a timebomb that goes off when you load sdk/elixir/dev (and possibly other modules from the dagger repo?) I'm going to say: something went wrong in the SDK-bundling system at build

#

Elixir SDK errors ๐Ÿงต

shrewd ermine
#

getting a surprising result from LastReply() on llm.8 ๐Ÿงต

proper stratus
#

Is there a way to break long line in shell?

spring wave
proper stratus
#

the \ does not work. Currently, if I want to write prompts with multiple lines, I have to write them outside the shell and paste them in.

spring wave
proper stratus
#

I tried with shift+enter but it didn't work. I'm fine with ctrl+o for now until shift+enter is supported

lean mural
#

I've got my agent writing some tests for me, but i haven't been able to get it to actually persist so once my dagger call ends it's all gone ๐Ÿซ  what do i need to do to save the work?

woeful quiver
lean mural
#

ah facepalm i used directory because the doc for export said export the image as a tarball

woeful quiver
#

Yeah something like dagger call <func> export --path=<somewhere> - if you are giving the repo as a directory you will probably want to export the entire directory

lean mural
#

ah, i wouldn't like just export my tests dir onto my existing tests dir?

smoky ocean
#

Onboarding improvements ๐Ÿงต

lean mural
shrewd ermine
#

and if you have a cloud trace that helps a lot with debugging what the agent is doing ๐Ÿš€

devout magnet
#

are we using this sdk for gemini? https://x.com/_philschmid/status/1900095644624134347

Weโ€™re thrilled to release a new TypeScript / Node SDK for @GoogleDeepMind Gemini!

It is a new unified SDK with huge improvements for all web developers โ€“ some of the highlights include:

- unified for Gemini Developer API and Vertex AI.
- Simpler initialization logic
- Cache

smoky ocean
shrewd ermine
woeful quiver
#

Has anyone tried/had success giving an agent two workspaces?

smoky ocean
woeful quiver
smoky ocean
#

At the moment (0.17.0-llm.8) you can only give a llm one object at a time

#

History inspection ๐Ÿงต

storm gate
smoky ocean
#

@shrewd ermine yesterday got a message from a friend trying out our agent stuff, he specifically called out your multi-agent demo module as particularly cool ๐Ÿ™‚

shrewd ermine
somber vault
#

Quick question, what's the preferred way to launch two dagger containers within single network? Application and a database as a real example

shrewd ermine
#

does the database need to be accessed other than by the application?

somber vault
#

Nope, throwaway setup, it's agents after all (:

shrewd ermine
#

the key line is the "with service binding" to connect the database container to the application container

somber vault
#
drupal.with_service_binding("db", mariadb) # assume "db" will be the hostname
...
.as_service(use_entrypoint=True) # <- this is the key line

That's it?

#

My boi that's clean

smoky ocean
#

Note if you want it to connect to an external DB (like one running on your laptop or an external server) you can bind that to your app container also. Service is an abstract type that can be backed by a container or a tunnel to host network

somber vault
#

neat, so I can take app container, swap service binding and throw this onto prod so to speak?

shrewd ermine
#

yup! The cool thing too is that you don't even need to change the service binding necessarily. Using the snippet from the cookbook as the example, you could have an optional arg to your function for mariadb dagger.Service, and if it's not set create it like in the snippet. But then you could pass it in from the cli as --mariadb tcp://localhost:3306

somber vault
#

Perfect, will try that out after checking how my pipeline runs on top of dagger

#

Finished porting all steps besides the database, but this should be ready in a couple minutes

lean mural
shrewd ermine
somber vault
#

Another silly question, say I expect command to fail often but still want to grab stdout stderr

result = await container.with_exec(["tsp", "compile", ".", "--no-emit"])
exit_code = await result.exit_code()
stdout = await result.stdout()
stderr = await result.stderr()
return exit_code, stdout, stderr

I found the .sync() method, not sure where to correctly get stdout and stderr

shrewd ermine
smoky ocean
#

@shrewd ermine your multi-agent demo has gpt-o1 as a constant for model name, but does that actually work?

shrewd ermine
smoky ocean
#

Were you able to use o1 at all? For me llm --model=o1 fails with an api error

shrewd ermine
#

I haven't tried ๐Ÿคท

smoky ocean
#

Multi-object eval 1 ๐Ÿงต

lean mural
#

Any techniques for trying to help dagger limit the amount of context it's sending to the LLM? I feel like i was making progress and now it's started blowing up due to exceeding token limits

lean mural
shrewd ermine
#

Yeah that'll help. We're going to have function masking so that you can limit which functions from your object get passed in, I think that's going to be big

lean mural
#

Is there a way to expose the container without adding a container() function so i'm not confusing the LLM with that tool, but can still access the workspace container? (i should probably go look at the toy programmer repo you just sent..)

smoky ocean
#

@lean mural we plan on adding a "function mask" so that you can hide certain functions (in this case, container()) when binding an object to the llm

#

for now you can use tricks like 1) adding a "do not use" comment in the function ๐Ÿ™‚ or 2) saying it explicitly in the prompt

quiet ether
#

๐Ÿ‘‹ does /prompt know how to handle reply types? in llm.8 at least how do I get a prompt reply and set that as the new /with object?

smoky ocean
#

"prompt reply" you mean the actual string message sent by the llm?

quiet ether
quiet ether
#

In my case the LLM is calling a tool that returns a container and I want to be able to do something with it

#

But I can't get a reference to that

smoky ocean
#

ah I see. well in llm.8 as long as the returned type is the same as the original type you set with /with, the llm keeps the latest reference.

quiet ether
#

Yeah.. they're different

#

I'm using the toy programmer fwiw. Which returns a Container and doesn't have a function to get a reference to it

smoky ocean
#

in any case you can't get it out euther way. but you can ask the llm to publish/export/open a terminal/expose port ๐Ÿ™‚

#

like in my demo when I say "give me a terminal"

quiet ether
#

In this case there's not

#

It's a response from a function call. I can add another function to retrieve it

smoky ocean
#

I don't understand the issue. it should be able to chain

#

"get the container from your workspace then publish it to ttl.sh"

quiet ether
#

It strange.. it doesn't know about it. Since the toy-programner doesn't have a way to retrieve it

#

The LLM only knows about the goprogram, goprogramqa

smoky ocean
#

ah you're doing multi agent? your "copilot" has access to a ToyProgrammer correct?

quiet ether
#

Yes

smoky ocean
#

ok - then that's expected yes, no function to get the workspace

#

for a demo I recommend doing single agent unless you specifically want to show multi agent as the topic

#

for a "basic" demo of dagger prompt mode it might be too meta

quiet ether
#

Yeah.. multi agent will probably be confusing. I'll call that toy programner with the shell and that's it

smoky ocean
#

you can also bind toy-workspace to the prompt mode and basically recreate toy-programmer from the CLI. Then show that you can do the same in code

#

both work

crisp scroll
#

I am using latest dagger with llm integration. When the llm tries to expose the port, it gets the following error : decode arg "port": cannot create Int from float64

#

This is the call that it is declaring to use:````
<invoke name="withExposedPort">
<parameter name="description">Plone HTTP port</parameter>
<parameter name="port">8080</parameter>
<parameter name="protocol">TCP</parameter>
</invoke>

#

Any idea how I could hint it ? Or is that a known bug of the integration ?

#

IOW, might not be totally fixed. Or I am looking at another flavour of that bug.

gloomy kindle
#

i can't see how it's used, is this different from what a +private field is?

quiet ether
gloomy kindle
#

oh I guess it's not private because we still need to access it from code

#

but the LLM shouldn't see it right

quiet ether
gloomy kindle
#

mm okay yeah fiddly

#

not sure we want to call this, but it doesn't seem plumbed through to anything

spring wave
#

gonna merge llm-multiobj in to llm and keep working on it

#

i think we might be able to just skip the feature flag, and have it only opt in to multi-object once you set variables thinkspin

shrewd ermine
gloomy kindle
#

neat neat

spring wave
#

except for interactive shell which automatically sets variables. maybe that one should be opt-in somehow? @shrewd ermine have you had a chance to play around with it yet? wondering how the demo UX is

shrewd ermine
spring wave
#

pushed the merge, here are the cliff's notes:

  • llm command is gone - use shell instead
  • press > at start of input to swap to prompt mode, ! to switch back
  • /with is gone - I'll add a .with builtin instead (e.g. .with $(container | from alpine))
  • vars that you set in shell are synced to the LLM, vars that the LLM set are synced back to the shell (though you usually have to tell it to set one)
  • in prompts you can use $vars (auto-completable) to explicitly reference objects, or not and see if it figures it out based on context
#

working on .with now

gloomy kindle
#

"autocompletable" ๐Ÿฅณ ๐Ÿฅณ

spring wave
#

still need to backport that to shell completion funnily enough

#

prompt can complete shell vars but not shell ๐Ÿ˜›

uneven depot
#

btw, dagger shell is displayed with strange hue on my terminal (ghostty). It wasn't like that before

spring wave
uneven depot
#

ah! ok cool. I'm on 0.16.3 so it hasn't shipped

smoky ocean
#

@spring wave should we celebrate multi-obj being merged with a llm.9 tag?

spring wave
#

the merge is just to avoid losing time to merge conflicts

#

but, should be soon

#

i'm adding a .llm builtin to replace /with and let you access the current LLM object:

# set state
container | from golang | .llm
# access current LLM state
.llm
# get state
.llm | container
# all in one
container | from golang | .llm | container
#

pushed ๐Ÿ‘†

smoky ocean
#

can we change the name of that builtin? looks a lot like llm...

quiet ether
#

does anyone have a clue why the model doesn't seem to figure out how to call withExposedPort?

edit: not sure if it's a claude specific thing

โœ” /with $(container | from alpine ) 2.0s
Container@xxh3:0991463b40bf15ca

โœ”  expose the port 8080 in the container 12.4s
โ”‚๐Ÿง‘ expose the port 8080 in the container
โ”‚ โ”ƒ 0.0s
โ”‚
โ”‚๐Ÿค– I'll help you expose port 8080 in the container using the  withExposedPort  function. This function requires the  port  parameter, and since you specified port 8080, I can use that directly.
โ”‚ โ”ƒ 3.6s โ—† LLM Input Tokens: 11,701 โ—† LLM Output Tokens: 105
โ”‚
โ”‚๐Ÿค– I apologize for the error in my previous attempt. Let me try again with the correct format:
โ”‚ โ”ƒ 2.5s โ—† LLM Input Tokens: 29 โ—† LLM Output Tokens: 78 โ—† LLM Input Tokens (cache writes): 11,805

I'm also not being able to see the call in the trace here: https://v3.dagger.cloud/marcos-test/traces/b81bf3d688e337cb4c844ce71b25a546

spring wave
#

.current-llm?

smoky ocean
#

oh sorry - I thought it returned the current selected object within the llm

spring wave
#

ah np, yea that'd be weird

smoky ocean
#

how do you feel about making it a special variable?

#

for example $copilot

spring wave
#

the trick is it also supports piping state to it

smoky ocean
#

then I could just set it

spring wave
#

i think we'd lose that with a var

smoky ocean
#

copilot=$($copilot | with-foo) ?

spring wave
#

true

smoky ocean
#

wait how do I pipe state to .llm?

spring wave
#

container | from alpine | .llm

#

it figures out the type and calls the appropriate withFoo

#

this is to replace /with for current demo flows, since all the slash commands are gone

smoky ocean
#

why not just set variables?

spring wave
# smoky ocean why not just set variables?

you can do that of course, this is in the spirit of compatibility with pre-multi-object demo flows, in case we're still giving them, since using vars changes how you prompt it. with this you just change from /with to piping at the end

#

bearing in mind the original plan was to feature-flag all of the multi-object stuff

shrewd ermine
#

super confused, on a fresh checkout of the llm branch I have

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   sdk/php/generated/LLM.php
        modified:   sdk/php/generated/LLMId.php

and restore/checkout doesn't actually fix ๐Ÿค”

#

those files don't actually exist, but I do have Llm.php and LlmId.php

#

except also they do exist thinkies

#
โžœ  dagger git:(llm) โœ—
> ls -la sdk/php/generated/LLM.php
-rw-r--r--  1 kylepenfound  staff  27015 Mar 14 13:36 sdk/php/generated/LLM.php
โžœ  dagger git:(llm) โœ—
> ls -la sdk/php/generated/Llm.php
-rw-r--r--  1 kylepenfound  staff  27015 Mar 14 13:36 sdk/php/generated/Llm.php
โžœ  dagger git:(llm) โœ—
> ls sdk/php/generated/L*php
sdk/php/generated/Label.php         sdk/php/generated/LabelId.php       sdk/php/generated/ListTypeDef.php   sdk/php/generated/ListTypeDefId.php sdk/php/generated/Llm.php           sdk/php/generated/LlmId.php
spring wave
#

probably some fun with case insensitive filesystems

#

guessing you're on macOS

#

we probably need to re-generate all of sdk/*

shrewd ermine
#
> git status
On branch llm
Your branch is up to date with 'upstream/llm'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   sdk/php/generated/LLM.php
        modified:   sdk/php/generated/LLMId.php

no changes added to commit (use "git add" and/or "git commit -a")
โžœ  dagger git:(llm) โœ—
> rm sdk/php/generated/LLM.php sdk/php/generated/LLMId.php
โžœ  dagger git:(llm) โœ—
> git status
On branch llm
Your branch is up to date with 'upstream/llm'.

Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        deleted:    sdk/php/generated/LLM.php
        deleted:    sdk/php/generated/LLMId.php
        deleted:    sdk/php/generated/Llm.php
        deleted:    sdk/php/generated/LlmId.php

no changes added to commit (use "git add" and/or "git commit -a")

yeah very weird lol

smoky ocean
#

basic single-object workflows dont work in tip llm fyi

#

I just gave a demo and it broke

#

llm | with-container | with-prompt "do you see the container" -> doesn't see it

worn hill
#

lol we have reached the "damn we really need tests don't we" point of this branch

spring wave
#

did that prompt work before? i've found it always has trouble with that sort of phrasing, since it's more around what set of tools it has

shrewd ermine
#

I just did llm | with-container $(container | from alpine) | with-prompt "do you see the container" | last-reply on llm.8 with ollama and it was good

spring wave
#

this worked for me:

โœ” llm | with-container $(container | from golang) | with-prompt "what tools do you have? list them all" 43.7s
# ... lists all of Container, plus the multi-object tools (TODO it should not)
shrewd ermine
#

but yeah it responded with the list of tools

worn hill
#

false alarm then we haven't reached the "damn we really need tests don't we"

#

i really need tests but that's a personal problem

spring wave
#

ultimatesmug who needs'em

worn hill
smoky ocean
spring wave
#

rust codegen is broken by the LlmID -> LLMID change; it ends up as Llmid and doesn't pass a ends_with("Id") check ๐Ÿซ 

spring wave