Release checklist 🧵 | Dagger | Page 1

wind dome Mar 24, 2025, 10:05 PM

#

I'm stressing out a little over getting to a stable release this week... Doesn't help that my hands are tied with the shell launch. Can we try to make a checklist together @nocturne laurel @bright ore @mighty storm @crystal monolith ?

mighty storm Mar 24, 2025, 10:06 PM

#

Happy to help

nocturne laurel Mar 24, 2025, 10:09 PM

#

I'm working on a PR (+ big description) for all the R&D I've been doing on the tool calling scheme: https://github.com/dagger/dagger/pull/9956

GitHub

llm: refine tool calling scheme, add evals by vito · Pull Request ...

(This PR does a few things, still working on the description, quick PR up so I can go over the diff)
New evals system

Added the beginning of a new suite of evals so we can quickly get a sense of h...

#

I'll go over my other TODOs and list anything that seems delegateable here

crystal monolith Mar 24, 2025, 10:14 PM

#

I can only talk about MCP stuff.

[DONE] i wanted to fix a bug, that's done, will push it.
There's another bug that's exercised only in Cursor, not sure why, it feels like a "multi-object/single-object conflict" but it doesn't make sense why it would only happen in Cursor. I'm fine deprioritizing this bug if it helps.
About to update the PR description to show how one can use Guillaume's mcp-client to better test it.
Need Reviews

Happy to help review things too

wind dome Mar 24, 2025, 10:15 PM

#

Pulling in ecosystem crew @tribal pewter @sharp oasis @burnt warren

#

One possible gap is the end-to-end CLI experience. The killer use case, for now, seems to be "write simple modules, and compose them for your personal workflows". Strong continuity from dagger shell, and strong focus on the details of the CLI UX

#

So the fine details of "prompt mode", how to configure and customize it, how to get data in and out of it, etc. will get a lot of usage and eyeballs - but comparatively they're the least mature part because we built it last

#

It's like the typical user journey is the chronological inverse from our own builder's journey. What they will use first is what we built last, and vice-versa

#

https://tenor.com/view/benjamin-button-old-not-as-old-gif-10768449

Tenor

#

The Benjamin Button journey

tribal pewter Mar 24, 2025, 10:21 PM

#

Agreed. This is the part I'm working on content for, so if I work out a script tonight maybe that'll help show where the gaps are?

wind dome Mar 24, 2025, 10:25 PM

#

tribal pewter Agreed. This is the part I'm working on content for, so if I work out a script t...

yes please!

#

One gap will probably be: modules that are actually composable

#

Right now we're getting lots of monolithic agents

#

Instead we want composable modules. Since the first agent you'll create is the one embedded in the CLI

sharp oasis Mar 24, 2025, 10:40 PM

#

I’ve been refining my database workspace, I can pull that out into its own repository if that’s helpful? I don’t feel like too many people are using agents with databases (in dagger at least) atm

wind dome Mar 24, 2025, 10:42 PM

#

Yes it would be helpful. The key would be to make it a regular module, that both a human or a llm could use

#

Also another papercut:

dagger init without a SDK: dagger fails to load the module. This makes it difficult to introduce a workflow based on CLI-only module composition

crystal monolith Mar 24, 2025, 10:56 PM

#

wind dome Also another papercut: - `dagger init` without a SDK: `dagger` fails to load th...

I can take a stab at this if it helps

sharp oasis Mar 24, 2025, 10:56 PM

#

wind dome Yes it would be helpful. The key would be to make it a regular module, that both...

Its pretty generic, just want to sort out a name - probably just going to go with database-workspace . Will push after dinner and getting the little ones down

wind dome Mar 24, 2025, 10:57 PM

#

which database? Any sql? or postgres only?

sharp oasis Mar 24, 2025, 10:57 PM

#

MySQL or Postgres 🙂

wind dome Mar 24, 2025, 10:58 PM

#

I would just call it sql then

#

Unless you plan on supporting non-sql databases, in which case call it db or something

#

Think of it as a precursor to a stdlib module

sharp oasis Mar 24, 2025, 10:59 PM

#

ok, I'll name it sql... I have used a lot of no-sql - that really should/could be a different module.

wind dome Mar 24, 2025, 10:59 PM

#

or, if you want to expose mysql-specific or postgres-specific modules, then split them into mysql and postgres (but given our DX, splitting modules with common code is probably not the place to start)

mighty storm Mar 24, 2025, 10:59 PM

#

wind dome Also another papercut: - `dagger init` without a SDK: `dagger` fails to load th...

Could you please detail how to repro the expected use-case:

you init a module without the --sdk field ? This works
You try anything after:

dagger_dev functions
✔ connect 1.6s
✘ load module 0.6s
! failed to serve module: input: moduleSource.asModule module name and SDK must be set
│ ✔ finding module configuration 0.6s
│ ✘ initializing module 0.0s
│ ! input: moduleSource.asModule module name and SDK must be set
│ │ ✘ ModuleSource.asModule: Module! 0.0s
│ │ ! module name and SDK must be set
Error: failed to serve module: input: moduleSource.asModule module name and SDK must be set

You expect that to work ?

tribal pewter Mar 24, 2025, 11:00 PM

#

Once 17.1 is out we'll want to update every example on the list to that release and make sure they're all in good shape

wind dome Mar 24, 2025, 11:00 PM

#

mighty storm Could you please detail how to repro the expected use-case: 1. you init a module...

Yes that's right. At step 2 you can just call dagger

#

I reported the issue, but I think I did it on discord... Now I can't find it. Should have made a github issue

sturdy ledge Mar 25, 2025, 5:23 AM

#

tribal pewter Once 17.1 is out we'll want to update every example on the list to that release ...

https://linear.app/dagger/issue/GTM-2816/ensure-all-examples-are-updated-to-0171

wind dome Mar 25, 2025, 4:55 PM

#

@nocturne laurel assuming we release tomorrow: what do we do about multi-object?

#

Do you want to charge ahead in the evals branch? But if so, we lose the current UX for prompt mode right? Can't give variables without expanding them in the prompt - can't receive variables back

nocturne laurel Mar 25, 2025, 5:00 PM

#

wind dome Do you want to charge ahead in the evals branch? But if so, we lose the current ...

This is my current plan yeah. Personally I prefer the expanding vars in the prompt, since it makes it totally in line with how prompt vars otherwise work, and keeps things very clear if you change variables over time. The magic $_ variable also substitutes the need for having the LLM set vars to me, since it aligns with the functional paradigm (LLMs just return a value, they don't muck with your environment and make up names with inconsistent schemes etc).

This is "just, like, my opinion" but I think it's worth trying since IMO it feels very aligned with our broader paradigm outside of LLMs - it aligns with functions, and even in shell we already wanted a "magic var" to refer to the last value returned, so reusing that proposal here feels good to me

wind dome Mar 25, 2025, 5:00 PM

#

For giving variables to llm: would be cool to "merge" them into the current object scope, like an overlay. So they would show up as tools

#

For receiving variables from llm: if set/get variables was too heavyweight, maybe just a _return builtin tool? We could enshrine that a llm can behave like a function, ie. it can return a single value

#

_error / _return -> works well together

nocturne laurel Mar 25, 2025, 5:02 PM

#

wind dome - For receiving variables from llm: if `set/get` variables was too heavyweight, ...

Yeah I like this idea, since it'll be a more explicit signal that it thinks it succeeded at the task you gave it

#

we could even say "if a model ends its turn without returning, that's when we hand control back over to you"

wind dome Mar 25, 2025, 5:02 PM

#

might help if the LLM API has a way to specify what return type you expect? Then it would be _return_Container or something

wind dome Mar 25, 2025, 5:05 PM

#

nocturne laurel This is my current plan yeah. Personally I prefer the expanding vars in the prom...

I really don't mind that system existing at all, I might even use it occasionally, but I find it confusing that a llm can ambiently discover some things (core API, current module & dependencies) but not others (bindings you specifically gave it already). I think in that context, there will be a lot of support requests like "what do I do?" and we'll be copy-pasting "did you make sure to expand the variable in the prompt?"

So adding them as tools overlay seems like a good addition IMO

#

My only concern with that, was the weird assymetry if some bindings can be written to (variables) and others can't (functions). But if the llm can't set variables anymore, and only return a value - then that assymetry is resolved

nocturne laurel Mar 25, 2025, 5:10 PM

#

I'll play with it - really hard to say without seeing what it does, IME it's pretty delicate so keeping the toolset as focused as possible is a high priority. Maybe even having a tool to just list the current vars would help, but even that might end up being called all the time, confusing it into thinking vars are super important when really they're just the jumping off point

#

personally referring to values in the prompt felt the most intuitive to me, and I even suspect it's what most people would try if they know about prompt vars, but my expectations don't always match reality 😛

#

another option is to have the act of setting a variable inject something into the history, like "The variable $foo has been set to Container#1". That's actually what happens on main at the moment but I removed it for llm-evals

wind dome Mar 25, 2025, 5:16 PM

#

@nocturne laurel to be clear I'm proposing no new tool for listing variables

nocturne laurel Mar 25, 2025, 5:16 PM

#

yeah - I thought you were proposing one-tool-per-var

wind dome Mar 25, 2025, 5:17 PM

#

just add them to the list of functions. Basically llm should not see the difference between a var and an actual function in the current object

#

main difference I guess, is that the vars stayed overlaid on top of all objects across selection

#

and maybe they are not prefixed with Type_?

#

none of which should require more "thinking" IMO

nocturne laurel Mar 25, 2025, 5:18 PM

#

wind dome just add them to the list of functions. Basically llm should not see the differe...

you're suggesting bringing back select_varname right?

wind dome Mar 25, 2025, 5:18 PM

#

nocturne laurel another option is to have the act of setting a variable inject something into th...

that would make MCP support harder (same as system prompt), so last resort IMO

wind dome Mar 25, 2025, 5:18 PM

#

nocturne laurel you're suggesting bringing back `select_varname` right?

No, something simpler. Let me illustrate:

#

dag.LLM().SetString("foo", "bar").WithQuery().WithPrompt("what tools do you see?").LastReply()

I see the following tools:

- `container` which returns a `Container`
- `git` which returns a `GitRepo`
- `foo` which returns a `string`

withPrompt("now create a container and select it. What tools do you see?")

Now I see:

- `Container_withExec`
- `Container_rootfs`
- ...
- `foo`

nocturne laurel Mar 25, 2025, 5:23 PM

#

oh i see, for non-object values? or both? if it's an object, does it change the selection?

#

and i suppose calling foo would return {"result":"..."}? (to be consistent with the current scheme on llm-evals)

#

i'm a little wary of doing something too different from regular prompt engineering here, since there are existing patterns even for passing large string values into a prompt

#

but, it's an interesting mechanic for it to be able to recall values later, rather than relying on message history

wind dome Mar 25, 2025, 5:34 PM

#

nocturne laurel oh i see, for non-object values? or both? if it's an object, does it change the ...

I guess both? For objects, I would keep the same behavior as everything else

wind dome Mar 25, 2025, 5:35 PM

#

nocturne laurel and i suppose calling `foo` would return `{"result":"..."}`? (to be consistent w...

yeah the more consistent the better

mighty storm Mar 25, 2025, 6:41 PM

#

wind dome Also another papercut: - `dagger init` without a SDK: `dagger` fails to load th...

What's the use-case: why not using the dagger --no-mod ? Or is it because you wanna gradually install modules and use them in your LLM agents

#

Just wanna clarify if it's: 1) enabling moduleSource without SDK field across the entire engine or 2) just bypassing in a smart way this security check ?

mighty storm Mar 25, 2025, 6:53 PM

#

mighty storm Just wanna clarify if it's: 1) enabling `moduleSource` without SDK field across ...

~~Hey @cyan vapor, what's your opinion on https://github.com/dagger/dagger/issues/9203 🙏 Helder said you were exploring this -- and we were thinking about taking it~~

mighty storm Mar 25, 2025, 7:01 PM

#

mighty storm What's the use-case: why not using the `dagger --no-mod` ? Or is it because you ...

Hey @fresh pivot, do you have some guidelines / opinions on how to solve https://github.com/dagger/dagger/issues/9203 ? 🙏

mighty storm Mar 25, 2025, 7:04 PM

#

mighty storm Hey <@488718750690967563>, do you have some guidelines / opinions on how to solv...

We'll explore a bit and add context on the issue in the meantime

fresh pivot Mar 25, 2025, 8:10 PM

#

mighty storm Hey <@488718750690967563>, do you have some guidelines / opinions on how to solv...

yes

#

we can just remove the message and the error case

wind dome Mar 25, 2025, 8:58 PM

#

nocturne laurel i'm a little wary of doing something too different from regular prompt engineeri...

also remember that over MCP, you only have bindings. Would be weird to have ability for object bindings but not scalar bindings

nocturne laurel Mar 25, 2025, 9:01 PM

#

bindings? is that an MCP term?

mighty storm Mar 25, 2025, 9:10 PM

#

fresh pivot yes

You mean CLI side or API side; changing the moduleSourceAsModule ? 👀

fresh pivot Mar 25, 2025, 9:10 PM

#

wherever the error is coming from, I think is safe to remove

#

essentially that code path just shouldn't error

mighty storm Mar 25, 2025, 9:11 PM

#

fresh pivot wherever the error is coming from, I think is safe to remove

Oki, because i've been trying to remove the checks here: https://github.com/dagger/dagger/blob/main/core/schema/modulesource.go#L2135 ; and i get some segfaults -- but pulling the thread, no worries, thanks, very helpful 🙏

Glad to know there's no big red flag 🙏

wind dome Mar 25, 2025, 9:13 PM

#

nocturne laurel bindings? is that an MCP term?

No I don't think MCP uses that term. I mean that when exposing a Dagger module over MCP (coming soon ™️ ) you can't inject prompts in the client LLM, you can only add bindings that will be exposed as tool calls over MCP

#

So "you can always inject the string in the prompt" doesn't apply when your module is consumed over MCP

nocturne laurel Mar 25, 2025, 9:15 PM

#

i see - but when you're only consuming over MCP, won't you also not be able to set variables anyway?

#

(or i just haven't seen how we do that)

wind dome Mar 25, 2025, 9:19 PM

#

nocturne laurel i see - but when you're only consuming over MCP, won't you also not be able to s...

Well the idea is that any environment dagger can give to a llm, it can expose over MCP. So in this case it might be dagger mcp -c 'foo | bar | baz'

#

But even if we didn't have a practical UX for exposing variables - we should still decouple tools from prompt as much as possible

#

I'm leaning on MCP as an simple proxy for a more general design rule.

#

it also applies for eg. composition of multiple objects & LLMs

nocturne laurel Mar 25, 2025, 9:22 PM

#

one trick I just did that helped a LOT was to expose a dummy currentSelection tool that just has "Your current selection: Container#1" as its description. With that change I no longer have to convince gpt-4o of anything after setting its state

wind dome Mar 25, 2025, 9:23 PM

#

nice 🙂

#

Yeah I think there are a LOT of unexplored tricks with tools alone

#

With MCP blowing up, I bet we're going to see a lot more of those tricks in the near future

nocturne laurel Mar 25, 2025, 9:24 PM

#

also, not sure if you saw but the system prompt and additional hints in llm-evals are only enabled for Gemini

#

once I get gpt-4o working without a prompt, I'll go back and try Gemini again without them

#

and maybe Gemini 2.5 which just came out won't need them either

tribal pewter Mar 25, 2025, 9:24 PM

#

nocturne laurel and maybe Gemini 2.5 which just came out won't need them either

it's the smartest gemini ever

nocturne laurel Mar 25, 2025, 9:25 PM

#

have you kicked the tires on it? how's it feel?

#

or is that just propaganda 😂

tribal pewter Mar 25, 2025, 9:27 PM

#

uh i just checked and it's rate limit is 5RPM (vs 2000 on 2.0), so probably not ready for tool calling yet lol

nocturne laurel Mar 25, 2025, 9:28 PM

#

maybe each request takes a minute so it's fine padme_right

#

so it can "think"

tribal pewter Mar 25, 2025, 9:30 PM

#

it wrote me a short story about ordering pizza in 25 seconds

#

writing the story took 25 seconds, it wasn't a story about ordering pizza in that amount of time

fresh pivot Mar 25, 2025, 9:40 PM

#

mighty storm Oki, because i've been trying to remove the checks here: https://github.com/dagg...

yeah 🙂 that's the tricky bit, the error message protects from those segfaults or bad errors

#

you also need to fix that case to actually work

nocturne laurel Mar 25, 2025, 11:43 PM

#

ok maybe this test is being a little extreme, but this is a little spooky: (not all LLMs fail at this)

#

the test:

    weirdText := "-$@!&* BEGIN WEIRD FILE -$@!&*\nim some fun content\n---- END WEIRD FILE----"
    return withLLMReport(ctx,
        m.LLM().
            SetString("myContent", weirdText).
            SetString("desiredName", "/weird.txt").
            SetDirectory("dest", dag.Directory()).
            WithPrompt("I gave you a variable, a directory, and a filename. Can you write the content to the specified file in the directory?"),
        func(t testing.TB, llm *dagger.LLM) {
            content, err := llm.Directory().File("weird.txt").Contents(ctx)
            require.NoError(t, err)
            require.Equal(t, weirdText, content)
        })

#

i think the moral of the story is "don't set anything in the environment that you wouldn't trust an LLM to read and regurgitate accurately". the realistic workaround is probably to pass a File in instead.

wind dome Mar 26, 2025, 8:16 PM

#

@tribal pewter @mighty storm @bright ore @crystal monolith want to talk release while we wait for Alex?

#

Release window means we only have until tonight to merge everything

#

which seems impossibly tight given the current time & number of unknowns. But let's roll with it

#Release checklist 🧵