#mcp

1 messages Β· Page 1 of 1 (latest)

tardy shadow
#

Just to confirm our understanding of the v2 of MCP:

  • Alex's current PR exposes one MCP server that exposes the graphQL API

The new implementation is:

  • dagger mcp can take 2 args:
    -- -m to specify a module
    -- -c to specify a command ? (not sure about that one)

Once the dagger mcp is running, it exposes the given module as an MCP server.

So from Alex's PR, we still expose the module

But, the underlying pipeline / plumbing is us creating an LLM instance of that given module + potentially the with-prompt / with-* binding ?

#

cc @proper ledge @slow sierra @clever oasis Just to be sure πŸ™ (sorry for the direct ping)

slow sierra
#

Not 100% sure about the args

clever oasis
#

there's a lot of hand-waving in that second part πŸ™‚

#

it would be pretty neat if it worked using the API somehow, since the BBI code is reliant on direct access to a *dagql.Server, which is more of an engine-side concept

slow sierra
#

We just went over the hand waving part πŸ™‚ @tardy shadow and @proper ledge are taking a stab at it

#

gonna start from terminal as a transport and see where it leads

proper ledge
#

In the llm branch we see there's core/bbi/gql driver that's not being used anywhere but it already has some MCP server stuff. Do you guys think should we reuse this driver ?

#

Ah I think this is probably related to @clever oasis 's implementation because i saw the same bbi tools in the PR, on the cli side

clever oasis
proper ledge
clever oasis
#

maybe BBIs can be implemented externally instead? thinkies - they're pretty high level tbh

proper ledge
#

right, we were thinking of doing this inside the engine and use Terminal or some session attachable to proxy

clever oasis
#

oh ok

tardy shadow
#

Do you think it makes sense to add this MCP integration as part of the LLM type, or create its own ?

clever oasis
#

dont think it matters much yet, hard to say from here πŸ˜… - can always adjust

slow sierra
#

i recommend llm type

#

because otherwise you will have to duplicate the middleware magic

#

(dynamic type that changes based on other available types)

#

also, you shouldn't need to worry about bbi implementation

#

you should be able to only consume the bbi interface

#

to over-simplify you want a new method in core.Llm that receives a io.Reader and io.Writer or some other representation of a stdio stream, and instantiates & run a mcp server with those streams as frontend and the bbi session as backend

#

note that the mcp.Tool type has a map[string]any field for its arguments schema that I think can be directly copied from bbi.Tool

#

(I'm willing to bet it's the same unmarshaled json schema under the hood)

tardy shadow
#

We've been blocked by the glibc issue this morning (that Erik gave a fix to 5 min earlier). We're currently trying to use the CLI as a proxy to the mcp server in the engine (under the LLM type . MCP() funciton for now)

To do that, for now hijacking the terminal sessionattachable

As soon as we have a hello world (MCP) working, we'll add the dagger modules to the server

dark egret
# tardy shadow <@430802613848506380> with <@707011193814122506> , we've been working on the BBI...

reading the code, still looks open ended as to what "hello world" means here, given that the hacked-in-MCP-tool is a stub. am i guessing correctly that y'all are hoping to wire dagger modules as tools back through to an LLM running in dagger? then maybe wire those out to a dagger-client-as-MCP-server (@clever oasis's PR) for external consumption?

not yet attacking running 3rd party MCPs inside dagger and wiring them through the same pipes?

GitHub

Signed-off-by: Tibor Vass

proper ledge
#

This is the hello world part: https://github.com/tiborvass/dagger/blob/08832a0bbc29006279802189fa81686efd3166d6/core/llm.go#L523

We are currently hard-coding a tool that simply returns Hello $name! but eventually yes, we will expose all the modules that the BBI instance will know about (right now it is limited to only one module, but that will change). This is all happening in the engine, and the CLI proxies it to the client.

GitHub

A programmable CI/CD engine that runs your pipelines in containers - tiborvass/dagger

proper ledge
dark egret
#

dope then i think the thing i'm contemplating hacking at might be non-overlapping... imagine i've got a a toy agent built in dagger calling modules as tools. i want that agent to interact with linear or gdrive, but i don't want to learn how their APIs work or figure out how to wire their API clients into my dagger SDK of choice. Both these things have MCP servers that already present these APIs to LLMs.

if i could configure and run those servers in module code and wire them up to my dagger-managed LLM, i can skip writing a bunch of module code.

GitHub

A server that integrates Linear's project management system with the Model Context Protocol (MCP) to allow LLMs to interact with Linear. - jerhadf/linear-mcp-server

#

if i'm understanding y'all's plans correctly, you're building all the pieces to easily turn module code into an MCP, but not the pieces that let you take an MCP off the shelf for use by a dagger LLM?

#

(hypothetically there's also a use case for an off-the-shelf MCP configured and running in dagger, but wired out to an external client, but that's maybe less exciting as a demo)

proper ledge
#

Correct

dark egret
#

sick

slow sierra
#

@dark egret just saw your update from this morning, you're working on MCP support also, complementary with what @tardy shadow and @proper ledge are doing?

proper ledge
#

He’s doing client MCP and we’re doing server MCP so it’s complementary. That’s what we figured out from the conversation last week (if I understood correctly)

slow sierra
#

oh nice

#

LLM.withMCPServer() or something like that? πŸ™‚

#

cool

dark egret
#

it's kinda spooky to do it on top of so many stacked and unstable changes, so i'd rather get things onto trunk first anyways

slow sierra
#

yeah makes sense

dark egret
#

@proper ledge @tardy shadow cursor has a refresh button on their MCP UI that reloads tools, which at least implies that refreshing continuously isn't entirely out of reach...

also i just went and tried to corroborate @slow sierra 's statement that the MCP authors want to support dynamic tool changes, and i didn't even have to browse github issues, they've already got the notifications/tools/list_changed backchannel in their docs ... i don't think this existed last time i played with this

Model Context Protocol

Enable LLMs to perform actions through your server

proper ledge
#

Yes that's correct! I need to check if mcp-go implements list_changed, but it would make sense that they did.

clever oasis
dark egret
#

interesting

#

perhaps clients are just lagging behind servers, wouldn't be the first time

clever oasis
#

i think Claude Desktop is kind of just Baby's First MCP Client

dark egret
#

maybe dagger/BBI can be the first client that actually uses the whole protocol

#

i'd be curious if zed/cursor/goose use any more of the protocol than claude desktop... like if resources worked properly it might save a whole shit ton of pain around prompting the llm on how to use a given tool or set of tools

#

or their whole sampling server-completion thing

clever oasis
#

i've had a TODO to investigate that for a while now haha

dark egret
#

do you remember if cursor supported MCP at the hackathon? i think they sneakily launched support either like right before or right after, because in my head they still don't support it, but on git and in their docs it looks like they started supporting it on 1/30

clever oasis
#

sounds about right - pretty sure it came after. Hackathon was on 1/21

tardy shadow
#

which is kinda weird -- goose doesn't do that either

#

@clever oasis I wasn't hearing your response when Solomon was talking about the fact that those clients don't re-explore the types of complex objects (the way dagger does in the sync). Does it change somehting for the mcp implem ? We'll explore solutions on our end too

#

Currently digging other client's code to ensure that what I'm saying is true -- but from empirical tests it's the case

clever oasis
dark egret
clever oasis
#

that's what i thought but the context is the cursor 'refresh tools' button thinkies

dark egret
#

Does it change somehting for the mcp implem ?
it might make us necessary champions of MCP notifications/tools/list_changed, otherwise i suspect MCP servers wired out from dagger to goose or claude desktop or whatever will need to be obscenely careful about restricting their return types to the point of uselessness

#

like i doubt models today can handle us giving them our entire core API as tools in all of its nested inter-relatedness, much less all the additional APIs implied by custom module return types... the thing where we refresh tools based on available objects to manipulate is a big part of what makes llm.Sync work

slow sierra
#

UPDATE πŸ‘† I clarifief in a live conversation with Guillaume and Tibor, that the issue is a) their MCP implementation doesn't implement notifications/tools/list_change, and b) unclear which clients actually honor that

tardy shadow
slow sierra
#

Maybe we should provide the first MCP client that does it properly? πŸ˜›

#

That would be hilarious

#

Here is 1) a MCP server implementation that leverages the power of dynamic tools + 2) a MCP client that actually can use it

tardy shadow
slow sierra
#

@tardy shadow first step is to implement that protocol on server side right?

tardy shadow
# slow sierra <@274903880343748619> first step is to implement that protocol on server side ri...

Yes

So, we're actually notifying at the moment (the library does it), but only at the MCP server initialization for now

What we want is, in our MCP server support, is an equivalent of tools.List(), which would be:

  • everytime a tool call responds (with a different type), then also send this notification update and add the tools to the server

I'm implementing that atm -- exploring how the mcp-go lib handles that / as the lib has a handler concept and this might need some stretching (it might need to be changed a bit (the maintainer seems to be quite open to contributions)) -- but potential temporary fork is doable too

#

Sorry for the confusion this morning -- we needed to double check. I doubt though that goose handles it either. Will triple check once done

slow sierra
#

It would be funny (but maybe not practical) to have a "mcp/llm adapter" πŸ™‚

#

ie a double-ended proxy: it's a mcp proxy but also a llm proxy

#

and it connects the mcp clients to the llm clients

tardy shadow
slow sierra
#

no this is different - this is purely to implement the mcp server side

#

it's just that dagger already has the code to dynamically notify a llm server of new tools

#

what if we think of MCP as a complicated way to reach a LLM that you can't actually connect to

#

(ie the LLM behind goose)

#

we can't prompt it

#

but we can give it tools

#

(dropping that whole train of thought - ignore)

tardy shadow
#

Ok, goose supports dynamic loading -- testing it out atm

slow sierra
#

did chatgpt get it right?

tardy shadow
# slow sierra

mmmh, I'm not on the nightly so surprised. But, notifications seem to have updated the list tool state (dynamic extension in a given session)

dark egret
dark egret
slow sierra
dark egret
tardy shadow
#

Coupled with the golden tests, it's a nice foundation for our test suite

slow sierra
#

@tardy shadow @dark egret would love to discuss with you both tomorrow πŸ‘€πŸ‘€

slow sierra
tardy shadow
#

MCP coming on VSCode very soon

slow sierra
#

nice

slow sierra
#

https://news.ycombinator.com/item?id=43488151

I think there is a fundamental misunderstanding that MCP services are plug and play. They are not. Function names and descriptions are literally prompts so it is almost certain you would need to modify the names or descriptions to add some nuances to how you want these to be called. Since MCP servers are not really meant to be extensible in that sort of way, the only other alternative is to add more context into the prompt which is not easy unless you have a tone of experience. Most of our customers fail at prompting.

_pdp_

I agree. What OpenAI did was simple and beautiful.Also, I think there is a fundamental misunderstanding that MCP services are plug and play. They are not. Function names and descriptions are literally prompts so it is almost certain you would need to modify the names or descriptions to add some nuances to how you want these to be called. Since M...

slow sierra
zinc glacier
#

Hi team, with MCP support released in vscode https://github.com/microsoft/vscode-docs/blob/vnext/docs/copilot/chat/mcp-servers.md last week, there would be a new large surface area opened up for building agentic dev workflow solution right inside the ide. I've been prototyping some dev workflow scenario with dagger modules and made some demo-able progress, as I just found out this vscode MCP preview, I'm eager to build an end-to-end dev experience through this portal. The most intuitive way of integrating it with my dagger module would be leveraging dagger as a MCP server I guess ? which leads me to this thread for MCP support update, any concrete timeline of when this is going to be available ? Happy to hear more insights too πŸ™‚

GitHub

Public documentation for Visual Studio Code. Contribute to microsoft/vscode-docs development by creating an account on GitHub.

slow sierra
#

@zinc glacier we agree πŸ™‚ we are going all out on MCP support

#

experimental MCP server support is merged in main. You can run dagger mcp -m MODULE to expose any Dagger module as a MCP server with zero additional code

#

we're going to add client support soon cc @dark egret @tardy shadow

#

I think the most exciting use cases for Dagger & MCP will be:

  • Packaging and distribution of MCP servers cross-platform
  • End-to-end testing of MCP servers, with multi-model evals
  • MCP composition. Right MCP servers are monolithic and can't really be composed with each other. Dagger's object model and typed composition changes that
  • Multi-agent. Your MCP server may also be an agent. Dagger can be a runtime for both
  • End-to-end observability. Get a deep trace from LLM to MCP, all the way to individual function granularity
zinc glacier
#

My bad! Actually I did read through MCP thread here but missed latest msg in agent main channel πŸ˜… where you guys had the PR merged already!

slow sierra
dark egret
zinc glacier
dark egret
#

got the wiring up to the buildkit part (read: the hard part lol)

dark egret
#

rubberduck rubberduckrubberduckrubberduck warning, long rambling message that is largely me trying to concretize the 10 billion LoC I've read today so it's not waste of time tomorrow rubberduck rubberduckrubberduckrubberduck

the pipe-y bits of mcp client functionality @clever oasis @thorn sand @proper ledge @tardy shadow:

trying to do stdio mcp servers first. ideally we are able to use a 3rd party MCP client impl like go-mcp, but coupling there between the client impl and exec.Cmd may force us to use a different lib or roll our own. ANYWAYS:

the hard part is the IPC element: in the engine process, i want to be able to call api methods and have the underlying impl talk to a bk-ran container on stdin and stdout. there are lots of session-attachable-adjacent things that do similar comms, but most of those are about exchanging data between an interactive mainclient and the engine. here i'm trying to communicate between the engine and a noninteractive bk-exec'd process. having dug through the code, there's definitely something sorta analogous on the backend of our terminal impl, but i'm not sure which of these MANY pipe/listen/forward situations in our codebase might be ones that actually wire up a bk process's stdios through the engine, even if they land somewhere else like a session attachable...

for terminal, does connect-to-bk-container-stdio always go through Service.startContainer? the terminal code makes me think it that it does... whenever i trace down the stack from that Start call, I find all the schema callsites where one can start a container and svcID is passed as a callID of a directory or a container or whatever, and at the other end of the stack looks to be svc.startContainer's

svcProc, err := gc.Start(execCtx, bkgw.StartRequest{
        Args:         execOp.Meta.Args,
        Env:          env,
        Cwd:          execOp.Meta.Cwd,
        User:         execOp.Meta.User,
        SecretEnv:    execOp.Secretenv,
        Tty:          interactive,
        Stdin:        stdinCtr,
        Stdout:       stdoutCtr,
        Stderr:       stderrCtr,
        SecurityMode: execOp.Security,
    })

which looks like the only place we throw stdio pipes into buildkit, other than debugContainer... so do y'all think I should put stdio MCP servers through svc.startContainer too? provided this reading of the code is correct, sessionattachables are fully orthogonal to all this.

GitHub

Description Build the SDK to create a new MCP Client form scratch Addtitional Information Look at the offcial MCP SDKs: Python: https://github.com/modelcontextprotocol/python-sdk?tab=readme-ov-file...

GitHub

An engine to run your pipelines in containers. Contribute to cwlbraa/dagger development by creating an account on GitHub.

GitHub

An engine to run your pipelines in containers. Contribute to cwlbraa/dagger development by creating an account on GitHub.

proper ledge
#

@dark egret Possibly you could reuse our PipeIO session attachable from our MCP PR. You'd still need to wire it to the buildkit side of things, i can take a look at it if you want. Or we can jump on a call and you show me your code.

dark egret
#

that's where i started, and im trying to rule it out as orthogonal

slow sierra
#

I like the strategy of starting with one specific transport (stdio). Don't forget to keep an eye out for generalization at the same time - a delicate balance I know

#

especially with recent changes to the protocol to allow stateless HTTP - that's going to become increasingly popular in the future (ie. Cloufdlare, Vercel & others selling mass backend hosting for them etc)

dark egret
proper ledge
#

Are you planning on hooking up to already running MCP servers ?

slow sierra
#

@clever oasis pushed

dark egret
# proper ledge Are you planning on hooking up to already running MCP servers ?

sorta, i'm planning on having bk solve user-provided container definitions and then wire those up to the engine-orchestrated LLM

func (m *McpCaller) MCPGSearch() *dagger.Container {
    return m.WolfiNode().WithExec([]string{"npx", "-y", "g-search-mcp"})
}

func (m *McpCaller) MCPAWS() *dagger.Container {
    return dag.Container().From("ghcr.io/alexei-led/aws-mcp-server:latest")
}

func (m *McpCaller) LLMWithMCP() *dagger.LLM {
    return dag.LLM().
        WithMCP(m.MCPGSearch()).
        WithMCP(m.MCPAWS()).
        WithPrompt("list your available tools")
}
#

wrote those before i started digging through this service thing, though, and if these all go through svc, you'd prolly do ctr.AsService(stdio: true) to make the semantics more clear?

proper ledge
#

Ok so if LLM is the MCP client, and the MCP servers live in the container, then yeah i don't think you need any of the session attachable stuff. (sorry i'm slow, just catching up). And now your question is what's the best way to expose an MCP server in a container, to the engine's LLM correct ?

dark egret
#

correct!

#

although im not entirely sure i'd expose the stdio bool all the way out through the api on the first pass... i'm not sure it has any userspace applications until there's an equivalent flag on dagger call myService up, and i think implementing that would probably use y'all's pipeio session attachable

#

but that's like step 4 lol, i'm on step 0 still

#

yeah i think a__ prefix hides it from codegen? or at least it can for functions...

#

or maybe it's fine to expose, it'd definitely be fine to expose if it's part of the actual interface, like WithMCP(m.MCPGSearch().AsService(stdio: true))

proper ledge
#

if you're doing step 0, do whatever works πŸ˜„

dark egret
#

anyways tomorrow imma start with tryna wire this through Service. this morning i was trying to do it inside engine/buildkit/client.go but that did not yield fruit

proper ledge
#

happy to help, i'm still new to the code but can dig around

dark egret
#

❀️ appreciate it, i am also still pretty new to it and i pretty consistently get on these deep branches where i want somebody to sanity check me so i don't spend multiple days working off of bad assumptions lol

dark egret
proper ledge
#

@dark egret we haven't considered other libs as it seemed to be the official one. It was making releases every couple of hours that's how fast it was changing. My main issue was not having a DefaultArray and the fact that cmd.Stderr was not being set which echoes what you're saying about it not being too flexible, but i thought of pushing a PR to make it so. I already have a PR out for the DefaultArray it just needs tests. I don't really care tbh

dark egret
proper ledge
#

yeah i didnt really need it other than to debug. Will try to think of a way

#

FYI we think env api broke our dagger mcp cli. We're investigating

proper ledge
proper ledge
#
dagger --progress plain -c 'h=$(github.com/shykes/hello); e=$(env | with-module-input hello $h "module to expose as an MCP server"); llm | with-env $e'

yields this error:

error: parse selections: parse field "withModuleInput": init arg "value" value as dagql.ID[*github.com/dagger/dagger/core.Module] (ModuleID!) using dagql.ID[*github.com/dagger/dagger/core.Module]: expected "Module" ID, got Hello! ID

Does this mean a Hello ID is not a Module ID from the Env's perspective ?

#

@clever oasis any ideas? (sorry for the endless pings)

clever oasis
proper ledge
#

ah, that would make more sense indeed

dark egret
#

early adopter penalties are funny, at least we're not actually meaningfully late to the party

proper ledge
#

I'll try to fork marklabs and get it to work for you, if we can upstream then good

dark egret
#

happy to help with that too once i've got a demo working

proper ledge
#

i also wanna make sure we can pass a custom httpserver to the sseserver because it's simply hiding it rn

#

Yeah i'd say just get it working and we can clean things up

dark egret
proper ledge
#

yeah it feels like they were content using the Rob Pike options design pattern, and declare themselves "idiomatic Go" πŸ˜„

dark egret
#

some of it is also clearly AI slop, and coding via giving claude instructions and TS code references to translate makes you way more likely to get something that works but doesn't compose properly

proper ledge
#

oh i didnt even think of that but i think you're probably right

dark egret
#

you can tell by the abundance of correctly-capitalized comments lol

dark egret
#

tomorrow i gotta play around with how these might interact with Environment

dark egret
#

the mcp client lib woes never end, the one i finally got working uses a default bufio.Scanner, max token size 64kb, and if a single response is longer than that it blocks indefinitely lolsob

clever oasis
slow sierra
dark egret
slow sierra
#

Zed launched their new agent beta. Seems like MCP support is in. Strongly recommend joining their discord

#

There's an "agent-beta" channel there

dark egret
proper ledge
#

signed up!

dark egret
# slow sierra https://discord.com/invite/qSDQ8VWc7k

grrrrr now i'm hanging out in here watching ppl be all excited... the review multibuffer sounds like it's the exact polish i've been wishing for on top of the assistant editing shit i've already been doing in neovim for a year now

dark egret
#

woooo got an invite

dark egret
#

it's pretty slick, combo'd with claude 3.7 sonnet thinking it's quite good at code navigation out the box, like surprisingly little coaxing required to get it to load the right shit

#

the new zed agent's default toolset in "ask" mode: ```
Based on the function definitions currently available to me, I have access to the following tools:

  1. diagnostics: Gets errors and warnings for the project or a specific file.

    • Can check a specific file or provide a project-wide summary
    • Helps identify issues in your code that need fixing
  2. path_search: Finds files matching a glob pattern in the project.

    • Returns matching paths in paginated results (50 matches per page)
    • Useful for locating files with specific patterns or extensions
  3. thinking: A scratchpad for problem-solving without executing actions.

    • Helps work through complex problems or develop strategies
  4. fetch: Retrieves content from a URL and returns it as Markdown.

  5. read_file: Reads the content of files in the project.

    • Can read entire files or specific line ranges
    • For large files, returns an outline of symbols with line numbers
  6. now: Returns the current datetime in RFC 3339 format.

    • Can use either UTC or local timezone
  7. regex_search: Searches the entire project for text matching a regular expression.

    • Returns matching paths with excerpts containing matches
    • Results are paginated (20 matches per page)

These tools help me understand, navigate, and analyze your codebase to provide better assistance.

#

and in "write" mode

Based on the function definitions currently available to me, I have access to the following tools:

1. **find_replace_file**: Edits a specific part of a file by finding unique text and replacing it.
   - Requires unique text to find (with context) and replacement text
   - Good for targeted changes to files

2. **diagnostics**: Gets errors and warnings for the project or a specific file.
   - Can check a specific file or provide a project-wide summary
   - Helps identify code issues that need fixing

3. **code_symbols**: Provides an outline of code symbols in the project.
   - Can show public symbols across the project or all symbols in a specific file
   - Shows line number information for navigating large files

4. **symbol_info**: Provides detailed information about code symbols using LSP.
   - Finds declarations, definitions, references, implementations, and type definitions
   - More reliable than text searches for code navigation

5. **path_search**: Finds files matching a glob pattern in the project.
   - Returns matching paths in paginated results

6. **thinking**: A scratchpad for problem-solving without executing actions.
   - Useful for working through complex problems

7. **fetch**: Retrieves content from a URL and returns it as Markdown.

8. **read_file**: Reads the content of files in the project.
   - Can read entire files or specific line ranges

9. **create_file**: Creates a new file with specified content.
   - The most efficient way to create new files

10. **bash**: Executes a bash command and returns the output.
    - Must use the `cd` parameter to navigate to a project directory

11. **now**: Returns the current datetime in RFC 3339 format.

12. **batch_tool**: Runs multiple tool calls either sequentially or concurrently.
    - Efficient for complex operations requiring multiple steps

13. **regex_search**: Searches the project for text matching a regular expression.
    - Returns matching paths with excerpts containing matches
dark egret
#

out the gate i cannot get it to connect to my configured context servers, neither custom ones nor ones installed by extensions... i wanna give it DaggerDev so it can run tests 😭

#

also switching back and forth between ask and write is dope, afaict it keeps the built up history but changes out the tools... like you can get it on the right track with ask and then switch to write: disable thinking and enable all the actual writing tools

#

prolly the 2 most interesting tools:

  1. thinking: A scratchpad for problem-solving without executing actions.
    • Useful for working through complex problems
  1. batch_tool: Runs multiple tool calls either sequentially or concurrently.
    • Efficient for complex operations requiring multiple steps
clever oasis
#

@dark egret I've been tinkering with adding a think tool - wasn't a silver bullet or anything but pretty interesting, planning to try again

#

batch is interesting too

clever oasis
dark egret
#

implying that hey, maybe thinking is bad for actually doing shit, but good when you're explicitly not modifying state

dark egret
#

got dagger mcp -m ~/src/dagger wired into the zed agent. as expected and similar to many other MCP clients, zed doesn't seem to respect tools changing with selections. even if it did, there's some UX that's not gonna be cooperative: per-profile, users are expected to manually select which additional tools they want to add - so if the agent calls select, it's gonna add to a list of tools, but then a human has to look at the list via a UI selector.

funnily, it can default-instantiate and select DaggerDev, and when it does, those tools tell it about additional tools it should've gained, but it doesn't ever get them. cc @deft osprey

#

@tardy shadow @proper ledge how confident are we that we're actually sending notifications/tools/list_changed btw? reading code it looks like it should happen when calling mcp-go.MCPServer.SetTools, but im curious if y'all have seen the notification go across the "wire" with your own eyes

tardy shadow
dark egret
#

no rush, i confirmed in zed code they are definitely just another client on the list of clients that dont handle the notification

GitHub

GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.

tardy shadow
deft osprey
dark egret
#

im not sure what i'm suspicious of but i'm suspicious

deft osprey
#

same, I am suspicious of that too

proper ledge
#

investigating

#

@dark egret we should probably add an env var or a flag that logs the pipe to a file

tardy shadow
#

On Goose, dynamic seems to be properly working ; here is the full session -- in parallel, triple checking on our custom client to ENSURE that it's indeed via the notification that it's updated

#

Confirming that we see the tools/list/changed notification on the wire cc @dark egret

 OPENAI_API_KEY=toto dagger_dev mcp -s -m /tmp/hello
overriding 'auto' progress mode to 'plain' to avoid interference with mcp stdio
Exposing module "hello" as an MCP server on standard input/output
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"capabilities":{},"clientInfo":{"name":"goose","version":"1.0.12"},"protocolVersion":"1.0.0"}}
{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}{"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2024-11-05","capabilities":{"tools":{}},"serverInfo":{"name":"Dagger","version":"0.0.1"}}}

{"jsonrpc":"2.0","id":2,"result":{"tools":[{"description":"module to expose as an MCP server","inputSchema":{"type":"object","properties":{}},"name":"hello"},{"description":"Select a Hello by its ID.\n\nProvides the following tools:\n\n- Hello_alpine","inputSchema":{"type":"object","properties":{"id":{"description":"The Hello ID to select, in \"Hello#number\" format.","type":"string"}},"required":["id"]},"name":"selectHello"}]}}
{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"arguments":{},"name":"hello"}}                           
{"jsonrpc":"2.0","id":3,"result":{"content":[{"type":"text","text":"{\"selected\":\"Hello#1\"}"}]}}
{"jsonrpc":"2.0","method":"notifications/tools/list_changed","params":{}}
#

I guess it's not very helpful -- yes our implementation is compliant -- BUT, unusable by most mcp clients ... Goose works, Claude Desktop seems to have been broken by the multi-object (or I've been hallucinating in the past)

dark egret
dark egret
#

might work in reverse for stdin too lol

deft osprey
tardy shadow
tardy shadow
deft osprey
# tardy shadow Could you please send me the module you're running the mcp server on πŸ™ (or the ...

Simple module "dagbox" that has one function:

// Calls to curl to get the contents of a URL and returns the result
func (m *Dagbox) Curl(ctx context.Context, url string) (string, error) {
    return dag.Container().
        From("cgr.dev/chainguard/wolfi-base:latest").
        WithExec([]string{"apk", "add", "--no-cache", "curl"}).
        WithExec([]string{"curl", "-L", url}).
        Stdout(ctx)
}

Config for Claude Desktop:

{
  "mcpServers": {
    "dagbox": {
      "command": "/Users/shad/forks/dagger/hack/with-dev",
      "args": [
        "dagger",
        "mcp",
        "-m",
        "/Users/shad/sandbox/dagger-example"
      ],
      "env": {
          "ANTHROPIC_API_KEY": "xxx",
          "OPENAI_API_KEY": "xxx"
      }
    }
  }
}

Prompt: "Fetch the content of the URL "github.com" and give me the body."

Result:

View result from selectDagbox from dagbox (local)

{
  `id`: `Dagbox#1`
}
{"previous":"Dagbox#1","selected":"Dagbox#1"}

View result from Dagbox_curl:

{
  `url`: `https://github.com`,
  `method`: `GET`
}
Tool 'Dagbox_curl' not found.
#

Specifying "with-dev" as command is my hack to use local dagger, that parts works, at least

dark egret
tardy shadow
#

So, in order for it to work, you need to:

  1. run dagbox -> use dagbox and list all the tools you have access to
  2. select dagbox to jump in the context of the module (thing that @dark egret is changing as part of its PR https://github.com/dagger/dagger/pull/10122) select dagbox and Fetch the content of the URL "github.com" and give me the body.
tardy shadow
deft osprey
#

Ok, will try and let you know. Thanks for digging into it

slow sierra
#

@dark egret @tardy shadow @proper ledge @clever oasis after the call, can we bikeshed that "privileged" / selectRoot / core API access ux problems, because it seems like there are many parallel threads right now and maybe going in circles

dark egret
slow sierra
#

OK let's talk about that end state then

#

I'm on team audio if you have any juice left for this

slow sierra
#

@dark egret @floral shuttle @tardy shadow @proper ledge @clever oasis@deft osprey quick braindump of dependencies we discussed for end-to-end awesome goose:

  • "shell parity" -> for a given environment, the interface presented to the model over MCP should mirror closely the interface presented to a human user over the shell
  • context awareness -> goose informs dagger of the context directory to load, dynamically, without having to change config (using roots? other goose-specific protocol? hack for the demo?)
  • make a decision on "privileged": what does it mean, and how does the goose user configure it?
  • example app: we need a realistic demo scenario that involves a daggerized app, and a realistic dev loop with build, test, lint etc. It should not be the Dagger-dev module itself, too meta. Maybe the one from our quickstart?
  • model performance. generally getting our tool mapping implementation to a point where it works at its best. If that means restricting parameters (only this model, etc) then so be it
  • secrets injection. How to inject secrets into objects, eg. github token into a github module. At the moment dagger mcp doesn't support that
  • dagger -s -> last I tried, the -s flag was mandatory to work with goose
dark egret
tardy shadow
slow sierra
dark egret
tardy shadow
# dark egret https://linear.app/dagger/issue/PROP-77/mcp-support-demo (i suspect tibor can't ...

Hey @bleak drift ,

We’re working on a simple but realistic MCP demo this week, and before diving into the flow, we want to make sure expectations are aligned.

As a first draft, we’re building from the hello-dagger CI example, aiming for the most intuitive experience (free of any known issues in dagger main).

Here’s what we’re imagining for the baseline:

1. dagger # enter shell mode in the context of the hello-dagger module  
2. Run `test` or `publish` directly in shell mode # current documented flow  
3. Switch to prompt mode  
4. Ask: `run the tests and summarize the output`

This should work out of the box and serve as the foundational demo from the docs.

Now, we’d love your input β€” along with @dark egret and @proper ledge β€” on how we could evolve this into a more complete and immersive experience, as Solomon suggested here:
https://linear.app/dagger/issue/PROP-77/mcp-support-demo#comment-27ac7cad

floral shuttle
# tardy shadow Hey <@135620352201064448> , We’re working on a simple but realistic MCP demo th...

I was just trying Claude desktop and couldn't get it to call "core" functions. Any tips? I also tried to have two tools, Golang as one (without --env-privileged) and Core as another. Have a video.

{
  "mcpServers": {
    "Golang": {
      "command": "dagger",
      "args": [
        "-s",
        "-m",
        "github.com/jpadams/mark-daggerverse/go",
        "mcp",
        "--env-privileged"
      ],
      "env": {
        "OPENAI_API_KEY": "anything",
        "DAGGER_CLOUD_TOKEN": "dag_dagger_XXXXXXXXX"
      }
    }
  }
}
tardy shadow
# floral shuttle

Hey Jeremy, thanks for the video πŸ™

What I suspect: "Dynamic tooling doesn't work" in Claude Desktop -- So, on the second time you ask for the tools, it's actually lying to you, as in a Dagger sense, we probably changed scope and thus available tools (which Claude Desktop does not refresh)

Can you please retry with this prompt let's first create a new empty directory and call it sourcedirectly from the beginning ?

#

I would recommend to just test on goose -- that's the only client that we know for sure dynamically update / refresh the tools atm

#

btw, -s is not necessary anymore

floral shuttle
#

cool! not having much success with goose either... but

tardy shadow
floral shuttle
#

oh, actually works much better after removing the -s on goose

#

still need a -m even for a dedicated Core tool, right?

tardy shadow
#

So yes, you're right sorry

proper ledge
#

if you're using 0.18.2+, you no longer need -s and that has no impact on your demo. If you had -s problems it wouldn't even have started.

--env-privileged is in v0.18.3

tardy shadow
proper ledge
#

so no need for -s

floral shuttle
#

weird side effect of not having Container().From() is that my LLM tried to create the golang container I asked for by Dockerfile πŸ˜„

#
Here are the functions available for **Container#1**:

### Container Functions

- **Container_asService**: Turn the container into a service with optional exposed ports.
- **Container_asTarball**: Package the container state as an OCI image and return it as a tar archive.
- **Container_directory**: Retrieve a directory from the container's root filesystem.
- **Container_exitCode**: Get the exit code of the last executed command.
- **Container_file**: Retrieve a file at a specified path from the container.
- **Container_publish**: Package the container state as an OCI image and publish it to a registry.
- **Container_stderr**: Access the buffered standard error stream of the last executed command.
- **Container_stdout**: Access the buffered standard output stream of the last executed command.
- **Container_withDefaultArgs**: Configure default arguments for future commands.
- **Container_withDirectory**: Add a directory to the container's filesystem.
- **Container_withEntrypoint**: Set an OCI-style entrypoint for the container.
- **Container_withEnvVariable**: Set a new environment variable in the container.
- **Container_withExec**: Execute a command in the container and return a new snapshot of the container state.
- **Container_withExposedPort**: Expose a network port for the container.
- **Container_withFile**: Add a file to the container's filesystem.
- **Container_withLabel**: Add a label to the container.
- **Container_withNewFile**: Add a new file to the container's filesystem.
- **Container_withRegistryAuth**: Attach credentials for publishing to a registry.
- **Container_withSecretVariable**: Set an environment variable with a secret value.
- **Container_withServiceBinding**: Establish a runtime dependency on a network service.
- **Container_withUnixSocket**: Forward a socket to a specified Unix socket path in the container.
- **Container_withUser**: Use a different user for the command.
- **Container_withWorkdir**: Change the container's working directory.
- **Container_withoutDirectory**: Remove a directory from the container's filesystem.
- **Container_withoutEntrypoint**: Reset the container's OCI entrypoint.
- **Container_withoutFiles**: Remove specified files from the container.
- **Container_withoutLabel**: Remove a label from the container.
tardy shadow
tardy shadow
dark egret
tardy shadow
clever oasis
#

absolutely strange things it's doing to the tool names though lol

dark egret
clever oasis
#

hmm seems like it's expecting things to be called like do_a_thing which would become Do A Thing

#

if MCP normally has underscored names we could always do that conversion, then you'd get Trivy Scan Container and Container With New File at least

#

i wonder if that has any effect on evals thinkies

#

could also be a casing library bugging out because Foo_barBazBuzz is a cursed pairing of camel and snake case (maybe it saw _ and assumed snake)

proper ledge
#

I panicked the dagger-engine (not sure if it's just my branch) and my best bet is that somehow CurrentModule has an empty "" InstanceID. If anyone have ideas or clues, i'm all ears

clever oasis
# proper ledge logs:

this might be fixed by pulling my branch. for a moment I had Binding.asFoo return a null Foo if it wasn't present, but that actually has really bad SDK semantics at the moment; instead of llm.Env().Output("foo").AsFoo().Sync(ctx) erroring it would return a Foo that tried to load a "" ID internally

#

so now there's a Binding.isNull instead

proper ledge
clever oasis
dark egret
#

shot in the dark, but every time anybody's hit an mcp bug in the last 3 days @clever oasis 's already fixed it somewhere-

using helloDagger.publish on my machine, or trivy.scanContainer on tibor's, called through MCP, hitting a cache miss, we hit this error in the engine and the demo stops:

error="map[error:process \"/runtime\" did not complete successfully: exit code: 2 kind:*buildkit.ExecError stack:<nil>]

calling dagger call $fn works, and then re-running via mcp will start working. known issue? or worth filing a bug?

proper ledge
#

i'm about to reabse on his branch

dark egret
#

on dagger-to-agents@main, prompting goose run -t "scan the hello-dagger application container for security vulnerabilities", hit this engine panic:

panic: no ':' separator in digest ""

goroutine 4648341 [running]:
github.com/opencontainers/go-digest.Digest.sepIndex({0x0, 0x0})
        /go/pkg/mod/github.com/opencontainers/go-digest@v1.0.0/digest.go:153 +0x94
github.com/opencontainers/go-digest.Digest.Encoded(...)
        /go/pkg/mod/github.com/opencontainers/go-digest@v1.0.0/digest.go:137
github.com/dagger/dagger/network.HostHash({0x0, 0x0})
        /app/network/hosts.go:16 +0x24
github.com/dagger/dagger/network.ModuleDomain(0x314a850?, {0x40011d7500, 0x19})
        /app/network/hosts.go:37 +0x4c
github.com/dagger/dagger/core.(*Container).WithExec(0x0?, {0x314a850, 0x400d5dbdb0}, {{0x4002964510, 0x9, 0x9}, 0x0, {0x0, 0x0}, {0x0, ...}, ...})
        /app/core/container_exec.go:115 +0x684
github.com/dagger/dagger/core/schema.(*containerSchema).withExec(0x4006b479f8?, {0x314a850, 0x400d5dbdb0}, 0x4009182400, {{{0x4002964510, 0x9, 0x9}, 0x0, {0x0, 0x0}, ...}, ...})
        /app/core/schema/container.go:858 +0x180
github.com/dagger/dagger/dagql.Func[...].func1({0x400ebcdd80?, 0x4009182400?, {0x0, 0x1, 0x400f03cf60, 0x400efec7e0}, 0x0, 0x0}, {{{0x40029643f0, 0x9, ...}, ...}, ...})
        /app/dagql/objects.go:704 +0x74
github.com/dagger/dagger/dagql.NodeFuncWithCacheKey[...].func1({0x400ebcdd80, 0x4009182400, {0x0, 0x1, 0x400f03cf60, 0x400efec7e0}, 0x0, 0x0}, 0x400fd09bf0)
        /app/dagql/objects.go:761 +0x10c
github.com/dagger/dagger/dagql.Class[...].Call(0x31a32c0?, {0x314a850?, 0x400d5dbdb0?}, {0x400ebcdd80?, 0x4009182400, {0x0, 0x1, 0x400f03cf60, 0x400efec7e0}, 0x0, ...}, ...)
        /app/dagql/objects.go:266 +0x130
github.com/dagger/dagger/dagql.Instance[...].call.func2()
        /app/dagql/objects.go:523 +0xa0
github.com/dagger/dagger/engine/cache.(*cache[...]).GetOrInitializeWithCallbacks.func1()
        /app/engine/cache/cache.go:189 +0x64
created by github.com/dagger/dagger/engine/cache.(*cache[...]).GetOrInitializeWithCallbacks in goroutine 4639903
        /app/engine/cache/cache.go:187 +0x4e0
dark egret
#

nice well then thats 2 examples

#

gonna go back to my previous checkout for the demo then

proper ledge
dark egret
#

our confluence of examples does mean its triggered by something dagger-to-agents@main, not your engine or goose branch

#

after this call btw i'd love to know what you're trying to do on those branches πŸ™‚

proper ledge
dark egret
#

oh sick

#

dude if you get that working and we can layer in sam's shit we're getting to an actually exciting demo

#

you didn't see sam's shit but he's got a host filesync thing working based on an old branch from andrea

proper ledge
#

that was the whole goal of this week, but "LLMs"

slow sierra
#

I don't see how the filesync will plug into the LLM as-is.

deft osprey
dark egret
#

salute i am here to provide super descriptive names

dark egret
#

although tibor also already has a thing where the source dir gets passed as WithDirectoryInput, that might work with some coaxing

tardy shadow
proper ledge
proper ledge
#

I feel like i've seen this many times now (from prompt mode): β”‚ ! failed to convert call inputs: decode arg "source" (string): decode "Directory" ID: failed to decode base64: illegal base64 data at input byte 9

Is this something you know about or even better has been fixed in your latest of your branch ?

clever oasis
#

(Canada holiday today so I'll be mostly away - realized I only put that in the team channel!)

proper ledge
#

apologies for the ping!

proper ledge
tardy shadow
proper ledge
#

i first want to be able to pass a directory to a function via prompt mode, shouldn't be this hard

tardy shadow
proper ledge
#

with bindings. I do workdir=$(host | directory .) then go to prompt mode

#

what was the way to debug prompt mode LLM calls ?

dark egret
tardy shadow
proper ledge
#

its possible its jsut prompt engineering. We hit it with 4o, then upgraded to 4.1 and it disappeared and now i hit it consistently

dark egret
#

evals are important just because they'll speed us up

proper ledge
#

i think we need ONE MCP client we can trust, and that is no longer Goose for me

dark egret
proper ledge
#

we had our own mcp client it's a slight change to the llm logic we already have. The SendQuery abstraction was reused to reuse all the providers.

dark egret
#

having our own in a demo (almost) entirely defeats the purpose of the demo

#

if we're writing our own client, the demo is only useful for going to other mcp clients and being like "look at this thing we could do with your product if only you fixed X" - we lose any value of going to real normal users and being like "look at this cool thing you can do with dagger via goose"

dark egret
#

tbh evals should prolly come before a static tool spike too just to have a target

proper ledge
dark egret
#

valid

proper ledge
#

that doesnt work in prompt mode either, so when it works in prompt mode we can see whatsup with goose

dark egret
#

omg i'm so dumb, i was responding to guillaume as though you hadn't already responded

proper ledge
#

np

dark egret
#

cool so we're good, im gonna violently break my demo setup now and see about fixing/filing some bugs

tardy shadow
slow sierra
tardy shadow
tardy shadow
dark egret
#

til goose session --debug to show full tool responses

tardy shadow
dark egret
#

Stdout lol

#

I also have an additional mcp.sh hack that collects the dagger mcp stderr to a file, will share in a sec

proper ledge
#

yeah I do 2>/tmp/debug.stderr.log too and tail -F it in a 3rd terminal, this way i have stdin, stdout and stderr πŸ˜„

dark egret
#

Yep exactly

tardy shadow
#
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"capabilities":{},"clientInfo":{"name":"goose","version":"1.0.18"},"protocolVersion":"2025-03-26"}}
{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}

@slow sierra

tardy shadow
# tardy shadow ```json {"jsonrpc":"2.0","id":1,"method":"initialize","params":{"capabilities":{...
{"jsonrpc":"2.0","id":2,"result":{"tools":[{"description":"Select tools for interacting with the available objects.\n\nAvailable tools:\n- HelloDagger_build (returns Container!)\n- HelloDagger_buildEnv (returns Container!)\n- HelloDagger_publish (returns String!)\n- HelloDagger_test (returns String!)\n- cacheVolume (returns CacheVolume!)\n- container (returns Container!)\n- directory (returns Directory!)\n- git (returns GitRepository!)\n- host (returns Host!)\n- http (returns File!)\n- trivy (returns Trivy!)\n\nAvailable objects:\n- HelloDagger#1: helloDagger","inputSchema":{"type":"object","properties":{"tools":{"description":"The tools to select.","items":{"type":"string"},"type":"array"}},"required":["tools"]},"name":"selectTools"},{"description":"A tool for thinking through problems, brainstorming ideas, or planning without executing any actions","inputSchema":{"type":"object","properties":{"thought":{"description":"Your thoughts.","type":"string"}},"required":["thought"]},"name":"think"}]}}
dark egret
proper ledge
dark egret
#

this being the panic: no ':' separator in digest "" panic, yes

dark egret
#

we also have some sort of truncation wackiness going on, like first part of list tools response

{"jsonrpc":"2.0","id":2,"result":{"tools":[{"description":"Load tools for interacting with the available objects.\n\nAvailable tools:\n- HelloDagger_build (returns Container!):  Build the application container\n- HelloDagger_build_env (returns Container!):  Build a ready-to-use development environment\n- HelloDagger_publish (returns String!):  Publish the application container after building and testing it on-the-fly\n- HelloDagger_test (returns String!):  Return the result of running unit tests\n- Directory_as_git (returns GitRepository!): Converts this directory to a local git repository\n- Directory_diff (returns Directory!): Return the difference between this directory and an another directory. The difference is encoded as a directory.\n- Directory_digest (returns String!): Return the directory's digest. The format of the digest is not guaranteed to be stable between releases of Dagger. It is guaranteed to be stable between invocations of the same Dagger engine.\n- Directory_directory (returns Directory!): Retrieves a directory at the given path.\n- Directory_docker_build (returns Container!): Use Dockerfile compatibility to build a container from this directory.
goose run -t "list all of your available dagger tools, do not leave anything out. i don't care how long the response is. use complete names." --debug
starting session | provider: openai model: gpt-4.1
    logging to /Users/braa/.local/share/goose/sessions/20250418_154104.jsonl
    working directory: /Users/braa/src/dagger
Here is a comprehensive list of all available Dagger tools, including their complete method names:

---

### HelloDagger (Build and Test Workflow)
- **HelloDagger_build**
  Builds the application container.

- **HelloDagger_build_env**
  Builds a ready-to-use development environment.

- **HelloDagger_publish**
  Publishes the application container after building and testing it on-the-fly.

- **HelloDagger_test**
  Returns the result of running unit tests.

---

### Directory (File and Git Operations)
- **Directory_as_git**
  Converts this directory to a local git repository.

- **Directory_diff**
  Returns the difference between this directory and another directory. The difference is encoded as a directory.

- **Directory_digest**
  Returns the directory's digest. The format of the digest is not guaranteed to be stable between releases of Dagger. It is guaranteed to be stable between invocations of the same Dagger engine.

- **Directory_directory**
  Retrieves a directory at the given path.

- **Directory_docker_build**
  [The remainder of the method name was cut off in the tool listing. If you need the full signature or more details, please let me know.]

---

If you need more details or want the signature/parameters for any of these tools, let me know!

this breaks the trivy part of the demo once we've layered in the directory input because trivy's at the end of the selectTools list, and probably prevents chaining as well because "available objects" is at the very very end

#
starting session | provider: openai model: gpt-4.1
    logging to /Users/braa/.local/share/goose/sessions/20250418_154842.jsonl
    working directory: /Users/braa/src/dagger
Certainly! Here is the complete description of dagger__load_tools, with no omissions or editorializing:

typescript
namespace functions {

// Load tools for interacting with the available objects.
//
// Available tools:
// - Directory_as_git (returns GitRepository!): Converts this directory to a local git repository
// - Directory_diff (returns Directory!): Return the difference between this directory and an another directory. The difference is encoded as a directory.
// - Directory_digest (returns String!): Return the directory's digest. The format of the digest is not guaranteed to be stable between releases of Dagger. It is guaranteed to be stable between invocations of the same Dagger engine.
// - Directory_directory (returns Directory!): Retrieves a directory at the given path.
// - Directory_docker_build (returns Container!): Use Dockerfile compatibility to build a container from this directory. Only use this function for Dockerfile compatibility. Otherwise use the native Container type directly, it is feature-complete and supports all Dockerfile features.
// - Directory_entries (returns [String!]!): Returns a list of files and directories at the given path.
// - Dire
type dagger__load_tools = (_: {
// The tools to select.
tools: string[],
}) => any;


This is the full, unedited, available description. If you need more details about any listed tool, just let me know!
proper ledge
#

yes that's what i was referring to this morning or last night idk

#

goose only sees 9 avilable tools to select

dark egret
#
starting session | provider: openai model: gpt-4.1
    logging to /Users/braa/.local/share/goose/sessions/20250418_154908.jsonl
    working directory: /Users/braa/src/dagger
Here is the complete description of the dagger__load_tools tool, without editorializing or omissions:


// Load tools for interacting with the available objects.
//
// Available tools:
// - HelloDagger_build (returns Container!):  Build the application container
// - HelloDagger_build_env (returns Container!):  Build a ready-to-use development environment
// - HelloDagger_publish (returns String!):  Publish the application container after building and testing it on-the-fly
// - HelloDagger_test (returns String!):  Return the result of running unit tests
// - Directory_as_git (returns GitRepository!): Converts this directory to a local git repository
// - Directory_diff (returns Directory!): Return the difference between this directory and an another directory. The difference is encoded as a directory.
// - Directory_digest (returns String!): Return the directory's digest. The format of the digest is not guaranteed to be stable between releases of Dagger. It is guaranteed to be stable between invocations of the same Dagger engine.
// - Directory_directory (returns Directory!): Retrieves a directory at the given path.
// - Directory_docker_build (

This is the full excerpt provided by the underlying system as of April 2025.
proper ledge
#

but if you look at the mcp logs, it's there

dark egret
#

yep exactly

proper ledge
#

this is where i was like what the heck is goose doing

#

and i have this behavior, even without any system prompt by goose, just dagger's

#

they must be doing some kinda processing ... hence my joke that they probably ask another AI how to mess it up

dark egret
#

no joke, bug goes away if you switch to claude

proper ledge
#

geez

#

this was w 4.1 ?

#

This doesn't make sense, why would it work with dagger promtp mode and not goose ?

dark egret
#

4.1 and 4o have pretty much the same thing, which makes me think it's a context window thing. could possibly be goose mismanaging the context window.

#

claude has a different bug but it at least gets a complete load_tools description

proper ledge
#

hence the need to mitm the http at this point

dark egret
#

yep yep that comment makes more sense now lmao

tardy shadow
#

Ok, I'm getting confused on the best observability stack we need to isolate any client / dagger shell discrepency.

Thinking from first principle, what would be, from our conversations, the most critical things to observe:

  • MITM http query to AI
  • Tool list before / after each response $agent |Β tools vs mcp tool list
  • Eval intent of tool use per $assignment
  • Extraction of system prompt (goose's system prompt is dynamic) -- might influence
proper ledge
#

evals are always welcome, i think we also need to mcpify the existing ones

#

Also MITM is important now (either via proxy or just debug printing both goose and dagger and rebuilding) to understand if goose is tweaking things differently in the way they send their requests to openai.

dark egret
#

lol one sorta deranged way to mitm openai specifically is to use ollama serve

#

anybody got a beefy gpu?

floral shuttle
dark egret
#

that's a big if though

clever oasis
dark egret
#

openWorldHint boolean true If true, the tool may interact with an β€œopen world” of external entities
the other 3 annotations are unsurprising but this one is both a surprise to me and pretty relevant to us

dark egret
#

@clever oasis consolidating #agents message to here, with selectTools merged, how do we imagine module-object preselection working? it might get pretty immediately relevant for @tardy shadow 's hypothetical "tier 1" headless browser screenshot + cursor MCP demo because cursor isn't friendly to tools_changed notifications

#

does selectTools really even change anything about that apart from that our selections aren't really focused around single specific objects any longer?

#

(subtext: i'm wondering if I should try to bring back some part of the WithFooInput(foo, initialSelect:bool) PR to enable this)

clever oasis
#

@dark egret i would try without a manual 'initial select' thing and just try providing the constructed object as an input, and see if that's enough as-is

#

ideally the model would see that the object is available, maybe with a helpful description as a hint if needed, and also see that its tools are available, but still need to select them first (and have enough hints that it knows to do that immediately)

#

in theory it'll see something like this:

Available objects:
- DaggerDev#1: (description here)

Available tools:
- DaggerDev_test: Find test suites to run

and figure everything out from there, given "run the tests"

dark egret
#

hmmm lemme try it out real quick in cursor, this is basically hinging on the "selectTools returns a tool schema that the model understands as though it was part of the MCP tools list" thing, right?

#

i keep forgetting that that works because in my mental model of the clients it does not XD (my mental model is clearly wrong)

clever oasis
# dark egret hmmm lemme try it out real quick in cursor, this is basically hinging on the "se...

that's unrelated, doesn't hinge on it at all, it just hinges on the model's ability to select tools before using them, which is now an assumption made everywhere

previously: 'current selected object' was important because it determined the available tools => so we needed a way to pre-select an object

now: 'current selected object' is not a thing => model selects tools as appropriate for working with the available objects => so theoretically we don't need to do anything special anymore

#

selectTools returning tools is only relevant for: 1. teaching dumb models that all they've done is select tools, not invoke them, and 2. helping any MCP clients that don't support the 'tools changed' notification

dark egret
#

yeah we're saying the same thing, this is 2.

tardy shadow
dark egret
#

(both things are required)

clever oasis
dark egret
#

cursor has it's own way of managing context-specific mcp servers: <project-root>/.cursor/mcp.json

tardy shadow
# dark egret (both things are required)

there's still a small possibility that this still works (with a bug on their end that makes the LLM actually calling an MCP tool that doesn't exist before the select) -- trying it out

dark egret
#

@tardy shadow i was testing the same thing i think at the same time, on tiborvass/mcp-demo-wip, llms inside Cursor at least will not call the selected tools

tardy shadow
proper ledge
#

@dark egret hey sorry, just wanna say that my mcp-demo-wip branch is a constantly moving target, it's where i'm trying to get the Directory#1 issue fixed so we can try the test coverage demo working

dark egret
#

we're not the first people to ask cursor to implement tools_changed though https://forum.cursor.com/t/mcp-hosted-dynamic-tools-refreshing/49693

dark egret
proper ledge
#

For what you guys are trying to do, my branch or not my branch doesn't change much wrt Cursor

dark egret
proper ledge
#

And your recursive dependencies too i believe

dark egret
#

yeah yeah and that which is needed for the trivy thing

tardy shadow
#

So we do need your stopgap (which was your closed PR) Connor to unlock the use with static MCP clients 🀣 Full circle πŸ™

proper ledge
#

Unless @dark egret you wanted to make it work, I can take a stab at doing autoselect for @sonic fox since my understanding is that Cursor won't cut it

dark egret
proper ledge
#

TBH, even if it worked, it'd be relying on a bug. Since it's not a security-wise sound thing to just yolo call a method that doesn't exist according to your local state

dark egret
proper ledge
#

ah ok, i was thinking of just passing selectedTools map[string]bool around in newMCP

#

make it a private API with _ prefix

dark egret
#

which brings us back to the original thing i was poking @clever oasis about, I confirmed that the selectTools returns tool schemas thing does not allow those tools to be called in cursor.

dark egret
clever oasis
proper ledge
#

I personally don't mind having a private API stop-gap solution if it unlocks interesting demos on the non-dynamic clients, since they're private, we can always find better ways later. In fact I haven't abandoned the experimentation on full static indirect API either.

dark egret
dark egret
clever oasis
dark egret
#

aka: static only if client doesn't support dynamic

proper ledge
dark egret
#

and when you frame it that way, the ability to do static under certain conditions almost becomes a requirement... like we don't want to document that our MCP support only works with clients that support the tools changed notification, ideally we work with all of them but better with the dynamic ones

proper ledge
#

Also it's a little more subtle than that, because it's not necessarily about handling the notification, it's about the tools list synchronization in the agentic loop

dark egret
clever oasis
#

hmm i wonder if all that's needed for static is adding a tool to call a function - basically a "legal" way for the model to call a tool that it doesn't actually have, but that it can recognize from the select_tools response

I've already been tempted to add support for chaining/batching to save token cost, a lot like that batch meta-tool you found @dark egret (I think from Zed?) - maybe once we add that we'll get 'static' support for free? thinkies

#

or is that already 100% what y'all are suggesting lol

dark egret
#

exactly, yeah! i do think we'd also need to change the tool descriptions/instructions/system prompt a lil based on the mcp client's list_tools capability, but that's basically the concept

clever oasis
#

i can give that a go if no one has it in flight, super curious how the evals handle it, would love to see token cost πŸ“‰

dark egret
#

❀️ i love that it's potentially 2 birds with one stone. idk if he's got anything going yet, but i do know a static toolcalling scheme is in @proper ledge's mental backlog

clever oasis
slow sierra
#

if you have call_tool do you still need select_tools?

clever oasis
#

you at least need something that tells you their schema + docs, once the model knows that it wants it

proper ledge
#

No need for select tools. But list tools and call tool should always take an object

dark egret
#

(so as to not confuse the LLMs by calling our functions tools when they're not actually callable via toolcall messages)

clever oasis
#

select_tools basically becomes list_functions + actually installing them as native tools, for clients that support it

dark egret
#

is closing discord to stop procrastinating the 2 branches i need to get rebased and merged to get the demo functionality on main

#

squint actually wait i might only have 1 branch, i think static fn calling might obviate any sort of initialSelect type thing

proper ledge
dark egret
#

rebasing both anyways but imma focus on the serveDependencies then

clever oasis
clever oasis
#

tried it out with Cursor - it does use chain_tools but is hamstrung by the fact that it never sees the updating select_tools description (where the available objects and tools are listed as it reaches more objects/types)

#

gonna look into workarounds (it's something i was curious about in the past and would improve caching behavior), and if they work maybe we can just get rid of the dynamically updating descriptions

dark egret
clever oasis
#

yeah - might be "and" instead of "or" - since i wouldn't want to get all the tool descriptions + schemas dumped into the context, this is more of a summary to see which tools can be selected

clever oasis
#

or, it could look like augmenting it into existing responses

{"result":"DaggerTest#1","new_tools":["DaggerTest_all","DaggerTest_specific"]}
dark egret
clever oasis
#

i'm just pointing out there are two levels of tool discovery: 1. learning what tools can be selected/described, which is just a list of names + return types, and 2. learning about tool(s) in full, which gives you all the descriptions + schemas + etc. which is way more verbose

dark egret
#

Yeah makes sense

#

Static could try to flatten that but I can see that that might be fucked, too many tools to pretend they’re all β€œselected” at once

clever oasis
#

yeah - this is all mostly scheming to save tokens + context window usage

floral shuttle
#

Found it interesting while trying dagger mcp in Claude desktop, etc that there was always a selectFooand a Foowhereas other servers don't have that.
Is that selecting the object with the functions vs an MCP server just having a set of functions at the top level? Related to #1346602677236400159 message?

clever oasis
tardy shadow
clever oasis
#

just wrote it manually + ran evals yeah

#

the prompt helped - without it gpt-4.1 would mess up the schema by skipping the params wrapper

proper ledge
#

@clever oasis @dark egret sorry i'm late with my branch not sure if it's worth anything for you now, but i wanted to do two versions, this one is the initial one: https://github.com/tiborvass/dagger/commits/mcption/

The second version will take object IDs as inputs to list_functions, and maybe i remove the arg schema function.

clever oasis
tardy shadow
#

mmmh, maybe πŸ€”

#

I was using dagger_dev call evals-across-models --models gpt-4.1 csv on his dev branch

proper ledge
#

Ah sorry will doublecheck later

tardy shadow
#

my bad -- i didn't recompile the engine

tardy shadow
clever oasis
dark egret
#

https://github.com/dagger/dagger/pull/10118 includeDependencies is ready to merge after @copper panther takes 1 more peek monday, which means we'll have the trivy demo working on main.

i've also started setting up a mutagen filesync + dagger mcp demo. i'm having some struggles with getting cache volumes to share as I'd expect them to, but it feels like user error more than an actual engine problem, so i'm shelving that to reexamine on monday. if i can get it working, the demo will require lots of ugly and error prone setup, BUT we'll be able to make a ts unit test fail on purpose and then ask goose to fix it with its "developer" tools enabled

GitHub

fixes #10116, replacing #9992
the main meat of this PR adds
module.Serve(ctx, dagger.ModuleServeOpts{IncludeDependencies: true})
to the API, enabling shell and other clients to serve the dependenci...

deft osprey
#

I am doing some introspection on one simple daggerized mcp server and I see we're exposing a think function (alongside the select_tools)? Is it part of the spec?

proper ledge
deft osprey
#

Is it common practice for other MCP implem?

dark egret
#

orthogonal to MCP

#

that tool also exists in prompt mode and for async agents, it measurably helped with evals

proper ledge
# deft osprey Is it common practice for other MCP implem?

What we do is not really common practice MCP because what most people do is expose static functions with a description and let the LLM handle the rest. We expose MCP tools that allow you to find the right dagger function to call. The way to navigate that is a bit more complicated and it helps the LLM a lot to ask it to think.

deft osprey
#

interesting, seems to be something that should be formalized in the spec at some point

dark egret
#

think is an increasingly common practice for tool calling in general, like zed has a built-in (non-mcp) think tool that it enables in "ask" mode and disables in "write" mode, but i don't really see how that relates to the model context protocol

slow sierra
#

I think the decision to include think in the core deserves a github discussion thread. It's not a trivial decision. It's commonly an outside tool, not in the core of any framework

#

Also I would love a discussion thread on this mutagen filesync idea. It sounds cool but I have no idea what it does or how it works. Beyond a certain level of investment (ie a few days full time) it starts being too much without a prior discussion

dark egret
# slow sierra Also I would love a discussion thread on this mutagen filesync idea. It sounds c...

yep yep, we're in strong agreement here, and i have 0 intent to take it any further than a demo to show the vibes w/ an external MCP client and further spark a discussion of what it might look like as an actual feature.

as for how it works: in userspace it's pretty violently faff-y, we run a mutagen agent service in dagger, pointed at a shared cache volume. outside of dagger, on the host box, we point a mutagen client at the dagger-hosted mutagen agent service. that sets up a sync that keeps a host directory matching the cache volume directory in its contents. further sessions can use that cache volume to read and write files as though they were bind-mounted on the host.

this all gets wacky really fast. the wire protocol for mutagen uses ssh, so restarting the mutagen Service means you gotta clean up known hosts on the host machine. sessions sometimes seem to interfere with each other and take down the mutagen sync, etc, etc

bleak drift
#

afaik think was part of the evals driven improvements Alex worked on for 0.18.5, right?

dark egret
#

yep

slow sierra
#

We can stand by our decision to embed think but we should have our eyes open about the fact that it's unusual and probably involves tradeoffs. Users will probably question it too

dark egret
# slow sierra I think the decision to include `think` in the core deserves a github discussion...

if i had to gamble i'd bet that the conclusion of this discussion will be a flag to make it configurable... i feel like it makes sense to include in prompt mode by default, but considering MCP it gets weird fast if there are other think tools available, and considering async agents it's not crazy to think that folks might want to implement their own or exclude it altogether to encourage the agent to "just do it"

dark egret
#

dependency constructors should be LLM-usable now in prompt mode/dagger mcp cc @tardy shadow @proper ledge - aka the demo from 2 weeks ago is working on main

sonic fox
#

@proper ledge @weary stag is it possible to start dagger mcp just with the core library?

#

getting this currently:

10  : β”‚ ! module not found
1   : dagger mcp ERROR [0.5s]
1   : ! module not found and --core not specified

14  : moduleSource(refString: "."): ModuleSource!
14  : moduleSource DONE [0.0s]
proper ledge
#

The error message is not helpful because the flag was renamed and we forgot to rename it in the error message but I believe @tardy shadow fixed it already

sonic fox
proper ledge
#

If it's Gemini Anthropic, I believe you're hitting a known issue, or at least i've seen this before. I could be wrong but i've seen 400 errors before with Gemini Anthropic

sonic fox
#

I'm mostly trying to make the default coding agent module https://docs.dagger.io/quickstart/agent?sdk=go in goose

Agentic workflows need repeatability, modularity, observability and cross-platform support. Dagger provides a way to build reproducible workflows in any language with custom environments, parallel processing, and seamless chaining.

proper ledge
#

I meant Anthropic

sonic fox
sonic fox
#

using goose 1.0.14

proper ledge
sonic fox
#

was trying to get a very simple dagger mcp demo working but fails with a very simple create an alpine container and install curl

#

what version are you currently in?

dark egret
#

1.0.18

proper ledge
#

i think we had welcome fixes in 1.0.18 but 1.0.21 is out

dark egret
#

just between .17 and .18 was a significant fix

#

^yeah what tibor said

sonic fox
#

woot, I ran goose update and I'm still in 1.0.14 somehow, lol

proper ledge
#

hash -r ?

sonic fox
#

I'm on 21 now

#

testing

dark egret
dark egret
#

but same might go for certain distros package managers, wouldn't surprise me

sonic fox
#

cool, I have a sweet demo working now. Thx @proper ledge and @dark egret 🍺

dark egret
#

i think i'm like an inch away from a working mutagen+mcp demo, but i keep getting rate limited by anthropic... time to appease the LLM spirits

dark egret
#

lol my offering worked, but i still exceeded rate limits so the demo looks bad... gotta try o3 now i guess

#

... also holy shit, while it was failing to read the second run of tests back into claude bc of rate limits, my local agent setup:

  • correctly updated node dependencies
  • added a test preserving the "original" test behavior + code to make the intentionally broken test pass
  • modified my .dagger/main.go for x-platform runs (so it could run npm run test:unit run on the host or via HelloDagger.Test without one of them failing)
#

meanwhile i was just hitting the tokens/minute limit and ignoring changes it was making

#

omg it also fixed the demo with the x-platform runs fix somehow

#

but i threw it away 😭

tardy shadow
#

it's amazing ahahah

dark egret
#

and woooooo got it all working on o3... fixed my rate limit problems with a .goosehints that says don't list files recursively to infinite depth, it'll overload the context window and exceed rate limits. control depth or use rg to find what you need. (bad goose defaults for their "developer" toolset, not our fault)

more interestingly, i had to futz with the module setup to force in-session test reruns, WithEnvVariable("CACHEBUSTER" is insufficient in shell/prompt/mcp environments, instead we've now gotta do ```
// Return the result of running unit tests.
// Results from previous runs are cached by idempotency key.
func (m *HelloDagger) Test(
ctx context.Context,
// Supply a new idempotency key to force a new run.
// +default=""
idempotencyKey string,
) (string, error) {
return m.BuildEnv().
WithExec([]string{"npm", "run", "test:unit", "run"}).
Stdout(ctx)
}


(also tq to @thorn sand for sanity checking me when i wasn't sure if the caching was doing what i thought it was doing lmao)
dark egret
#

tried to take the demo a little further just now and compose 3 modules together, but hit a wall:

  • hello-dagger (with my added cachevolume + mutagen mounting bits, usable by goose MCP client)
  • dev-env (takes a container and layers writefile, readfile, listfiles, etc on top)
  • agent (toy-programmer-esque llm.withEnv(env.withHelloDaggerInput.WithDevEnvInput).withPrompt.Sync, return helloDagger.BuildEnv)

hit a wall i wasn't expecting that makes this less cool to show, tool routing seems to work great but "cannot retrieve path from cache" errors prevent me from using dagger fns to manipulate files on the cache volume. i can probably work around this via withExecs. the demo also suffers from the obvious problem of the additional module necessitated by the lack of self-calls, but we're working on that.

i'll show the code anyways at demos tomorrow because i think the interface is interesting, and once we have self-calls it's potentially quite slick

slow sierra
#

@warm wolf πŸ‘‹ this is where the MCP work is happening btw πŸ™‚

#

@dark egret I think mutagen stuff is a distraction. just being honest

dark egret
slow sierra
#

I mean short term, trying to fit that in on top of the rest in the first demo

dark egret
#

you mean for showing externally?

dark egret
# dark egret yep yep, we're in strong agreement here, and i have 0 intent to take it any furt...

quoting myself @slow sierra , approach and intents have not changed- i'm just playing around with UXes we can potentially enable, mutagen is just a hack to make the basics work. this is not part of any external-facing thing.

yep yep, we're in strong agreement here, and i have 0 intent to take it any further than a demo to show the vibes w/ an external MCP client and further spark a discussion of what it might look like as an actual feature.

slow sierra
#

Darren was just telling me about MCP sampling & roots πŸ™‚

deft osprey
#

After we're done with the first MCP demo, we should target getting MCP working with some popular frameworks like google-adk. The blocker seems to be with our select_tools that keep looping (like with some other clients), it'd be nice to have the ability to disable the dynamic tooling with something like dagger -m my-module mcp --static.

slow sierra
#

btw once we have generated clients, I think MCP support in frameworks becomes less important, because you can just integrate with their native tool calling system, which will probably work better.

tldr apps > frameworks in the priorities for MCP clients to support

deft osprey
#

After trying github's official mcp server (running it with docker) with google-adk, what I think would be great to have, is the ability to use any of my dagger module (ignoring the language they are written in), with a simple python agent that uses one of this frameworks.

slow sierra
#

I agree but I think it will be easier to achieve that with generated Python and Typescript clients, because it's more mature than mcp and has less moving parts

#

(but if it ends up working better with mcp, that's great too)

dark egret
#

Not a lot of benefits to having 2 competing MCP implementations, it's hard enough to polish one

the hard part is gonna be if the dynamic scheme works better in evals, this logic becomes less clear, or at least we may have to sacrifice capability for simplicity... plus if it does it's totally possible we can push MCP clients, over time, to support dynamic tool calling

slow sierra
#

Personally my money is on "GraphQL query builder over mcp"

#

raw graphql BUT with a twist: instead of learning the schema, we guide the model to build the query one function call at a time

#

then a separate tool to send it

dark egret
#

yeah that does seem the natural choice

#

i'm just glad we're starting to figure out evals, without data it scares me even contemplating decisions like these XD

deft osprey
dark egret
#

sooooo with @clever oasis back and OTEL API stuff resurfacing the big question is: do you still wanna work out the whole static tool calling scheme? cc @proper ledge

clever oasis
#

no strong preference here - happy to let others carry the torch on that if there's interest

slow sierra
#

I could use a checkpoint on what our current LLM-facing API is, how it may or may not change in the future... It feels like a black box to me and also feels like it might churn completely from one week to the next. And I wrote the original version... I imagine everyone else feels even more that it's a black box.

Are we in a place where we can start stabilizing the API?

#

Are there big WIPs that we should be aware of?

#

eg. "declare_output" seems cool but I don't fully understand the implications

dark egret
#

my impression is that the LLM facing API is still deeply unstable

slow sierra
#

also we haven't resolved the discussionon current state of Env which doesn't match anyon'e original vision - not LLM-agnostic; not simple; etc

clever oasis
#

+1 (to connor) - I see it as purely an implementation detail of better eval performance (+ shell DX), subject to change at any moment, and nothing should ever make assumptions about it

slow sierra
#

I agree with that theory but in practice, a lot of external-facing design discussions are heavily influenced by LLM-facing constraints

#

or maybe that's no longer true, and we're back in safer territory where API and implementation can be decoupled like with regular software?

clever oasis
slow sierra
#

trying to remember / disentangle πŸ˜…

clever oasis
#

i can think of the other way around, like static vs. dynamic tool schemes being influenced by limited clients, and external factors like prompt caching nudging us away from dynamic

slow sierra
#

The one that's occupying my mind right now is declare_output. Every time I saw it happen, it was in the context of a shell session, and it was very useful - basically replaced the old set and the LLM was able to set a shell variable for me.

But, it kind of broke my brain when trying to connect it to the system of externally-defined inputs and outputs. Can any LLM add any outputs to its environment at any time? Is that a good thing?

clever oasis
#

ah - that's exposed in the API as dag.Env(writable: true)

slow sierra
#

Also having a baked-in think tool which is a non-trivial design decision (kind of like including something in core dagger api or not). ie sophisticated devs will care about that as it might affect their other tools

slow sierra
dark egret
#

privileged has the same problem

#

and there was that whole convo about "levels" of privilege etc

#

that shit we should try to stabilize asap

clever oasis
slow sierra
#

Yeah but at least we had a semblance of API design discussion for privileged - what scares me is when things pop in the API that has no visible API design thread anywhere in the record. That scares me because implementation changes very fast - if API starts moving at the rate of implementation we're in trouble

#

Like writeable I had never even heard mentioned ever until just now

#

So my request is - can we bring the LLM parts of the API back under the usual bikeshedding system. There was a sort of implicit exception made for that part because of the unique constraints. Wondering if now we can go back to normal or too soon basically

#

(I know our bikeshedding is not perfect and open to improving it in any number of wayas - but that's orthogonal I think)

clever oasis
#

probably soon yeah - my mentality has been to just make a best guess at everything assuming we'll have one final round of bikeshed later, so maybe it's that time now?

dark egret
#

i think still too soon but you are 100% correct to apply pressure that we need to be trying to get there asap (basically what alex said)

slow sierra
# clever oasis probably soon yeah - my mentality has been to just make a best guess at everythi...

Well like you said, for changes that are truly only LLM-facing, it's OK to consider them implementation (never will be 100% since how LLMs understand our tools leaks into how they will respond to user prompts... but anyway close enough)

For LLM-facing changes that directly affect the API (ie. new flags in Env etc), then at the very least we should regain the habit of treating them like API changes, ie. notifying people, have an issue or a place to discuss or at least collect feedback for later. We can still do that and fast track it, we've been known to do that elsewhere.

#

Also just double-checking, did we get around to marking LLM and Env as experimental? Cause clearly they still are based on this conversation

dark egret
#

@tardy shadow @proper ledge i switched dagger-to-agents to use jeremy's trunk trivy module, should be on main now (i meant to PR but my upstreams are set up weird so i accidentally force-pushed facepalm , but i don't think i rewrote any history)

dark egret
#

oh yo @proper ledge hmu for a review when you get the in-process stdio tees open as a PR, would love to see that on main

proper ledge
slow sierra
#

FYI starting a discussion on broadening out support of MCP spec over time - thank you @blazing mortar for starting this! https://github.com/modelcontextprotocol/modelcontextprotocol/issues/362#issuecomment-2859692113

GitHub

This issue is a request to add Dagger to the list of applications that support the Model Context Protocol (MCP) on the MCP Clients page. The current trends and increasing hype surrounding MCP for A...

dark egret
# slow sierra FYI starting a discussion on broadening out support of MCP spec over time - than...

uh this is a bit awkward, but we don't currently even support tools as far as the MCP Client list is concerned. dagger mcp is a an MCP server, not an MCP client. in the future it'd be great to get on the client list, but we'd need to ship client support for one of those columns first. currently that's not at the top of our list of priorities given that dagger modules already provide quite a bit of the programmability of MCP servers, plus i'm still holding out hope that we'll get a blessed go client mcp library

proper ledge
tardy shadow
#

With Solomon's MCP client runtime -- doesn't that unlock the client use-case ? πŸ€”

dark egret
tardy shadow
slow sierra
#

@dark egret you are completely right, I had missed the fact that the issue was about Dagger as a MCP client... So my comment was completely off topic πŸ˜› I corrected my comment accordingly. Thanks for catching

proper ledge
warped halo
slow sierra
floral shuttle
warped halo
clever oasis
#

is there a dagger mcp --no-mod/-M equivalent?

#

like if i just want a global core dagger MCP server

proper ledge
#

yes it's --env-privileged

clever oasis
#

oh that skips loading a module?

#

it seemed to do a bunch of stuff still

proper ledge
#

ah my bad, i think we just talked about having a -M

#

--env-privileged adds the core api, and if you dont have any modules it will only have core, but i dont think we handle -M

#

happy to add it

clever oasis
#

ah ok it at least does the right thing if i run it outside of a module

proper ledge
#

yes

clever oasis
blazing mortar
dark egret
#

@slow sierra with the sdk/runtime approach to MCP clients, would we retain the ability for users to configure the module container with their arbitrary secrets, base images, withexecs and whatnot? i thought i was gonna find an example here but i suspect i'm looking in the wrong place

GitHub

Solomon's experimental Dagger modules. Contribute to shykes/x development by creating an account on GitHub.

clever oasis
#

@proper ledge did you say you had code pushed somewhere for static tool calling?

proper ledge
#

i wanted to also do a version where you pass in a receiver to list_tools so that it only lists the tools of that object

sonic fox
slow sierra
clever oasis
proper ledge
slow sierra
#

Baked into a closed-source fork of an open-source fork of a web browser

πŸ”«

dark egret
proper ledge
#

@dark egret The special bindings for workdir (and result_dir) are just stopgaps for demos. I was reassured by the hack because mcp also has the notion (albeit weird) of Roots. But i'm still not sold on it, i'd rather expose host tools in a secure way.

slow sierra
clever oasis
#

trying out the new static tool scheme with Claude Code, since it should theoretically work now - but it fails when I run out of the dagger repo, with "error": "Connection failed: Connection to MCP server \"dagger\" timed out after 30000ms",

is there any way we could start the MCP server eagerly and load the module async?

proper ledge
#

it's claude code that's timing out ?

clever oasis
#

claude code timing out "connecting" to the dagger MCP server

proper ledge
#

interesting

#

does claude have mcp logs ?

clever oasis
#

it does but that's all it says

#

my guess is it bails before it sees the stdio mcp init response

proper ledge
#

(oh i guess needs a rebase)

proper ledge
clever oasis
#

it times out because the MCP server takes too long to start, since it's waiting for the module to load

#

once the caches are warm it's fine

proper ledge
#

aaahh

#

ok thanks, yeah that's bad

#

yeah it shoudl be lazy. i'll think of a fix during lunch

clever oasis
#

cool cool

#

also seeing this weirdness:

● dagger:call_method (MCP)(method: "Container.withExec", args: {"args":["apk","add","--no-cache","cowsay"]}, self: "Container#2")…
  ⎿  Error: Error calling tool call_method: undefined

the call doesn't show up in the trace, almost seems like Claude Code is messing up something? sometimes it works, like this one was fine:

● dagger:call_method (MCP)(method: "Container.withExec", args: {"args":["echo","test"]}, self: "Container#2")…
  ⎿  {
       "result": "Container#3"
     }

wat

#

oh - it's just really confusing output, nvm

#

it puts the logs above it

clever oasis
dark egret
clever oasis
clever oasis
#

(confirmed)

tardy shadow
clever oasis
#

it's pretty confusing, but basically all the MCP code needs to respect the *dagql.Server that's passed in, so it sees the caller's API, instead of just the core API

#

for env(privileged: true) we were grabbing the Root() of the core API, and then at runtime we would use env.Root(), instead of srv.Root() (the *dagql.Server passed in)

proper ledge
#

nice catch

dark egret
#

party_blob @clever oasis merged @proper ledge 's mcp stdio PR so i can stop scripting and manually process managing my mcp processes on 18.7 party_blob

dark egret
#

woo with proper config, can confirm trivy demo works real good and easy... kinda crazy how much more stable it feels

#

other thing i'd recommend to folks is if you've got some mcp client config like

  dagger:
    args:
    - mcp
    - -m
    - ~/src/dagger-to-agents/hello-dagger
    - --env-privileged
    bundled: null
    cmd: dagger
    description: "dagger functions for the hello-dagger app"
    enabled: true
    env_keys: []
    envs:
      OPENAI_API_KEY: foo
      DAGGER_LOG_STDIN: /tmp/debug.stdin.log
      DAGGER_LOG_STDOUT: /tmp/debug.stdout.log
      DAGGER_LOG_STDERR: /tmp/debug.stderr.log
      # _EXPERIMENTAL_DAGGER_RUNNER_HOST: docker-container://dagger-engine.dev
      # _EXPERIMENTAL_DAGGER_CLI_BIN: ~/src/dagger/bin/dagger
    name: dagger
    timeout: 300
    type: stdio

you should really not hesitate to use "description" field, like until we get roots support wired up, it's not cheating to tell the client what module we're pointed at and what it's for

floral shuttle
dark egret
#

at least with 4o, that lil description field seems to have totally avoided the problem i was having before where it wants to immediately ls -R to figure out wtf is going on in the project dir

dark egret
clever oasis
#

it's possible that it only broke on my branch, from a refactor

dark egret
#

ah cool

dark egret
#

sweet lord @clever oasis the static tools PR is impressive... i set it up in zed, and accidentally created a profile where all the zed built-ins are disabled. i broke a test intentionally, then prompted "i've got a broken test, can you run it and fix it plz" ... claude 3.7 did all the test running, that part i expected, but even in the absence of client-provided edit_file tools, totally inside the sandbox, it globbed its way down into the appropriate test and impl files, then pulled out contents, Directory.withNewFile'd new contents back in, and then reran tests with the changes. It did all this on the first try with intentionally vague prompting.

clever oasis
#

sick

dark egret
#

incredibly sick, this makes me think that fully sandboxed edit tools are way more feasible than I thought they were a week ago

#

like we were discussing with goosemic

slow sierra
#

don't block on me for these tool scheme prs, i'll catch up

#

(or at least I'll try πŸ˜‰ )

#

appreciate the ping though

dark egret
#

soooooo with static tools + hostdir reload, i just got a vibe-code-coverage-improvement demo mostly working in zed. their "follow edits" feature makes the demo look great. however, out the box, there are a couple issues:

  • I needed to configure a zed agent profile with access to file read and edit tools, but without the terminal tool. with the terminal tool, it'll either entirely ignore dagger or it'll get confused about sandboxing vs the host shell.
  • Because static tools doesn't get super descriptive about what all can be done with the loaded module until the agent calls the tools, the agent needs to be coaxed to use explore dagger functionality at all. i added a zed "rule" that says "When developing code, when you'd normally run a command in your terminal, instead you should look through available dagger objects and methods to accomplish the task." this way it'll actually explore the API.
  • It did get confused about sandbox state at some point... with host dir reloading, some things reload automatically, but you can also save off old state for later use, and it'll find ways to run tests in old containers where it's most recent changes aren't reflected. if you stop it and explain what's happening, it stops getting confused and acts right. i'm experimenting now with a zed rule that injects this context.
  • zed, unlike goose, preserves a single instance of the mcp server for the whole project, rather than per-chat. this makes repeated runs w/ new sessions kinda weird, like it'll go off and find all the objects you have lying around and use those to get results that may be from historical state.
  • one interesting tidbit, claude is getting good enough at this that one time when i had terminal enabled, it did a complete end run around our MCP and figured out how to use dagger call to run tests.
clever oasis
#

friendly reminder I'm heading to the airport this evening for PTO, back Tuesday - would be nice to merge static tools so it's not lingering, whenever ya have a moment to review/approve @dark egret @proper ledge
PR is all green: https://github.com/dagger/dagger/pull/10366

GitHub

Switch mostly from dynamic tools to static
Instead of listing available objects + methods in tool descriptions, there are new tools:

list_objects - list all known objects and their descriptions
li...

clever oasis
#

oops mcp usage specifically is broken atm, know what it is, looking into it:

1   : ! error starting MCP server: input: llm.withEnv.__mcp failed to convert tools to MCP: schema of arg "self" of tool "call_method" should have a "type" entry of type string, got []string
tardy shadow
#

Do we want to merge it prior the hack night ? I mean we're still gonna ship it so maybe do it now yeah ahah -- as the evals still work ahah ; we gotta trust the evals 😝 but it's still a paradigm shift 🀣

dark egret
# tardy shadow ~~Do we want to merge it prior the hack night ? I mean we're still gonna ship it...

afaict it works quite a bit better than before in situations where there aren't other tools involved, but may be about the same in the presence of other tools.

i'm curious if anybody remembers the details of which clients support instructions cuz that would give me a more complete mental model of relative performance (i think it might be that goose works good bc it reads instructions and zed does not)

#

@tardy shadow the other thing is for the hack night you can be on the released version and have dynamic or dev build and have static

tardy shadow
tardy shadow
clever oasis
dark egret
#

basically what i was seeing yesterday in my agent setups stitching up other toolkits is that even claude will privilege the "terminal" tool and find that dependencies are consistently missing, and then not conclude from that that it should be using dagger

clever oasis
#

btw, interesting - Claude Code seems to have a "TODO list" mechanism now, which is exactly what i tried for static output saving (each output gets a TODO)

#

ended up scrapping it, but i still like the idea of a TODO list that the model can check in with at any time

clever oasis
#

though that also means there'd be two TODO lists πŸ˜›

proper ledge
#

oh that's interesting

#

@clever oasis the input/output tokens count is it an average or is it because or despite the increase in number of attempts ?

clever oasis
#

it's an average across all attempts, so # of attempts increasing doesn't inflate it

proper ledge
#

so overall it's just gemini doing couple % worse (but not statistically significantly worse)

#

When the number of tokens increases is it because it's searching its way to the goal via some detour ?

clever oasis
#

could be any number of things - redundant tool calls (e.g. list_methods repeatedly), stumbling and having to try different approaches, larger system prompt

warped halo
dark egret
#

think_spin so i made the llm API key loading lazy a while ago, right? i was just going through and testing that w/ dagger mcp because theoretically it means we don't have to configure OPENAI_API_KEY=lul given that we never call loop, provider, or model on the mcp codepath.

and as expected, it's no longer required. that said, while testing, i keep hitting errors like this:

{"jsonrpc":"2.0","id":11,"result":{"content":[{"type":"text","text":"failed to sync: select: process \"/runtime\" did not complete successfully: exit code: 2\n\n\u003c_type\u003e\nEXEC_ERROR\n\u003c/_type\u003e\n\n\u003ccmd\u003e\n[\"/runtime\"]\n\u003c/cmd\u003e\n\n\u003cexitCode\u003e\n2\n\u003c/exitCode\u003e\n\n\u003cstderr\u003e\nunexpected status 200: get or init client: initialize client: failed to load module: load moduleSource(refString: \"~/src/dagger-to-agents/hello-dagger\").asModule: Module!: load: load base: load: failed to stat local path: failed to stat path: failed to get requester session: session for \"twser5q6zfebjl0q0ee271x65\" not found\n\u003c/stderr\u003e\n\n\u003cstdout\u003e\nfailed to return error: input: get or init client: initialize client: failed to load module: load moduleSource(refString: \"~/src/dagger-to-agents/hello-dagger\").asModule: Module!: load: load base: load: failed to stat local path: failed to stat path: failed to get requester session: session for \"twser5q6zfebjl0q0ee271x65\" not found\n \noriginal error: get parent name: input: get or init client: initialize client: failed to load module: load moduleSource(refString: \"~/src/dagger-to-agents/hello-dagger\").asModule: Module!: load: load base: load: failed to stat local path: failed to stat path: failed to get requester session: session for \"twser5q6zfebjl0q0ee271x65\" not found\n\u003c/stdout\u003e"}],"isError":true}}

to me, this reads like the engine is getting confused about which cli-side session it should be talking to... does that make any sense to anybody?

proper ledge
#

YES, i've seen this error sometimes, even at the hackathon, and I don't know why it happens, and i don't know how it resolves itself (i think i just busted cache or idk)

dark egret
#

i think it might only occur when you've got multiple CLIs running at once

proper ledge
#

ah possible, that can be tested

zinc glacier
warped halo