#agents

1 messages · Page 1 of 1 (latest)

merry scarab
#

🤖

quiet ether
merry scarab
#

Has anyone found any of the new "computer use" features that have any sort of free tier. It seems the anthropic API is only paid - I was hoping to do some basic experiements without dropping a CC yet 😄

smoky ocean
#

@subtle surge @merry scarab I share the desire for a new channel but I think this was a bit rushed

#

Dagger and... is for integration / combination with specific tools or tech. This feels different

merry scarab
smoky ocean
#

I think this is more like a use case

#

Dagger for...

#

And it should be more specifically about AI agents, "AI" is a bit too broad

#

Also possible that I'm overthinking it

merry scarab
#

Yeah I agree on the Dagger for... that was my intent in the first place for sure.

Also agree that its about agents vs all possible ai topics.

Issue is adding Dagger for... feels like an even bigger escalation than adding a new channel 🙂

smoky ocean
#

yeah true

#

For now I'll just move this to the top of the list, and see where it goes

subtle surge
merry scarab
subtle surge
mystic steeple
#

All i can say is, always say thank you to gpt. you never know when terminator judgement day comes, it will remember the kind humans and spare us

smoky ocean
novel nova
# smoky ocean And it should be more specifically about AI agents, "AI" is a bit too broad

AI Agent
Definition: An AI agent is a software entity that autonomously performs tasks on behalf of a user or another program. It is designed to perceive its environment, make decisions, and act to achieve specific goals.

Functions:

Autonomy: Operates independently without continuous human intervention.

Task-Oriented: Focuses on specific tasks such as data analysis, customer service, scheduling, or monitoring.

Learning: Can incorporate machine learning algorithms to improve performance over time.

Examples: Chatbots, virtual assistants (like Siri, Google Assistant), recommendation systems.

AI Avatar
Definition: An AI avatar is a digital representation of an entity, often human-like, that interacts with users in a more engaging and personal way. It combines visual and sometimes voice elements to simulate a human presence.

Functions:

Representation: Provides a visual and interactive interface for AI systems.

Engagement: Enhances user interaction and engagement by simulating a human-like presence.

Communication: Can use natural language processing (NLP) to understand and respond to user input.

Examples: Virtual customer service agents with human-like avatars, digital characters in video games, virtual influencers.

#

your saying just reminded me of the difference between AI Agent and Avatar.

spiral hawk
#

I watched the demo a few weeks ago, what stuck with me from it was that it matched my experience of getting a chat bot to do something like troubleshoot and fix some networking problems with MacOS. But I had to ask it to explain its rollback plan before issuing some obscure CLI commands . Apple Intelligence, the other AI, OTOH, doesn’t appear to be trained at this level..

long briar
#

AI = algorithm we don't understand because it's occluded by a "network"

subtle surge
quiet ether
#

@harsh stag the reason why we couldn't load your module in our call with @smoky ocean seems to be related to an old dependency in your project's main branch. If you run this in the shell .doc git@github.com:siafu/cover.ai/apps/reporter@feat/pytest-plugin, that works

harsh stag
#

@quiet ether

#

let me up and running I want to attempt a POC with the latest version of the agent

#

I have my company and another lined up

#

lets talk early this week if possible

smoky ocean
#

@harsh stag let me prepare instructions for you real quick

smoky ocean
#

@harsh stag checkout github.com/shykes/melvin. I added a setup.sh script

#

(adapted from your gist @bronze fern thank you 🙏 )

harsh stag
#

ok I want to explain to you guys what i'm building

#

and the companies involved

harsh stag
smoky ocean
#

oh i think I pushed it to the wrong repo!

#

@harsh stag ok pushed

harsh stag
#

it works for only openai key?

#

its running, using my openai key

smoky ocean
#

You can also set:

  • LLM_HOST
  • LLM_PATH
  • LLM_MODEL

Those should allow you to connect to any openai-compatible endpoint.

@shrewd ermine and @shrewd fern know how to get it working on local models.

smoky ocean
#

FYI @spring wave @quiet ether @shrewd ermine @worn hill @spark phoenix I'm going to move the llm/agent dev thread to here

#

I like this new API better:

⋈ llm | with-git-repository $(git https://github.com/dagger/dagger) | ask "show me all minor releases, each with the latest patch release"
Below are the minor releases along with their latest patch releases:

- **v0.1**: Latest patch release is **v0.1.0**
- **v0.2**: Latest patch release is **v0.2.36**
- **v0.3**: Latest patch release is **v0.3.13**
- **v0.4**: Latest patch release is **v0.4.2**
- **v0.5**: Latest patch release is **v0.5.3**
- **v0.6**: Latest patch release is **v0.6.4**
- **v0.8**: Latest patch release is **v0.8.8**
- **v0.9**: Latest patch release is **v0.9.11**
- **v0.10**: Latest patch release is **v0.10.3**
- **v0.11**: Latest patch release is **v0.11.9**
- **v0.12**: Latest patch release is **v0.12.7**
- **v0.13**: Latest patch release is **v0.13.7**
- **v0.14**: Latest patch release is **v0.14.0**
- **v0.15**: Latest patch release is **v0.15.3**
smoky ocean
#

@spring wave I wired middleware back in, it works for core types but inexplicably, causes modules to fail loading with missing Go types at module runtime build. Even when I comment out the payload of the middleware, even a noopModuleWithObject simply being called triggers the issue...

#
dagger -m github.com/shykes/dagger@llm shell -c 'dev | with-mounted-file .env .env | terminal -c "dagger -i shell -m github.com/shykes/hello"'
spring wave
#

@smoky ocean .env is a directory now?

smoky ocean
#

oops no

#

brainfart

smoky ocean
#

@spring wave I pushed a commit that re-enables the middleware, so the issue is reproducible

smoky ocean
#

I think lack of self-calls will be a blocker for making the most of this for agent dev

smoky ocean
#

@spark phoenix re-posting the 4 bugs I'm currently stuck on... (was initially on another thread, but here is better)

#

## Bug 1. "Can't instantiate..." Solved!

#

## Bug 2. "FIeld "withExec" not found..."

#

Bug 3. Cannot create int from float64

⋈ llm | with-directory $(directory) | with-prompt "write a file hello.txt with contents 'hello world' and set the permissions to 0600" | history
🧑 💬write a file hello.txt with contents 'hello world' and set the permissions to 0600
🤖 💻 withNewFile({"path":"hello.txt","contents":"hello world","permissions":384})
💻 error calling tool: decode arg "permissions": cannot create Int from float64
🤖 💻 withNewFile({"path":"hello.txt","contents":"hello world"})
💻 ok
🤖 💬I have created the file `hello.txt` with the contents "hello world". The file was created without setting specific permissions due to an earlier issue, so it has the default permissions. Would you like me to attempt setting the permissions again?
⋈
#

Bug 4. Cannot load modules

This was introduced in my refactoring yesterday

$ dagger call -m github.com/shykes/hello functions
✘ Container.withExec(args: ["go", "build", "-ldflags", "-s -w", "-o", "/runtime", "."]): Container! 0.5s
# hello/internal/dagger
internal/dagger/dagger.gen.go:5461:33: undefined: Engine
internal/dagger/dagger.gen.go:5472:38: undefined: EngineCache
internal/dagger/dagger.gen.go:5483:43: undefined: EngineCacheEntry
internal/dagger/dagger.gen.go:5494:46: undefined: EngineCacheEntrySet
internal/dagger/dagger.gen.go:5648:31: undefined: Host
! process "go build -ldflags -s -w -o /runtime ." did not complete successfully: exit code: 1
Error: failed to serve module: input: module.withSource.initialize failed to initialize module: failed to call module "hello" to get functions: call constructor: process "go build -ldflags -s -w -o /runtime ." did not complete successfully: exit code: 1

--> Happens with any module

Note: this repro doesn't use dagger shell, because shell doesn't show the full error message for some reason

woeful flume
#

When dagger rag agents that push your code for you?

smoky ocean
#

I think soon that will be something you can do yourself in a dagger module 🙂 (or find a module in daggerverse that already does it)

#

Added a setup script in the PR description

woeful flume
#

I see it coming too cant wait

smoky ocean
#

You can try with this branch already, it's not finished but honestly pretty fun 🙂

smoky ocean
#

@woeful flume do you have a use case in mind? 🙂

smoky ocean
smoky ocean
smoky ocean
woeful flume
smoky ocean
smoky ocean
#

@gloomy kindle @shrewd fern I would like to hack together self-calls, I think it will be crucial for this feature to make sense to AI agent devs. Can you point me in the right direction? How would you go about it?

gloomy kindle
#

There need to be two separate passes - introspection and then codegen - atm it is one huge phase

#

To be able to codegen for yourself you need to know your own introspection

#

The options are either to hack it into every sdk manually or to cleanly separate out those phases at the sdk interface level

smoky ocean
#

Mmmm I don't sense an easy stopgap for next week's demo 🙂

gloomy kindle
#

Yeah 😢

#

When is next weeks demo?

#

Maybe there's something I'm missing, but there's a reason its not done 😢 can reply in more detail when I'm back from PTO tomorrow

smoky ocean
#

@gloomy kindle sorry I didn't realize you were on vacation!

gloomy kindle
#

From the engine pov I think self calls just work - it's the codegen for those self calls that's missing

smoky ocean
gloomy kindle
#

Ooo yeah okay

#

Although, that doesn't stop you doing a self call

#

Just doing a self call on an object you've made yourself

smoky ocean
#

Right

gloomy kindle
#

But yes good point, that's a rather more tricky limitation

#

That probably requires some thinking about how to solve

smoky ocean
#

I was initially hoping to do a stopgap without bindings, but really without the full codegen it's useless, because you can't "weave" self-calls into the rest of your code

gloomy kindle
#

Yeah for id-ying your own objects, it's a bit miserable I think, I think we'd prooobably need to embed dagql ID construction into the SDKs? Or something similarly tricky

#

Not out of the question, I've wanted this anyways, but definitely not trivial

smoky ocean
#

I'm starting to think that we should prioritize self-calls aggressively, because it's a forcing function to address fundamental issues in our DX apparently

#

same for object persistence at the lower levels of the engine (buildkit caching etc)

gloomy kindle
#

I'd be happy to try picking this thread up again - I was working on it for a bit

smoky ocean
#

Nice, that would be a trio of longstanding platform gaps, that the LLM use case really needs, and might all get a much-deserved boost:

  1. Object persistence (via @steep onyx attacking the caching layer)
  2. Generated clients (via @hidden tartan who needs it for his experimental Docker SDK)
  3. Self-calls 🎉
smoky ocean
#

@shrewd ermine @spark phoenix quick status update:

  • I'm stuck on the remaining bugs, @spring wave offered to take a look at "bug 4" which is the most painful right now (can't load any module from llm branch). He is timeboxed because of other commitments so let's see if more help if needed today.

  • Meanwhile I'm turning my attention to melvin. Since I can't prompt from the melvin code yet (because of bug 4) I'm focusing on tools, and general project structure

  • I'm trying a "tools-first approach": what tools will the AI need for each part of the workflow? And what's the perfect environment to consume these tools? From there I design the corresponding sub-module. It has to be sub-modules, so that the top-level module can hook them up to a LLM (no self-calls means you can't hook up a LLM to your own module's types..)

  • Right now I'm focusing on a workspace sub-module, since I wrote several variations of it for my past llm modules. Basically a basic environment for editing files, mounting them in a container, and running commands. With easy passing of files in and out, and checking the history of changes (to supervise the work of models)

  • The next submodule which is up for grabs, is github interactions. Imagine a LLM tasked with monitoring a github issue, communicating back and forth with users (including catching new comments, distinguishing them from your own, etc), perhaps some abstraction for reporting a list of tasks and their status. Ie a declarative API for managing those tasks in a stateful way, instead of a stateless firehose of github messages (like in my demo 🙂

shrewd ermine
#

Nice the github one sounds similar to what @spark phoenix demoed

smoky ocean
#

Yeah exactly

#

Basically picture yourself driving that module from the shell, to accomplish a task. You can't use anything other than that module's API via dagger shell. Can you do it?

#

If you can do it, then the LLM can do it (basically)

#

After that, there's still big questions around the AI-specific parts. Ie. which modules actually prompts LLM to do what? If we split up the agent into multi-agent, how do the agents talk to each other?

But I feel like if we get the nice environments working first, we'll be better equipped to answer

#

(also until bug 4 is fixed, we don't have much choice 🙂

hidden tartan
#

Because we could generate binding to call the current module functions

smoky ocean
#

Aha!

gloomy kindle
#

I'm not sure it fully solves it, you still need to know what your own functions are (the introspection part), which is tied into codegen itself.
Also still no way to get your own IDs.
But yes, it's a way to potentially get something (but we still fundamentally need to refactor the interface if we want this to be built into all modules)

hidden tartan
#

Yeah that's a lot of work indeed

#

I'm technically stuck on loading a module that doesn't have source for now, it's already a big challenge because the engine is assuming too many things when loading a module

gloomy kindle
#

Yeah are you working on untangling this? I'm happy to start on it, once I get back and get the next release prep started

#

Feels like it's blocking a few things

hidden tartan
#

To understand how it would work, I'm doing a side implementation of a Module that doesn't have source, and then I'll find a way to consolidate everything.
Ideally we would load a module differently depending on the task to perform/what's available

gloomy kindle
hidden tartan
#

So maybe a bunch of interface around Module could abstract that, I don't know yet because it's not working for now haha

#

Which one?

gloomy kindle
hidden tartan
#

I'll look at it

#

Oh yeah that's a very nice PR

smoky ocean
#

@harsh stag so you got it to build? 🙂

harsh stag
#

yes

#

I did

smoky ocean
#

ok so next steps:

  1. You can hook up core dagger objects to a llm, from the dagger shell, to get familiar with the possibilities

  2. as soon as we fix our remaining bug, you can start hooking up your own dagger modules to the llm so that it can drive them

harsh stag
#

ok

#

all good

#

I would like to share the MVP I'm working on

#

with you

smoky ocean
#

Try this as a start (from the dagger shell session):

llm | with-directory $(directory | with-new-file hi.txt "Hi Bob") | with-prompt "look for a text file, then read its contents. Who is the message addressed to?" | history
spring wave
#

@smoky ocean pushed a fix for bug 4

smoky ocean
#

THANK YOU! Was it very stupid?

#

actually don't answer that 😛

spring wave
#

it wasn't that stupid 😛 arguably something our codegen should handle (I left a comment + TODO)

smoky ocean
#

OK I feel less bad for not figuring it out then

spring wave
#

tl;dr there are some types that we exclude from module codegen (Engine, Host) - but codegen will still codegen fields that return those types, which is why it errors

smoky ocean
#

Ooh! I was wondering why those specific types

#

Thought it was an alphabetic order thing + race condition... that's how deep in the rabbit hole I was

#

testing now

spring wave
#

haha, i tried asking cursor "what do these types have in common?" after it indexed everything. but, no cigar. even though they're all literally listed together in the codebase somewhere, it just told me general stuff

smoky ocean
#

you just opened the floodgates @spring wave

spring wave
smoky ocean
#

@spring wave loading modules worked! But doesn't seem to hookup the setter/getter in Llm for the module type?

#

Maybe I commented out that part while debugging and forgot to reconnect it?

#
dagger shell -m github.com/shykes/hello
⋈ llm with-hello $(.)
github.com/shykes/hello ⋈ llm | with-hello
Error: no function "with-hello" in type "Llm"
spring wave
#

seems like we shouldn't even need the module middleware anymore right?

smoky ocean
#

I don't know - didn't fully understand why we needed it in the first place

#

something about 2 different modes of introspection co-existing in the engine

spring wave
#

we needed to before because it had to add the agent fields to the module's type

smoky ocean
#

oooh

spring wave
#

but now the only thing that changes is the LLM type, and that's all through standard graphql inspection

#

i wonder if the schema actually is there, but the CLI has an outdated view of it, or something

smoky ocean
#

But will external clients (eg. my CLI session) find all the setter/getter fields in the LLM type, regardless of whether they reference module or core types?

spring wave
spring wave
#

but it's confusing because you'd expect to have that problem with the module functions too

#

looking into it

smoky ocean
#

nice thanks @spark phoenix !

#

I think I found a nice pattern for the workspace sub-module. Impatient to try it 🙂

#

@spark phoenix it feels like the pattern is to program the objects that the agent will interact with.

#

(eg. a workspace - makes sense that the workspace itself does not include prompts. It's not an agent, it's an object that an agent can use)

#

Ok I'm 99% there... pushed what I have. Quick lunch break then I try to plug a llm into it

#

Note: the LLM pattern brings a lot of clarity to what should be in constructor arguments, and what should be in WithFoo chained methods... depends on what you want inside vs. outside the sandbox

shrewd ermine
#

Nice that makes perfect sense

smoky ocean
#

Really happy with that "checker" pattern - in a perfect world it would be a dagger interface, but easier to use container + default args for now

smoky ocean
#

@shrewd ermine

start=$(git https://github.com/dagger/dagger | head | tree)
checker=$(container | from golang | with-default-args go build ./cmd/dagger)
ws=$(github.com/shykes/melvin/workspace --start=$start --checker=$checker)

# Does the start workspace build?
$ws | check

# Let's make a stupid change and check again
$ws | write cmd/dagger/foo.go 'package main typo typo typo' | check
spring wave
#

still looking into the shell thing but might have to timebox soon. the issue is similar to before: the field shows up with native graphql introspection, but it's not present in the fields for Llm listed under currentTypeDefs, so shell doesn't see it

smoky ocean
#

cc @steep onyx any chance we could bother you with this real quick?

#

We have another "I see it in the graphql introspection, but not in the dagger introspection" problem

spring wave
#

The interesting part is that we have a core type (LLM) that gets extended with new fields that refer to module types. I don't think we've ever had that dependency arrow direction before (core -> modules). I'm looking for places where we treat the core module as a "leaf" dependency, but from what I've found it should still end up installing to the same *dagql.Server that the other module dependencies install to. And yet, I'm seeing in the currentTypeDefs path for the core module that it doesn't see the fields via GraphQL native introspection. Despite them showing up from the outside.

smoky ocean
#

@shrewd ermine do you want to try the github submodule? Otherwise I'll give it a go in about 2h (after my board meeting 😛 )

shrewd ermine
smoky ocean
#

New bug just dropped...

⋈ llm | with-directory $(git https://github.com/dagger/dagger | head | tree | with-new-file hi.txt "Hello redpoint team") | with-prompt "read the contents of ./hi.txt. Who is the message addressed to?" | with-prompt "write a new file to hi-back.txt with a response from the Redpoint team, saying, wow this is so amazing." | history
🧑 💬read the contents of ./hi.txt. Who is the message addressed to?
🤖 💻 file({"path":"./hi.txt"})
💻 xxh3:b00afab666cda867
🤖 💻 Filecontents({"id":"xxh3:b00afab666cda867"})
💻 "Hello redpoint team"
🤖 💬The message is addressed to the "redpoint team."
🧑 💬write a new file to hi-back.txt with a response from the Redpoint team, saying, wow this is so amazing.
🤖 💻 withNewFile({"path":"hi-back.txt","contents":"Wow, this is so amazing. - The Redpoint Team"})
💻 error calling tool: toSelectable: unknown type "DirectoryID"
#

Never saw this error before... Maybe today's fixes introduced it as a regression?

smoky ocean
#

This one is fresh, probably introduced by a commit in the last 24h

bronze fern
bronze fern
shrewd ermine
shrewd ermine
#

can i see the exact timestamp of a span in cloud?

#

I think it might be coming from "case 3" https://github.com/shykes/dagger/blob/llm/core/bbi/flat/flat.go#L255

time="2025-02-05T03:38:51Z" level=debug msg="Loading tool from field" field=id type=Directory
time="2025-02-05T03:38:51Z" level=debug msg="Checking if type is an object" kind=SCALAR typeName=DirectoryID
time="2025-02-05T03:38:51Z" level=debug msg="Field returns non-object type. Tool will return its value" field=id type=Directory
bronze fern
bronze fern
shrewd ermine
spring wave
#

@smoky ocean pushed some incremental progress towards getting modules working. now it works with -m but it still doesn't pick up modules installed interactively

bronze fern
#
{
  "Version": 6,
  "Final": true,
  "ID": "ec509fb26a68ef32",
  "Name": "Llm.withPrompt",
  "StartTime": "2025-02-05T03:36:46.301423387Z",
  "EndTime": "2025-02-05T03:40:06.32252127Z",
  "Activity": {
    "CompletedIntervals": [
      {
        "Start": "2025-02-05T03:36:46.301423387Z",
        "End": "2025-02-05T03:40:06.32252127Z"
      }
    ],
    "EarliestRunning": "0001-01-01T00:00:00Z"
  },
  "ParentID": "f4772ea35cbed054",
  "Status": {
    "Code": 2,
    "Description": ""
  },
  "CachedReason_": [
    "span has children"
  ],
  "PendingReason_": [
    "span has completed"
  ],
  "CallDigest": "xxh3:89846ea5b17eb54c",
  "CallPayload": "ChV4eGgzOjM1ZmYxYjI3ZWJiMGUzZTMSBwoDTGxtGAEaCndpdGhQcm9tcHQiggEKBnByb21wdBJ4OnZ3cml0ZSBhIG5ldyBmaWxlIHRvIGhpLWJhY2sudHh0IHdpdGggYSByZXNwb25zZSBmcm9tIHRoZSBhZGRyZXNzZWUgb2YgdGhlIGZpcnN0IG1lc3NhZ2Ugc2F5aW5nIHdvdyB0aGlzIGlzIHNvIGFtYXppbmcuShV4eGgzOjg5ODQ2ZWE1YjE3ZWI1NGM=",
  "ChildCount": 5
}
shrewd ermine
bronze fern
#

yep, just got into that 👆

#

no sure how I got there 😂

shrewd ermine
smoky ocean
spring wave
#

alternatively we could refresh whenever a new module is served, but that seems like a bigger lift

smoky ocean
#

Weirdly, "bug 6" makes it impossible for the llm to modify a container or directory state, but it works fine with a module type. (eg. a melvin/workspace)

#

I got melvin to complete its first coding task 🙂

#

@spring wave wanna try?

#

a glimpse of the future, made possible by your dagql acrobatics

spring wave
#

yes please

smoky ocean
bronze fern
#

🍿

smoky ocean
#
#!/usr/bin/env dagger-llm shell -m github.com/shykes/melvin/workspace

# Starting point for the workspace
source=$(git https://github.com/dagger/dagger | tag v0.15.0 | tree)

# Checker container for the workspace. Here we try building the dagger CLI
checker=$(container | from golang | with-mounted-cache /go/pkg/mod $(cache-volume gomodcache) | with-default-args -- go build ./cmd/dagger)

# Setup workspace
ws=$(. --start $source --checker $checker)

# Run the agent!
agent=$(llm | with-workspace $ws | with-prompt "create a new go CLI at cmd/hello that just says hello. It should take an optional flag to say 'bonjour' in french instead. Use the 'check' tool to make sure the build is not broken")

# Get the result
result=$(agent | workspace)

# Print the diff
$result | diff

# Inspect interactively
$result | dir | terminal
bronze fern
#

source=$(...

need start? was trying to run line by line...

smoky ocean
#

Sorry should have checked that it runs first, but got over-excited 🙂

#

OK here's a one-liner version to run straight from your dagger-llm shell:

#
_EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:1234 dagger-llm shell -m github.com/shykes/melvin/workspace <<'EOF'
llm |
with-workspace $(github.com/shykes/melvin/workspace --start $(git https://github.com/dagger/dagger | tag v0.15.0 | tree) --checker $(container | from golang | with-mounted-cache /go/pkg/mod $(cache-volume gomodcache) | with-default-args -- go build ./cmd/hello)
) |
with-prompt "write a new go CLI at cmd/hello that just says hello. It should take an optional flag to say 'bonjour' in french instead. Use the 'check' tool to make sure the build is not broken" |
workspace |
dir |
terminal
EOF
smoky ocean
bronze fern
#

oh, terminal from a directory?

smoky ocean
#

yeah

#

just interactively looking at the result

#

you can also replace | dir | terminal with | diff to see the changes

bronze fern
#

I wanted to end up with the go program in a golang container 🙂

#

to run myself. Not fair that the checker got to 😉

smoky ocean
#

Yeah 🙂 Same thought. I went for a very small piece that does as little as possible - just a sandbox for the LLM to write code, and get a green/red for its loop

#

What's nice is that it's agnostic - you can use it to write a frontend, or even docs

#

You can even have a checker that actually calls another llm and asks "does this look legit to you?"

#

(that would require making the checker a dagger interface rather than a container though)

bronze fern
#

totally! Gues I can foo=$(llm...dir) and put that in a container

spring wave
#

@smoky ocean pushed a couple UX things, feel free to rm

#
  1. spans for API calls that return a scalar now print the value to the span's logs, so now you see this instead of just nothing
#
  1. removed the silly string truncation, don't think we need it anymore now that we don't routinely pass giant schema inspection JSON strings around
#

bed time for me 👋

bronze fern
# bronze fern totally! Gues I can `foo=$(llm...dir)` and put that in a container
_EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:1234 ~/bin/dagger-llm shell -m github.com/shykes/melvin/workspace <<'EOF'
agentwork=$(llm |
with-workspace $(github.com/shykes/melvin/workspace --start $(git https://github.com/dagger/dagger | tag v0.15.0 | tree) --checker $(container | from golang | with-mounted-cache /go/pkg/mod $(cache-volume gomodcache) | with-default-args -- go build ./cmd/hello)
) |
with-prompt "write a new go CLI at cmd/hello that just says hello. It should take an optional flag to say 'bonjour' in french instead. Use the 'check' tool to make sure the build is not broken" |
workspace)
container | from golang | with-mounted-directory /app $($agentwork | dir) | with-workdir /app | terminal --cmd bash
EOF
#

works for me @smoky ocean 👆😁

#

not sure why it says "no checker configured"

smoky ocean
#

gonna make this code, like a demo module, will be nicer

#

but frustrating that I can't return the workspace from my demo module

bronze fern
#

Super cool. Changed my $foo above to $agentwork since I'm mounting some agent-produced work in my image after all 🙂

#

noticed I can't use a variable name like agent-promised-work with hyphens. Seems to be interpreted as a module name or function name.

#
_EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:1234 ~/bin/dagger-llm shell -m github.com/shykes/melvin/workspace <<'EOF'
# Agent does some work with LLM and tools in its workspace, including an automated qa check
agentwork=$(llm |
with-workspace $(github.com/shykes/melvin/workspace --start $(git https://github.com/dagger/dagger | tag v0.15.0 | tree) --checker $(container | from golang | with-mounted-cache /go/pkg/mod $(cache-volume gomodcache) | with-default-args -- go build ./cmd/hello)
) |
with-prompt "write a new go CLI at cmd/hello that just says hello. It should take an optional flag to say 'bonjour' in french instead. Use the 'check' tool to make sure the build is not broken" |
workspace)
# Human does a manual spot check
container | from golang | with-mounted-directory /app $($agentwork | dir) | with-workdir /app | terminal --cmd bash
EOF
smoky ocean
smoky ocean
#

I pushed a code version. But it's not working, I think it's hitting what's left of bug 5 (module loading)

#
llm # workaround to global llm config bug

github.com/shykes/melvin/demo |
go-programmer "a terminal snake game with ncurses-like interface and a basic AI opponent. Heavy use of ascii art effects. make it psychedelic." |
terminal
Error: input: demo.goProgrammer load: Call: Query has no such field: "workspace"
smoky ocean
#

Updating my list of platform gaps for agent dev:

  1. Object persistence
  2. Generated clients
  3. Self-calls
  4. Type re-exporting (module A can export the types of module B)
  5. Type embedding
#

I wonder if this could be daggerized? https://news.ycombinator.com/item?id=42935659

brunohaid

Noice!Does anyone have a good recommendation for a local dev setup that does something similar with available tools? Ie incorporates a bunch of PDFs (~10,000 pages of datasheets) and other docs, as well as a curl style importer?Trying to wean myself off the next tech molochs, ideally with local functionality similar to OpenAIs Search + Reason, a...

spring wave
#

@smoky ocean just a heads up, the change I made to log the response from scalar API calls leaks secrets 😅 hopefully I can just re-use the censoring code

spring wave
#

the censoring code is a bit intertwined with Buildkit atm, thinking I can get away with just marking .plaintext sensitive internally, which has some precedent already (setSecret's plaintext arg) - trivial to implement, will push soon

#

pushed

smoky ocean
#

nice, huge ux boost for daggerized agents!

#

thank you

spring wave
#

taking a look at bug 6 unless there's a higher priority

smoky ocean
#

The toSelectable one?

spring wave
#

yep

smoky ocean
#

Yeah that one hurts - would be very helpful

#

Seems like a recent regression too

smoky ocean
#

Oh also I found a bug 7 🙂 But can deal with it

storm gate
#

I keep getting an error with the ".env"

Error: input: llmapp.foo select: failed to read secret file ".env": open .env: no such file or directory

I tried several locations but same error every time. Isn't supposed to look for the work dir from where you run the cli?

smoky ocean
#

@storm gate I think that's bug 7. Are you calling llm from a module or from the CLI?

storm gate
#

from a module, I want to experiment from the Go API directly

#

(so I created a dumb go mod)

smoky ocean
#

Yeah that's bug 7. Should be an easy fix.

Workaround: call llm without argument in the CLI (call or shell) before calling your module

smoky ocean
#

It's because I store the llm creds as a global variable engine-wide (or session-wide? 🤔), so even modules can use it. But you have to instantiate Llm once to trigger the loading of .env. Whoever loads it first, everyone gets to re-use it. The bug is that, if it's a module who loads it first, then the engine looks for .env in the module's runtime.

smoky ocean
#

@storm gate did the workaround fix it?

storm gate
spring wave
#

also what's that View: v0.13.2 doing there? thinkspin cc @smoky ocean

shrewd ermine
# bronze fern ```yaml _EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:1234 ~/bin/dagger-llm s...

Ran this with llama3.3 locally

root@ju9ikntv2jqee:/app# ls cmd/hello
main.go  main_test.go
root@ju9ikntv2jqee:/app# cat cmd/hello/main.go
package main

import (
        "flag"
)

func main() {
        language := flag.String("lang", "en", "Language")
        flag.Parse()

        if *language == "fr" {
                print("Bonjour\n")
        } else {
                print("Hello\n")
        }
}root@ju9ikntv2jqee:/app# go run ./cmd/hello
Hello
root@ju9ikntv2jqee:/app# go run ./cmd/hello --lang fr
Bonjour

very cool

spring wave
#

pushed fix for bug 6

❯ dagger-dev shell --no-mod
Dagger interactive shell. Type ".help" for more information. Press Ctrl+D to exit.
⋈ llm | with-directory $(git https://github.com/dagger/dagger | head | tree | with-new-file hi.txt "Hello redpoint team") | with-prompt "read the contents of ./hi.txt. Who is the message addressed to?" | with-prompt "write a new file to hi-back.txt with a response from the Redpoint
 team, saying, wow this is so amazing!" | history
🧑 💬read the contents of ./hi.txt. Who is the message addressed to?
🤖 💻 file({"path":"./hi.txt"})
💻 xxh3:b00afab666cda867
🤖 💻 Filecontents({"id":"xxh3:b00afab666cda867"})
💻 "Hello redpoint team"
🤖 💬The message is addressed to the "redpoint team."
🧑 💬write a new file to hi-back.txt with a response from the Redpoint team, saying, wow this is so amazing!
🤖 💻 withNewFile({"path":"hi-back.txt","contents":"Wow, this is so amazing! - The Redpoint Team"})
💻 ok
🤖 💬A new file named "hi-back.txt" has been created with the response from the Redpoint team.
⋈  
smoky ocean
spring wave
#

oh. i may have just broken that then. assumed it was AI slop 😅

smoky ocean
#

Probably 😛

spring wave
#

is there a repro for the withExec bug?

smoky ocean
shrewd ermine
smoky ocean
shrewd ermine
smoky ocean
#

so it's v1 of the ollama api?

#

I'm asking because eventually we could have something like LLM_ENDPOINT=ollama://wompbox.turkey-beta.ts.net/llama3.3 or somethign like that 🙂

shrewd ermine
#

I don't know the context on why it's called v1. But v1 is compatible with openai. It has similar but slightly different endpoints at / that ollama-native libraries use

smoky ocean
#

but was wondering if it should be ollama:// or llama3.3:// or maybe just http:// is enough

shrewd ermine
#

Ah got it. Since ollama can serve any number of models, I'm not sure where that should go. The /v1/ always has to get added somehow, though

#

ollama:// could work because it'll let us know to add /v1/ and other oddities we may discover. The model you might want configurable per-request still, even with openai or claude

steep onyx
#

Pulled the latest version of the dagger/llm branch and running llm first still gives me Error: input: llm select: failed to read secret file ".env": open .env: no such file or directory

did I miss another setup pre-req?

spring wave
#

yeah you'll need a .env file in the repo root, containing your OpenAI API key

steep onyx
#

thanks, was missing the M I was supposed TFR 😄

smoky ocean
#

@steep onyx you also need to call llm from the CLI first, before letting a module call dag.Llm() (bug 7)

steep onyx
shrewd ermine
#

for anyone else wanting to use thier own hardware with ollama, here's my setup
OLLAMA_HOST=0.0.0.0 OLLAMA_ORIGINS=* ollama serve
and in another terminal
tailscale serve 11434
which lets me access ollama over tailscale with https
and then in another terminal you can pre-pull some models you might want to try, like
ollama pull llama3.3

smoky ocean
#

Taking a first stab at the melvin/github module follow the workspace pattern

#

Maybe I should try Gerhard's notification module, and get a bunch of comm channels all at once?

#

@spark phoenix can I use that module to do the neat "edit comment in place" thing you did in your demo last year?

#

looking at the source code, it looks like it 💪

steep onyx
#

@smoky ocean @spring wave might have figured it out, at least I broke through the previous error and am now hitting:

Error: input: demo.goProgrammer Post "https://api.openai.com/v1/chat/completions": net/http: invalid header field value for "Authorization"

Dunno if I misconfigured something in my api key or what.

Either way, the diff that seems to fix it is:

sipsma@dagger_dev:~/repo/github.com/sipsma/dagger$ git diff
diff --git a/core/llm.go b/core/llm.go
index b24b18dc2..4bbb447b6 100644
--- a/core/llm.go
+++ b/core/llm.go
@@ -354,8 +354,8 @@ func (llm *Llm) messages() ([]openAIMessage, error) {
        return messages, nil
 }
 
-func (llm *Llm) WithState(ctx context.Context, objId dagql.IDType) (*Llm, error) {
-       obj, err := llm.srv.Load(ctx, objId.ID())
+func (llm *Llm) WithState(ctx context.Context, objId dagql.IDType, srv *dagql.Server) (*Llm, error) {
+       obj, err := srv.Load(ctx, objId.ID())
        if err != nil {
                return nil, err
        }
@@ -504,7 +504,7 @@ func (s LlmMiddleware) extendLlmType(targetType dagql.ObjectType) error {
                func(ctx context.Context, self dagql.Object, args map[string]dagql.Input) (dagql.Typed, error) {
                        llm := self.(dagql.Instance[*Llm]).Self
                        id := args["value"].(dagql.IDType)
-                       return llm.WithState(ctx, id)
+                       return llm.WithState(ctx, id, s.Server)
                },
                nil,
        )

Basically, we were storing a dagql.Server in the object instances for Llm, but I think we probably were hitting dagql cache and thus ending up using a server from a totally different client where workspace had never been stitched into the schema. So passing the server around explicitly instead seems to fix it by ensuring we are using the one where workspace had been installed.

I wouldn't be surprised if there's some more errors lurking after the Authorization one but my hypothesis is that if there are we could work around them with some well-placed .Sync(ctx) calls in the module code. TBD.

#

There's a similar problem with for some other parts of the impl that were using the dagql.Server in the object, so I'll patch those up quick too and push this to the llm branch

spring wave
#

Ah yeah that sounds very similar to what I ran into with CoreMod

steep onyx
#

Oh actually I wonder if I'm getting those auth errors because I'm sourcing the key from a file w/ a \n at the end..

EDIT: no, doesn't seem like that fixed it

steep onyx
spring wave
#

I don't think the CoreMod case was caching related specifically, it was more like having two different sources of truth: the "pristine" Core schema installed into CoreMod.Dag vs. the "live/unified" schema that currentTypeDefs was trying to introspect. Didn't matter until now because the core schema never changed at runtime. But, yeah, it was another weird consequence of storing a dagql.Server on a long-lived object

smoky ocean
#

@steep onyx for that auth error, are you able to get a working query with just CLI core types? That would isolate whether it's linked to modules

steep onyx
# smoky ocean <@949034677610643507> for that auth error, are you able to get a working query w...

Oh wait, okay there's multiple things:

  1. The \n was indeed a problem when sourcing from a file
  2. For some reason the values read from .env seems to be extremely cached in that they never update even across separate dagger shell sessions until I restart the engine thinkspin

But now it works.

Next error was: process "go build ./..." did not complete successfully: exit code: 1

I had started with --interactive though and could see it failed to build just because there was only a main.go and no go.mod.

After manually running go mod init, the new error when trying to build is:

./main.go:143:16: invalid operation: cannot receive from non-channel termbox.PollEvent() (value of type termbox.Event)

Which I think just means the LLM generated go code that doesn't compile? So success 🎉 ? Sorta?

smoky ocean
# steep onyx Oh wait, okay there's multiple things: 1. The `\n` was indeed a problem when sou...

Re: value of .env, the way I do it is... hacky.

I have a single global variable in the engine code. The first call to Llm (from any client, including module runtimes) triggers a callback to fetch file://.env (via new secrets provider code). Then it's persisted in the global variable for all other clients.

I do this to allow modules to "break out" and use the host's .env. But as a result yeah, it's engine-wide..

#

This 👆 is also why you have to call llm from the client, to force hydration of the .env from the host, before a module gets to call llm first. Otherwise, the module "wins" and my code tries to hydrate llm config from file://.env in the module runtme container

I tried to solve that problem by auto-hydrating, but actually didn't find a good place in the codebase to do it...

#

./main.go:143:16: invalid operation: cannot receive from non-channel termbox.PollEvent() (value of type termbox.Event)

I have never seen this error before in my life

steep onyx
smoky ocean
#

Oh! I thought it was from our code 🙂

#

Normally it should continue in a loop until the code builds

steep onyx
smoky ocean
#

Pulling fixes now 🙂 Thank you!

smoky ocean
#

If anyone's around, let me know if you want to see a cool demo 🙂

#

@shrewd ermine more fun if you have the page open to see live updates 🙂

#

@shrewd ermine what's your favorite movie?

shrewd ermine
shrewd ermine
smoky ocean
#

(engine build... can't wait for those caching improvements)

#

of course out of nowhere the build is 10x slower

#

still building...

#

finally

#

OK @shrewd ermine I'm going to run this:

llm |
with-github-progress-report $(
  new-progress-report interstellar GH_TOKEN kpenfound dagger-modules 8
) |
with-prompt "You are the hero of the movie Interstellar. Retrace the whole story of the movie. As you go through it, share your journey with us in your progress report. Also keep us updated on the various tasks you accomplish throughout your journey" |
history
#

(quick reset as I configure my github token)

#

Ha ha that was terse... Let's try again

#
llm |
with-github-progress-report $(
  new-progress-report interstellar GH_TOKEN kpenfound dagger-modules 8
) |
with-prompt "You are the hero of the movie Interstellar. Retrace your whole journey, and send us updates as you experience it. Write the summary in movie script style. Also keep track of your tasks throghout the adventure." |
history
#

@shrewd ermine now I'm starting a parallel pipeline for a different movie (eg. different progress udpate) on the same issue.

shrewd ermine
smoky ocean
#

hold on, re-building the engine... I think there's a new bug, it seems that every completed session causes the engine to be unreachable...

#

(and by "re-building" I mean "re-running, but with a 120 second rebuild overhead")

#

Ok I'm preparing the double movie report, setting it up to record

shrewd ermine
#

Ok I got you back now

smoky ocean
#

I'm happy that it's so re-entrant, very compatible with Dagger's signature rapid dev loop

shrewd ermine
smoky ocean
#

(thanks to @spark phoenix 's marker trick, I'm using his module under the hood)

#

Ok but will you dare plug a brain into it? 🙂

shrewd ermine
#

Seems easy enough! can either do what we had working earlier and save the dir to a var, put that to the with-changes, or teach an llm to use the feature-branch module

smoky ocean
#

yeah actually that part wouldn't be LLM-connected I guess

shrewd ermine
# smoky ocean Ok but will you dare plug a brain into it? 🙂

could use some better prompting, but here's what I got
https://github.com/shykes/x/pull/9

export GITHUB_TOKEN=$(gh auth token)
_EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:1234 ~/bin/dagger-llm shell -m github.com/shykes/melvin/workspace <<'EOF'
# LLM work
agentwork=$(llm |
with-workspace $(github.com/shykes/melvin/workspace --start $(git https://github.com/shykes/x | head | tree)) | with-prompt "look around the repository and summarize its purpose in /README.md" | workspace)
# PR
github.com/kpenfound/feature-branch | with-github-token env:GITHUB_TOKEN | create github.com/shykes/x "add_readme_llm" --fork-name "shykes-x" | with-changes $($agentwork | dir | without-directory .git) | pull-request "Add Readme" "This adds a basic readme"
EOF

So in theory another agent can be driving this workflow instead. 1) read issue (not included, 2) make changes (agentwork), 3) make PR (final step)

smoky ocean
smoky ocean
#

We're going to need Dagger interfaces... lots of them

#

also, multi-object (named variables idea we discussed @spring wave) is going to be a must

smoky ocean
#

Could anyone check if they can reproduce my issue:

  • Run llm engine
  • Run a complete shell session against that engine
  • Try running a second session -> can't connect to the remote port
  • Engine looks like it's still running (no crash or obvious error message) but impossible to reach it
shrewd ermine
#

checking. Btw it looks like a binary was committed to melvin at github/github

smoky ocean
#

oh no it's the curse of Devin

shrewd ermine
#

can't reproduce that issue

devout magnet
smoky ocean
#

@spring wave I'm moving up a layer to Melvin, trying to build something cool on the primitives we have.

I was wondering... How do you feel about trying to implement multi-object? (ability to give the llm several objects with variable names). I'm 100% conviced it will be better. Without it, you find yourself doing more gluing together with traditional code than you really need to. Also that part of the Dagger DX is not great - eg. can't embed types, so if I have 2 modules and I want to give the same llm access to both, I need to create a wrapper type and basically copy-paste everything...

Of course the challenge is getting it to work 🙂 I have a reasonable idea of how it would work - but it's a BBI change. Not flat anymore... Closer to your graphql implementation I think.

wdyt?

spring wave
#

Trying to judge if context is going from n=1 to n=T or n=(T * V) (T=types, V=vars of that type)

smoky ocean
#

there's a state dagl.Typed. It would have to become state map[string]dagql.Typed

#

I actually started down that route when refactoring from agent to llm branch. Then realized "this is a significant BBI change, it's doable but one thing at a time"

smoky ocean
#

To break down the big problem into smaller problems, we could hack multi-object into the current flat BBI. We could inject special "meta-tools" prefixed with _ or something, always available, and would allow the model to list variables, select a variable. set a variable, etc. The key is the concept of "select", then the rest of the flat BBI applies to the currently variable. Could actually perform quite well

#

If you're already heading down the gql route, let me know, I can try this 👆 in parallel (but probably not today)

#

@spark phoenix one gap that I found, is that using Dagger interfaces as callbacks would be very valuable... That checker trick in workspace works super well, but it only works for running a raw container... Without interfaces I can't eg. have the agent's innermost devloop send updates as it encounters issues for example.

smoky ocean
#

Another task if anyone is looking to help: Llm.WithPromptTemplate(prompt string, values ... with builtin support for mustache/handlebars templating, would be nice. (Not sure what type we could use for values) --> actually there's something simpler we can do

#

Also WithPromptFile which would take a dagger.File instead of a string. That would allow me to move prompts into separate text files and grab them as contextual arguments... Would make the code way cleaner (half of my code is embedded strings atm).

smoky ocean
#

Feature request: add support for Claude Sonnet3.5, and perhaps also Gemini?

smoky ocean
#

The DX with WithPromptFile + optional vars for templating is really nice

#
$ cat prompts/historian.txt
You are a historian. Given a period or historical event, provide a paragraph summarizing what you know about it; then bullet points about key facts and events.

The topic is: $topic
func (h Historian) Explain(
  ctx context.Context,
  // The topic to explain
  topic string
  // +optional
  // +defaultPath="prompts/historian.txt"
  prompt *dagger.File,
) (string, error) {
 return dag.
  Llm().
  WithPromptFile(prompt, dagger.LlmWithPromptFileOpts{Vars:[]string{"topic", topic}).
  LastReply(ctx)
}

No more embedded strings for me 🙂

smoky ocean
#

@shrewd ermine

. GH_TOKEN github.com/kpenfound/dagger-modules 8 | go-programmer --start $(git https://github.com/dagger/dagger | head | tree) "Improve the core/ and core/schema/ with a top-level file() function that returns an empty File, similar to how directory() returns an empty Directory. Calling file() (in graphql/dagql: '{ file(name: "foo", contents: "bar") { ... } }' is a convenience wrapper for '{ directory { with-new-file(path:"foo", contents: "bar") { file(path: "foo")} { ... } } }'. Look at core/file.go core/directory.go core/schema/file.go core/schema/directory.go. Depending on how the code is laid out, you may need to also look at core/query.go and core/schema/query.go. Please be careful to keep the change clean and concise. Necessary changes only"
smoky ocean
#

Can't wait for this:

llm --model openai://gpt=-4o
llm --model google://gemini2
llm --model anthropic://claude3.5-sonnet
llm --model ollama://llama3.3
etc. etc.

Just abstract away the minutia of setitng up 1) hostname 2) path 3) auth 4) creds. Also I heard some providers have you select the model in the query, and others in the http header.

smoky ocean
#

@shrewd ermine FYI I just pushed some cleanup I hadn't committed

storm gate
#

Early support for Claude Sonnet as well as a LLMProvider abstraction to support others: https://github.com/shykes/dagger/pull/297 - I made it as a draft PR on @smoky ocean's fork because I don't want to break the branch. Claude Sonnet seems to work fine but I have issues with gpt4o, not sure if I intro'd a regression, I'll spend more time debugging.

GitHub

Draft PR to implement LLM providers for supporting more than OpenAI.
WIP...

smoky ocean
#

Btw @storm gate there is a feature request from @shrewd ermine to make the model configurable as an argument to llm() (so that you can mix and match different models in the session). I haven't started yet, so if you want to do it, it's all yours. Otherwise, just keep in mind that it's coming, in case that changes how you want to do your PR (to avoid conflicts I mean)

storm gate
smoky ocean
#

So it will only be the default llm settings that are global to the engine

#

Actually the Llm type already has the fields to persist its own copy of Model, Host, Path, Token, so if you add arguments to the type constructor, and fallback to using the corresponding field from the global variable as a default, it should be all you need.

#

The only reason we need that global variable really (and have to deal with the stickiness problems) is for the token, so that modules don't have to provide a token. But for model config it's fine I think

(until a random module decides to make 100 requests to o1 using my account... but we can worry about that later 😛 )

storm gate
#

Yeah makes sense, also it’ll allow to pass the config directly in code instead of relying on the .env

smoky ocean
#

@shrewd ermine FYI I stopped passing prompts as contextual dir, it makes the examples look more complicated. I used dag.CurrentModule().Source() instead

smoky ocean
#

@shrewd ermine I'm trying to decide how to divide up the different github-related modules

shrewd ermine
smoky ocean
#
  • We embed our own github client to get the issue contents, but then call aluzzardi/daggerverse/github-comment for the in-place editing, which uses its own client lib
  • My progress update logic is not super github-specific, so ideally I would move it into its own module. But in practice it only works with github today
#

@shrewd ermine can you explain the patch to github/ that loads the contents of the issue at initialization? What does that do?

shrewd ermine
#

Yeah since the github-comment module is write-only we didn't have a mechanism to read/store the issue title and body which I wanted to use that as the assignment if one wasn't provided. Not the cleanest

#

I should just split it out to a read-only issue module

smoky ocean
#

I think I'll move the progress report stuff to its own progress module, and keep it write-only (since it's a progress report) seems easier to understand

#

in practice it will be github-only (for now) but can be expanded later to support eg. discord, etc

shrewd ermine
#

Yeah that abstraction makes sense to me

smoky ocean
#

I can tell we're going to need interfaces soon... not looking forward to that

shrewd ermine
#

Layers on layers

smoky ocean
#

@shrewd ermine I'm adding a reviewer agent to give extra feedback to the coder agent, beyond "it builds". So far implementation is 6 lines 🙂

smoky ocean
#

@spring wave not demo-relevant, but just to confirm: I think multi-object, and in particular typed multi-object, will be a game changer. Especially for making it super easy to get "callbacks" from the model.

For example right now I'm slapping together a code-reviewer agent, to quickly tell me "hey does this seem good to you?" so I can wrap the "get it to build" devloop into another "get it to pass review loop" (and those will be actual for-loops, how cool is that).

BUT there is the question of the for-loop condition. I can easily get a long bla-bla-bla from the reviewer (last-reply. But in this case I need a boolean (merge / no-merge) or perhaps an int (review score 0-10) so my code can decide when to stop. So I'm missing a low-friction way to pass typed information back and forth. I can't use the state because there's only one slot at the moment and it's taken by the directory to review. So this is where having a 2nd typed object which can receive the score would be perfect.

And one more thing: in simple cases like a boolean or integer score, it will be extra convenient I am guessing, if we can attach scalars as variables. So I would just go:

score := dag.Llm().
  WithInteger("score", "Review score. 0/10 means unacceptable; 10/10 means perfect", -1).
  WithDirectory("code", "The code to review", source).
  WithPrompt("review the source code and set its score from 0 to 10").
  Integer("score")
#

The scalar thing is kind of a bonus. We can start with custom objects and then continue from there. I'm just trying to imagine the most lean possible DX, since simplicity seems to be the selling point here

smoky ocean
#

damnit, it's getting stuck on missing go.sum and trying to manually edit it... need to give it a way to run shell commands

#

Ha ha it worked 🙂

shrewd ermine
#

oh nice! go 1.16, a fine vintage

smoky ocean
#

OK I found a way to punt on shell commands (my checker just runs go mod tidy && go build ./...)

#

Gotta try giving it something harder

shrewd ermine
#

What was the prompt for that out of curiousity

smoky ocean
#

The reviewer is too easy to sway. Trying again 🙂

#

"The code is just a placeholder with a giant FIXME. 8/10"

shrewd ermine
#

Yeah this is why I thought I was seeing prompt entropy with the reporter. I think it's just down to prompt crafting though. It will take that first part of the prompt and make sure it solves that, assuming the rest is bonus points or something

smoky ocean
#

coder agent reads its review score: 4. Coder agent is sad...

#

now at 6/10!

#

(just pushed everything)

smoky ocean
#

Hello newcomers! 👋 This is where the magic happens 🙂 The whole Dagger team is nearby, don't be shy, we're happy to answer questions ansd help you get setup

smoky ocean
#

@spring wave is there any way to make the LLM traces look awesome in the experimental cloud, without losing the cool emojis in stable cloud v3?

spring wave
smoky ocean
spring wave
#

they're spans with their message as logs isntead of the span name (so they can stream)

#

which i think is worth keeping

smoky ocean
#

Ie. I saw now there's a span that says "LLM response" and you have to expand it to see its "logs". So is it OK if I just change that span name to the actual response? or weird?

#

Ah

#

Ah for the streaming

#

Yeah that makes sense

spring wave
#

i think it's weird - and poor UX, since you have to wait for the entire response to be done

smoky ocean
#

I did notice some streaming in the TUI... thought I was hallucinating at first 🙂 Very very cool

spring wave
#

so if you just make it '<emoji> response' and '<emoji> prompt' or something that's probably a good balance

smoky ocean
#

Also FYI the TUI seems a little more prone to locking, and display glitches

#

(this was on last morning's version though)

spring wave
#

are you on latest llm? i pushed a bug that was live for a bit

#

it was requesting the terminal's background color on every paint for the markdown logs 😅

#

yea try just pulling and rebuilding the CLI (don't need to rebuild the entire engine - go build -o ./bin ./cmd/dagger should do fine)

smoky ocean
#

Nice, will try that now

spring wave
#

i'm also trying and get the new bubbletea shell in a good enough state for the demo, i think i'm close. will ping when it's ready to try out

#

it's already pretty usable, has autocomplete and all that, just want to make some changes to the UI

#

that's on vito/llm-shell if you want to try building muscle memory early

smoky ocean
#

Can you remove the builtins .foo from the autocomplete while you're at it? 😛

spring wave
#

from what complete scenario?

smoky ocean
smoky ocean
spring wave
#

lol k

smoky ocean
#

Something else we learned / confirmed yesterday:

  • Controlling model name from the function is non-negotiable. Models are not interchangeable, they are an application concern. The same codebase will orchestrate different models for different tasks. We can't abstract that away. Cost; capabilities; performance; and even prompts; none of those things are portable across models.

  • As a convenience, we can choose a reasonable default; and also expose generic families of model, to give developers the choice of making their function more portable. Then it becomes a SWE decision: up to you how portable vs. optimized you want your code to be.

  • Corollary of the above: we may need to expose discovery of available models (ie. those that have a valid configuration in the current engine).

  • Routing from model name to llm endpoint (host + path + creds + raw client) can be hidden from the function code, and left to the operator to configure. The result may be something between secrets providers, and docker config (different creds for different registries)

smoky ocean
#

@spring wave let's move the oss coding stuff here 🙂

#

@spring wave looks like my issue was a fluke... Ctrl-a Ctrl-e works in zed also now

spring wave
#

ok cool

#

im sure it's fine padme_right

smoky ocean
#

@gloomy kindle @compact swan is any of your current work (sdk config etc) relevant to a global client-side config? We might need that to clean up LLM config... Right now it's a .env in the current directory with a few flat vars... We're going to need something closer to a docker config, with multiple llm endpoints, each with their own config. How far off are we from having that kind of plumbing available?

spring wave
#

@smoky ocean pushed Ctrl+C fix

smoky ocean
#

HOLY 🤯💩🤯💩🤯💩🤯💩🤯💩 now I see it

#

streaming the spans and everything

#

erasing history now 🙂

#

actually tools is a bigger problem than history

#

@spring wave I don't know why, some things don't work as well in the default zed terminal. Ctrl-L doesn't clear, but in ghostty it does.

spring wave
#

hmm weird

#

ill try it out on my MBA (i finally got one i can use for testing 🎉)

smoky ocean
#

@spring wave copy-paste doesn't work FYI

#

(in ghostty)

spring wave
#

are you doing plain old cmd+v?

#

or you mean you can't select stuff to copy?

#

ah the second part is definitely true, maybe i just need to disable mouse input for this

smoky ocean
#

can't select

#

another bug: calling a function without required arguments, doesn't print the error

spring wave
smoky ocean
#

I'm getting the streaming but then it disappears at the end. Is that something I need to toggle?

spring wave
#

(and <tab> or i to return to input mode)

spring wave
#

one sorta clunky thing is with increased verbosity now you also see all the spans for the individual field selections

#

i'd love to tuck those under an internal span to hide them, but they're actually what drives the real evaluation

#

maybe i can change that query to just id or sync and then do those as a separate query thinkspin

smoky ocean
#

Ok I got it to work by pressing alt-+ a random number of times until it showed it.

#

I think no visual indicator of verbosity level right?

spring wave
#

only in non-input mode (<esc>)

smoky ocean
#

✔ llm | with-prompt "your name is bob" | with-container $(container) | with-prompt "You have access to  a container. install nodejs on it" | container
defaultArgs: []
entrypoint: []
mounts: []
platform: linux/amd64                                                                                                                          default│ │ │ ┃ main.main()
│ │ │ ┃         /app/cmd/init/main.go:115 +0x906
│ │ │ ! process "apt-get install -y curl" did not complete successfully: exit code: 2
│ │ │ ✘ .sync: ContainerID! = xxh3:5f8c1553c7884d06 0.5s
│ │ │ ! process "apt-get install -y curl" did not complete successfully: exit code: 2
│ │
│ │ ✘ withExec(stdin: "", args: ["apt-get", "update"], insecureRootCapabilities: true, noInit: false, redirectStderr: "", redirectStdout: "", expand:
│ │ ! process "apt-get update" did not complete successfully: exit code: 2
│ │ │ ✘ Container@xxh3:6934f6e558023746.withExec(args: ["apt-get", "update"], expand: true, expect: SUCCESS, experimentalPrivilegedNesting: true, inse
│ │ │ ┃ panic: exec: "apt-get": executable file not found in $PATH
│ │ │ ┃
│ │ │ ┃ goroutine 1 [running]:
│ │ │ ┃ main.main()
│ │ │ ┃         /app/cmd/init/main.go:115 +0x906
│ │ │ ! process "apt-get update" did not complete successfully: exit code: 2
│ │ │ ✘ .sync: ContainerID! = xxh3:f74eb22c5bd89241 0.3s
│ │ │ ! process "apt-get update" did not complete successfully: exit code: 2
│ │
│ │🤖 It seems there is an issue running the  apt-get update  command, which might be due to a base image that doesn't use  apt  as its package
│ │ ┃ manager. Let's first check which package manager is available in this container and then proceed with the installation steps for Node.js
│ │ ┃ accordingly. Can you please provide more details or context about the container's base image?
│ $ .Container: Container! = xxh3:2fbfa0d0748ed4a0 0.0s CACHED
│
│ $ loadContainerFromID(
│ │ │ id: $ Container.from(address: "alpine"): Container! = xxh3:ff04b88d02461bd7 0.0s CACHED
│ │ ): Container! = xxh3:0621482c6c1c7b28 0.0s CACHED

. ⋈
smoky ocean
spring wave
#

2 should do

#

so just +1 from default

smoky ocean
#

ok!

spring wave
#

could also do dagger shell -v

smoky ocean
#

@spring wave sorry separate question. I'm trying to make a barebones implementation of Workspace, that's so simple I can live code that in a demo

#

Basically read, write, build

#

My issue is that `dag.Container().From("golang").WithExec([]string{"go", "build"}).Sync(ctx) will not return the full stderr in the error. So I can't just pass the error through to the model. It will only see "process exited with code 1 blablabla".

I have to manually set expect: Any, then check for exit code, get stderr, etc etc.

Am I holding it wrong? Is there a way to get the actual stderr in the error without my own glue code?

#

separately: calling a module function that returns an object, prints the ID

spring wave
#

i'd expect something like this:

err := dag.Container().WithExec(...).Sync(ctx)
if err != nil {
  var execErr *ExecError
  if errors.As(err, &execErr) {
    // ... access Stdout/Stderr/etc
  }
}
#

the stdout/stderr was removed from the error message because it makes the UI insanely verbose and redundant

spring wave
#

that's standard go error handling?

smoky ocean
#

(in the context of my demo - showing Go SDK-specific tricks etc)

#

Yeah I guess

#

It's just that the audience isn't a Go audience, so I'm pushing it by showing them Go at all

spring wave
#

how about python 😛

#

but you're right, it's hard for LLMs to grok the full details when things fail, i think we need a way to feed them the telemetry report or something

#

i ran into that issue with the Claude MCP demo

#

it'd run a command, the command would print a clear failure that I could see, but Claude had to continue on blindly

smoky ocean
#

Why is it redundant to add stderr to the error?

#

In the contexts of BBI, I don't have any special information about the behavior of withExec, so I would just expect it to return the most informative error possible. If a command was executed and failed, the contents of its stderr seems like the most informative content possible no?

spring wave
#

because those errors get displayed in the UI for span errors, so you end up with gigantic swathes of red text, poorly formatted because it's split Stdout/Stderr

#

and there isn't a good heuristic for cleaning it up

#

like when you run a command in go and it fails, you don't get an error with its entire stdout/stderr

smoky ocean
#

I wouldn't expect stdout to be there, just verbatim stderr, perhaps prefixed with "failed with exit code <n>: <stderr>"

spring wave
#

yep, still too noisy

#

trust me 🙂 there just isn't a great end down that path, i tried for a long time

smoky ocean
#

You mean for rendering? But isn't that just a rendering issue? "only render the first line of multi-line errors"?

spring wave
#

i tried that

#

and that breaks things that intentionally return multi-line errors

#

like linters

smoky ocean
#

Yeah I know you spent a lot of time on that rabbit hole... Just trying to replay the whole thing to understand the ramifications for agent dev specifically

#

I'm looking at your new LLM trace rendering and thinking - we're dealing with multi-line content already. You see a snippet, then you can manually click to expand. And it looks <chefs kiss> Couldn't we do the same for long errors?

#

Otherwise someone will have to special-case with-exec, forever. And also we'll have to tell everyone "never return multi-line errors, we don't like those"

spring wave
#

maybe for the web UI, but the TUI would still be uglified

#

i would rather try replaying telemetry data or something

#

we already persist it engine-side, just need to work out the interface

#

i'll think about it more, but it was definitely a frustrating thing to try and fix and workaround in all of our UIs

smoky ocean
#

Ok, I get the whole chain of reasoning, it's just going to be really tempting to just have a wrapper module with a MyExec that gets what I actually want which is stderr, and puts it in the error if the engine refuses to do it

#

Basically a human looking at the TUI has access to error information that my code (and by extension the LLM consuming it) can't

spring wave
#

right

#

theoretically that's also true of anything that prints logs that you don't directly access via stdout in the API

#

like if you doa container.withExec("echo hi").withExec("echo bye").stdout the agent will only see bye

#

it can't learn from anything that might have gone wrong earlier

smoky ocean
#

Yes, it's just that in the context of a command-line tool failing, the contents of stderr is not really logs, it's also an error message

#

I get that we're paying the price for the ambiguity of what stderr is for

spring wave
#

i guess my point is the medium for a human is the full logs/telemetry, and if we solve it that way we'll get more benefits; when something fails it might be more than just the failed command that's relevant to understanding why it failed, so in the same way that a human might go back and grok the full context, the agents should too

smoky ocean
#

There's a long tail of issues where just getting a few lines from stderr is all you need to get "oh right I forgot the go.mod" and keep moving

#

Then for more complicated stuff, I love the idea of introspecting the telemetry. Love love it

#

@spring wave sorry to jump to a completely different topic (demo prep mode...)

It looks like at the moment, only a module's top-level type is available to attach to Llm, is that right? Or is there another issue with my code?

spring wave
#

ah you might be right

smoky ocean
#

Oooops sorry - it was prefixed by the module name doh

spring wave
#

oh cool

spring wave
smoky ocean
#

Side note: just ran into lack of self-calls. Need to make my demo more complicated by splitting it into 2 modules

smoky ocean
#

@spring wave is it an easy fix to print errors returned from function calls? example:

#

sorry for bombarding you

spring wave
#

should be fixed already

#

pushed a bit ago

smoky ocean
#

Got some ETOOBIGs

spring wave
#

side note: are you cranking up the verbosity? or maybe you have an old engine? those loadFOoFromIDs shouldn't be showing up

#

(i think it's the latter)

spring wave
smoky ocean
#

Not sure. Let me re-run with your latest version and see

#

Quick notes:

  • Streaming is a game changer
  • Feels more verbose than I need it to, I would love something slightly less verbose, but with lingering
  • Tool calls aren't clearly shown - but maybe that's because of the extra verbosity, ie. drowned in the noise a little
  • In view mode (esc), it doesn't auto-follow
#

(example of verbosity: lots of visible IDs)

smoky ocean
spring wave
#

too old!

smoky ocean
#

gah

void flint
#

Streaming may impact tool calling.

#

Some of the upstream models won’t run tool call when streaming is enabled

smoky ocean
#

Oh we don't support streaming to the models, just streaming their output tokens to the end user via otel

#

or is that what you mean by "support streaming"?

#

Also good to see yuo here @void flint !

spring wave
#

man, i had the cutest little demo of "use my apko module to install and run cowsay" and it worked first try, but now it doesn't work, for one of ~3 different reasons. one panic where the id arg wasn't provided somehow, one panic from OpenAI not returning any Choices - maybe API error. and now it installs the package, but then tries to call Apko.asContainer which just errors because it's blank. (instead of just chaining from Apko.wolfi(cowsay) which it just called)

#

or it just doesn't do the thing i asked...

spring wave
#

Figured out a trick so the tool calls reveal the function directly, so we don't need the 'fake Call' hack anymore: if you combine our passthrough and reveal attributes, that tells the UI: 1. reveal this span (the tool call), but then 2. instead of actually showing the span, show its children. Now you see this, instead of ContainerwithExec(id: "xxh3:...") in a non-chained form

#

@smoky ocean that'll need a new dev engine though 😅 feel free to skip if it's cutting too close

void flint
void flint
smoky ocean
#

Noted for ollama @void flint !

spring wave
#

@smoky ocean pushed some minor CLI-only fixes fyi (fix quitting too aggressively, improved spacing a bit)

smoky ocean
#

@spring wave any chance you could add 1) a mode that is less verbose while still lingering 2) showing emojis in that mode.

Feedback I'm getting: still not as clear and pretty as | history (in the specific context of a 7mn demo)

spring wave
#

will take a swing at it 👍 i'll try just having it only show revealed spans, which tracks 1:1 to history

shrewd ermine
smoky ocean
#

thank you! Also getting that feedback of "too much debug info on verbosity=2" from several people, they feel overwhelmed

spring wave
shrewd ermine
#

Which is why I risked the git pull lol

spring wave
#

haha k cool

spring wave
#

wow yeah that made a huge difference. sweet

#

@smoky ocean btw, just a heads up, Bubbletea can only paint within the visible screen region, so output will get cut off at the top.

  • To scroll up, hit <Esc> and then arrow keys/hjkl, but it works in a particular way (it's not a pager, it hops between spans).
  • We could bring back mouse events, if we're OK with saying just hold Shift to bypass it for copy/paste/whatever.
  • I could bring back mouse events only in navigation mode which could be a decent balance. (<Esc> => scroll)

Side note: the full non-cut-off output is printed on exit.

compact swan
smoky ocean
spring wave
#

some shell feedback - I find myself making this mistake a lot:

dagger on  llm [$!?] via 🐹 v1.23.2 took 1m56s
❯ dagger-dev shell -m github.com/shykes/melvin/demo
Dagger interactive shell. Type ".help" for more information. Press Ctrl+D to exit.

✔ .doc 0.0s
MODULE
  demo

ENTRYPOINT
  Usage: github.com/shykes/melvin/demo <token> <repo> <issue>

  REQUIRED ARGUMENTS
    token Secret
    repo string
    issue int

✘ . PAT https://github.com/vito/testctx 1 1m17s
! constructor: accepts at most 0 positional argument(s), received 3
│ ✔ load module . 1m17s

✔ github.com/shykes/melvin/demo PAT https://github.com/vito/testctx 1 0.4s
issue: 1
repo: https://github.com/vito/testctx

github.com/shykes/melvin/demo ⋈

I ran it from ~/src/dagger, but with github.com/shykes/melvin/demo as the current module, and I always expect . to be the current module, not the current working directory. So then it loads the ~/src/dagger module and runs its initializer, which expects completely different args, and fails without telling you which module failed. Takes a bit to figure out. (The solution is to use github.com/shykes/melvin/demo instead of . but that was the whole point of me doing -m, in my mind)

compact swan
spring wave
#

got rid of ETOOBIG 🫗

shrewd fern
smoky ocean
#

Welcome newcomers from the AI devtools meetup 🙂

Tomorrow I will update the repo to make it easy to replicate my demo.

gloomy kindle
#

it's pretty detached imo - but yes, we do need this 😄

#

the title maybe isn't 100% correct, "session" refers to the fact that we would load the config every session - it doesn't configure the client, it's instructions for how the client configures the engine (for the duration of it's session)

#

but as i remember, you were suggesting that that config be merged into the engine config?

spring wave
#

pushed a change to just stop the CLI from auto-fetching tools, history, and lastReply

smoky ocean
spring wave
smoky ocean
#

I still like "print nothing"

#

I keep searching for a simple and intuitive way to communicate "a lot of things were built, they now exist in the cache"

#

I feel like once you know dagger 101, you just know that - so the information is redundant

#

We have the precedent of unix... command succeeds, if it has nothing further to say, it just prints nothing and exits

#

If we had a nice short ID, we could print that

smoky ocean
#

Ok let's start using threads 🙂 I worry that we're drowning newcomers in walls of implementation text

spring wave
#

🧵 how should the CLI print objects?

spring wave
#

Is it just me or is the behavior of OpenAI gpt-4o extremely volatile? Does it change due to load on their end or something?

#

Also, I'm trying out a new TUI cue where it renders a 'shadow' of a parent span after the last of its children so you don't need to scroll all the way up

smoky ocean
spring wave
#

Thought that might be related, looks like the default is 0 but I have no idea what the 'log probability' bit means. thinkspin

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

smoky ocean
#

HELLO NEWCOMERS

If you are here because of the Civo Navigate demo, or the AI devtools meetup demo, I will push an update to the melvin repo tonight, which makes it easier to get started, and notify you here.

In the meantime, for those who haven't already done so, please say hello and tell us what are your thoughts on the demo, and what use case you have in mind 🙂

smoky ocean
#

@spring wave llm-shell bug report: ctrl-D triggers an immediate exit at any time, instead of only on an empty line edit. When the line is not empty, it's a common shortut for "delete character to the right of the cursor"

smoky ocean
#

👋 @here ok folks, as promised I cleaned up the Melvin repo, and updated it with everything you need to replicate yesterday's demo ("containerize your AI agents!"), and get started playing with Melvin, and the experimental LLM branch of Dagger that it's built on.

https://github.com/shykes/melvin

Please let me know if you run into any issues getting started!

GitHub

Contribute to shykes/melvin development by creating an account on GitHub.

spring wave
smoky ocean
#

@spring wave I think I just saw a bug report from you? Can't find where... Anyway I fixed it, prompt.txt is now pushed 🙂

smoky ocean
#

Heads up, I'm making a small API adjustment that was frozen pre-demo.

--> withPrompt will be lazy
--> loop as an explicit call to send context and process replies & tool calls; to make it very clear when the loop actually happens
--> remove ask sugar to keep it simple

shrewd ermine
smoky ocean
#

So I would say 50/50 🙂

#

Also. I think I'm going to remove LLM_MODEL from the .env global config. Just to make thing simpler, and help us focus on making that an application concern

#

I think the next big features we need to add are:

  1. Model selection in the API (including the routing logic in the backend, so operators can decide where to route which model). Clean separation of concerns between app and infra.

  2. Multi-object BBI. Allowing one LLM to juggle any number of Dagger objects will unblock even more powerful composition scenarios. There are workarounds today, but they lean on some of our DX weak spots: wrapper types, interfaces, better to not force devs to use that on their first day...

  3. Polish the tracing experience. Still too verbose out of the box in the terminal (per user feedback, "I am overwhelmed")

  4. Generated clients.

  5. Object persistence

#

Bug report @spring wave : new shell works wrapped in a dagger interactive terminal (github.com/shykes/dagger@llm | dev | terminal --cmd=dagger,shell) but just barely.

smoky ocean
spring wave
#

kind of interesting how the shell doesn't block input while you're running the command now. you can just keep spamming and backgrounding things

#

woops it crashed (concurrent map writes)
lurk

spring wave
#

@smoky ocean two options (re: this crash, caused by running a bunch of stuff in parallel)

  • decide we don't need that, and disable input while something is running in the foreground, like before (except foo & - that'd be fine)
  • decide it's cool, and run each command in a subshell and backport any changes it makes to env back to the outer shell when it completes (last write wins, will have to keep track of changes manually)

edit: I'm just gonna wrap a mutex around it and call it a day - you can still type, but itll queue up commands instead

merry scarab
#

Getting Started with shykes/melvin

smoky ocean
smoky ocean
harsh stag
#

I can't find a sitemap.xml on the dagger.io website

#

I want to train a model on the docs

#

sitemap makes a easier to crawl

bronze fern
harsh stag
#

thanks @bronze fern

shrewd ermine
#

I remade my "marvin" demo in my own new agents repo https://github.com/kpenfound/agents/blob/main/go-coder/main.go#L37

More agents to come 🙂
But first I want to add go-programmer iterate <PR> <feedback>

Currently it can
solve a go coding assignment and give you a terminal with the solution: assignment "make a curl clone" | terminal

read an assignment from a github issue and push a PR with the solution: solve-issue GITHUB_TOKEN https://github.com/kpenfound/greetings-api 32

smoky ocean
devout magnet
#

Hey Dagger community! & @smoky ocean

I've been following the interesting conversations around AI, LLMs, and AI agents in this channel. My team has been actively building AI agents using LangGraph for the past six months, and we're now looking ahead to plan for some upcoming projects. This makes understanding Dagger's future direction in this space really important for our decision-making.

So, I'm curious about a couple of things:

  1. Does Dagger plan to offer any built-in framework-like functionalities to help with building AI agents, similar to something like LangGraph? Or is the focus more on providing a foundational base that developers can use to build whatever they need on top of existing AI/LLM tools and frameworks? Knowing this will help us decide whether to continue investing in LangGraph or explore Dagger's native capabilities.

  2. How does Dagger envision its integration with things like shell reload and distributed builds? For some of our projects, we face significant compute resource constraints and also need to minimize commute time for our developers. We're exploring how Dagger can help us manage these challenges, and understanding how these features might work together would be incredibly helpful.

Thanks in advance for any insights!

smoky ocean
# devout magnet Hey Dagger community! & <@488409085998530571> I've been following the interest...

Hi Bhaumin!

This is very new so we're still figuring it out...

It seems that Dagger can provide a great runtime for agents, that allows "containerizing" both the software tools, and the llm context calling the tools. It's not an AI framework like Langchain, but provides a new primitive that could drastically simplify the architecture of your agents, and therefore change how you use the framework.

I think in the same way that Dagger integrates with your CI platform while radically changing how you use it; Dagger will integrate with your agentic framework while radically changing how you use it.

devout magnet
# smoky ocean Hi Bhaumin! This is very new so we're still figuring it out... It seems that D...

I shared details with our team for this agent direction, and they're excited about its potential. We're anticipating something big. We'd love to try it out soon and are hoping for experimental support in the next Dagger release so we can provide feedback.

The Dagger ecosystem is a huge plus. We appreciate that it offers built-in observability and the ability to orchestrate workflows based on a single cache, which can be used in multiple parallel workflows. There's so much potential here. Great Work, Keep it up.

Thanks for your reply.

fierce citrus
#

Hello! I'm seconding this! I just saw your latest demo and was blown away by the potential of using dagger and containerized agents. Adding my reaction

  • Wrapping MCP servers? Possibly from https://hub.docker.com/u/mcp
  • Adding a break/token limit as a parameter to prevent the agent from going into an infinite loop?
    Thanks, and excited to toy with it.
proper stratus
#

Could I use Claude with this agent or just openAI?

smoky ocean
#

Also local models - llama3.2, qwen, deepseek...

smoky ocean
smoky ocean
proper stratus
smoky ocean
river belfry
#

I think I'm missing something obvious but I'm not seeing what. I'm trying to follow https://github.com/shykes/melvin readme but constantly hit this issue:

✔ toyProgrammer: ToyProgrammer! 0.0s
✘ .goProgram(assignment: "develop a curl clone"): Container! 1.8s
! select: failed to read secret file ".env": open .env: no such file or directory

But I have a .env file containing a value for LLM_KEY. The file is in the current directory from where I'm running dagger-llm shell. Is there something else to do? Like a way to mount/share the .env file?

smoky ocean
#

Will add it back @river belfry . Basically, you need a .env file in your current directory, with LLM_KEY set to your openai token. It can be the plaintext value, or a dagger-style secret reference, for example op://... vault:// , env://..., cmd://...

#

For example here is mine:

$ cat .env
LLM_KEY=op://Dev/sdhfkasdhaskdahskd/credential
$ 
#

This is temporary until we connect this to Dagger's builtin config system

cc @gloomy kindle 👉👈 🥹

river belfry
#

🤔
I have a .env in my current directory

$ cat .env
LLM_KEY="sk-pr....."

(it's an openapi key)
And I still got the error message

smoky ocean
#

Ah..

#

Ah you're hitting a small bug - you need to call llm from the CLI first

#

Then your modules will be able to call it

#

It's caused by my temporary hack to get LLM configuration into the engine

#

Starting a thread about adding MCP support / cc @worn hill @spring wave 🧵

river belfry
river belfry
smoky ocean
#

@river belfry it might be the " in the plaintext, I don't know if .env format supports quotes in that way

river belfry
#

I tried to remove them with no success

spring wave
#

fwiw I set LLM_KEY with a raw value and it works

smoky ocean
#

@river belfry maybe it was actually the "bug 7" and your plaintext was fine?

smoky ocean
#

⚠️ I pushed a fixed & improved README. It includes a quickstart option with a setup.sh script. Thank you @river belfry for testing the onboarding flow and catching the gaps!

smoky ocean
#

Quick recap of missing features:

  1. Binary release we need a binary release of the llm branch, to make it easier to install.
  2. MCP frontend (discussion #1341123420246773882 ). Any Dagger module instantly becomes a MCP server.
  3. Clean LLM configuration (not requiring a .env in the local dir, not having to manually run llm from the CLI)
  4. More models. Clearly we need Claude; as well as llama3, qwen, deepseek, Gemini... The more the better! This includes the ability to choose the model from the code.
  5. Token budgets. That includes measuring token cost and enforcing token limit
  6. Less verbose TUI. Consistent feedback that the output is too overwhelming by default
  7. Multi-object. Being able to give each LLM more than one Dagger object, would make it easy to compose modules and build more powerful agents in less lines of code
  8. Generated clients, to integrate Dagger modules in your existing application #1334452944740814931
  9. Finish the shell. All our demos rely on it, but it's not documented and not stabilized.
  10. MCP backend. Less straightforward, be very doable. Would be great to be able to wrap any existing MCP server in a Dagger module and boom, use it.
  11. Object persistence. When calling Dagger modules from an async, distributed workflow platform like Temporal or equivalent - you'll need a way to save object state in-between events. That is already a priority in the engine roadmap.
  12. Enable host API inside functions. #1339366755134472282 message
smoky ocean
spring wave
#

Is 5) still the case with the latest llm branch? Should be much quieter than the original demo now

smoky ocean
#

I know that's always a delicate balance. Maybe the right answer is "you'll get used to it"

smoky ocean
#

FYI @jovial grail this is where we're hacking on that agent demo 🙂

smoky ocean
#

@shrewd ermine you mentioned returning a service endpoint - is there a missing feature in the engine llm branch to enable that? We can add it to the list 👆

shrewd ermine
smoky ocean
#

Oh right!

shrewd ermine
#

The idea is it'll let me run a GitHub webhook service wired into any agent I want. So not directly connected to the LLM feature but it helps

smoky ocean
#

Added 🙂

shrewd ermine
#

On 3) more models - what's the blocker? Documentation? I think I've tested like 15 different models on ollama with my code

smoky ocean
#

I'm going to start small, and move model selection into the llm() call. Initially it will only work with openai models, but at least will shift us towards the DX we want

#

then I'll try to add the "routing layer" that we talked about

#

(ie. based on requested model name, configure the right endpoint, taking existing standard config when it exists, and allowing augmenting it with dagger-specific config)

shrewd ermine
#

Can we add a bunch of statically defined models temporarily like we had in the early demos?

smoky ocean
#

sure - I can do that first and deal with the "shared global config" issue later

#

Mode models

#

⚠️ pushed: first part of model selection API. You can now call eg llm(model: "gpt-3.5-turbo") or llm(model: "o1"). Still OpenAI only, but that will change soon

LLM_MODEL in .env is no longer supported. You can only configure the model from your code

shrewd ermine
#

I noticed the setup in melvin is a few commits behind shykes/dagger@llm. Should we get a pre-release branch+build on dagger/dagger similar to the cloak beta?

smoky ocean
#

Oh, I keep forgetting to update the dependency

#

I figured, better to have to manually update, but at least stay in control of the user experience

#

I think a pre-release build would be a good thing to have, soon

shrewd ermine
#

yeah sounds good. I have the dagger@llm branch locally and I'm running dagger shell -c 'engine | service llm | up'

smoky ocean
#

Debugging loop of the engine is SO SLOW

shrewd ermine
#

missing a field in the config parsing? /app/core/llm.go:485 /app/core/llm.go:662

smoky ocean
#

Yeah maybe - trying to get very very basic debug information out...

shrewd ermine
#

seems like a lot of stuff isn't cached on engine builds. Or at least the work being done after the changed source is mounted takes a long time

smoky ocean
#

It looks like my convention of splitting LLM_HOST from LLM_PATH goes against the grain of basically every API client... they all rely on mixing host & path in to a single "base url". So I'm finding myself splitting & merging it along the way, for basically no gain

shrewd ermine
#

Yeah I vote merge it. Scheme too.

smoky ocean
#

on it

smoky ocean
#

⚠️ pushed: second part of model selection.

We now honor standard env variables:

  • OPENAI_API_KEY, OPENAI_BASE_URL, OPENAI_MODEL
  • ANTHROPIC_API_KEY, ANTHROPIC_BASE_URL, ANTHROPIC_MODEL

Note:

  • Variables may be overridden with a .env file in the current directory.
  • Variables for secret values may either contain the secret plaintext (basic convention) or a reference to the secret in 1password (op://...), hashicorp vault (vault://...), or a file (file://...)
  • all API endpoints are still queried by the standard OpenAI client library. This will break Anthropic endpoints, and will be fixed in a follow-up commit.
#

Tomorrow:

  • I'll finish adding Anthropic support (will pull in Sam's initial PR with anthropic client, and test it all)
  • @shrewd ermine please let me know if it all works with local models
  • I'll update melvin to the latest verison, and update the README
smoky ocean
#

I take it back @shrewd ermine . Watching @dim spruce setup melvin in real time: yes we should absolutely start shipping a pre-release binary of llm right away, just like cloak...

#

Melvin install party underway with @dim spruce 😁

dim spruce
#

👋

smoky ocean
# smoky ocean Quick recap of missing features: 0. **Binary release** we need a binary release...
  1. Binary release we need a binary release of the llm branch, to make it easier to install.

@split shard could you help with this when in your morning? 🙏 It's the most painful blocker right now - installing a dev version of the engine, from source, is just not a lot of fun. A binary release, like we did for cloak, would be 10x better.

  1. MCP frontend (discussion #1341123420246773882 ). Any Dagger module instantly becomes a MCP server.

@worn hill want to take it?

  1. Clean LLM configuration (not requiring a .env in the local dir, not having to manually run llm from the CLI)

@gloomy kindle we're due to sync tomorrow morning-pacific/evening-uk. @compact swan not sure if you'll be around, but welcome to join if so!

  1. More models. Clearly we need Claude; as well as llama3, qwen, deepseek, Gemini... The more the better! This includes the ability to choose the model from the code.

@shrewd ermine all ready for you to test with local modules.
Anthropic is half-finished, but I can finish it tomorrow.

  1. Token budgets. That includes measuring token cost and enforcing token limit
  2. Less verbose TUI. Consistent feedback that the output is too overwhelming by default
  3. Multi-object. Being able to give each LLM more than one Dagger object, would make it easy to compose modules and build more powerful agents in less lines of code

@spring wave take your pick 🙂 Let me know what you choose, so I know what can be allocated to others (or if you find something else)

  1. Generated clients, to integrate Dagger modules in your existing application #1334452944740814931

The rebel alliance is counting on you @hidden tartan 🙂

  1. Finish the shell. All our demos rely on it, but it's not documented and not stabilized.

I already know @shrewd fern is on it, just wanted to highlight that it's very useful for the LLM work

  1. MCP backend. Less straightforward, be very doable. Would be great to be able to wrap any existing MCP server in a Dagger module and boom, use it.

@hidden tartan @shrewd fern maybe we can talk about this in the context of the Magic SDK discussion we were planning on having anyway.

  1. Object persistence. When calling Dagger modules from an async, distributed workflow platform like Temporal or equivalent - you'll need a way to save object state in-between events. That is already a priority in the engine roadmap.

I know @steep onyx is on it already.

  1. Enable host API inside functions. #1339366755134472282 message

This one is up for grabs.

#

On top of this 👆 there's the equally important work on examples, and just building cool modules on top of the llm branch, which is the whole point 🙂

gloomy kindle
#

r.e. binary release. this is a touch tricky, we've changed a lot in our release process, but should be doable (especially if we're not cutting external sdk releases as well).
but, instead, i think it's worth (even if we do ship a binary soon) seeing how soon we can merge the llm pr, and have an experimental feature in main - that also reduces us needing to multi-task cross-cutting features like shell improvements in multiple places, and needing to keep rebasing. I'll look through the llm pr, and see if i spot any major blockers.

#

@spring wave notice there's a lot of https://github.com/dagger/dagger/pull/9327 in the llm branch. imo, the api seems entirely reasonable, and it feels like we've kind of decided on the design now that it's in the llm branch? is there still bikeshedding to do, or any blockers to merging that soon?

#

similarly, with the shell changes

#

there's a few code cleanups i think we'd want to make before merging - but don't know whether it's premature:

  • bbi doesn't seem to depend on remotefn anymore? seems to be deadcode now.
  • there's a few TODOs around the dagql server middlewares, which we'd probably want to fix (and add tests for, just so we're not accidentally breaking anything else)

honestly, aside from the above (which i'm happy to pick up), i'd be very happy to land this very soon (assuming we're on the same page with the new Span API and the shell tui changes). without the middlewares/etc, the scope of the changes are really not particularly sprawling, if something goes wrong, i'm not too worried about it breaking the rest of the engine.

gloomy kindle
#

(pushed a couple things to try and clean up ci on that branch, trying to get an idea of what needs doing to get it passing)

void flint
#

👋 I'm just looking to get this setup and working with our own API compatible endpoint (https://api.relax.ai/v1) and I can't seem to get the env vars to load from .env correctly

I've followed the setup.sh from github.com/shykes/melvin and I think I've got the correct engine / binaries running

I have the following in .env:

OPENAI_BASE_URL=api.relax.ai
OPENAI_MODEL=llama-3-70b
OPENAI_API_KEY=<PLAINTEXT>

however these don't seem to be being loaded when run

❯ ~/bin/dagger-llm shell -c "llm | model"      
✔ connect 0.0s
✔ looking for module 0.2s
✔ loading type definitions 0.2s

✘ llm: Llm! 0.0s
! No valid LLM endpoint configuration
│ ∅ model router: []->[&core.LlmEndpoint{Model:"", BaseURL:"", Key:"", Provider:"other"}] 0.0s

I'm sure this is user error ... does anyone have any ideas of where I can start looking?

I've also tried:

OPENAI_BASE_URL=api.relax.ai/v1
OPENAI_BASE_URL=https://api.relax.ai/v1

spring wave
# gloomy kindle <@108011715077091328> notice there's a lot of https://github.com/dagger/dagger/p...

Yeah - it was merged in to make web UI dev easier since it depended on the frontend-api branch + llm. I backported all the changes out from llm and back into that PR over the weekend, but I'm going to try and divide it up even further; technically llm only needs the internal plumbing (new attrs) and TUI changes, it doesn't need the API and SDK parts, which we don't want to rush. All the shell changes should also probably be a separate PR.

smoky ocean
#

@bronze fern very interested in that typescript video you were about to post! 👀👀

#

LLM branch review & pre-release binary

compact swan
#

Hi Solomon, please let me know when you chat with Justin. if I am awake, I would like to join that call.

smoky ocean
#

hoping to be ready in 10mn

nova bronze
#

Is the LLM logic able to pull in secrets from the environment to call functions?

Trying to use Gerhard's notify module for sending some messages and it expects a webhookURL argument that is a secret

smoky ocean
#

Looks like maybe that URL argument shouldn't be a secret?

nova bronze
#

it definitely should. Its a sensitive argument, anybody that can read it is able to post messages

gloomy kindle
smoky ocean
#

@gloomy kindle @compact swan joining dev audio

shrewd ermine
#

Is the LLM logic able to pull in secrets

smoky ocean
shrewd ermine
#

using the dagger-in-dagger method for the engine and it is having a rough time on the latest commit

smoky ocean
#

@shrewd ermine yet another instance of "you were right", I'm going to drop the mandatory loop() and make llm follow the standard "sync-able" system. Some calls will be lazy, others will force sync().

shrewd ermine
#

sounds good! guessing that means with-prompt is lazy but sync or history will execute?

shrewd ermine
spring wave
#

@smoky ocean shall I make that change to print TypeName@xxh3:... instead of object fields, as part of splitting out TUI changes?

smoky ocean
nova bronze
#

Is supplying multiple modules in a single Llm() supported?

smoky ocean
smoky ocean
#

Starting a thread for "clean LLM configuration" fix

shrewd ermine
smoky ocean
#

@shrewd ermine so the model selection is passed to the openai client lib within the engine implementation, and it magically works? (same endpoint for both models in your case)

shrewd ermine
smoky ocean
shrewd ermine
# smoky ocean oooh, like instrumenting ollama to do it?

yeah it just depends where we want to sit in the stack. I don't know how those responsibilities get divided in different orgs. If we're provider-aware for ollama we'd have the ability to list models and pull models. But is that something someone would expect dagger to do or something you'd expect of your infra team

#

maybe something best left to the infra until its better defined

smoky ocean
shrewd ermine
spring wave
shrewd ermine
#

with a minor modification, I present: gemini

river belfry
#

🤔 I don't know what I'm doing wrong, but I can't get the melvin stuff working anymore.

  • with the latest commit from shykes/melvin
  • I'm building dagger-llm with no problem
  • the engine is up and running
  • inside dagger-llm shell I'm loading llm first with success model: gpt-4o(openai)
  • when running ./toy-programmer | go-program "develop a curl clone" | terminal everything looks ok, but nothing is created inside the container
dagger /app $ ls -a
.  ..
  • if I'm trying the default llm | with-prompt "llm, are you there?" | last-reply I just got a (no reply)
  • my openai key is stored in 1password, and accessed (as it asked me to unlock the vault)
    If anyone has an idea of what I'm doing wrong 🙏
shrewd ermine
shrewd ermine
#

toy-programmer is also missing loop, although @smoky ocean how's the progress on changing that behavior?

river belfry
#

yeah, I just added .Loop() and that works
So now I have something I know is working I can continue to try to translate everything in Java 🙂

merry scarab
#

I was still not able to get things to work but it was due to some weird networking issues. I restarted my Mac/Docker and things seem to be working finally. 😄

merry scarab
river belfry
#

I was able to translate toy-workspace and toy-programmer into Java. It's using a specific branch for dagger (basically the shykes/llm branch with the constructor support from https://github.com/dagger/dagger/pull/9523 - available at lgtdio/llvm-java)
-> There's no real benefit here except writing in Java and catching bugs in the Java SDK (already found and fixed 3 of them)
I updated the workspace part and the prompt, I'm now able to generate java/maven code. The prompt is more complex than the Go one, to ensure the generated code will work (and I haven't fixed everything). But that a start.
At least it's fun to do 🙂
./nono-programmer | java-program "develop a clone of curl" | terminal --cmd=bash

GitHub

NoteTo be merged after #9605

Allow to define a constructor on classes for an module object.

a default, empty, constructor calling super is required
only the first non empty constructor will be r...

merry scarab
smoky ocean
#

@river belfry that reminds me, would you mind opening a PR to the melvin repo with your Java example?

river belfry
merry scarab
river belfry
smoky ocean
#

@shrewd ermine what open models do you have working so far? And what's the coolest way we could share that online? I want to send one tweet per "headline" improvement 🙂

#

maybe a cool little gif of each? If we can make it catchy somehow

merry scarab
#

He has all the models

shrewd ermine
merry scarab
#

I tried using deepseek-r1:70b on Kyles infra but its not working for me at the moment.

✔ llm  0.0s
model: deepseek-r1:70b(other)

✘ llm | with-prompt "what model are you using" | loop | last-reply 34.8s
! Post "http://dagger/query": context canceled

● llm | with-prompt "what model are you using" | loop | last-reply 1m11s
smoky ocean
shrewd ermine
merry scarab
# shrewd ermine see if your engine died. It happens sometimes with dagger-in-dagger engines

It seems to be alive because I was running with setup.sh shortcut, but idk - is it working on your end?

I switched back to qwen and it worked right away

✔ llm  0.0s
model: qwen2.5-coder:32b(other)

✔ llm | with-prompt "what model are you using" | loop | last-reply 8.9s
I am based on the Qwen large language model created by Alibaba Cloud. How can I assist you today?
│🧑 what model are you using
│
│🤖 I am based on the Qwen large language model created by Alibaba Cloud. How can I assist you
│ ┃ today?
shrewd ermine
#

interesting, maybe it was having trouble loading the deepseek 70b model with multiple users 😂 that model does push the limits of that GPU

shrewd ermine
#

I've had that model working btw but only when I was the only user. And it's really bad at tool calls because it's so chatty

shrewd ermine
river belfry
#

Here is a draft PR that adds a ./toy-programmer | java-program to Melvin
This uses a specific maven/java workspace (created with the Java SDK)
And uses a specific llm-java branch the time we merge the constructor support (the branch has been refreshed a few minutes ago)
https://github.com/shykes/melvin/pull/4

GitHub

Warningthis is experimental and requires a specific dagger branch

This adds a new workspace, nono-workspace, dedicated for maven based projects.
This dagger module is written using the Java SDK. S...

shrewd ermine
#

@smoky ocean my 2 changes required for gemini (so far based on ongoing testing)

  • disable the automatic model routing so that it falls back to the openai client
  • comment out seed from the openai request. It's not supported in google's openai "compatible" api. How necessary is it for actual openai models?
smoky ocean
shrewd ermine
wraith remnant
#

Sam abstracted a bit and made the implementation per provider

smoky ocean
#

@wraith remnant but does it default to openai still? I guess so

wraith remnant
#

So, upon the call on teh SendQuery interface, per provider, it will catch it

#

yes yes

smoky ocean
#

It looks like @shrewd ermine confirms that we can fallback to openai provider for gemini, if we make the 2 small changes above

#

(or you're saying we can have a gemini-specific provider that just calls the openai client in a slightly different way?)

shrewd ermine
smoky ocean
#

anyway - as long as you two are talking I'm happy 😉

wraith remnant
wraith remnant
#

I guess I'll see your pr

shrewd ermine
wraith remnant
#

Oh so it fallsback to gemini, using the chatgpt library ; yeah we're aligned then

#

You're just in advance ahah 🤣

#

But, wouldn't what we want is to keep the routing per provider though ? 🤔

shrewd ermine
wraith remnant
#

Yeah, but wouldn't what we really want is to have those providers implement their own specific: func (llm *Llm) sendQuery(ctx..., in which case it reuses the openai client with the seed off -- and not for openai routing for example ?

#

As I am implementing the abstraction for claude with an interface, extending the logic is then easy

#

No strong opinion though 😇

wraith remnant
shrewd ermine
#

yeah I know what you mean. The painful part is that the whole benefit of the "openai compatible api" is that we really shouldn't need to do that. For Ollama it's been perfect so far. The fact that gemini's openai API is missing seed is just a bug on their end. If we were going to actually route it differently I'd say we just use the core gemini api instead of the openai compatible one

wraith remnant
#

So we're aligned, it's just an implementation detail afterward

smoky ocean
#

⚠️ pushed: API call limts. llm(maxApiCalls:10) -> this will cap to 10 API calls total for the duration of that LLM instance. This deprecates loop(maxLoops) and paves the way to renaming it to adopting standard Dagger convention for laziness and sync

#

If you reach the cap, you will get an error

#

we can add token limit and maybe dollar limit in the future (might require a dagger cloud integration for that last one)

shrewd ermine
wraith remnant
#

half-ish

#

With pleasure 🙏

shrewd ermine
#

cool, i'll work on something else for now!

wraith remnant
#

🙏 (focus mode, seeya 👋 )

smoky ocean
#

I'm thinking that with multi-object, I'll take the opportunity to adjust the API as so:

Option 1:

type LLM {
  set<FOO>(key: String!, value: <FOO>!): LLM!
  get<FOO>(key: String!): <FOO>!
}

The explicit set and get might make it more clear what is happening.

But, we would lose with<FOO> which is also nice. wdyt?

Option 2:

Keep what we have, just with key argument:

type LLM {
  with<FOO>(key: String!, value: <FOO>!): LLM!
  <FOO>(key: String!): <FOO>!
}
smoky ocean
#

mmm weird, toy-programmer prompt now breaks for me, it tells me how to do it, instead of using tools to do it ...

Nevermind, it was a bug I introduced, now fixed

smoky ocean
shrewd ermine
#

I think I do, but I haven't actually done anything multi-object yet. So maybe after that's supported I'll realize it's gross

spring wave
#

I just have a negative gut reaction to getters/setters 😛 but now I understand the problem a bit more

#

setFoo behaves like withFoo, right? it just returns a modified LLM with the value set, as opposed to mutating?

smoky ocean
#

The problem I'm trying to avoid ⏬

smoky ocean
#

a compromise would be withFoo / getFoo

#

🚨🚨🚨 LLM.loop() is now no longer required. You can call sync() explicitly to force evaluation - otherwise just call lastReply, history or a getter function (container(), myObject()) and the state will be automatically synchronized

--> Logging off for the day. Tomorrow attacking multi-object and/or MCP support, we'll see. + playing with the latest dagger shell improvements from Helder, and hopefully plugging that in!

#

Thank you for a fun few days everyone!

void flint
#

@shrewd ermine - re the messages earlier about deepseek. The model doesn't support tool calling at the moment, hence why you're not getting "great" results from it. I've also found that r1 with 70b params can halucinate quite quickly. I've been running a 671b quantised model to get better results. I still havn't run the full model

To get tool call "working" with DeepSeek R1, I've been using BAML (https://www.boundaryml.com/blog/deepseek-r1-function-calling) in between my code and model. BAML taks the random / unstructured output from any LLM, and coercess into a user defined format

Boundary

How to do tool-calling or function-calling with Deepseek R1

river belfry
#

I re-did my java test: https://github.com/shykes/melvin/pull/4
So it's using the default dagger-llm (no specific branch, just needed for the java sdk)
There's a new java module with a few commands:

  • find-bugs to find and explain them, but without to change anything
  • refactor to improve the code
    That's still a toy, but that's really fun to do! 🙂
GitHub

WarningExperimental

This adds a basic java module to interact with Java/maven code.
Two main functions are exposed: find-bugs and refactor:

find-bugs: find bugs and explain them, propose some alt...

river belfry
#

❓ Is there a way we can directly edit files on the host from the workspace? I'm doing a ./java --source MyDir | foo | export MyDir but I'd like to avoid the last segment is possible

smoky ocean
river belfry
smoky ocean
#

@spring wave @shrewd ermine @wraith remnant heads up I'm moving us from shykes/dagger@llm to dagger/dagger@llm. Thanks to @split shard and @gloomy kindle we have a nice pre-release system, so we can start giving binary builds to users instead of having them build dagger with dagger 🙂 🙏

smoky ocean
smoky ocean
#

🚨🚨🚨 Pushed: a new README with much simpler initial setup. You can now install llm-enabled Dagger from a binary pre-release. Please give the new setup instructions a try to make sure they work 🙏

smoky ocean
#

@spring wave @steep onyx I have a new merge conflict rebasing llm on main... I think related to this commit:

commit 4bb955ca4e9791125de7aae4fa090759891acc11 (upstream/main, main)
Author: Erik Sipsma <erik@dagger.io>
Date:   Wed Feb 19 11:22:49 2025 -0800

    make function calls cached session-wide (#9621)
#

I'm going to poke around but I didn't author either side of the conflicting code so not very confident

shrewd ermine
smoky ocean
#

Love it

#

It just occured to me: at the moment multi-model will not work if you combine OpenAI models & generic - because the engine uses OPENAI_BASE_URL for both

#

Starting a thread to discuss @nova bronze 's toy roasting agent 🙂

spring wave
#

@smoky ocean 🚨 gonna push -f soon: I removed all the commits that were split out into my two PRs, rebased on main, and then merged both of my PRs in on top. From now on we can just continuously merge from the 3 branches (vito/tui-llm, vito/shell-bbt, main - I'll manage the first two as changes are made to them), and once they're merged we can just rebase on main

spring wave
#

uh...done... i think?

remote: Resolving deltas: 100% (661/661), completed with 79 local objects.
remote: Bypassed rule violations for refs/heads/llm:
remote: 
remote: - Cannot force-push to this branch
remote: 
To github.com:dagger/dagger
 + 2273cd29a...bae2eb0f5 llm -> llm (forced update)
branch 'llm' set up to track 'upstream/llm'.
smoky ocean
#

Current status: upgrading github.com/shykes/melvin/workspace to use a checker interface instead of a container... If that works, I'll use interfaces for a onSave hook, then the programmer agent can eg. send updates to github, discord etc. from within its inner loop 😛

#

@shrewd ermine do you need help on the dag.Host support? Looks like it would unblock you in a big way

#

ugh getting stuck on interfaces... 😦 I never get it right on the first try

shrewd ermine
smoky ocean
#

Man1️⃣ Machine0️⃣

smoky ocean
#

Next blocker: figuring out the right pattern for using interfaces as callbacks in a stateful loop...

#

Specifically: if the callback needs to persist state across calls... how to do that.

For example, for a notification callback (so I can eg. send github & discord updates every time the dev agent checkpoints its workspace).

--> the github notification module kind of assumes its local state will be up-to-date. For example if you add 3 tasks, it expects that its state will contain an array of 3 tasks.

But if I wrap it in an interface that looks like this:

interface Notifier {
  notify(message: String!): Void
}

Then there's no way to "build state" from one notification to the next

#

So, no big deal you say: just make it a chained call:

interface Notifier {
  notify(message: String!): Notifier
}

Well guess what: now my "notify" function has to be lazy, because it returns an object, and therefore there is a blanket rule that you can't do error check on it

#

Not too bad for notify (would be nice to be able to know when a notification callback failed, but sure why not).

But it becomes a major problem for my other interface: Checker.

interface Checker {
  check(dir: DirectoryID!): Checker
}

Now my check function, whose only purpose is to maybe return an error, cannot return an error.

smoky ocean
#

A separate but related problem: because a module can't receive or return another module's type, we can't use interfaces to implement a complete hook system.

Eg. my Workspace type can't define a hook interface that receives the Workspace as argument, for more flexibility.

So for example my notifier can't inspect Workspace.Diff() and make its own decision on whether to include it or not.

#

that said: HA HA HA it's working

#

I think actually a lot of this callback system might be advantageously replaced by multi-object. Instead of programming my workspace to call a notifier hook on each save(), I would just give the LLM both 1) the workspace and 2) the notifier, and let it decide when to send updates

shrewd ermine
#

@steep onyx is this error related to the bump to 0.16? Cannot query field \\\\\\\"resolveContextPathFromCaller\\\\\\\" on type \\\\\\\"ModuleSource

smoky ocean
steep onyx
shrewd ermine
#

Got it. I'm just doing dag.CurrentModule().Source().File("system.txt"). Based on the note in the release I thought it would be safe

steep onyx
#

I wonder if this is related to your success installing before with the different sha

shrewd ermine
#

"engineVersion": "v0.16.1-250219172554-bae2eb0f5765"
using cli + engine built off of dagger@llm

steep onyx
shrewd ermine
#

ok I have a pretty messed up setup right now so i'm bringing it all back up lol

#

no luck 😞 let me try to uncomplicate the setup a bit

#

ok I think we're good. I was on dev-engine in prod engine -> shell -> function -> dagger-in-dagger. Now I'm just on the llm release. It was a bit manual updating deps

shrewd ermine
jaunty iron
#

Q: maybe this was already mentioned (sorry), can I make the llm primitive talks to an openai compatible service running inside the Dagger engine? Like, I start one with gpu access, then I load the model, and eventually I test my agent?

smoky ocean
bronze fern
proper stratus
jaunty iron
smoky ocean
#

if you set ANTHROPIC_MODEL in the environment it should pick that as a default

smoky ocean
smoky ocean
# smoky ocean if you set `ANTHROPIC_MODEL` in the environment it should pick that as a default

That said @proper stratus even once you get the requested routed correctly to anthropic (which should already work), you will hit another issue, which is that Anthropic endpoints don't work with the openai client libraries, and we haven't yet added the anthropic client library. @wraith remnant is finishing a PR for this, and we plan on merging today.

TLDR: by the end of the day, you should have a working anthropic implementation running on Dagger 🙂

proper stratus
#

Great! I will try that tomorrow morning.

smoky ocean
#

Hello everyone! On the menu for today:

  • Anthropic support merging soon thanks to @wraith remnant and @storm gate
  • We rebased on dagger main, and will get the sweet sweet performance boost contributed by @steep onyx 🙏 Will cut a release soon, this will require an upgrade
  • Some cool demos being built on dagger-llm, we will share videos and links
  • I would love to get MCP support prototyped today
  • Big improvements to dagger shell in a separate branch thanks to @shrewd fern , I'm thinking we should just merge that in also 😛
devout magnet
smoky ocean
#

@devout magnet yes if you look at the README for https://github.com/shykes/melvin , we updated it yesterday with install instructions that do this. There is a "pre-release" version of Dagger called v0.16-llm.1. Later today we will release v.017-llm.1

devout magnet
smoky ocean
devout magnet
devout magnet
smoky ocean
#

@gloomy kindle @spring wave @split shard @wraith remnant @shrewd ermine be advised: I am tagging v0.17.0-llm.1

#

@wraith remnant do you want me to wait for your Anthropic PR to merge?

wraith remnant
smoky ocean
#

no problem we'll just release llm.2

smoky ocean
#

🚨🚨🚨 New pre-release has dropped: v0.17.0-llm.1. Please update your dagger with:

curl -fsSL https://dl.dagger.io/dagger/install.sh | DAGGER_VERSION=0.17.0-llm.1 BIN_DIR=/usr/local/bin sh
#

This is rebased on Dagger 0.16, so among other things, you will get the performance boost & 1password & hashicorp vault integrations

mystic steeple
#

@smoky ocean have you used lmstudio? when I was doing the C# stuff with semantic kernel, i was only calling locally hosted models, wondering if i can do that with this PR or does it call out to the remote model provider API only? i.e you need to get an api key / acocunt on those providers to use the LLM dagget type?

#

I somehow racked up a £100 bill and i errm, didnt want to run remote OpenAI stuff anymore because of that

#

like i would just wanna point it to localhost

merry scarab
shrewd ermine
#

what lev said 🙂

merry scarab
#

@shrewd ermine to the rescue

mystic steeple
#

ahh okay cool, thats good - the variables provided doesnt make it all too clear to me

merry scarab
#

Yeah, good point! I was also confused at first but the tldr is that ollama provides a compatible api

mystic steeple
#

in which case, ignore me will try on the weekend!!

shrewd ermine
#

yeah with ollama specifically we're taking advantage of the fact that they have an openai compatible API, so I set OPENAI_BASE_URL=http://localhost:11434/v1/. I'm not familiar with lmstudio to say what that would look like

merry scarab
#

@shrewd ermine might be a nice blog post or something showing how to get this working from scratch

smoky ocean
#

As a safety, you can also cap the number of API calls per llm instance (for example llm(maxApiCalls:10)) it's a pretty crude safety but better than nothing. We're going to ask token caps also

smoky ocean
# void flint <@135620352201064448> - re the messages earlier about deepseek. The model doesn'...

I was wondering if one could implement this "DeepSeek + tool calling wrapper" as a dagger module 🙂

  • Run a DeepSeek instance ("thinker")
  • Also run a non-reasoning model that supports tool calling ("doer")
  • Wire them so that they talk to each other - the thinker tells the doer what tools to call
  • Wrap that in a DeepSeekWithToolCall type, that can be instantiated and prompted at will

I think the issue might be - how to get your object in there. You might have to instantiate the "doer" LLM on your own, then pass it as an argument

#

So maybe a more generic utility, to combine 2 Llms - one thinker, one doer - so that they collaborate

shrewd ermine
mystic steeple
#

and some models support images right?

#

has anyone tried to dagger -> provide image -> get output from image?

mystic steeple
#

💀

#

could be an interesting use case for dagger watch... could screenshot your screen or a window, and boom - an assistant

wraith remnant
#

It shouldn't change anything on the openAI / fallback logic for gemini

shrewd ermine
#

at a glance, looks great! I can try building it if you're unsure

wraith remnant
#

yeah please 🙏 Especially for the retrocompat ; Anthropic still hallucinates a bit, I am digging that

#

But it can be a follow-up -- tools are discovered, the model is using them -- it's just that it doesn't seem as good as openAI

shrewd ermine
shrewd ermine
wraith remnant
#

Then good to go ✅🙏

shrewd ermine
#

Added my approval

nova bronze
worn hill
nova bronze
#

Hahah very cheeky 😅. It picked it up either from the location of the activity or the location of the "club"

smoky ocean
#

@wraith remnant @shrewd ermine @spring wave if anthropic is merged, do you want to tag v0.17.0-llm.2 ?

wraith remnant
spring wave
#

pushed

wraith remnant
#

Works prefectly ✅ ❤️ -- thanks again

smoky ocean
#

awesome 🙂

#

@wraith remnant can you share a gif that I can tweet 🙂

#

And can you guys tag 0.17.0-llm.2 if anthropic works? 🙏

smoky ocean
#

@bronze fern @shrewd ermine back to my computer in 5mn... sorry for disappearing. Got some gifs for me? multi model goodness maybe? 😛

shrewd ermine
bronze fern
#

X takes videos just fine

smoky ocean
#

Ok I'm at my computer. Pushing llm.2 so that everyone gets the Anthropic goodness

#

@spring wave I may have found a bug in the TUI/llm branch...

#

Does this work for you?

container | from alpine | with-service-binding api $(container | from nginx | with-exposed-port 8000 | as-service) | terminal
spring wave
#

nope - hangs

smoky ocean
#

Same for me

#

That's the bug

#

I assumed it's TUI/shell but maybe not?

#

v0.17.0-llm.2

wraith remnant
#

But works

smoky ocean
smoky ocean
#

Ah but it still does tool calling fine?

#

So maybe we need to tell it to shut up and just do the work? 🙂

bronze fern
#

put mine up on X @smoky ocean

#

has vid attached

wraith remnant
# spring wave nope - hangs

still getting this error from time to time: ! input: llm.withContainer.withPrompt.sync POST "https://api.anthropic.com/v1/messages": 400 Bad Request {"type":"error","error":{"type":"invalid_request_error","message":"messages: text content blocks must be non-empty"}}

Digging back after the gif is done

spring wave
#

@smoky ocean re: stdout/stderr, I pushed a change to include <stdout> and <stderr> in exec error responses, only in the LLM code path for now instead of everywhere. demo (too bad that's not publicly viewable)

smoky ocean
spring wave
#

I actually went through the trouble of having it render the progress UI by loading and replaying the client's telemetry, but that didn't even help because it's reliant on the command actually running where/when you expect it to. In my case the command was being evaluated prior to the llm.Sync (arguably a bug - I was passing a container as an arg, and I think that calls sync instead of id atm) so the call in llm.Sync ended up just hitting the same cached failure, without any new logs printed. Whereas if you extract it from the error itself, it'll always be there

#

Also at one point it rendered way too much and blew up my token rate limit 😅 so that method is a little fraught

smoky ocean
#

Looking into MCP support... So what's the deal with this stdio-server situation?

  • The convention seems to be: implement your MCP server as a stdio server?
  • But then surely the client apps expect a remote HTTP endpoint?

Should dagger http endpoint? Or stdio and if so -how

bronze fern
smoky ocean
#

indeed!

#

Looks like Goose definitely executes directly

#

(with manual curation of which shell command to run for which mcp server... 😭)

spring wave
# smoky ocean Looking into MCP support... So what's the deal with this stdio-server situation?...

If it helps any, there's a stdio => HTTP proxy tool called @mpchub/gateway - it's how I was able to get Claude Desktop to talk to my MCP service running in Dagger:

env MCPHUB_SERVER_URL=http://localhost:1234/api/mcp npx @mcphub/gateway

Claude config (mine has extra WSL rubbish):

❯ cat /mnt/c/Users/surac/AppData/Roaming/Claude/claude_desktop_config.json
{
    "mcpServers": {
        "dagger": {
            "command": "wsl",
            "args": [
                "--cd",
                "~/hack/mcp-gql/",
                "env",
                "MCPHUB_SERVER_URL=http://localhost:8080",
                "npx",
                "@mcphub/gateway"
            ]
        }
    }
}
#

not ideal for prod obviously, but could help unblock

smoky ocean
#

@spring wave so Claude desktop also does direct exec + stdio?

#

looks like it

spring wave
#

yeah

#

which is... kind of weird tbh. you'd think it'd be easy to support a simple HTTP endpoint config

smoky ocean
#

yes

#

I'm curious, does mcp support nesting / multiplexing? In other words, in theory could you connect one MCP server that has all the tools ™️ then select from the client which tool you want in which session?

#

my guess is "yeah right"

spring wave
#

the set of available tools can in theory be changed at "runtime" - there's a protocol message for it

#

but i don't think Claude Desktop supports that

#

(that's going to be a theme - for Desktop anyway, different story if we control the client too)

smoky ocean
#

Goose is the same. Very static

spring wave
#

which is fine for the graphql BBI approach, since the set of tools never has to change, its understanding of the schema changes instead

smoky ocean
#

So, since the utility of MCP is defined by its clients - and there are actually very few of those - fair to say, "support MCP" actually means "support executing many specialized MCP servers as single-purpose stdio servers"

noble notch
devout magnet
mystic steeple
#

A thought, from a business perspective.

Assistant (something a user or customer will ACTUALLY use...) -> calls out to an agent? Multiple agents behind the scene? Does the Customer UI even show whats happening? - Think of complicated workflows that could be retreival and manipulation of data, and then possibly dumped somewhere else, not back to the user at all) Agents can or will possibly have multiple tools available and an assistant may possibly be a agent gpt based bot on the front or not.

What does THAT look like in a world of dagger

#

ignore dagger and the devops, how is a customer interacting with AI solutions that run on dagger. Those diagrams will be interesting.

smoky ocean
wild pasture
#

Hi everyone!

Im trying to reproduce the legendary demo "containerize your agents".

Tested on 0.15 and 0.16, this attempts to reach the OpenAI API even when the endpoint is set to local ollama URL:

command:
OPENAI_BASE_URL="tcp://0.0.0.0:11434" OPENAI_API_KEY="notused" dagger shell

. ⋈ llm --model "gpt-4o" | with-prompt "hi " | ask yo | history

Error: input: llm.withPrompt.ask POST "https://api.openai.com/v1/chat/completions": 401 Unauthorized {
"error": {
"message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
"type": "invalid_request_error",
"param": null,
"code": null
}
}

#

any help/advice is appreciated. In oder to make this work, you need to create an alias for any of your ollama models to one of the openai models (as per how the code seems). The rest i dunno 😦

smoky ocean
#

or is it llm.2 now @gloomy kindle @split shard ? 😇

gloomy kindle
#

see #1342304973421150218, still failing - i tried submitting a fix earlier today, did not manage to get a review

subtle surge
# noble notch

<@&1326978792110948425> folks, would love your input here. 👆