#static tool scheme

1 messages · Page 1 of 1 (latest)

cinder skiff
#

@random heart re: naming - i tried out "method" and it seems to work pretty well. Meshes well with our existing 'object' metaphor, and no ambiguity with tools/functions

random heart
#

which commit on dagger is this ?

#

nvm 7e18cb44

celest dune
#

@cinder skiff what's the gist of the static tool strategy you're pursuing?

random heart
#

this is without chaining ?

cinder skiff
cinder skiff
random heart
#

thank you @cinder skiff ❤️

cinder skiff
#

i'm trying a run with a blank system prompt just to see if somehow they magically know what to do based on this new framing

random heart
#

hahaha

celest dune
#

are you sure requiring a 2-step process (select then call) will help more than it hurts?

cinder skiff
#

well, the alternative is to dump all of the schemas in list_available_methods which seems expensive

#

and the model's going to be hopeless without the schema

#

(also dumping all at once seems like it'd hurt the context window)

celest dune
#

you could separate list from get_schema

cinder skiff
#

that's essentially what it is, just under a different name

celest dune
#

it's just that "select a method before calling it" is not a pattern that the model will be familiar with, so intuitively it feels like it might get confused

random heart
#

conceptually it's the same as what i called "list", "describe" (getSchema), "call" in the first version of static

celest dune
cinder skiff
#

yeah this is basically the 'you need to RTFM before calling' error pattern

random heart
#

and in the description of call i see you speicfied that the llm has to first select it.

#

which i found works pretty well, though i usually mention the name of the tool to be more precise

#

ah, i guess you were trying to avoid confusion around calling select_methods in the MCP sense vs calling in the dagger sense (aka calling call_method)

#

maybe phrasing like "using select_methods" works

cinder skiff
#

ah yea good idea

random heart
#

i agree that we could easily rename select_methods to method_schemas maybe it's conceptually easier

#

another thing that could help is in call_method we could validate the schema and return an error with the expected schema

cinder skiff
#

renaming is easy, it's mostly a matter of what guides the model to actually call it. it took a few iterations to land on select_ and IME models are very keen to avoid reading manuals/etc. and will just steamroll forward

#

i'll give it a go either way, ideally it's understandable by us and the model lol

random heart
#

WDYT about returning schema in call_method if validation fails ?

#

or maybe just a message saying "RTFM call select_methods first" to have it reinforce the pattern

celest dune
#

Let's coordinate closely if you guys don't mind - I'm going to open a separate issue to fix the DX of Env (not LLM-facing, but as we know things are entangled at the moment).

cinder skiff
celest dune
#

My focus:

  1. Make the DX simpler, by addressing papercuts. The first papercut being: adding aliases in LLM so you can do simple things without explicitly instantiating a Env (kind of like Container aliases to Directory functions for its rootfs)

  2. Get Env back on track as a more general primitive, not just for LLMs. This one is where I see the most risk of entanglement - in both directions

royal fjord
#

I'm a bit confused on this DX / static tool track sorry 👀

#

Oooh:

  1. Make the DX simpler, by addressing papercuts. The first papercut being: adding aliases in LLM so you can do simple things without explicitly instantiating a Env (kind of like Container aliases to Directory functions for its rootfs)
    Simplify the API so that it's easier for the LLM and making Env a top level thing in the dev loop
celest dune
cinder skiff
#

zed + dagger mcp works now 🎉

#

evals looking solid too, running a bunch of iterations locally to measure success rate + token usage

#

seems better than main even (some evals were a bit flaky there on certain models, now it seems more consistent)

random heart
#

That's awesome!!! Thank you!

cinder skiff
#

token usage seems significantly higher, which is a little surprising, will investigate

#

initial guesses:

  • list_available_methods now includes the description for each method, so that'll obviously cost more
  • various tools now return JSON formatted responses, so that'll add up a bit
  • maybe there's still something busting caches in tool descriptions?
random heart
#

@cinder skiff i had a commit that was taking only the first paragraph of descriptions

firstParagraph := strings.ReplaceAll(strings.Split(tool.Description, "\n\n")[0], "\n", " ")

Maybe that could also help as that should be sufficient to tell the LLM whether to call select_methods on it.

Another thing i wanted to try is to pass in either an object ref or a an object type to list_available_methods to make it less big.

cinder skiff
#

lol - this explains some of it

random heart
#

hmm stupid

#

is it available that's tripping it ? Maybe it thinks that it changes constantly when in fact it doesn't once you have a type.

#

What happens if you try list_methods and mention in its description specifically that the methods of a type will never change

cinder skiff
#

could be a poorly worded system prompt too

#

slowing down for a bit if you wanna try anything! (dinner + errands)

celest dune
#

I finally got around to writing my thread