#Cannot decode `Directory#1`

1 messages · Page 1 of 1 (latest)

tawdry veldt
#

there are 2 bugs in one bug. First is that the LLM is not great at typechecking, it is sometimes taking a HelloDagger#1 as a dagger.Directory: https://v3.dagger.cloud/tiborvass/traces/c218cb47a4d1c0fe518206ea305f957d

Thankfully I won't be hitting this since i'll be autoselecting it with @young solstice 's PR.

The other bug is that when it does choose a directory, it uses the string Directory#1 as-is instead of looking up its base64 ID. Where is the correspondance between the LLM-friendly numbers and the dagger IDs ?

I'm testing on main.

https://dagger.cloud/tiborvass/traces/5bef0311f17844dd789f50488a038719

#

Cannot decode Directory#1

young solstice
#

what LLM impl are you using, btw? im a lil suprised by this error but maybe we just have different preferred providers

tawdry veldt
#

4o

young solstice
#

i never hit it with claude 3.5

tawdry veldt
young solstice
#

once i've opened a prompt, i go back to the shell and then do $agent | env | inputs | name

#

i am almost entirely certain there is a less stupid way than that though lmao

tawdry veldt
#

TIL $agent

young solstice
#

it used to be .llm

tawdry veldt
#

Now, i just repro'd the same behavior as 4o on claude 3.5 (it's trying to pass Directory#1 but it was never created).

#

$agent | env | inputs | name returns just hello

sick scroll
#

it's a large messy pr atm, sorry - but the tool calling scheme is very different, so I'm curious if any of the stuff I've already done addresses the issue in any way

tawdry veldt
#

sure! will try now

sick scroll
#

the evals are in-repo now, so you can do stuff like this:

# run a few attempts of an eval, analyze the results, suggest a new system prompt
dagger-dev -m modules/evaluator call --model claude-3-7-sonnet-latest --docs ./core/llm_docs.md --initial-prompt ./core/llm_dagger_prompt.md evaluate --model gpt-4.1 --name BuildMulti

# or to run a single eval
dagger-dev -m modules/evaluator/evals call build-multi report
young solstice
#

i had procrastinated cloning them or learning how to use them in their original location, and now you have rewarded my procrastination

fervent temple
tawdry veldt
#

Sorry Alex, got sidetracked, will get back into testing your branch. And I think Guillaume is referring to handling prompts via Instructions in MCP, which if we start relying heavily on it again, we'll need to prioritize to make things work with MCP. /cc @young solstice

#

I'll try to add an eval and run it with your instructions above

young solstice
#

maybe a hot take but we should keep focus on a minimal mcp demo unless merging selectTools is hyper-imminent, and if it is we should acknowledge we're gonna have some serious logical merge conflicts to iron out

tawdry veldt
#

right now the intuitive prompt is hitting the Directory#1 issue, so i'm trying to fix that.

sick scroll
sick scroll
#

i don't think i'm far from merging, but there's a lot of clean-up to do (as in splitting up the PR potentially)

tawdry veldt
#

good news, the problem seems to go away with your branch! Will need to do more tests with MCP but it looks promising

sick scroll
tawdry veldt
#

Will send you later but basically the QuickStart example and the prompt for now is just “build dagger hello”