#Multi-object eval 1 ๐Ÿงต

1 messages ยท Page 1 of 1 (latest)

cedar cedar
#

OK I got it to actually see its objects @wild marlin ๐Ÿ‘ now onto actually performing a task

#

cc @hushed hill @sturdy mason @pseudo flame @hollow osprey @glossy finch

#

Multi-object nirvana here we come!

#

I hit an error with my github publish, env var not available somehow. Had to interrupt the agent with Ctrl-C. Wasn't sure what part of the context was still there. Which prompts need to be re-sent? Which variables are still set?

#

(note: I kind of want to annotate traces right now ๐Ÿ™‚

hushed hill
#

looks like it didn't stumble actually finding it's tools which is sweet

cedar cedar
cedar cedar
hushed hill
#

yeah looks like either the tool arg types were wrong or it hallucinated them a bit

cedar cedar
#

@wild marlin one interesting side effect of allowing use of hashes: I updated one of the variables, asked it to try again, but it tried from the same hash so I had to explicitly say "reload from the variable please"

cedar cedar
# cedar cedar

we had an actual engine / bbi bug in there at some point. We may have removed the fix alongside much of the bbi code

#

@wild marlin part 1: object has wrong credentials, agent gets an error

#

@wild marlin part2 : user fixes the object, asks agent to try again. but agent does not reload from variable (uses hash)

wild marlin
#

right right

cedar cedar
#

Part 3: user prompts to reload. this time object is used. We get to a new error ๐Ÿ™‚

wild marlin
#

well, technically it's not that it used the hash - it's that the current state never changed

#

(i think)

cedar cedar
#

oh, is there still a concept of current state? I thought that was replaced by explicit IDs

wild marlin
#

nope it's still there

cedar cedar
#

I guess it's a hybrid system: current state does the heavy lifting. Explicit IDs for passing objects as arguments

wild marlin
#

yeah

cedar cedar
#

(and/or more creative uses by the model)

#

creative->scary

wild marlin
#

and calling tools changes the current state to the new return value

cedar cedar
#

makes sense. then yeah you're right, it's the current state that's the problem

wild marlin
#

maybe setting variables could also set the current state? thinkies

cedar cedar
#

-> maybe we should show current selection in the prompt?

cedar cedar
wild marlin
#

would that break?

#

i think right now it's basically set a=1, b=2 -> prompt -> model selects a or b

cedar cedar
#

maybe setting variables could also set the current state?

But if I set 3 variables, which one would be set to the state? Last one wins?

wild marlin
#

yeah, and if the model needs a different one i presume it would just swap

cedar cedar
#

I think it might confuse the model if it's not the one doing the selecting

wild marlin
#

but, eh, it doesn't feel great

cedar cedar
#

at the moment we only tell the model about its new state when it selects it

#

what if we shaped the tools so that it has to start a "pipeline"; and the concept of currents state is only within that pipeline/session. If the pipeline completes, a value will be saved to a variable. If it's interrupted, the state is lost

#

(not sure that even makes sense)

#

OK next UI blocker: the agent says it's done. I can't really inspect the result, because I can't list current variables in the shell

#

I'm literally paying OpenAI to ask the LLM which variables it sees, because i can't see them ๐Ÿ˜›

wild marlin
#

lol, i ran into that too and tried running _env and .env in desparation

#

would be nice if $ supported tab completion too

wild marlin
# cedar cedar

are you able to confirm those are synced back to the shell?

cedar cedar
#

yes I opened a terminal in one of them ๐Ÿ™‚

cedar cedar
#

getting some random 1password lookup errors

#

(ignore the var names ๐Ÿ˜› )

#

error is from pressing >

#

@wild marlin just demoed this ๐Ÿ‘† to @glossy finch @hollow osprey @strong lynx ๐Ÿ™‚ they like it

#

Started talking about the looking question of "what about access to the shell environment - like the current module, core API, etc"

#

observed that the copilot's "privileges" will need to be configurable - you don't always want to give access to core API, or current module

#

from there: how to configure this?

#

from there: well there's the LLM API, llm | .... prompt mode is already a special case of that API. Could we make the relationship more ovious, so you can configure your "copilot" (the special global llm instance selected by prompt mode) using the llm API?

#

from there: what if you could set any number of variables of type llm, and > let you cycle through them?

#

Like copilot tabs ๐Ÿ™‚

wild marlin
#

@cedar cedar an idea for the var confusion: what if, when we set a var, we add a message like "The variable foo has been set to Container@xxh3:...", and if it changes, it says "The variable foo has changed from Contaienr@xxh3:... to Container@xxh3:..." - maybe that will be enough of a hint to the model

#

I actually tried that before embracing the 'just pass $foo' pattern and it did help

#

right now var changes are completely invisible to the model until it observes them (_objects), which it can't always know to do

cedar cedar
#

so you wrap it in an extra system prompt?

wild marlin
cedar cedar
#

Overall this first eval went really well!

#

A few papercuts. But the core plumbing held up well

#

Makes me want to try making more modules that I can compose in-model