#default-constructible loaded-module LLM function visibility

1 messages ยท Page 1 of 1 (latest)

livid depot
#

putting aside potential api bikeshedding adjacent to privileged:true, how do we expect this to work?

#

like we want to initialize the module object so that the llm can call it, but atm the way "root" works is to pass the wholeass server

livid depot
#

so should i expect that having initialized the module means all its functions will show up in a callable way on that server? [the problem marcos was describing](#agents message) is that he doesn't wanna have to select the module object to call the functions, like with DaggerDev as the loaded module we should just be able to > run all the tests

#

because reading the code, to me, it feels like exposing those functions requires selecting that module object, and therefore not selecting the toplevel server with all the queries on it and whatnot

serene sky
#

yeah, I think that's the desired state. there's a tangential issue, since there's no way to go back to Query from the selected module object, so we've been talking about adding a selectRoot (or something) to fix that. but I think the desired initial state would still be that you only have the module object's tools immediately available (bikesheddable)

#

like for > run all the tests it would ideally already be scoped in tightly

#

and you'd only escape back out to the root if you need to (at the LLM's discretion I guess?)

livid depot
#

ah gotcha, i thought the bikeshedding was gonna be around env(privileged:true, module: .), and for both those to be set at once i think breaks the whole abstraction

serene sky
#

everything is bikesheddable ๐Ÿ˜› - I'm also toying with another scheme that might remove the need for it, but not sold on it yet, still very experimental and might not work for dumber models

livid depot
#

i guess both those bikesheds are possible at once if env(privileged:true, module: .) initially selects the module but has selectRoot available

serene sky
#

tl;dr: no more 'current selection', there's just one tool, selectTools(tools: ["..."]), which lists all available tools and object IDs in its description, and all objects are passed around as arguments instead of swapping selections around (so there's a self: arg)

livid depot
#

like WithExec(self:ctr, cmd:"cowsay lul")?

serene sky
#

(ok not just one tool, there's still returning)

serene sky
#

but, sorry, might just be a side tangent, still eval'ing

eternal ferry
#

Quick interjection: I think the monolithic concept of "root query object" (which tainted Env.withQuery() and still taints env(privileged) doesn't make sense to expose to the LLM. It mixes 1) core API access and 2) module access which are pretty different from a UX point of view

eternal ferry
#

problem is tool explosion

livid depot
#

lol [this](#1359291317364064357 message) vibes with me hard, it's a big smell that the reason we want privileged:true is because module funcs have parameters that can only be "built" via the core api

serene sky
livid depot
#

think_spin have we tested that the LLMs actually understand swapping back and forth from root, anyways? like I try to get it to call ctr.WithFile(file) with a container selected, it's gotta pop out to root, construct the file, bind it, then pop back to ctr and call WithFile($file)

#

i imagine testing this requires that the selectRoot tool exists lol

#

and maybe that's step one here

serene sky
#

i did, and couldn't get it to behave well consistently, so i shelved it due to time constraints

livid depot
#

๐Ÿ˜ข

serene sky
#

i could have been doing something dumb though

#

at the time it was a "maybe i can sneak this in for release" => computer said no

eternal ferry
#

I agree the Query functions should be an always-present overlay

#

I don't think we can get rid of "select" completely without finding another solution to tool explosion. If the tool list just keeps growing, eventually it will hit the limit

#

I guess we could bet on models just getting better

#
  • maybe with the core type haircut, the situation is different?
#

still I think query is a special case. never made sense to me that you could select it

#

OK how about this

#

as an alternative

livid depot
eternal ferry
#

@serene sky instead of selecting objects, we keep an ever-growing flat list of namespaced tools (basically like in single-object), but we add a "tool store" indirection, where the model can search the toolstore and enable/disable tools

#

difference is that it maps directly to the flat list of tools concept, that LLMs are trained for. Less juggling with "objects" to select etc

#

But the type system is still there. So hopefully the model can use type information to choose tools wisely

livid depot
#

searching tools by type sounds pretty sensible

serene sky
#

some evals are looking promising

serene sky
#

still totally need the function trimming btw. if not just to hide export ๐Ÿ˜ญ - with a narrower selection there's a MUCH higher chance of the LLM picking the right ones based on name only

eternal ferry
livid depot
#

so uh this convo went slightly of the rails but i also wasn't very clear with my intentions. short term, should we be looking to change the default shell startup to be something like env(privileged: true, module: .) so you boot into a shell/prompt with your sdk-defined functions available and then the LLM has to back out to root via a selectRoot tool? the tenor of this conversation was leaning towards the idea that selectTools might obviate such a thing.

serene sky
livid depot
#

ah, because selecttools still kinda implies you want some control over the intial selection

#

cool i will move forward then

serene sky
#

yeah exactly, the end goal is still there: "some means via the API to control what tools the model has immediately available"

#

unfortunately i'm seeing that models are keen to make up tool args (Container.withExec(..., env: [...]) and neglect to provide required ones (self). trying to system prompt my way out of it.

serene sky
livid depot
#

I keep making the same mistake. I need to specify the container each time.
lmao it's so dumb and yet so self-aware at the same time

#

the "bad intern that's real good at googling" analogy echos throughout lmao

#

omg that trace is insanely long

serene sky