#supporting lists
1 messages ยท Page 1 of 1 (latest)
I think this can't be true? The following prompt for example: llm --model gemini-2.0-flash | with-directory $(git github.com/dagger/dagger| head | tree) | with-prompt "i gave you access to a directory at / and tools to work with it. the tool entries will tell you the contents of the directory. tell me the first item" | last-reply calls Directory.entries("/") and tells me .changes is the first entry
Right, so I think that case works for a specific reason, which is that Entries returns a list of strings. When we hit that case we actually just json marshal and i think it sees a list of strings: https://github.com/shykes/dagger/blob/431135c10af2d3a2a6051cf3841f26982f70ae48/core/bbi/flat/flat.go#L260
So the LLM can infer there's a list just because it's json.
But when we return a list of objects we return an ID for the type "list of that Object", but it's just a single ID for the whole list. That's where the LLM has no knowledge of lists or how to select elements from them.
Like if you change your example to use something that has a return type of []*Directory or anything like that, then I think the LLM will get confused again
It's a good point though that the problem is specifically with lists of objects and interesting the LLM can do it with lists of scalars just because it understands JSON
Gotcha, that makes sense. I haven't run into that yet!
Yeah I don't think it's a blocker per-se, but I happened to hit it quick and it's profoundly confusing.
The fact that it works with JSON in the list-of-scalars-case gave me an idea for a potential quick fix though, which is to represent lists of objects as a list of their IDs when presented to the LLM. Maybe they are good enough JSON parsers to work with that reliably
I'll try it quick
Yeah it sounds like a missing part of BBI that we don't provide a tool to access the list items
Yeah that would be even better but more work. If this quick fix works it basically relies on the LLM being given a json list encoded as a string and being able to reliably do stuff like "what is the 7th element of that list?"
I have no intuition on how good they would be at that. Seems plausible but also they struggle counting the number of r's in strawberry so ๐คทโโ๏ธ
At the moment I believe BBI requires the state to be an object
Good timing since I'm diving back into BBI today and tomorrow to add multi-object, and could make other improvements while I'm at it
One thing I've been wondering: would there be a way to have weakly-typed setter functions?
Like the equivalent of Llm.WithAny(value any) in go?
Okay I'll just leave it alone for now then rather than try a fix (I'm probably spending a bit too much time here tbh anyways). It wasn't turning out to be a "quick fix" anyways ๐
@sturdy jolt just so I understand the shape of the problem, you were focusing on arrays specifically right?
Oh OK I Just realized - your issue is not even with the Dagger-facing part of the BBI. It's purely with the llm-facing part
ie you do not need this:
var ctrs []*dagger.Container
dag.Llm().WithContainerArray(ctrs).WithPrompt(...)
It's more specifically arrays of objects. Right now the code handling that is confusing because when it sees a type of []Object (where Object is File, Directory, etc.) it treats it as just Object in some cases. e.g. isObjectType seems to return true when it's a list of Objects, not just a standalone Object: https://github.com/shykes/dagger/blob/431135c10af2d3a2a6051cf3841f26982f70ae48/core/bbi/flat/flat.go#L147
- That's due to the
.Name()method on ast.Type just returning the element type name when it's a list (which is dubious, but in a library outside our control)
After that the return value for a list of objects is just a single ID which represents the entire list. Possible fixes include:
- Instead present that to the LLM as a json list of the ID types (relies on LLM understanding json, which seems plausible)
- Keep returning just a single ID for the whole list, but specifically tell the LLM that it's a list type and give it an extra "built-in" tool called like
selectNthElementwhich gives it the ability to retrieve individual elements from any list ID (this matches how dagql call ID formats work)
Correct, I don't need that (though obviously would be nice to have some day). The problem here arises the LLM calls something that returns a list of objects and then tries to use it. See simple example here: https://github.com/dagger/dagger/pull/9628#issuecomment-2698853881
Ah I see
yeah that was one of the horrible horrible blocker bugs where I had to ask cursor+o1 for help ๐
actually one of my nightmares is re-triggering these bugs as I go back and mess with BBI
I think that's what we talked about a few weeks ago before any of this had come together. It's definitely technically possible, though the classic tradeoff of type safety vs. convenience in some situations. If we don't have that we will end up with lots of autogenerated with* for permutations of with<Object>, with<Object>Array, etc. But idk if that's a bad thing
Let's revisit once we have multi-object in place. It could be that there's an easy fix
Yeah my other general comment after reviewing is that even though it will be tricky to figure out, we desperately need some tests somehow. Probably need some mock LLM backend
I was thinking of this kind of API:
LLM.set<Foo>(key: String, value: <Foo>): LLM!
LLM.get<Foo>(key: String): <Foo>!
Potentially we could add a third:
LLM.append<Foo>(key: string, value: <Foo>): LLM!
Yeah IIUC that seems reasonable to me. It's "stringly typed" of course but if someone really cares about that they are free to make their own "workspace object" that has fields for each of the objects/lists instead (which also of course will be 100x less painful after we have self calls)
Sorry the key: string are for multi object it's so I can give the LLM say, a workspace, a container and a github API endpoint (3 different object types) each at a different variable name
so it's orthogonal to the array problem
Yeah that's what I was imagining. I was just saying that there's sort of an equivalence between doing:
This:
type Workspace struct {
Foo *dagger.File
Bar *dagger.Directory
Baz []*dagger.Container
}
// (pretend self-calls exist and this would work w/out separate module)
ws := &Workspace{Foo: foo, Bar: bar, Baz: baz}
llm := dag.Llm().WithWorkspace(ws)
And this:
llm := dag.Llm().
SetFile("Foo", foo).
SetDirectory("Bar", bar).
AppendContainers("Baz", baz)
Like "multi-object" could be modeled either with string vals as you mentioned, or there's an equivalent representation of a custom object that has fields for each object.
In the longer term I like the "custom object" approach more since once self calls exist it's about as convenient and retains type safety, but in the mean time the "string key" approach works just fine and holds us over
ah yes. exactly ๐