#Has anyone else struggled to get LLMs to
1 messages ยท Page 1 of 1 (latest)
Anthropic, whatever model is chosen by default
Nice
Once I get to the review code stage I'll get a better idea if there's something wrong there, was just curious if this was a known issue or not
So far I've had success enforcing as much guardrails as I can. For example rather than giving it a *File I'll give it a string of the file contents. Or if it's doing a search I might give a bunch of file names that it can call read (a function in my workspace) on
Can you share a trace @iron bloom ?
Could be a BBI problem (ie. how we map dagger API to llm tools)
For some reason my tracing wasn't setup? I've been getting the "setup tracing at ..." nag message. I'll press undo a bunch of times and go back to that state to repro, one sec
... I can't get my traces sent to cloud for some reason? even after logout/login from the CLI? Idk, that's a separate issue, here's a gist with the local progress output for now:
https://gist.github.com/sipsma/9e3b5b0daf51356438cae3e51ef577a9
Looking closer I can see there it just keeps trying to provide the list as an arg for a single dagger.File, which makes me agree it's probably just the LLM not understanding something
I moved onto something more interesting, a "chat with dagger engine" agent that uses the engine cache instrospection APIs to let you ask questions about the cache state, so I'm not blocked here or anything
@iron bloom by the way for debugging you can ask the LLM: "what tools are available to you? Show a detailed table with name, description, and argument schema" very useful ๐
Oh yeah I already have been doing that, that's sort of what lead to the line of thought that ended in "chat with dagger engine". I feel like with enough tools you could give it access to its own logs, its own cpu/memory profiles, even just straight-up arbitrary read-only memory access and then have the engine debug itself ๐ starting with just the cache because that's what's available as an api right now
There's definitely some way of generalizing this too so users could use it on their own apps. Maybe the whole DAP protocol could be passed as a dagger.Service arg and the LLM could use it to debug any remote application (idk enough details about DAP to be sure that'd work, but something like it maybe)
What's DAP?
this doesn't work on ollama in my experience