#llm | env | with-file file size capped at 80kB?

1 messages Β· Page 1 of 1 (latest)

frank jay
#

Hello,

Did I miss smth in the docs?

I'm trying to use dagger's LLM module to analyze a CI build log and explain to me what's the error. The log is almost 1MB in size and tedious to search manually. So I figured I'll ask AI to do it for me.

HOWEVER, it seems like LLM does not receive the full file -- it only receives 80kB of it (<10% of total file size). I assumed it might be the LLM API provider's limit, so I tried both API tokens I have: OpenAI and Claude. Both agree on the same truncation at 80kB.

What's happening? Am I doing it wrong? How do I do it RIGHT then?

What's the "truncated": true flag mentioned by Claude (see screenshots)

neat otter
#

you're doing it right, we have a limitation so limiting it ought to be considered a bug - we are currently placing an arbitrary 80kb truncation limit on strings returned from tools. this truncation should definitely be made configurable, but beyond that there's a balancing act between truncation being problematic and tools easily overloading context windows and exceeding tokens-per-minute rate limits

frank jay
neat otter
#

if we make it configurible the api docs will suffice

neat otter
#

re-reading your initial ask after chatting with @sand obsidian ... 1mb is like 250k tokens, right? on o3 that's a $2.50 query, on $0.75 on 4.1... o3 is limited to 200k token (~800kb) context window, 4.1 can take 1m tokens (~4mb)... so like plugging that whole file into an LLM API is gonna tip over some models. 4.1 can handle it, o3 can't, so it may be necessary to teach the model to grep such a large file. i think these models also place lower per-request limits, too? like openAI notes 128k per request on the generic api docs but different numbers per model, but that might be out of date idk

neat otter
#

but fwiw, even after i do that and it ships, you might still have problems with that particular file πŸ™‚

neat otter
#

@frank jay truncation will be removed on v0.18.6, slated to go out tomorrow

#

would love your feedback on it afterwards, especially given that I suspect your specific application may still fail, just differently

frank jay
#

it will probably still fail, it's rather a big file to ingest in one go. I was just surprised to see an undocumented limitation. IMO considering the LLMs are getting frequent updates and new more fancy models are released, the 80k limit might very soon become too low as actual LLM limits might soon get much higher. A configurable value would make sense IMO. With a default of 80k -- why not. But documented nontheless πŸ™‚ Because I've spent 2-3 hours for nothing, trying to figure out wtf I'm doing wrong, bashing my head against the wall until I came to post a question in General (which was ignored), so I spent some hours more, and the reposted I in 'help' section πŸ™‚

A single sentence in docs or helppages of a 'with-file' function would have saved me more than half a day.

Thanks for the change. Will test it out as soon as I get a chance to.

neat otter
#

we'll get there in time, sorry you paid the early adopter penalty on this one. we've been changing a lot of the llm "internals" faster than we can document them, hence the "experimental" designation. we're working to stabilizing things, though, a big part of which will be improving documentation, see #1346602677236400159

frank jay
#

@neat otter interesting...

Claude query fails, as the prompt exceeds token limit:

β”‚πŸ€– Now let's create and start updating our progress.log file:
β”‚ ┃ 3.3s β—† Input Tokens: 1,450 β—† Output Tokens: 148 β—† Token Cache Writes: 8,076
β”‚ βœ” .withExec(args: ["sh", "-c", "echo \"$(date -u +%Y-%m-%dT%H:%M:%SZ) Task started: Analyzing CI log file content vs file size\" > /task/task_20250507T063837Z/progress.log"]): Container! 0.2s
β”‚ 
β”‚πŸ€– Now, let's analyze the log file content. First, I'll retrieve the file contents:
β”‚ ┃ 2.9s β—† Input Tokens: 1,603 β—† Output Tokens: 57 β—† Token Cache Writes: 8,186
β”‚ 
β”‚πŸ€– File.contents: String! 0.0s
β”‚ 
β”‚πŸ€– 0.6s
β”‚ ! POST "https://api.anthropic.com/v1/messages": 400 Bad Request {"type":"error","error":{"type":"invalid_request_error","message":"prompt is too long: 213897 tokens > 200000 maximum"}}
Error: input: llm.withEnv.withPrompt.withPrompt.env not retrying: POST "https://api.anthropic.com/v1/messages": 400 Bad Request {"type":"error","error":{"type":"invalid_request_error","message":"prompt is too long: 213897 tokens > 200000 maximum"}}
#

And OpenAI for some reason now sees fewer bytes -- instead of 80kB now it sees 64kB

β”‚πŸ€– Here are the concrete measurements for your log file:                                                                                                                                                                                                                                                                     
β”‚ ┃                                                                                                                                                                                                                                                                                                                           
β”‚ ┃ β€’ Actual file size: 950,326 bytes                                                                                                                                                                                                                                                                                         
β”‚ ┃ β€’ Bytes read by this AI Agent (File_contents): 65,535 bytes (this is the maximum the function returnedβ€”actual string truncated well before the true file end).                                                                                                                                                            

Tried twice, got the same results.

#

Also, there's now this weird error:

β”‚ ┃ --------                                                                                                                                                                                                                                                                                                                  
β”‚ ┃                                                                                                                                                                                                                                                                                                                           
β”‚ ┃ Let me know if you'd like code, scripts, or processes to work around this limitation or for deep analysis of truncated logs!
β”‚ ┃ 12.3s β—† Input Tokens: 314,593 β—† Output Tokens: 592
βœ” .output(name: "result"): Binding! 0.0s
✘ .asContainer: Container! 0.0s
! binding "result" undefined
Error: input: llm.withEnv.withPrompt.withPrompt.env.output.asContainer binding "result" undefined

"result" is an output name: https://gitlab.com/netikras/llm-toys/-/blob/main/dagger_agent/dagger_task.dagger#L145

It worked OK with 0.18.4

sand obsidian
frank jay
sand obsidian
#

outputs are sort of like a "fill in the blank" for the model: "result is a container with the task result", the model's job is to satisfy that type + description with a value that fits

#

if the model ends its turn without filling in those values, we treat that as a failure, and that's where -i comes in for troubleshooting

frank jay
#

aaahh,,, so it's not up to dagger to create 'outputs' -- it's a suggestion of sorts to LLM. Got it, thanks