Hello,
Is there no way to perform a file search and get a structured response in one go?
I need GPT to perform semantic search (or regular, context-based search) from the file and respond in a structured format.
I'm currently using the Assistants API so that I can use vector storage for the file search (to avoid uploading the file each and every time), though I'm open to alternatives if you believe some other set of endpoints would be better.
I've thought of the following options for how I can get a structured response after a file search:
-
Using function calling and
file_search, withparallel_tool_calls=True.
However, even if this worked, the two main issues would be that:
a) I would be using function calling instead of structured outputs even though I don't really need to return anything to the GPT, and
b) I wouldn't be able to guarantee that GPT will perform a file_search AND use function calling, sincetool_choiceonly takes one option. -
Asking GPT to provide a response using JSON and providing it in the prompt.
This usually works, but naturally the point of the structured response with strict=true is to guarantee the response matches the schema, rather than relying on parsing and hoping GPT doesn't hallucinate new properties. -
Using structured outputs.
This doesn't work.response_formatcannot be of typejson_schemaunless there are no tools available to the run OR all the tools are function calls.
If I do provide thefile_searchtool to the run, I get the following error:
openai.BadRequestError: Error code: 400 - {'error': {'message': 'Invalid tools: all tools must be of typefunctionwhenresponse_formatis of typejson_schema.', 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}
In my case, I need thefile_searchtool so that it performs semantic search on the file. -
Creating a thread and doing two consecutive runs
In the first run, I would ask GPT to list the data I need like I would in option 2, specifying thetool_choiceto be "file_search".
In the second run, I would ask GPT to then provide a structured response with no tools. It could then use the context (the previous message) to provide the output.
My issue with this is the unnecessary added latency and extra wasted tokens. -
Using the chat endpoint to convert the freeform response to a structured response
That's just option 4 with extra steps.
None of these options really seem "right".
I've currently gone with option 2 to save the latency and tokens, with the idea of switching to option 3 if I get too many incorrectly formatted responses from GPT.
I could also switch to using option 1, with the idea of switching to option 2 if I get too many invalid responses from GPT.
Still, I'm hoping there's a better solution.
Thanks in advance for any input!