Reported by @charred inlet
Use Evals through OpenAI Platform with file_search tool enabled (either on single file or on vector store) using gpt-5, gpt-5-mini, or gpt-5-nano. Loop on a typical evals test dataset, with the developer prompt telling the model to search before making it's response. Also tell the model if it cannot find any relevant information to return "insufficient data."
The model searches, and if it can find relevant information, it includes it in its response. If the model cannot find any relevant information or the search fails, the model responds with "insufficient data."
Some of the models search and find information via the file_search tool, but many other responses don't say anything (while using the same amount of tokens as the ones that succeed)
Web