Hello,
I have two issues that might be correlated, when the PDF is a scan (and therefore an image).
When sending a pdf to the OCR model, it provides me with a image, then I run the OCR on it, and it returns me a bit of text and other image, on top of which I can rerun OCR again, and so on and so forth. Not very confortable to work with.
Now with this same PDF, I query the chat, I get a correct reading of the whole doc, but when I do the same query with the API, i only have the first layer of information that is being read, and the answer is incomplete.
It seems there is a divergence of treatment between chat and API which lead to poor API performances.