ocr not extracting the text from the image | Mistral AI | Page 1

whole ocean Mar 6, 2025, 8:42 PM

#

Hey,

I have a pipeline for parsing documents, and I'm testing Mistral OCR, but the result output is a bit weird.

I attached the image, that's the output from mistral OCR

# Company structure 

BD is structured to serve customers by providing unique solutions. The data below represents the company structure for FY 2019.

## Revenue by geography

(millions of dollars)
![img-0.jpeg](img-0.jpeg)

## Revenue by segment

(billions of dollars)
![img-1.jpeg](img-1.jpeg)
![img-2.jpeg](img-2.jpeg)
![img-3.jpeg](img-3.jpeg)

Values in this exhibit reflect rounded numbers in billions and include Bard.
![img-4.jpeg](img-4.jpeg)

c4aea0858256a8d19b78089af114ded2e833782214b27a8a99893091.png

stable pasture Mar 7, 2025, 12:40 AM

#

Seeing the same thing in our testing; and it makes the API rather unusable for visually complex documents.

I'd expect the API to also return the extracted text fro the images it identifies w/o requiring recursive calls to OCR those images.

Not sure if this was intentional by Mistral, or an oversight?

spiral sparrow Mar 7, 2025, 8:05 AM

#

Good solution to use the recursive call ! But indeed it will be pretty good to have directly the text when include_image_base64=False or with another param

#

In the meanwhile I'm using pixtral-large-latest by converting PDF into images

#

"Not sure if this was intentional by Mistral, or an oversight?" => It's seem to be intentional by reading the structured_ocr cookbook (https://colab.research.google.com/github/mistralai/cookbook/blob/main/mistral/ocr/structured_ocr.ipynb). There is a replace_images_in_markdown function which is used after getting the response

finite sphinx Mar 7, 2025, 9:25 PM

#

im seeing the same thing where half the image isnt being OCR'ed and jsut classified as <image_1> etc.

finite karma Mar 10, 2025, 10:58 AM

#

Same on my side. Any fixes?

cold linden Mar 10, 2025, 12:31 PM

#

same. Do devs see an option to have that image incuded in parsing? location of it can affect quality of parsing.

finite karma Mar 10, 2025, 12:33 PM

#

As example, I used that image without markdown in response.

cold linden Mar 10, 2025, 12:35 PM

#

another problem which i see, is that i need to host somewhere this file, to be able to process it. It's fine but quite annoying tbh 😅

cold linden Mar 10, 2025, 1:12 PM

#

finite karma As example, I used that image without markdown in response.

problem in that case can be that this is "audio/mpeg", instead of image/

#

nvm, that looks similar to this problem, but at this case it's at least possibilityto process it after first "reprocessing"

queen wave Mar 10, 2025, 9:38 PM

#

Same problem. Text extraction on Images with API (type: "image_url") is not working in almost all cases, however if I input the same image on the chat interface (lechat), i get the text

river junco Mar 10, 2025, 9:58 PM

#

Same problem for me, also with pdf files

south willow Mar 16, 2025, 10:41 AM

#

Facing the same issue, support ignoring this

#ocr not extracting the text from the image