#Feedback on Mistral OCR: Feature Request for Educational Use (Handwriting & Fidelity)

5 messages · Page 1 of 1 (latest)

languid mantle
#

Dear Mistral Team,
I am a teacher from Germany and I am currently exploring Mistral OCR for digitizing student assignments. While the performance is impressive, I have encountered a specific challenge for the educational sector.
The Problem: "Auto-Correction" of Errors
Mistral OCR often performs so well that it "cleans up" the text. For teachers, it is essential to have an absolutely faithful transcription. If a student makes a spelling or grammatical mistake, the OCR must reflect that exactly in the Markdown output. Currently, the model sometimes "fixes" these errors, which makes it impossible to use the digital version for grading or diagnostic purposes.
My Feature Request:

  1. High-Fidelity Mode: A setting or parameter that forces the model to prioritize literal character recognition over linguistic probability (avoiding "autocorrect").
  2. Handling of Non-linear Layouts: Better support for handwriting and margin notes, as students often use arrows or inserts that disrupt the logical flow.
  3. Contextual Awareness without Modification: Using models like Pixtral to understand the layout, but without altering the original text's orthography.
    Is a "Fidelity Mode" or a specialized HTR (Handwritten Text Recognition) focus on your roadmap? Providing a GDPR-compliant, high-fidelity OCR solution for schools in the EU would be a massive benefit for the educational landscape.
    I look forward to your thoughts on this.
    Best regards,

Felix

stray kayakBOT
#
Thanks for the Feedback!

Thanks a lot for the feedback! If you need help in the future, you can contact support to get the help you need, visit the following article to learn how to contact support.

languid mantle
#

API

distant breach
languid mantle
#

It smoothes out language mistakes, probably by inferring words from context of the sentence.

While this can be useful, I’d argue that a perfect OCR would be true to every letter.

It’s still really good, though.