#is the gpt able to do that or is it bs-ing me?

1 messages · Page 1 of 1 (latest)

junior lily
#

I gave chat gpt my receipt and asked it to analysis it for me, he got the numbers wrong so I asked him to do it again and it was correct, then adked the question above and this is the answer it gave me. Now as far as I know gpt uses tesseract to read pictures so how come it can read it better if it is running a simple code?

cinder wraithBOT
#

Hi @junior lily. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

ChatGPT, developed by OpenAI, is a powerful AI model that can perform a variety of tasks, including data analysis and image processing. When you upload an image as part of your prompt, ChatGPT uses the GPT Vision model to interpret the image. This enables ChatGPT to answer questions about the image, or use information in the image as context for other prompts [source (https://help.openai.com/en/articles/9260256-chatgpt-capabilities-overview#h_87b98a8413)].

However, it's important to note that the capabilities of GPT-3 (and by extension, ChatGPT) are not fixed across all contexts. Sometimes, the model might not get it right the first time, but with a better prompt or a repeated request, it can steer itself toward the correct output [source (https://cookbook.openai.com/articles/techniques_to_improve_reliability#model-capabilities-depend-on-context)]. This might explain why ChatGPT was able to correctly analyze your receipt upon a repeated request.

As for your question about Tesseract, the knowledge sources provided do not mention the use of Tesseract for image processing in ChatGPT. It's possible that the AI uses a different method or model for image interpretation. If you have further questions about how ChatGPT processes images, I would recommend reaching out to OpenAI directly for more detailed information.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@junior lily kapa.ai is still learning and improving, please let us know how it did by reacting below

junior lily
#

@cinder wraith I don’t understand you
How different prompt control its ability to be accurate about OCR?
As for the tesseract it sometimes runs a code and in that code I can see that it uses it (idk if it always does it tho)

cinder wraithBOT
#

Let me check the knowledge base for you. One moment! :books:

#

The accuracy of ChatGPT's OCR (Optical Character Recognition) capabilities can indeed be influenced by the way you structure your prompts. This is part of a process known as "prompt engineering," which involves crafting your prompts in a way that guides the AI model towards the desired output. For example, if you're asking ChatGPT to analyze an image of a receipt, you might get better results by asking it to "read and summarize the information on this receipt" rather than simply asking it to "analyze this image" [source (https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-api#h_f28eab5004)].

However, it's important to note that while prompt engineering can improve results, it doesn't guarantee perfect accuracy. The AI model might still make mistakes or misinterpret certain elements of the image. For example, the model may struggle with tasks requiring precise spatial localization, such as identifying specific items on a receipt, or it may misinterpret rotated or upside-down text [source (https://help.openai.com/en/articles/8400551-image-inputs-for-chatgpt-faq#h_eaab4187ad)].

As for your question about Tesseract, the knowledge sources provided do not mention the use of Tesseract in ChatGPT's image processing capabilities. It's possible that the AI uses a different method or model for image interpretation. If you have further questions about how ChatGPT processes images, I would recommend reaching out to OpenAI directly for more detailed information.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@junior lily kapa.ai is still learning and improving, please let us know how it did by reacting below