#GPTs vision

1 messages · Page 1 of 1 (latest)

jaunty vale
#

Can GPTs use vision to see pictures from their Knowledge files?

spiral craterBOT
#

Hi @jaunty vale. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the information provided in the extracts, GPTs do not currently support the ability to process images from their Knowledge files. The Knowledge feature in GPTs allows builders to upload files containing additional context, but only the text is currently processed source (https://help.openai.com/en/articles/8843948-knowledge-in-gpts).

However, OpenAI has developed a version of GPT-4, referred to as GPT-4 with Vision or gpt-4-vision-preview in the API, that can take in images and answer questions about them. This model is available to developers who have access to GPT-4 via the gpt-4-vision-preview model and the Chat Completions API, which has been updated to support image inputs. Images can be made available to the model by passing a link to the image or by passing the base64 encoded image directly in the request source (https://platform.openai.com/docs/guides/vision).

It's important to note that the Assistants API does not currently support image inputs, and GPT-4 with Vision does not receive image metadata source (https://platform.openai.com/docs/guides/vision).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@jaunty vale kapa.ai is still learning and improving, please let us know how it did by reacting below

jaunty vale
#

@spiral crater using code interpreter, ChatGPT with GPT-4 can run code and display an image. can it see that image?

spiral craterBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Based on the information provided in the extracts, GPT-4 with Vision, or gpt-4-vision-preview, can process images and answer questions about them. However, this capability is currently limited to the Chat Completions API, and it does not extend to the code interpreter feature of ChatGPT.

While GPT-4 can execute code and display an image, it does not have the ability to "see" or interpret the image it has displayed. The image processing capability is specifically tied to the gpt-4-vision-preview model and involves passing an image URL or a base64 encoded image directly in the request to the API source (https://platform.openai.com/docs/guides/vision).

Therefore, if you're using the code interpreter feature of ChatGPT to run code and display an image, GPT-4 would not be able to interpret that image.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: