#how to get text of image usin gpt-4 vision in python?

1 messages · Page 1 of 1 (latest)

fresh pier
#

how to get text of image usin gpt-4 vision in python?

red voidBOT
#

Hi @fresh pier. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

To use GPT-4 with Vision (GPT-4V) to understand images, you can use the OpenAI API. Here's a basic example of how you can do this in Python:

from openai import OpenAI

client = OpenAI()

# Your image data
image_data = "..."  # This should be your image data

response = client.ChatCompletion.create(
  model="gpt-4-vision-preview",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": [{"type": "text", "text": "What's in this image?"}, {"type": "image", "data": image_data}]}
    ]
)

print(response['choices'][0]['message']['content'])

In this example, image_data should be the base64 encoded string of your image. You can get this string by reading your image file in binary mode, encoding it in base64, and then decoding it to utf-8. Here's a function that does this:

import base64

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

You can use this function to get the base64 string of your image like this:

image_data = encode_image("Path_of_the_image.jpeg")

Please replace "Path_of_the_image.jpeg" with the path to your image file.

Remember that GPT-4 with vision is currently available to all developers who have access to GPT-4 via the gpt-4-vision-preview model and the Chat Completions API. The Assistants API does not currently support image inputs[^1^].

[^1^]: (source: https://platform.openai.com/docs/guides/vision)

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@fresh pier kapa.ai is still learning and improving, please let us know how it did by reacting below