#Base64 images

1 messages · Page 1 of 1 (latest)

blissful jasper
#

Acording to the openai documentation i should be able to upload images as base64 encoded strings. but when i do it gives me this error message.

Error code: 400 - {'error': {'message': 'Invalid image.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_image'}}

heres my code for reference:

def proofread_text(image: str):
    print(image)
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "Proofread this text:"},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": image,
                        }
                    },
                ],
            }
        ],
        max_tokens=300,
    )
    return response.choices[0].message.content
if waiting_for_file.get(user_id):
        file_bytes = await client.download_media(message, in_memory=True)

        if message.document.mime_type == "application/pdf":
            try:
                images = convert_from_bytes(file_bytes.getbuffer())

                for _, image in enumerate(images):
                    with io.BytesIO() as output:
                        image.save(output, format="PNG")
                        output.seek(0)
                        base64_image_data = base64.b64encode(output.read()).decode('utf-8')
                        response = proofread_text(base64_image_data)
                        await message.reply_text(response)
teal obsidianBOT
#

Hi @blissful jasper. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

The error message you're seeing suggests that the image data you're providing is not being recognized as a valid image by the OpenAI API. This could be due to a few reasons:

  1. The base64 string is not correctly formatted: The base64 string should be prefixed with data:image/jpeg;base64, or data:image/png;base64, depending on the image format. This is mentioned in a forum post (https://community.openai.com/t/879424) where a user faced a similar issue.

  2. The image data is not correctly encoded: The image data should be correctly encoded in base64. The OpenAI documentation (https://platform.openai.com/docs/guides/vision) provides a Python function for correctly encoding an image in base64.

  3. The image data is not being correctly passed to the API: The image data should be passed as a dictionary with a url key, as shown in the documentation (https://platform.openai.com/docs/guides/vision) and in a Github issue comment (https://github.com/openai/openai-python/issues/1635).

Here's how you can modify your code to ensure the base64 string is correctly formatted and passed to the API:

def proofread_text(image: str):
    print(image)
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "Proofread this text:"},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": f"data:image/png;base64,{image}",
                        }
                    },
                ],
            }
        ],
        max_tokens=300,
    )
    return response.choices[0].message.content
#
if waiting_for_file.get(user_id):
        file_bytes = await client.download_media(message, in_memory=True)

        if message.document.mime_type == "application/pdf":
            try:
                images = convert_from_bytes(file_bytes.getbuffer())

                for _, image in enumerate(images):
                    with io.BytesIO() as output:
                        image.save(output, format="PNG")
                        output.seek(0)
                        base64_image_data = base64.b64encode(output.read()).decode('utf-8')
                        response = proofread_text(base64_image_data)
                        await message.reply_text(response)

Please ensure that the image data is correctly encoded in base64 and that the base64 string is correctly formatted and passed to the API. If you're still facing issues, it might be helpful to post on the OpenAI Community forum (https://community.openai.com) for more specific help.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@heady stone kapa.ai is still learning and improving, please let us know how it did by reacting below

blissful jasper
#

@teal obsidian how could i make it so it sends multiple images on the same request?

teal obsidianBOT
#

Let me check the knowledge base for you. One moment! :books:

#

You can send multiple images in the same request by including multiple image_url or image_file objects in the content array of your user message. Each image_url or image_file object should contain the URL or base64-encoded data of a different image. Here's an example of how you can modify your code to send multiple images:

def proofread_text(images: list):
    image_contents = [{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{image}"}} for image in images]
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "Proofread this text:"},
                    *image_contents
                ],
            }
        ],
        max_tokens=300,
    )
    return response.choices[0].message.content
if waiting_for_file.get(user_id):
        file_bytes = await client.download_media(message, in_memory=True)

        if message.document.mime_type == "application/pdf":
            try:
                images = convert_from_bytes(file_bytes.getbuffer())
                base64_images = []

                for _, image in enumerate(images):
                    with io.BytesIO() as output:
                        image.save(output, format="PNG")
                        output.seek(0)
                        base64_image_data = base64.b64encode(output.read()).decode('utf-8')
                        base64_images.append(base64_image_data)

                response = proofread_text(base64_images)
                await message.reply_text(response)

This code will send all the images in the base64_images list in the same request. The * operator is used to unpack the list of image contents into the content array.

#

Please note that the number of images you can add to a conversation depends on various factors, including the size of the images and the amount of text accompanying them. As a general guideline, if you encounter issues, consider reducing the image quantity or size (source (https://help.openai.com/en/articles/8400551-image-inputs-for-chatgpt-faq#h_63671f4925)).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@heady stone kapa.ai is still learning and improving, please let us know how it did by reacting below

wary lichen