Analyzing a PDF with gpt-4 | OpenAI | Page 1

agile rivet Dec 24, 2023, 8:49 AM

#

I noticed that ChatGPT-4 accepts pdf upload and can use the pdf to answer questions I have. I'm having trouble finding the right endpoint to hit that will accept my file input. I have looked at https://platform.openai.com/docs/api-reference/chat/create but there is no file parameter.

I found knowledge retrieval https://platform.openai.com/docs/assistants/tools/knowledge-retrieval , but I'm not sure if there's a better alternative that doesn't cost storage $ since I only want the document to be used for a single query.

What's the 'right' way to do this?

civic ibex Dec 24, 2023, 9:31 AM

#

To my knowledge everytime it accesses that file it is a retrieval and it costs money https://platform.openai.com/docs/assistants/tools/retrieval-pricing

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

timid folio Dec 24, 2023, 11:01 AM

#

agile rivet I noticed that ChatGPT-4 accepts pdf upload and can use the pdf to answer questi...

I haven't tried it yet

#

But

#

You can upload files using the assistants API

#

I'll try it myself real quick because I haven't tried it out yet

timid folio Dec 24, 2023, 11:25 AM

#

Yeah so, you need to upload the files to the OpenAI servers to analyze it with GPT-4.
Here are the steps:

Create an assistant with the tools retrieval and code_interpreter, this worked for me, when I didn't include these it stated that it can't read file contents
Create a thread
Upload your file to the OpenAI servers
Add a message to the thread with your file ID, e.g.

{
    "role": "user",
    "content": "What's in this PDF file?",
    "file_ids": [
        "file-dRKAcOvKtXVajMWM29M3Fw0r"
    ]
}

Run the thread
Retrieve the response, this is what I got:

The PDF file contains a simple text document used as a demonstration for tutorials. It appears to be two pages long and contains repetitive phrases such as \"And more text\" interspersed with expressions of the content's tedium, like \"Boring zzzzz\" and \"Oh how boring typing this stuff.\" The document concludes with \"The end and just as well.\" indicating the end of the text. The document is straightforward and seems to serve as a basic example of a PDF file for demonstration purposes.

#

Here are some docs that may help:

Creating an assistant: https://platform.openai.com/docs/assistants/how-it-works/creating-assistants
Uploading a file: https://platform.openai.com/docs/api-reference/files/create
Creating a thread: https://platform.openai.com/docs/api-reference/threads/createThread
Adding a message to the thread: https://platform.openai.com/docs/api-reference/messages/createMessage

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

agile rivet Dec 25, 2023, 2:03 AM

#

@timid folio thank you very much! i was curious how can i prevent/minimize any costs? does uploading then running the query on the assistant, then deleting the file prevent incurring costs?

#

do you know how chat.openai.com works with file uploads? is it essentially calling the same model with assistant?

timid folio Dec 25, 2023, 7:11 PM

#

agile rivet <@996833912053575870> thank you very much! i was curious how can i prevent/minim...

You could delete the files right after you finished analyzing the PDF

#

I'm not sure but since the price is calculated per day it may be free

#

If you host it for less than a day on the openai servers

#

But I'm not sure

agile rivet Dec 25, 2023, 11:28 PM

#

Thanks!! Im still trying to figure out the best way to check status and delete after printing results

agile rivet Dec 29, 2023, 2:33 AM

#

@timid folio when you have time, could you help me again?

i'm having trouble uploading the file. i am looking at the docs:

https://github.com/openai/openai-python/tree/main?tab=readme-ov-file#file-uploads
Request parameters that correspond to file uploads can be passed as bytes, a PathLike instance or a tuple of (filename, contents, media type).

my understanding, according to the openai api docs, is that i need to form a tuple with (filename, contents, media type). i understand filename can just be a string, but i don't understand what contents and media type should be.

i am uploading a file to my server via html
<form enctype="multipart/form-data">

when i check the type on my flask server it is:
<class 'werkzeug.datastructures.file_storage.FileStorage'> which i learned is what flask uses.

thanks again for your help!

timid folio Dec 29, 2023, 10:13 AM

#

agile rivet <@996833912053575870> when you have time, could you help me again? i'm having t...

Yeah so I've read the docs, but I'm not a python developer and I don't know what a tuple is lmao, so I gave ChatGPT some docs and your question and now I have a code snippet that looks alright but I have no idea if it will work to be honest, but here it is:

from flask import Flask, request
from openai import OpenAI
import mimetypes

app = Flask(__name__)
client = OpenAI()

def get_mime_type(file):
    # Try to detect MIME type, default to 'application/octet-stream' if not found
    mime_type, _ = mimetypes.guess_type(file.filename)
    return mime_type or 'application/octet-stream'

@app.route('/upload', methods=['POST'])
def upload_to_openai():
    try:
        # Step 1: Receive File in Flask
        file = request.files['file']  # 'file' is the name attribute in your HTML form

        # Step 2: Read File Contents
        file_contents = file.read()

        # Step 3: Determine Media Type
        media_type = get_mime_type(file)

        # Step 4: Create Tuple for OpenAI API
        filename = file.filename
        openai_tuple = (filename, file_contents, media_type)

        # Step 5: Use OpenAI Python Library to Upload
        client.files.create(
          file=openai_tuple, 
          purpose="assistants"
        )

        return "File uploaded to OpenAI successfully!"
    except Exception as e:
        return f"Error uploading file: {str(e)}", 500

if __name__ == '__main__':
    app.run()

agile rivet Dec 30, 2023, 2:40 AM

#

i was able to get it to upload. i ended up using request.files['file_upload'].read()

#

seems to align with your generated solution

agile rivet Dec 30, 2023, 3:02 AM

#

thanks, i have made a lot of progress actually! not only am i able to upload the file to open ai, i can now create an assistant, thread, message, file, run it, and get the status!

#

my next hurdle is understanding how to unwrap SyncCursorPage[MessageFile]
client.beta.threads.messages.list(thread_id, **params) -> SyncCursorPage[ThreadMessage]

#

when i click on the link, it takes me. to ThreadMessage definition. trying to understand what SyncCursorPage does

#

doing a grep in the repo doesn't yield anything

agile rivet Dec 30, 2023, 4:42 AM

#

#dev-chat message

timid folio Dec 30, 2023, 7:14 PM

#

agile rivet my next hurdle is understanding how to unwrap `SyncCursorPage[MessageFile]` `cli...

No idea to be honest, like I said, I'm a js dev and don't really know python

agile rivet Dec 30, 2023, 7:57 PM

#

Me too rofl. Tuple is just an immutable array

#Analyzing a PDF with gpt-4