#Analyzing a PDF with gpt-4

1 messages · Page 1 of 1 (latest)

agile rivet
#

I noticed that ChatGPT-4 accepts pdf upload and can use the pdf to answer questions I have. I'm having trouble finding the right endpoint to hit that will accept my file input. I have looked at https://platform.openai.com/docs/api-reference/chat/create but there is no file parameter.

I found knowledge retrieval https://platform.openai.com/docs/assistants/tools/knowledge-retrieval , but I'm not sure if there's a better alternative that doesn't cost storage $ since I only want the document to be used for a single query.

What's the 'right' way to do this?

civic ibex
timid folio
#

But

#

You can upload files using the assistants API

#

I'll try it myself real quick because I haven't tried it out yet

timid folio
#

Yeah so, you need to upload the files to the OpenAI servers to analyze it with GPT-4.
Here are the steps:

  1. Create an assistant with the tools retrieval and code_interpreter, this worked for me, when I didn't include these it stated that it can't read file contents
  2. Create a thread
  3. Upload your file to the OpenAI servers
  4. Add a message to the thread with your file ID, e.g.
{
    "role": "user",
    "content": "What's in this PDF file?",
    "file_ids": [
        "file-dRKAcOvKtXVajMWM29M3Fw0r"
    ]
}
  1. Run the thread
  2. Retrieve the response, this is what I got:
The PDF file contains a simple text document used as a demonstration for tutorials. It appears to be two pages long and contains repetitive phrases such as \"And more text\" interspersed with expressions of the content's tedium, like \"Boring zzzzz\" and \"Oh how boring typing this stuff.\" The document concludes with \"The end and just as well.\" indicating the end of the text. The document is straightforward and seems to serve as a basic example of a PDF file for demonstration purposes.
#

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

agile rivet
#

@timid folio thank you very much! i was curious how can i prevent/minimize any costs? does uploading then running the query on the assistant, then deleting the file prevent incurring costs?

#

do you know how chat.openai.com works with file uploads? is it essentially calling the same model with assistant?

timid folio
#

I'm not sure but since the price is calculated per day it may be free

#

If you host it for less than a day on the openai servers

#

But I'm not sure

agile rivet
#

Thanks!! Im still trying to figure out the best way to check status and delete after printing results

agile rivet
#

@timid folio when you have time, could you help me again?

i'm having trouble uploading the file. i am looking at the docs:

https://github.com/openai/openai-python/tree/main?tab=readme-ov-file#file-uploads
Request parameters that correspond to file uploads can be passed as bytes, a PathLike instance or a tuple of (filename, contents, media type).

my understanding, according to the openai api docs, is that i need to form a tuple with (filename, contents, media type). i understand filename can just be a string, but i don't understand what contents and media type should be.

i am uploading a file to my server via html
<form enctype="multipart/form-data">

when i check the type on my flask server it is:
<class 'werkzeug.datastructures.file_storage.FileStorage'> which i learned is what flask uses.

thanks again for your help!

timid folio
# agile rivet <@996833912053575870> when you have time, could you help me again? i'm having t...

Yeah so I've read the docs, but I'm not a python developer and I don't know what a tuple is lmao, so I gave ChatGPT some docs and your question and now I have a code snippet that looks alright but I have no idea if it will work to be honest, but here it is:

from flask import Flask, request
from openai import OpenAI
import mimetypes

app = Flask(__name__)
client = OpenAI()

def get_mime_type(file):
    # Try to detect MIME type, default to 'application/octet-stream' if not found
    mime_type, _ = mimetypes.guess_type(file.filename)
    return mime_type or 'application/octet-stream'

@app.route('/upload', methods=['POST'])
def upload_to_openai():
    try:
        # Step 1: Receive File in Flask
        file = request.files['file']  # 'file' is the name attribute in your HTML form

        # Step 2: Read File Contents
        file_contents = file.read()

        # Step 3: Determine Media Type
        media_type = get_mime_type(file)

        # Step 4: Create Tuple for OpenAI API
        filename = file.filename
        openai_tuple = (filename, file_contents, media_type)

        # Step 5: Use OpenAI Python Library to Upload
        client.files.create(
          file=openai_tuple, 
          purpose="assistants"
        )

        return "File uploaded to OpenAI successfully!"
    except Exception as e:
        return f"Error uploading file: {str(e)}", 500

if __name__ == '__main__':
    app.run()
agile rivet
#

i was able to get it to upload. i ended up using request.files['file_upload'].read()

#

seems to align with your generated solution

agile rivet
#

thanks, i have made a lot of progress actually! not only am i able to upload the file to open ai, i can now create an assistant, thread, message, file, run it, and get the status!

#

my next hurdle is understanding how to unwrap SyncCursorPage[MessageFile]
client.beta.threads.messages.list(thread_id, **params) -> SyncCursorPage[ThreadMessage]

#

when i click on the link, it takes me. to ThreadMessage definition. trying to understand what SyncCursorPage does

#

doing a grep in the repo doesn't yield anything

agile rivet
timid folio
agile rivet
#

Me too rofl. Tuple is just an immutable array