#Is it possible to submit a PDF via the API (or a URL for a pdf) so that OpenAI can analyse it?
1 messages · Page 1 of 1 (latest)
Hi @sullen vault. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
Yes, it is possible to submit a PDF for analysis using OpenAI's capabilities, but it requires some preprocessing steps. OpenAI models like GPT-4V and GPT-4o can be used to analyze the content of PDFs, but they do not natively handle PDF files directly. Instead, you need to convert the PDF pages into images and then analyze these images.
Here is a general approach based on the knowledge sources:
-
Convert PDF to Images: Use a library like
pdf2imageto convert each page of the PDF into an image. This is necessary because models like GPT-4V and GPT-4o can analyze images but not PDF files directly.from pdf2image import convert_from_path def convert_doc_to_images(path): images = convert_from_path(path) return images -
Encode Images: Convert the images to a base64 format to prepare them for analysis.
import base64 from io import BytesIO from PIL import Image def get_img_uri(img): buffer = BytesIO() img.save(buffer, format="jpeg") base64_image = base64.b64encode(buffer.getvalue()).decode("utf-8") data_uri = f"data:image/jpeg;base64,{base64_image}" return data_uri -
Analyze Images with GPT-4V or GPT-4o: Use the encoded images as input to the model for analysis.
def analyze_image(img_url):
response = client.chat.completions.create(
model="gpt-4-vision-preview",
temperature=0,
messages=[
{
"role": "system",
"content": system_prompt
},
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": img_url,
},
],
}
],
max_tokens=300,
top_p=0.1
)
return response.choices[0].message.content
For more detailed instructions, you can refer to the OpenAI Cookbook on parsing PDF docs for RAG (https://cookbook.openai.com/examples/parse_pdf_docs_for_rag).
Regarding model support, the models that can handle such tasks include gpt-4-vision-preview and gpt-4o, as they have vision capabilities. You can find more about model availability in the Batch API documentation (https://platform.openai.com/docs/guides/batch#model-availability).