#batch pdf uploading to the API

1 messages · Page 1 of 1 (latest)

true aspen
#

Hi, I am using assistants api and a batch uploading pdf files but for some reason, one off file is successful and the other results in failed status. I am trying to understand what could be the reason for this failed status so that I can correct it before passing it in as a stream. Any help would be greatly appreciated

silver harborBOT
#

Hi @true aspen. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

The status of a batch can be checked at any time using the OpenAI API, and it will return a Batch object. The status of a given Batch object can be any of the following:

  • validating: the input file is being validated before the batch can begin
  • failed: the input file has failed the validation process
  • in_progress: the input file was successfully validated and the batch is currently being run
  • finalizing: the batch has completed and the results are being prepared
  • completed: the batch has been completed and the results are ready
  • expired: the batch was not able to be completed within the 24-hour time window
  • cancelling: the batch is being cancelled (may take up to 10 minutes)
  • cancelled: the batch was cancelled

If your batch status is failed, it means the input file has failed the validation process. This could be due to various reasons such as incorrect file format, size of the file, or the content of the file.

One of the forum posts suggests that the size of the output file being too big (>5GB) could cause the batch to fail. The user resolved this by splitting the input file.

Here is how you can check the status of a batch:

from openai import OpenAI
client = OpenAI()

client.batches.retrieve("batch_abc123")

Please replace "batch_abc123" with your batch ID.

Sources:


Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: