#how to create embeddings in batch?

1 messages · Page 1 of 1 (latest)

agile topaz
#
      const batchSize = 200;
      let embeddings: any[] = [];

      for (
        let batchStart = 0;
        batchStart < chunks.length;
        batchStart += batchSize
      ) {
        const batchEnd = Math.min(batchStart + batchSize, chunks.length);
        const batchChunks = chunks.slice(batchStart, batchEnd);

        const batchEmbeddings = await Promise.all(
          batchChunks
            .flatMap((chunk, index) =>
              [...Array(Math.ceil(chunk.content.length / 8000))].map(
                (_, i) => ({
                  chunk,
                  index: index * Math.ceil(chunk.content.length / 8000) + i,
                  content: chunk.content.slice(i * 8000, (i + 1) * 8000),
                })
              )
            )
            .map(async ({ chunk, index, content }) => {
              const { embedding } = await embed({
                model: openai.embedding("text-embedding-3-small"),
                value: content,
              });
              return { chunk, index, embedding };
            })
        );

        embeddings.push(...batchEmbeddings);
      }

While generating embeddings like this, I am getting an error which says Please try again after some time. What can be the cause of the error? I am using it with vercel ai sdk.

pliant canyonBOT
#

Hi @agile topaz. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the knowledge sources provided, I don't have enough information to diagnose the specific error you're encountering. However, it's worth noting that OpenAI's API does have rate limits, and if you're making too many requests in a short period of time, you might encounter errors asking you to try again later.

In the code you've provided, you're making a large number of concurrent requests with Promise.all. If the number of chunks is large, this could potentially exceed OpenAI's rate limits.

You might want to consider adding some form of rate limiting or throttling to your code to ensure you don't exceed these limits. However, without more specific information about the error and the context in which it's occurring, it's difficult to provide a more precise answer.

For more information about OpenAI's API, you can refer to the official API documentation (https://platform.openai.com/docs/api-reference). If you're still encountering issues, you might want to consider reaching out to OpenAI's support or checking the OpenAI community forum (https://community.openai.com/) for help.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@agile topaz kapa.ai is still learning and improving, please let us know how it did by reacting below

agile topaz
#

@pliant canyon ,
What is the most optimum approach to create embeddings in batch without getting any errors?

pliant canyonBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Creating embeddings in batches is a common practice to manage memory and API usage efficiently. Here's a general approach based on the information from the provided knowledge sources:

  1. Prepare your data: Organize your data into a suitable format. This could be a list of texts that you want to convert into embeddings.

  2. Define your batch size: Decide on a suitable batch size. This is the number of items you will process at once. The optimal batch size can depend on various factors, including the limitations of your machine or API usage limits.

  3. Create a function to generate embeddings: This function should take a list of texts as input and return a list of embeddings. In the examples provided, this is done using the openai.Embedding.create method.

  4. Process the data in batches: Loop over your data, processing one batch at a time. For each batch, call your embedding function and then store the results.

Here's a simplified example of how you might do this:

def embed(texts):
    embeddings = openai.Embedding.create(
        input=texts,
        engine=OPENAI_ENGINE
    )
    return [x['embedding'] for x in embeddings['data']]

BATCH_SIZE = 50  # or whatever size you choose
data = [...]  # your data here

for i in range(0, len(data), BATCH_SIZE):
    batch = data[i:i+BATCH_SIZE]
    embeddings = embed(batch)
    # Now do something with the embeddings, e.g. store them

This example is based on the Python code provided in the Zilliz (https://cookbook.openai.com/examples/vector_databases/zilliz/filtered_search_with_zilliz_and_openai#insert-the-data) and Milvus (https://cookbook.openai.com/examples/vector_databases/milvus/filtered_search_with_milvus_and_openai#insert-the-data) examples from the OpenAI Cookbook.

#

Remember to handle any potential errors and edge cases. For example, if the total number of data items is not a multiple of the batch size, you'll need to handle the last batch separately to avoid errors.

Also, keep in mind the rate limits of the OpenAI API. If you're making too many requests too quickly, you might need to add some delay between batches to avoid hitting these limits.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@agile topaz kapa.ai is still learning and improving, please let us know how it did by reacting below

agile topaz
#

@pliant canyon ,
Could you give me an example of how to achieve this in Javascript?

pliant canyonBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Based on the knowledge sources provided, here's a general approach to creating embeddings in batches using JavaScript:

  1. Prepare your data: Organize your data into a suitable format. This could be an array of texts that you want to convert into embeddings.

  2. Define your batch size: Decide on a suitable batch size. This is the number of items you will process at once. The optimal batch size can depend on various factors, including the limitations of your machine or API usage limits.

  3. Create a function to generate embeddings: This function should take an array of texts as input and return a Promise that resolves to an array of embeddings. In the examples provided, this is done using the openai.embeddings.create method.

  4. Process the data in batches: Loop over your data, processing one batch at a time. For each batch, call your embedding function and then store the results.

Here's a simplified example of how you might do this:

const openai = new OpenAI({
  apiKey: "<openai-api-key>",
});

async function embed(texts) {
  const results = await Promise.all(
    texts.map(async (text) => {
      const result = await openai.embeddings.create({
        input: text,
        model: "text-embedding-3-small",
      });
      return result.data[0].embedding;
    })
  );
  return results;
}

const BATCH_SIZE = 50;  // or whatever size you choose
const data = [...]  // your data here

for (let i = 0; i < data.length; i += BATCH_SIZE) {
  const batch = data.slice(i, i + BATCH_SIZE);
  const embeddings = await embed(batch);
  // Now do something with the embeddings, e.g. store them
}

This example is based on the JavaScript code provided in the Supabase (https://cookbook.openai.com/examples/vector_databases/supabase/semantic-search#generate-openai-embeddings) example from the OpenAI Cookbook.

#

Remember to handle any potential errors and edge cases. For example, if the total number of data items is not a multiple of the batch size, you'll need to handle the last batch separately to avoid errors.

Also, keep in mind the rate limits of the OpenAI API. If you're making too many requests too quickly, you might need to add some delay between batches to avoid hitting these limits.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@agile topaz kapa.ai is still learning and improving, please let us know how it did by reacting below

agile topaz
#
      const batchSize = 200;
      let finalEmbeddings: any[] = [];

      for (
        let batchStart = 0;
        batchStart < chunks.length;
        batchStart += batchSize
      ) {
        const batchEnd = Math.min(batchStart + batchSize, chunks.length);
        const batchChunks = chunks.slice(batchStart, batchEnd);
        const values = batchChunks.map(({ content }) => content);

        const { embeddings } = await embedMany({
          model: openai.embedding("text-embedding-3-large"),
          values,
        });

        finalEmbeddings.push(...embeddings);
      }

Will this also result in a rate limit error?

Or is this the optimum way to generate embeddings in batch?

@pliant canyon

pliant canyonBOT
#

Let me check the knowledge base for you. One moment! :books:

#

The code you've provided seems to be a more optimized way of generating embeddings in batches. It's making use of the embedMany function to generate embeddings for a batch of texts at once, which should be more efficient than generating embeddings for each text individually.

However, whether this code will result in a rate limit error depends on the size of your batches and the frequency of your requests. OpenAI's API has rate limits, and if you're making too many requests in a short period of time, you might encounter rate limit errors.

The Batch API has two types of rate limits:

  1. Per-batch limits: A single batch may include up to 50,000 requests, and a batch input file can be up to 100 MB in size. Note that /v1/embeddings batches are also restricted to a maximum of 50,000 embedding inputs across all requests in the batch.
  2. Enqueued prompt tokens per model: Each model has a maximum number of enqueued prompt tokens allowed for batch processing. You can find these limits on the Platform Settings page (https://platform.openai.com/settings/organization/limits).

There are no limits for output tokens or number of submitted requests for the Batch API today. Because Batch API rate limits are a new, separate pool, using the Batch API will not consume tokens from your standard per-model rate limits.

If you're still encountering rate limit errors, you might need to reduce your batch size or add some delay between batches. If you're making a large number of requests, you might also want to consider using OpenAI's Batch API, which is designed for processing large numbers of requests at once.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@agile topaz kapa.ai is still learning and improving, please let us know how it did by reacting below

agile topaz
#

@pliant canyon , In the above code what would be more beneficial? Adding Promise.all or not adding Promise.all?

pliant canyonBOT
#

Let me check the knowledge base for you. One moment! :books:

#

In the code you've provided, there's no need to use Promise.all. The embedMany function is called once per batch of texts, and it returns a single promise that resolves to the embeddings for all texts in the batch. Therefore, you can simply await this promise directly, as you're doing in your code.

Promise.all is useful when you have multiple promises that can be resolved concurrently. In this case, since you're generating embeddings for each batch of texts as a whole, there's only one promise per batch, so Promise.all isn't necessary.

Here's the relevant part of your code for reference:

const {{ embeddings }} = await embedMany({{
  model: openai.embedding("text-embedding-3-large"),
  values,
}});

In this line, embedMany returns a single promise, which you're awaiting directly. This is the correct approach for this situation.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: