How to pass chunks to create embeddings? | OpenAI | Page 1

livid lintel Jan 30, 2024, 7:31 PM

#

I have an array of chunks, and I need to pass each chunk to the fetchEmbedding func to create an embedding from openai, what is the best way to achieve this?
Currently I am passing it as so?

const response = await fetchEmbedding(args.chunks.map((chunk) => chunk));

export const fetchEmbedding = async (text: string[]) => {
  try {
    const response = await openai.embeddings.create({
      model: "text-embedding-ada-002",
      input: text,
    });
    return response;
  } catch (error) {
    console.log(error);
    return "Unable to create embeddings from Open AI.";
  }
};

But as you can see in here I am receiving the text as an array of chunk, but shouldn't it be just a chunk.

near violetBOT Jan 30, 2024, 7:31 PM

#

Hi @livid lintel. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Yes, you're correct. The openai.embeddings.create function expects a single string or an array of strings as input. In your case, you're passing an array of chunks (which are presumably strings) to the function, which is correct. However, if you want to process each chunk individually, you'll need to loop over your array of chunks and call the function for each one.

Here's an example of how you might modify your fetchEmbedding function to handle this:

export const fetchEmbedding = async (chunks: string[]) => {
  try {
    const embeddings = [];
    for (const chunk of chunks) {
      const response = await openai.embeddings.create({
        model: "text-embedding-ada-002",
        input: chunk,
      });
      embeddings.push(response);
    }
    return embeddings;
  } catch (error) {
    console.log(error);
    return "Unable to create embeddings from Open AI.";
  }
};

In this modified version of your function, we loop over the chunks array and call openai.embeddings.create for each chunk. The resulting embeddings are collected in the embeddings array, which is then returned by the function.

This approach is based on the information provided in the OpenAI Cookbook (https://cookbook.openai.com/examples/vector_databases/pinecone/gpt4_retrieval_augmentation#initializing-the-index) and the OpenAI API documentation (https://platform.openai.com/docs/api-reference).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@livid lintel kapa.ai is still learning and improving, please let us know how it did by reacting below

livid lintel Jan 30, 2024, 7:33 PM

#

@near violet
Which is a better approach? And which will be cost effective taking less tokens?

near violetBOT Jan 30, 2024, 7:33 PM

#

Let me check the knowledge base for you. One moment! :books:

#

The cost-effectiveness of your approach will depend on the size and number of chunks you're processing. In terms of token usage, each call to openai.embeddings.create will consume tokens equivalent to the size of the input text. Therefore, if your chunks are large, processing them individually could consume more tokens and be more costly.

However, if your chunks are small and you have a lot of them, processing them all at once might be more cost-effective. This is because the OpenAI API charges per request, so making fewer requests with larger payloads can sometimes be cheaper.

In terms of code efficiency, processing the chunks individually (as in the second example) might be slower because it involves more API calls. However, it could also be more robust because it allows you to handle errors on a per-chunk basis.

Ultimately, the best approach will depend on your specific use case and constraints. You might want to experiment with both approaches and monitor your token usage to see which one works best for you. You can use OpenAI's tokenizer tool (https://platform.openai.com/tokenizer) to help estimate costs and token usage.

Remember, you can also consider other strategies to reduce token usage, such as using shorter prompts, fine-tuning models, or caching common queries, as suggested in the OpenAI production best practices guide (https://platform.openai.com/docs/guides/production-best-practices).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@livid lintel kapa.ai is still learning and improving, please let us know how it did by reacting below

livid lintel Jan 30, 2024, 8:14 PM

#

@near violet

      for (const chunk of args.chunks) {
        const response = await fetchEmbedding(chunk);
        if (typeof response === "string") {
          throw new Error(response);
        }
        await ctx.runMutation(internal.chatbook.addEmbedding, {
          chatId: args.chatId,
          content: chunk,
          embedding: response.data[0].embedding,
          metadata: args.metadata,
        });
      }

export const fetchEmbedding = async (text: string) => {
  console.log(text);
  try {
    const response = await openai.embeddings.create({
      model: "text-embedding-ada-002",
      input: text,
    });
    return response;
  } catch (error) {
    console.log(error);
    return "Unable to create embeddings from Open AI.";
  }
};

My fetch embedding func is too slow now, but with the previous method it was fast, why is it so?

near violetBOT Jan 30, 2024, 8:14 PM

#

Let me check the knowledge base for you. One moment! :books:

#

The slowdown you're experiencing is likely due to the fact that you're now making multiple asynchronous calls to the OpenAI API in a loop. In your previous implementation, you were making a single API call with an array of chunks as input. This allowed the API to process all of your chunks at once, which is generally faster than processing them individually.

In your new implementation, you're making a separate API call for each chunk. Each of these calls involves network latency and processing time, which can add up when you're dealing with a large number of chunks. This is likely why your new implementation is slower.

If speed is a concern, you might want to consider reverting to your previous implementation, or looking into ways to parallelize your API calls. However, keep in mind that making too many simultaneous requests to the API could lead to rate limiting.

Here's an example of how you might modify your fetchEmbedding function to handle multiple chunks at once, similar to your original implementation:

export const fetchEmbedding = async (chunks: string[]) => {
  try {
    const responses = await Promise.all(chunks.map(chunk => 
      openai.embeddings.create({
        model: "text-embedding-ada-002",
        input: chunk,
      })
    ));
    return responses;
  } catch (error) {
    console.log(error);
    return "Unable to create embeddings from Open AI.";
  }
};

In this version of the function, we use `Promise.all` to send all of the API requests at once. This should be faster than sending the requests one at a time, but it could also lead to rate limiting if you're dealing with a large number of chunks. Be sure to test this implementation thoroughly to ensure it meets your needs.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@livid lintel kapa.ai is still learning and improving, please let us know how it did by reacting below

livid lintel Jan 30, 2024, 8:31 PM

#

@near violet , Give an example of how can I use batch embeddings to create embeddings? I am using React.

near violetBOT Jan 30, 2024, 8:31 PM

#

Let me check the knowledge base for you. One moment! :books:

#

Sure, you can use the openai.embeddings.create function to create embeddings for a batch of texts. Here's an example of how you might do this in a React application:

import openai from 'openai'; // Assuming you have an OpenAI client

export const fetchBatchEmbeddings = async (texts: string[]) => {
  try {
    const response = await openai.embeddings.create({
      model: "text-embedding-ada-002",
      input: texts,
    });
    return response;
  } catch (error) {
    console.log(error);
    return "Unable to create embeddings from Open AI.";
  }
};

// Usage
const texts = ['Text 1', 'Text 2', 'Text 3']; // Replace with your actual texts
const embeddings = await fetchBatchEmbeddings(texts);

In this example, fetchBatchEmbeddings is a function that takes an array of texts and returns a promise that resolves to the embeddings for those texts. You can use this function in your React components to fetch embeddings for a batch of texts.

This approach is based on the information provided in the OpenAI Cookbook (https://cookbook.openai.com/examples/embedding_wikipedia_articles_for_search#3-embed-document-chunks) and the OpenAI API documentation (https://platform.openai.com/docs/api-reference).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

livid lintel Jan 30, 2024, 8:46 PM

#

@near violet

    try {
      const BATCH_SIZE = 100;
      for (
        let batchStart = 0;
        batchStart < args.chunks.length;
        batchStart += BATCH_SIZE
      ) {
        const batchEnd = batchStart + BATCH_SIZE;
        const batch = args.chunks.slice(batchStart, batchEnd);

        const response = await fetchEmbedding(batch);

        if (typeof response === "string") {
          throw new Error(response);
        }

        for (let i = 0; i < response.data.length; i++) {
          await ctx.runMutation(internal.chatbook.addEmbedding, {
            chatId: args.chatId,
            content: args.chunks[i],
            embedding: response.data[i].embedding,
            metadata: args.metadata,
          });
        }
      }

Does the code ensures that the correct embedding is associated with the corresponding text chunk when patching data to the database.?
Does fetchEmbedding guarantee order preservation?

export const fetchEmbedding = async (text: string[]) => {
  console.log(text);
  try {
    const response = await openai.embeddings.create({
      model: "text-embedding-ada-002",
      input: text,
    });
    return response;
  } catch (error) {
    console.log(error);
    return "Unable to create embeddings from Open AI.";
  }
};

near violetBOT Jan 30, 2024, 8:46 PM

#

Let me check the knowledge base for you. One moment! :books:

#

Yes, the fetchEmbedding function does guarantee order preservation. When you pass an array of strings to the openai.embeddings.create function, it returns an array of embeddings in the same order as the input array. This means that the first embedding in the response corresponds to the first text in the input array, the second embedding corresponds to the second text, and so on.

In your code, you're correctly associating each embedding with its corresponding text chunk. You're iterating over the response.data array and the args.chunks array in parallel, so each embedding is associated with the text chunk at the same index.

Here's the relevant part of your code:

for (let i = 0; i < response.data.length; i++) {
  await ctx.runMutation(internal.chatbook.addEmbedding, {
    chatId: args.chatId,
    content: args.chunks[i],
    embedding: response.data[i].embedding,
    metadata: args.metadata,
  });
}

In this loop, args.chunks[i] is the i-th text chunk and response.data[i].embedding is the corresponding embedding. These are passed to the internal.chatbook.addEmbedding mutation, which presumably adds the embedding to the database.

So, as long as the openai.embeddings.create function is working correctly and the internal.chatbook.addEmbedding mutation is correctly adding the embeddings to the database, your code should ensure that the correct embedding is associated with each text chunk.

#

This behavior is consistent with the information provided in the OpenAI Cookbook (https://cookbook.openai.com/examples/vector_databases/mongodb_atlas/semantic_search_using_mongodb_atlas_vector_search#step-2-setup-embeddings-generation-function) and the OpenAI API documentation (https://platform.openai.com/docs/api-reference).

#How to pass chunks to create embeddings?

This approach is based on the information provided in the OpenAI Cookbook (https://cookbook.openai.com/examples/vector_databases/pinecone/gpt4_retrieval_augmentation#initializing-the-index) and the OpenAI API documentation (https://platform.openai.com/docs/api-reference).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

Remember, you can also consider other strategies to reduce token usage, such as using shorter prompts, fine-tuning models, or caching common queries, as suggested in the OpenAI production best practices guide (https://platform.openai.com/docs/guides/production-best-practices).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

This approach is based on the information provided in the OpenAI Cookbook (https://cookbook.openai.com/examples/embedding_wikipedia_articles_for_search#3-embed-document-chunks) and the OpenAI API documentation (https://platform.openai.com/docs/api-reference).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: