#how can I insert into a already created vector store using AzureOpenAI
1 messages · Page 1 of 1 (latest)
Hi @tardy canopy. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
To insert data into an already created vector store using Azure Cognitive Search and OpenAI, you can follow the steps outlined in the OpenAI Cookbook example (https://cookbook.openai.com/examples/vector_databases/azuresearch/getting_started_with_azure_cognitive_search_and_openai#insert-text-and-embeddings-into-vector-store).
Here's a Python code snippet from the example that demonstrates how to do this:
# Convert the 'id' and 'vector_id' columns to string so one of them can serve as our key field
article_df['id'] = article_df['id'].astype(str)
article_df['vector_id'] = article_df['vector_id'].astype(str)
# Convert the DataFrame to a list of dictionaries
documents = article_df.to_dict(orient='records')
# Use SearchIndexingBufferedSender to upload the documents in batches optimized for indexing
with SearchIndexingBufferedSender(search_service_endpoint, index_name, AzureKeyCredential(search_service_api_key)) as batch_client:
# Add upload actions for all documents
batch_client.upload_documents(documents=documents)
print(f"Uploaded {len(documents)} documents in total")
In this example, article_df is a DataFrame that contains your data, with 'id' and 'vector_id' as columns. The DataFrame is converted into a list of dictionaries, which is then uploaded to the Azure Search index using the SearchIndexingBufferedSender class.
If your dataset doesn't already contain pre-computed embeddings, you can create embeddings using the OpenAI Python library:
# Example function to generate document embedding
def generate_document_embeddings(text):
response = openai.Embedding.create(
input=text, engine=model)
embeddings = response['data'][0]['embedding']
return embeddings
Sampling the first document content as an example
first_document_content = documents[0]['text']
print(f"Content: {first_document_content[:100]}")
Generate the content vector using the generate_document_embeddings function
content_vector = generate_document_embeddings(first_document_content)
print(f"Content vector generated")
In this example, the `generate_document_embeddings` function is used to generate embeddings for a document. The function takes a text input and uses the OpenAI `Embedding.create` method to generate an embedding. The embedding is then returned by the function.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------