#how can I insert into a already created vector store using AzureOpenAI

1 messages · Page 1 of 1 (latest)

tardy canopy
#

how can I insert into a already created vector store using AzureOpenAI

edgy glenBOT
#

Hi @tardy canopy. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

To insert data into an already created vector store using Azure Cognitive Search and OpenAI, you can follow the steps outlined in the OpenAI Cookbook example (https://cookbook.openai.com/examples/vector_databases/azuresearch/getting_started_with_azure_cognitive_search_and_openai#insert-text-and-embeddings-into-vector-store).

Here's a Python code snippet from the example that demonstrates how to do this:

# Convert the 'id' and 'vector_id' columns to string so one of them can serve as our key field 
article_df['id'] = article_df['id'].astype(str) 
article_df['vector_id'] = article_df['vector_id'].astype(str) 

# Convert the DataFrame to a list of dictionaries 
documents = article_df.to_dict(orient='records') 

# Use SearchIndexingBufferedSender to upload the documents in batches optimized for indexing 
with SearchIndexingBufferedSender(search_service_endpoint, index_name, AzureKeyCredential(search_service_api_key)) as batch_client: 
 # Add upload actions for all documents 
 batch_client.upload_documents(documents=documents) 

print(f"Uploaded {len(documents)} documents in total") 

In this example, article_df is a DataFrame that contains your data, with 'id' and 'vector_id' as columns. The DataFrame is converted into a list of dictionaries, which is then uploaded to the Azure Search index using the SearchIndexingBufferedSender class.

If your dataset doesn't already contain pre-computed embeddings, you can create embeddings using the OpenAI Python library:

# Example function to generate document embedding 
def generate_document_embeddings(text): 
 response = openai.Embedding.create( 
 input=text, engine=model) 
 embeddings = response['data'][0]['embedding'] 
 return embeddings
#

Sampling the first document content as an example

first_document_content = documents[0]['text']
print(f"Content: {first_document_content[:100]}")

Generate the content vector using the generate_document_embeddings function

content_vector = generate_document_embeddings(first_document_content)
print(f"Content vector generated")


In this example, the `generate_document_embeddings` function is used to generate embeddings for a document. The function takes a text input and uses the OpenAI `Embedding.create` method to generate an embedding. The embedding is then returned by the function.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------