I am not sure what I am doing wrong, I am not an expert I am just a university student (with basic formal education in python) who's studies aren't related to AI but I am growing this skill on the side. With that being said, I am trying to build a (personal) RAG system to answer questions from the content of books of my choosing.
The way I am going about is by parsing the .pdf or .epub file and save each chapter in a Document object, I then chunk each chapter using the llama_index sentence splitter into TextNodes with some metadata.
Here is where I think the mistake most likely is but I am not sure what. Because Cohere embed connection point requires the text argument to be a list of strings, so I use this line to add the texts from the TextNodes to a list: texts = [node.text for node in text_nodes] and then
response = co.embed(
texts= texts,
model="embed-multilingual-v3.0",
input_type="search_document",
embedding_types=["float"],
)
When I check either response or response.embeddings, I get 1 float with all the embeddings in a single list like in the attached picture. I hope this is enough context but if I am missing something important please let me know. Thank you in advance.