#embedding similarity issue

2 messages · Page 1 of 1 (latest)

hidden ingot
#

I keep running into this error when trying to calculate cosine similarity between my search query and embeddings:

ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U34382'), dtype('<U23')) -> None

This is part of the code:

embedding = get_embedding(
    search_query,
    engine="text-embedding-ada-002"
)
df["similarities"] = df.ada_search.apply(lambda x: cosine_similarity(x, embedding)) # Errors here

Any ideas?

#

Additional info:

  • datatype of the 'ada_search' column in my df is 'object'
  • datatype of 'embedding' is 'list'
  • I'm saving the embeddings to a csv, and then loading the csv back into a dataframe. I'm willing to bet the conversion back into the df is causing the datatype mismatch, but not sure how to correct it