#vector search stopped working

14 messages · Page 1 of 1 (latest)

honest saddle
#

My code has not changed for a few weeks, and now the vector search method inside actions is no longer returning results when using the filter property.

#
  const results = await vectorStore.similaritySearch(args.query, 5, {
      filter: (q) =>
        q.eq(
          "metadata.chatbotId",
          args.chatbotId,
        ),
    });

The results are always empty, I triple made sure that metadata.chatbotId exists for the value I am passing, and again, my code has not changed, this used to work some time ago

#

this is using the langchain convex integration, also tried with vanilla vectorSearch and no luck

heavy lava
#

What version of convex are you on?

honest saddle
#

"convex": "^1.13.0",

heavy lava
#

What does your code using vanilla vector search look like? (if you feel comfortable sharing)

honest saddle
#
// convex/ai.ts

export const search = internalAction({
  args: {
    query: v.string(),
    chatbotId: v.id("chatbot"),
  },
  handler: async (ctx, args) => {
    const embeddings = await cohereClient.embed({
      texts: [args.query],
      model: DEFAULT_EMBEDDING_MODEL,
      inputType: "search_query",
    });
    const [vector] = embeddings.embeddings as number[][];

    const results = await ctx.vectorSearch("document", "byEmbeddings", {
      vector,
      filter: (q) => q.eq("metadata.chatbotId", args.chatbotId),
      limit: 5,
    });

    return results;
  },
});
#
// convex/schema.ts

export default defineSchema({
  document: defineTable({
    text: v.string(),
    embeddings: v.array(v.float64()),
    metadata: v.object({
      chatbotId: v.id("chatbot"),
    }),
  })
    .vectorIndex("byEmbeddings", {
      vectorField: "embeddings",
      filterFields: ["metadata.chatbotId"],
      dimensions: 1024,
    })
    .index("by_chatbot", ["metadata.chatbotId"])
})
#

when I remove the filter, it works, but of course I need that filter to only search based on a chatbot's resources

heavy lava
#

that's helpful to know! I'll try to repro and figure out why filtering is broken

wanton folio
#

I was able to reproduce. The issue seem related to fact we are filtering on a nested field (or at least I couldn't reproduce it without nesting). I am investigating to try to find a root cause. As a stop gap, if you remove the nesting, my bet is it will work, but not sure how viable it is.

honest saddle
#

It was working some time ago! My guess is that something changed recently

wanton folio
#

The bug is related to nested field. We are working on rolliing out a fix. The issue is a bit tricky because we always showed recently added results (within the last hour) but due to incorrectly handling the nesting, weren't showing historical results. Thus, when you test it might appear working, which I think led to the confusion (and the fact we didn't catch it).

wanton folio
#

We have fixed the issue and deployed a fix. Again, it only affected filtering on nested fields for data older than 1h. For historical data to become available, you should drop and readd the index. We will look into doing this automatically next week but easier to do that yourself for now.

For example, you can remove the filter field from the index, deploy, and then add the field back. Alternatively, you could add another filter field like "text" and deploy ``filterFields: ["metadata.chatbotId", "text"],`. Adding the extra fields forces the index to rebuild. Later you can remove the "text" filter field again. Thanks for the report and sorry about the trouble!