#"Too many bytes read" when using an index + filter, but no issues when removing the filter

1 messages · Page 1 of 1 (latest)

stray sierra
#

Hey all,

My understanding is that I should use an index to minimize the number of returned results, and using a filter beyond that will only filter on that subset of documents (vs a whole table scan).

I'm seeing slightly different behavior than expected.

This query throws a "too many bytes" error (and the dashboard log shows it reads ~9k rows, even though my paginate function is limited in this case to just 1):

const novemberFirst = 1761955200000;
const { page, isDone, continueCursor } = await ctx.db
.query("emailEvents")
.withIndex("by_creation_time", (q) => q.gt("_creationTime", novemberFirst))
.filter((q) =>
  q.and(
    q.neq(q.field("executionId"), undefined),
    q.or(
      q.eq(q.field("templateId"), undefined),
      q.eq(q.field("templateScheduledFor"), undefined),
      q.eq(q.field("campaignType"), undefined),
      q.eq(q.field("campaignTypeId"), undefined),
      q.eq(q.field("contactId"), undefined)
    )
  )
)
.paginate({
  cursor: null,
  numItems: 1
});

But this query does not:

const novemberFirst = 1761955200000;
const { page, isDone, continueCursor } = await ctx.db
.query("emailEvents")
.withIndex("by_creation_time", (q) => q.gt("_creationTime", novemberFirst))
.paginate({
  cursor: null,
  numItems: 1
});

Notice I'm also using paginate, which should limit the results no matter what, correct?

I guess I can drop the filter() and manually filter the pages of events, but I'm just wondering where my gap in understanding might be.

ashen viperBOT
#

Thanks for posting in #1088161997662724167.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.

    - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
    - Use [search.convex.dev](https://search.convex.dev) to search Docs, Stack, and Discord all at once.
    - Additionally, you can post your questions in the Convex Community's #1228095053885476985 channel to receive a response from AI.
    - Avoid tagging staff unless specifically instructed.

    Thank you!
stray sierra
#

I think I figured this one out – and learned something new in the process!

  • using withIndex limits the total number of documents scanned
  • using take() or paginate() takes whatever "numItems" from those

However, using filter() on top of this means that in order to return "numItems", it needs to scan all the documents found with the index. In my case, this was something like 10k documents.

My misunderstanding was that I thought the process would go:

  • withIndex returns numItems, THEN filter is applied (limiting the number of docs read by numItems)

But in reality, it's:

  • withIndex returns ALL the matching docs if filter is applied, so it can return the requested numItems of documents.

The docs do say this but it's easy to miss (https://docs.convex.dev/database/reading-data/indexes/):