#Uncaught Error: Too many documents read in a single function execution (limit: 32000).

7 messages · Page 1 of 1 (latest)

modest bronze
#

I have a paginated Query, using stream inside this query. In my table are 4600 entries (which are ver few). I use convex selfhosted.
The paginated query failes with

Uncaught Error: Too many documents read in a single function execution (limit: 32000). Consider using smaller limits in your queries, paginating your queries, or using indexed queries with a selective index range expressions.
vapid ventureBOT
#

Hi @modest bronze. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

That error means the single function execution (your paginated query) is reading more than Convex’s per-function read limits, even if your table has “only” 4600 rows.

A few key points from the docs and threads:

  • Convex enforces per-function limits on reads and scans (documents scanned, bytes read, etc.).[Txn limits; Read/write limits]
  • With filters or streams, you can easily scan far more documents than you think: documents that are filtered out still count as scanned.[Filtering perf; Queries that scale 1]
  • stream(...).query(...).paginate(...) does not have all the same protections as plain ctx.db.query().paginate(...); in particular, Convex can’t detect some “data-dependent” definitions and restart pagination, and you can end up scanning large ranges repeatedly.[Merging streams]

So it’s very possible for a paginated query that returns only, say, 50 docs to still scan more than 32,000 docs total across the stream’s internal queries.

From the Discord thread you pasted, Convex staff suggested streams specifically when a simple indexed query was scanning too much data for an ordered listing; in your case you’re already using streams and now hitting the global read limit, so you need to reduce per-function work.[Ordered listings Discord]

Concrete things to try, all consistent with the docs:

  1. Make each underlying query more selective via indexes
#

Ensure every stream(ctx.db, schema).query("X") you’re using has a withIndex(..., q => ...) with a range expression that significantly narrows the index range, not just a bare withIndex("idxName") or a .filter(...) after the fact.[Queries that scale 1; Best‑practices DB]

  1. Reduce how much work happens inside a single paginated query

    Per the limits, you must keep each query/mutation light.[Read/write limits; Queries that scale 2]

    Patterns mentioned in the docs and Discord:

    • Use smaller page sizes (numItems) so each call scans fewer documents.
    • If you’re doing “bulk” style processing, use recursive scheduled functions or an action that calls many smaller mutations/queries, each operating on just one page.[Too many reads Discord]
  2. Be aware of stream pagination caveats

    The streams article calls out that:

    • Pagination cursors must be reused with the same stream definition.
    • Streams don’t automatically give you the “fully reactive” guarantees of ctx.db.query().paginate, and holes/overlaps can appear unless you pass continueCursor back as endCursor and effectively re-run queries.[Merging streams]

    If your stream definition is data-dependent or you’re re-running pages in ways that re-scan earlier ranges, this can multiply the number of docs scanned in one function execution.

#

Because I don’t see your exact query/stream code in the sources, I can’t point to a specific line, but based on the docs the fix is:

  • tighten your .withIndex(..., q => ...) ranges,
  • avoid .filter(...) on large tables whenever possible (push constraints into indexes),
  • and ensure each single function execution (each paginated call) processes only a modest slice of data.

If you can share the specific query + stream definition, I can map these recommendations more directly to your code using the same patterns as in the linked examples.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@modest bronze Kapa.ai is still learning and improving, please let me know how I did by reacting below.

modest bronze
#

<@&1228088700710293556> But how is that ever possible? I give you my setup. I have a table, called "report".

 report: defineTable({
    title: v.optional(v.string()),
    content: v.string(),
    content_summarized: v.string(),
 }).index("title", ["title"])
    .index("content", ["content"])
    .index("content_summarized", ["content_summarized"])

As you see, I have indexex. But in my streamed query I can not used it, because I do a custom filter with string compare. And I do a join so that I merge another stream. Function searchReportByFilter is the one which makes troubles.

export const searchReportByFilter = query({
  args: {
    paginationOpts: paginationOptsValidator,
    _term: v.optional(v.string()),
    title: v.optional(v.string()),
  },
  handler: async (ctx, args) => {
    // setup report stream
    const reportsStream = stream(ctx.db, schema)
      .query("report")
      .withIndex("by_creation_time")
      .order(args.sorting === "oldest" ? "asc" : "desc");


    // "join" users 
    const reportsWithUsersStream = reportsStream.map(async report => {
      const user = (await ctx.db.get(report.report_from)) || null;
      return { report, user, organisation };  
    });

    // filter by search term called _term
    let reportsFiltered = reportsWithUsersStream.filterWith(async () => true);

    reportsFiltered = reportsFiltered.filterWith(async ({ report }) => {
      if (args._term == null || args._term.trim() === "") {
        return true;
      }
      const searchIn = (
        (report.title ?? "") +
        " " +
        (report.content ?? "") +
        " " +
        (report.content_summarized ?? "")
      ).toLowerCase();
      return searchIn.includes(args._term.toLowerCase());
    });

    // apply the pagination Options
    return await reportsFiltered.paginate(args.paginationOpts);
  }
});

How can I make use of the indexes in this case? And how to avoid buig document load