#Convex range queries

6 messages · Page 1 of 1 (latest)

sly acorn
#

What is the recommended approach to implement a virtual file system in convex?
Our main approach right now is using "//a/b/c" for paths, however this has issues as the full text search does for approximate matches and we are not able to for example list all files under "//a/".
What are our options other than having to do multiple queries and building the tree using parentId's of the folders/files?

near auroraBOT
#

Hi @sly acorn. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

There isn’t anything in the provided sources that directly describes a “virtual filesystem” pattern for Convex, so I can’t point you to an official, prescribed schema. I can only infer from the general guidance on indexes and search.

A few concrete points from the docs that apply:

  1. Full‑text search is the wrong tool for path hierarchies

    Full‑text search is designed for approximate keyword matching inside strings, not for prefix/range queries on structured paths. It will happily match "//a/b/c" when you search for "a" etc., and you can’t express “all documents whose path starts with //a/” as an efficient search index query. [Full text search; Search behavior]

    So using a single path: "//a/b/c" string and a search index is expected to give you the problems you’re seeing.

  2. Convex range queries work over indexes, not search

    Range queries in Convex are expressed via .withIndex over a normal index, with chained .eq / .gt / .lt on the indexed fields. [Indexes intro; Index ranges; IndexRangeBuilder]

    Example from the docs:

    const messages = await ctx.db
      .query("messages")
      .withIndex("by_channel", (q) =>
        q
          .eq("channel", channel)
          .gt("_creationTime", Date.now() - 2 * 60000)
          .lt("_creationTime", Date.now() - 60000),
      )
      .collect();
    
#

The performance depends on how specific that range is. [Index ranges]

  1. What this implies for a filesystem‑like model

    Since I don’t have a Convex‑authored “filesystem schema” to quote, I can’t give you an official pattern, but based on the index docs:

    • Convex is optimized for indexed equality + range on structured fields, not for arbitrary prefix matching on a single string.
    • The docs explicitly recommend using indexes and/or pagination for any large table. [Best practices – database]

    That means your two main levers are:

    1. Model hierarchy explicitly (e.g. parentId, name, maybe type: "file" | "dir"), and index on those fields so you can do:
      // Pseudocode pattern, not from docs:
      ctx.db.query("nodes")
        .withIndex("by_parent", q => q.eq("parentId", someId))
        .collect();
      
      This is exactly the “multiple queries + parentId tree” approach you mentioned. It aligns with the general guidance to “use tables to separate logical object types” and query via ids and indexes. [Best practices – database; Reading data]
#
  1. Use an indexable path representation if you really want to keep a path field:

    • Convex indexes are lexicographically ordered on the full field value; you can only express ranges like gte(path, prefix) / lt(path, prefixEnd) if you can compute a suitable upper bound. The docs show this pattern for time ranges, but not for string prefixes. [Index ranges; IndexRangeBuilder]
    • The knowledge sources don’t show an example of using string prefix ranges on an index, so I can’t say with confidence that “path prefix” queries are officially supported or recommended.
  2. ConvexFS component

    There is a community component, ConvexFS, described as a “virtual filesystem for Convex, backed by Bunny.net Edge Storage & CDN.” [ConvexFS]
    However, the provided snippet only describes its high‑level features (paths, files, directories, reference‑counted blobs, etc.) and links to external docs; it does not include its internal schema or how it structures paths in Convex, so I can’t reuse its implementation details here.


Given the limits of the sources:

#
  • I can’t point you to a built‑in Convex feature that lets you do efficient “all files under //a/” queries on a single string field without either:
    • explicit hierarchical modeling (e.g. parentId), or
    • some custom path‑encoding plus a normal index.
  • The only clearly documented, scalable pattern is: use normal indexes on structured fields and design your queries around them; for a filesystem, that naturally pushes you toward a parentId (and/or name)–based model with indexed queries per directory. [Indexes intro; Best practices – database]

If you’re open to external code, ConvexFS’s docs/repo (linked from the component page) are likely the best reference for a production‑grade virtual filesystem on Convex, but they’re outside the provided knowledge set.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: