What is the recommended approach to implement a virtual file system in convex?
Our main approach right now is using "//a/b/c" for paths, however this has issues as the full text search does for approximate matches and we are not able to for example list all files under "//a/".
What are our options other than having to do multiple queries and building the tree using parentId's of the folders/files?
#Convex range queries
6 messages · Page 1 of 1 (latest)
Hi @sly acorn. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
There isn’t anything in the provided sources that directly describes a “virtual filesystem” pattern for Convex, so I can’t point you to an official, prescribed schema. I can only infer from the general guidance on indexes and search.
A few concrete points from the docs that apply:
-
Full‑text search is the wrong tool for path hierarchies
Full‑text search is designed for approximate keyword matching inside strings, not for prefix/range queries on structured paths. It will happily match
"//a/b/c"when you search for"a"etc., and you can’t express “all documents whose path starts with//a/” as an efficient search index query. [Full text search; Search behavior]So using a single
path: "//a/b/c"string and a search index is expected to give you the problems you’re seeing. -
Convex range queries work over indexes, not search
Range queries in Convex are expressed via
.withIndexover a normal index, with chained.eq/.gt/.lton the indexed fields. [Indexes intro; Index ranges; IndexRangeBuilder]Example from the docs:
const messages = await ctx.db .query("messages") .withIndex("by_channel", (q) => q .eq("channel", channel) .gt("_creationTime", Date.now() - 2 * 60000) .lt("_creationTime", Date.now() - 60000), ) .collect();
The performance depends on how specific that range is. [Index ranges]
-
What this implies for a filesystem‑like model
Since I don’t have a Convex‑authored “filesystem schema” to quote, I can’t give you an official pattern, but based on the index docs:
- Convex is optimized for indexed equality + range on structured fields, not for arbitrary prefix matching on a single string.
- The docs explicitly recommend using indexes and/or pagination for any large table. [Best practices – database]
That means your two main levers are:
- Model hierarchy explicitly (e.g.
parentId,name, maybetype: "file" | "dir"), and index on those fields so you can do:This is exactly the “multiple queries + parentId tree” approach you mentioned. It aligns with the general guidance to “use tables to separate logical object types” and query via ids and indexes. [Best practices – database; Reading data]// Pseudocode pattern, not from docs: ctx.db.query("nodes") .withIndex("by_parent", q => q.eq("parentId", someId)) .collect();
-
Use an indexable path representation if you really want to keep a path field:
- Convex indexes are lexicographically ordered on the full field value; you can only express ranges like
gte(path, prefix)/lt(path, prefixEnd)if you can compute a suitable upper bound. The docs show this pattern for time ranges, but not for string prefixes. [Index ranges; IndexRangeBuilder] - The knowledge sources don’t show an example of using string prefix ranges on an index, so I can’t say with confidence that “path prefix” queries are officially supported or recommended.
- Convex indexes are lexicographically ordered on the full field value; you can only express ranges like
-
ConvexFS component
There is a community component, ConvexFS, described as a “virtual filesystem for Convex, backed by Bunny.net Edge Storage & CDN.” [ConvexFS]
However, the provided snippet only describes its high‑level features (paths, files, directories, reference‑counted blobs, etc.) and links to external docs; it does not include its internal schema or how it structures paths in Convex, so I can’t reuse its implementation details here.
Given the limits of the sources:
- I can’t point you to a built‑in Convex feature that lets you do efficient “all files under
//a/” queries on a single string field without either:- explicit hierarchical modeling (e.g.
parentId), or - some custom path‑encoding plus a normal index.
- explicit hierarchical modeling (e.g.
- The only clearly documented, scalable pattern is: use normal indexes on structured fields and design your queries around them; for a filesystem, that naturally pushes you toward a
parentId(and/orname)–based model with indexed queries per directory. [Indexes intro; Best practices – database]