files: defineTable({
articleId: v.optional(v.id("articles")),
name: v.string(),
contentType: v.string(),
sizeBytes: v.number(),
isPrivate: v.boolean(),
isUploaded: v.boolean(),
createdAt: v.number(),
updatedAt: v.number(),
})
.index("by_is_uploaded_created_at", ["isUploaded", "createdAt"])
.index("by_article_id_created_at", ["articleId", "createdAt"]),
can i just use article_id for the index by_article_id_created_at
#f
12 messages · Page 1 of 1 (latest)
Hi @covert jay. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
You can define an index on just articleId if that’s all you need:
files: defineTable({
articleId: v.optional(v.id("articles")),
name: v.string(),
contentType: v.string(),
sizeBytes: v.number(),
isPrivate: v.boolean(),
isUploaded: v.boolean(),
createdAt: v.number(),
updatedAt: v.number(),
})
.index("by_article_id", ["articleId"]);
However, this changes what you can do efficiently:
- With
["articleId", "createdAt"], you can efficiently query byarticleIdand get results ordered bycreatedAt(because index order isarticleId, thencreatedAt, then_creationTime). [Sorting indexes; Querying with indexes] - With just
["articleId"], you can still query byarticleId, but you lose the ability to efficiently constrain or sort bycreatedAtwithin that index.
So:
- If you only ever need “all files for this article” and don’t care about
createdAtordering or ranges,["articleId"]is enough. - If you need “files for this article ordered by createdAt” or “files for this article in a createdAt range”, keep
["articleId", "createdAt"].
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@covert jay Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@unreal moon the index is used fgor this:
export const getOrphanedFiles = internalQuery({
args: {},
handler: async (ctx) => {
const TAKE_COUNT = 100;
const ONE_HOUR_MS = 1000 * 60 * 60;
const cutoffTimeMs = Date.now() - ONE_HOUR_MS;
const orphanedFilesByIsUploaded = await ctx.db
.query("files")
.withIndex("by_is_uploaded_created_at", q =>
q.eq("isUploaded", false).lte("createdAt", cutoffTimeMs)
)
.take(TAKE_COUNT);
const orphanedFilesByArticleId = await ctx.db
.query("files")
.withIndex("by_article_id_created_at", q =>
q.eq("articleId", undefined).lte("createdAt", cutoffTimeMs)
)
.take(TAKE_COUNT);
return [...orphanedFilesByIsUploaded, ...orphanedFilesByArticleId];
},
});
but for use cases where i want to get all files associated with a paper can i just use by_article_id_created_at for it?
Let me check the knowledge base for you again. One moment! :books:
Yes, you can (and should) reuse by_article_id_created_at for “all files for this article” queries.
A multi-field index can be used with only a prefix of its fields. That’s exactly the recommended pattern: instead of having both by_articleId and by_article_id_created_at, you keep just the compound index and sometimes only constrain on the first field. [Best practices]
So for “all files associated with a paper”, you can do:
const filesForArticle = await ctx.db
.query("files")
.withIndex("by_article_id_created_at", q =>
q.eq("articleId", articleId)
)
.collect();
This will:
- Efficiently use the index (range is all rows with that
articleId). - Return them ordered by
createdAt(then_creationTime) because that’s the index order. [Querying with indexes]
You don’t need a separate by_article_id index unless you specifically need ordering by articleId then _creationTime (without createdAt in between), which is a rarer case.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@covert jay Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@unreal moon export const getOrphanedFiles = internalQuery({
args: {},
handler: async (ctx) => {
const TAKE_COUNT = 100;
const ONE_HOUR_MS = 1000 * 60 * 60;
const cutoffTimeMs = Date.now() - ONE_HOUR_MS;
const orphanedFilesByIsUploaded = await ctx.db
.query("files")
.withIndex("by_article_id_is_uploaded_created_at", q =>
q.eq("articleId", undefined).eq("isUploaded", false).lte("createdAt", cutoffTimeMs)
)
.take(TAKE_COUNT);
return orphanedFilesByIsUploaded;
},
});
how do i check if it matches either of the first two condititon and past 1 hr
Let me check the knowledge base for you again. One moment! :books:
You can’t express an OR across index fields inside a single withIndex range; Convex index ranges must step through fields in order with eq on a prefix and then optional bounds on the next field. [Querying indexes]
So with an index:
.index("by_article_id_is_uploaded_created_at", [
"articleId",
"isUploaded",
"createdAt",
])
this is valid:
.withIndex("by_article_id_is_uploaded_created_at", q =>
q
.eq("articleId", undefined)
.eq("isUploaded", false)
.lte("createdAt", cutoffTimeMs)
)
but you cannot say “(articleId is undefined OR isUploaded is false) AND createdAt <= cutoff” in a single indexed range.
To implement:
matches either of the first two conditions and past 1 hr
you need two indexed queries and then merge the results:
export const getOrphanedFiles = internalQuery({
args: {},
handler: async (ctx) => {
const TAKE_COUNT = 100;
const ONE_HOUR_MS = 1000 * 60 * 60;
const cutoffTimeMs = Date.now() - ONE_HOUR_MS;
// 1. articleId is undefined AND older than 1h
const byArticleId = await ctx.db
.query("files")
.withIndex("by_article_id_is_uploaded_created_at", q =>
q
.eq("articleId", undefined)
.lte("createdAt", cutoffTimeMs)
)
.take(TAKE_COUNT);
// 2. isUploaded is false AND older than 1h
const byIsUploaded = await ctx.db
.query("files")
.withIndex("by_is_uploaded_created_at", q =>
q
.eq("isUploaded", false)
.lte("createdAt", cutoffTimeMs)
)
.take(TAKE_COUNT);
// 3. Union + dedupe (if needed)
const all = [...byArticleId, ...byIsUploaded];
const deduped = Object.values(
Object.fromEntries(all.map(doc => [doc._id, doc]))
);
return deduped;
},
});
This follows the pattern “no efficient `IN` / `OR` in a single index query → do multiple queries and union in code.” [[No efficient in/or](<https://stack.convex.dev/databases-are-spreadsheets#no-efficient-in--or>); [Union pattern](<https://stack.convex.dev/translate-sql-into-convex-queries#union>)]
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------