#f

12 messages · Page 1 of 1 (latest)

covert jay
#

files: defineTable({
articleId: v.optional(v.id("articles")),
name: v.string(),
contentType: v.string(),
sizeBytes: v.number(),
isPrivate: v.boolean(),
isUploaded: v.boolean(),
createdAt: v.number(),
updatedAt: v.number(),
})
.index("by_is_uploaded_created_at", ["isUploaded", "createdAt"])
.index("by_article_id_created_at", ["articleId", "createdAt"]),
can i just use article_id for the index by_article_id_created_at

unreal moonBOT
#

Hi @covert jay. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

You can define an index on just articleId if that’s all you need:

files: defineTable({
  articleId: v.optional(v.id("articles")),
  name: v.string(),
  contentType: v.string(),
  sizeBytes: v.number(),
  isPrivate: v.boolean(),
  isUploaded: v.boolean(),
  createdAt: v.number(),
  updatedAt: v.number(),
})
  .index("by_article_id", ["articleId"]);

However, this changes what you can do efficiently:

  • With ["articleId", "createdAt"], you can efficiently query by articleId and get results ordered by createdAt (because index order is articleId, then createdAt, then _creationTime). [Sorting indexes; Querying with indexes]
  • With just ["articleId"], you can still query by articleId, but you lose the ability to efficiently constrain or sort by createdAt within that index.

So:

  • If you only ever need “all files for this article” and don’t care about createdAt ordering or ranges, ["articleId"] is enough.
  • If you need “files for this article ordered by createdAt” or “files for this article in a createdAt range”, keep ["articleId", "createdAt"].

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@covert jay Kapa.ai is still learning and improving, please let me know how I did by reacting below.

covert jay
#

@unreal moon the index is used fgor this:
export const getOrphanedFiles = internalQuery({
args: {},
handler: async (ctx) => {
const TAKE_COUNT = 100;
const ONE_HOUR_MS = 1000 * 60 * 60;
const cutoffTimeMs = Date.now() - ONE_HOUR_MS;
const orphanedFilesByIsUploaded = await ctx.db
.query("files")
.withIndex("by_is_uploaded_created_at", q =>
q.eq("isUploaded", false).lte("createdAt", cutoffTimeMs)
)
.take(TAKE_COUNT);
const orphanedFilesByArticleId = await ctx.db
.query("files")
.withIndex("by_article_id_created_at", q =>
q.eq("articleId", undefined).lte("createdAt", cutoffTimeMs)
)
.take(TAKE_COUNT);
return [...orphanedFilesByIsUploaded, ...orphanedFilesByArticleId];
},
});

but for use cases where i want to get all files associated with a paper can i just use by_article_id_created_at for it?

unreal moonBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

Yes, you can (and should) reuse by_article_id_created_at for “all files for this article” queries.

A multi-field index can be used with only a prefix of its fields. That’s exactly the recommended pattern: instead of having both by_articleId and by_article_id_created_at, you keep just the compound index and sometimes only constrain on the first field. [Best practices]

So for “all files associated with a paper”, you can do:

const filesForArticle = await ctx.db
  .query("files")
  .withIndex("by_article_id_created_at", q =>
    q.eq("articleId", articleId)
  )
  .collect();

This will:

  • Efficiently use the index (range is all rows with that articleId).
  • Return them ordered by createdAt (then _creationTime) because that’s the index order. [Querying with indexes]

You don’t need a separate by_article_id index unless you specifically need ordering by articleId then _creationTime (without createdAt in between), which is a rarer case.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@covert jay Kapa.ai is still learning and improving, please let me know how I did by reacting below.

covert jay
#

@unreal moon export const getOrphanedFiles = internalQuery({
args: {},
handler: async (ctx) => {
const TAKE_COUNT = 100;
const ONE_HOUR_MS = 1000 * 60 * 60;
const cutoffTimeMs = Date.now() - ONE_HOUR_MS;
const orphanedFilesByIsUploaded = await ctx.db
.query("files")
.withIndex("by_article_id_is_uploaded_created_at", q =>
q.eq("articleId", undefined).eq("isUploaded", false).lte("createdAt", cutoffTimeMs)
)
.take(TAKE_COUNT);
return orphanedFilesByIsUploaded;
},
});

how do i check if it matches either of the first two condititon and past 1 hr

unreal moonBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

You can’t express an OR across index fields inside a single withIndex range; Convex index ranges must step through fields in order with eq on a prefix and then optional bounds on the next field. [Querying indexes]

So with an index:

.index("by_article_id_is_uploaded_created_at", [
  "articleId",
  "isUploaded",
  "createdAt",
])

this is valid:

.withIndex("by_article_id_is_uploaded_created_at", q =>
  q
    .eq("articleId", undefined)
    .eq("isUploaded", false)
    .lte("createdAt", cutoffTimeMs)
)

but you cannot say “(articleId is undefined OR isUploaded is false) AND createdAt <= cutoff” in a single indexed range.

To implement:

matches either of the first two conditions and past 1 hr

you need two indexed queries and then merge the results:

export const getOrphanedFiles = internalQuery({
  args: {},
  handler: async (ctx) => {
    const TAKE_COUNT = 100;
    const ONE_HOUR_MS = 1000 * 60 * 60;
    const cutoffTimeMs = Date.now() - ONE_HOUR_MS;

    // 1. articleId is undefined AND older than 1h
    const byArticleId = await ctx.db
      .query("files")
      .withIndex("by_article_id_is_uploaded_created_at", q =>
        q
          .eq("articleId", undefined)
          .lte("createdAt", cutoffTimeMs)
      )
      .take(TAKE_COUNT);

    // 2. isUploaded is false AND older than 1h
    const byIsUploaded = await ctx.db
      .query("files")
      .withIndex("by_is_uploaded_created_at", q =>
        q
          .eq("isUploaded", false)
          .lte("createdAt", cutoffTimeMs)
      )
      .take(TAKE_COUNT);

    // 3. Union + dedupe (if needed)
    const all = [...byArticleId, ...byIsUploaded];
    const deduped = Object.values(
      Object.fromEntries(all.map(doc => [doc._id, doc]))
    );
#

return deduped;
},
});


This follows the pattern “no efficient `IN` / `OR` in a single index query → do multiple queries and union in code.” [[No efficient in/or](<https://stack.convex.dev/databases-are-spreadsheets#no-efficient-in--or>); [Union pattern](<https://stack.convex.dev/translate-sql-into-convex-queries#union>)]
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------