#Update Document when 24 hours old

10 messages · Page 1 of 1 (latest)

short depot
#

Is it possible to update a document as soon as it's 24 hours old? I have an updated_at field which is the date and time that the document was last updated and I want to kickoff a task that will automatically update a document by hitting an external API as soon as the document is older than 24 hours.

cobalt sparrowBOT
#

Hi @short depot. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Yes, you can update a document as soon as it's 24 hours old in Convex. There are a couple of approaches you can use:

Using a Cron Job

The most straightforward approach is to set up a cron job that runs periodically (e.g., hourly) to check for documents that need updating:

// convex/crons.ts
import { cronJobs } from "convex/server";
import { internal } from "./_generated/api";

const crons = cronJobs();

crons.hourly(
  "update old documents",
  { minuteUTC: 0 }, // At the top of every hour
  internal.yourModule.checkAndUpdateOldDocuments
);

export default crons;

Then create an action that checks for documents older than 24 hours and updates them:

export const checkAndUpdateOldDocuments = internalAction({
  args: {},
  handler: async (ctx) => {
    // Get documents that need updating
    const docsToUpdate = await ctx.runQuery(internal.yourModule.getDocsOlderThan24Hours);
    
    // For each document, fetch data from external API and update
    for (const doc of docsToUpdate) {
      // Make your API call
      const response = await fetch("<https://your-api-endpoint.com">);
      const data = await response.json();
      
      // Update the document with the API result
      await ctx.runMutation(internal.yourModule.updateDocWithApiResult, {
        documentId: doc._id,
        apiData: data,
        updated_at: Date.now()
      });
    }
  },
});

This approach is mentioned in the knowledge sources as a common pattern for handling periodic updates. As one community member noted: "Could calculating/populating your aggregates e.g. once every 24 hours work in your situation? That's what we are doing in our project." Discord support thread

Scheduling Individual Updates

#

If you need more precise timing, you could schedule individual updates for each document:

  1. When a document is created or updated, schedule an action to run exactly 24 hours later
  2. The action will make the API call and update the document

This approach is demonstrated in a migration example:

// When document is created/updated
export const updateDocument = mutation({
  // ...
  handler: async (ctx, args) => {
    // Update the document
    const id = await ctx.db.patch(documentId, { 
      // your updates
      updated_at: Date.now() 
    });
    
    // Schedule an update in 24 hours
    await ctx.scheduler.runAfter(
      24 * 60 * 60 * 1000, // 24 hours in milliseconds
      api.yourModule.fetchAndUpdateFromExternalApi, 
      { documentId: id }
    );
  }
});

Then create the action that will run after 24 hours:

export const fetchAndUpdateFromExternalApi = action({
  args: { documentId: v.id("yourTable") },
  handler: async (ctx, args) => {
    // Make your API call
    const response = await fetch("<https://your-api-endpoint.com">);
    const data = await response.json();
    
    // Update the document with the API result
    await ctx.runMutation(api.yourModule.updateDocWithApiResult, {
      documentId: args.documentId,
      apiData: data,
      updated_at: Date.now()
    });
  },
});

This pattern is similar to what's shown in the Discord support thread on Migrations calling HTTP.

Remember that actions are the appropriate function type when making external API calls, while mutations should be used for database updates.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@short depot Kapa.ai is still learning and improving, please let me know how I did by reacting below.

short depot
#

@cobalt sparrow I have 400k documents that will need to be updated every 24 hours, how do I schedule this?

cobalt sparrowBOT
#

Let me check the knowledge base for you again. One moment! :books:

#

For updating 400k documents every 24 hours in Convex, you'll need a strategy that handles this large volume efficiently. Here's how you can approach this:

Batch Processing with Self-Scheduling Functions

Since you're dealing with a large number of documents, you should process them in batches rather than all at once. This approach is recommended for large-scale operations in Convex:

// convex/crons.ts
import { cronJobs } from "convex/server";
import { internal } from "./_generated/api";

const crons = cronJobs();

// Start the batch update process once per day
crons.daily(
  "start daily document updates",
  { hourUTC: 0, minuteUTC: 0 }, // At midnight UTC
  internal.yourModule.startBatchUpdates
);

export default crons;

Then implement a self-scheduling mutation pattern to process all documents in batches:

export const startBatchUpdates = internalMutation({
  args: { cursor: v.optional(v.string()) },
  handler: async (ctx, args) => {
    // Fetch one batch of documents
    const batch = await ctx.db.query("yourTable")
      .paginate({
        cursor: args.cursor ?? null, // null is the cursor for the first batch
        numItems: 100, // Process 100 documents at a time
      });

    // Schedule an action to update this batch
    await ctx.scheduler.runAfter(0, internal.yourModule.updateBatch, { 
      documentIds: batch.page.map(doc => doc._id) 
    });

    // If there are more documents to process, schedule the next batch
    if (!batch.isDone) {
      await ctx.scheduler.runAfter(0, internal.yourModule.startBatchUpdates, 
        { cursor: batch.continueCursor });
    }
  },
});
#

export const updateBatch = internalAction({
args: { documentIds: v.array(v.id("yourTable")) },
handler: async (ctx, args) => {
// Make your API call
const response = await fetch("https://your-api-endpoint.com");
const data = await response.json();

// Update each document in the batch
for (const id of args.documentIds) {
  await ctx.runMutation(internal.yourModule.updateDocument, {
    documentId: id,
    apiData: data,
    updated_at: Date.now()
  });
}

},
});


This pattern is similar to what's shown in [Stack: Queries that scale](<https://stack.convex.dev/queries-that-scale#part-2-batching-updates-from-frequently-changing--data>) for processing large datasets.

## Consider Using Workpool Component

For high-scale operations like yours, Convex recommends using their Workpool component:

npm install @convex-dev/workpool


As mentioned in the documentation: "Workpool give critical tasks priority by organizing async operations into separate, customizable queues" [Convex Scheduling](<https://docs.convex.dev/scheduling>).

This would be particularly useful for your case as it provides better reliability with configurable retries and can handle the large volume more efficiently.

## Additional Considerations

1. **Reliability**: Use mutations for workflow orchestration as they are "idempotent, reliable, and retried until success" [Stack: Configure Cron Jobs at Runtime](<https://stack.convex.dev/cron-jobs#workflow>).

2. **Time Limits**: Be aware that actions have a 10-minute time limit, so your batches need to complete within that timeframe [Convex Cron Jobs](<https://docs.convex.dev/scheduling/cron-jobs>).
#
  1. Import Tools: For initial data loading or complete refreshes, consider using Convex's import tools: "you can use the airbyte destination connector" or "command line CSV import (npx convex import)" Discord support thread.

This approach should allow you to reliably update your 400k documents every 24 hours while working within Convex's architecture and limits.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: