#Update Document when 24 hours old
10 messages · Page 1 of 1 (latest)
Hi @short depot. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
Yes, you can update a document as soon as it's 24 hours old in Convex. There are a couple of approaches you can use:
Using a Cron Job
The most straightforward approach is to set up a cron job that runs periodically (e.g., hourly) to check for documents that need updating:
// convex/crons.ts
import { cronJobs } from "convex/server";
import { internal } from "./_generated/api";
const crons = cronJobs();
crons.hourly(
"update old documents",
{ minuteUTC: 0 }, // At the top of every hour
internal.yourModule.checkAndUpdateOldDocuments
);
export default crons;
Then create an action that checks for documents older than 24 hours and updates them:
export const checkAndUpdateOldDocuments = internalAction({
args: {},
handler: async (ctx) => {
// Get documents that need updating
const docsToUpdate = await ctx.runQuery(internal.yourModule.getDocsOlderThan24Hours);
// For each document, fetch data from external API and update
for (const doc of docsToUpdate) {
// Make your API call
const response = await fetch("<https://your-api-endpoint.com">);
const data = await response.json();
// Update the document with the API result
await ctx.runMutation(internal.yourModule.updateDocWithApiResult, {
documentId: doc._id,
apiData: data,
updated_at: Date.now()
});
}
},
});
This approach is mentioned in the knowledge sources as a common pattern for handling periodic updates. As one community member noted: "Could calculating/populating your aggregates e.g. once every 24 hours work in your situation? That's what we are doing in our project." Discord support thread
Scheduling Individual Updates
If you need more precise timing, you could schedule individual updates for each document:
- When a document is created or updated, schedule an action to run exactly 24 hours later
- The action will make the API call and update the document
This approach is demonstrated in a migration example:
// When document is created/updated
export const updateDocument = mutation({
// ...
handler: async (ctx, args) => {
// Update the document
const id = await ctx.db.patch(documentId, {
// your updates
updated_at: Date.now()
});
// Schedule an update in 24 hours
await ctx.scheduler.runAfter(
24 * 60 * 60 * 1000, // 24 hours in milliseconds
api.yourModule.fetchAndUpdateFromExternalApi,
{ documentId: id }
);
}
});
Then create the action that will run after 24 hours:
export const fetchAndUpdateFromExternalApi = action({
args: { documentId: v.id("yourTable") },
handler: async (ctx, args) => {
// Make your API call
const response = await fetch("<https://your-api-endpoint.com">);
const data = await response.json();
// Update the document with the API result
await ctx.runMutation(api.yourModule.updateDocWithApiResult, {
documentId: args.documentId,
apiData: data,
updated_at: Date.now()
});
},
});
This pattern is similar to what's shown in the Discord support thread on Migrations calling HTTP.
Remember that actions are the appropriate function type when making external API calls, while mutations should be used for database updates.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@short depot Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@cobalt sparrow I have 400k documents that will need to be updated every 24 hours, how do I schedule this?
Let me check the knowledge base for you again. One moment! :books:
For updating 400k documents every 24 hours in Convex, you'll need a strategy that handles this large volume efficiently. Here's how you can approach this:
Batch Processing with Self-Scheduling Functions
Since you're dealing with a large number of documents, you should process them in batches rather than all at once. This approach is recommended for large-scale operations in Convex:
// convex/crons.ts
import { cronJobs } from "convex/server";
import { internal } from "./_generated/api";
const crons = cronJobs();
// Start the batch update process once per day
crons.daily(
"start daily document updates",
{ hourUTC: 0, minuteUTC: 0 }, // At midnight UTC
internal.yourModule.startBatchUpdates
);
export default crons;
Then implement a self-scheduling mutation pattern to process all documents in batches:
export const startBatchUpdates = internalMutation({
args: { cursor: v.optional(v.string()) },
handler: async (ctx, args) => {
// Fetch one batch of documents
const batch = await ctx.db.query("yourTable")
.paginate({
cursor: args.cursor ?? null, // null is the cursor for the first batch
numItems: 100, // Process 100 documents at a time
});
// Schedule an action to update this batch
await ctx.scheduler.runAfter(0, internal.yourModule.updateBatch, {
documentIds: batch.page.map(doc => doc._id)
});
// If there are more documents to process, schedule the next batch
if (!batch.isDone) {
await ctx.scheduler.runAfter(0, internal.yourModule.startBatchUpdates,
{ cursor: batch.continueCursor });
}
},
});
export const updateBatch = internalAction({
args: { documentIds: v.array(v.id("yourTable")) },
handler: async (ctx, args) => {
// Make your API call
const response = await fetch("https://your-api-endpoint.com");
const data = await response.json();
// Update each document in the batch
for (const id of args.documentIds) {
await ctx.runMutation(internal.yourModule.updateDocument, {
documentId: id,
apiData: data,
updated_at: Date.now()
});
}
},
});
This pattern is similar to what's shown in [Stack: Queries that scale](<https://stack.convex.dev/queries-that-scale#part-2-batching-updates-from-frequently-changing--data>) for processing large datasets.
## Consider Using Workpool Component
For high-scale operations like yours, Convex recommends using their Workpool component:
npm install @convex-dev/workpool
As mentioned in the documentation: "Workpool give critical tasks priority by organizing async operations into separate, customizable queues" [Convex Scheduling](<https://docs.convex.dev/scheduling>).
This would be particularly useful for your case as it provides better reliability with configurable retries and can handle the large volume more efficiently.
## Additional Considerations
1. **Reliability**: Use mutations for workflow orchestration as they are "idempotent, reliable, and retried until success" [Stack: Configure Cron Jobs at Runtime](<https://stack.convex.dev/cron-jobs#workflow>).
2. **Time Limits**: Be aware that actions have a 10-minute time limit, so your batches need to complete within that timeframe [Convex Cron Jobs](<https://docs.convex.dev/scheduling/cron-jobs>).
- Import Tools: For initial data loading or complete refreshes, consider using Convex's import tools: "you can use the airbyte destination connector" or "command line CSV import (
npx convex import)" Discord support thread.