#?
28 messages · Page 1 of 1 (latest)
Hi @cold willow. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
For clearing a table before loading data in an ETL process, Convex offers a few approaches:
Option 1: CLI (Recommended for ETL)
Use npx convex import with the --replace flag and an empty file. This is the most efficient one-liner:
npx convex import --replace --table <tableName> --format jsonLines /dev/null -y
For production, add --prod:
npx convex import --prod --replace --table <tableName> --format jsonLines /dev/null -y
This effectively clears the table by "importing" an empty dataset in replace mode. [YOLO MVP Tips]
If you need to clear all tables at once:
for tableName in `npx convex data`; do npx convex import --table $tableName --replace -y --format jsonLines /dev/null; done
Option 2: Dashboard
You can also clear a table via the Convex Dashboard by clicking the ⋮ overflow menu on the data page and selecting "Clear Table". This deletes all documents without removing the table itself. [Dashboard Data]
Option 3: Programmatic (via Mutation)
If you need to clear a table programmatically as part of your ETL logic, you can write a mutation that queries and deletes all documents:
export const clearAll = mutation(async (ctx) => {
const docs = await ctx.db.query("yourTable").collect();
await Promise.all(docs.map((doc) => ctx.db.delete(doc._id)));
});
For a typical ETL pipeline, the CLI approach (--replace with an empty file) is the most practical since it's scriptable and works well for both dev and production deployments.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@cold willow Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@modest raven assuming that there are hundred of thousands of rows to clear, what is the best practice
Let me check the knowledge base for you again. One moment! :books:
For clearing hundreds of thousands of rows, the CLI import approach is still the best practice, as it bypasses Convex's transaction limits entirely:
npx convex import --replace --table <tableName> --format jsonLines /dev/null -y
This is the recommended approach because it doesn't go through a mutation transaction, so it won't hit Convex's per-transaction read/write limits. [Data Import]
Why not a programmatic mutation?
If you try to delete hundreds of thousands of rows via a mutation, you will hit Convex's transaction limits. Specifically:
- There is a 4,096 read limit per function execution, and each delete also requires a read, so you'd hit the ceiling quickly with large tables. [Too Many Reads]
- Mutations enforce transaction limits to prevent long-lived transactions that could grind the database to a halt.
If you must do it programmatically
If for some reason you need to clear the table via code (e.g., as part of a larger workflow), the correct pattern is to use recursive scheduled mutations with pagination, processing a safe batch size (e.g., 500 documents) per mutation run:
import { internalMutation } from "./_generated/server";
import { internal } from "./_generated/api";
const MiB = 1 << 20;
export const clearTable = internalMutation({
args: {},
handler: async (ctx) => {
const docs = await ctx.db.query("yourTable").paginate({ cursor: null, numItems: 500 });
for (const doc of docs.page) {
await ctx.db.delete(doc._id);
const metrics = await ctx.meta.getTransactionMetrics();
if (
metrics.bytesRead.used > 4 * MiB ||
metrics.bytesWritten.used > 2 * MiB ||
metrics.databaseQueries.remaining < 500
) {
await ctx.scheduler.runAfter(0, internal.yourFile.clearTable);
break;
}
}
},
});
[[Write Performance](<https://docs.convex.dev/database/writing-data#write-performance-and-limits>)]
---
**Bottom line:** For an ETL process with hundreds of thousands of rows, use the CLI `--replace` with an empty file. It's the most efficient, scalable, and safe approach.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@cold willow Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@modest raven but how do imake the insert funcs wait till clear is done
Let me check the knowledge base for you again. One moment! :books:
Great question. Since the CLI --replace import is a synchronous command, the simplest approach is to sequence your ETL steps in your shell script — the import command blocks until completion, so you just run your insert logic after it:
# Step 1: Clear the table (blocks until done)
npx convex import --replace --table <tableName> --format jsonLines /dev/null -y
# Step 2: Load new data (only runs after clear is complete)
npx convex import --table <tableName> newdata.jsonl
If you're doing this programmatically (via mutations)
If you need to orchestrate the clear + insert entirely in code, the key constraint is that each mutation is its own transaction — you cannot span a clear and an insert across one atomic transaction for hundreds of thousands of rows due to transaction limits. [Write Performance]
The recommended pattern is to use the Convex Workflow component, which lets you define sequential steps with durability guarantees — step 2 (insert) only runs after step 1 (clear) completes:
// Pseudocode using the Workflow component
export const etlWorkflow = workflow.define({
handler: async (ctx) => {
await ctx.runMutation(internal.yourFile.clearTable, {}); // Step 1
await ctx.runMutation(internal.yourFile.bulkInsert, { data }); // Step 2
},
});
The Workflow component tracks state and retries automatically, so if the clear step fails partway through, it won't proceed to the insert. [Durable Workflows]
Bottom line: For a CLI-based ETL, shell sequencing is sufficient since npx convex import is synchronous. For a fully programmatic approach, the Workflow component is the safest way to ensure the insert waits for the clear to finish.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@cold willow Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@modest raven how do i wrap this in a type:
Let me check the knowledge base for you again. One moment! :books:
Your question seems incomplete — "how do i wrap this in a type" doesn't specify what you'd like to type. Could you clarify what you're referring to? For example, are you asking about:
- Typing the workflow handler (e.g., annotating the return type of
workflow.define) - Typing a mutation or action context (e.g.,
MutationCtx,ActionCtx) - Typing arguments using validators and
Infer - Typing document shapes using
Doc<"tableName">
Based on the conversation so far, here are the most likely things you might want to type:
Typing a workflow handler
The docs recommend always annotating the return type of the handler to avoid type cycles: [Workflow Usage]
export const etlWorkflow = workflow.define({
args: { /* ... */ },
handler: async (ctx, args): Promise<void> => {
// ...
},
});
Typing a mutation context helper
import { MutationCtx } from "./_generated/server";
export async function clearTableHelper(ctx: MutationCtx) {
const docs = await ctx.db.query("yourTable").collect();
for (const doc of docs) {
await ctx.db.delete(doc._id);
}
}
Typing arguments with Infer
import { Infer, v } from "convex/values";
const rowValidator = v.object({ name: v.string(), value: v.number() });
type Row = Infer<typeof rowValidator>;
Please share more details about what you're trying to type and I'll give you a more precise answer!
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@cold willow Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@modest raven how do i wrap this in a type:
routes: defineTable({
route_id: v.string(),
route_short_name: v.string(),
route_long_name: v.string(),
route_desc: v.nullable(v.string()),
route_type: v.number(),
agency_id: v.string(),
route_color: v.nullable(v.string()),
route_text_color: v.nullable(v.string()),
}).index("by_route_id", ["route_id"]),
Let me check the knowledge base for you again. One moment! :books:
You can extract a TypeScript type from that table definition using the Infer type helper from convex/values, or by using the generated Doc type.
Option 1: Using Infer on a validator (recommended for reuse)
Define the validator separately, then use Infer to extract the type:
import { defineSchema, defineTable } from "convex/server";
import { v, Infer } from "convex/values";
const routeValidator = v.object({
route_id: v.string(),
route_short_name: v.string(),
route_long_name: v.string(),
route_desc: v.nullable(v.string()),
route_type: v.number(),
agency_id: v.string(),
route_color: v.nullable(v.string()),
route_text_color: v.nullable(v.string()),
});
// Extracted TypeScript type
export type Route = Infer<typeof routeValidator>;
export default defineSchema({
routes: defineTable(routeValidator).index("by_route_id", ["route_id"]),
});
Option 2: Using the generated Doc type
Once your schema is defined and you've run npx convex dev, Convex generates a Doc type you can use anywhere in your codebase:
import { Doc } from "../convex/_generated/dataModel";
// Includes system fields _id and _creationTime
type Route = Doc<"routes">;
Which to use?
- Use
Inferwhen you want a type without system fields (e.g., for inserting data or passing partial objects around). - Use
Doc<"routes">when you want the full document type including_idand_creationTime(e.g., for displaying data returned from a query). [Doc type]
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@cold willow Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@modest raven why cant i do this:
import { Doc } from "./_generated/dataModel";
export const batchInsertRoutes = internalMutation({
args: {
routes: v.array(v.doc("routes")),
},
handler: async (ctx, args) => {
for (const route of args.routes) {
await ctx.db.insert("routes", route);
}
},
});
Let me check the knowledge base for you again. One moment! :books:
The issue is that v.doc("routes") is not a built-in validator in the standard convex/values v object. The standard v does not have a .doc() method — that's only available via the convex-helpers library or the typedV utility. [Helpers for schema]
How to fix it
Option 1: Use v.object(...) with your route fields directly
The cleanest approach is to export your route fields from your schema and reuse them:
// convex/schema.ts
export const routeFields = {
route_id: v.string(),
route_short_name: v.string(),
route_long_name: v.string(),
route_desc: v.nullable(v.string()),
route_type: v.number(),
agency_id: v.string(),
route_color: v.nullable(v.string()),
route_text_color: v.nullable(v.string()),
};
export default defineSchema({
routes: defineTable(routeFields).index("by_route_id", ["route_id"]),
});
// convex/routes.ts
import { internalMutation } from "./_generated/server";
import { v } from "convex/values";
import { routeFields } from "./schema";
export const batchInsertRoutes = internalMutation({
args: {
routes: v.array(v.object(routeFields)),
},
handler: async (ctx, args) => {
for (const route of args.routes) {
await ctx.db.insert("routes", route);
}
},
});
[Argument Validation without Repetition]
Option 2: Use convex-helpers doc validator
If you want v.doc()-like behavior, install convex-helpers and use its doc helper:
import { doc } from "convex-helpers/validators";
import schema from "./schema";
export const batchInsertRoutes = internalMutation({
args: {
routes: v.array(doc(schema, "routes")),
},
handler: async (ctx, args) => {
for (const route of args.routes) {
await ctx.db.insert("routes", route);
}
},
});
Note that `doc(schema, "routes")` includes system fields (`_id`, `_creationTime`), so you'd need to strip those before inserting. [[Helpers for schema](<https://stack.convex.dev/argument-validation-without-repetition#helpers-for-schema-definition--validation>)]
---
**Bottom line:** Use **Option 1** (exporting `routeFields` from your schema) — it's the idiomatic Convex pattern with no extra dependencies and works perfectly for bulk inserts.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------