#An efficient way to architecht a YouTube like comments section in DB

5 messages · Page 1 of 1 (latest)

silent hearth
#

Quick explanation of the comments‑section bottleneck

I’m rewriting our prototype (React + json‑server) in Next.js with Convex and hit a data‑model problem around threaded comments.

Current model

type Post = { id: string; comments: string[]; ... }
type Comment = { id: string; replies: string[]; ... }
  • A Post keeps the IDs of its top‑level comments.
  • Every Comment keeps the IDs of its direct replies (so replies are just more Comment docs).

Why it’s breaking down

To show one post I now:

  1. Read the post → get the list of top‑level comment IDs.
  2. Paginate that list.
  3. For each comment, recursively fetch every reply to build a tree.

On a post with 10 comments where one branch is 13 levels deep this becomes dozens of round‑trips plus a lot of in‑route JSON stitching. Works with json‑server but will hammer a real DB in production.

What I’m after

A data‑model / query pattern that lets me fetch a comment tree (or at least the first N levels) in one cheap call instead of a cascade of recursive fetches.

Example in production (might take a minute to load, because json cold start): https://sheru.vercel.app/posts/1

Take a look at the network tab in dev inspector, and the sheer amount of API roud trips happening for a single comments section chain assembly.

Any and all suggestion apreciated.

solid agateBOT
#

Hi @silent hearth. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

To efficiently architect a YouTube-like threaded comments section in Convex, you should avoid recursive, client-driven fetching of each comment and its replies, as this leads to the N+1 query problem and excessive round-trips—especially problematic for deep or large threads.

Recommended Data Model

Instead of storing arrays of child IDs in each comment or post, model your comments as a flat table with a parentId field:

type Comment = {
  _id: Id<"comments">;
  postId: Id<"posts">;
  parentId?: Id<"comments">; // undefined for top-level comments
  body: string;
  // ...other fields
}
  • Top-level comments have parentId as undefined and a postId.
  • Replies have parentId set to the parent comment's _id.

Query Pattern

Fetch all comments for a post in a single query, then build the tree in memory. For large threads, you can paginate top-level comments and fetch all their direct replies in one go.

Example: Fetching all comments for a post

export const getCommentsForPost = query({
  args: { postId: v.id("posts") },
  handler: async (ctx, args) => {
    // Fetch all comments for the post
    return await ctx.db
      .query("comments")
      .withIndex("by_postId", q => q.eq("postId", args.postId))
      .collect();
  },
});

You can then reconstruct the tree structure in your React component.

Pagination

If you want to paginate top-level comments and fetch their replies efficiently:

  1. Fetch the first N top-level comments (parentId is undefined).
  2. Fetch all replies whose parentId is in the set of those top-level comment IDs.

This can be done in two queries, not N+1.

Why this works well in Convex

#
  • Convex queries are written in TypeScript and can return arbitrarily shaped data, so you can fetch all relevant comments and return them as a flat array or even as a pre-assembled tree if you wish.
  • You can use helpers like getManyFrom to fetch all replies for a set of parent IDs efficiently, leveraging indexes for performance Database Relationship Helpers.
  • Convex's architecture minimizes the N+1 problem because queries run close to the database and can batch lookups efficiently What about N+1?.

Example: Efficiently Fetching Replies

const topLevelComments = await ctx.db
  .query("comments")
  .withIndex("by_postId", q => q.eq("postId", args.postId))
  .filter(q => q.eq(q.field("parentId"), undefined))
  .take(10); // paginate

const replyIds = topLevelComments.map(c => c._id);

const replies = await ctx.db
  .query("comments")
  .withIndex("by_parentId", q => q.in("parentId", replyIds))
  .collect();

Summary

  • Model comments as a flat table with parentId and postId.
  • Fetch all relevant comments in as few queries as possible (ideally one or two).
  • Assemble the tree in memory or in your query function.
  • Paginate top-level comments and batch-fetch their replies.

This approach is scalable, minimizes round-trips, and leverages Convex's strengths for real-time, reactive data fetching.