#Delta updates on query? - understanding bandwidth

24 messages · Page 1 of 1 (latest)

chrome vector
#

Hello, I am building a card game with big ridiculous states (see picture).

I'm trying to understand data usage in detail again.
On games.getGame(id), does the subscription have to get the whole row everytime? are there partial updates and/or compression middleware that could allow for delta updates similar to git patches?

Is a future optimization to split up queries to only grab a single column or so in order to reduce database bandwidth or is there any other recommended steps to reduce bandwidth?

shrewd sparrowBOT
#

Thanks for posting in #1088161997662724167.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.

  • Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
  • Use search.convex.dev to search Docs, Stack, and Discord all at once.
  • Additionally, you can post your questions in the Convex Community's #1228095053885476985 channel to receive a response from AI.
  • Avoid tagging staff unless specifically instructed.

Thank you!

granite rose
#

hey @chrome vector -- we currently don't have a way to do partial reads or writes to a document but it's on our radar.

in the meantime, one recommendation is to break your document up into smaller pieces. for example, you could have a separate combatLog table that stores { gameId: Id<"games">, "action": ... } with an index on gameId. then, it'd be efficient to (1) append a new entry to the combat log and (2) only read a range of the log if needed.

chrome vector
#

thanks for the answer @granite rose yeah I figured this was gonna be the answer. will keep it as is and then optimize when I go into alpha tests.

Even if I break out the tables though - and / or implement compression middleware - my use case would still benefit from delta updates:
Combat log e.g. is not just a log for analysis, but actually shows the opponents last turns as part of the interface. See bottom right of this image.

In the schema, I have combatLog: v.array(vCombatAction),

So in my dream scenario, we have almost like a query planner middleware that automatically checks on v.array or v.object columns whether it is worth it to send diffs instead of raw data. In this case, it could just broadcast something like APPEND("X passed turn") instead of resending the entire array. On the game field object, it could detect that only 2 fields have been changed and can send a diff instead of the entire document.

Will likely implement a version of this AND/OR try breaking the documents into subdocuments when it becomes necessary to actually conserve bandwidth for me, but maybe it's a common enough use case to be worth an official middleware.

granite rose
chrome vector
#

hm I kinda want to have all moves on screen the whole time, implementing this via the pagination concept feels like a bit of a misuse. Like I would ALWAYS have to trigger loadMore as soon as status hits "CanLoadMore"

keen elbow
#

First off, that game screen looks amazing! I'm not sure it's the kind of game I'd go for, but it definitely has a lot of visual appeal.

As for the data retrieval process, here's a rough idea: Use two queries.

First, this piggybacks on the table-splitting idea proposed by @granite rose , so each move in the combat log would have to be its own document in a separate table.

The first query retrieves the entire collection of moves for a given game when the game loads (or perhaps use pagination to only get the most recent X moves, only loading moves older than that if the user requests). If it's the start of a game, it will be an empty array; otherwise an array of all moves up to the present time. Again, this query only runs once when the game first loads, and it stores the collected documents in a state variable, which is used to render the on-screen list.

The second query only retrieves the most recent move in the game using an index targeting the game ID, with the query ending in .order("desc").take(1). As each new record comes in via this query, it's appended to the full array in state.

This means that the only potentially-heavy query is the first one, but only if a game is in-progress and has lots of moves.

Would that work for this use case?

chrome vector
#

Thanks for the input, once the game is stable enough (hopefully in like a month) I will start working on the more out there performance improvements.

granite rose
chrome vector
#

@granite rose just started playtesting and hit this 90Mb Bandwidth over 4 games. Optimization is starting to get more priority again for me.

#

also @keen elbow if you're interested in the game feel free to DM me

granite rose
#

cool! do you have a sense for how much bandwidth it should take per game as a lower bound? we can then see what optimizations we’d need to get there.

#

also curious how the tables and queries are set up for the game — maybe there are some easy wins like the document splitting idea from before

keen elbow
chrome vector
#

Appreciate all the attention you're giving even to more out there use cases like mine by the way 🙂

granite rose
chrome vector
#

I think the main issue is just that my gamestate representation is gratuitously verbose for development conveniences and I need to add encoding like chess notation to compress it. The json for the battlefield alone is way to big when it arguably can be downsized to 64bytes for the terrain types and maybe a couple more bytes for hexId to entityId mapping. Really was designed for maximum typescript convenience and takes an awful amount of space.

That's what I'm considering quick wins atm.

At some point, I could go deep and only selectively load gamestate for the actions that need it and split up everything but that feels like a big challenge and will slow down development. Better encoding will be a data layer only change on the other hand and not touch game logic.

granite rose
#

makes sense.

yeah, I think splitting stuff up into smaller documents will make a lot of stuff better automatically -- mutations will be cheaper when they only fetch what they need, and queries can be finer-grained, have fewer reactivity updates, get cached more effectively. but, understood how this then means pushing database access into your game logic.

on the other extreme, if you're storing everything in one big game state document, have you tried compressing the game state before writing it to the db? this is really quick & dirty, and it'll make the dashboard not that useful, but it could be worth trying.

i've used lz4js in queries/mutations and it works great:

import { v } from "convex/values";
import { mutation } from "./_generated/server"
import * as lz4 from "lz4js";

export const compressionTest = mutation({    
    args: {
        repetitions: v.number() 
    },
    handler: async (ctx, args) => {                
        let s = [];
        for (let i = 0; i < args.repetitions; i++) {
            s.push(example);
        }
        console.time("encode");
        const encoder = new TextEncoder();
        const buf = encoder.encode(s.join("\n"));
        console.timeEnd("encode");

        console.time("compress");
        const compressed = lz4.compress(buf);
        console.timeEnd("compress");

        console.time("decompress");
        const decompressed = lz4.decompress(compressed);
        console.timeEnd("decompress");

        if (!decompressed.every((value, index) => value === buf[index])) {
            throw new Error("Decompressed data does not match original data");
        }

        console.log(`[lz4] Compressed ${(buf.length / 1024).toFixed(2)}KB to ${(compressed.length / 1024).toFixed(2)}KB (ratio: ${(compressed.length / buf.length).toFixed(2)})`);
    }
})
#

was trying it with 10 repetitions of a 40KB json document (not a representative test for compression ratio, ofc)

encode: 0ms
compress: 14ms
decompress: 8ms
[lz4] Compressed 407.22KB to 15.33KB (ratio: 0.04)
chrome vector
#

I like the idea of this. However, I have also grown fond to the schema niceties. Is it possible to reuse my schema definition for validation of the uncompressed doc?

#

And thanks again for all the advice you've already given here.

granite rose
#

hmm, not that I'm aware of. one idea for a workaround would be to switch to using zod validators. it'd be less nicely integrated with everything but still pretty close.