Delta updates on query? - understanding bandwidth | Convex Community | Page 1

chrome vector Sep 30, 2024, 10:25 AM

#

Hello, I am building a card game with big ridiculous states (see picture).

I'm trying to understand data usage in detail again.
On games.getGame(id), does the subscription have to get the whole row everytime? are there partial updates and/or compression middleware that could allow for delta updates similar to git patches?

Is a future optimization to split up queries to only grab a single column or so in order to reduce database bandwidth or is there any other recommended steps to reduce bandwidth?

shrewd sparrowBOT Sep 30, 2024, 10:25 AM

#

Thanks for posting in #1088161997662724167.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.

Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
Use search.convex.dev to search Docs, Stack, and Discord all at once.
Additionally, you can post your questions in the Convex Community's #1228095053885476985 channel to receive a response from AI.
Avoid tagging staff unless specifically instructed.

Thank you!

granite rose Sep 30, 2024, 5:49 PM

#

hey @chrome vector -- we currently don't have a way to do partial reads or writes to a document but it's on our radar.

in the meantime, one recommendation is to break your document up into smaller pieces. for example, you could have a separate combatLog table that stores { gameId: Id<"games">, "action": ... } with an index on gameId. then, it'd be efficient to (1) append a new entry to the combat log and (2) only read a range of the log if needed.

chrome vector Sep 30, 2024, 7:49 PM

#

thanks for the answer @granite rose yeah I figured this was gonna be the answer. will keep it as is and then optimize when I go into alpha tests.

Even if I break out the tables though - and / or implement compression middleware - my use case would still benefit from delta updates:
Combat log e.g. is not just a log for analysis, but actually shows the opponents last turns as part of the interface. See bottom right of this image.

In the schema, I have combatLog: v.array(vCombatAction),

So in my dream scenario, we have almost like a query planner middleware that automatically checks on v.array or v.object columns whether it is worth it to send diffs instead of raw data. In this case, it could just broadcast something like APPEND("X passed turn") instead of resending the entire array. On the game field object, it could detect that only 2 fields have been changed and can send a diff instead of the entire document.

Will likely implement a version of this AND/OR try breaking the documents into subdocuments when it becomes necessary to actually conserve bandwidth for me, but maybe it's a common enough use case to be worth an official middleware.

granite rose Sep 30, 2024, 8:22 PM

#

makes sense!

btw, we currently have pagination (https://docs.convex.dev/database/pagination) for querying stuff like a combatLog table and efficiently sending updates as it changes, but we have some improvements in mind to make it easier to use.

Paginated Queries | Convex Developer Hub

Load paginated queries.

chrome vector Sep 30, 2024, 8:53 PM

#

hm I kinda want to have all moves on screen the whole time, implementing this via the pagination concept feels like a bit of a misuse. Like I would ALWAYS have to trigger loadMore as soon as status hits "CanLoadMore"

keen elbow Sep 30, 2024, 10:22 PM

#

First off, that game screen looks amazing! I'm not sure it's the kind of game I'd go for, but it definitely has a lot of visual appeal.

As for the data retrieval process, here's a rough idea: Use two queries.

First, this piggybacks on the table-splitting idea proposed by @granite rose , so each move in the combat log would have to be its own document in a separate table.

The first query retrieves the entire collection of moves for a given game when the game loads (or perhaps use pagination to only get the most recent X moves, only loading moves older than that if the user requests). If it's the start of a game, it will be an empty array; otherwise an array of all moves up to the present time. Again, this query only runs once when the game first loads, and it stores the collected documents in a state variable, which is used to render the on-screen list.

The second query only retrieves the most recent move in the game using an index targeting the game ID, with the query ending in .order("desc").take(1). As each new record comes in via this query, it's appended to the full array in state.

This means that the only potentially-heavy query is the first one, but only if a game is in-progress and has lots of moves.

Would that work for this use case?

#

My idea for the one-time-only query came after reading about ConvexReactClient and its query method here: https://docs.convex.dev/api/classes/react.ConvexReactClient#query

FWIW, I've not actually used that method before, but it sounds like it would work.

Class: ConvexReactClient | Convex Developer Hub

react.ConvexReactClient

chrome vector Oct 1, 2024, 12:39 PM

#

Thanks for the input, once the game is stable enough (hopefully in like a month) I will start working on the more out there performance improvements.

granite rose Oct 1, 2024, 7:02 PM

#

chrome vector hm I kinda want to have all moves on screen the whole time, implementing this vi...

makes sense -- I agree that from an API perspective, it's a bit awkward to be manually calling loadMore until you reach the end.

from a data loading perspective, however, it's accommodating the case where there are many (say thousands) of combat log entries in a game. then, it may make sense to stream them into the app in pages and not block interactivity on loading all of the log entries.

chrome vector Dec 11, 2024, 2:47 PM

#

@granite rose just started playtesting and hit this 90Mb Bandwidth over 4 games. Optimization is starting to get more priority again for me.

#

also @keen elbow if you're interested in the game feel free to DM me

granite rose Dec 11, 2024, 2:52 PM

#

cool! do you have a sense for how much bandwidth it should take per game as a lower bound? we can then see what optimizations we’d need to get there.

#

also curious how the tables and queries are set up for the game — maybe there are some easy wins like the document splitting idea from before

keen elbow Dec 11, 2024, 2:54 PM

#

chrome vector also <@466081406758682658> if you're interested in the game feel free to DM me

Appreciate the info. As I said earlier, this isn't really my type of game. My wife might be interested in taking it for a spin, though. I'll show her the screenshot and if she's interested, I'll DM you.

chrome vector Dec 11, 2024, 3:21 PM

#

granite rose also curious how the tables and queries are set up for the game — maybe there ar...

Not really but I'm reconsidering provisioning a little server that just keeps the state in memory and implements its own websocket api. Would be a lot more inconvenient though. Will look into splitting up the state more. But I guess 0.2Mb state * 100 game actions for 20Mb feels ball park correct.

#

Appreciate all the attention you're giving even to more out there use cases like mine by the way 🙂

granite rose Dec 11, 2024, 3:55 PM

#

chrome vector Not really but I'm reconsidering provisioning a little server that just keeps th...

yeah, i’d be curious if we could find a way to make the bandwidth close to sizeof(action) * 100 actions. on one extreme convex could just sync an action log, but there should be ways to tweak the server data model to get close to this without fully upending everything.

chrome vector Dec 11, 2024, 4:18 PM

#

I think the main issue is just that my gamestate representation is gratuitously verbose for development conveniences and I need to add encoding like chess notation to compress it. The json for the battlefield alone is way to big when it arguably can be downsized to 64bytes for the terrain types and maybe a couple more bytes for hexId to entityId mapping. Really was designed for maximum typescript convenience and takes an awful amount of space.

That's what I'm considering quick wins atm.

At some point, I could go deep and only selectively load gamestate for the actions that need it and split up everything but that feels like a big challenge and will slow down development. Better encoding will be a data layer only change on the other hand and not touch game logic.

granite rose Dec 11, 2024, 5:03 PM

#

makes sense.

yeah, I think splitting stuff up into smaller documents will make a lot of stuff better automatically -- mutations will be cheaper when they only fetch what they need, and queries can be finer-grained, have fewer reactivity updates, get cached more effectively. but, understood how this then means pushing database access into your game logic.

on the other extreme, if you're storing everything in one big game state document, have you tried compressing the game state before writing it to the db? this is really quick & dirty, and it'll make the dashboard not that useful, but it could be worth trying.

i've used lz4js in queries/mutations and it works great:

import { v } from "convex/values";
import { mutation } from "./_generated/server"
import * as lz4 from "lz4js";

export const compressionTest = mutation({    
    args: {
        repetitions: v.number() 
    },
    handler: async (ctx, args) => {                
        let s = [];
        for (let i = 0; i < args.repetitions; i++) {
            s.push(example);
        }
        console.time("encode");
        const encoder = new TextEncoder();
        const buf = encoder.encode(s.join("\n"));
        console.timeEnd("encode");

        console.time("compress");
        const compressed = lz4.compress(buf);
        console.timeEnd("compress");

        console.time("decompress");
        const decompressed = lz4.decompress(compressed);
        console.timeEnd("decompress");

        if (!decompressed.every((value, index) => value === buf[index])) {
            throw new Error("Decompressed data does not match original data");
        }

        console.log(`[lz4] Compressed ${(buf.length / 1024).toFixed(2)}KB to ${(compressed.length / 1024).toFixed(2)}KB (ratio: ${(compressed.length / buf.length).toFixed(2)})`);
    }
})

#

was trying it with 10 repetitions of a 40KB json document (not a representative test for compression ratio, ofc)

encode: 0ms
compress: 14ms
decompress: 8ms
[lz4] Compressed 407.22KB to 15.33KB (ratio: 0.04)

chrome vector Dec 12, 2024, 7:41 PM

#

I like the idea of this. However, I have also grown fond to the schema niceties. Is it possible to reuse my schema definition for validation of the uncompressed doc?

#

And thanks again for all the advice you've already given here.

granite rose Dec 13, 2024, 12:08 AM

#

hmm, not that I'm aware of. one idea for a workaround would be to switch to using zod validators. it'd be less nicely integrated with everything but still pretty close.

#Delta updates on query? - understanding bandwidth