#Extended validation in the schema

19 messages · Page 1 of 1 (latest)

wheat parrot
#

The schema is an ideal place to express validation constraints, but it's pretty limited right now. I'm aware we can make zod work with convex, but the challenge is we then have to define loose validation with convex, and then extended validation separately with something else.

I'm going to come up with some sort of pattern for my own use, but curious if there's been any thought on this with the team.

feral minnow
#

It should be possible to wrap ctx.db to perform validation on writes (or reads), beyond what the database does. See Ian's latest on this topic: https://stack.convex.dev/typescript-zod-function-validation#can-i-use-zod-to-define-my-database-types-too

It's possible that in the future we'll add a layer that'll allow for this and similar needs like defaults and row/field-level security/authorization.

Use Zod with TypeScript for argument validation on your server functions allows you to both protect against invalid data, and define TypeScript types ...

wheat parrot
#

I'm more looking at where the types for validation are set. I want to find a way to set them in line with the schema definition. I currently have to define a field as numeric in the schema, and then somewhere else define that it is limited to ten characters. I want to do those in the same place.

I'm thinking about making a higher level schema where field definitions can be expanded, and then running that through a parser to generate the schema that convex needs. But wanted to see if the team had thoughts first.

primal prairie
#

I'll expand on this a little bit, and it's another instance of having "compressed" the database layer and like the user eronomics this early, just because of how much time it takes to build each. we've talked about ideas about a higher-level ORM type thing that has more powerful validations on it that would go on top of the schema definition. the reason why we'd always have something like the current schema definition is we just happen to be using typescript right now to create the canonical definition of the types actually at rest in the database

#

so I'd view this other eventual layer as a great way to make everything really well conforming in, say, the JS/TS environment, which probably includes all the UDFs right now (and for a long time to come)

#

the reason we also need a baseline definition of the database types is we run into things like export/import, integration with SQL, streaming jobs out to other services, where there is a need to have a pretty fundamental definition of the column types. in all of these other places, more sophisticated validators may not run because of a lower-level interface with some other more basic system, like a data warehouse or DuckDB or something like that

#

so there still is a bit of a necessary distinction we need to maintain between the clarify on what actually is at rest in the database, and the code we run in, say, the UDF enviroment to make it more likely those value confirm to certain constraints

#

so, when it comes to what the team is actually doing with higher-level validators, most of the active work right now is around @quaint cradle 's work with zod

wheat parrot
#

Thanks for the breakdown! Keeping a bottom layer of universal types definitely makes sense, I'll deal with the separation of definitions for now.

primal prairie
#

cool. we'd love to see what you come up with, because we definitely know people want and need more power here

#

also @proper ingot 's work with Effect + convex is maybe yet another way to have more powerful/opinionated value specifications that layer over our schema definition

quaint cradle
# wheat parrot I'm more looking at where the types for validation are set. I want to find a way...

I'm thinking about making a higher level schema where field definitions can be expanded, and then running that through a parser to generate the schema that convex needs.

The Zod post didn't call it out very loudly, but the Zod helpers already come with two functions: zodToConvex(z.*) and zodToConvexFields({ fieldName: z.*, ... }). Both produce convex validators recursively (v.*). So you can define your higher level schema in zod and generate Convex schema for defineTable etc.

I'd like to make some Zod helper that turns a zod schema ({field: z.*...) into both tables, and an associated reader / writer wrapper that runs the right pardser on the right table, so you could go full-zod without a lot of manual shlep. I don't want to put something out prematurely though, so I won't rush it out in the next couple weeks

wheat parrot
#

@quaint cradle oh nice! So they're able to convert a more complex parser, like z.string().email(), into a simple v.string() for the convex schema?

quaint cradle
wheat parrot
#

Awesome, definitely using this. Thank you!

proper ingot
#

Aye, this is exactly what I'm working on, but with the Effect ecosystem's Schema library rather than Zod. I also have a working Effect Schema to Convex Validator compiler (https://github.com/rjdellecese/effect-convex/blob/755f46db3012940f7d242ea6f55c5210352d23b4/src/schema-to-validator-compiler.ts), which exposes functions that operate exactly how Ian's zodToConvex and zodToConvexFields do. I also want to wrap other Convex APIs such that you can define your schema, read/write (decode/encode) data from your database in terms of a richer schema language (Schema in my case). I think it will be a little easier with Schema, because each Schema contains a decoder and an encoder (the type is actually Schema<From = To, To>), whereas in Zod you'll have to define a separate Zod decoder and encoder for each field.

#

My goals with this Effect/Convex library are broader than just this (see #general message for more), but I think this is the highest-value first feature! It would be really neat to see this functioning for Zod, too @quaint cradle. I'm sure you're much more familiar with the Convex JS codebase than I am, but if you ever want to compare notes on your implementation of this for Zod, let me know

proper ingot
#

This all makes me think of this cool diagram (figure 2) from some UW CS course notes (https://courses.cs.washington.edu/courses/cse341/04wi/lectures/13-dynamic-vs-static-types.html), and this awesome talk by Runar Bjarnason (https://www.youtube.com/watch?v=GqmsQeSzMdw), in which he coins (or at least popularized?) the phrase "Constraints Liberate, Liberties Constrain".

Any given user of Convex has a lot of control over the context in which they need or want to interface with their Convex database. I might choose to only interface with (or at least write to) Convex via a TypeScript application. This gives me a lot of power, if I use some nice high-level abstractions like Ian's Zod helpers, or my Effect/Schema library, to ensure that my data obeys certain rules/is correct in great detail. But the more you try to interface with other systems/languages, the more you need to relax the constraints on your data in order to have a constraint-definition language that actually applies its constraints to all of the systems/languages that you want to support (assuming, of course, that that's your goal). This is one reason, I think, why it's tricky for Convex to take their current Validator approach much further than it has been taken so far.

primal prairie
#

yep, and that's exactly why we haven't. even if more powerful things layer on top, we also want to make sure we keep available and unobscured a bare-bones representation of exactly what's in the database that will be universally recognized by all entities interacting with the database over the next 10+ years of your company when you have tons of systems and millions of lines of code 😛