#V10 Document Data Schema

1 messages · Page 1 of 1 (latest)

surreal harbor
#

Some Discussion Ground Rules

I know most of you have opinions about just about anything/everything, but I would like to limit this conversation to developers who have been actively using the existing DocumentData API:

  1. Having prior knowledge of what the current API does will make it easier for me to explain proposed changes.
  2. This will make it easier to collect feedback on the proposal that is grounded in the current functionality of the system.
  3. This will also make it easier for me to understand the extent to which this will be a widely-breaking change (or not) across the developer community.

If you join this discussion, please begin by linking to a file or files in your git repo where you have created custom DocumentData definitions so that I can see how you are currently using the API.

#

What change are we considering?

Currently in V8/V9 document data schema are defined using basic objects with some attributes like:

someField: {
  type: String,
  required: true,
  default: ""
}

While this approach is nice in its simplicity, it has some limitations when it comes to supporting more sophisticated data structures like arrays of complex objects or embedded (inner) data structures - neither of which are currently supported without defining ancilary DocumentData objects. Furthermore, this approach requires a significant amount of logic in the DocumentData class to interpret how to parse/clean/validate input fields depending on their declared type.

The new approach that we are considering makes the fields of DocumentData more feature-rich and powerful as instances of a DataField class. These DataField instances are able to encapsulate parsing/cleaning/validation while allowing for more elegant recursive structures. For example, the field in my above example would now be declared as:

someField = new fields.StringField({required: true, default: ""})
#

A more practical example

Under V9, the data definition for ActiveEffectData looks like this:

class EffectDurationData extends DocumentData {
  static defineSchema() {
    return {
      startTime: fields.field(fields.NUMERIC_FIELD, {default: null}),
      seconds: fields.NONNEGATIVE_INTEGER_FIELD,
      combat: fields.STRING_FIELD,
      rounds: fields.NONNEGATIVE_INTEGER_FIELD,
      turns: fields.NONNEGATIVE_INTEGER_FIELD,
      startRound: fields.NONNEGATIVE_INTEGER_FIELD,
      startTurn: fields.NONNEGATIVE_INTEGER_FIELD
    }
  }
}

class EffectChangeData extends DocumentData {
  static defineSchema() {
    return {
      key: fields.BLANK_STRING,
      value: fields.BLANK_STRING,
      mode: fields.field(fields.NONNEGATIVE_NUMBER_FIELD, {default: CONST.ACTIVE_EFFECT_MODES.ADD}),
      priority: fields.NUMERIC_FIELD
    }
  }
}

class ActiveEffectData extends DocumentData {
  static defineSchema() {
    return {
      _id: fields.DOCUMENT_ID,
      changes: {
        type: [EffectChangeData],
        required: true,
        default: []
      },
      disabled: fields.BOOLEAN_FIELD,
      duration: {
        type: EffectDurationData,
        required: true,
        default: {}
      },
      icon: fields.IMAGE_FIELD,
      label: fields.BLANK_STRING,
      origin: fields.STRING_FIELD,
      tint: fields.COLOR_FIELD,
      transfer: fields.field(fields.BOOLEAN_FIELD, {default: true}),
      flags: fields.OBJECT_FIELD
    }
  }
}

The classes EffectDurationData and EffectChangeData are defined as subclasses of DocumentData - but this is more of a workaround than an intentional design. These data objects are not useful on their own, and exist only to serve the needs of the parent ActiveEffectData class.

#

In the proposed V10 approach, ActiveEffectData can be declared with these inner objects in-line to declare nested data structures:

class ActiveEffectData extends DocumentData {
  static defineSchema() {
    return {
      _id: new fields.DocumentIdField(),
      changes: new fields.ArrayField(new fields.SchemaField({
        key: new fields.StringField({required: true, label: "EFFECT.ChangeKey"}),
        value: new fields.StringField({required: true, label: "EFFECT.ChangeValue"}),
        mode: new fields.NumberField({integer: true, initial: CONST.ACTIVE_EFFECT_MODES.ADD, label: "EFFECT.ChangeMode"}),
        priority: new fields.NumberField()
      })),
      disabled: new fields.BooleanField(),
      duration: new fields.SchemaField({
        startTime: new fields.NumberField({initial: null, label: "EFFECT.StartTime"}),
        seconds: new fields.NumberField({integer: true, positive: true, label: "EFFECT.DurationSecs"}),
        combat: new fields.ForeignDocumentField(documents.BaseCombat, {label: "EFFECT.Combat"}),
        rounds: new fields.NumberField({integer: true, positive: true}),
        turns: new fields.NumberField({integer: true, positive: true, label: "EFFECT.DurationTurns"}),
        startRound: new fields.NumberField({integer: true, positive: true}),
        startTurn: new fields.NumberField({integer: true, positive: true, label: "EFFECT.StartTurns"})
      }),
      icon: new fields.FilePathField({categories: ["IMAGE"], label: "EFFECT.Icon"}),
      label: new fields.StringField({required: true, label: "EFFECT.Label"}),
      origin: new fields.StringField({nullable: true, blank: false, initial: null, label: "EFFECT.Origin"}),
      tint: new fields.ColorField({label: "EFFECT.IconTint"}),
      transfer: new fields.BooleanField({initial: true, label: "EFFECT.Transfer"}),
      flags: new fields.ObjectField()
    }
  }
}

Notice that the changes array of EffectChangeData and the duration object of EffectDurationData have been folded in to the ActiveEffectData schema. This example illustrates one level of depth with such recursive structures, but there is no depth limitation, for example duration could have its own inner object defined as a fields.SchemaField.

#

What are the anticipated benefits of this change?

  1. Our expectation is that in V10 and beyond more package authors will use the DocumentData API to define custom data structures, having the advantage of more robust data cleaning and validation.

  2. the new approach envisioned in V10+ will make it really easy to generate standard Foundry-style form fields for data objects, making it easy to generate forms to configure and customize the data attributes of your custom data structure.

  3. Additionally, a goal of V10 is to enable game system authors to easily define document type-specific data objects. It is common for game system data to contain arrays of objects or inner objects (for example things like data.attributes.strength). This new syntax for declaring data schema will make it much easier to define such complex schema.

Thoughts, feedback, questions, concerns?

somber ether
#

Is a SchemaField a shorthand for defining the inner DocumentData objects, or do they remain plain JS objects?

silk oriole
#

If you implement this is it likely to be optional or will you also tighten the schema for actors etc. So that we can't just define an empty object in the schema and populate it elsewhere? (Please no!)

brazen gate
#

Two questions really:

  1. How easy would the migration be and how easy would it be to hook up into things that already exist? Like if I use DocumentData for custom themes, how easy will I be able to replace how I am already handling custom data with, say, system settings?
  2. Tangetically related, Would this data structure allow additional add-ons specifically for rendering the data, i.e. a hook for a rendering plugin to replace handlebars?
surreal harbor
unreal pendant
#

IMO typescript already does schema for JS objects pretty good. I'm certain there also other standardized JS/JSON schema systems. In a lot of ways I think you're trying to reinvent the wheel here.

surreal harbor
surreal harbor
#

Would this data structure allow additional add-ons specifically for rendering the data, i.e. a hook for a rendering plugin to replace handlebars?
That is a stretch goal.

trail geyser
surreal harbor
hoary smelt
surreal harbor
somber ether
#

We've only merged in pretty minimal subclassing for DocumentData:

export class ArmorData extends BasePhysicalItemData {
    static DEFAULT_ICON = "systems/pf2e/icons/default-icons/armor.svg";
}

I've played around with using DocumentData for system data objects, but client-side schema validation complicated performing migrations

silent nova
hoary smelt
brazen gate
#

Oh yeah I have a feedback/thought and a big ask: Is there a chance the DB storage of this data can be updated when reading data, especially for Arrays? For instance right now if I use the sting format for saving data in an array it converts the array into an object: setProperty('data.someArray.0.someproperty') converts {someArray: [{someProperty: 'foo'}]} to {someArray:{0:{someProperty: 'foo'}}}

#

Not sure if that is something that can be adjusted in this DocumentData

surreal harbor
trail geyser
#

I personally do not recommend arrays in data, but that just my advice.

surreal harbor
unreal pendant
#

@surreal harbor Does it need to schema only JSON data or JS (i.e. function and class types)?

surreal harbor
unreal pendant
#

@surreal harbor Have you looked at AJV?

somber ether
#

That #_initializeData alters the contents of the _source property seems like it would make it hazardous to define a schema for system data when it comes time to perform a migration

#

A way for a system's migration framework to get its hands on source data prior to it getting run through schema validation would seem necessary

surreal harbor
# unreal pendant <@!100392368745779200> Have you looked at AJV?

I had not, looking at it now. It looks pretty good for basic schema, but I'm looking at their syntax for things like inner objects or arrays of objects and it looks pretty clunky to me. Don't want to judge too quickly so I'm going to do a bit more reading

#

okay, nevermind - this looks a bit more straightforward than i first thought, there's some good stuff here

unreal pendant
#

@surreal harbor Ok, well if you do decide to roll your own. My criticism of what you've shown us is the schema is defined with code. I think a good schema should be defined with data (i.e. JSON, XML or YAML). Since it makes it more portable, testable, and consistent. If you define a scheme with code people will create things like loops to generate the scheme, which makes it difficult for people to see what they need to match.

surreal harbor
#

Fields support some configuration options like custom validation or cleaning functions which would not be possible to define in JSON/XML/YAML/etc..

#

defining schema in code also allows for fields to reference related objects by reference rather than simply by name

unreal pendant
#

But it's not really portable between say a system and a module implemented for a system (or multiple systems)

trail geyser
#

Uh, yea, it's not portable. The schema for system owned documents belongs to the system.

surreal harbor
unreal pendant
#

Well the goal of any schema is validation between different segments of code. A contract between how data is communicated. If a module wanted to validate it's data before passing it to the system or another module

trail geyser
#

This is for validating documents and updates to said documents, not about validating data one is passing to other modules.

#

You can get the schema from the document and validate it at anytime if that was just the goal to not pass invalid document data.

surreal harbor
#

I see what Kage means, the scope for this is intended to be more broad than just Documents, ideally this approach would be useful for all sorts of cases where a package author wants to define a data model

#

but I don't see where the problem comes in, if a module or system defines a data schema it can expose/export that schema such that other code can also use it

unreal pendant
#

I mean, in a more abstract case. People looking to integrate foundry databases outside of foundry. Compendium importing/exporting, third party services, etc.

#

I can give you one case, Moulinette allows Patron creators to create assets (which could eventually include documents). Would be useful for this service to validate system specific documents outside of foundry completely

#

I'm just recommending allowing portability for future cases outside of your immediate use case.

surreal harbor
unreal pendant
#

Sure, just giving some push back to help you see all the possible angles.

surreal harbor
#

Appreciate that - for a bit more context - do you currently subclass DocumentData in one of your packages?

unreal pendant
#

Not in a published package. But I am implementing a homebrew system in foundry.

#

Which of course, does have to extend DocumentData

surreal harbor
#

which is partly why I'm trying to understand in this thread how many devs would be affected by changes here

unreal pendant
#

Oh, well maybe I'm doing my system wrong lol

trail geyser
#

Not really.

somber ether
trail geyser
#

The DocumentData stuff did not exist when I created my system. A lot of others are in that boat.

elfin maple
somber ether
#

sure

#

I saw the ground rules message too

#

pf2e does but makes minimal use of it

#

it's nice for instanceof checking at the very least

#
  • setting per-actor/item type default icons
#

using it for system data remains scary

surreal harbor
somber ether
#

mainly the migration problem

surreal harbor
#

I would agree that the current system is not yet ideal for system data

somber ether
#

to the extent that we tighten a system-data schema, pre-migration data would be lost

surreal harbor
#

Migration meaning the need to migrate old incompatible data values to new compatible ones?

somber ether
#

right

surreal harbor
#

Presumably that migration could be done as part of initializing the DocumentData source?

somber ether
#

if pre-migration system data gets run through the schema shredder first, old properties would get dropped, etc.

somber ether
#

Though our migrations are async

surreal harbor
#

hairy

unreal pendant
#

Could I make an argument that schema (data structure), validation (data integrity), and cleanup (data upgrade, alterations and corrections) are three separate concerns and don't necessarily need to share a unified approach?

surreal harbor
#

@somber ether probably worth standardizing a workflow where migrations can/should occur using raw source data before documents are constructed.

#

which could allow async operations

somber ether
#

I suppose we could reach inside game.data for original "dirty" data

#

the async bit is in part to grab fresh item data from compendiums

surreal harbor
#

it's a little off-topic for this thread, but maybe we can make figuring out a solution here an objective for V10 prototyping

#

pretty much all game systems need to perform data migrations from time to time, it would be good to provide a standard approach for that rather than needing each system to roll their own solution

surreal harbor
#

I'm not sure I agree with it though

#

data validation depends on schema, and curing/corecting invalid data also depends on both schema and validation

somber ether
#

Are the custom DocumentData subclasses you're thinking of for the purpose of this thread the sort that may inherit directly from DocumentData (rather than ActorData, etc.)?

surreal harbor
#

so the degree to which they are separate concerns is ... to me ... tenuous at best

somber ether
#

I don't think mixing light coercions with validation is any separation-of-concerns crime

unreal pendant
#

To me the concern with data structure is preventing things like null accessors. The value of being non-negative value for data.duration.startTime is pointless if data.duration is undefined or null. Data structure is about paths and types. While validation is about the data being within acceptable ranges. If you check that your data is structured correctly, validation becomes inherently more reliable.

trail geyser
#

I've got a general thought. When I was poking this all in my dev env, I kinda felt that the schema ended up duplicating the template data structure in template.json, feels like it's close to violating DRY (TS makes it worse due to wanting interfaces).

Do you think providing a default in the schema and removing that structure in template.json is something could be done? Of course with appropriate transition period.

surreal harbor
#

I'm not sure how to reconcile those, but what I think might be possible is to provide a utility function that would automatically generate the template.json file from a DocumentData class definition

#

so your DocumentData could be the "source of truth" for what the schema is, and then you could generate the necessary JSON file from there

trail geyser
#

Yes, I really don't want some random dev (including me) to poke server side code.

surreal harbor
#

it's not elegant though ,so maybre there is a better solution

trail geyser
#

I was largely just thinking to kill the server being responsible for initializing a new document and pushing that client side in schema.

#

At least for system data.

#

Ironically, @unreal pendant's wanting of a pure data schema would allow the server to validate system schema.

surreal harbor
#

yep, it would, at the expense of other features

unreal pendant
#

Why is that ironic? That's literally how and why the rest of the world uses schemas.

surreal harbor
trail geyser
blissful ledge
surreal harbor
#

Thanks for the link @blissful ledge, do you know who the author is in Discord?

blissful ledge
#

Ping @SWW13#0799

#

Hm, that didn’t work 😅

#

@unreal elk

haughty magnet
#

i think migrations are the only reason we don't use it in pf2e

blissful ledge
blissful ledge
surreal harbor
#

So that is not an option

haughty magnet
#

that's arbitrary though. i'm just saying that ideal solution for our problem.

elfin maple
trail geyser
#

There are plenty of reasons to deny that.

haughty magnet
#

being intentional doesn't mean it's not arbitrary

trail geyser
#

There is reasons, it's not just a random choice or whim.

haughty magnet
#

arbitrary from a math sense

#

regardless, i'm not arguing that anything should be changed. Just that we're stuck with a bad implementation because of the design decision.

surreal harbor
#

I certainly would say it is not an arbitrary decision, and it’s definitely in the will never happen category

#

There should be other good ways to improve system data migration though

blissful ledge
#

I think there could potentially be an option for allowing server side migrations of system data that does not depend on system code being executed. If there was a wayto describe migrations in a declarative way on the data level, it might not be necessary to actually execute system code. It might limit a bit what kind of migrations can be performed, but it might cover most of the relevant cases.

Regardless, I think this is turning a bit off topic, so unless Atro wants to discuss (server side) migration improvements in more detail here, I would suggest we return to the original topic.

(I guess discussing how the DocumentData changes would interact with the common migration patterns that are currently used is on topic)

stray pasture
#

Out of curiosity, is DocumentData something new to V9? Or is it just something that one indirectly uses (via class extensions)?

haughty magnet
#

i'm just saying, that, due to the migration, this entire conversation is useless because we (pf2e) cannot use it

stray pasture
blissful ledge
stray pasture
#

Ah, so it's more for system developers.

surreal harbor
blissful ledge
haughty magnet
#

we don't use this schema stuff

surreal harbor
#

I don't think the design of DocumentData inherently makes that problem worse - although what stwlam mentioned earlier in this thread about the need to async migrations is something that is not currently supporteed

haughty magnet
#

i think they're only really async due to the update methods being async

#

and we have a class of migrations that aren't just updating documents

somber ether
#

We sometimes pull the latest copy of a particular item from a compendium and swap out the actor's older version with it. That's always going to be async

stray pasture
surreal harbor
#

that is an option that can be considered (and has been in the past) - but I believe it's too limiting for our needs

silk oriole
#

Am I right in understanding that the intention is to give more flexibility for use cases like mine with the new DocumentData tools? In that I've got the basics in my system.json (defining that each actor has skills, stunts, etc.) but the specifics are implemented in code by my setup and character editing tools. It sounds like in future I'd be able to define a specific DocumentData model which makes the schema for the objects inside stunts, skills etc. if I wanted to with a lot more depth than can currently be supported by the top-level system.json schema?

surreal harbor
silk oriole
#

Sounds like something I'll want to consider upgrading to, then, as it would centralise a lot of the validation I currently handle myself. I'm glad it won't be an enforced migration and that I can keep doing it the way I am now until I'm ready to make the leap though.

blissful ledge
#

(not extending DocumentData, though, I think)

unreal elk
#

it has some limitations [...] structures like arrays
One limitation I'm currently facing (I'm working on macro support and would like to have a list of macro ID's that are active).

Thoughts, feedback, questions, concerns?

  1. I like the idea and already have a use in the near future for it.
  2. It would be really nice to deprecate the old format (with an converter function for the next 1 or 2 versions) to not break current modules, upgrading modules to new foundry versions has been a great pita in the past.
  3. (more of a nitpick) the current set of fields not really match their name e.g. NONNEGATIVE_NUMBER_FIELD (is required but has a default set - not really an optional number). there is REQUIRED_POSITIVE_NUMBER without a default but REQUIRED_NUMBER has a default
  4. I'd really love to see support for custom documents on the server side (maybe with client-side only validation as a first step) because manually hacking in new documents requires wonky workarounds (see https://gitlab.com/SWW13/foundryvtt-stairways/-/blob/7800039e35698e752f5fd0f8e5808f2876401745/src/dataQuirks.js) and as far as I looked into the server code there was no reason why it shouldn't be possible with some modifications to the validation - much like the "custom" documents for systems.
surreal harbor
#

Thanks for the thoughts @unreal elk. A few responses:

It would be really nice to deprecate the old format (with an converter function for the next 1 or 2 versions) to not break current modules, upgrading modules to new foundry versions has been a great pita in the past.
I would try to provide backwards compatibility for anything that was previously a fields.* const

(more of a nitpick) the current set of fields not really match their name e.g. NONNEGATIVE_NUMBER_FIELD (is required but has a default set - not really an optional number). there is REQUIRED_POSITIVE_NUMBER without a default but REQUIRED_NUMBER has a default
The plan would be to retire these completely in favor of field instances where you have more control

I'd really love to see support for custom documents on the server side (maybe with client-side only validation as a first step) because manually hacking in new documents requires wonky workarounds
A separate issue, but there is a proposal for a "Basic Document" which gives a flexible template to use for somewhat arbitrary document types

green pollen
green pollen
#

Also I'm not sure what we currently have to do to prevent updates is ideal. Right now we either have to use hooks or override _preCreate and _preUpdate and throw in them.

surreal harbor
green pollen
#

I'm not saying anything against throwing errors. It just feels a bit unusual compared to the rest of foundry, that I've seen so far. But that might just be subjective.

frank bear
# surreal harbor Thanks for the thoughts <@!311235865022038016>. A few responses: > It would be ...

Based on my experience with Storyteller, when you create your document types, you need to implement the database on its own, in my case, it's a trivial json in the settings. At the same time, the registration of this type itself took me quite a long time because of the lack of guides about it.

But surprisingly, it turned out to be quite realistic to use.

I'm not sure that the decision to somehow describe the scheme in the database itself is so necessary, as read/write to the file, with game amounts should not be too slow.

surreal harbor
frank bear
#

It will be pretty useful

inner lintel
#

I don't know if this is something worth considering, but it could be useful to have some sort of versioned schema

#

So you could have a v1 schema, and then when you do a major change you create a new v2 schema. You can tell the system to load the data using the v1 schema and then write a migration that converts between them

#

Documents could keep track of which version they are on to make tracking migrations easier rather than having to guess based on system version checks or data inspection

#

This would probably be easiest for a setup where the schema was stored in static json

surreal harbor
#

@inner lintel it's an idea that has value on its own, to be sure, but I am pretty hesitant about that given the complexity that might be required to pull it off in a way that plays nicely with every other component of the system.

#

Probably most useful to game systems where the data model changes more frequently than for core data types

stray pasture
surreal harbor