Site architecture question: content collection with dynamic data | Astro Lounge | Page 1

ripe vale Feb 6, 2024, 9:16 PM

#

I am rebuilding a marketing, data discovery, documentation, and blog type of website from jekyll to astro. Currently, I am rebuilding our data pages (https://gis.utah.gov/data/#data-categories).

The data pages lead to an index page and then to the child data page. For instance clicking the transportation category takes you to https://gis.utah.gov/data/transportation/ and then clicking roads and highways takes you to those data sets https://gis.utah.gov/data/transportation/roads-system/.

I have started to organize this as a content collection. All of the jekyll data is currently static so it's basically one to one. It's nice because I can generate the index page from the content collection filtered by a category prop. I can even show the same item in multiple indexes with a secondary category without having to manage a static index page and keep it synchronized.

With one of the content items I just made, that is pointed to a github release, I used fetch to get the latest release to get the published date and asset links to display in the data page instead of having to remember to manage those data when that product releases a new version. It was great but doesn't seem to fit properly.

In the future we would like to build a system to keep the parts of these data pages in a database that we can use to keep everything synchronized across the platforms we duplicate this data to. So the content collection data will likely start to migrate from static content to more external requests for the metadata to create the pages during build time.

I'm being told in the chat this I am going against the grain and this probably isn't the correct design choice. I'm early in the development and do not want to make any decisions now that will end up being a major mistake. How would you architect this solution?

Utah GIS Portal

Transportation Data and Services Overview

Utah Mapping Portal | The Utah Geospatial Resource Center

Utah GIS Portal

Utah Mapping Portal

Utah Mapping Portal | The Utah Geospatial Resource Center

Utah GIS Portal

Data

Utah Mapping Portal | The Utah Geospatial Resource Center

#

You can see what i've done in this preview

top level categories
https://gis-utah.netlify.app/products/sgid/categories/
utility category index
https://gis-utah.netlify.app/products/sgid/utilities/
item in the utility category
https://gis-utah.netlify.app/products/sgid/utilities/broadband/

The code is open source in https://github.com/agrc/gis.utah.gov/tree/main

Utah Geospatial Resource Center : GIS Portal

UGRC is the state's map technology coordination office. We aggregate and make available important statewide data sets (e.g. roads, address points) to the public and for government systems; engage and advise on GIS implementations and best practices; and build custom tools, services, and applications for all levels of government throughout the st...

Utah Geospatial Resource Center : GIS Portal

UGRC is the state's map technology coordination office. We aggregate and make available important statewide data sets (e.g. roads, address points) to the public and for government systems; engage and advise on GIS implementations and best practices; and build custom tools, services, and applications for all levels of government throughout the st...

Utah Geospatial Resource Center : GIS Portal

UGRC is the state's map technology coordination office. We aggregate and make available important statewide data sets (e.g. roads, address points) to the public and for government systems; engage and advise on GIS implementations and best practices; and build custom tools, services, and applications for all levels of government throughout the st...

GitHub

GitHub - agrc/gis.utah.gov at main

The official UGRC website. Contribute to agrc/gis.utah.gov development by creating an account on GitHub.

bronze mango Feb 6, 2024, 9:21 PM

#

Thanks for spinning up the thread! It’s late here, but promise to check in tomorrow when I’m a little more awake 😄

ripe vale Feb 6, 2024, 9:32 PM

#

oh sorry for the ping

#

it's 2:32pm here :/

static quest Feb 6, 2024, 9:32 PM

#

Thanks for opening a thread Steve! I definitely see what you're running into here, there's currently not a great pathway for Content Collections to grow into a more dynamic use case.

static quest Feb 6, 2024, 9:33 PM

#

ripe vale oh sorry for the ping

Ha that's alright! We've got people from all over the world! I think everyone on core is used to notification at all hours 😅

ripe vale Feb 6, 2024, 9:33 PM

#

it'd be nice if they showed tz info on your profile like slack

static quest Feb 6, 2024, 9:34 PM

#

For sure

ripe vale Feb 6, 2024, 9:34 PM

#

so do you think i should bail on the content collection?

#

or if new api's get added to help me, i will be able to migrate without too much trouble?

static quest Feb 6, 2024, 9:35 PM

#

I was just about to say... I think my main suggestion would be to eject from Content Collections here since you're not really gaining much from the schema validation if anything is fetched dynamically

#

You could definitely figure out your own approach to structuring the data, and even use zod to validate it at runtime if you want, and hopefully adopt newer concepts when they're available

#

What format is your data in mostly? Do you gain much from Astro working with markdown files or is the data just JSON?

#

I guess I can share the gist of the new concepts we're considering, which is that we might offer the ability to seed a local sqlite database as part of the Astro build process. You'd be able to pull from any kind of datasource (local filesystem or remote), you'd have a normalized query API, and you'd have much more control than Content Collections currently offers. Does that sound appealing at all?

ripe vale Feb 6, 2024, 10:07 PM

#

static quest What format is your data in mostly? Do you gain much from Astro working with mar...

the data pages are html pages. only our blog is markdown really

#

that does sound appealing yes

static quest Feb 6, 2024, 10:07 PM

#

Cool, that's helpful context!

ripe vale Feb 6, 2024, 10:07 PM

#

i'm going to build our cms system and it will probably be a pgsql db in the google cloud

#

so a process to sync to a local would be fine or the ability to use an external db would remove a step for me

#

have different db connectors?

#

what sort of strucure would you store data in the sqlite db?

#

and is this a secret i shouldn't mention if someone asked?

#

if i eject from content collections can you recommend a way to manage the data category index pages?

static quest Feb 6, 2024, 10:14 PM

#

ripe vale and is this a secret i shouldn't mention if someone asked?

Ha we haven't really shared any details publicly yet, but we've hinted at it a few times! Probably don't go around telling everyone, but it's not the end of the world if it comes up.

#

Structure would be relational db tables but wrapped with a similar DX to content collection schemas

static quest Feb 6, 2024, 10:17 PM

#

ripe vale if i eject from content collections can you recommend a way to manage the data c...

Are you asking about loading the data or more about rendering it? Presumably you'd write some of your own utils to structure the data how you need / call out to your backend. Rendering Markdown at runtime would be a little trickier but definitely not impossible

ripe vale Feb 6, 2024, 11:40 PM

#

i mean if i do it as is with static files and the file system

#

i can glob the files if i have them in folders for the index i guess

ripe vale Feb 6, 2024, 11:57 PM

#

having access to data within the file though won't exist

#

this sucks...

#

can i access astro exports from a glob file or something weird like that?

static quest Feb 7, 2024, 12:00 AM

#

Yeah import.meta.glob is a Vite feature that loads the modules for you. It's what Content Collections uses under the hood.

#

I guess I'm a bit confused why you wouldn't be able to access data within the file? All that the Content Collections API really does is import the files and validate the frontmatter schema. The files themselves still run through Astro's build pipeline and plugin system. You can access everything in the file manually through the import.meta.glob feature

#

It's definitely not as nice as Content Collections coordinating it all for you, but based on my understanding of your plans, it sounds like you want full control over how everything is loaded

ripe vale Feb 7, 2024, 12:11 AM

#

not really, i just want to pluck some metadata to show a title and description. i forget what glob gives you access to from the frontmatter

#

but i'll c.log it and see

#

seems with glob i only get the filename and url?

static quest Feb 7, 2024, 12:13 AM

#

You should get everything... what type of file is it?

ripe vale Feb 7, 2024, 12:13 AM

#

.astro

#

  {
    default: [Function: api-client] {
      isAstroComponentFactory: true,
      moduleId: '/dev/gis.utah.gov/src/pages/products/sgid/address/api-client.astro',
      propagation: undefined
    },
    file: [Getter],
    url: [Getter],
    [Symbol(Symbol.toStringTag)]: 'Module'
  }
]```

#

const dataPages = await Astro.glob('./*.astro');

static quest Feb 7, 2024, 12:14 AM

#

You can access any other exports in the file

ripe vale Feb 7, 2024, 12:14 AM

#

ok

static quest Feb 7, 2024, 12:14 AM

#

So I'd just export the metadata you need

ripe vale Feb 7, 2024, 12:15 AM

#

and when i export a prop the scope all changes

#

const dateFormatter = Intl.DateTimeFormat('en-US', { dateStyle: 'full' });

const res = await fetch('https://api.github.com/repos/agrc/api-client/releases?per_page=1');
const data = await res.json();

const page = {
  title: 'UGRC API Client',
  category: 'Address',
  publishedDate: dateFormatter.format(new Date(data[0].published_at)),
  stewards: ['UGRC'],
  type: 'Desktop application',
  lastUpdate: 'dynamic',
  description: `
    The UGRC API Client is a cross-platform, stand-alone desktop geocoding tool designed
    to carefully guide you step-by-step through your address geocoding workflows.
  `,
  links: [
    {
      title: 'UGRC API Client for Windows',
      url: data[0].assets.find((asset) => asset.name.includes('win32-setup.exe')).browser_download_url,
    },
    {
      title: 'UGRC API Client for MacOS',
      url: data[0].assets.find((asset) => asset.name.includes('x64.dmg')).browser_download_url,
    },
  ],
};

#

so exporting page would be a bit difficult here since the formatter and fetch stuff aren't exported

#

but all i really want are the title and description

#

so i can pull those out

static quest Feb 7, 2024, 12:20 AM

#

I feel like it's probably easier to wrap this in a utility function rather than doing it all inline in the Astro page?

ripe vale Feb 7, 2024, 12:24 AM

#

can you show me a sample?

#

it's specific to each data type. not every data type has a github release

#

for instance, this is the frontmatter that i will convert to an astro file for another dataset,

title: Utah Broadband Service Areas
category: Utilities
stewards:
  - UGRC
  - "The Governor's Office Of Economic Development (GOED)"
type: Polygon GIS data
application: 'https://broadband.ugrc.utah.gov/'
lastUpdate: Fall 2023
description: Broadband coverage polygons over any transmission technology, including fixed and mobile services.
hub:
  name: Utah Broadband Service
  featureServiceId: BroadBandService
  itemId: 2b479a30791c445eb135e05acf77dbcc
  layerId: 0
  openSgid: utilities.broadband_service

static quest Feb 7, 2024, 12:31 AM

#

Can you share an example of like two full files so I can see what you're working with? One that is fully static and one that has some dynamic stuff as well?

ripe vale Feb 7, 2024, 12:32 AM

#

yah one sec i'll commit something here in a minute

#

https://github.com/agrc/gis.utah.gov/tree/main/src/pages/products/sgid/address

#

there are two sample data sets

#

eventually we would like to pull some of this information from a CMS but I haven't built it yet

#

here's the globbed index page https://github.com/agrc/gis.utah.gov/blob/main/src/pages/products/sgid/address.astro

#

using the exports of those pages

#

i'll have to duplicate this for 27 categories and ~200 pages

#

right now i could query a google sheet for the update history

#

it could be nice to pull markdown from github readme's as well possibly for some of the apps

static quest Feb 7, 2024, 12:58 AM

#

It would be reasonable to put the data next to the page and then import it in the page itself, I think.

So if you had src/pages/products/sgid/address/address-points.astro, you could also have src/pages/products/sgid/address/_address-points.yml. In the page, you'd just do import data from './_address-points.yml' and on the index page you could use import.meta.glob('./**/*.yml')

#

Assuming you're using something like https://www.npmjs.com/package/@rollup/plugin-yaml

ripe vale Feb 7, 2024, 12:58 AM

#

i have not added that plugin

#

i don't have an issue with the yaml being in the file

#

what's the benefit

#

deduplication?

static quest Feb 7, 2024, 1:00 AM

#

I guess being able to use import.meta.glob? They could also just be .js or .ts files that export a function that fetches your data

#

And then you can call that function on the index page or on the page that renders it

#Site architecture question: content collection with dynamic data