#modules and deps thread

1 messages ยท Page 1 of 1 (latest)

rocky hamlet
#

yeah, this is it feels like a partly philosophical question. modules are supposed to be modular. So they need to be standalone. But I can see my little scripts thing having 10, 30, 50 different modules that do things like

  • Interact with s3
  • interact with rds instance
  • interact with some external api (datadog in this specific spike I'm doing)

So now I'm coming into a world where my repo might have 50+ go.mod/pyproject.toml files. What do?

silk nacelle
past nymph
#

Catching up,...

#

Just so I'm clear, this is about how to organize a "platform repo" made of many modules possibly calling each other, and not about "application repo" and how to organize modules co-existing with application code. Right?

rocky hamlet
#

For the purposes of this conversation, let's consider it the former. I'm thinking the platform repo architecture where most of the core functions and modules reside in a single 'platform' repository, then downstream repos use the module(s) in the platform repository as a dependency and each have their own ci module

past nymph
#

So then the problem is specifically: lots of modules each with their own go.mod and pyproject.toml.

Isn't that a common and accepted pattern with or without dagger?

#

Also orthogonal to local dependencies within the repo, since Dagger modules don't use go.mod or pyproject.toml to express dependencies on other Dagger modules

rocky hamlet
#

Maybe then this is just inherent to the way that modules work. In order to have a standalone module that you can just run you have to have your own dependency chain for that module. and that necessitates having a complete node, golang, python package for each module. Maybe the question I'm asking is "how to best manage the supply chain of several dozen modules" which is perhaps an unsolved problem as of yet? Maybe you don't want to share the dep chain between modules and bulk find replace is the best thing to do when trying to, say, bump the golang version that a bunch of different modules use.

past nymph
#

I think the first step is to distinguish the 2 layers to the supply chain in question:

  1. The language-specific dependencies of each module (including language version itself)
  2. The Dagger dependencies

Those are managed differently, so depending on which part you're curious about, the answer is probably different

#

That's why I was asking specific follow-up questions

#

It seems like you're focusing on the language-specific layer. Things like bulk-updating the Go version for all go modules in the repo for example

#

We've actually planned support for sharing the same go.mod across multiple modules in the same repo. The primitives are available for SDKs to do that. I'm not sure if SDKs have that implemented already. But if not, it's definitely coming

#

I was curious how badly you would want that, and why

rocky hamlet
#

Got it. Ok cool. Yeah let's talk in some actual real world examples here so we are all 100% on the same page. Here's the tree of my repo rn:

โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ ci
โ”‚ย ย  โ””โ”€โ”€ src
โ”œโ”€โ”€ dagger.json
โ”œโ”€โ”€ platform
โ”‚ย ย  โ””โ”€โ”€ src
โ”‚ย ย      โ”œโ”€โ”€ base
โ”‚ย ย      โ”‚ย ย  โ”œโ”€โ”€ dagger
โ”‚ย ย      โ”‚ย ย  โ””โ”€โ”€ dagger.json
โ”‚ย ย      โ”œโ”€โ”€ main.py
โ”‚ย ย      โ”œโ”€โ”€ pg
โ”‚ย ย      โ”‚ย ย  โ”œโ”€โ”€ dagger
โ”‚ย ย      โ”‚ย ย  โ””โ”€โ”€ dagger.json
โ”‚ย ย      โ”œโ”€โ”€ pyproject.toml
โ”‚ย ย      โ”œโ”€โ”€ s3
โ”‚ย ย      โ”‚ย ย  โ”œโ”€โ”€ dagger
โ”‚ย ย      โ”‚ย ย  โ””โ”€โ”€ dagger.json
โ”‚ย ย      โ””โ”€โ”€ vault
โ”‚ย ย          โ”œโ”€โ”€ dagger
โ”‚ย ย          โ””โ”€โ”€ dagger.json
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ requirements-dev.lock
โ””โ”€โ”€ requirements.lock```

I have a top level dagger.json, with a mod called "platform". It's in Python, as you can see from the main.py

In my platform module, I have some submodules 

  • base: Just a standard base image. Go. No input, outputs a base container image
  • pg: Create an authed pg container to execute psql queries. Python
  • s3: s3 related operations. Go. Takes in an authed aws container and a directory or file and does some s3 related operations,
  • vault: Container to handle vault secrets. Go. Takes in a container and installs vault

A simple  cron-like workflow that I'm approaching as part of this spike:

  • Install base image
  • Install vault
  • Auth with vault to get secrets
  • Auth to postgres
  • Perform a sql query, output a csv
  • Log into aws
  • Take the output csv from the postgres container and upload the csv to an s3 bucket


if `base` and `vault` and `s3` all share some sort of Go library dependency (let's pretend it's some security library like `legion`), is it a good/bad idea to try to manage these together? Or should `base`, `vault` and `s3` each manage their own version of `legion` as a dep with its own lifecycle? I think it's the latter. But there does feel like a significant advantage to, for instance, try to have all of your modules using the same base images for caching purposes. Maybe that can be achieved through a "layering" pattern though.