#Monorepo Advices

1 messages · Page 1 of 1 (latest)

lament widget
#

Hey there!
I’m in the process of migrating one of our monorepo to validate our dagger poc. We currently use moon mostly for task dependencies and cache. What’s the recommended dagger way - if there’s any already?

As a simplified context, let’s say I have 2 libs (l1 & l2) and an app (a1). a1 depends on both l1 & l2, l2 depends on l1, l1 is standalone. When using moon, we can configure our projects so that it can assess what to run depending on what changed in the context of the project graph. If l1 changed, everything’s gonna be rebuilt. Similarly, when working on a project, running a project local task will call the full upward chain, leveraging cache if any exists.

I plan on creating one dagger module per project, plus a root dagger module and was wondering if 1) it’s the right plan and 2) is there a way to leverage dagger internal DAG to implement the dependency graph, or if I have to implement it from scratch.

cunning parrot
#

Hey @lament widget! are you planning to reimplement some moon logic in dagger or are you trying to use moon inside of dagger?

Both are valid, the latter is probably simpler and should work more or less the same depending on where moon stores cache.

at a high level, dagger layer caching happens based on inputs. So if an input is changed in the dag then all subsequent steps are recomputed.

You can use this to your advantage as you’re designing your pipelines. You shouldn’t have to think about it too much, the chain of inputs will determine the correct shape of the dag.

lament widget
#

… reimplement some moon logic […] use moon inside of dagger
The former. I agree with you the latter is likely simpler but the goal of the poc is to go hardcore mode, so I can have a clear vision on how it would look like. Keeping both moon and dagger would add maintenance burden and bloat the repo, for virtually no added value.

The chain of inputs will determine the correct shape of the dag.
Gotcha (I think 😅). So if I understand correctly basically I just write my dependencies implicitly by calling other (potentially external) dagger functions in a function? But that would mean I have to give each functions basically the git root directory as source right? Or I don’t see how by calling a1.build it would be able to trigger lib1.check and lib2.check for example. I think that’d work if I do pass the root directory, but I foresee a serious headache filtering that to avoid destroying the cache efficiency.

wintry laurel
#

My understanding is, if you recreate what moon does for you, you'd also be creating your own change graph. Or, maybe you can somehow call moon's change (action?) graph? At any rate, it would be a real pain to do, even if you can use moon's change graph and I'd only go for it, if you have very specialized work needing to be done. And even then, I do believe it would be better using moon's facilities.

So, I'd personally suggest using moon's given tools to do the tasks it can do. You'd only need Dagger to pull in your repo into the container and run those tasks from moon's perspective. It's why you have a monorepo to begin with, right? 🙂 The only thing you'd need is an external moon cache for this to work efficiently, I believe.

What you can then do is, after running the tasks, such as lint, test and build using Dagger, you can then also use Dagger to create other artefacts after the build step, which moon more than likely won't do, like build and publish containers or packages and publish to a package manager, etc.

Hope I could help.

lament widget
#

I understand your recommendation, but this seems hard to maintain in my opinion. If going for both, we’d have to maintain 2 build systems. Both moon tasks and a Dagger wrapper. The very idea of switching to Dagger is to reduce our amount of YAML configuration, and leverage its powerful as code model. If we’d have to maintain both YAML tasks + a dagger configuration on top, it makes little sense to migrate at all.
Moon already does everything on the monorepo I’m testing on. Building, testing and publishing. The issue is the amount of non-testable configuration this leaves us with.

I’ll continue my current full dagger proof of concept, as I’m starting to figure out how to design it. I don’t foresee any outcome where moon and dagger live together for us. It’s either we switch or we don’t, but we won’t introduce a layer of complexity if it doesn’t reduce any of our pain points.

cunning parrot
#

Yeah for sure, I think that recommendation makes sense when you have this type of system be a component of your overall build, but it sounds like here moon is doing it all and you're trying to fully migrate to dagger.

That makes sense for sure.

So if I understand correctly basically I just write my dependencies implicitly by calling other (potentially external) dagger functions in a function? But that would mean I have to give each functions basically the git root directory as source right? Or I don’t see how by calling a1.build it would be able to trigger lib1.check and lib2.check for example. I think that’d work if I do pass the root directory, but I foresee a serious headache filtering that to avoid destroying the cache efficiency.

You are thinking about it right and I think Dagger is a bit more intelligent than this, but I'm trying to dig up a good example of this somewhere already.

lament widget
#

yeah from what i’m currently seing, passing root and postfiltering for project seems to nicely play with the cache. I’ll try to send something as soon as I’m happy with my tinkering, see if it makes sense

wintry laurel
tacit lintel
# lament widget yeah from what i’m currently seing, passing root and postfiltering for project s...

You have this context directory feature that can help you automatically getting your git repo as an argument without specifying it (can still be overwritten if you want to): https://docs.dagger.io/api/filters/#pre-call-filtering
This also works for your local dependency as long as they belong to the same git repo, so +defaultPath=/ will always point to its root party_blob

When you pass a directory to a Dagger Function as argument, Dagger uploads everything in that directory tree to the Dagger Engine. For large monorepos or directories containing large-sized files, this can significantly slow down your Dagger Function while filesystem contents are transferred. To mitigate this problem, Dagger lets you apply filter...

lament widget
#

Yeah that’s already what I’m using at top level. And then post filtering for project functions, to avoid wrong cache invalidation

nimble pelican
#

@lament widget re: your issue of having to pass source directory around. That's usually a symptom of needing multiple dagger modules in the repo, each with their own contextual directory access. In your example, a1, lib1 and lib2 should probably be dagger modules. Then mapping dependencies becomes easy

#

I also agree that you should not combine dagger with another "mega-build tool" like moon or nx, unless you need to for compatibility/legacy reasons. In that case Dagger will adapt to your constraints, but it's true that there's overhead to integrating 2 tools

lament widget
#

Ah yeah, I was making them packages of the main dagger module, but indeed having them as module themselves would be more flexible bigbrain . Sorry I’m still learning my way in.

nimble pelican
#

no worries, it's useful for us to see what's not obvious

#

for an extreme example of a module using contextual directories, try:

dagger -m github.com/dagger/dagger/version -c version

#

or dagger -m github.com/dagger/dagger/docs -c 'server | up'

lament widget
#

Ok thanks for the tip. Unlocked pretty much all of my migration. I now have a dagger module inside each project (lib, service, etc…) of my monorepo. Each of them has default directory to the project src (relative to git root context) in its constructor.
So each of them can manage their dependencies by just directly calling another function from another module (only thing required is to dagger install the dependency module)
And I put a last dagger module at the root of the monorepo, orchestrating high-level functions (repo chore, CI entrypoint, local dev setup, etc…). It also maps sub-project modules as dependency.
I still have to clean it up so I don’t end up blind looking at my crap, but this seems super solid at first sight. The cache is also properly leveraged.
Also this monorepo has multiple languages (go & python). I think I’ll give a try to cross lang dagger modules, this would be a solid demo of a multi team monorepo (each dev could maintain their own module, in their native language).

It’s a tad less performant than moon in term of runtime - due to containerization of course -, but the massive gain in maintenance and reproducibility is 💯 worth!

#

Also, this makes it possible to easily extract a project (e.g. a lib which became stable and standalone) from the monorepo, without rewriting the entire build system.
Or even enable sparse checkouts, which we would not even dare consider before.
Which is a big ➕