#Dependencies within monorepo
1 messages Ā· Page 1 of 1 (latest)
Mono Repo chat some more.
I have a Supabase module and I can consume it from anywhere in my mono repo with the remote syntax:
{
"name": "rawkode.academy",
"sdk": "go",
"dependencies": ["github.com/RawkodeAcademy/RawkodeAcademy/dagger/supabase"]
}
but this means any time I change it I need to push first, then consume
Instead I'd like to use ../../dagger/supabase as the dependency, but without mounting the entire mono repo.
I would like the Dagger CLI to build each module individually with only the required mounts.
I assume this isn't there yet though?
Supabase has its own dagger.json, as does the service module I'm currently within that has the dependency
@golden quartz what's the path of the importing module? To help make my examples realistic š
Happy to share my screen, if that helps
Iām not against mounting the whole repo
But BuildKit requires the context to be a zip, right?
Even this one directory is slow, like 40s
OK, so yes, the idea is you have to configure your repo so that the engine loads the entire monorepo, but we plan on adding support for include/exclude so that the parts you don't care about aren't actually transferred. cc @shrewd bison to make sure I'm not saying anything wrong here
The upcoming breaking change I mentioned, is in the config format for doing that
Before the change: your module says "my root is the monorepo"
After the change: the monorepo says: "I'm the root for this module"
Amont other things, it makes it harder for an evil module to access parent directories it's not supposed to
Realistically, this is never going to work too well for a mono repo; it feels.
That zip step seems like a blocker
Because we canāt use the cache without it?
Yeah that's the idea, you'd need to either only explicitly include the files/dir you want or to exclude ones you don't want (or a combination)
Why is that?
One thing to clarify, buildkits local imports are cached such that if you load a local dir once for the first time, subsequent loads will only transfer data that changed (until the cache is pruned of course, like all other cache).
Realistically, this is never going to work too well for a mono repo; it feels.
Is it the need for configuring include/exclude? Like that's too much configuration overhead?
@shrewd bison in theory with the root-for approach, the loader could go parse the configs specified by that key, and determine what needs to be accessed. Since it's all determined by either 1) what the SDK might need (eg. go.mod), or 2) dependencies (eg. ../../dagger/supabase). So in theory, we could later optimize the load of a monorepo without requiring manual include/exclude (but we could still allow it of course)
Yes, that's already in the cards for local modules (though for the future, too much for the current PR), but loading modules like that from a git repo is going to require a lot more elbow grease: https://github.com/dagger/dagger/issues/6292
Because of the multi language support, a lot of the trivial local wrangling of dependencies has become a rather complicated process involving many short lived module build steps
And thatās before we get to running actual jobs
My concern is that my mono repo is small and itās already hard, as I add many more services; can I get a solution that scales without exponential compute time and complexity?
Which I can, but I sacrifice modules and go back to my own client handling; which maybe thatās just the answer.
Of course I could also put all my modules in their own repo and build and publish them
But I think Iād still run into problems when I have service dependencies, like two services needing the same protobufs
Just to clarify, the concern right now is the performance? As opposed to the configuration complexity?
Do you mean computational complexity (computer takes time to complete the work) or human complexity (lots of concepts and tasks to juggle in your brain)?
Dependencies within monorepo
Compute and performance, the config doesnāt bother me.
Thereās two use-cases. One, CI - easy. Mount everything, rely on cache, happy Rawkode
Local. Laptop. I never want to build everything, I want to work on services and have their dependencies built to be consumed. This needs to isolate the dag and be fast
As we have services, Iām now refusing to write a Just or Makefile because everything can be a task / function
But this use-case isnāt easy for me to see what itāll work like
I wonder if dagger mod install at the root of my repository could detect all dagger.json files and do the xodegen globally, without mounting the entire repository
Letās stick to my current example. I want to call ādagger up devā in my Rawkode.academy directory. It needs the Supabase function from the root of the repository; I donāt think this should require mounting everything. What can we do to make that work?
I keep ending up back at bazel and I think there's some things that map well. Dagger modules and functions are bazel rules.
Bazel has WORKSPACE and BUILD files.
dagger.json is kind of like a WORKSPACE and the BUILD is embedded within the rules.
What Dagger doesn't have yet is the bindings and dependencies that Bazel has.
This is something Earthly handles really well too
I'll need to put some thoughts into an issue
Like .editorconfig supports [root] , perhaps dagger.json could too. Dagger CLI would always look for this. Any dagger.json further down the directory tree would become a local module and could be references by name, like in bazel: //sub-name-module, rather than ../../ or root juggling
@golden quartz what's the earthly feature you're thinking of? I'll go look at it to better understand
There's 2:
- Local targets:
BUILD ./any-subdirectory+target-name, also works upwards withBUILD ../../some+target - Remote targets:
IMPORT github.com/earthly/lib/utils/git:2.2.11 AS git
What's cool about it is that I can execute earthly at any directory within my monorepo, or at the root and build whatever is needed
Their IMPORT stuff is pretty new
I'm trying to map 1) what you like about bazel and 2) what you like about earthly to 3) what you see missing in dagger.
For example Bazel has "reference everything relative to project root" to avoid ../... But Earthly doesn't have that, correct?
What I like about Bazel is that I don't need any higher level orchestration. I put BUILD and WORKSPACE files in directories and it works out the rest.
What I like about Earthly is that it's kind of the same, everything is referenced via target names. This is how Zenith works too, after the codegen; but as we've discussed this runs in a container and we have problems with imports higher up the tree. Because Earthly runs this stuff on the host, the orchestration works as expected.
There's a lot of simiarity between Earthly's IMPORT and Zenith modules after codegen. Earthly's Dockerfile syntax probably very similar to the GraphQL.
What's missing in Dagger is the orchestration piece. We can get this at a service level with codegen and functions with dagger call, but there's no aggregation across the repository like we get with bazel and Earthly
Very mono repo specific, I know
I think the confusion comes from the fact that what Bazel and Earthly aggregate, is something completely different from Dagger functions. That particular problem (finding "scripts" or "rules" in a given monorepo) is something Dagger has not tackled yet. We're starting with a simpler primitive (functions) which has no equivalent in Bazel and Earthly
Maintaining a hierarchy of dagger modules in your monorepo is not equivalent to maintaining a hierarchy of Bazel or Earthly rules
You don't think a Zenith module is the same as a Earthly module? How is Dagger different?
Of course, Dagger could implement a parser for bazel BUILD files and Earthfile's; but let's ignore that for now š
I'm setting aside Earthly modules (import) which are roughly a clone of our older implementation of dagger do (down to the verb lol)
I'm talking about rules/targets
The main difference is that Dagger modules are just a collection of functions and types - just like Go modules. There is no implicit access to the monorepo's content. For example a build function hosted in your monorepo, doesn't have ambient access to your monorepo's source code: it needs to receive source code as an argument like any other function. This is different from Bazel/Earthly/Docker build/Makefile, where there is always implicit access to the current directory at least.
The current directory is bound late, but yes
Dagger could do that too?
ah but the challenge is that the modules run in BuildKit land and the inputs need known before time?
I'm getting a little out of my depth of Zenith internals
Dagger Functions are intentionally designed to not allow ambient access, because that would compromise the simplicity and scalability of the composition model.
BUT we do plan on adding convenience on top, to eventually match the UX of the "bag of rules and shell scripts" model without compromising the composition model