OCI Registry Module Distribution | Dagger | Page 1

heady cypress Dec 18, 2023, 10:30 PM

#

Well, currently you’ve optimized for teams to have their own daggerverse repository, which you can cache and that will work fine.

#

But teams that have invested in their monorepo will be storing their modules in much larger repositories where the Dagger module code will be anywhere from 1-10% of the code base

#

Of course you can cache that, but how many multi GB or 10s of GB repositories do you want to do that for when you need a few 100MBs?

#

Then there’s monorepos that aren’t public

crisp storm Dec 18, 2023, 10:40 PM

#

In the context of a huge monorepo, typically the initial load will be from a local checkout (otherwise no point in embedding). So in practice that big checkout of the monorepo by dagger may not happen. Instead it will be a Host.Directory. We’ll still find a caching optimization for it, we have a ton of engineering planned on improving caching across the board next year.

heady cypress Dec 18, 2023, 10:53 PM

#

For module installation?

#

How can that use Host.Directory?

crisp storm Dec 18, 2023, 10:55 PM

#

Something like dagger call -m ./ci

#

(CLI does the Host.Directory call for you)

heady cypress Dec 18, 2023, 10:57 PM

#

And if that uses a dagger module higher up in my tree?

#

I don’t know if I’m missing something and these are silly questions 😅

crisp storm Dec 18, 2023, 10:58 PM

#

Not silly at all

crisp storm Dec 18, 2023, 10:59 PM

#

heady cypress And if that uses a dagger module higher up in my tree?

That is an unsettled topic 😁 cc @stuck ravine

heady cypress Dec 18, 2023, 11:03 PM

#

Technically, everything is going to become a module and services that depend on the build output of other services in a nested hierarchy are currently not possible this way

#

I can put together a real example of what I’d like to be able to do, if that helps

stuck ravine Dec 18, 2023, 11:19 PM

#

Yeah we've discussed some of this as part of this issue here: https://github.com/dagger/dagger/issues/5862

And there's a related discussion about how to actually specify dependencies in these sort of situations here: https://discord.com/channels/707636530424053791/1182712578154180648 (gonna go turn that one into an issue too after typing this)

Basically, we do want to support dependencies from one module to another within the context of a large monorepo. There's a placeholder way of getting it to work today with a root setting in dagger.json but we're gonna refactor it soon (sometime in the next month).

Of course you can cache that, but how many multi GB or 10s of GB repositories do you want to do that for when you need a few 100MBs?
Truly huge repos like this definitely will call for a lot more intelligence in how we clone+cache git repos internally. I haven't invested a ton of research time into this yet but there do appear to be ways in newer versions of the git protocol to just clone+checkout parts of a repo, which is what we'll want in order to be able to pull only parts of the repo we actually need for loading each module.

That sort of optimization is less likely to come in the immediate future (probably better as part of the caching optimization work next year Solomon mentioned) but I'll make sure it's tracked in an issue.

stuck ravine Dec 18, 2023, 11:20 PM

#

heady cypress I can put together a real example of what I’d like to be able to do, if that hel...

That would be super helpful, we can figure out what works in the immediate term best and then use it as input for the above mentioned improvements were planning too 🙏

stuck ravine Dec 18, 2023, 11:49 PM

#

(issue summarizing the discord thread I linked to w/ an initial strawman proposal on one possible specific approach: https://github.com/dagger/dagger/issues/6291)

#OCI Registry Module Distribution