#Toolchain Overlays POC by kpenfound · Pu...

1 messages · Page 1 of 1 (latest)

rustic seal
#

Does this solution for middleware make sense?
Is this better than just writing a main module for composition?

rustic seal
#

One concern I have is that its adding complexity to how modules are loaded. To understand the API, we now have

  1. Load the main module (or blueprint)
  2. Load the main module's dependencies
  3. Load toolchains and add proxy functions to the toolchain module's constructor
  4. Load overlay toolchains and apply proxies to the toolchain functions

What we gain is the ability to change how a toolchain function is called (like middleware) without writing a new check in a main module (we might not even have a main module) and disabling the toolchain check in the dagger.json

compact folio
#

NOTE: this is a pretty pessimistic take having not used it for the use cases it's intended for, there's a good chance it's actually pretty nice all things considered, but leaving raw initial feedback because I'm guessing I won't be the only one

My 10 mile read on it is that it feels like an intimidating amount of complexity. Creating a third module that modifies how another module is used by your module feels like a lot to keep in your head. I'd be struggling to understand how all the components get joined together to result in the final interface, like you said.

I feel like I'd end up yearning for the relative simplicity of just using a module as a plain old dependency. Like the complexity of adding my own check that calls the module's check (~5 lines of code) would be preferable to the complexity of installing it as a toolchain and babysitting/modifying the checks that it brings along (through a combination of dagger.json and .env edits and overlay modules).

tl;dr this seems like a well designed solution for the problems we have, but I wonder if we could instead try to solve problems further upstream to avoid a complicated solution downstream

Some things to consider:

  • Can we have toolchains act as super-powered dependencies, and "just" support calling them in the user module and returning their types? This would be distinct from plain dependencies, as their types would be effectively copied into the target module, avoiding the diamond dependency problem. We could maybe even support overriding methods on toolchain-provided types if we need to, just like we used to support that for Container for a hot second near the start of Zenith. I'm not sure how you'd call the original though.
  • Do we need to have a "come to Jesus" moment with contextual dir args and try to find a new design? Feels like a lot of this hardship comes from how those work. (Failing to remember the details but I remember it coming up a lot when going over this, and they're in a sense the original reason toolchains exist)
  • Overlays feel like monkey-patching in Ruby, but in Ruby it's a last resort. Through that lens, it makes me wonder if there's a better way to do module composition that's less brittle. But I don't know what that is. What I do know is if people design modules such that they expect users to define overlays to fully use them as the primary configuration/customization/parameterization technique, that would feel like pretty bad DX. Maybe something around dependency injection?
  • How much of this can be solved by solid module design patterns? Things like "builder modules should always accept a configurable base Container" This example feels like poor module design (sorry I know it's just a quick demo) - if a user wanted to build with a different base image entirely they'd have to completely ignore the original Build and do their own thing which feels crazy.

Sorry for all the words, trying to throw things at the wall and see what sticks 😅

rustic seal
#

thanks for the detailed analysis @compact folio ! I think we're on the same page, thats why I wanted to have this discussion

Can we have toolchains act as super-powered dependencies

Thats basically the alternative today. You can already use a toolchain as a dependency, but you can't override types. I don't know if thats really needed

Do we need to have a "come to Jesus" moment with contextual dir args and try to find a new design?

We definitely circled around this a few times when @deft arch and I were brainstorming this. I think on Monday someone (maybe @cobalt night ?) mentioned creating a way to find the true module source separate from context dirs, which feels like a prerequisite for handling the main context in a smarter way

Overlays feel like monkey-patching in Ruby

100%. The original idea was more like a middleware, but where we ended up was essentially monkey-patching. I think this leads to the next point

Things like "builder modules should always accept a configurable base Container"

Yes, but here's where it gets tricky. Taking our CI for example, to use a generic Go toolchain we would need to pass in a base container configured with multiple custom Services, some from other toolchains. So we have some cross-toolchain composition required. With a middleware, we could do some lightweight composition to create that base container that Go.Test needs. Or in a main module, we create a Test check that does the composition, and then we disable Go.Test (you can disable individual checks in the dagger.json). The later works today, but we're trying to see if a middleware solution could be better.

I'm not convinced the overlay solution is worth it when the current workaround isn't that much different and doesn't add to module loading complexity. But maybe there's still something to the original middleware idea?

#

I've also been considering that we might need a more advanced system for routing api calls to function calls now that we have blueprints and toolchains. It was simple enough when we just had individual modules, but the proxying stuff is a hack that could be handled by a proper router. And with a router, the middleware concept becomes much less complex

deft arch
#

Hello! I will dive in after the week-end, but dropping a few thoughts:

  1. I think we can salvage this feature with a chiropractic adjustment. Instead of "module FOO has a toolchain BAR but loaded as an overlay to BAZ", it should be: "module BAR extends module BAZ". This moves the feature to the toolchain's dagger.json - and makes it orthogonal to toolchains. It narrows the focus to a specific goal: a less verbose DX for wrappers.

  2. Separately from all this, I agree that we should revisit context directories (+defaultPath and +ignore). With the new toolchain pattern, they are even more in the UX hotpath, so it's worth looking for every opportunity to improve it.

rustic seal
#

module BAR extends module BAZ

would you then install BAR as a toolchain instead of BAZ?

deft arch
#

Correct me if I'm wrong: in the current implementation you are required to install both as toolchains, whether you actually need both or not. What I'm proposing is that you would not be required to do that. Instead:

  • If you need BAR, you install BAR
  • If you actually need BAZ without any overlays, you install BAZ
  • If you need both, you install both

And regardless of which toolchains you install, overlays are irrelevant at that point

rustic seal
deft arch
#

Do we need to have a "come to Jesus" moment with contextual dir args and try to find a new design? Feels like a lot of this hardship comes from how those work. (Failing to remember the details but I remember it coming up a lot when going over this, and they're in a sense the original reason toolchains exist)

@compact folio did you have anything specific in mind here? I'm interested..

compact folio
#

not really 🫣

#

just extremely vague thoughts around dependency injection/discovery, Env, all that

#

but have to keep caching / cache key computation in mind, too

deft arch
#

Seems related somehow

#

tldr: create a new type for the underlying "file server" that can produce directory snapshots; support passing that as argument to create your own snapshots

compact folio
# deft arch What do you think of <@949034677610643507> 's proposal for https://github.com/da...

The part at the end feels intriguing to me, since what paths to load tends to change infrequently, and knowing that as early as possible seems like it'd have bigger performance implications for more potential systems (like smart checking, cache key computing, etc). I may have some bias though since it reminds me of my 'multiple .ignore files' idea 🙂

That's more of a "I want to know what we can find out about that approach" than a "I prefer that over LiveDir" - incrementally expanding a Directory does sound cool, but I wonder if it's more dynamic than we really need and would have a higher performance floor (as in higher price of admission)

#

I think the biggest part of the problem is there being one contextual dir - sometimes we want it to be the original module's source (e.g. Dang SDK), sometimes we want it to be the toolchain-using module's source, sometimes we want that to be an Env workspace

deft arch
rustic seal
#

I think starting by separating ModuleSource from context will help. Especially if we can expose the context directory in the API too, not just via magic arguments

deft arch
rustic seal
#

ah right, so then further separating "here's the whole context you can access in the API" from "here's a special version of that you can get with a magical argument"

#

or just avoiding the API completely so there's not a wrong way to do the same thing

#

ok fine its a hard problem 😛

deft arch
#

The idea in that issue (dynamic filtering) is to narrow the scope of the magic argument, and move as much logic as possible in regular code, inside an entrypoint function that is never cached.

The whole idea hinges on how fast we can make that never-cached entrypoint function (which is what @compact folio was concerned about earlier in this thread I think)

#

This "never-cached entrypoint" design does require a special argument type, and the exact shape of that type hasn't been defined. All we know is that it needs a way to create directory snapshots.

In that issue I call the type "FileServer", but it could also be broader in scope, and basically be "env 2.0".

deft arch
deft arch
sullen venture
#

but think there's some pretty reasonable options to reduce it further if needed

sullen venture
deft arch
compact folio
#

I like the fact that it's an argument, but I don't love that ignore lists become part of the argument spec - it's not so bad in some SDKs, but big JSON list comments in Go are unfortunate, and I worry about drift between those lists in general

#

(but it is convenient to define it all together, to be fair)

deft arch
compact folio
#

That's part of the appeal, to me, of pulling those "specs" out as something that can be committed and possibly generated. Like maybe it would be nice to have names for different views of the source tree, and map those names to Directory arguments somehow

deft arch
compact folio
#

ah right, views is similar but not the best DX - caller has to specify it, no way to default to one for an arg, per-module JSON config instead of files in the repo at appropriate paths

#

i like the idea of driving them from code, e.g. to deal with replace rules in a go.mod, but generating them feels like a good next step on top of that, since these things rarely change and are almost never dynamic afaik

#

sorry i realize i'm mostly nibbling around the edges of an idea without putting forward a proposal to pick apart

deft arch
compact folio
#

well, the generating is only a thing you'd do if you needed it, otherwise it'd be about the same as maintaining your .gitignore, which is a well-exercised muscle for most people i'd expect

#

(also was there a particular message re: their contextual dir constraints? or more a stand-in for the 'simple case'? or is their case actually hard?)

deft arch
compact folio
#

yeah, i'm just suggesting when you need more, it'd just be more repeating the same .gitignore pattern, just as other files representing views of your repo, instead of learning another ignore/path specification system

deft arch
compact folio
#

which could either be hand-maintained or generated, but that's an external exercise

deft arch
#

Mmmm... We did ship a way to attach ignore filters to specific arguments of specific functions... Via "toolchain customizations". So we could lean on that if we wanted.

Different design from your example - instead of the code referencing an ignore file, it's a config file referencing the code... But same general idea (maintain your filters in a standard config format alongside your code, instead of mixed within it)

compact folio
#

It'd be nice if we could make it more transparent, so you can configure filters once in a generic way and have things generally "just work" across toolchains.

Strawman idea: what if we could map file extensions to filters? e.g. %.go could represent all .go files and all the files they need , i.e. (**/*.go + and all // go:embeds + all go.mod replaced paths). The Go toolchain would use something like // +sources=["%.go"] for its source arg. %.go could default to **/*.go and be overridden by reading a conventional filename in the target directory (like .paths/go). The Go toolchain could provide a generator that finds the other dependencies and writes them to that file. In the CLI, foo/%.go could apply the same filters but scoped to foo/.

That's sort of designed on the theory that most args are generally accepting one or a handful of primary file types, and the dependencies descend from there.

rustic seal
# deft arch Correct me if I'm wrong: in the current implementation you are required to insta...

Here's that https://github.com/dagger/dagger/pull/11642
I think its pretty cool actually. I'm also less concerned about this than I was about overlays. Wdyt @compact folio ?

GitHub

This is an alternative to #11561
This introduces the concept of extending a module. If you extend a module, you will inherit all of its functions, but you can override any of them. Here&#39;s a...