#Upstream modules?
1 messages ยท Page 1 of 1 (latest)
๐งต

I will transcribe this into an issue, but would love to hear your knee-jerk reactions first
I'm not following yet. The dag.CurrentModule().Source() api lets you access the contents of your module's source dir. Is this something different?
If you set "source": "." then you're right, you can already for it. But many repos will set "source" to a subdirectory (especially the current default "./dagger").
So I'm assuming that the current best practice is "you can kind of do it, but the docs don't mention it, most modules don't do it, so it's unchartered territory"
Yeah that would sum it up
I remember @ebon tree saying something like "if we embraced it, Dagger could become a mainstream method for packaging software, like nix flakes". Which is a strong argument for doing it, but at the same time I worry about unintended consequences of mixing both models. Will best practices become more complicated, more confusing etc. Lack of clarity on what is the "correct way" of doing things is a major problem in Nix, and may end up being its downfall IMO. I want to avoid the same thing happening to us.
TLDR I am torn and would love your thoughts
What struck me the most was that I originally wrote the https://daggerverse.dev/mod/github.com/vito/bass@bf53c90d467e0eaf7125cfbfd005c65349125481 module just to have a module for Bass's CI stack (as you would normally do), but when it came time to publishing the module, it actually felt like I just published Bass itself, but available in a programmatic "build-it-yourself" way. At the time we even showed the README, so Bass's full README was rendered inline in Daggerverse, which was kind of neat. And to drive that home, the module ref is github.com/vito/bass, not github.com/vito/bass/dagger, so it really feels like it represents the application, not its "build process."
Basically we'd "just" need a few things:
- To define a standard interface for
build()functions - which needs to account for building on different target platforms. - This thread - some way to avoid having to pass the Bass source code into its own module.
- A command, maybe
dagger run, that just takes a module ref, invokes itsbuild()function, moves the binary to a local cache, and executes it. (With as much caching as possible for this to compete with running the binary 'natively'.)
I'm somewhat encouraged by the fact that it doesn't feel like you need to write two modules or think of it as two different use cases - usually this would just be a subtle change to a build() function you already wrote, to make sure it fits the interface. But I do hear the worry about messaging/framing, if we pivot to support this but it feels disjointed in the end, that ain't good.
Worth noting that point 2 could maybe just be something that dagger run does automatically. It could take the ref, interpret it as both a module ref and a Git source ref, and automatically pass it in (but that implies a standard interface for "passing the source in")
I think this onto something - it feels sort of like one of the things I was trying to express in https://github.com/dagger/dagger/pull/6843.
The dagger ci is going to be made for building+packaging dagger - while it's super cool to be able to run something like dagger call -m https://github.com/dagger/dagger.git --source https://github.com/dagger/dagger.git build or something like that, the main use case is going to be local dev or in CI where the repo is already cloned. The need to keep repeating --source=. ... export --path=. really starts to feel repetitive, especially when it turns a simple autoformat command into a massive exercise (though I suppose dagger script might potentially resolve this).
Maybe for "upstream modules", we could allow access to the top-level directory (through the host api or maybe something new)? Allowing reading/writing to it? It could be a capability or something that's added in dagger.json, which would also prevent any other modules from importing it (which closes the security hole of dependent modules being able to grab any contents)
I'm sort of on the fence, while that would be really nice for resolving my current issue, I could definitely see this impacting the reusability of the daggerverse, if users start enabling it everywhere without understanding exactly where we think it should be used.
@amber token there is also "dagger scripts" which will solve some of that
then you can dagger run build
will scripts only work when running locally? or can i also point it to a remote module? (and if I can, what happens if there's a file export?)
yes -m would work. export would die alone in darkness
(I guess go to the ephemeral checkout)
unless we find a convenience that makes sense
The general idea of scripts though, is that they're for using dagger in the context of your project. so expecting that you have a local checkout makes sense
so: scripts will not make dagger a general packaging platform. But will solve some of the problems that create the desire for upstream modules (inconvenience of passing all args all the time)
mmm ok maybe export could work? ๐ค
Yeah I arrived down a similar line of thinking. Both of these might be solved by allowing a default value for a source argument somehow. As long as we still require you to define an argument in your module, it seems like that's win-win? The module would still be designed the "right way" (take source as an explicit input), users would be able to skip the arg in the CLI, and 'app distribution modules' would just build on that.
Isn't this as simple as dag.CurrentModule().Root() ? Then up to module dev to implement default values the usual way.
That ๐ seems like the most straightforward DX. I'm just worried about impact on the ecosystem as we're discussing. feasibility is not an issue imo
let me see... I remember there being something about the paths that didn't work out. But yeah, if that's already an option, I see the danger in encouraging it, since users might just skip the arg at that point
ok no, I don't think that's exposed. You have CurrentModule().Source() which is dagger/, and then CurrentModule().Workdir() which is /scratch (I think). We don't expose the outer "context" - I suppose that's the door that we closed
which is kind of good news, I bet if that door was opened so easily people would be reaching for it already
oh - what if you could configure a "default view" for any Directory argument? building on the 'views' feature we just added (which I have yet to use, so correct me if this makes no sense)
@amber token also had some ideas around views that might relate (don't remember details)
right I am aware it's not exposed - but I'm guessing could be easily, if we feel ok with consequences
got it
Quick checkpoint. My understanding is that there are several parallel problems and solutions that may be relevant to this, and we're not sure how complementary / overlapping they are. Namely:
Relevant problems:
- Can't use Dagger as an upstream packaging tool
- Always forcing explicit arguments to
dagger callis annoying - Dagger is not yet a slam dunk in monorepos
Relevant solutions:
- Allowing modules to access their root directory (not just their source)
- Directory views
- Dagger Scripts
Don't think I have specific context on the monorepos part, but aside from that it tracks yeah
Yeah, felt the need for access to the root dir recently. Kind of a knot to untangle there that I haven't gone back and fix yet.
Yeah I left it pretty vague. Basically: there is no well-defined, known-to-be-awesome, best practice for embedding Dagger modules in a monorepo.
The absolute lowest-hanging fruit has been Mark's issue: "the monorepo is too big to load" so everyone is stuck at step 1. But once we solve that (with the views stopgap), there will for sure be other blockers.
For example, you can't really map the dependency tree of monorepo components to a dependency tree of Dagger Modules. Because a module can't embed its source code. So you need to "carry" the extra information of source argument, all the way down. I don't know how to solve that without dag.CurrentModule().Root()
tldr: the "no first-class support for upstream modules" doesn't only affect the open-source ecosystem: it also affects how you can organize a private monorepo, I think
I do like the general idea mentioned above and elsewhere of support for defaulting Directory arguments to values from the module context. That gives you almost the same power as full blown dag.CurrentModule().Root() but in such a way that you still treat these directories as arguments, which is a good thing because:
- Callers can override them (since you'd just be defining a default value), which guardrails modules be easy to re-use, less hard coded, etc.
- It makes caching a lot simpler once we implement support for function caching
Is the extra magic of a "special default value" really worth it, compared to just letting devs implement the default value themselves?
Also wdyt of the impact on ecosystem of having this pattern (regardless of how we implement it) @undone grove ?
Can upstream modules peacefully coexist with downstream modules? ie. do we risk having a proliferation of modules that just don't bother to give you a source argument at all
(I guess someone else can always wrap it in a downstream module, and the ecosystem can decide if they like that?)
I think so, for the benefits I listed there. dag.CurrentModule().Root() would probably result in authors just skipping making an arg at all, which is a bit sad. I am more in favor of having a tiny bit of opinion + guardrails here. At least in terms of a next step, I'm not utterly opposed to dag.CurrentModule().Root() ever, my hot take is just that it's worth starting with a default arg value approach first and only layer that on later if proven necessary.
Wouldn't it be weird if all directories always had the module's source as a default value? How would it co-exist with the argument's actual default values etc.
I'm thinking if we do it that way, we'll need some sort of descriptor right? The module dev would designate some Directory arguments vs others
ie. do we risk having a proliferation of modules that just don't bother to give you a source argument at all
That's what the default arg value approach would avoid; you'd still have to make it an arg (it just would be optional for callers). Which I do think is worth trying to preserve, at least for now
Yeah good point
So the tradeoff boils down to:
- Protect use of source arguments in the ecosystem
vs
- Protect simplicity of the DX
I'm imagining something like this:
func (m *MyModule) Build(
//+default-from-root some/dir/from/the/context/root
src *Directory
) ...
With python/ts using annotations instead obviously, and a better name of default-from-root obviously too ๐
So it's totally opt-in on individual args and controllable there. Could also perhaps reference views as @jed suggested too
Mainly that, though the simplicity of caching is very nice too. dag.CurrentModule().Root() wouldn't be an argument and thus wouldn't be a part of the cache key for the function (unless we jump through more hoops, which would also impact the code the author is writing). It doesn't matter yet but once we add function caching that may become more of a thorn
@undone grove +1 to basically everything. Would it be possible to specify a name of a view, too? I bet that's what you'd want most of the time:
// +default-view="code"
alternative, maybe too magical: we treat a string value as a view name
// +default="code"
just play devil's advocate: if you can pass include/exclude to currentModule().Root(), then you don't need views. And we did say views should be a temporary workaround...
so far the only true benefit I see to the "magical default arg" approach, is forcing an argument. Possibly the caching thing also but I didn't understand that ๐ For everything else, I prefer real code: having a mini-DSL in the comments pokes a hole in the everything-as-codeness of it all. A small hole but still a hole. IDEs can't autocomplete, language toolchain can't catch errors, etc
I think I agree with @undone grove's point above - if we add currentModule().Root(), then I can see module authors just not exposing a directory arg at all (I probably even wouldn't). Maybe this is slightly less bad if we prevented modules using this API being used as deps (so this would have less of a developer ecosystem impact), but it's still kinda hm to me. That said, I do like the idea of controlling exact include/exclude if we had it as @chrome forge suggested ๐ I think we should try and make views work and give them a go before we try adding this API even if we decide to add it eventually.
I like the default option a lot - don't particularly like the idea of having every single Directory arg default to the top-level, if we could specify a view, that would be helpful (but I would also like the ability to select a subdir in the default as well, alongside patterns). I do quite like @ebon tree's magic solution ๐ (maybe we could even drop the quotes? that might be too far though).