#"dev module" pattern
1 messages ยท Page 1 of 1 (latest)
๐งต
I don't want to go back to a ci source directory though
If we were to revert the creation of github.com/dagger/dagger/dev, that would leave some unresolved problems:
-
What do we do with
sdk/python/dev? That's a different pattern. Are we willing to remove that module too? And if not, why not? How do we achieve consistency within the repo? -
What do we call the source repo? I don't want to go back to
ci/. Can we live with.dagger? -
What source directory should
dagger initdefault to? The promise of the dev module pattern, is that we could always default to.(since you would explicitlydagger init ./dev). But in honesty maybe that was only moving the problem, since we would still need a way to indicate that./devhas a special relationship to.(to get rid of-m)
- Doesn't have to be a different pattern in relation to the name of the module and how it's called, but what I'm hoping for has always been this:
/sdk/python/devincludes most of what's in/devfor Python and then some (e.g., updating dependencies, type checking, explicit sub testing, preview docs build, etc..). These should be easily available when developing on/sdk/pythonin your terminal.- Have
/devdepend on/sdk/python/devto implement the SDK interface that's going to be called in CI, but rather than implement those features, it's much simpler because it's makingdag.PythonSdkDev()calls. Same benefits from a module havingdag.Self()! - For other SDKs to follow the same pattern. Implementations in their language (when it's useful), coordinated in the root's
/dev.
Have /dev depend on /sdk/python/dev to implement the SDK interface that's going to be called in CI, but rather than implement those features, it's much simpler because it's making dag.PythonSdkDev() calls. Same benefits from a module having dag.Self()!
Yeah 100%, my initial PR actually went straight to that, but then I gradually scaled back my ambitions with smaller and smaller PRs ๐ What you're describing is still the ultimate goal in my mind
But, is it /dev importing /sdk/python/dev, or / importing /sdk/python? That's the fork in the road here I think
(pattern 1: /dev submodule, pattern 2: / module + /.dagger or / source dir depending on the situation)
In other words, either way we need and want support for submodules in a monorepo. But for each module and submodule, what is the pattern for organizing them
Yeah, that sgtm (i.e, / importing /sdk/python), I actually moved to that in python's dev module (locally, with sdk/python/dagger.json -> dev) a few months back.
There's real benefits on using .dagger in terms of pattern exclusion because I've regretted putting under sdk what we codegen in Py and TS, but I'd be able to exclude **/.dagger/sdk easily. Why did you move dagger.json to dev? Was it just to align with sdk/python/dev or did you find real issues with having /dagger.json?
So I'm ok putting /sdk/python/dagger.json and having "source" point to dev or even .dagger.
However... we found an issue with two modules wanting "ownership" of sdk/<name>. For example both Elixir and PHP have modules support now, but they're not bundled so you need to use a path on --sdk. However, they had no simple way to get the SDK's sources in that module so the easiest solution was to move the runtime module's dagger.json to sdk/php/dagger.json and point "source": "runtime". It makes a lot of sense to say:
dagger init --sdk=github.com/dagger/dagger/sdk/php
Instead of:
dagger init --sdk=github.com/dagger/dagger/sdk/php/runtime
With context directory we'll be able to get the SDK's sources without moving dagger.json to the parent, but that'll bring back the need for runtime on something the users of these SDKs will need to use a lot. We already have a convention for putting that module under a runtime subdir, so we could have the engine just assume that and expect a directory with a runtime included. That gives you best of both worlds (once you have context directory):
- Keep
runtime/dagger.jsonbut make it a requirement when using--sdk=to reference a parent with that subdir. Even if it's a repo with only that. - Move
dagger.jsonto the parent for dev so you don't need-m ...while developing.
Link to my reasoning at the time: https://github.com/dagger/dagger/pull/7766#issuecomment-2195022396
TLDR was:
- consistency with
sdk/python/dev - the fact that if your project is itself a module, it seemed better (because otherwise you have to mash 2 modules together in the same dagger.json: the actual project, and the module to develop the project). However for this point, there is a proposed alternative, which is to make it easier to "mash these two modules together" - with a big focus on a module being able to test itself. So that might change the equation
Underpinning all this is uncertainty on what is "the right way" to manage the relationship between upstream software and dagger modules.
Between 1) a module to develop X, and 2) X as a module, what's the difference? Do we support both, or is one better than the other?
But yeah, with context directories some of the constraints will be removed - maybe once that domino falls, the rest will follow
I'm curious to try /sdk/python/dagger.json with "source": "." (for dev) but relying on "include"/"exclude" to get the right files. That's what was done in PHP I think. In this sense the module and project are the same. And it actually makes sense in this case. That can work today already.
Context directory does make it better though because you can have different functions needing a different slice of the project. I've certainly created several views in Python when that feature came out (for tests, lint, docs, etc), but it was cumbersome with views, so I reduced to just two (lint and default). It also allows you to test the function against another version of the files like in a git remote or PR, but if you still keep dagger.json on the project dir and depend on include/exclude, you may be loading more than you need every time. Unless you move "source": ".dagger" which has best of both worlds I guess.
Isn't it relying on include/exclude a game a whack-a-mole, since the SDK can write any file at any time? You would have to keep the patterns up to date when SDK make changes etc
Best if you can depend more on includes rather than excludes. That was feedback I gave to the PHP folks because they basically relied on excludes but they had lots of lines in there that was very easy to miss new additions to the SDK. So I agree that relying on include/exclude is harder to get right.
I will float an old proposal, which was to allocate a source directory per sdk, for clear ownership separation of concerns
.dagger/sdk/<NAME> -> reserved for the use of that SDK for this module
then you never have to worry about wack a mole, or mixing generated and non-generated files, what to ignore etc
or for a shorter path: .dagger/<SDK NAME>
Not entirely clear to me, do you mean Contextual modules or a variation of that?
no this would be specific to source directory
(so within one module as opposed to relationship between modules)
What would you put in .dagger/<SDK NAME>? Can you give an example? Seems like just a rename of ./sdk in non-Go SDKs.
Btw, Go works differently here. For Go to work like the others, you'd put the SDK under a subdir and in your module's go.mod have a replace rule (or workspace) to install from local dir instead of "published" package. Then the SDK's dependencies can be isolated to the SDK, instead of being mixed in the user's go.mod.
@scenic wing resuming this... Not a strong opinion, more of a thought experiment. I was talking about .dagger/python being the default source dir for dagger develop --sdk=python; .dagger/go being the default source dir for dagger develop --sdk=go; etc
i seem to remember we considered this one before
But I'm wondering if it would be viable to do it the other way around... @scenic wing do you think the SDKs could handle conflict with an existing codebase (pipeline embedded in an app) by "sneaking" the pipeline source in a native-friendly place in the app source code, so that it doesn't clash with the native tooling?
For example in Go, it might be internal/dagger.io/module or internal/dagger.io/self or some other reserved / non-clashing place?
Yes because I asked the same question last year ๐
Reminder this is our dilemma ๐
Depends on the language but the way SDKs are structured, modules are a sub-package, with another sub-package inside (except Go which mixes the SDK with the user's code).
- What replaces dev modules
- Where does
ci/go - What does
dagger init --sourcedefault to
i think the question of what the default should be and what we do are very different - i don't really mind where we put ours, but i think the most obvious answer to dagger init should to always be the current directory
I disable this behavior of the Go SDK every single time because it never works for me. Creates awful go package errors that I never figure out how to fix.
I have to manually do this:
mkdir foo
cd foo
go mod init foo
dagger init --sdk=go --source=.
dev/, ci/ can be anywhere, but for our purposes i would rather it not be dagger//.dagger (even if other do this)
that means our module would be called github.com/dagger/dagger/dagger which is not really great
But that's because the Go SDK is reusing the parent's go.mod, right?
Yes. Sorry I misunderstood what you meant by "mixing", just re-read it
What if we made it internal/dagger.io/self or internal/dagger.io/module or something like that
i think my biggest objection to this is that there's now layers of empty directories, it starts to feel a lot more java-like ๐ค
Well if it's internal/dagger the server code might clash with the generated client code
i know some tools collapse them down nicely, but i often get very frustrated at tools that have lots of directory nesting
also internal has special significance to go code - our repo uses it for example
Why not do what other sdks do and put the sdk on a subdir of the module and have a replace rule in go.mod to install the sdk from that subdir?
That's the whole point
it's much more likely to clash
so different directories for different sdks?
that feels confusing to navigate
@analog egret let me know when you're at the second iteration of thinking about it ๐
The knot:
What does dagger init --source default to
It depends whether the module is "standalone" or "contextual".
- Standalone: could be anything. The shorter the better (SDK has the module root to itself)
- Contextual: no good answer. All options seem flawed in some way. I'm looking for more.
how do you mean? the problem is, personally, unless the default is switched to the current directory, i will just keep using --source=.. it's genuinely difficult for me to weigh up whether i would prefer internal/ci/dev/dagger/long/path/here/etc, because most of the time I will always want --source=.
it still doesn't matter for contextual? the root is still the same place, right? we're just deciding about where the source code is inside the root
@analog egret I also always type --source=. but you and I are not representative of every use case, we also need a great default for "daggerize your app repo", and --source . is not it
If the root is the right place, let's move /dev/* to /* and problem solved
i don't understand why we need to use the default
i think we should use dev/ or similar. i think the default should be ..
What source directory should dagger init default to? The promise of the dev module pattern, is that we could always default to . (since you would explicitly dagger init ./dev). But in honesty maybe that was only moving the problem, since we would still need a way to indicate that ./dev has a special relationship to . (to get rid of -m)
๐
the special relationship today is indicated through the presence of the .git dir/or a dagger.json in the root - but moving dagger.json into dev changed that entirely
i don't understand why we need to use the default
We don't need to use the default, it's just a convenient way to make sure the default is good. If we don't use the default, we still need to find the right default for everyone else, but without the benefit of dogfooding to be sure we designed it right
That's what I just wrote
right sorry, i was going to add some more ๐ pressed enter too soon
We need to use @scenic wing 's mindmapping tool to keep up with the decision tree
I think the default should be for source to default to root directory. When you create a new module like this:
$ dagger init --sdk=go foobar
That would create a new directory foobar with dagger.json next to main.go.
But if you try to use an existing directory with files:
$ dagger init --sdk=go
The X directory is not clean and some files could be overwritten. You can specify `--source` to blah blah...
Do you want to continue? [Y/n]
If --source is specified explicitly then no need for prompt. Also, introduce a -y flag to skip the prompt.
Decision one: root module (1) or dev module (2)
- if Root module:
- Where does module source code go?
- How to co-exist with an app?
- What if the app is a Dagger module?
- if dev module:
- How to auto-detect the dev module?
The problem with this, is that dagger develop at the root of an existing app will be very common. So we have to provide a good default for "not clean" module root. Asking devs to specify it is too much work left to the user
Root module, put sources in dev, don't be a part of an app, be a sub-package. Putting sources in a subdir fixes that. Same thing if app is a dagger module.
my pov summarized:
- the default
sourceshould be.- but you can configure this with--source=<foo>- if you try and set
--sourceto a non-empty directory, dagger tells you to not do that, and if it was., can even suggest you usedev/ci. this would require we fix the go.mod/go.sum thing so that users don't get tripped up by that (but we should do this anyways) - why
.? relatively standard for most tools to do this.ls/tree/etc tools use the current directory.go mod init/hugo init/npm initall do it. i understand we're a bit special, but not doing the default is going against the grain, and is a fight uphill (that we've already seen from the original proposal to change this)
- if you try and set
dagger.jsonshould always be the "root" (or the "context" from contextual modules). no.gitdir shenanigans (this is a hard sell i guess, but i find this behavior trips me up a lot, but probably this is controversial)- we consolidate terminology - the "context" is the same as the "root".
- we should avoid altering the behavior of commands based on what's currently around - e.g. we shouldn't have behavior that puts
--source=devif.already has an app
no .git dir shenanigans
Do you mean actual use of the .git directory? I'm not aware of a proposal to do that
i mean more in our implementation of determining the root - if no dagger.json is found, we determine the root as where we find the .git directory
There is no planned change to finding the module root though. That's already settled (it's where dagger.json lives, always)
Not sure what the problem is. dagger develop would stay the same. The behavior I described is only needed on init. The prompt is clarifying and when you know how it works, it just takes specifying --source or -y explicitly.
dagger init --sdk=foo is just an alias to dagger develop so I'm simplifying the discussion by just talking about develop. init is irrelevant
yes, this change just exists today. it's relevant in the case of installing dependencies. suppose you install ../foo, that goes above your dagger.json. you can still do this, as long as it's all in one git repo (this is how most daggerverse's do things)
Same point, it's only needed when a module is created for the first time.
No that's incorrect, it's needed when the SDK is setup for the first time
but also, i do think this is orthogonal - i'd love to get rid of it, but i don't feel hugely strongly about it
Of course ๐ Still same point though.
So just to focus on the hardest part (at least for me): when 1) the SDK is initialized (whether via dagger develop --sdk or dagger init --sdk, and 2) the module root is not empty (eg. the module is embedded in an existing project) --> where should the module's source directory point to by default?
It's important to me that there is a default, because it's a very common situation and it's too much work to ask users to choose each time
In that situation the possible defaults could be (all paths relative to the module root)
.(most requested default, ideal for standalone modules, messy for embedded).daggerdagger(current default in all cases).dagger/<sdk>- SDK-specific (based on the contents of repo), for example Go SDK might choose
internal/dagger_module
We already have a default for dagger now and a lot of users would prefer for it to be "." (i..e, next to dagger.json).
yes . is a great default for standalone modules, but a bad one for embedded (case in point: we ourselves don't want to use it for our own repo)
Yeah, but today you have good for embedded but bad for standalone.
Yes also true
And none of those options have a sweet spot for both.
i think we should avoid anything that's based on the contents of the repo - this feels like it will be confusing, having the very first command that most dagger users will run do "magic" that might not always work isn't great
Which is why I'm still trying new options
Guys at this point it's useless to point out only the downsides of this or that option. They all have downsides. We have to talk in terms of tradeoff.
Well, one important data point is what's the most common use case going to be? We made a bet on the ./dagger default to favor embedded, but wasn't most of the feedback that users prefer the default to be .? Or we actually don't know if that's what most users feel (since most don't voice their opinion)?
dagger init should be in cloud actually. could we see what args people tend to run it with? if we want more concrete data there
Can you see the values of "source" easily?
It makes sense that we get the most complaints from standalone module devs, because they're the most active on our discord, and once you're a power user you will primarily be creating standalone modules.
But, there's selection bias. If the default is confusing, or generally adds friction, for embedded repos, that hurts onboarding - and that impact is silently losing people who could have become power users, but instead just gave up
I think it makes sense to change the default to . now, as a stopgap while we continue figuring out how to make the onboarding great
In theory that will make it harder for me to get traction around improving the default further - but I'm already struggling on that front ๐ so won't make a difference either way
SDK-specific (based on the contents of repo), for example Go SDK might choose internal/dagger_module
Specifically this ๐ has promise I think
Let the SDK decide how to co-exist with an existing project
We kind of do that already with the "re-use existing go.mod" behavior, but currently it hurts more than it hurts. Maybe we can improve that capability to make it part of the solution rather than a problem
I still don't know why it's better to mix the pipelines in an existing source instead of making it a sub-package. Same could be done in Python for example, but I prefer having in a package separate from the app.
Isn't that what internal/dagger_module would do for Go SDK?
Sub-package is good. The problem is what to call the sub-package right?
Not completely. The go.mod dependencies are mixed and the import changes from module to module. It could be import "dagger.io/dagger", same as non-modules.
Didn't understand that sorry.
I've said this multiple times but never got a straight answer. Basically doing what the other SDKs are doing. From the module's sources you'd use the SDK the same as you'd go get dagger.io/dagger but instead of installing from the web, the module's go.mod would have a replace rule to use the one from the subdir, or use a workspace (like I've been doing with uv).
this is fun, i kinda of like this idea (it means the auto magic import of dagger.io/dagger will just kinda work, though dag would still not work)
we could still put the code wherever though right?
I like the idea of SDKs standardizing on a subdir name for this. One that's easy to add to exclude patterns without knowing their parent dirs.
Let's get specific because I am not understanding clearly what you are proposing.
Scenario: "let's daggerize my Go app!" (the app is dagger/dagger in this case)
$ git clone https://github.com/dagger/dagger
$ cd dagger
$ ls dagger.json
dagger.json: no such file or directory
$ ls go.mod go.sum
go.mod
go.sum
dagger init --sdk=go
What files have just been created?
You could still drop a generated file next to the user's sources for the entrypoint. Some SDKs do this.
We're talking about two different things at the same time. I was replying to something you said. To be clear, this suggestion is about where the SDK's files go, inside the source dir. Not where the source dir goes relative to the root dir.
To me those are implementation details. I care about the combination: where the SDK's files go relative to the root dir. How much of that is "source dir" vs "files relative source dir" is not as important to me, I'm open to anything if it solves the problem
(but understood)
A potential other idea to add to the list - if we want to have more opinionated defaults, could we have "templates"? If you want to daggerize your existing go app, dagger init --template=<name-of-a-go-template, maybe even pulling from a remote git repo
We could have the generalized init that way, but still a way of doing more specialized migrations - here's an easy way to daggerize your Django app, here's one for your simple go cli, here's a cargo wrapper, etc
To me that feels better than trying to anticipate all generalized use cases, since we could specialize for each of those
It's because I see things separate in my mind and mixing things make it more confusing. A Dagger Module is an app in itself, in the language of the chosen SDK. The SDK library should be generated separately from the Module's own source code, as part of a workspace (in its own subdir).
OK you're talking about specifically the SDK's generated client library (as opposed to the SDK's generated server template)
A Dagger Module is an app in itself, in the language of the chosen SDK
That definitely makes sense, and has been my assumption also. I think in the case of an embedded module, I'm becoming more flexible on that point, and willing to explore alternatives if it solves the user problems
In a perfect world, we would choose source directories that let us have it both ways: works if the module is a standalone app; and also works if it's mixed with an existing app. That's why internal/dagger-module (sub-package + a collision-safe name) is attractive to me in Go. It works in both cases.
But isn't the advantage of "embedding" just to get access to the "context directory"? What's the use cases for having it be a part of the same existing app in the repo?
Less friction out of the box.
That confused me before but you're basically saying /internal/dev/internal/dagger, right?
No, not the intent (but possibly an accidental consequence). that would definitely be weird
If it's /internal/<module> and the module has internal/dagger that's what it would be.
But yeah you're right that maybe source would be ., and the SDK would just change where it puts the files in that source dir
No I don't mean <module>
I mean the literal "module"
github.com/dagger/dagger/internal/dagger/self (hesitating between self and module to convey what I mean)
So an harcoded internal/dagger_module, same as the other proposal for the hardcoded .dagger?
Yes, with the caveats:
- Would be SDK-specific (not universal)
- Perhaps would not be a different source dir, but rather a new location in the source dir (which would remain
.), I hadn't though of that
So my amended proposal:
- Always default source to
. - Go SDK changes from
<source>/main.goto<source>/internal/dagger/main/main.go(just realizedmain/might be a good candidate) - Perhaps other SDKs don't need to change anything, because source=
.is already fine even if there's an existing python or typescript app there?
Other SDKs would conflict too.
Is there an equivalent trick, where the module's "main" code could live in a sub-package named in a way that avoids any conflicts?
And ideally, always use that sub-package location regardless of what's in the source dir? So no more "if the dir is non-clean do X, otherwise do Y". would be more robust and address @analog egret 's concern
The way I like to solve this is to consider the module it's own "app". Installed as a dependency instead of being vendored on the existing app. Not sure why people would prefer it being mixed. Additionally, Python is a very old language and people have many different ways to structure their applications. The most compatible way to work with that is as a dependency.
So what happens if I dagger init --sdk=python at the root of a python app? That's what I need to know ๐
Can't have an existing app there if it's ..
It wouldn't overwrite existing files but the module wouldn't work.
So as a user what do I do? Manually use --source to the path of a new "app"?
I'm trying to think about how we can use the [target] argument here.
Yep. But is it too confusing to have --source default to [target]?
dagger init --sdk=python foobar
This would put dagger.json with "source": "." in foobar.
If you use:
dagger init --sdk=python --source=foobar
Then dagger.json in current dir, with "source": "foobar".
If you do:
dagger init --sdk=python
Then dagger.json in current dir, with "source": "dagger"
So default for --source still dagger, unless [target] is used, in which case it defaults to that value.
It works for me, my concern is that it's confusing for others.
Yeah, so that's what I'm trying to conciliate just need to focus for a bit. Some variation of that, maybe the --source is complicating things so I want to see if there's a good simplification here.
If the repo has an existing Python app, is there nothing in the layout of that app that we know for sure will be true? For example where subpackages are stored
If so, the solution might be to always make the module's "main" code a subpackage. Since it's not actually a real main app anyway.
If we adopted the rule that "your module's code is a library" it would solve a lot of things
Basic idea is to default to putting the "module" in a subdir, but if you specify a target directory (non-existent), then assume standalone.
Or leave source to be always ., and move the boilerplate from src/main/__init__.py to src/dagger/main/__init__py, and the sdk from sdk/ to src/dagger/sdk/ so that it always works at the same paths
Still don't like vendoring the module's source in the repo's app like that. You also need to conciliate pyproject.toml, etc..
I take note of the fact that you don't like it. Indulge me in the thought experiment? What would it mean to conciliate pyproject.toml?
Not feasible. There's a lot of fragmentation in the ecosystem, multiple package managers, with different lock file formats, and ways to structure a project. It would take a lot from us to attempt it. Not to mention that it's just too invasive. The only sensible way this can work is to require the user to add some config in their pyproject.toml instead of doing it automatically.
OK but what kind of change are we talking about? Why is pyproject.toml involved at all
It's the equivalent of go.mod and package.json. It describes where the sources are and what the dependencies are, etc...