#Last mile review
1 messages ยท Page 1 of 1 (latest)
Container.Shell()->Container.Terminal(). I know it's longer to type, but I think it's better to have a clean model where the words are used consistently. OtherwiseContainer.WithDefaultShell().Shell()will be confusing.
dag.CurrentModule()is just long... I don't have a great alternative in mind, but feel like it's at least worth thinking of one. We've been pretty good so far at keeping our API from looking like the Windows API... It's hard because it never happens all at once, it just creeps up on you, one longer-than-necessary name at a time. Granted, this is really a nit.
dagger update: I'll address that one in the PR @thick burrow , but it feels like it has to at least update something (I'm thinking the dependencies). If we don't have time to implement, better to return an error "not yet implement" than change the meaning to something else IMO. Will comment in the PR
- :
"root-for": {"source": "."}. I really think that default value should be applied by the config parser, rather than explicitly writing it out on eachdagger.json. In other words,dagger initshould not write that field by default, and adagger.jsonwithout that field should be valid. I understand the argument that "the complexity is hidden away, you're not supposed to edit the file directly", but it's still valuable to avoid unnecessary complexity. People will read the file, and reason about how it works and what needs to be in it. I should be able to write a validdagger.jsonfrom memory without having to memorize arcane fields I don't actually need
I understand the argument that "the complexity is hidden away, you're not supposed to edit the file directly", but it's still valuable to avoid unnecessary complexity. People will read the file, and reason about how it works and what needs to be in it.
We will be adding complexity if we don't do this because the engine needs to do extra work (more find-ups) and in general have more complex internal logic around handling all the permutations of cases (what's a valid config, what's not, etc.). Right now it's very simple in that a valid config needs to have aroot-forsomewhere, and for the simple case ofroot dir == source dir, that meansroot-foris.. If there isn't aroot-forset in any find-up, it's invalid.
We can eat that if it's worth it, it's not a dealbreaker level of complexity, but:
- It's easy to explain why it exists (there must be a
root-forfor any valid module) - I really don't want to encourage users to be writing these files themselves. Limiting the config format to something easily writable from memory by a human is a huge constraint, and I just don't know why we'd take that on when we can just say "use the CLI to modify
dagger.json, if you write it yourself you are on your own"
(Not ignoring your other comments, just need to update/review some PRs first ๐ be back here soon)
Mmmm I just realized there is no source field
May I add that since include and exclude paths are relative to the root, it makes more sense to me to keep them next to in root-for.
you mean inside root-for? Or alongside it at the root of the config?
It would have to be inside. To be module specific.
Right now it's very simple in that a valid config needs to have a root-for somewhere, and for the simple case of root dir == source dir, that means root-for is .. If there isn't a root-for set in any find-up, it's invalid.
I don't understand this. If I'm loading a module and its config is {}, I know it is not the root for any sub-modules - only for it self. I can continue the find-up to the next subdirectory. I don't understand how that is not the simplest implementation. It makes me think I'm missing an important part of how find-up actually works.
Ah yeah I'll fixup that // Deprecated comment on Root
But there should be a Source field in that struct, right?
Example: the module is in a git repo at subdir foo/bar/mod/. Currently, if that module doesn't have a separate root, the engine sees foo/bar/mod/dagger.json has root-for entry of . and knows it can stop looking for the root. If we remove that, the engine has to always keep doing a find-up until it sees that there is no root-for in any dagger.json in any parent dir. Right now the cost is low, but I could see that becoming worse once we want to optimize loading of a large git repos. That's besides the other more complex logic it would entail.
No, there's no need for it. There is always a dagger.json in the source dir of a module, you never need to point to a separate source dir. In the case where root and the source dir are different, that's handled by root-for in a parent dagger.json
I think I may have severely misunderstood how the current implementation works
See this thread where I just explained how it works in a way that may be completely wrong? https://discord.com/channels/707636530424053791/1202337917436174336
But I think what I'm explaining (in that other thread) matches your last comments in the issue
Ah sorry, I for some reason left in "source": "." in all of those here, that's my bad: https://github.com/dagger/dagger/issues/6291#issuecomment-1911337000
But yeah all those values are just . because there is always a dagger.json in the source dir of a module (why it's not needed anymore). I'll update that comment for posterity, but it's literally just deleting "source": ".", everything else in the description should be unchanged
Updated that comment
OK so at some point along the way we misunderstood each other
Can we use my answer to Nipuna as a test case? What should I have answered?
I was so happy with my answer too ๐
It would be:
That way this works:
$ git clone YOUR_REPO repo
$ cd repo
$ dagger call -m ci test --dir .
Your Dagger config would look like this (at the root of the repo):
$ cat dagger.json
{
"root-for": [
{
source: "./ci"
}
]
}
And there would be a dagger.json in the ci/ dir colocated with your module's source code that looks like this:
$ cat dagger.json
{
"name": "foo",
"sdk": "go"
}
What's the point of root-for in this case? He doesn't need it (assuming the ci module doesn't depend on other modules in the monorepo)
And I don't get to drop the -m ci
(which was the point of the source field)
This sort of setup would be needed in the very specific case of:
- the module in
ci/requires ago.modin the parent dir (root of the repo) - you for some reason can't just put the module source code in
ci/in the root of the repo
Otherwise, yeah this would not be needed
In my mind the source field is pretty much orthogonal to the rest. Just a small optimization so that a module can tell its SDK where to load source code (instead of the default of the module's root)
In my head canon, -m was for referring to a module ref (git or local path) with the newly expanded ability to also be able to refer to deps
Yes totally
In this case I just want the root of my repo to be the module
So the baseline is: dagger init at the root of my repo; dagger develop at the root of my repo; drop a bunch of go files at the root of my repo. Conceptually simple, works fine, but pollutes my repo root.
So, add source: to tidy things up, but no fundamental change: my repo root is the module
Right but I thought one of the nice aspects of this was that we went back to "there is always a dagger.json in the module's source dir, easy to identify".
(I know we clearly diverged at some point ๐ which is fine, just trying to find all the differences)
I think we're just using "source dir" in different ways. There's always a dagger.json in the module's directory - by definition, no dagger.json means no module. But not all modules have source code
In the current implementation, it is just:
dagger init -m cifrom the root of the repo (all the source is initialized in theci/subdir, thedagger.jsons all setup for you etc.)
But not all modules have source code
Right that's true in the current implementation too
That's different though: that creates a module in ./ci, not .
I think the hang up is on whether you can decouple a module's source code from its root directory.
(I'm now thinking context-for might have been less confusing, since now I'm using "root" for two different thigns)
Okay, I think I'm seeing what you were imagining. What I implemented had essentially two points:
- There's a root dir, which is only used by sdks for building. It's where the
root-forentry for a module is located - There's a source dir, which is where the module source code is. Must either be the same as the root dir (simple case) or a subdir of it (case where go.mod in parent dir, etc.)
But what you're describing sounds like there's 3 points (correct if wrong):
- A root dir, where the
root-forentry is - A "source root dir" (? not sure what to call it? context?) which is where the
dagger.jsonwith the module's name, sdk, etc. config is - A "source subdir" which is where the actual source code is located
If so, the current implementation basically just collapses your point 2+3 to always be the same.
Yes that's right. Your number 2 I just called "the module directory" but now realize it would have helped to reserve a name for it.
Right now this terminology feels clearer (keeping same numbers as you):
-
"load context" dir, available to SDKs for loading, and security boundary for accessing
.. -
Root directory: always where dagger.json is.
-
Source directory: where the module's source code lives (defaults to the root)
Okay cool, pretty sure we're on the same page again ๐ค Thinking through the various permutations of settings that having those 3 directories as variables enables. I agree it gives even more flexibility than the current implementation (just in that you have another variable available to control).
One thing is that two modules can't share a root directory (2. in your most recent message), unless we go back to having a list of modules: [ ... ] again.
There are probably cases in which that might be annoying, but as far as I can think there would always be alternative options for how to layout directories that could achieve the same end effect.
The main thing that becomes possible now is the use case of "clone repo + dagger call (no -m)"
And will need to handle corner cases like:
- module A having "root directory" set to
./fooand "source directory"./foo/bar/baz - module B having "root directory" set to
./foo/barand "source directory" also set to./foo/bar
Layouts like that would just have to be invalid. Which is fine, but needs to be detected in such a way that it doesn't error out on slightly different but valid cases like:
- module A having "root directory" set to
./fooand "source directory"./foo/bar/baz - module B having "root directory" set to
./foo/otherbarand "source directory" also set to./foo/otherbar
(I'm just stream of consciousness dumping right now as I think of things)
Also need to deal with what @sudden heart mentioned before about where include/exclude is set. Obviously it being set in the "root directory" dagger.json makes sense, but it still would make sense for it to be a setting in the "load context directory" context-for entry (presuming it's renamed to that). Just in that maybe a go.mod is in the root of the repo (and maybe it refers to some other go packages that are relative to it), but that doesn't mean we should always load the whole root repo in its entirety...
yes exactly. That is a tradeoff, and one I'm comfortable with
I didn't get this part. What is wrong with that example layout?
btw I can talk live if that's easier
sorry if I'm making your life difficult today ๐
worth sweating the details on this
Started typing out why and realized that we could make it work ๐ Thought process:
I might invoke module A via e.g. dagger call -m github.com/org/daggerverse/foo/bar/baz. The engine loads the git repo, starts a find-up at foo/bar/baz until it finds the dagger.json (and also finds the context-for in possibly another parent dagger.json).
The thing we need to be careful about is that it will actually hit module B's dagger.json in the find-up first. I was thinking that would just make this invalid. But actually we can find the "right" dagger.json by checking the source path.
It's the same logic as already exists for finding the "right" dagger.json that contains the root-for a given module, it's just now we need to do that on two levels: what dagger.json contains the source for a given subpath and what dagger.json contains the root-for for a given subpath.
We're gonna need a lot integ tests ๐ But that's obviously fine, it's manageable I think. And I do like the flexibility
No it's all good, I'd rather get all these problems + use cases sorted out now rather than try to retrofit them in the future ๐ Our future selves will thank our current selves
you're welcome, future selves
does this mean that context-for becomes just an array of strings?
eg "context-for": ["./foo"]
I like making everything a list of object because it gives us more flexibility in the long run if we end up wanting an extra field. E.g. the include/exclude problem might call for context-for giving each entry the ability to have its own include/exclude.
yeah I get that. We should use a field other than source though, to avoid confusion
path?
Yeah SGTM
My CUE brain immediately thought: let's just make it a string now, and make it string|{...} later ๐