Last mile review | Dagger | Page 1

pseudo atlas Jan 31, 2024, 6:35 PM

#

Container.Shell() -> Container.Terminal(). I know it's longer to type, but I think it's better to have a clean model where the words are used consistently. Otherwise Container.WithDefaultShell().Shell() will be confusing.

#

dag.CurrentModule() is just long... I don't have a great alternative in mind, but feel like it's at least worth thinking of one. We've been pretty good so far at keeping our API from looking like the Windows API... It's hard because it never happens all at once, it just creeps up on you, one longer-than-necessary name at a time. Granted, this is really a nit.

#

dagger update: I'll address that one in the PR @thick burrow , but it feels like it has to at least update something (I'm thinking the dependencies). If we don't have time to implement, better to return an error "not yet implement" than change the meaning to something else IMO. Will comment in the PR

#

:"root-for": {"source": "."}. I really think that default value should be applied by the config parser, rather than explicitly writing it out on each dagger.json. In other words, dagger init should not write that field by default, and a dagger.json without that field should be valid. I understand the argument that "the complexity is hidden away, you're not supposed to edit the file directly", but it's still valuable to avoid unnecessary complexity. People will read the file, and reason about how it works and what needs to be in it. I should be able to write a valid dagger.json from memory without having to memorize arcane fields I don't actually need

thick burrow Jan 31, 2024, 7:06 PM

#

I understand the argument that "the complexity is hidden away, you're not supposed to edit the file directly", but it's still valuable to avoid unnecessary complexity. People will read the file, and reason about how it works and what needs to be in it.
We will be adding complexity if we don't do this because the engine needs to do extra work (more find-ups) and in general have more complex internal logic around handling all the permutations of cases (what's a valid config, what's not, etc.). Right now it's very simple in that a valid config needs to have a root-for somewhere, and for the simple case of root dir == source dir, that means root-for is .. If there isn't a root-for set in any find-up, it's invalid.

We can eat that if it's worth it, it's not a dealbreaker level of complexity, but:

It's easy to explain why it exists (there must be a root-for for any valid module)
I really don't want to encourage users to be writing these files themselves. Limiting the config format to something easily writable from memory by a human is a huge constraint, and I just don't know why we'd take that on when we can just say "use the CLI to modify dagger.json, if you write it yourself you are on your own"

#

(Not ignoring your other comments, just need to update/review some PRs first 🙂 be back here soon)

pseudo atlas Jan 31, 2024, 7:12 PM

#

Mmmm I just realized there is no source field

#

https://github.com/dagger/dagger/blob/2685e2f9d494a45577aae084d8683c8372f0a31e/core/modules/config.go#L16-L38

GitHub

dagger/core/modules/config.go at 2685e2f9d494a45577aae084d8683c8372...

Application Delivery as Code that Runs Anywhere. Contribute to dagger/dagger development by creating an account on GitHub.

sudden heart Jan 31, 2024, 7:23 PM

#

May I add that since include and exclude paths are relative to the root, it makes more sense to me to keep them ~~next to~~ in root-for.

pseudo atlas Jan 31, 2024, 7:25 PM

#

sudden heart May I add that since `include` and `exclude` paths are relative to the root, it ...

you mean inside root-for? Or alongside it at the root of the config?

sudden heart Jan 31, 2024, 7:28 PM

#

It would have to be inside. To be module specific.

pseudo atlas Jan 31, 2024, 7:28 PM

#

Right now it's very simple in that a valid config needs to have a root-for somewhere, and for the simple case of root dir == source dir, that means root-for is .. If there isn't a root-for set in any find-up, it's invalid.

I don't understand this. If I'm loading a module and its config is {}, I know it is not the root for any sub-modules - only for it self. I can continue the find-up to the next subdirectory. I don't understand how that is not the simplest implementation. It makes me think I'm missing an important part of how find-up actually works.

thick burrow Jan 31, 2024, 8:12 PM

#

pseudo atlas https://github.com/dagger/dagger/blob/2685e2f9d494a45577aae084d8683c8372f0a31e/c...

Ah yeah I'll fixup that // Deprecated comment on Root

pseudo atlas Jan 31, 2024, 8:14 PM

#

But there should be a Source field in that struct, right?

thick burrow Jan 31, 2024, 8:16 PM

#

pseudo atlas > Right now it's very simple in that a valid config needs to have a root-for som...

Example: the module is in a git repo at subdir foo/bar/mod/. Currently, if that module doesn't have a separate root, the engine sees foo/bar/mod/dagger.json has root-for entry of . and knows it can stop looking for the root. If we remove that, the engine has to always keep doing a find-up until it sees that there is no root-for in any dagger.json in any parent dir. Right now the cost is low, but I could see that becoming worse once we want to optimize loading of a large git repos. That's besides the other more complex logic it would entail.

thick burrow Jan 31, 2024, 8:17 PM

#

pseudo atlas But there *should* be a `Source` field in that struct, right?

No, there's no need for it. There is always a dagger.json in the source dir of a module, you never need to point to a separate source dir. In the case where root and the source dir are different, that's handled by root-for in a parent dagger.json

pseudo atlas Jan 31, 2024, 8:26 PM

#

I think I may have severely misunderstood how the current implementation works

#

See this thread where I just explained how it works in a way that may be completely wrong? https://discord.com/channels/707636530424053791/1202337917436174336

#

But I think what I'm explaining (in that other thread) matches your last comments in the issue

thick burrow Jan 31, 2024, 8:31 PM

#

Ah sorry, I for some reason left in "source": "." in all of those here, that's my bad: https://github.com/dagger/dagger/issues/6291#issuecomment-1911337000

But yeah all those values are just . because there is always a dagger.json in the source dir of a module (why it's not needed anymore). I'll update that comment for posterity, but it's literally just deleting "source": ".", everything else in the description should be unchanged

GitHub

Modules: replace `root` in `dagger.json` with `source` · Issue #629...

Currently, by default the engine loads the directory tree at a module's dagger.json and nothing else. However, to support cases where there are files/dirs the module relies on that are outside ...

#

Updated that comment

pseudo atlas Jan 31, 2024, 8:33 PM

#

OK so at some point along the way we misunderstood each other

#

Can we use my answer to Nipuna as a test case? What should I have answered?

#

I was so happy with my answer too 😛

thick burrow Jan 31, 2024, 8:37 PM

#

pseudo atlas Can we use my answer to Nipuna as a test case? What *should* I have answered?

It would be:

That way this works:

$ git clone YOUR_REPO repo
$ cd repo
$ dagger call -m ci test --dir .

Your Dagger config would look like this (at the root of the repo):

$ cat dagger.json
{
 "root-for": [
    {
      source: "./ci"
    }
 ]
}

And there would be a dagger.json in the ci/ dir colocated with your module's source code that looks like this:

$ cat dagger.json
{
 "name": "foo",
 "sdk": "go"
}

pseudo atlas Jan 31, 2024, 8:38 PM

#

What's the point of root-for in this case? He doesn't need it (assuming the ci module doesn't depend on other modules in the monorepo)

#

And I don't get to drop the -m ci

#

(which was the point of the source field)

thick burrow Jan 31, 2024, 8:39 PM

#

pseudo atlas What's the point of `root-for` in this case? He doesn't need it (assuming the ci...

This sort of setup would be needed in the very specific case of:

the module in ci/ requires a go.mod in the parent dir (root of the repo)
you for some reason can't just put the module source code in ci/ in the root of the repo

Otherwise, yeah this would not be needed

pseudo atlas Jan 31, 2024, 8:40 PM

#

In my mind the source field is pretty much orthogonal to the rest. Just a small optimization so that a module can tell its SDK where to load source code (instead of the default of the module's root)

thick burrow Jan 31, 2024, 8:41 PM

#

pseudo atlas (which was the point of the `source` field)

In my head canon, -m was for referring to a module ref (git or local path) with the newly expanded ability to also be able to refer to deps

pseudo atlas Jan 31, 2024, 8:42 PM

#

Yes totally

#

In this case I just want the root of my repo to be the module

#

So the baseline is: dagger init at the root of my repo; dagger develop at the root of my repo; drop a bunch of go files at the root of my repo. Conceptually simple, works fine, but pollutes my repo root.

So, add source: to tidy things up, but no fundamental change: my repo root is the module

thick burrow Jan 31, 2024, 8:43 PM

#

pseudo atlas In my mind the `source` field is pretty much orthogonal to the rest. Just a smal...

Right but I thought one of the nice aspects of this was that we went back to "there is always a dagger.json in the module's source dir, easy to identify".

(I know we clearly diverged at some point 😄 which is fine, just trying to find all the differences)

pseudo atlas Jan 31, 2024, 8:44 PM

#

I think we're just using "source dir" in different ways. There's always a dagger.json in the module's directory - by definition, no dagger.json means no module. But not all modules have source code

thick burrow Jan 31, 2024, 8:45 PM

#

pseudo atlas So the baseline is: `dagger init` at the root of my repo; `dagger develop` at th...

In the current implementation, it is just:

dagger init -m ci from the root of the repo (all the source is initialized in the ci/ subdir, the dagger.jsons all setup for you etc.)

thick burrow Jan 31, 2024, 8:45 PM

#

pseudo atlas I think we're just using "source dir" in different ways. There's always a dagger...

But not all modules have source code
Right that's true in the current implementation too

pseudo atlas Jan 31, 2024, 8:45 PM

#

thick burrow In the current implementation, it is just: 1. `dagger init -m ci` from the root ...

That's different though: that creates a module in ./ci, not .

#

I think the hang up is on whether you can decouple a module's source code from its root directory.

#

(I'm now thinking context-for might have been less confusing, since now I'm using "root" for two different thigns)

thick burrow Jan 31, 2024, 8:52 PM

#

Okay, I think I'm seeing what you were imagining. What I implemented had essentially two points:

There's a root dir, which is only used by sdks for building. It's where the root-for entry for a module is located
There's a source dir, which is where the module source code is. Must either be the same as the root dir (simple case) or a subdir of it (case where go.mod in parent dir, etc.)

But what you're describing sounds like there's 3 points (correct if wrong):

A root dir, where the root-for entry is
A "source root dir" (? not sure what to call it? context?) which is where the dagger.json with the module's name, sdk, etc. config is
A "source subdir" which is where the actual source code is located

#

If so, the current implementation basically just collapses your point 2+3 to always be the same.

pseudo atlas Jan 31, 2024, 9:04 PM

#

Yes that's right. Your number 2 I just called "the module directory" but now realize it would have helped to reserve a name for it.

Right now this terminology feels clearer (keeping same numbers as you):

"load context" dir, available to SDKs for loading, and security boundary for accessing ..
Root directory: always where dagger.json is.
Source directory: where the module's source code lives (defaults to the root)

thick burrow Jan 31, 2024, 10:08 PM

#

pseudo atlas Yes that's right. Your number 2 I just called "the module directory" but now rea...

Okay cool, pretty sure we're on the same page again 🤞 Thinking through the various permutations of settings that having those 3 directories as variables enables. I agree it gives even more flexibility than the current implementation (just in that you have another variable available to control).

#

One thing is that two modules can't share a root directory (2. in your most recent message), unless we go back to having a list of modules: [ ... ] again.

#

There are probably cases in which that might be annoying, but as far as I can think there would always be alternative options for how to layout directories that could achieve the same end effect.

#

The main thing that becomes possible now is the use case of "clone repo + dagger call (no -m)"

#

And will need to handle corner cases like:

module A having "root directory" set to ./foo and "source directory" ./foo/bar/baz
module B having "root directory" set to ./foo/bar and "source directory" also set to ./foo/bar

Layouts like that would just have to be invalid. Which is fine, but needs to be detected in such a way that it doesn't error out on slightly different but valid cases like:

module A having "root directory" set to ./foo and "source directory" ./foo/bar/baz
module B having "root directory" set to ./foo/otherbar and "source directory" also set to ./foo/otherbar

#

(I'm just stream of consciousness dumping right now as I think of things)

#

Also need to deal with what @sudden heart mentioned before about where include/exclude is set. Obviously it being set in the "root directory" dagger.json makes sense, but it still would make sense for it to be a setting in the "load context directory" context-for entry (presuming it's renamed to that). Just in that maybe a go.mod is in the root of the repo (and maybe it refers to some other go packages that are relative to it), but that doesn't mean we should always load the whole root repo in its entirety...

pseudo atlas Jan 31, 2024, 10:28 PM

#

thick burrow There are probably cases in which that might be annoying, but as far as I can th...

yes exactly. That is a tradeoff, and one I'm comfortable with

pseudo atlas Jan 31, 2024, 10:29 PM

#

thick burrow And will need to handle corner cases like: 1. module A having "root directory" s...

I didn't get this part. What is wrong with that example layout?

#

btw I can talk live if that's easier

#

sorry if I'm making your life difficult today 😅

#

worth sweating the details on this

thick burrow Jan 31, 2024, 10:38 PM

#

pseudo atlas I didn't get this part. What is wrong with that example layout?

Started typing out why and realized that we could make it work 🙂 Thought process:

I might invoke module A via e.g. dagger call -m github.com/org/daggerverse/foo/bar/baz. The engine loads the git repo, starts a find-up at foo/bar/baz until it finds the dagger.json (and also finds the context-for in possibly another parent dagger.json).

The thing we need to be careful about is that it will actually hit module B's dagger.json in the find-up first. I was thinking that would just make this invalid. But actually we can find the "right" dagger.json by checking the source path.

It's the same logic as already exists for finding the "right" dagger.json that contains the root-for a given module, it's just now we need to do that on two levels: what dagger.json contains the source for a given subpath and what dagger.json contains the root-for for a given subpath.

#

We're gonna need a lot integ tests 😄 But that's obviously fine, it's manageable I think. And I do like the flexibility

thick burrow Jan 31, 2024, 10:40 PM

#

pseudo atlas sorry if I'm making your life difficult today 😅

No it's all good, I'd rather get all these problems + use cases sorted out now rather than try to retrofit them in the future 🙂 Our future selves will thank our current selves

pseudo atlas Jan 31, 2024, 10:46 PM

#

you're welcome, future selves

#

does this mean that context-for becomes just an array of strings?

#

eg "context-for": ["./foo"]

thick burrow Jan 31, 2024, 10:54 PM

#

pseudo atlas does this mean that `context-for` becomes just an array of strings?

I like making everything a list of object because it gives us more flexibility in the long run if we end up wanting an extra field. E.g. the include/exclude problem might call for context-for giving each entry the ability to have its own include/exclude.

pseudo atlas Jan 31, 2024, 10:54 PM

#

yeah I get that. We should use a field other than source though, to avoid confusion

#

path?

thick burrow Jan 31, 2024, 10:55 PM

#

pseudo atlas path?

Yeah SGTM

pseudo atlas Jan 31, 2024, 10:55 PM

#

My CUE brain immediately thought: let's just make it a string now, and make it string|{...} later 😛

#Last mile review