#`install` vs `toolchain install`
1 messages Β· Page 1 of 1 (latest)
<@&946480760016207902> An unresolved UX issue: there are 2 kinds of dependencies now. They feel disjointed, we need to unify the user experience for them. But how?
@noble prairie @nocturne owl in particular since we discussed this one before. But really all maintainers should follow this.. Because it impacts the fundamentals of our module dependency system - and in turn, the sandboxing model
I gave a demo of toolchains to @high nimbus. He asked: "what's the relationship between install and toolchain install"? I didn't have a good answer. In fact I accidentally typed dagger install instead of toolchain install in the demo... 
there's at least some precedent with yarn/npm, where our toolchains feel like 'dev dependencies'
also, fwiw - i've really liked the word 'workspace' since you've started using it even as a stand-in name for the new Env. feel like it clarifies nicely that it's a development time thing
dagger install --dev?
Jeremy asked a question which I pass on to you:
What if there was only one
dagger install, with one type of dependency. But some dependencies could act as toolchains, and others not?
This could be done a few different ways:
- by marking
isToolchain: trueindagger.jsonas you suggested @noble prairie . - With a special pragma on the main object type: in the same file as the "explicit workspace" API
- The "explicit workspace API" is the marker: no extra information needed
Regardless of how we do it, it would have profound implications on the dependency & sandboxing model. Some dependencies can access their caller's workspace. That breaks the metaphor of module dependencies as libraries. Are we ok with that? How do we explain it?
This is another angle, where we retire the word "toolchain" and instead call them "dev dependencies".
I don't love this because it defines modules primarily as "libraries calling libraries". But that's not how users experience their project's module. They think of their project's module as their workspace - not as a library. So "dev dependencies" is potentially weird in that context. Users might think we're talking about their project's actual dev dependencies.
what about 'workspace modules'? dagger install -w/--workspace/--ws
We had a whole naming thread which was thoroughly debated until we settled on "blueprints" and later "toolchains", none of the other candidates felt intuitive enough, and we discussed quite a few candidates
Note that this π goes beyond a name change. Potentially it removes the need for a separate concept altogether - just dependencies, just some need a special kind of access and others don't. I'd like to get your thoughts on that option
Could even be a "capability prompt" thing, like when a module needs access to your LLM creds. "Module xyz requests access to your workspace; y/n"
But then, it breaks the mental model of "dependencies are like when my lib calls another lib"
Although, workspace access would be materialized by passing a magic argument, so at least everything is still a function call..
yeah hm, that's another wrinkle, toolchain modules aren't exposed as libraries at all
3 sounds the most promising to me. Toolchains have a special constructor related to workspaces that makes it able to load as a toolchain. But it can also be used as a dependency
3 sgtm too if we can work out the details, the prompting sounds good if we think we need it, and we could save the prompt confirmation to dagger.json too
yeah it is intriguing - also could be more granular. In this model, potentially a dependency could expose regular functions, and also workspace-contextual checks. Because any function could potentially be marked as requiring a special workspace argument (if we decided to support that)
But, what about the mental model?
How do we explain this
And, I personally can't go back to "your module is like a library! And it can import other libraries". --> So my project is a library? Doesn't work
Also I still don't want to go back to "everyone creates .dagger on day one"
So I don't know how to square that circle..
I think it brings us back to "they're all just modules" and there's only dagger install. Some modules happen to have checks, generators, etc. There's no distinction to explain to someone who is a no-code user or someone who has a .dagger
Yeah but how do we explain how modules work in a simple and clear way, if in reality there are 2 very different kinds of modules that interact with your project in very different ways?
install vs toolchain install
I'm not thinking of this proposal as 2 very different kinds of modules? For example, our engine-dev toolchain (ignoring that its a "project specific" toolchain for now). We want to use it as both a toolchain and a dependency. Its just a module that exposes checks!
They do work very differently even today... It's just that the difference is hidden, in the routing of +defaultPath. With the explicit workspace API, that will be more visible.
I think it could work if we explain modules as something more like "plugins" to add capabilities to your project
I thought with the better context API / workspace API that difference could be eliminated - no more hidden routing
Well that won't eliminate the difference, it will make it more explicit (which is good).
Toolchains get the workspace argument... regular modules don't.
That new API doesn't address the split between install and toolchain install
That new API doesn't address the split between install and toolchain install
Right but the new API + number 3 above does
Toolchains get the workspace argument... regular modules don't.
This doesn't have to be true though
This doesn't have to be true though
Correct, if we find an alternative design. Just pointing out that it doesn't exist yet.
Right but the new API + number 3 above does
Yes potentially. But there still would be 2 kinds of modules in practice right? Those who use the new API, and those who don't. We need a way to explain that difference.
Actually 3 kinds of modules, if you count those that act as your workspace, and implement no functions of their own
Yes potentially. But there still would be 2 kinds of modules in practice right? Those who use the new API, and those who don't. We need a way to explain that difference.
Do we though? Some modules provide checks, some dont. I dont think consumers of the module will care about how checks work under the hood
I'm assuming that when a user learns dagger install today, they understand it as "like installing a library, with new functions that I can call". Do you think that's true? if so, isn't there's a risk that we break that understanding if dependencies can sometimes access their caller's workspace?
To be clear I kind of like the option of combining it all. But want to make sure we stress test it
Here's my strawman:
dagger installwill add modules to your workspace. Modules are like plugins, that add capabilities to your workspace. Those capabilities are implemented by dagger functions, so everything is sandboxed, programmable, etc.- Capability 1: you can inspect and call your modules functions directly, with explicit arguments. When you need full control, and direct access to the plumbing, flexible composiiton etc.
- Capability 2: modules can "hook" into your workspace and add checks, generator functions, and other cool things. Higher-level APIs, mode opinionated, but super simple and modules can do more automatically on your behalf
With this framing, everything relies on the new APIs. Modules get more capabilities, they can hook into more and more aspects of the workspace. Shifts more control and difficulty from the end user to the module dev
I still like the word "toolchain" though... If we end up retiring it, I will miss it. Modules are technically accurate but on their own, very abstract.
While we're putting big UX changes on the table... What about differentiating workspaces from modules? π π
Right now dagger.json marks both. We're gradually expanding the dagger.json spec to accomodate the widening gap between them. With or without SDK; with or without source code... Now customizations, which are artificially separated from user defaults because of sandboxing - a separation which only makes sense if your module has code...
Is it time to discuss bifurcating modules (dagger.json) and workspaces? (dagger-workspace.toml? dagger.toml?) Or is this insanity
Or, we keep the unified dagger.json, but embrace that it's a workspace configuration. No more worry about sandboxed/unsandboxed. everything can go in there. And optionally your workspace may be imported as a module - if it has functions of its own. When imported as a module, not all workspace configuration applies (ie. your workspace config may say "load my aws token from `op://top-secretpassword", but that won't be honored when you're imported as a module in another workspace
What about transient dependencies? A imports B, B imports C. C needs access to a workspace.
- When user is developing A, what workspace does C get?
- When user is developing B, what workspace does C get?
This is where the concept of "dev dependency" would make sense, except that it's weird, because like I said earlier in the thread: "dev dependency" will be the most common kind of dependency: a module you install in your workspace to get more capabilities. It's weird that the normal kind of dependency has a qualified, and the advanced/special one has no qualifier.
--> This I think is an argument for splitting the files, or at least the dependency configurations.
- Configuration 1: "I want this module added to my workspace, so I can use its capabilities. It should get my workspace" (this is like "toolchain", "dev dependency"
- Configuration 2: "I am developing a module. I want to access the functions of another module. It should NOT get my workspace"
What about transient dependencies? A imports B, B imports C. C needs access to a workspace.
When user is developing A, what workspace does C get?
When user is developing B, what workspace does C get?
depends how we decide workspaces work π but based on most of the assumptions we've made about how it could work, I'd say they all get the same workspace based on the original caller?
Sorry I late-edited the message you're replyign to
I think fundamentally there are 2 kinds of dependencies:
- workspace -> module ("add capabilities to my project")
- module -> module ("let my functions call more functions")
all of this hinges on how we pass workspaces to modules. So even if its not figured out yet it might help to definite a starting point so we have the same frame of reference
Starting point: when a module is loaded, some of its functions may be marked by the engine as "workspace-aware". The criteria in the base design is "the function is the constructor of a module installed with toolchain install, but the criteria can change if needed).
Functions marked as workspace-aware are passed an argument of type Workspace. If the function doesn't accept that argument, module loading fails.
Workspace argument gives the function access to the workspace's files, with dynamic filtering. Also it can inspect all checks in the workspace, and other advanced "hooking" capabilities.
I have a POC for this part π
(lunch break, will resume after)
I will try to incorporate this discussion into my proposal π
But there still would be 2 kinds of modules in practice right? Those who use the new API, and those who don't. We need a way to explain that difference.
Going back to this, why would the explanation of modules need to explain the Workspace type? Its easy enough to say that modules are collections of functions that use the dagger API. If Workspace is part of the API, thats part of it. So some functions can dynamically work with the host if they choose to.
If its a sandboxing concern, we can make decisions on how Workspaces are initialized. Maybe they have to get explicitly passed in from a parent module, rather than ommitted as a default as adefaultPath Directory could be today.
I think this is the root issue π
my two cents:
-
Agree with Kyle that not restricting or differentiating on who can accept a Workspace arg type is the most intuitive. Sandboxing/permissions need thought but from the perspective of what's simplest/most-intuitive, trying to make some differentiation here seems complicated.
-
β¨
I think fundamentally there are 2 kinds of dependencies:β© agree that's what's actually worth differentiating.
The dumbest simplest thing I can think of is that dagger.json has two fields:
β¨```json
{
"toolchains": [
{"name": "foo", "source": "github.com/foo/bar" },
{"name": "baz", "source": "./some/local/path" }
},
"dependencies": [
{"name": "abc", "source": "github.com/abc/123" },
{"name": "qaz", "source": "idk" }
]
"source": "<existing meaning of source, a module that will have toolchains mixed in>"
}
Just a strawman to throw in the ring. But it seems like it would work out logically consistent.

If its a sandboxing concern, we can make decisions on how Workspaces are initialized
I feel like there's something here so i'll try to describe it better.
What if Workspace can't be defaulted in code/pragma? Thats part of what makes defaultPath hairy.
Then, the dagger CLI can pass in the context's "default" workspace if I call something directly and don't pass a Workspace value. In this case, if we're going module -> module, the Workspace would explicitly get passed through. Don't need to worry about transitive dependencies automatically getting something we don't want them to have.
isn't that basically what we have today with cosmetic adjustments? (not necessarily a bad thing)
yeah mostly afaict π
corollary is that you still need 2 distinct install commands (or flags) right?
yep, I'd just make it a flag probably
@glad ruin any thoughts on more strongly differentiating "workspace" from "module"?
I'd greatly prefer not to (less concepts the better, up to a point). And it doesn't seem like it's necessary based on above discussion
One blocker for me is which file to put workspace configuration like user defaults
@nocturne owl is it also your preference to basically keep the current system? β¨dagger toolchain installβ© vs β¨dagger installβ©? With maybe some cosmetic changes but nothing fundamental?
I think its possible for us to get to a single dagger install but it so heavily depends on pieces that don't exist yet its probably more pragmatic to take it 1 step at a time
I don't think there's a way to do that that also resolves transitive dependencies in a clean way
Because there are fundamentally 2 types of dependencies - that fundamentally will behave differently when transitive
yeah what I mean is we keep them separate, get the first version of this Workspace API figured out (and shipped), and then see if we have different problems
I can already tell you we have this problem though
(my bad for not having pushed my POC branch yet)
But the key is the A -> B -> C question: which workspace does C get? We can't answer atm
wouldn't this be a possible way to avoid the transitive dependency problem? #1466873214780051526 message
Honestly I didn't understand that message
Let me re-read it
I can try to explain better if its not clear!
Any chance you could rephrase it in terms of an answer to the A->B->C question?
Sure one sec
The question is:
- A imports B, B imports C
- C.fn() expects a workspace
- In the A workspace, user calls B.fn() which calls C.fn(). Which workspace does C.fn() receive?
- In the B workspace, user calls B.fn() which calls C.fn(). Which workspace does C.fn() receive?
So in the world of Workspaces being explicit:
A.fn(Workspace) gets a workspace, and the only way C.fn(Workspace) can receive it as a dependency is if its explicitly passed in.
So A(a Workspace) -> C.fn(a). It is not possible to call C.fn() without an argument as a dependency
and B.bar() calls C.fn().
So for A to call B.bar(), it would be A(a Workspace) -> B.bar(a) -> C.fn(a)
If I call B.bar() directly, not through code, we'll have B.bar(b Workspace) -> C.fn(b).
It always goes back to the main caller context
hopefully thats more clear and not just psuedocode overload π
So the user has to explicitly pass their workspace? Wouldn't that break β¨dagger checkβ©?
thats where the last part comes in
Then, the dagger CLI can pass in the context's "default" workspace if I call something directly and don't pass a Workspace value. In this case, if we're going module -> module, the Workspace would explicitly get passed through. Don't need to worry about transitive dependencies automatically getting something we don't want them to have.
So the caller can specify the workspace, or it can be automagically contextual
OK so the distinction would not be between different kinds of dependencies, but different execution contexts: is the client a module basically
- If client is a module: no workspace in the current context. Client must pass a workspace itself
- If client is not a module: any function who expects a workspace, automatically receives one, with access to the current client's workspace
Or even simpler: client register a workspace, or doesn't. Engine doesn't check for "is client a module", it just checks for "did this client register a workspace to access"
Modules when they connect, don't register a workspace
CLI could get flags for explicit control over what workspace to register.
- Default: workspace is the current git repo; fallback to current module; fallback to error
- β¨
--workspace=PATHβ© -> explicitly set the workspace boundary - β¨
--no-workspaceβ© -> explicitly disable workspace. Modules would use this
I think this π could work.... It's just missing a pragma somewhere to explicitly say "this type / this function" needs the client's context please. Basically like β¨+toolchainβ©... but probably not the right word. Otherwise it feels too magic.
Maybe I'm still misunderstanding Workspace because I feel like we're talking about completely different things. It sounds like you're saying a Module defines a Workspace and I'm saying the caller provides a Workspace to a Module
Yes caller (client) provides workspace. Transient dependencies are handled by making sure modules (which recursively call engine witih their own client) don't provide a workspace.
I guess I should have more explicitly distinguished "module B" from "module C" π
- β¨β¨β¨
dagger checkβ©β©β© calls β¨β¨β¨B.fn()β©β©β©, in the context of A: client provides a workspace: the git repo for A - β¨β¨β¨
B.fn()β©β©β© calls β¨β¨β¨C.fn()β©β©β©. The client is module B's function runtime: it provides no workspace.
This allows clean handling of transitive dependencies
@nocturne owl TLDR yes clients define the workspace, not modules. But sometimes, the client runs recursively inside a module π In that case, in-module client defines no workspace. I think this works.
Ah no there's still a problem: there are 2 possible kinds of B->C dependencies:
- B needs to call C functions. It's a "runtime dependency". Even when calling in the context of B's git repo, C should not get workspace access
- Developer of B needs to call C functions directly, eg. to check. It's a "dev dependency".
If we don't differentiate, it's weird.
But if we call them "dev dependencies" and "regular dependencies", then that's confusing for the 99% of dagger users who will never actually develop dagger functions, but have to type β¨dagger install --devβ©, otherwise nothing works like they expect
So far: negative progress for me on this issue
I'm still not connecting with this, sorry. I can type out a real world scenario in a sec. But why wouldn't B pass a Workspace to C as an argument? Doesn't that avoid all of this?
β¨β¨β¨β¨β¨β¨β¨β¨```
$ cd ~/src/mymodules/B
β¨β¨β¨β¨β¨β¨β¨β¨```
$ dagger check -l
c:lint
```β©β©β©β©β©β©β©β©
β¨β¨β¨β¨β¨β¨β¨β¨```
$ dagger check
```β©β©β©β©β©β©β©β©
--> This calls β¨β¨β¨β¨β¨β¨β¨β¨`C.lint()`β©β©β©β©β©β©β©β© with workspace auto-set to β¨β¨β¨β¨β¨β¨β¨β¨`~/src/mymodules`β©β©β©β©β©β©β©β©
As the developer of module B, this may or may not be what I want
- Option 1: *no*, not what I want. C is actually a runtime dependency to B: one of my functions calls β¨β¨β¨β¨`C.lint`β©β©β©β© with explicit arguments. I have no use for β¨β¨β¨β¨`C.lint()`β©β©β©β© for my own project, it's weird that it's there
- Option 2: *yes*, this is what I want. C is a dev dependency to B. My functions don't call functions from C. I just need β¨β¨β¨β¨`C.lint`β©β©β©β© as a check in my project to help me develop B
Problem: how to differentiate
Again the term "dev dependency" is perfectly accurate technically, BUT confusing for our users because they don't expect anything other than dev dependencies - runtime dependencies are for power users.
But that sounds entirely like enabling/disabling module's checks, not about whether they get the correct workspace
Yes, the problem of getting the correct workspace is solved in this design.
It's about enabling / disabling based on intended use of the dependency - because there are fundamentall 2 kinds of dependencies
ok using our own modules right now as an example, we have modules that we use as dependencies and as toolchains, and they're useful as both
Yes. We still get to control which are exposed as toolchains (dev dependencies) and which are not. Merging it down to a single kind of dependency, would prevent us from making that distinction. All dependencies that can be used as a toolchains/dev-dependencies will be, even if that' not what we want
I think for me, itβs a question of who you want to make the workflow most simple for. I guess if itβs developers who are consumers of a fully prepared (by DevOps folks) repo with checks and maybe some tools for local dev/test, Iβd want them to have the cleanest/shortest/simplest commands with least cognitive load. dagger check
Next level of complexity for devs who install, say the jest tool chain on their own. They can take a little more complexity. dagger install jest and maybe config defaults?
Next level is for DevOps engineers, toolchain/module devs. More flags, more complexity, because itβs my job to mange that complexity for devs.
Ok I see what you mean for the case of discovering checks but i'm not sure how it connects to workspaces π
Checks are a good use case because to provide a useful checks, you always need workspace access (otherwise what are you checking)
But the whole point is to illustrate the fundamental difference between runtime deps and dev deps, and the need to distinguish them in the UX somehow, without actually calling them "dev deps" because that will confuse 99% of users
so if we're optimized for the user and not the module dev (and we keep 2 separate kinds of dependencies), one improvement would be to swap it. Instead of dagger toolchain install and then dagger install for module deps, it could be dagger install for toolchains and dagger dependency install for mod deps
Yep. My comment was mostly at that level, though I know itβs very thorny below that level (as this discussion shows), and it might be argued that you optimize the other way.
Front office/back office
But does it really boil down to "automatically run checks on these modules, and not these modules" or is there more wrapped up in Workspace usage?
I think it's more fundamental. "This module is to help me develop my project" vs "This module is for my dagger functions to call other dagger functions"
"check" is just one way to help me develop my project. Tomorrow it will be "ship", "up", "fix", whatever. Also applying user defaults.
Honestly this makes me want to create a separate config file at the root of the repo (or wherever), just for setting up dagger and configuring which modules to enable for which paths in the project, with which values - and nothing else
It's barely a "dependency" it's just a project config.
Then β¨β¨β¨β¨dagger.jsonβ©β©β©β© is for modules, it has dependencies, the normal kind: runtime dependencies - functions calling functions, no workspace magic.
If you want to use other dagger modules to develop your module project which may or may not be a dagger module, that's not a dependency at all. You just add that to your simple project config, at the root of the repo or wherever. Orthogonal to your dagger.json
reading this as "If you want to use other dagger modules to develop your module project", I agree, but I don't know if I'd separate the dependencies (into separate files). More like a config to refine how to check/ship/up/fix. Especially when you have a monorepo with subprojects that you want to run subsets of the checks on for each one
As for keeping the dependencies in separate lists, I think its honestly fine. But I don't think there's any reason to explain "there are 2 types of modules", its more like there are 2 ways to install a module. But its all the same modules. I think thats the part I've been stuck on
Yeah I just think it might actually be simpler, instead of having 2 types of dependencies in dagger.json, to have 2 different config files - one for modules and one for project/workspace config, and each has just one kind of dependency: basically workspace->module vs. module->module. And in the case of workspace->module, we wouldn't even use the word "dependency". Just like when you configure nginx to use this or that module, it's not a dependency. Just a config that installs/enables a nginx module. Just semantics
so (ignoring the actual ergonomics for a second), we might dagger workspace install jest (install a module into the workspace) or dagger dependency install alpine (install a module to use in our dagger code)
Right. And in this last variation, those 2 commands would change 2 different config files
Got it. I'm pretty indifferent on the 2 config files. I agree we need one to configure things, but no real opinion on whether the dagger.json would still track the module or not.
I was trying to catch up on the thread, and was thinking we should swap the commands (then this message)
I wonder if it shouldn't be something like dagger develop install <lib-I-want-to-integrate-in-my-module>
That way we keep that under the dagger develop command that already exists and we stay in the same phase, that is writing custom functions.
And to have dagger install <module> that exposes checks, generators, etc makes sense as a end user I think. I'm installing a module, it provides me features I can use right away. Contrary to installing a lib that I need to integrate into my own custom module to make available.
this sounds pretty good to me!