#`install` vs `toolchain install`

1 messages Β· Page 1 of 1 (latest)

hazy root
#

<@&946480760016207902> An unresolved UX issue: there are 2 kinds of dependencies now. They feel disjointed, we need to unify the user experience for them. But how?

#

@noble prairie @nocturne owl in particular since we discussed this one before. But really all maintainers should follow this.. Because it impacts the fundamentals of our module dependency system - and in turn, the sandboxing model

#

I gave a demo of toolchains to @high nimbus. He asked: "what's the relationship between install and toolchain install"? I didn't have a good answer. In fact I accidentally typed dagger install instead of toolchain install in the demo... facepalm

noble prairie
#

there's at least some precedent with yarn/npm, where our toolchains feel like 'dev dependencies'

#

also, fwiw - i've really liked the word 'workspace' since you've started using it even as a stand-in name for the new Env. feel like it clarifies nicely that it's a development time thing

#

dagger install --dev?

hazy root
#

Jeremy asked a question which I pass on to you:

What if there was only one dagger install, with one type of dependency. But some dependencies could act as toolchains, and others not?

This could be done a few different ways:

  1. by marking isToolchain: true in dagger.json as you suggested @noble prairie .
  2. With a special pragma on the main object type: in the same file as the "explicit workspace" API
  3. The "explicit workspace API" is the marker: no extra information needed

Regardless of how we do it, it would have profound implications on the dependency & sandboxing model. Some dependencies can access their caller's workspace. That breaks the metaphor of module dependencies as libraries. Are we ok with that? How do we explain it?

hazy root
# noble prairie `dagger install --dev`?

This is another angle, where we retire the word "toolchain" and instead call them "dev dependencies".

I don't love this because it defines modules primarily as "libraries calling libraries". But that's not how users experience their project's module. They think of their project's module as their workspace - not as a library. So "dev dependencies" is potentially weird in that context. Users might think we're talking about their project's actual dev dependencies.

noble prairie
#

what about 'workspace modules'? dagger install -w/--workspace/--ws

hazy root
hazy root
#

Could even be a "capability prompt" thing, like when a module needs access to your LLM creds. "Module xyz requests access to your workspace; y/n"

But then, it breaks the mental model of "dependencies are like when my lib calls another lib"

Although, workspace access would be materialized by passing a magic argument, so at least everything is still a function call..

noble prairie
#

yeah hm, that's another wrinkle, toolchain modules aren't exposed as libraries at all

nocturne owl
noble prairie
#

3 sgtm too if we can work out the details, the prompting sounds good if we think we need it, and we could save the prompt confirmation to dagger.json too

hazy root
#

But, what about the mental model?

#

How do we explain this

#

And, I personally can't go back to "your module is like a library! And it can import other libraries". --> So my project is a library? Doesn't work

#

Also I still don't want to go back to "everyone creates .dagger on day one"

#

So I don't know how to square that circle..

nocturne owl
#

I think it brings us back to "they're all just modules" and there's only dagger install. Some modules happen to have checks, generators, etc. There's no distinction to explain to someone who is a no-code user or someone who has a .dagger

hazy root
#

install vs toolchain install

nocturne owl
#

I'm not thinking of this proposal as 2 very different kinds of modules? For example, our engine-dev toolchain (ignoring that its a "project specific" toolchain for now). We want to use it as both a toolchain and a dependency. Its just a module that exposes checks!

hazy root
#

I think it could work if we explain modules as something more like "plugins" to add capabilities to your project

nocturne owl
hazy root
nocturne owl
hazy root
#

Actually 3 kinds of modules, if you count those that act as your workspace, and implement no functions of their own

nocturne owl
#

Yes potentially. But there still would be 2 kinds of modules in practice right? Those who use the new API, and those who don't. We need a way to explain that difference.

Do we though? Some modules provide checks, some dont. I dont think consumers of the module will care about how checks work under the hood

hazy root
#

To be clear I kind of like the option of combining it all. But want to make sure we stress test it

#

Here's my strawman:

  • dagger install will add modules to your workspace. Modules are like plugins, that add capabilities to your workspace. Those capabilities are implemented by dagger functions, so everything is sandboxed, programmable, etc.
  • Capability 1: you can inspect and call your modules functions directly, with explicit arguments. When you need full control, and direct access to the plumbing, flexible composiiton etc.
  • Capability 2: modules can "hook" into your workspace and add checks, generator functions, and other cool things. Higher-level APIs, mode opinionated, but super simple and modules can do more automatically on your behalf

With this framing, everything relies on the new APIs. Modules get more capabilities, they can hook into more and more aspects of the workspace. Shifts more control and difficulty from the end user to the module dev

#

I still like the word "toolchain" though... If we end up retiring it, I will miss it. Modules are technically accurate but on their own, very abstract.

#

While we're putting big UX changes on the table... What about differentiating workspaces from modules? πŸ˜… πŸ˜‡

#

Right now dagger.json marks both. We're gradually expanding the dagger.json spec to accomodate the widening gap between them. With or without SDK; with or without source code... Now customizations, which are artificially separated from user defaults because of sandboxing - a separation which only makes sense if your module has code...

Is it time to discuss bifurcating modules (dagger.json) and workspaces? (dagger-workspace.toml? dagger.toml?) Or is this insanity

#

Or, we keep the unified dagger.json, but embrace that it's a workspace configuration. No more worry about sandboxed/unsandboxed. everything can go in there. And optionally your workspace may be imported as a module - if it has functions of its own. When imported as a module, not all workspace configuration applies (ie. your workspace config may say "load my aws token from `op://top-secretpassword", but that won't be honored when you're imported as a module in another workspace

hazy root
# nocturne owl > Yes potentially. But there still would be 2 kinds of modules in practice right...

What about transient dependencies? A imports B, B imports C. C needs access to a workspace.

  • When user is developing A, what workspace does C get?
  • When user is developing B, what workspace does C get?

This is where the concept of "dev dependency" would make sense, except that it's weird, because like I said earlier in the thread: "dev dependency" will be the most common kind of dependency: a module you install in your workspace to get more capabilities. It's weird that the normal kind of dependency has a qualified, and the advanced/special one has no qualifier.

--> This I think is an argument for splitting the files, or at least the dependency configurations.

  • Configuration 1: "I want this module added to my workspace, so I can use its capabilities. It should get my workspace" (this is like "toolchain", "dev dependency"
  • Configuration 2: "I am developing a module. I want to access the functions of another module. It should NOT get my workspace"
nocturne owl
#

What about transient dependencies? A imports B, B imports C. C needs access to a workspace.

When user is developing A, what workspace does C get?
When user is developing B, what workspace does C get?

depends how we decide workspaces work πŸ˜› but based on most of the assumptions we've made about how it could work, I'd say they all get the same workspace based on the original caller?

hazy root
#

I think fundamentally there are 2 kinds of dependencies:

  • workspace -> module ("add capabilities to my project")
  • module -> module ("let my functions call more functions")
nocturne owl
#

all of this hinges on how we pass workspaces to modules. So even if its not figured out yet it might help to definite a starting point so we have the same frame of reference

hazy root
# nocturne owl all of this hinges on how we pass workspaces to modules. So even if its not figu...

Starting point: when a module is loaded, some of its functions may be marked by the engine as "workspace-aware". The criteria in the base design is "the function is the constructor of a module installed with toolchain install, but the criteria can change if needed).

Functions marked as workspace-aware are passed an argument of type Workspace. If the function doesn't accept that argument, module loading fails.

Workspace argument gives the function access to the workspace's files, with dynamic filtering. Also it can inspect all checks in the workspace, and other advanced "hooking" capabilities.

#

I have a POC for this part πŸ™‚

#

(lunch break, will resume after)

#

I will try to incorporate this discussion into my proposal πŸ™‚

nocturne owl
#

But there still would be 2 kinds of modules in practice right? Those who use the new API, and those who don't. We need a way to explain that difference.

Going back to this, why would the explanation of modules need to explain the Workspace type? Its easy enough to say that modules are collections of functions that use the dagger API. If Workspace is part of the API, thats part of it. So some functions can dynamically work with the host if they choose to.

If its a sandboxing concern, we can make decisions on how Workspaces are initialized. Maybe they have to get explicitly passed in from a parent module, rather than ommitted as a default as adefaultPath Directory could be today.

hazy root
glad ruin
#

delurk my two cents:

  1. Agree with Kyle that not restricting or differentiating on who can accept a Workspace arg type is the most intuitive. Sandboxing/permissions need thought but from the perspective of what's simplest/most-intuitive, trying to make some differentiation here seems complicated.

  2. ⁨I think fundamentally there are 2 kinds of dependencies:⁩ agree that's what's actually worth differentiating.

The dumbest simplest thing I can think of is that dagger.json has two fields:
⁨```json
{
"toolchains": [
{"name": "foo", "source": "github.com/foo/bar" },
{"name": "baz", "source": "./some/local/path" }
},
"dependencies": [
{"name": "abc", "source": "github.com/abc/123" },
{"name": "qaz", "source": "idk" }
]
"source": "<existing meaning of source, a module that will have toolchains mixed in>"
}


Just a strawman to throw in the ring. But it seems like it would work out logically consistent.
![lurk](https://cdn.discordapp.com/emojis/1025058665268510791.webp?size=128 "lurk")
nocturne owl
#

If its a sandboxing concern, we can make decisions on how Workspaces are initialized

I feel like there's something here so i'll try to describe it better.

What if Workspace can't be defaulted in code/pragma? Thats part of what makes defaultPath hairy.

Then, the dagger CLI can pass in the context's "default" workspace if I call something directly and don't pass a Workspace value. In this case, if we're going module -> module, the Workspace would explicitly get passed through. Don't need to worry about transitive dependencies automatically getting something we don't want them to have.

hazy root
hazy root
glad ruin
hazy root
#

@glad ruin any thoughts on more strongly differentiating "workspace" from "module"?

glad ruin
hazy root
#

One blocker for me is which file to put workspace configuration like user defaults

hazy root
#

@nocturne owl is it also your preference to basically keep the current system? ⁨dagger toolchain install⁩ vs ⁨dagger install⁩? With maybe some cosmetic changes but nothing fundamental?

nocturne owl
#

I think its possible for us to get to a single dagger install but it so heavily depends on pieces that don't exist yet its probably more pragmatic to take it 1 step at a time

hazy root
#

Because there are fundamentally 2 types of dependencies - that fundamentally will behave differently when transitive

nocturne owl
#

yeah what I mean is we keep them separate, get the first version of this Workspace API figured out (and shipped), and then see if we have different problems

hazy root
#

I can already tell you we have this problem though

#

(my bad for not having pushed my POC branch yet)

#

But the key is the A -> B -> C question: which workspace does C get? We can't answer atm

nocturne owl
hazy root
#

Let me re-read it

nocturne owl
#

I can try to explain better if its not clear!

hazy root
#

Any chance you could rephrase it in terms of an answer to the A->B->C question?

nocturne owl
#

Sure one sec

hazy root
#

The question is:

  • A imports B, B imports C
  • C.fn() expects a workspace
  • In the A workspace, user calls B.fn() which calls C.fn(). Which workspace does C.fn() receive?
  • In the B workspace, user calls B.fn() which calls C.fn(). Which workspace does C.fn() receive?
nocturne owl
#

So in the world of Workspaces being explicit:
A.fn(Workspace) gets a workspace, and the only way C.fn(Workspace) can receive it as a dependency is if its explicitly passed in.
So A(a Workspace) -> C.fn(a). It is not possible to call C.fn() without an argument as a dependency

and B.bar() calls C.fn().
So for A to call B.bar(), it would be A(a Workspace) -> B.bar(a) -> C.fn(a)

If I call B.bar() directly, not through code, we'll have B.bar(b Workspace) -> C.fn(b).

It always goes back to the main caller context

#

hopefully thats more clear and not just psuedocode overload πŸ˜‚

hazy root
#

So the user has to explicitly pass their workspace? Wouldn't that break ⁨dagger check⁩?

nocturne owl
#

thats where the last part comes in

Then, the dagger CLI can pass in the context's "default" workspace if I call something directly and don't pass a Workspace value. In this case, if we're going module -> module, the Workspace would explicitly get passed through. Don't need to worry about transitive dependencies automatically getting something we don't want them to have.

So the caller can specify the workspace, or it can be automagically contextual

hazy root
#

OK so the distinction would not be between different kinds of dependencies, but different execution contexts: is the client a module basically

#
  • If client is a module: no workspace in the current context. Client must pass a workspace itself
  • If client is not a module: any function who expects a workspace, automatically receives one, with access to the current client's workspace
#

Or even simpler: client register a workspace, or doesn't. Engine doesn't check for "is client a module", it just checks for "did this client register a workspace to access"

Modules when they connect, don't register a workspace

#

CLI could get flags for explicit control over what workspace to register.

  • Default: workspace is the current git repo; fallback to current module; fallback to error
  • ⁨--workspace=PATH⁩ -> explicitly set the workspace boundary
  • ⁨--no-workspace⁩ -> explicitly disable workspace. Modules would use this
#

I think this πŸ‘† could work.... It's just missing a pragma somewhere to explicitly say "this type / this function" needs the client's context please. Basically like ⁨+toolchain⁩... but probably not the right word. Otherwise it feels too magic.

nocturne owl
#

Maybe I'm still misunderstanding Workspace because I feel like we're talking about completely different things. It sounds like you're saying a Module defines a Workspace and I'm saying the caller provides a Workspace to a Module

hazy root
#

I guess I should have more explicitly distinguished "module B" from "module C" πŸ™‚

#
  • ⁨⁨⁨dagger check⁩⁩⁩ calls ⁨⁨⁨B.fn()⁩⁩⁩, in the context of A: client provides a workspace: the git repo for A
  • ⁨⁨⁨B.fn()⁩⁩⁩ calls ⁨⁨⁨C.fn()⁩⁩⁩. The client is module B's function runtime: it provides no workspace.

This allows clean handling of transitive dependencies

#

@nocturne owl TLDR yes clients define the workspace, not modules. But sometimes, the client runs recursively inside a module πŸ™‚ In that case, in-module client defines no workspace. I think this works.

#

Ah no there's still a problem: there are 2 possible kinds of B->C dependencies:

  • B needs to call C functions. It's a "runtime dependency". Even when calling in the context of B's git repo, C should not get workspace access
  • Developer of B needs to call C functions directly, eg. to check. It's a "dev dependency".

If we don't differentiate, it's weird.

But if we call them "dev dependencies" and "regular dependencies", then that's confusing for the 99% of dagger users who will never actually develop dagger functions, but have to type ⁨dagger install --dev⁩, otherwise nothing works like they expect

#

So far: negative progress for me on this issue

nocturne owl
#

I'm still not connecting with this, sorry. I can type out a real world scenario in a sec. But why wouldn't B pass a Workspace to C as an argument? Doesn't that avoid all of this?

hazy root
# nocturne owl I'm still not connecting with this, sorry. I can type out a real world scenario ...

⁨⁨⁨⁨⁨⁨⁨⁨```
$ cd ~/src/mymodules/B

⁨⁨⁨⁨⁨⁨⁨⁨```
$ dagger check -l
c:lint
```⁩⁩⁩⁩⁩⁩⁩⁩
⁨⁨⁨⁨⁨⁨⁨⁨```
$ dagger check
```⁩⁩⁩⁩⁩⁩⁩⁩

--> This calls ⁨⁨⁨⁨⁨⁨⁨⁨`C.lint()`⁩⁩⁩⁩⁩⁩⁩⁩ with workspace auto-set to ⁨⁨⁨⁨⁨⁨⁨⁨`~/src/mymodules`⁩⁩⁩⁩⁩⁩⁩⁩

As the developer of module B, this may or may not be what I want

- Option 1: *no*, not what I want. C is actually a runtime dependency to B: one of my functions calls ⁨⁨⁨⁨`C.lint`⁩⁩⁩⁩ with explicit arguments. I have no use for ⁨⁨⁨⁨`C.lint()`⁩⁩⁩⁩ for my own project, it's weird that it's there

- Option 2: *yes*, this is what I want. C is a dev dependency to B. My functions don't call functions from C. I just need ⁨⁨⁨⁨`C.lint`⁩⁩⁩⁩ as a check in my project to help me develop B

Problem: how to differentiate
#

Again the term "dev dependency" is perfectly accurate technically, BUT confusing for our users because they don't expect anything other than dev dependencies - runtime dependencies are for power users.

nocturne owl
#

But that sounds entirely like enabling/disabling module's checks, not about whether they get the correct workspace

hazy root
#

It's about enabling / disabling based on intended use of the dependency - because there are fundamentall 2 kinds of dependencies

nocturne owl
#

ok using our own modules right now as an example, we have modules that we use as dependencies and as toolchains, and they're useful as both

hazy root
high nimbus
#

I think for me, it’s a question of who you want to make the workflow most simple for. I guess if it’s developers who are consumers of a fully prepared (by DevOps folks) repo with checks and maybe some tools for local dev/test, I’d want them to have the cleanest/shortest/simplest commands with least cognitive load. dagger check

Next level of complexity for devs who install, say the jest tool chain on their own. They can take a little more complexity. dagger install jest and maybe config defaults?

Next level is for DevOps engineers, toolchain/module devs. More flags, more complexity, because it’s my job to mange that complexity for devs.

nocturne owl
hazy root
nocturne owl
high nimbus
nocturne owl
hazy root
#

"check" is just one way to help me develop my project. Tomorrow it will be "ship", "up", "fix", whatever. Also applying user defaults.

#

Honestly this makes me want to create a separate config file at the root of the repo (or wherever), just for setting up dagger and configuring which modules to enable for which paths in the project, with which values - and nothing else

It's barely a "dependency" it's just a project config.

Then ⁨⁨⁨⁨dagger.json⁩⁩⁩⁩ is for modules, it has dependencies, the normal kind: runtime dependencies - functions calling functions, no workspace magic.

If you want to use other dagger modules to develop your module project which may or may not be a dagger module, that's not a dependency at all. You just add that to your simple project config, at the root of the repo or wherever. Orthogonal to your dagger.json

nocturne owl
#

reading this as "If you want to use other dagger modules to develop your module project", I agree, but I don't know if I'd separate the dependencies (into separate files). More like a config to refine how to check/ship/up/fix. Especially when you have a monorepo with subprojects that you want to run subsets of the checks on for each one

#

As for keeping the dependencies in separate lists, I think its honestly fine. But I don't think there's any reason to explain "there are 2 types of modules", its more like there are 2 ways to install a module. But its all the same modules. I think thats the part I've been stuck on

hazy root
#

Yeah I just think it might actually be simpler, instead of having 2 types of dependencies in dagger.json, to have 2 different config files - one for modules and one for project/workspace config, and each has just one kind of dependency: basically workspace->module vs. module->module. And in the case of workspace->module, we wouldn't even use the word "dependency". Just like when you configure nginx to use this or that module, it's not a dependency. Just a config that installs/enables a nginx module. Just semantics

nocturne owl
#

so (ignoring the actual ergonomics for a second), we might dagger workspace install jest (install a module into the workspace) or dagger dependency install alpine (install a module to use in our dagger code)

hazy root
nocturne owl
#

Got it. I'm pretty indifferent on the 2 config files. I agree we need one to configure things, but no real opinion on whether the dagger.json would still track the module or not.

opaque zealot
# nocturne owl so if we're optimized for the user and not the module dev (and we keep 2 separat...

I was trying to catch up on the thread, and was thinking we should swap the commands (then this message)
I wonder if it shouldn't be something like dagger develop install <lib-I-want-to-integrate-in-my-module>
That way we keep that under the dagger develop command that already exists and we stay in the same phase, that is writing custom functions.

And to have dagger install <module> that exposes checks, generators, etc makes sense as a end user I think. I'm installing a module, it provides me features I can use right away. Contrary to installing a lib that I need to integrate into my own custom module to make available.

noble prairie