#Workspaces aka "modules v2"
1 messages Β· Page 1 of 1 (latest)
Lets get the prototype going π
Overall I agree with the whole proposal. First questions I have are
- What would the config.toml look like with some customizations?
- Why
.dagger/config.tomlinstead ofdagger.tomlordagger_workspace.toml? - Nested workspaces shadow the parent directories, does that mean they inherit anything those directories add? including toolchains?
- What would the config.toml look like with some customizations?
tbd π We should combine customizations & user defaults, from the user's point of view the distinction is arbitrary.
- Why
.dagger/config.tomlinstead ofdagger.tomlordagger_workspace.toml?
dagger.toml feels too close to dagger.json
dagger_workspace.toml feels too long
.dagger/config.toml feels kind of like .github or other existing config formats. Plus it has "config" in the name π
.dagger.toml is another good option.
- Nested workspaces shadow the parent directories, does that mean they inherit anything those directories add? including toolchains?
You mean modules π
Good questions, I would say no by default but configurable.
LGTM, ship-it
In terms of the meat of the proposal I'm onboard. Agree with Kyle that the sooner we have a prototype the better
- I technically have hang-ups on why we couldn't just have a β¨
dagger.tomlβ© with different sections indicating whether the current thing is a workspace or a module, dropping β¨dagger.jsonβ© entirely. But I don't actually care very much. Don't block anything on that.
Part 2 -> #1468165016019669166
I think this LGTM. Does this fully get rid of terms like toolchain and blueprint etc? Looks like it does (and I like it), just wanted to confirm.
Would an analogy for monorepos be nx workspace? I am not an nx expert but it's hugely popular in my company.
yes it would remove the word "toolchain" in favor of an expanded and sharpened definition of "module". All modules would have the "toolchain" capability.
Blueprints I don't know, haven't explicitly defined how they would be impacted. But probably they would be less needed?
I was under the impression toolchains replaced blueprints π
In the current design they are complementary:
-
Toolchains are for no-code configuration of your project. End user chooses which toolchains to install based on their needs. Reduces the need to develop their own dagger module from day one. This is a mainstream feature - every user is expected to install at least one toolchain on initial setup.
-
Blueprints are for centralized configuration of multiple projects. Unlike toolchains, a blueprint is monolithic: it configures your whole project, take it or leave it. This is a niche feature: centralized platform teams in charge of daggerizing many projects with a cookie-cutter stack, can use blueprints.
--> Both features are definitely disrupted by this new design proposal, for sure
I think I like it. The disctinction between workspace and module seems to solve a lot of the recent confusion between modules, checks, toolchain, etc.
That seems nice, and I like the "no init" part, just dagger install and voilΓ .
Regarding dagger dependency add I still wonder if we shouldn't put that under dagger develop. Like dagger develop add-dependency .... So we nest everything regarding developing modules under one place, but it might be a bit long to type/non intuitive. The advantage to put what under develop is for "end users", the ones interested in workspace and not developing modules, there's one single sub command to ignore, and develop seems a good name.
Thinking while typing, should we also move dagger init under develop? dagger develop init to say we are explicitly working on creating a module.
Blueprints are for centralized configuration of multiple projects. Unlike toolchains, a blueprint is monolithic: it configures your whole project, take it or leave it. This is a niche feature: centralized platform teams in charge of daggerizing many projects with a cookie-cutter stack, can use blueprints.
Functionally they are the same, aren't they? Sure, toolchains recently gained some customization options, but, I could create a monolithic toolchain and that would be exactly like a blueprint right?
Yeah I agree dagger develop install is one possible option. I don't like dagger develop because it's kind of a catch-all command... And much of it will be replaced with dagger generate.
But: dagger dependency add is NOT my final preference. In fact it was made up by claude code π
Design know: Part 1 - Module vs. Workspace
I've been thinking about this and I think the only way to sequence this without bricking our own CI is to
- build the ability to read a config.toml to load toolchains and checks as we do today
- release it
- migrate our toolchains and customizations to the config.toml
- refactor that into Workspace + the rest of this proposal
I'm updating the gist to add details, and get us ready for implementation
Countdown to bikeshed
Terminology check: custom modules vs reusable modules?
Another option: project modules vs reusable modules
Any preference @brittle wharf ?
custom module, or project module?
project feels more precise (technically any user-written module could be considered a "custom module")
you sound like Claude lol
except my intelligence is not artificial 
@here updated with details https://gist.github.com/shykes/e4778dc5ec17c9a8bbd3120f5c21ce73
Looks great, I think it captures everything we talked about
If no workspace exists yet: If in a git repository, creates .dagger/config.toml at the repo root
Is there a way to explicitly create a workspace other than install? Would I manually create a .dagger/config.toml? Otherwise there's no way to create nested workspaces
Find workspace: Walk up from current directory looking for .dagger/config.toml. If not found, workspace is empty (no modules installed).
Child .dagger/config.toml shadows parent
Worth mentioning that to use the child workspace you should run dagger from that child directory
Find workspace: Walk up from current directory looking for .dagger/config.toml. If not found, workspace is empty (no modules installed).
- some mention of the
--no-workspaceflag
@coral kernel @brittle wharf I am still unsure what to call the directory with a .dagger dir
Is it a workspace? But then that contradicts "the repo is the workspace"
dagger workspace config?
I worry about a scenario where we say that it is not a workspace, but in practice everyone calls it a workspace anyway, causing confusion for everybody
alternative: that is the workspace, which means there can be multiple workspaces per repo, and we find a name for the "boundary"
aaand a bird just shat on my head
Left side or right side ? left is luck I heard (sorry, it sucks...)
to me this makes sense because it's where we said 'project modules' would go
so workspace always == entire repo, β¨β¨.dagger/β©β© denotes a project, often they're 1:1 (in small/non-monorepo case)
Is it a workspace? But then that contradicts "the repo is the workspace"
Isn't it "the repo is the workspace unless you have a .dagger/config.toml that says what the workspace really is?"
What about "project modules" ?
I don't think one automatically leads to the other
my other choices were "bespoke module", "custom module"
artisinal
i can say "a module specific to your project" without formally defining what is or isn't a project
I was gonna say exactly this, I really don't think I'll ever remember which is which 
ok different angle:
- .dagger marks a workspace
- workspace config can be a local path: must always be relative to the config file
- workspace API: paths are rooted in workspace . If you need other files in the repo from outside the workspace, that's a workspace config we can add, to "mount" external files in the workspace
- For 90% of installs, the workspace is the repo
If you need other files in the repo from outside the workspace, that's a workspace config we can add, to "mount" external files in the workspace
Something like include = ["../../package.json"]? Would that mean my Workspace source is a Directory where my workdir has some parent dirs?
with a separate workspace config it becomes quite natural
more like path aliases:
[fs.aliases]
.github = ../../.github
vendor/foo = ../../libs/foo
ok i prefer that. It's a bit like go replace
yeah previously there was nowhere to put such a config without compromising the portability of modules. No such worry with workspace config
seems fine for now I guess. I anticipate we'll have situations pretty quickly where that won't work but we can deal with it when it comes up
let's walk through an example! better to stress test now
Regarding the configuration reference: we naturally ended up adding pinnedRef in the dagger.json. In above spec, go = "github.com/dagger/go-toolchain@v1.0", where would it live ?
another benefit of this "replace" approach is that you could even go beyond the repo, and alias files from your local system (assuming you trust the workspace)
oh that's a throwaway example, not final schema. we should finalize that next
(as in now)
will focus on that as soon as bird shit is removed from head π
I still think the replace directives make it non portable, but at least it's explicit.
it makes the workspace non portable. But modules remain portable
and workspaces don't aspire to be portable
Well not 100% true
i still think it's nice for a workspace to be portable
yes agreed. but we already want things like 1passwords urls in there - with user defaults
But because it's explicit we can warn the user about escaping the portability/security of the workspace
Yeah although i distinguish between ressources you are allowed/denied access, and completely swapping the ressource to something else (more of a security risk)
can you give an example of the difference ?
scenario:
I'm in a subproject of a monorepo. I need to include a few repo-wide configs for my tools to use, such as a .prettierignore. Thats required for my prettier module's dynamic filtering. If I include = ["../../../.prettierignore"], the content of that file still expects to be at a specific relative path to my subproject.
If its a named mount, my resusable toolchain will have no idea where to put it.
@coral kernel I was picturing fs.aliases.".prettierignore" = ".prettierignore" so the issue is not where to put it, but the fact that the file contents are tightly coupled to their location in the repo. So aliases would break stuff all the time
thats what I was thinking of when i asked this #1468070450524459029 message
like if everything is in a directory at the correct relative path and we set the workdir based on the workspace's location, its all good
maybe its more complicated and doesn't actually help but maybe we could include other workspaces instead of files
it would be a problem in β¨β¨dagger.ioβ©β© too because we have β¨β¨go.modβ©β© replace rules that point to things like β¨β¨../api/β©β© and β¨β¨../daggerverse/β©β© - so those always have to be at the same relative location
sure.
- 1password urls if you dont have the right one, you just dont have access to the secrets and thus to the ressource.
- llvm = ../../llvm: it's not about allowing/denying access, it's about a potentially sensitive folder (because outside workspace) being given without warning to an untrusted module.
From a security standpoint 1 and 2 are the same. Workspace configuration loads modules, and gives them access to sensitive resources (credentials, out-of-workspace directories). So if you're in an untrusted workspace (for example a checkout of an evil repo), either of those are equally bad.
the reason I said this, even though it sounds more complicated, is because its a layer on top of Directory that can be smarter. So if I include a workspace 2 levels up, thats the root of the dir, but we can signal in the resulting Workspace that my workdir is where the user wants it to be. That the Module can still mount and filter a single Directory and set the workdir
ok new NEW angle:
.daggermarks a workspace- In workspace config (outside module sandbox), paths have the normal meaning: local paths are relative to the config file; absolute paths are relative to your local host filesystem.
- NOTE: this means workspace config can reference files outside of the workspace root. Whether to allow this is a policy decision. Probably we want to deny it by default, with possible overrides if you trust the workspace and really know what you're doing. But in any case: no namespacing.
/means/in the system
- In workspace API (inside module sandbox): we distinguish the workspace root (where nah then module dev doesn't know which one to use.dagger lives) with the workspace's git context (the git repository inside which the workspace lives). The API lets you access one or the other
- In workspace API, 2 options:
OPTION 1: paths are rooted in the workspace root. If you want workspace to be rooted further up, you configure that in the workspace config. root="../..". Maybe a special variable to mean the git root, since that would be common: root="$gitroot"
OPTION 2: paths are always rooted in the git root... And you rely on filters. So a workspace is always a view of the repo
what does β¨/β© meaning host rootfs β¨/β© buy you? seems like that would always be a footgun to me
option 2 sounds the most flexible but I still think workdir relative to git root is important to preserve
β¨β¨β¨β¨/β©β©β©β© always meaning git repo root i find pretty ergonomic, i would always prefer that to β¨β¨β¨β¨../../../../β©β©β©β© in those fringe scenarios (but even β¨../β© tbh)
In the context of workspace config specifically: user doesn't know how modules work. As far as they know they're just configuring another devops tool. So it makes sense that all paths work in the familiar way.
Inside the module sandbox: different story. [actually not always true] "path physics" are different, eg. workspace.Directory(".") is the same as workspace.Directory("/") anywayDirectory.directory("/") vs Directory.(".") etc
workdir relative to git root is important to preserve
In the sense that modules should be able to know the client's current workdir?
relative to the root, yeah. Specifically with child workspaces. As part of the Workspace API
Mmmm you mean for things like per-subproject config like we were exploring yesterday?
I initially liked the idea of giving the module access to the client's workdir. The more info the better right? But I wonder if it could be a footgun. For example now the module dev has to decide - what do I do with the workdir? Do I scan for all go.mod in the workdir, or in the whole workspace? What's the convention? etc
yeah thats what I mean by child workspace https://gist.github.com/shykes/e4778dc5ec17c9a8bbd3120f5c21ce73#source-code-location
Workdir is where a reusable module should run its tools, like go test or whatever. So a module would probably mount the Workspace.source and then set the workdir to mountpath + Workspace.workdir
Disagree, you want the go module to find all go.mods in the workspace, expose each as an object, and for each of those objects, the corresponding test() function mounts that go.mods directory in the container running go test. IMO the client's actual workdir should be irrelevant in all that
Where the client workdir is important is for filtering artifacts by path. For example dagger check --path=. which means "only check artifacts which trace back to my current workdir". That could even be the default... But the difference is that the module code doesn't know the workdir -> it's the engine doing the filtering
Its not about check discovery its about the context where the subproject exists. If my module works dynamically across the whole repo there's no use for child workspaces. But if I have a nextjs project in a child workspace at /foo/bar and a /.prettierignore, I'm expected to run prettier check from /foo/bar. And if I run dagger check from /foo/bar which has its own workspace, I don't want it to go find checks in other places
In the very last option I suggested (.dagger marks a workspace, but all workspaces are rooted in repo root) then it boils down to filtering.
-
Exclude by default: workspace
/foo/runmust explicitly include/.prettierignorein its config. -
Include by default: workspace
/foo/runmust explicitly exclude everything except what it needs:/foo/runand/.prettierignore
I dont think its about filtering though, its about intent. I want to run tools in /foo/run regardless of what else is included in the workspace, so the Workspace API must tell the module the workdir relative to the repo root
sorry I'm confused. We can discuss it live tomorrow maybe?
There's a lot of assumptions that seem abstract to me
yeah that works! I'm not sure how the filtering and Artifacts change the underlying problem with child workspaces
We'll have to start there - I don't know what the problem is with child workspaces. And also I spent zero time thinking about child workspaces so far (I think claude actually snuck in that term π )
maybe another example like @brittle wharf mentioned.
i'm in the repo dagger/dagger.io
working on cloud in /cloud. It has its own .dagger but we're in a subdir of the repo root
I have to include ../api because the go.mod has a replace that expects that.
So now my resulting workspace directory looks like
/
/api
/cloud <-- workdir
any checks or tools my checks might run should be run from /cloud in the workspace's resulting directory. I never need them to care that they could build or test /api
In that particular example, ideally the module follows the replace directive on its own, so there's nothing you need to do
(That probably doesn't address your point though)
ok but from a module author's point of view. I have a directory, something like Workspace.directory. I mount it in a golang container at /app. When I run go test, how do I know I'm supposed to run it from /app/cloud unless the workspace tells me the workdir?
Because the module is scanning the whole workspace for go.mod, and each match will be one entry in a dynamic list of GoModule objects. Each with its own test() function
So IMO the module should focus on building the whole list of possible go modules to test. And let the engine filter based on user workdir, flags etc
Right so thats what I mean, we've then lost the user intent. I ran dagger check from /cloud which is its own workspace. I had to include /api as a side-effect of the go.mod's replace, but I'm not asking dagger to go run go test in /cloud AND /api, I'm only concerned with /cloud here
What I'm hoping is that user can use path filtering for that. dagger check --path=. for example
In this specific example, api and cloud are tightly coupled, so it makes sense that you can check both in the same workspace
But if you only want to check /api then you would just say that: dagger check --path=/api
I'm pretty concerned about losing the user intent and trying to be too smart though. That would be a worse experience than just running go test myself from /cloud because thats what i want to test
(dont want to derail, i just like the idea of dagger check <pattern> and that pattern can be either a path pattern (starts with / or .), or an artifact pattern like go:*)
Separate topic: found a way to simplify module initialization based on feedback from @brittle wharf @proven viper @restive swift earlier today
pushing to the gist
Well the good news is with the testcontainer pattern you can run go test π
But there might be more to check in /cloud than the go tests.
So dagger check with filtering is a nice way to aggregate
If you specifically want to run go test in one module, in a super narrow fast devloop, dagger can't really compete with that anyway
I mean if we were talking about a workspace at the repo root and we happen to want to run just the checks for /cloud, passing a filter makes sense. But if I'm in /cloud and it has its own workspace, it feels weird to not respect that
But if /cloud is tightly coupled to /api, I question that you should have a separate workspace in /cloud in the first place. It might not be the right use of a workspace
IMO it's more for organization split, rather than delineating the software stack
"I don't want to ask 50 people for permission to add .dagger in the whole monorepo. Easier for me to do it in /devrel/tiger-projects or whatever"
But I think we could make dagger check --path=. the default (ie. by default, only artifacts linked to your workdir are shown to you)
So dagger check -l by default would filter that way.
Also I was thinking that dagger check --git-uncommitted might be a good default...
So dagger check out of the box would only check artifacts that 1) are linked to your workdir, and 2) touched by your uncommitted git changes
that would make it more realistic for a devloop
then you add dagger check --watch and we're in business π
I think that accomplishes the same thing so that works. I think workspaces can delineate software stacks as well as organizational splits, but we can wait and see how it plays out. I can depend on ../api and also have no control over that other project and therefore never care about running checks for it
yeah maybe there's a way to make that subtlety easily configurable in the workspace config? Like 2 different kinds of path filters?
Or "disable checks for these paths" or something
I'm doing a pass on the toml config schema
@lofty shadow I'm lifting the dependency schema as-is and transposing it into the toml file. Any opinion on preserving the list format from dagger.json, vs. making it a map?
From a user readability point of view, a map looks nicer.
no, doesn't matter
i think if we just prototype this with dagger.io we should have a pretty good feel for it as we go if somethings wrong
Agreed
made a bunch of edits to the proposal. See comments for changelog: https://gist.github.com/shykes/e4778dc5ec17c9a8bbd3120f5c21ce73
If you are looking for an extra wrinkle in path handling: #1468216262558617775 message
About git usage
This is actually a very interesting question π
Git is not used at the repository root. Weβre using a private VCS instead. However, in order to make module-to-module dependencies work, we had to git init an empty repository at the root. Itβs not really used and doesnβt contain anything meaningful.
Basically, they don't use git but have to β¨mkdir .gitβ© right now to get a context root. Feels potentially related to and possibly solved by all the above, just flagging
really weird. I wonder if the findUp that looks for a .git could be configurable to look for some other marker? although configuring it would probably be more characters than just running mkdir .git
@coral kernel @brittle wharf looks like the first question for backwards compat is: should there be a way to configure a single "main module" in the workspace? So you could do either module= or modules= depending on the style you want?
And if we do: would we consider it a backwards compat thing only? Or a good and necessary convention that some people would continue using in their project?
my only idea for doing this in a backwards compat way was to have a config in the workspace like temporary_backwards_compat_main_module="/dagger.json". As ugly as possible to be clear that its temporary so people can migrate
One possible migration hook that could be pretty clean: we could deprecate the source field in dagger.jsn (since in the new model, a dagger module is a cleanly separated software package, the source directory can always be the module directory).
Then we can detect modules with a source field not set to ., and treat them as legacy "main modules", and from there, decide what to do with them
that would allow us to clearly separate our UX between "backwards compat" and "normal ongoing usage"
-
For pure modules: clean break. If you want to call your module's functions, you need to either A) install it in a dev workspace, or B) explicitly load it with
dagger call -m . -
For legacy "project modules": trigger backwards compat / migration flow (tbd)
yeah seems good. For legacy mode we should make sure its clear that they should migrate because we'll eventually take away legacy mode
Maybe we can auto-migrate
Then they just need to decide when they're ready to commit the diff π
Or, a new dagger migrate command
And if you're running on a new engine version, auto-call dagger migrate maybe?
btw I love that I could just tell claude this:
Read the discord thread https://discord.com/channels/707636530424053791/1468070450524459029 (use the discord-reader skill)
Especially the last part about migration
And it did π
the reason i was in favor of something like this is that there's no magic handling that isn't explicitly enabled. Setting that field isn't migrating, but its acknowledging that a migration needs to happen. That could happen with dagger migrate or something when they're ready
so the immediate effect of upgrading to v0.20.0 is that dagger call is broken. But you can enable backwards compat mode in a new .dagger/config.toml until you have a chance to do the migration
a temporary dagger enable-backwards-compat could even create that config
@coral kernel do we have good examples of real-world repos using Dagger, other than our own? open-source I mean
neat detail: migration could be handled by a module π
Nice thanks. Looking at this one: https://github.com/bliporg/blip/blob/main/dagger/main.go
It's a nice clean example of a "main module"
You can do:
dagger call test-coredagger call lint-coredagger call format-coredagger call lint-modulesdagger call lua-dev
Note: oh damn that's our friends Gaetan and Adrien π That startup shut down, but the project is still open source. Not actively developed though..
I thought you picked that on purpose lol, I see some commits of your own in there
OK so there's no 100% real-world prod CI setup in these examples
I did not
π
There's runme but I don't know if they use it as their CI, it looks more like a daggerverse module for others to get a build of runme (I doubt anyone uses that module if that's the case)
there's openmeter
oh yeah missed that one
ok that's the most real
Status update:
- I have the general backwards compat problem mostly scoped out
- I have a "harness" that I like for detecting backwards compatibility issues at runtime, reliably, and triggering a compat or migration flow
- I think we can handle backwards compat as a form of migration (to keep actual loading & runtime logic clean)
- BLOCKER: I don't know what to do with legitimate use of
dagger call FOOin the main module. Should we keep a concept of "main module" or not. And if not: how to avoid breaking users in a painful way?
@brittle wharf @coral kernel if you're around I could use a sounding board π
Basically:
-
I'm scared to allow a concept of main module, and then we weaken our new model - now there's two ways to setup your project, more concepts to explain, fragmentation etc.
-
But I'm scared to forbid it, because it breaks existing users in a way that can't be auto-migrated, and can feel like we're crippling the UX for arbitrary reasons ("before I could type
dagger call playground, now I have todagger call engine-dev playground" -> that's an actual situation we had on our own repo)
I guess what you're proposing @coral kernel is a middle ground, where we allow it but as a time-limited, only-use-for-migration feature
I'm in https://discord.com/channels/707636530424053791/1029799417324257420 if you want to dig in
(had a long sync conversation about this π Will update the gist & soon push a POC branch
Prototyping the migration tool as a dagger module π
I've got a first migration tool POC that seems to work
@brittle wharf I'm reaching the point in my prototype where i need to define a Workspace type. I'm doing it from scratch for now, but if you have a branch somewhere with your own version of the type, I can take a look to harmonize early π no rush
pushed here, but consider it scraps to yoink - nothing's really working yet: https://github.com/vito/dagger/tree/workspace-api
i hit two hurdles: 1. how to have β¨β¨Workspace.directoryβ©β© route file syncs to the host from a module function call, and 2. the need for Dang to support constructors; I chose to focus on 2 so I could write some test modules (and because Mark needed it in https://discordapp.com/channels/707636530424053791/1468959453591240928), but haven't circled back
For 1. I was picturing the same general trick as eg. getting outer client's LLM creds, and loading user defaults from the outer client's .env. There are utilities for "escalating" to the right client higher up in the session, I forget the exact funct name
it's also missing the bit where a Workspace argument taints the cache key with 'never cache'
Ah right!
Small note on that: if the workspace is backed by a git source, there's still some caching, but it's the whole git tree that is part of the cache key
Update: I'm debugging, then will ping you all for a pre-review & discussion
@dim smelt did you end up folding my changes in? or should i try rebasing on your PR?
or more generally: lemme know how i can help!
No not y9t
trying to get my protoype working enough for demo & review today
given you have tons of Context already, might be worth glancing at the pr
@topaz seal hello! I would be interested in your thoughts on the pr... It's still a prototype but should be functional enough to test the UX
Sure, I'll have a look. I was just asking Claude to summarize it to me π
Ha ha π
You can see the design proposal here: https://gist.github.com/shykes/e4778dc5ec17c9a8bbd3120f5c21ce73
I try to keep them in sync
But I recommend playing with it before reading the code
@brittle wharf want to try connecting parts 1 and 2 today? And ideally start working towards part 3
yea sounds good!
@brittle wharf want to take 11812 for a spin, see how the UX feels? It's rough but somewhat usable
Next I'm adding -C with support for local and remote paths
building now - what's a good sequence to try it out?
oh i'll follow the giant comment
This is my standard "playground":
dagger call \
engine-dev playground \
with-directory --path=./src/dagger --source=https://github.com/dagger/dagger \
with-directory --path=./src/demo-react-app --source=https://github.com/kpenfound/demo-react-app \
terminal
Things that should work:
- Detect a "legacy" project-module and tell you to migrate
- Install a module in the workspace
- Correctly load modules installed in the workspace
- Correctly handle
dagger -min harmony with workspaces
At the moment I'm fixing dagger install and dagger workspace info to detect workspace engine-side
quick comment (I haven't looked at the code to know if that's possible)
Would it be possible to always expand the first level of modules? If I install two modules in the workspace, I have something like that:
[modules]
jest.source = "github.com/dagger/jest"
plop.source = "modules/plop"
But it would be more readable and more open to edit to have this
[modules]
[modules.jest]
source = "github.com/dagger/jest"
[modules.plop]
source = "modules/plop"
I guess that's not high priority, just a feedback while playing with it
Very easy to do, since that's how it worked initially, but I thought it looked crowded with large numbers of modules π For example after migrating our main repo. I can change it back
ho yeah, it's only from some tests with 2 modules, so maybe not to change π
also I think it mostly depends how much we expect to edit this file by hand. If everything is doable with dagger commands, then we don't care that much I think
i think i prefer the latter too - if there are a ton of modules i'm ok with the file being longer, vs. being deceptively dense and easy to overlook. low stock on this, just a first impression, opinion may change with real use π
Makes sense. And yes @topaz seal the goal is to be read and written by humans
Now that dagger install and dagger workspace info are implemented engine-side, I officially have a top-level dag.Workspace() that returns the client's current workspace. I don't love it, but not sure how else to implement those features engine-side. wdyt?
It's basically the equivalent of dag.CurrentEnv(), I think?
@coral kernel you ready for some testing? π
absolutely!
@coral kernel I could use some stress testing of the UX in 11812... Hoping to connect that to @brittle wharf 's workspace API so we can start writing our first experimental modules and try them e2e
@brittle wharf want a demo? I'm in lounge with Kyle
FYI @topaz zenith @crisp dove @restive swift this is the thread for my demo earlier
I just pushed dagger -C with support for remote workspaces
fixed: dagger functions shoudl no longer print module dependencies
@dim smelt any reason it's core.DagqlWorkspace and not core.Workspace?
seems like it renames fine, can push
hehe - so I was curious about writing config.toml while preserving the user's comments, and found this - https://pkg.go.dev/github.com/neongreen/mono/lib/toml
the "hehe": i poked around the repo, and they use Dagger: https://github.com/neongreen/mono/tree/main/.dagger
that's vibe coded π€·ββοΈ sorry
@brittle wharf I worry about the meaning of dag.Workspace() when called inside a dagger function.
np, pushed the rename
I guess it should mean "the current client's workspace", so if the client is a dagger function runtime -> the function's own "worspace" ?
it's like dag.Host() all over again
is it that, or is it just empty + something the function can build, like a sandbox?
like dag.Workspace().WithRoot(Directory) or something, or dag.Workspace(Directory) to be explicit
Yeah so that's definitely the wrong name lol
Because as implemented it represents the client's current workspace (for purposes of installing modules into it, and getting its path)
I didn't get to the point where we might want to create a virtual workspace "from the outside" like we do for Env. Frankly I'd rather postpone that discussion to avoid breaking our brains. We seem to be doing fine without it so far
at least that's where my head was at so far
I guess dag.CurrentWorkspace() would more accurately mirror CurrentEnv()
i see - it's used internally with (dagql.Server).Select
Should be exposed to the client also, see dagger install
makes sense
and then the question would become: what does CurrentWorkspace() mean in a function?
(or do we not expose it at all)
exposing it does feel weird, the caching semantics are unclear, and arguably broken? nothing forces the function to be no-cache
when we chatted live about this earlier I had the same question for dag.Engine() since I'm working on the generated clients right now
- confusing: yes. and I don't know what to do about it
- broken: I don't think so. It would not give you the same workspace as the one you receive in argument. It would be your own client's workspace. When called from a dagger function, it would basically return an empty workspace I think
An alternative: we don't expose it at all.
Instead, the CLI finds another way to implement dagger install and dagger workspace info.
It could be special client params. I already defined a bunch in the PR:
ExtraModules-> specify modules to load on the fly (for-m)SkipWorkspaceModules-> don't load modules from the workspace config (also used by-m)RemoteWorkdir-> for-Con git modules
Meh but it would be super weird to add client params to modify the workspace config, or query its path. That feels too much like a shadow API
I think dag.CurrentWorkspace() is the least bad option right now
- Matches an establshed pattern of
CurrentEnv(),CurrentModule()... - Gets the job done for CLI
- Confusion is manageable for module devs -> there will be lots of examples of the correct way; and if they call
CurrentWorkspace()they will quickly see it's a dead end - Caching I think is a non-issue -> calling from a module, whatever's in there, is as cached as the module itself
Caching I think is a non-issue -> calling from a module, whatever's in there, is as cached as the module itself
Am I correct that:
- The
Workspacetype is the thing that gives you access to all the various client resources (local dirs, etc.)? - calling
dag.CurrentWorkspace()from a module function is not the expected way to use workspaces, it's just something might technically be possible? And the expected way is that aWorkspacetype gets passed as arg through functions?
Just checking, because if calling dag.CurrentWorkspace() is actually the definitive way for a function to get access to the workspace, then I'm very concerned about caching behavior.
1 and 2 are correct yeah (afaict)
Yes indeed
@brittle wharf I pulled your changes, renaming to CurrentWorkspace() and moving on
sweet
Next I'm going to look into workspace config & migration
-> bring the migration code into a dagger migrate
-> experiment with a dagger workspace config to set / get workspace config keys
-> pre-fill config keys on install
@dim smelt looking into why module init fails, to get going:
β― dagger-dev module init --sdk=github.com/vito/dang/dagger-sdk wstest
β connect 0.1s
β currentWorkspace(skipMigrationCheck: true): Workspace! 0.0s
β .moduleInit(name: "wstest", sdk: "github.com/vito/dang/dagger-sdk"): String! 0.3s ERROR
β export directory / to host /home/vito/src/dagger 0.0s ERROR
! failed to mount directory: no buildkit session group in context
OK I now need this π Will check it out
I got a similar error earlier. Probably a timing issue. Possible that the engine is trying to call session attachables before they're available. Might be worth looking through the git history of the workspace branch for fixes to similar issues
@dim smelt mind if i force-push a rebase? wanted to pull in erik's caching stuff in case it helps with any of this (update: it did not. but now i know that!)
This is pushed to PR 11812? I can have a separate cache-expert codex try to look
yeah, that repros on upstream/workspace
interestingly it's when it's mounting the codegen'd directory to export that it errors, not when it tries to write to the host - that part's fine, as demonstrated by other commands
ah, i think i know a fix, testing now (just call export through DagQL, instead of directly)
yep, that did it:
β― dagger-dev module init --sdk=github.com/vito/dang/dagger-sdk wstest
β connect 0.2s
β currentWorkspace(skipMigrationCheck: true): Workspace! 0.0s
β .moduleInit(name: "wstest", sdk: "github.com/vito/dang/dagger-sdk"): String! 0.3s
Created module "wstest" at /home/vito/src/dagger/.dagger/modules/wstest
Installed in /home/vito/src/dagger/.dagger/config.toml
Yep that or wrap moduleInit in DagOp
is there a rule of thumb?
i think this one's mostly glue so maybe not worth dagoping itself? no idea
it's marked DoNotCache since it's an API used primarily to interact with the host
That distinction will go away soon too, but right now there's two worlds: outside of dag-op (normal world) and inside dag-op (inside of buildkit solver context)
I'd say be biased towards calling through dagql, wrapping in dag-op gets you persisted cache for the operation right now, but in the near future that won't matter and we can make any arbitrary thing persisted cache by flipping that on
I just pushed config hints on module install
now trying to add comments preservation with that neongreen lib
a few weird things going on over here, possibly compounding:
- tried
module init'ing a Dang module that takes aWorkspaceas an argument. except, theWorkspacetype is nowhere to be found.Directory, evenGeneratortypes work fine. inspected the schema JSON that gets passed to the Dang SDK, andWorkspaceis indeed missing, along withQuery.currentWorkspace. can't find any type filters. - to rule out it being a Dang thing, decided to
dagger module inita Go SDK module.- first weird thing: it didn't create a
main.gofor me - second weird thing: when I ran
dagger develop -m ./.dagger/modules/gowstestit got stuck on 1 when trying to load the Dangwststmodule - looks like when ANY module is broken, nothing can load - third weird thing: when I ran
cd ./.dagger/modules/gowstest; dagger developit generated everything just fine in that directory, but it ALSO clobbered thego.modin the root of my repo (workspace?)
- first weird thing: it didn't create a
now that the go SDK is codegen'd, I'm seeing *dagger.Workspace is missing there too, so it's at least consistent
- when I try to reference
*dagger.Workspacefrom my Go SDK module, it seems to fail silently; its entrypoint just stops appearing indagger functionsand reappears when I swap it back to a ``Directory` arg
oh - Workspace is listed in TypesHiddenFromModuleSDKs. @dim smelt guessing unintentional, removing
Yeah it's an AI misunderstanding, it knows only the CLI currently uses or needs CurrentWorkspace(), but doesn't know about Part 2, so just hid everything. sorry
makes sense. moving on to this one, since i'm seeing the same behavior in Dang (referencing Workspace makes the entrypoint disappear)
got it - need to treat Workspace as a magical arg, it was being seen as an unsupported type and causing it to be hidden
pushed both
@dim smelt pushed Workspace.directory(), Workspace.file(), and arg injection.
some notes:
- added
Workspace.clientID: String!- currently exposed through API, probably shouldn't be but left it for tinkering/troubleshooting in case it helps - also pushed a revert of #1468070450524459029 message since arg injection made it redundant by just forcing
Workspaceargs to be optional, but not 100% sure on the direction here
i can now dagger module init --sdk github.com/vito/dang/dagger-sdk dangmod as:
type Dangmod {
let source: Directory!
new(source: Workspace!) {
self.source = source.directory(".")
self
}
pub ls: [String!] {
source.entries
}
}
and run:
β― dagger-dev call dangmod ls
β connect 0.1s
β loading type definitions 0.7s
β parsing command line arguments 0.0s
β dangmod(
β source: currentWorkspace(skipMigrationCheck: true): Workspace!
): Dangmod! 0.0s
β .ls: [String!] 0.0s
.dagger/
.git/
app.dang
π
beautiful
I'm very curious about dynamic filtering, will it get us real-world perf boosts
i'm only testing in a tiny side repo, but i basically didn't notice any overhead at all. i'll try putting more junk in there and filtering
new problem: if I create a new file foo and re-run, it's not picked up
ah ha - it's not the syncing, it seems like it's the ls being cached, but it's not showing as 'cached' in the trace. (tested via dagger call dangmod source terminal and running ls in there)
πΌ (via)
@lofty shadow do you have a preference between / alternative to these two approaches? (presented in one diff since they're conveniently near each other)
@@ -493,17 +503,6 @@ func (fn *ModuleFunction) CacheConfigForCall(
dgstInputs := []string{cacheCfgResp.CacheKey.CallKey}
- // Mix in the parent object's field content so that functions called on
- // objects with different field values (e.g. a Directory that changed on
- // the host) get different cache keys.
- if parentObj, ok := dagql.UnwrapAs[*ModuleObject](parent); ok && len(parentObj.Fields) > 0 {
- parentFieldsJSON, err := json.Marshal(parentObj.Fields)
- if err != nil {
- return nil, fmt.Errorf("marshal parent fields for cache key: %w", err)
- }
- dgstInputs = append(dgstInputs, hashutil.HashStrings(string(parentFieldsJSON)).String())
- }
-
var ctxArgs []*FunctionArg
var workspaceArgs []*FunctionArg
var userDefaults []*UserDefault
@@ -933,6 +932,18 @@ func (fn *ModuleFunction) Call(ctx context.Context, opts *CallOpts) (t dagql.Any
returnValue = returnValue.WithSafeToPersistCache(safeToPersistCache)
}
+ // If this function accepts Workspace args, set a content digest on the
+ // result derived from the actual output. This ensures downstream calls
+ // that reference this result get a different cache key when the result
+ // content changes (e.g. a Directory synced from a changed host workspace).
+ // Content digests propagate through the DAG via ID references.
+ if returnValue != nil && fn.hasWorkspaceArgs() {
+ contentDigest := hashutil.HashStrings(string(outputBytes))
+ returnValue = returnValue.WithID(
+ returnValue.ID().With(call.WithContentDigest(contentDigest)),
+ )
+ }
+
return returnValue, nil
}
the problem being solved: the directories synced at runtime and stashed in the module fields had no bearing on the cache key for downstream function calls
first attempt (-) fixes it by marshaling parent object fields and mixing it into the downstream func digest
second attempt (+) fixes it by making module functions content-addressed on their result payload, which is interesting, but i'm not sure if e.g. the objects encoded into IDs will have the caching semantics we want
my guess is there's a less obvious but cleaner third approach, or the second one is actually fine?
(making claude try harder, no rush)
Thinking, I am surprised this is an issue (though I believe you). I have memories of it and would have guessed we have tests for it, looking around
yeah, I'm a little surprised too, but I think I can see how we wouldn't notice; we've always forced cache keys to be part of arguments
The first attempt is not right in that it almost certainly shouldn't be parentFieldsJSON that we mix in, it should be the ID digests of any IDAble parent fields, but given that adjustment it does make sense
and never exposed host() to module functions
is the PR up to date? need to check something
here's the latest attempt (took the liberty of pushing @dim smelt ) - https://github.com/dagger/dagger/commit/fcde6631a3db4255bbe507ebdb0b7d51c7c19ef6
just pushed now
(brb)
in core/schema/workspace.go I would have expected dagql.Func("currentWorkspace", s.currentWorkspace). to be dagql.FuncWithCacheKey("currentWorkspace", s.currentWorkspace, dagql.CachePerCall).
Right now every workspace has the same cache key I think
Which would explain this
I think you gotta make that change either way otherwise anything that accepts Workspace args won't get cache invalidated, i.e. right now I bet new itself is cached
lemme know if I'm missing something
I'm still working on incorporating migration
ah yep that makes sense, and fixed it
- will roll back the rest
i think that might make https://github.com/dagger/dagger/commit/307f56e9c5ada168f18457eec2a02a6e0879611a redundant too? trying a revert
yep - works without it
I'm glad because those were my pure human tokens, so I still have a purpose in this world π₯²
@dim smelt pushed a Workspace integ test suite, some of it's intentional and the rest is just documenting current behavior π - for example the fact that you can dagger call a Workspace-taking module and it gains access to the host filesystem is interesting (wrong? or just powerful? don't remember which we decided)
https://github.com/dagger/dagger/blob/08b115c7f591179e1e3fb2dd0b4b091309592c67/core/integration/workspace_test.go#L47-L70
Mmm how can it gain access, isn't Workspace.directory() rooted in the git repo?
dagger migrate works π
sorry yes i don't really mean arbitrary access, just couldn't remember whether we expect -m invocations to also pick up Workspace automatically
That's how it's expected to work yes. At least until we decide to change it π
It allows for cleanly incoporating -m in the workspace model without breaking compat
oh wait, that test isn't even doing -m, i misread it as plain module use but it's also installing it into the workspace
- it's calling it toolchain style
-m works exactly like an ephemeral module install in the workspace
I like it because it keeps everything consistent
without breaking old commands which is a miracle honestly
sgtm! with it being rooted in the workspace, that'll be a safer alternative to host() anyway
pushed a test + fix for that
Pushed: toml formatting the way you suggested @brittle wharf @topaz seal
@dim smelt case to handle in dagger migrate: when the main module has relative dependencies, they need to be made relative to the new dagger.json location. for https://github.com/vito/dang that meant putting ../../ at the start of the sdk and both dependencies
side note: it'd be nice if those could all be absolute, relative to the repo root.
wonder if that works? haven't tried
Yeah I had that working in the migrate module, but have to tune it again in dagger migrate
No absolute paths rooted to the repo root don't work. I think that is a bad idea (although I see the benefits). Users looking a config file in their computer expect absolute paths to be relative to their computer. my 2c
yeah that's fair. these just get really really fiddly when you move modules around. but i guess the same is approximately true for renames too.
One random thought. What about a dagger workspace config command, in the same spirit than git config? Something like:
dagger workspace configwill print the toml file in a pretty-print version (always the same, whatever the way it's stored)dagger workspace config [key]for instancedagger workspace config modules.my-module.sourceto print a keydagger workspace config [key] [value]for instancedagger workspace config modules.my-module.alias trueto set a key
The goal would be to interact with the toml file in a controlled way, so that users don't need to edit the file by hand.
Happy to play around this idea is that makes sense.
@topaz seal πππ
π«£ π
If there's specific thing you'd like me to contribute, let me know and I jump on it (and to avoid conflicts π )
workspace config would be great actually
sounds good to me, I'll have a look then
also there's a bug where dagger migrate and dagger workspace info are too aggressively cached.
cd demo-react-app
dagger workspace info
-> "error, please migrate" β
dagger migrate
dagger workspace info
-> ok β
rm -fr .dagger
dagger workspace info
-> ok βΌοΈ
dagger migrate
-> "nothing to migrate" βΌοΈ
Another bug: dagger migrate doesn't show enough telemetry by default. You have to increase the verbosity to 2 or 3, otherwise it looks like it's hanging after "connect"
hit what feels like a 1% chance bug with module aliasing (promoting fields to Query): my dang module provides a Dang.dang(args: [String!]!) function, originally exposed for building + running dang <args> for an LLM. but, Dang.dang clobbers the original Query.dang so all the promoted fields fail with required arg "args" not provided. dang.
theoretical fix: when aliasing, move the original constructor to _foo? or, don't alias functions that would clobber?
ooh that might explain why call -m migrate migrate didn't work during my demo π
@brittle wharf actually I didn't understand the "requird args" not provided. Your constructor requires arguments basically? And aliasing breaks the mechanism for providing them?
it's because the aliased fields internally re-call through the constructor, like Query.foo wants to reroute to Query.dang.foo (where the workspace gets passed through), but by that point the Query.dang has been swapped out for the alias
so all the aliases end up calling the Query.dang(args: [String!]!) instead
Sorry have to drop to demo calls
The branch is yours π
I was about to push my rebase on main.
(done)
Actually I think there's a problem with this - it means downstream functions never get cached. We kind of want to treat the constructor as a cache key calculator, without it being broken every time by the currentWorkspace call. Right now if I run dagger-dev call dangmod ls it runs ls every time even with no changes to the workspace directory
Yeah I was kind of wondering about that. I think thatβs a pre-existing issue though separate from what youβre all doing
How devastating is it? I can definitely try to address it as part of the current cache updates
I coincidentally am working on modfunc stuff rn
pretty devastating π - this is part of the 'special sauce' to make workspaces viable
the previous approach was:
- when a
Workspacearg is detected, mark the function asDoNotCache - unpack IDs in the response and add them to the returned content digest: https://github.com/dagger/dagger/pull/11812/changes/ad5d027d612d6a8ffc58b7ecdcba6ecb54c7339a
which sounds reasonable in my head, but i'm not sure if that code actually, you know, worked
though even unpacking IDs isn't accurate enough - if the constructor read the content of a file and baked it into the object fields, we would also want downstream caches busted
My suggestion:
- Leave in the cache key change to workspace, that's better than
DoNotCachebecause it stopcurrentWorkspacefrom re-executing literally every time it's ever encountered in an ID.CachePerCallmeans that a new call that includes it will always evaluatecurrentWorkspacebut if you append more selections after that to the ID it won't re-eval. I think that's what you want here - Try that in combination w/ the commit for
unpack IDs in the response and add them to the returned content digest- I can see how that'd work now. In
mainwe prefer content digests of inputs when making digests for new appended calls, including theparent. So if the parent has a content digest, then that should be what's used for the subsequent selection onls
- I can see how that'd work now. In
True, may want to incorporate that into that withContentDigestFromReturnedIDs method
what's the new way to mix in to the result's content digest? looks like WithID is no more
you'd just mix stuff into contentDigest: https://github.com/dagger/dagger/pull/11812/changes/ad5d027d612d6a8ffc58b7ecdcba6ecb54c7339a#diff-6e7d0c02c8bd065fb312e4109724ee002cb041cd44c4c6e97abe267ce31cedf4R152
That ends up getting set on the ID via call.WithContentDigest a few lines below that
maybe i'm uncaffeinated
- but that was previously being merged back into returnType with returnType.WithID(returnValue.ID().With(call.WithContentDigest(x))) after updating the ID, but WithID is gone now
not sure if that's a trivial 'the helper is gone now because it's unused' or 'it's gone now because caching works differently and it's too late by then'
Oh sorry I'm the uncaffeinated one, I guess you're rebasing on main now?
WithContentDigest exists on ObjectResult and Result
But it's not on the interface wrapper AnyResult and AnyObjectResult
But you need the interface wrappers in this part of the code cause it's arbitrary types only known at runtime
So you just gotta add WithContentDigest to the iface
sweet, ty!
that worked ish - confirmed an appropriate content digest is being set (only changes when files change), but the downstream function still runs every time.
i'll keep digging to get more familiar, no rush
in modfunc.go the CacheConfigForCall method is the authority on finalizing the cachekey for a function call
So printlning there should eventually reveal a discrepancy
hm. println debugging shows the cache key is stable, but it's still re-running for some reason
did you verify ls is for sure re-running vs. just not showing up as CACHED for some reason?
yeah - it logs
pushed to vito/wip-workspace-cache-debug if you want to poke around
here's the logs i see:
β― docker logs -f dagger-engine.dev 2>&1 | grep '!!!'
time=2026-02-12T16:40:13.028Z level=WARN msg="!!! withContentDigestFromReturnedIDs" value=Dangmod@xxh3:067dc85c1604ea46 returnedIDs=map[xxh3:d494b13dcc2d3f4a:0xc005136d40]
time=2026-02-12T16:40:13.028Z level=WARN msg="!!! withContentDigestFromReturnedIDs mixing" dgstInputs=[sha256:a6316ebc2c4bc39651683f65eb55d92def9b105bb3f5f0b8ffcbad0e4b11a8b1]
time=2026-02-12T16:40:13.028Z level=WARN msg="!!! withContentDigestFromReturnedIDs mixing" contentDigest=xxh3:26040607f2c4ebeb
time=2026-02-12T16:40:13.029Z level=WARN msg="!!! parent has content digest" d=xxh3:26040607f2c4ebeb
time=2026-02-12T16:40:13.029Z level=WARN msg="!!! computing cache key" h1=xxh3:26040607f2c4ebeb h2=xxh3:6a6249929b30ee40 h3=xxh3:857db0722beee7c4 h4=dangmod
time=2026-02-12T16:40:13.029Z level=WARN msg="!!! DA FINAL CACHE KEY" key=xxh3:d6e41065e7c9f7d1
time=2026-02-12T16:40:13.029Z level=WARN msg="!!! DA EVEN MORE FINAL CACHE KEY INPUTS" inputs=[xxh3:d6e41065e7c9f7d1]
time=2026-02-12T16:40:13.029Z level=WARN msg="!!! DA EVEN MORE FINAL CACHE KEY" digest=xxh3:624176e8e5fc6f1a
time=2026-02-12T16:40:18.286Z level=WARN msg="!!! withContentDigestFromReturnedIDs" value=Dangmod@xxh3:05593baa30417c02 returnedIDs=map[xxh3:834d5e289d373ca4:0xc006d54f00]
time=2026-02-12T16:40:18.286Z level=WARN msg="!!! withContentDigestFromReturnedIDs mixing" dgstInputs=[sha256:a6316ebc2c4bc39651683f65eb55d92def9b105bb3f5f0b8ffcbad0e4b11a8b1]
time=2026-02-12T16:40:18.286Z level=WARN msg="!!! withContentDigestFromReturnedIDs mixing" contentDigest=xxh3:26040607f2c4ebeb
time=2026-02-12T16:40:18.286Z level=WARN msg="!!! parent has content digest" d=xxh3:26040607f2c4ebeb
time=2026-02-12T16:40:18.286Z level=WARN msg="!!! computing cache key" h1=xxh3:26040607f2c4ebeb h2=xxh3:6a6249929b30ee40 h3=xxh3:857db0722beee7c4 h4=dangmod
time=2026-02-12T16:40:18.287Z level=WARN msg="!!! DA FINAL CACHE KEY" key=xxh3:d6e41065e7c9f7d1
time=2026-02-12T16:40:18.287Z level=WARN msg="!!! DA EVEN MORE FINAL CACHE KEY INPUTS" inputs=[xxh3:d6e41065e7c9f7d1]
time=2026-02-12T16:40:18.287Z level=WARN msg="!!! DA EVEN MORE FINAL CACHE KEY" digest=xxh3:624176e8e5fc6f1a
and here's the repo i'm using to test: https://github.com/vito/wstest
with dagger-dev call dangmod ls -E
gonna see if claude can write an integ test and figure it out in parallel
pulled vito/wip-workspace-cache-debug, built dev cli/engine, pulled vito/wstest and ran dagger call dangmod ls -E, I'm getting module not found https://dagger.cloud/dagger/traces/61abe0daec363c3d4c61d4ccb2b66187?listen=7fc5ace18aed79ab&listen=aeb3e40b6d67a8a9&listen=b2af2eb6f674616a&listen=179701a9c8358136&listen=051168fac10c6b07
huh. that's with dev cli?
yeah I double checked
ba2677b31df22be1e4568eee1b7d13c05c0d30e7 from your dagger repo
my trace doesn't even have that load module: span - https://dagger.cloud/dagger/traces/892d79f0687c026c6d994c0d520180d5
could it be a ~/go/bin/dagger taking priority in $PATH? that happens to me now and then from a rogue go install
No I don't think so
sipsma@dagger_dev:~/repo/github.com/sipsma/wstest$ which dagger
/home/sipsma/repo/github.com/sipsma/dagger/bin/dagger
sipsma@dagger_dev:~/repo/github.com/sipsma/wstest$ dagger version
dagger v0.19.12-20260211024413-dev-b83bcf240d80 (docker-container://dagger-engine.dev) linux/arm64/v8
I'll triple check π
tried with a fresh clone of the repo locally, too
I'm trying deleting everything including the stable v0.19.11 engine that builds the dev engine in case that's somehow using stale cached data
omfg
I have too many dagger clones/worktrees due to having agents do things in parallel
one sec....
if it's any consolation i'm having my own existential crisis as Claude wrote what seems to be a legitimate test for the problem, but it's passing π«
lemme see if it fails without the "fix"...
Yeah it works now, very subtle difference
sipsma@dagger_dev:~/repo/github.com/sipsma/wstest$ which dagger
/home/sipsma/repo/github.com/sipsma/dagger2/bin/dagger
sipsma@dagger_dev:~/repo/github.com/sipsma/wstest$ dagger version
dagger v0.19.12-20260212164452-dev-ba2677b31df2 (docker-container://dagger-engine.dev) linux/arm64/v8
Not sure how you want to proceed with contributions to workspace branch.
So I pushed on a branch on my fork this: https://github.com/eunomie/dagger/commit/2a21bf9faa924ab8e02f2211d437b940ea94869f
It allows to read and write configuration:
dagger workspace config: prints the full filedagger workspace config [key]: print the value or the sub tree. For instancedagger workspace config modules.jest.sourceordagger workspace config modules.jestdagger workspace config [key] [value]: set the value. For instancedagger workspace config modules.jest.source github.com/dagger/jestordagger workspace config modules.my-module.config.branches main,master(array are supported)
With
[modules.eslint]
source = "github.com/dagger/eslint@main"
[modules.eslint.config]
source = "./path" # Directory
baseImageAddress = "" # string
packageManager = "" # string
$ dagger workspace config modules.eslint.config
source = "./path"
baseImageAddress = ""
packageManager = ""
or
$ dagger workspace config modules.eslint
source = "github.com/dagger/eslint@main"
config.source = "./path"
config.baseImageAddress = ""
config.packageManager = ""
But we can only write one value at a time.
If that sounds good I guess the commit can be cherry-picked to the workspace branch.
Just push to the branch directly π
- Make sure to pull first
- Communicate first to avoid overlap as much as possible
- Push early and often π
diff --git a/dagql/cache.go b/dagql/cache.go
index 7a1877f94..d6cecef3e 100644
--- a/dagql/cache.go
+++ b/dagql/cache.go
@@ -905,6 +905,10 @@ func (c *cache) GetOrInitCall(
}
}
+ if key.ID.Field() == "ls" {
+ slog.Warn("!!! DA YET EVEN MORE FINAL CACHE KEY", "digest", storageKey)
+ }
+
ctx = ctxWithStorageKey(ctx, storageKey)
if ctx.Value(cacheContextKey{storageKey, c}) != nil {
The storage key (π I can't wait to be rid of this) is changing each time even the dagql cache key is the same. Looking into why...
@topaz seal those new workspace config tests are failing - but, they feel like legit failures to me: at least one of them is failing because it tries to load modules that the config reference but those modules don't exist. i feel like config plumbing commands (among others) shouldn't fail on that sort of thing
for some reason the result is being marked as not safe to persist cache even though it doesn't have any named secrets in it, continuing to pull on thread
oops. I'll have a look
that's weird, because if I try by hand, my commands do not try to load modules π€
here's a trace, sorry: https://dagger.cloud/dagger/traces/701fc29cfc5df323e4372224aae45ce4?listen=e8b73ffdb9f75176&span=f54b24a458cac141#e8b73ffdb9f75176
yes, I can reproduce it when I'm running the tests, but when I'm doing the same thing by hand that works as expected with what it seems a different trace
(with the config file referencing a non existing module)
$ dagger workspace config
β connect 0.0s
β currentWorkspace: Workspace! 0.0s
β .configRead: String! 0.0s
[modules.my-module]
source = "modules/my-module"
alias = true
So either my tests by hand are wrong, or the integration tests are doing something I don't understand
hmm. are you testing in a git repo?
(throwing things at the wall to see what sticks - the tests do that)
(the ability to see all the commands and open them as if it wasn't nested is really nice still, amazed each time I'm looking at them π )
Running dagger check -l in dagger/dagger (post-migration) doesn't work.. fails with a weird engine error. FYI
I need to go cooking.
If I can I'll fix the tests tonight, but I'm not sure so maybe tomorrow morning. Sorry for the disturbance. Feel free to just ignore them 
Do not hesitate to ping me with new tasks for tomorrow depending on the progress you'll make during the day.
On my list I wrote the telemetry issue on dagger migrate but anything else is fine to me.
Thanks Yves! I just finished with the call marathon, after debrief I will start working on this branch again
@brittle wharf figured it out, here's a funny workaround that makes it cached as expected by just changing the dang code:
diff --git a/.dagger/modules/dangmod/main.dang b/.dagger/modules/dangmod/main.dang
index 91efbf2..731507e 100644
--- a/.dagger/modules/dangmod/main.dang
+++ b/.dagger/modules/dangmod/main.dang
@@ -6,7 +6,7 @@ type Dangmod {
self
}
- pub ls(buster: String! = ""): [String!] {
+ pub ls(buster: String! = ""): [String!]! {
print("im listing bro")
source.entries
}
The problem is that the dagql.DynamicNullable was not propagating safeToPersistCache when deref'ing a result, so the DynamicNullable[Result[T]] was safeToPersist, but the Result[T] itself wasn't.
I bet this explains why none of our integ tests caught it, we probably always return types that are non-nullable from other SDKs (or at least in the tests we have)
I'll send you an engine fix in a sec (but also gonna send out a fix for main too since it's an issue there0
nice catch, incredibly flukey that we even found that lol, [String!]! is totally what the code should have been
i have a failing workspace test for it now, looking forward to a solid green 
no worries about the failing tests btw! progress marches on
confirmed the test passes with the ! fix and cache key changes, and fails on upstream/workspace with the ! fix and without the cache key changes (that probably was easier to to write than to read, sorry)
pushed an engine-side fix here https://github.com/sipsma/dagger/commit/3c85e5ca40350e8e74e39d9b083e4a8c799e0114
(pr for main too now https://github.com/dagger/dagger/pull/11855)
Going to pick the next task on the list. Anything already in flight?
@brittle wharf when you mentioned getting workspace API to work, with a first module using it etc. was that on the workspace branch?
yep, pushed everything except proper content addressed caching, working on that now
nice
I'm going to focus on plumbing user defaults from workspace config
probably will add something like Workspace.Modules().Config() while I'm at it
@dim smelt just did a rebase + push -f with the content hashing changes (and main)
pulled workspace first so pretty sure i didn't clobber anything
what should we do next?
I'm wrangling EnvFile & user defaults plumbing
was thinking maybe I'll take a swing at lock file after?
how far are we from green CI? would be nice to get on a baseline but not sure what's involved. when i took a peek there were a ton of migrate errors in the tests
still need to fix the alias clobbering (Dang.dang eating Query.dang) too
yeah makes sense. at the moment we're intentionally breaking and requiring migration. so if we stick with that we'll need to migrate the test environments itself
oh yeah let me re-read your explanation
@brittle wharf how do you feel about the current approach to require migration, but also make it very easy?
My thinking was: one-time user pain, but much cleaner codepaths, less risk of inconsistent behavior or bugs
until you migrate, you're kind of living in an illusion... things don't really work the way you think they do
i see the appeal but i have a feeling people are gonna come and complain π
a global warning might be better
should be easy enough to do with that β¨telemetry.GlobalWriterβ© thing we added
might also get ci green π - should i give that a go? giving it a go if not just to assess scope of ci damage
A warning is fine, but what happens after... We need to pretend like your not-a-workspace dagger.json is actually a workspace
The runtime compat layer is not trivial to get right
yeah just landed there, assumed it was just an extra guard but nope
Since migration is implemented engine-side, we could just apply the migration "virtually", and internally it still looks like a workspace
was just typing that π
maybe this is what forces 'virtual workspaces'?
eh. ok i can be convinced to just error
unless we chuck it into an llm and it figures it out, it's kind of a lift
How about we just punt for now? We can always do it later. And if we do it, better to do it a bit later IMO
(but pre-merge)
may downside is harder to get green ci... but we can just migrate the test modules
sgtm, looking into that now
β¨ci:bootstrapβ© also fails because β¨dagger/daggerβ© itself is unmigrated, but if we do that we lose all CI, so i'll see if i can have it migrate first somehow i guess?
not a bad way to test β¨migrateβ© in the meantime
it makes sense that in the workspace branch our repo itself would be migrated. .dagger/config.toml can co-exist with the old dagger.json
ah interesting, that'd be preferable
normally migrate moves the β¨β¨dagger.jsonβ©β© into β¨β¨.dagger/modules/fooβ©β©, so just undo that and update the config to match? (or something)
looks like β¨migrateβ© is missing a step to move the old β¨.dagger/*β© source into β¨.dagger/modules/foo/β©?
β¨```
β― tree .dagger
.dagger
βββ config.toml
βββ go.mod
βββ go.sum
βββ main.go
βββ modules
βββ dagger-dev
βΒ Β βββ dagger.json
βββ gowstest
βΒ Β βββ dagger.json
βββ wstest
βββ dagger.json
which explains this (no source code, couldn't find module name)
on it
pushed a migration in the meantime, feel free to undo and force-push over it
β¨migrate --recursiveβ© would be nice too. not quite β¨develop -rβ© since that meant dependencies, this one would just find all migrateble β¨dagger.jsonβ©s and do it
adding that, and a β¨dagger migrate -lβ© to list migrateable modules
punting on β¨β¨migrate --recursiveβ©β© for now, actually it doesn't make sense as i originally imagined since we'd want one workspace for the entire repo
one confusion opportunity i hit was when you run β¨dagger migrateβ© in a subdir it no-ops because it just sees the root level workspace
The tests should be fixed now (I just pushed, I'll see the CI but they are passing on my machine). It was effectively a different behavior between manual tests (CLI calls) and integration tests, where SkipWorkspaceModules were not taken in account
I pushed a commit that improves the visible telemetry when doing a dagger migrate
See before and after
nice thank you
@dim smelt wdyt of doing a merge + ship asap, so we can dogfood without breaking CI? we're in a nice compatibility phase on HEAD at the moment but that only lasts until we actually start using Workspace everywhere
mm you really think we can sneak it in this quickly? It would certainly be reassuring, but what about the migration/backwards compat issues? Also it's very vibe-coded, I kept a close eye on the overall design and architecture, but did NOT review that code line by line..
btw right now I'm filling the gaps in migrate (source code not moved etc) + adding informative terminal output during migration
i priced some of that in to 'asap' - but also it's Friday so realistically it'd be a bit, i did forget about the blocking migration though (pre-coffee)
basically just mean in terms of features, as opposed to moving on to part 3 (artifacts)
hmm maybe i could carve out the Workspace API + caching changes, and have it also work in the old world?
@dim smelt is modules/migrate/ dead code?
(just chasing down lints)
Yes
Warning I'm modifying dagger migrate atm (for filling gaps) but yeah modules/migrate is dead
So to summarize the plan:
- Stabilize parts 1 & 2
- Ship asap
?
btw there's another migration aspect: the code of toolchains themselves, from +defaultPath to Workspace
yup - that one's quite a bit harder, right? you'd need it to write code
Should we anticipate these edge cases by preparing an agent skill for users?
I also found another tricky one on workspace migration: if your main module has local dependencies that are not dagger-specific: like go-mod-replace, or importing local paths from your python/typescript code. So a skill could help with that also
sgtm, i guess we could embed them and write them out if you pass a flag (with a message letting them know about it)
would be cool if we just ran the agent ourselves, but setting up auth and all that is still a bit thorny for a guided experience
side q: if I cd into a module that has dagger.json and run dagger call, that should pick up the module in the workdir, and not the outer workspace, right?
like, ./dagger.json > ../../.dagger/config.toml > ../../dagger.json in terms of precedence
Yeah I was thinking that too. Could be an agentic function...Perfect use case. But it's another unknown.
rebased & pushed
We should go over an important detail of parts 1 & 2: workspace roots & paths... Both inside and outside the sandbox. We made progress but worth double-checking the details since it's foundational to everything else
Also I'm going to make dagger init an explicit alias to dagger workspace init
There's several distinct things:
- What is a workspace: it's a configuration context for dagger, scoped to a directory. Every client always has a workspace.
- Workspace detection: as soon as a client connects, the engine searches for the workspace directory. That directory can be explicitly marked by
.dagger, or implicitly by falling back to current git root, or as a last resort, current directory. - Workpace initialization: when the client calls
dagger workspace init, where is.daggercreated? This might be different from workspace detection depending on what we decide. -->cd ./subdir/of/my/repo && dagger workspace init-> does that create the workspace in the current directory, or the repo root? One is more "logical", but the other is more likely to be what most user wants. - Workspace fs sandbox. When a module reads files from the workspace from inside the sandbox: what is the filesytem sandbox? We know it's not always the workspace itself -> modules may need to access files anywhere in the repo, regardless of where the workspace itself lives. So, even if the workspace is not at the repo root, the fs sandbox probably is.
- Relative vs absolute paths in the fs sandbox. Absolute paths are expected to start from the root of the fs sandbox. But what about relative paths? There are several possible answes there: also relative to the sandbox root? relative to the workspace path? relative to the client workdir? This forces us to answer the underlying question: why would a module use a relative path?
- Entanglement with part 3?. we have an unresolved thread with @coral kernel on the overlap between "sub-workspaces" and "dynamic artifacts" which complicates the discussion on relative paths. Basically, if I have 5 "apps" in my workspace, do I create sub-workspaces for them (with a module installed 5 times), or do I leave it to "dynamic artifacts" (via part 3) to handle that?
Still on this - what should happen when I run dagger functions and dagger call in core/integration/llmtest/go-programmer/? It's not a legacy module, it's just a module, and we want to run its functions. Right now it just finds the outer Workspace
i have a fix for it locally (LLM'd) but want to make sure it's in line with The Plan
Or is the expectation to pass explicit -m .?
I thought we landed on implicit -m . but that could be old info
though now I'm confused because if dagger init is aliased to dagger workspace init, and dagger module init always initializes within a workspace, there's no way to init a standalone module in an arbitrary directory anymore
Yeah that was before I worked out end-to-end backwards compat and migration... Now I think we need to revisit in that context
You would specify the path as an optional 2nd directory
dagger module init foo-> goes in workspacedagger module init foo .-> goes in current dir
that part's fine, but it still installs it into a Workspace, right?
I think if you specify the path explicitly, it would disable auto-install
ah ok
that part is not implemented (or tested) btw π
I guess implicit -m . could be ok. We could make it another client param.
Implementation: if the current workdir is inside a dagger module (dagger.json), and skipWorkdirModule is not set, and the client didn't already set module (the client param for -m), then set module to the workdir module -> aka "default -m ."
We could make that behavior configurable in the workspace config if needed
At least for a transition period that makes sense to me. Eventually we'll get to a point where that will surprise someone because all they know are workspaces and we can revisit it
FYI I renamed the PR + added a description to match Parts 1 + 2
Now editing the PR to explain parts 1 & 2 as a combined change: "Workspaces" aka "modules v2"
- Part 1 introduces the concept of Workspace and combines it with
modules to create a new configuration and dependency model - Part 2 introduces the Workspace API
Workspaces aka "modules v2"
Renaming this thread to reflect the merging of parts 1 and 2
@brittle wharf one looming topic for discussion: how the workspace API should mutate the workspace.
Ideally we start simple, then we can layer more powerful concepts?
Ultimately it intersects with things like:
dagger callsometimes prompts you to apply a changeset- Functions can return an
Envand that will mutate the LLM's environment when called
Would be nice to have a unified model that we can converge, without complexifying everything.
--> Deja-vu, I think we discussed all this at the beginning of "evolution of Env", but now we're much further along in implementation so it's no longer a hypothetical future problem
Update: cleaned up plumbing of workspace config, it co-exists with .env to apply the same user defaults. old-style.env loading is disabled by default but can be enabled in the workspace config
FYI: there's a rampup.md at the root of the repo (temporary) that I use to ramp-up new agents. I just say "read rampup.md" then I prompt them. works well
Going to write this up in a gist so we can track it. Then will implement dagger workspace init
@dim smelt tried migrate again to test the module source moving - almost there, but ended up with some weird duplication:
β― tree .dagger/
.dagger/
βββ config.toml
βββ main.dang
βββ modules
βββ dang
βββ dagger.json
βββ main.dang
βββ modules
βββ dang
βββ dagger.json
5 directories, 5 files
(previously it was just .dagger/main.dang)
cd ./subdir/of/my/repo && dagger workspace init-> does that create the workspace in the current directory, or the repo root? One is more "logical", but the other is more likely to be what most user wants.
Request for opinions. It's a superficial UX bikeshed, but important detail.
I vote -> defaults to repo root. Plus an optional argument to specify the path explicitly
would have expected this:
β― tree .dagger/
.dagger/
βββ config.toml
βββ modules
βββ dang
βββ dagger.json
βββ main.dang
i think i would even be OK with it erroring if it's not in the repo root, at least for now
feels like it'd be easy to miss and assume it init'd in the work-dir, if it init's in repo root
Mmm but it needs to be possible to do it - in large monorepos, it's explicitly requested by power users
Maybe we error if not in git root and explicit path not given?
that sounds pretty good
cd ./some/subdir
dagger init
ERROR: are you sure you want to initialize the workspace in a subdirectory of the git repo? If so, run 'dagger init .' Otherwise, run this command again from the root of the repo
the case i fear is:
- user assumes workspaces are roughly equal to modules, thinks you init them everywhere
- run init, it "succeeds" but they don't notice it did it root of the repo => they assume it's a bug
could be overworrying tho
Maybe a bit aggressive for beginners though... A softer version would be a warning: works out of the box for standard users. Power users see the warning and can course-correct
cd ./some/subdir
dagger init
WARNING: initializing dagger workspace at the git root (/foo/bar). To initialize in the current directory, run "dagger init ."
I don't love any of these options...
Maybe we just create it in the current dir, and just make it very clear where it is. Seems pretty reasonable
seems OK too. if monopeople are gonna be doing it in the happy path, the warnings will feel weird. if a user is overdoing it with workspaces they can always simplify later
lol did not mean to type monopeople there but leaving it
I'm converting dagger check to the new workspace system. At the moment it is broken
hm, bunch of failing workspace tests now: https://dagger.cloud/dagger/traces/a9951ac4262276684009d56a27b2510a?span=d308436bee9c9199
pushed change to check & generate. it's substantial so might be broken (check scanner assumed a main module)
want to get a basic workspace init working. Then will be afk for 2-3 hours. Then will check back in
@dim smelt pushed a change for implicit -m . and fixing up other broken aspects of -m but i think this'll need another pass - initializeWorkspace in particular got pretty hairy, but I want to see how much it helps with the tests. left a FIXME(vito) to keep track
added more workspace tests along the way, covering dagger module init <name> <dir> (which now works) and making sure they work when nested beneath a workspace
ok will be back at keyboard in 30mn or so, will take a look. Still ok for me to push dagger workspace init or are you holding a lock on that part of the code?
(I'm back at it, claude wants `init to work CLI-side, but I want the CLI as dumb as possible. Might add a client param to avoid chicken and egg problem - auto-detect workpspace at connect, in order to create a workspace...)
I also found it struggling between the client side / server side split
Yeah I have to yell at it about that every time. I'll add it to rampup.md
Finished workspace init (not yet pushed)
Now cleaning up handling of remote workspaces. Claude slapped it into a duplicated code path, I'm DRY-ing it
pushed a commit that should fix module loading in nested execs, which accounted for a ton of test failures i think. took $9.206 and 71% of the 200k context window to find it, but the fix looks legit
(that HTML file is the pi agent session log tree, pretty neat)
Need to switch all the tests from init to module init
@dim smelt quick check in - I got the module tests down from ~270 failures to ~160, but now im getting the feeling we'd be better off extracting JUST the Workspace API + arg + caching support, shipping that on its own, and then switching our toolchains to it. that way we can unblock dogfooding, and take our time with the CLI DX (and tests)
FYI I pushed a commit late last night, cleaning up a part of my vibe coding. Hope it doesn't break anything (it shouldn't but you never know)
Ah I see, you're saying ship part 2 before part 1? With part 1 rebased on the part 2 PR? Works for me
yeah exactly. cool, will go for that tmrw!
actually @brittle wharf , if we merge part 2 first, what will be the workspace API accessing? If there's no actual concept of workspace yet
pushed a fix to the "dagger check" crash
also pushed improvements to the engine playground skill, and a simple qa skill that runs through a simple scenario in the playground and reports back.
(i used those to automate the bugfix loop)
I pushed a new commit that runs dagger migrate in parallel.
It looks like something's wrong as they all finished at the same time but that's not the case, it's just they are waiting for the same thing (the SDK). I put a sleep in one of them just to ensure it's good.
See the before and after
I also fixed an issue that was removing all hints when doing a dagger module init
I'm currently working on two bug fixes:
β
when dagger module init it creates the module under .dagger/modules/MY_MODULE but also at the root... removing that
π² I'll have a look at the "cache" issue: running a module, changing the configuration, running the module, configuration is not used (as we saw in yesterday's demo)
(I continue to add more before/after, if that's too annoying / useless let me know, it's just as a way to quickly expose the changes)
Here is the fix for the module init
Concrete answer: repo root.
In terms of other concepts: the same directory that +defaultPath="/" is rooted in. It'll start as just a dynamic, content addressed alternative.
General answer: whatever works for our dogfooding, since this will be an experimental unannounced API, we don't have to commit to anything beyond what helps us test + ship Workspaces faster overall.
ok. that does make sense. hopefully it won't make part 1 much harder to merge. but merge conflicts are easier to resolve now than they used to be
so this will plumb into toolchains
yup yup
btw I think `dagger check' is still crashing the engine
I'll have a look at the "cache" issue: running a module, changing the configuration, running the module, configuration is not used (as we saw in yesterday's demo)
I can't reproduce it π€
My test scenario:
dagger migrateondagger/daggerdagger call go base terminalthengo version->1.25.7- edit
.dagger/config.tomlto change the go version to1.24and uncomment this line dagger call go base terminalthengo version->1.24.13
So this is working. But I saw yesterday the issue, so not sure why.
were you using dagger cloud?
if so it could be a different engine
no, just running the playground on my machine
@brittle wharf @topaz seal @coral kernel I tried porting a few of our own modules to workspace API. Not sure where to put the ignore filters. For now I make them regular []string arguments, which is neat. But, the discrepancy between include/exclude and ignore makes it a little bit inconvenient
@brittle wharf I'm watching the kids this morning so can only do async. But available here if you want to coordinate the branch split / rebase
sweet - sounds good, I have a draft PR up here that I'm testing: https://github.com/dagger/dagger/pull/11874
How much work to rebase, part 1 on it, do you think?
Also: how will 11874 address breaking changes for toolchains?
I don't think adoption is huge yet, but it is in the docs and there's a quickstart using it
so we should assume there are others out there using toolchains in prod
well, the point of 11874 is to not bring any breaking changes, so there should be nothing to address
small-to-medium? happy to handle it when the time comes
Wouldn't it break the meaning of +defaultPath in toolchains? Or would you keep that part
I guess it makes more sense to keep it
yeah unless i'm missing something i just have to literally not touch that code path
- this is all additive
And then, the breakage that is left, is when we modify our toolchain modules to receive a Workspace -> but that's just a normal case of requiring a newer engine
we'll see if the ci run agrees with me π
yep exactly
Cool cool. Makes sense to focus some extra effort on the cache invalidation part, making sure the injection magic happens when we want, and doesn't happen when we don't want. etc. Which 11874 allows for
and the idea is that the older (newer than now, older than Workspaces 'for real') outer engine, once we ship Workspace 11874, will still be able to run the Workspace PR's (11812) checks once we transition
yeah unlike what happened last night when I started modifying toolchains in the branch
Should I try rebasing workspace now?
if you want! you could try doing a 'soft' rebase where you just realign it on top of the same commits that were extracted into 11874, like just re-ordering in git rebase -i
i could try that later too if 11874 is looking good
OK - maybe better to let you take the wheel on both sides. I have a demo call coming up soon (roughly 1h) then dentist appointment... Will be back online 3pm my time - makes more sense if you own the whole operation today IMO
works for me π
I'll use my remaining hour to work on other parts of workspace
btw - dang is a pretty nifty debugging tool now, with the new REPL and :doc. earlier I did dagger-dev -s run dang to start its REPL pointing to dev dagger, and did :doc to verify that Workspace args were being made optional. (there was a weird i/o buffering thing caused by the wrapping, but it worked once i realized that's all it was). more info: #daggernauts message
What does :doc mean?
it's sort of like a GraphQL schema explorer - but based on the language runtime types + bindings, so you can also explore the builtin functions defined on strings, lists, etc.
and now it loads the Dagger module + dependencies in the cwd, just like dagger
https://asciinema.org/a/kpZm7kn9l6FZeh2d :doc is at the end
Is :<foo> part of the dang syntax? Or a special repl think like .<foo> in dagger shell?
oh yea not dang syntax, just a special repl command
not sure how widespread this will be, but introducing the Workspace type is apparently a breaking change for modules that define their own Workspace type: https://dagger.cloud/dagger/traces/4397c5fa68a499abc307baef89e1c81a?span=a4d0520ec5dc7546
Oh right... I think we have this problem for every new core type.
yeah, and unfortunately Workspace was a common convention for LLM use case
(but I thought we namespaced module types though?)
If module foo defines type bar, doesn't that become type FooBar in the gql schema?
yeah - i wonder if this is overly defensive? it looks like it was added shortly after we decided to not allow modules to extend core types: https://github.com/dagger/dagger/blob/515ec6f8195741b15e57e6e7b78f8a87e4cd0f7a/core/module.go#L323-L324
oh, but in this case the module is literally called Workspace
so it doesn't get namespaced by the usual logic, because it's the module type itself
Ah the issue is the module name, I see
Eminent domain then
yeah, there shouldn't be too many modules named 'workspace' anyway 
https://daggerverse.dev/search?q=workspace
initial reaction: 
but actually they're 99% forks, the rest are @coral kernel or greetings-api
yeah we used that name a lot ~1 year ago. Clearly anticipating this π
heading back from dentist. I'm looking into full backwards compatibility: instead of error if a migration is needed, just a warning.
https://github.com/dagger/dagger/pull/11874 is ready for review - just babysitting CI now (random lints from previously-skipped packages, + buildkit flakes that are already fixed on will be fixed in the next week π)main
π https://github.com/dagger/dagger/pull/11874
First part of workspaces, (workspace API) that we're fast-tracking for early testing.
was that meant for #maintainers? π
Yes, trying to get more eyeballs on it
Update: found a good design for clean compat mode. Also will delete a lot of code, no more toolchain support it module loading. Instead, just a single compat gate at workspace loading. Then everything downstream is simpler and cleaner
While claude churns away on that π , I'm going to check out your workspace-API branch @brittle wharf and try porting our toolchains to it
sweet. Workspace.search is interesting - were you imagining that as an attachable that runs rg directly on the host? in the meantime, I guess you can just do Workspace.directory(...).search
also thinking I could/should just remove Workspace.install and .moduleInit from that PR
Workspace.findUp i imagine to be trivial to add already since that's already has a Host API
Yeah I was just thinking in terms of feature parity. No opinion on implementation, I just was thinking "I will probably need thi". I didn't even think of .directory().search(), I guess it could work - but need to spend brain cycles thinking about include/exclude arguments in one place vs the other
@brittle wharf I stacked a draft PR on top of yours, just for API dogfooding on our modules
https://github.com/dagger/dagger/pull/11876
wip & fyi
nice. just pushed this:
+ Workspace.findUp
- Workspace.install
- Workspace.modInit
Ah shrinking the API footprint? makes sense
yeah, those feel more suited for 11812 anyway, they just kind of came along for the ride
i punted on Workspace.search - it would be cool to skip the filesync, but it's a bit of a pandora's box (runtime dependency on client-side rg?)
Yeah I can always Workspace.directory(include:..., exclude:...).search() right? or just .glob(), whichever fits the situation
Thinking about blueprints... Isn't a blueprint just a module in the workspace config, with alias=true? (alias: current terminology to mean "mount this module's functions directly at the root, instead of namespacing them)
ha ha, I was struggling to find the right term to use in the config (alias? topLevel? noNamespace? main?). But maybe it's actually blueprint π
Between toolchains and modules moving to a thin compat on top of workspaces: lots of complex and potentially slow code getting removed. Fingers crossed we get a slight speedup out of it!
First impressions porting a small toolchain (helm-dev) to workspace API @brittle wharf:
- DX is strictly better. An explicit argument is more clear than pragma incantation
- I can also split up code in utility functions at my leisure, without worrying about the absence of self-calls causing me to lose pragmas... I just pass the freaking workspace argument, and it works as intended. Huge win
- Loading the module is super slow, even on warm cache. Maybe it already was before? See screenshot: 47sec, 19sec, 13sec. That seems crazy long for loading a super small module written in go
- The helm checks are not showing up in duh, I was running it with stable dagger... dagger check -l... I have no idea why π€·ββοΈ
@brittle wharf I'm adding a compat bridge for legacy +defaultPath, configurable in config.toml. dagger migrate will set the compat field (legacyDefaultPath=true automatically. So we can keep the desired clean definition of +defaultPath for everyone else)
But also, we can probably deprecate +defaultPath completely down the road, no reason to not call dag.CurrentModule().something() since module always gets fully loaded, so no need to worry about the cache police
PR is ready for another look (https://github.com/dagger/dagger/pull/11874)
- @dim smelt - not sure what you meant by this https://github.com/dagger/dagger/pull/11874#pullrequestreview-3817138828 - do you know if there's something lacking for blueprints, or just seeking an explicit test?
- @lofty shadow - addressed all feedback, some of it by just kicking the can down the road to 11812, main question is whether the path resolution looks good to you: https://github.com/dagger/dagger/pull/11874/commits/68236bb04f6be9bbdd7a525ebd7906e7edf69ea5
also did the unconditional content-addressing for module functions change, nothing broke - 
I saw some secret-related failures in TestModule that are not familiar flakes: https://dagger.cloud/dagger/checks/github.com/dagger/dagger@ab4d1292e2495fcebbc7dcd41f39e51187c78dbc?check=testSplit%3AtestModules&listen=a291eb52ee102852&listen=226341e187df69db&listen=69166e0aa4c09b70&listen=8628b33e68bd1b21&listen=0034f8608ce1366d&listen=b2a4fadb78f00f6e&listen=60d76420fd1ead56#5d6eb2e65d742def
I wouldn't be shocked if the content addressed function parents were impacting those. Worth seeing if it's repro-able. It looks like the tests in question are setSecret (as opposed to secret providers), which is an utter nightmare when it comes to caching.
TBH if it's that, I'd probably just go back to the Workspace-only implementation for now. I funnily enough am doing battle with setSecret right now in the dagql cache work, so whatever I land on there might help this situation too, can revisit then.
hm yeah "nothing broke" was a bit overconfident - these are also potentially interesting:
Error: Not equal:
expected: "Hello from B"
actual : "Hello from A"
Error: make request: input: test.upper set call inputs: convert arg "msg": failed to get deps for DynamicID: module not found: test [traceparent:972913fc2fbd2d3af91eac9afb5f12e4-54c015d7b46974ea]
Error: make request: input: test.upper set call inputs: convert arg "msg": failed to get deps for DynamicID: module not found: test [traceparent:972913fc2fbd2d3af91eac9afb5f12e4-54c015d7b46974ea]
I actually managed to trigger that exact one in my e-graph PR too... weird. Must be triggering the same thing somehow.
Either way, definitely seems not worth blocking the PR on this, I'd just go back to the previous impl for now. I'll need to rebase on this anyways, can look into all together with the new egraph stuff
pushed a revert + tidy
this test-modules run failed, but assuming solver related flake - rerunning: https://dagger.cloud/dagger/checks/github.com/dagger/dagger@28e4860801b777b580c71658a6dba189ab82df79?check=testSplit%3AtestModules&run=2bebed05-fe19-4775-847e-ffba17c77ac6
Yeah that one is
not sure what you meant by this https://github.com/dagger/dagger/pull/11874#pullrequestreview-3817138828 - do you know if there's something lacking for blueprints, or just seeking an explicit test?
Yesterday I realized that blueprints can be cleanly folded into the workspace system, similar to toolchains. So they'll have the same backwards compat / migration needs than toolchains. I implemented that in the workspace branch for both. Following that same symmetry, I was wondering if in your PR you'd want to also add support for workspace API in blueprints the way you did for toolchains.
I guess where the symmetry breaks, is that we use toolchains but not blueprints. So strictly from a dogfooding perspective, your PR only needs to touch toolchains.
So my comment was not so much a request, but a FYI so you can make an informed decision on where to draw the line in your PR
In technical terms, from old -> new:
- toolchain in your dagger.json -> module in your config.toml with
legacy-default-path: truefor backwards compat of+defaultPath(to be manually removed when your module adopts workspace API and only uses+defaultPathto mean "my own files" (note: no way for the module to indicate it has made that change..) - main module in your dagger.json -> module in your config toml with
blueprint: true - blueprint in your dagger.json -> module in your config toml with
blueprint: true+legacy-default-path: true
gotcha - well, I added an extra integ test covering the blueprint + workspace arg path just to make sure we have some coverage there, don't think there's much else to do in this PR
will merge on green
merged! now to rebase...
@dim smelt I'm confused by this Workspace.Rootfs field - I thought the idea was to skip an eager filesync for host dirs?
currently:
dagger HEAD(REBASING 109/135) *βββ β‘1m9s
Mmm "rootfs" is the term I chose to mean "the root of the sandbox inside which all paths are evaluated". Basically a new word for what we called "the module's context directory" kn modules v1. Implementation is supposed to be similar to modulesource context dir and reuse similar plumbing -> use the git root for remote workspace; use host session attachables for local workspace; all abstracted away cleanly. At some point the vibe coded implementation looked correct. but maybe it accidentally got reverted/broken when I focused on other parts (most recently bug fixes, DRY & backwards compat)
yea right now it's a dagql.ObjectResult[*Directory] field populated by calling host.directory when local:
if isLocal {
// Local: Rootfs = host.directory(detected.Root)
hostPath = detected.Root
err := dag.Select(ctx, dag.Root(), &rootfs,
dagql.Selector{Field: "host"},
dagql.Selector{
Field: "directory",
Args: []dagql.NamedInput{
{Name: "path", Value: dagql.String(detected.Root)},
},
},
)
if err != nil {
return nil, fmt.Errorf("creating rootfs directory: %w", err)
}
} else {
// Remote: Rootfs = the cloned git tree passed in.
rootfs = prebuiltRootfs
}
oh yeah that's wrong... sorry I've become my worst nightmare: sloppy vibe coding CEO π
I did say "understand and reuse the same plumbing as contextdirectory, with abstraction over local va git sources" -> it made a very convincing case that it had done that
lol np. i'll try to finish this rebase, but might bail and try going for a merge instead, if i turns into a minefield
(kinda already getting there, more code changed than i anticipated between the two PRs, some of which was refactored prior to merge, notably path arithmetic)
oh. ok. that was the last commit with conflicts.
I was going to say "agents are great at complicated merge conflicts" but now I worry I got conned π
I think agents will truly help track the conceptual parts of the merge, as a safety to make sure important changes don't get dropped
now weighing whether a rebase or merging main into the PR is the lesser evil. nice thing about merge commits is all the conflict resolutions are in one place (the merge) rather than sneakily melded in haphazardly through the commit history
makes sense to me, that history will eventually get cleaned up before merging back into main anyway
@dim smelt pushed the merge - left the Rootfs brokenness for now so it's still a pure merge
for fixing the rootfs, it kinda seems like we just need to handle both states and hope the branching doesn't get out of hand: either it has a Rootfs Directory, or it's a virtual workspace backed by a host path
Yeah - but we already handle that for a module's context directory. So hopefully we can just copy that
@brittle wharf I'm just back from my meetings. I have an uncommitted fix for "no main object" bug (handling of module "original name" vs. installed name was broken in my backwards compat commit). Will commit, then hopefully it will cleanly rebase on your merge
@brittle wharf then I can focus on the eager rootfs, try to fix my vibe-coded mess π Unless you're already doing it
go for it π
Since I'll be off thursday-friday, consider the branch yours for the rest of the week - whatever you want to take over, take over
I like how fast we're moving and don't want to derail that π
I started a de-facto convention on the branch, feel free to try it also if you want:
-> I track in-progress work in hack/designs/<name>.md. This contains 1) context 2) design 3) task list with status
When I start a design, I tell claude to write that file - and NOT go in plan mode (if it enters plan mode, I exit it). That way we can actually talk about the design normally, without the weird plan mode UX that swallows the dialogue.
Then I tell claude to follow the task list, one commit by task, and atomically change the task list status in the same commit. That way I can always kill the agent and resume clean.
I'm sure you guys have equivalent workflows. Just wanted to let you know the interface for mine, so you can plug in π If you look in hack/designs you'll see the current state of what I'm working on. Especially handy within a big PR, since there's so much going on
@dim smelt One thing that'll help in your absence is confirmation of what commands we're committing to breaking. That'll help with fixing up the TestModule suite - a quick triage of those failures would go a long way if you have a sec! (should that test pass, or does it change to command XYX)
(nbd if no time, i'll make guesses, easy enough to adjust in the end if i go astray)
(pushed tentative fix to rootfs laziness)
OK. I'll be on my phone π So we have a little buffer π
About to get in a car for a few hours, will go over the failures
--> hack/designs/workspace-rootfs-laziness.md
harder than I thought to get the information I need from this page on mobile..
Captain's log: wrangling with module tests, whittled down from 137 failures to 43 failures.
Here are some notes (not exhaustive, just the recurring themes):
- [ ] recursive call detected
- this is caused by how aliasing works - we should maybe avoid aliases working by schema stitching
- TestModule/TestSecretNested
- TestFloat
- TestCodegenOptionals
- TestWrapping
- TestModulePreFilteringDirectory
- [x] missing
--with-self-callsfordagger module init- TestSelfCalls
- [x]
dagger module init: source in ./foo/, dagger.json in .- many such cases
- [x]
dagger developbroke for unbundled SDKs- TestUnbundleSDK
[workspace 6da9f58a7] fix(cli): skip workspace modules when running dagger develop
- [ ] missing LICENSE
The first checklist item is the broadest amount of failures - basically the aliasing system is wreaking havoc for a bunch of different cases, related to breakage I also hit in dang (the Dang.dang clobbering the entrypoint). Going back to the drawing board for that aspect - I don't think we can just yeet all the blueprint'd module's fields up to Query like that, there are just way too many scenarios where it collides either with its own functions, or with a dependency, or with core APIs, etc.
Current approach is to expose type Workspace { defaultModule: String } which returns either the blueprint module's name or the singularly loaded (-m or CWD) module's name, and have the CLI treat that as the entrypoint.
Overall trying to avoid making widespread changes to the tests, making backwards compat. changes where possible to bridge the gap, but when the dust settles we should take a step back and figure out which side should really change
Current approach is to expose type Workspace { defaultModule: String } which returns either the blueprint module's name or the singularly loaded (-m or CWD) module's name, and have the CLI treat that as the entrypoint.
By "treat as entrypoint" you mean "prepend every query with that module name", similar to how CLI handled it before?
yeah
I feel dumb because I still don't undertand how aliasing breaks everything. I guess it's because some functions shadow others in the top-level query? But which functions are shadowing ("clobbering") which other functions exactly? Aliases module shadows core functions?
it's confusing π I had Claude draw me a graph in one case
basically, in the happy path, module foo defining bar, you get type Query { foo, bar } where under the hood bar calls Query.foo.bar
breakage 1: define foo module with foo function -> this results in recursive call detected, because Query.foo calls Query.foo.foo.
I worked around this by aliasing the constructor as _blueprint_foo and having the aliases call that, but then hit
breakage 2: define dep in your module, install module named dep => your dep clobbers your dependency's constructor, so you can no longer call it
breakage 3: define a function called container in your module => well, now you've done it! container.from() will now call your container instead.
mmm, so the way it works today, aliasing can work if we forbid installing other modules (just the one aliased module) and we don't expose its constructor, just its functions directly?
not sure i follow
is the end goal really to change the schema layout like that? feels like more trouble than it's worth but maybe i'm missing the upside
So far we've always erred on the side of making the CLI dumber when possible. So it seemed to make sense to just let the engine show the client what it's supposed to see, without the CLI having to participate
I didn't expect it would be so complicated
But, my previous point was, we're not trying to build a generalized schema aliasing feature either. We have a pretty narrow list of things that need to work
-
Backwards compat with a project's "main module" when migrating its dagger.json to a workspace
-
Handling
-mgracefully -
Implementing blueprints --> If it helps, we can cut this requirement, which was my point earlier
Requiring that the client check for an entrypoint module, and honor it, seems like a workaround: now every client has to implement this, or they're broken. Plus it's an extra round trip.
Reformulating: instead of supporting any number of aliased modules, co-existing with any number of non-aliased modules, we could simply support a single entrypoint module, that would be exclusively served - no other modules allowed. This would work for -m, blueprints, and for backwards compat with main modules
Should be possible, can investigate tomorrow
- one semi blocking question though, is what to do about the constructor? I don't think we can drop it entirely, even dagger call invokes it when you pass flags (dagger call --constructor-flag func --func-flag)
I would consider dropping support for that, if it turns out to be the best solution overall
-
Backwards compat -> you can set module configuration in the workspace config, and you can migrate off of using aliasing
-
-m-> This one would be the most painful I guess, no workaround -
blueprint -> can configure module in the workspace config
But yeah that is a major downside of doing it purely in the schema
Maybe the engine could return a special header or something, as part of client connection handling? To make it more "special" than querying Workspace.something?
one thing i'm wondering is if that's really simplifying the client, or just enshrining a very particular client's POV to all clients, not all of which agree
for example dagger shell and dang's REPL put you in a context that has your module, its dependencies, and core APIs all available
could always be a param, i suppose, just like we have Module.serve(includeDependencies) (edit: no it can't, since this is all pushed engine side, and that particular approach even assumes the core API is available, which we don't want)
down to 18 (but i think actually 16) failures in the module suite π₯²
- pushed legacy module handling down into
workspace.Detectso it's squashed as early as possible - support contextual git args for legacy
defaultPath
the vast majority of the failures are just from LICENSE no longer being created by dagger module init, not sure if that was a conscious choice (vaguely recall there being some discussion on its merit)
what seems reasonable is:
- keep
LICENSEgeneration for standalone modules (which should fix all the tests) - don't do it for workspace modules, since a) they're not meant for redistribution/reuse anyway, and b) they're likely covered by a repo-wide
LICENSEalready
edit: down to 2 failures with that fixed!
trying out the Query hoisting revival now
@brittle wharf was looking at the workspaces PR out of curiosity since the dagql cache updates are in the final stretches (https://github.com/dagger/dagger/pull/11856). We are gonna have A LOT of rebase conflicts to sort through lol
Fortunately my PR simplifies quite a bit of the caching aspects of module loading and other APIs in core/schema, so I don't think it's fundamentally in conflict. More like needing to reconcile two new-but-better worlds.
One question though: I saw that almost all of the toolchain-related code in schema/modulesource.go is deleted in your PR. I am currently working through a toolchain test failure that's gonna require reworking how toolchains are loaded... should I even bother? Are toolchains actually gone with your PR? Or did they just move somewhere else?
I guess the other relevant question is timing. I want my PR merged early next week. Not sure if that's the same as the workspaces PR or if it'll take longer
Can't speak to dang, but the fact that dagger shell has different behavior is a major PITA and complicates everything. I think fragmentation on this point is bad.
Clients already have considerable control anyway:
extraModulesin client params, with control over naming and aliasingskipWorkspaceModules
With modules v2 I think serving dependencies at the top level (like shell does) is obsolete.
hmm, i can buy that but having the core API available in a repl/shell feels pretty important for playing around, which is mostly what repls are for. will think on it, i do think it's a noble goal if we can make it work, only really have practical concerns
claude made progress on an alternative implementation that has the major tests passing at least, I pushed it to see a full CI run and it looks plausible (17 module test failures, some of which are constructor focused which makes sense). need to take a closer look still, i had it working away while out to anniversary dinner and came back to a confident sounding commit
one cute option could be putting all of core beneath its own entrypoint. this is the sort of fun stuff i wouldn't mind diving into dagql to support (if it doesn't already). the main risk might be caching, with different IDs hitting the same APIs, but maybe that's where something something egraphs come in
I'd say just merge yours first and i'll try and figure it out haha
ah yeah core API I don't think we need to hide? unless the client is explicitly requesting aliasing / hiding core?
I've always wished we had done this. I don't think caching would be hard, the part that mildly scares me is making sure codegen works (i.e. client bindings don't suddenly go from dag.Container() -> dag.Core().Container())
I initially had a hideCoreModule client param but that turned out to be a bad idea π
yes toolchains are gone. Now there are just modules; and they are mounted individually into the workspace at their respective namespaces. There is no more "main module" and no more mounting of toolchains into a main module
(we kept backwards compat though; toolchains in a dagger.json get automatically transformed into modules in a workspace)
@brittle wharf maybe a quick sync about the aliasing thing later?
@dim smelt i'm around if you're around!
Almost - shuttling Merlin back from a doctor's appointment, while wrapping up last week's gazette
15mn?
sgtm!
@brittle wharf sorry 5 more minutes... if you're still around
np, still around
(kids are still off school, and they have many questions)
UI / telemetry glitch -> looks like nothing is happening
Same session after cranking up verbosity to 3
As I type this, I realize: how do we cleanly handle workspace injection for nested clients?
taking a detour today to fix https://github.com/dagger/dagger/issues/11887 and any related weird telemetry
@brittle wharf I was thinking we could have a top-level field in config.toml called "entrypoint" eg.
entrypoint = "dagger-dev"
[modules.dagger-dev]
source = "..."
And corresponding api: Workspace.entrypoint(): String!
to replace blueprint = true? sgtm in shape, fixes the 'what if i set multiple to true' question
yes exactly. Still slightly worried about fragmentation ("this command honors entrypoint; this one doesn't") but at least the concept is easy to explain
And wondering if codegen should honor entrypoint
@brittle wharf I'm testing workspace API. Is there something about Workspace.directory() that would cause it to be copied read-only in a container? I'm getting unusual "permission denied" in a seemingly trivial container, just base image + workspace directory copied in
doesn't ring any bells 
it just calls Host.directory under the hood (assuming host workspace)
It's from within a dev engine too, so maybe it's something related to nesting? WIll try to repro
@brittle wharf it looks like it's caused by the specific base image I was using (markdownlint pre-built image).
Mmm super weird though
Oh I get it. Container.withDirectory() doesn't honor the current user configured in the container image (unlike Dockerfiles I believe). Which explains this:
dagger -M -c 'container | from tmknom/markdownlint | with-directory . $(directory) | with-exec touch foo'-> permissions denieddagger -M -c 'container | from tmknom/markdownlint | with-exec touch foo'-> OK
@brittle wharf manual testing of workspace branch. It looks like entrypoint/aliasing is too aggressive: it only shows the designated entrypoint, and not the other modules. In 0.9.11 dagger functions will show a mix of toolchains + main module's functions
noted, will check when i get back to it unless you or claude figure it out
Also there might be an engine crash when listing checks... Will try to repro more precisely
Mmmmm.... maybe a big module DX win... @brittle wharf the Betclic team reminded me that real-world juggling of npm cache volumes is a major issue...
- Can't share the cache between concurrent writers
- Global lock across all apps is a major bottleneck
- No clean way to namespace cache volumes by app
--> But if everything is relative to the workspace, now we have something to namespace by app: the path in workspace π
Will try in our docusaurus module
oof, just merged main in and there are quite a few conflicts from a mixed bag of PRs:
- https://github.com/dagger/dagger/pull/11908
- https://github.com/dagger/dagger/pull/11902
- https://github.com/dagger/dagger/pull/11890
- (so far)
and a lot of these conflicts are like:
<<<<<<<< HEAD
========
[hundreds of lines of deleted code, good luck finding the needle and figuring out where it needs to be reapplied]
>>>>>>>> upstream/main
we should at the very least stop merging anything that touches toolchains imo. it's just rearranging the deck chairs on the titanic. literally the entire test suite is gone
i'm done with the conflicts at least, more worried about if we keep that pace. will push soon (done)
@brittle wharf I'm all for a well-scoped freeze - what should the scope be?
toolchains, any ModTree stuff unrelated to fixing native-ci issues (so 11902 above for example is fine), and honestly anything nonessential that touches modules (11908 for example could probably wait until after just to avoid friction)
11908 was a user request - but yeah makes sense
@brittle wharf what task can I safely pick on the branch?
or maybe the opposite - what has a lock on it?
I could take the "aliasing is too strict" thing
@dim smelt wanna quickly sync and go over the commits made while you were out, or do you trust them enough to just fix anything else in post?
I have absolute trust BUT we should still talk π
(to make a plan)
lol k, can hop in chat whenever
@dim smelt went over the commits myself, here are the notables:
dagger.json>../.dagger/config.toml- load only the specified module
- but we still want the workspace - what does that mean?
- CURRENT STATE:
workspace.Detectfinds adagger.jsonand effectively does a 'legacy migration' - so we IGNORE the outer defined workspace - QUESTION: what would NOT ignoring the outer workspace look like? What is the end goal?
- CURRENT STATE:
- added
dagger module install dagger installauto-detects and runsmodule install- https://github.com/dagger/dagger/pull/11812/changes/13e41fb46282bfb3f3826601500458a2577e1d93
- Is this a good enough heuristic so we don't need two commands?
- Or should we remove this because it's too much going on in one command, and update more tests to use
module install?
dagger module initin empty directory inits standalone module in that directory- https://github.com/dagger/dagger/pull/11812/changes/90489000b01b137ba2336159c98f94d6814e65dd
- This is to keep a bunch of tests passing - otherwise they initialize a workspace and install the module into
.dagger/modules/foo/andconfig.toml
dagger module init --sourceis specifically source dir, notdagger.jsondir, which is what the positional arg is- https://github.com/dagger/dagger/pull/11812/changes/4047379a2cbc2177f1a9fdc57f8126358715f652
- Assuming not controversial
dagger module init --with-self-calls- https://github.com/dagger/dagger/pull/11812/changes/59c6e9c5b002f61821d83e059e4cda2015037a04
- Just maintaining an existing flag
dagger developskips workspace modules- https://github.com/dagger/dagger/pull/11812/changes/6da9f58a74c30de9d59a2c99cb5b9e6d885a7fb2
- Otherwise chicken-egg (can't dev your module if unrelated modules broken)
- Aliasing mechanism ripped out
- https://github.com/dagger/dagger/pull/11812/changes/fe0a3eb947410f2a4b34bf81aad8c687a44f2e7c
- too fraught - caused many test failures
- previous attempted fixes:
- Push migration logic down into
Workspace.Detect- https://github.com/dagger/dagger/pull/11812/changes/4e92b04b07a5d538e74d9821902d385bbfb5ccef
- (further cleaned up later)
- Unsure how this plays with
-m, if at all
- add
GitRepository/GitRefsupport for legacydefaultPath - don't load workspace modules in
install/uninstall/update - bring back
LICENSEcreation- https://github.com/dagger/dagger/pull/11812/changes/5a54aa65b776f70e5e39094a5409914c5436125c
- questionable that this is done CLI-side (lowest friction)
- engine serves dependencies, too
- https://github.com/dagger/dagger/pull/11812/changes/a488ba2bc59e20ff8653cb1410e9d96e1f6874e9
- needed for
TestInterfacesotherwise concrete type is not loaded for caller - but I'm slightly sus on this change
dagger install auto-detects and runs module install
https://github.com/dagger/dagger/pull/11812/changes/13e41fb46282bfb3f3826601500458a2577e1d93
Is this a good enough heuristic so we don't need two commands?
Or should we remove this because it's too much going on in one command, and update more tests to use module install?
I would vote for remove it - too much magic IMO
cool, I'll look into that now
CURRENT STATE: workspace.Detect finds a dagger.json and effectively does a 'legacy migration' - so we IGNORE the outer defined workspace
Here is the intended workspace detection algorithm:
- Primary: Find-up
.dagger - Fallback: find-up
dagger.json, filtering for "migration triggers": either existence of toolchains, or source path that is not.(dagger.json with no migration triggers are ignored) - Fallback: find-up
.git. - Fallback: use workdir
So the intended design is not to unilaterally ignore outer defined workspaces whenever a dagger.json is found.
- Explicitly defined workspace
.daggeralways takes precedence dagger.jsonis only considered as a fallback for workspace if it has the migration markers- NOT IN SCOPE: handling of CWD modules which is orthogonal to workspace detection
QUESTION: what would NOT ignoring the outer workspace look like? What is the end goal?
The end goal is that workspace detection always works the same, and is easy to explain and understand.
Handling of CWD modules is a superficial convenience that doesn't need to alter how workspaces are loaded. It's similar to dagger -m -> a customization of which modules are loaded in the workspace
done - and all TestModules tests passing, woo
OK, so my Detect logic needs to change - looking into that now. Basically:
- treat a legacy (toolchains, blueprints, or non-
.source dir)dagger.jsonas if it were migrated as.dagger/ - otherwise, treat CWD
dagger.jsonas if it were passed as-m, butthat's an orthogonal code path that interacts with above,not overriding it (strikethrough edit: too prescriptive, think it might still be nice to consolidate intoDetectso it's all in one place, TBD)
I think wherever -m is handled (handling of module-related client params) is probably where CWD dagger.json should also be handled. But yeah don't need to be too prescriptive
But yeah, legacy dagger.json and CWD dagger.json are completely orthogonal. You could have a CWD dagger.json in an otherwise perfectly regular .dagger/config.toml (in fact that will happen all the time)
engine serves dependencies, too
https://github.com/dagger/dagger/pull/11812/changes/a488ba2bc59e20ff8653cb1410e9d96e1f6874e9
needed for TestInterfaces otherwise concrete type is not loaded for caller
but I'm slightly sus on this change
Oooh I didn't think of interfaces...
It seems that "serve dependencies" is really a combinatio of 2 things: 1) add the dependency's types, 2) add the dependency's constructor to the root query. Maybe in this case we want to do one without the other? It makes sense to have the dependency's type, but it's weird if dagger functions also shows my module's dependencies constructor right?
yeah that sounds ideal i think?
So in workspace-land does it mean:
- Load every module from the workspace (including ExtraModules and CWDModule if relevant)
- Also load every dependency of those modules (but only expose the types)
- Make sure all types are cleanly namespaced by module name
so, -m pretty clearly has to be at least somewhat handled by the CLI considering it's a flag, but maintaining support for CWD dagger.json in "the same spot" seems like it would mean re-implementing FindUp logic - we can't just use Host.findUp because now this happens before we even have a client
the reason I was considering pushing it into Detect is we already do a findUp there, and perhaps we just need a way to influence 'extra modules' from there too
also: when the user uses -m some/other/ref while they happen to be in/under a dir with dagger.json, we expect that to ignore the dagger.json right?
(doing the findUp CLI-side is fine btw if we still want that in the end, just surfacing nuance as I find it)
Yeah the CLI should not have to do any find-up (and I don't think it does today?)
it does, by virtue of using dag.ModuleSource(".").ConfigExists(ctx) - which happens after client init
To be clear when I said "handle -m and CWD in the same spot" I meant in the same spot engine-side. So I should have said "handle ExtraModules and CWD in the same spot"
--> As far as I know this is how I had already implemented it?
If I remember correctly, there's a "pending modules" system to avoid the deadlock between client connection, loading modules, establishing session attachables
did that not work properly?
hard to say, it was a bit of a blur getting down from 168 failing tests π
that system is still there, but I think that specific part got lost in the mix while pushing more down into workspace.Detect
(working on fixing this up now)
maybe we should start a fresh set of tests that are workspace-native? Not to replace the existing tests, but to remove some of the pressure on the workspace implementation by clarifying the purpose of each suite?
we do have the workspace suite, and I don't find the module suite to be particularly confusing, but more tests with clearer purpose is never a bad idea π
the workspace suite is more focused on the Workspace API, though, we're probably missing coverage for a bunch of specific CLI / code organization scenarios, i agree
(back from dentist)
dagger module init in empty directory inits standalone module in that directory
https://github.com/dagger/dagger/pull/11812/changes/90489000b01b137ba2336159c98f94d6814e65dd
This is to keep a bunch of tests passing - otherwise they initialize a workspace and install the module into .dagger/modules/foo/ and config.toml
Instead of "is workdir empty", we should use "is there an initialized workspace".
- If there's an initialized workspace -> create in
.dagger/modules/regardless of what's in workdir - If there is no initialixed workspace -> create in workdir
Will keep things more consistent & predictable for users inside an initialized workspace.
@brittle wharf do you think we should do another spinoff from workspace, and get the plumbing merged first, without any of the UX? Take advantage of backwards compat / migration, to convert everything to workspaces internally, but without any of the user-facing changes?
It would take a lot of discipline to protect the plumbing from the legacy UX
but if we do it, it would be just a couple weeks difference between the two PRs I'm guessing
wdyt, worth it?
update: it did work properly lol, I brought it back to how it worked before and all the Module tests pass. Must have been a red herring, or maybe the things I fixed were fixed by another change upstream
will keep it in mind, don't have a strong feeling at the moment
@dim smelt what do we expect for this test?
https://dagger.cloud/dagger/traces/de234c0102206e0f628927338c656709?span=76572aaa5affdf6d
Currently, the dagger module init calls are also initializing the workspace, which goes against #1468070450524459029 message but that's beside the point I think; these are the Workspace tests so that more implies they're just missing an explicit dagger init.
Right now the test asserts that when you run with your workdir in a nested module, dagger functions does NOT show the workspace modules - only the current module's functions, blueprint-style.
But I think the test is wrong based on our conversation - it should show all the workspace modules, and the CWD/blueprint module's functions
pushed - dagger functions and dagger call shows other modules in addition to the blueprint, updated the test
Makes me feel better π
(sorry missed those messages... catching up)
But I think the test is wrong based on our conversation - it should show all the workspace modules, and the CWD/blueprint module's functions
Yes I think so, unless it causes genuine UX issues to do it this way..
I think if we don't do it this way, it can become confusing and frustrating for "workspace-native" users. Example:
- I have my workspace setup (
~/repo/.dagger) - I init a module in my workspace (
~/repo/.dagger/modules/api) - For developing my module, I initialize a sub-workspace with exactly the modules I need to develop my modyle (
~/repo/.dagger/modules/api/.dagger) - For mysterious reasons, when developing my module, my specialized workspace doesn't work...
addressed this - now there's a ServedMods type which allows per-module configuration of whether the constructor is installed
plus dagger module init only doing workspace stuff when the workspace is already init'd (instead of 'is dir empty?')
Nice. I've been drafting a detailed lockfile design doc, will push on the branch
- will attack multi-env support in config
pushed a first pass at lockfile implementation
nice. merged main back in so CI can run
@brittle wharf I noticed in main, modules can query Workspace.root which seems to leak the workspace path relative to the host fs. Two problems with that 1) the workspace might be remote, 2) I'd rather not leak this info to modules
--> OK to remove?
sgtm, I left things pretty open expecting to whittle away at things
clientID is also exposed, might go away later but figured it could help with troubleshooting while we work on it
On the other hand, it would be AWESOME to give modules some sort of "stable identifier" for their workspace, to eg. namespace their cache volumes off of
--> Finally we can solve the nightmare that is NPM cache volumes
With workspace id + app path in workspace, I can safely create a different node_modules cache per app, and set its concurrency to LOCKED -> no corruption on concurrent writes, and no global lock across all apps ...
cc @unborn scroll π π I know you suffered from this. The nightmare will soon be over
I wanted to have a look at using generate for dagger develop, based on workspaces to avoid duplicate efforts.
But on workspace branch both dagger check -l and dagger generate -l don't work.
Was it working before or not at all? (to understand if that's a new problem introduced recently or not)
What's the error? I'll try to repro
Error: checks from module "dagger-cli": "dagger-cli": no main object
I just ran dagger migrate on a clone of dagger/dagger then dagger check -l at the root
new error
ok, I can have a look at the recent changes and find why
@dim smelt are you using a skill to generate docs like https://github.com/dagger/dagger/blob/workspace/hack/design/bugs/workspace-missing-spans.md ? Or it's initially hand written?
It's not fully formalized as a skill, I wrote a few lines in AGENTS.md (but didn't get merged). I regularly tell the agent to keep the design doc updated, so that another engineer can cleanly take over. I also make sure the task list stays up to date, and I say "when you complete a task, change the status in the tasklist in the same commit". It's ad hoc and imprecise, but mostly good enough
Definitely NOT hand written π
basically it's my persistence layer to keep the agents as stateless / ephemeral as possible
I've pushed a fit to workspace branch. With a similar hack/design/bugs file. I don't know if we want to have that for all of the bugs we are fixing, but that's interesting. https://github.com/dagger/dagger/blob/006094c32c727452c055f8017e41d6523746c2bc/hack/design/bugs/workspace-checks-no-main-object.md
FYI I pushed the workspace branch with:
- merge from main
- fix so it can correctly build especially on darwin (cmd/dagger was referencing some parts of core, adding a dependency on builkit that can't build on darwin)
more details about the build issue here: https://github.com/dagger/dagger/blob/workspace/hack/design/bugs/workspace-darwin-build-failure.md
I pushed a first design draft and implementation plan to use dagger generate instead of dagger develop. I'll iterate more on them, but it might be interesting (maybe should have I created a different branch?)
Quick overview of the current progress on dagger generate for modules:
$ dagger module init --sdk=go plop
β connect 0.0s
β Workspace.moduleInit(name: "plop", sdk: "go", license: "Apache-2.0"): String! 7.2s
Created module "plop" at /root/src/current/.dagger/modules/plop
Installed in /root/src/current/.dagger/config.toml
$ cd .dagger/modules/plop
$ ls
dagger.json main.go
$ cat dagger.json
{
"name": "plop",
"engineVersion": "v0.20.2-20260310101303-dev-a99318c0ce55",
"sdk": {
"source": "go"
}
}
$ cat .dagger/config.toml
[modules.develop]
source = "sdk:go:develop"
$ dagger generate
β develop:generate 24.0s 2Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£·β‘β£Ώβ£Ώβ‘β£Ώβ£Ώβ‘
Apply changes?
dagger.gen.go +233
go.mod +53
go.sum +95
internal/
internal/dagger/
internal/dagger/dagger.gen.go +17103
6 files changes, +17484 lines
Apply Discard
One thing to notice. We have go, typescript, etc as shortcuts for embedded (or not) sdks.
I was wondering what to put for the "generate/develop" module. So I thought about sdk:go:develop. Using : to express this is not a real path (I mean sdk/go/develop is a path, but only if we think about the dagger sources)
And I think it express quite good the intent behind, what this module is for.
It maps to sdk/go/develop relative path. We often have a sdk/LANGUAGE/runtime module, to have sdk/LANGUAGE/develop can make sense.
But all that is up to discussion.
This sdk:go:develop module is a pretty short dang one, with a pre-built codegen binary.
It still needs some refinements and a bit of work, but it's working quite well for now. One of the thing in particular is to work on the migration path and other SDKs.
And that also handle the init. So if the module directory only contains .dagger/config.toml and dagger.json as shown above, everything else will be created.
One of the really good aspect here is with the Changeset integration on the CLI it's very explicit.
One part still missing is the .gitattributes/.gitignore files
$ dagger generate
β develop:generate 24.0s 2Γβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£Ώβ£·β‘β£Ώβ£Ώβ‘β£Ώβ£Ώβ‘
Apply changes?
dagger.gen.go +233
go.mod +53
go.sum +95
internal/
internal/dagger/
internal/dagger/dagger.gen.go +17103
main.go +37
7 files changes, +17521 lines
π
So I added this plop module to the dagger/dagger workspace.
If I dagger functions from the root of the workspace, I can see it. If I dagger call plop container-echo ... it's working as expected.
But dagger functions plop it failing with Error: no function "plop" in type "DaggerDev"
Is that a known issue? Is someone working on it? (to avoid conflict / duplication of work)
If not, happy to have a look
seems related to the "clobbering" issues that @brittle wharf was looking into?
hmm don't think so, the clobbering is gone, that looks like it's trying to call daggerDev.plop instead of Query.plop? haven't familiarized myself with the new dagger call <ref> model yet
I can have a look at that. The call works, it's only the functions that doesn't. I don't think that's a big issue, was mostly wondering if it was a known one.
@topaz seal for develop->generate, maybe a different branch? wdyt
we're already struggling to get this one merged, I worry that we're adding faster than we're stabilizing
I'm already working on a different branch for that yes. And will probably open a PR -> workspace, at least to discuss it, then squash the result when we will agree it's ok. (or after workspace branch is merged, we'll see)
Go is working, I'm looking at Typescript and Python, and need to ensure it still compatible with the old world so other SDKs can still work.
@topaz seal that branch is for the discussion we had in SF?
Yes, I already have an implementation working (as in the messages above)
Including a migration path from develop to generate
Just thinking about modules migration with dagger generate instead of develop. To migrate from develop to generate doesn't really change how that works. It's mostly a change on how the generation function is called. So maybe this can ve generic. Like to provide a way for SDKs to work using generate without requiring changes at the SDK level. A kind of module that will ger the required info to then call the existing codegen from the module SDK.
That way we can imagine to have this change as minimal as possible on SDKs, we don't need to update them all. But still we do the change with workspace for each module and the removal of dagger develop command.
Not sure if that looks clear as I'm cooking at the same time π
but I think we can find a nice path so we change the UX with as less changes as possible (as this less is already big)
I'll explore that asap.
I'm giving a try at spinning out the workspace plumbing + backwards compat in its own PR, with as little new UX as possible.
I have a first POC working that way. A generic sdk:compat:develop module that will use the sdk.source from dagger.json to run the existing codegen. That way, we can have a fully compatible layer, meaning we can move to a removal of dagger develop and only use dagger generate without to rework all SDKs, and handled by the dagger migrate command.
Then each SDK can work on providing a proper develop module.
And I'm also thinking about a way to move then from the generic compat module to a SDK specific one, I think it can be achieved using dagger generate again π
@dim smelt what about a dagger module list (or dagger module --list but as we have dagger module init, dagger module install...) to just list the installed modules in the workspace?
Nothing fancy, just displaying in a table the name (and * if it's aliased) and the source. Nothing more.
I was thinking about that as a quick way to know what's available without to load any module, look at functions, or at the config file.
love it
$ dagger module list
β connect 0.0s
β Workspace.configRead: String! 0.0s
Name Source
changelog ../toolchains/changelog
ci ../toolchains/ci
cli ../toolchains/cli-dev
dagger-dev* modules/dagger-dev
docs ../toolchains/docs-dev
elixir-sdk ../toolchains/elixir-sdk-dev
engine-dev ../toolchains/engine-dev
go ../toolchains/go
go-sdk ../toolchains/go-sdk-dev
helm ../toolchains/helm-dev
installers ../toolchains/installers
java-sdk ../toolchains/java-sdk-dev
php-sdk ../toolchains/php-sdk-dev
python-sdk ../toolchains/python-sdk-dev
release ../toolchains/release
rust-sdk ../toolchains/rust-sdk-dev
sdks ../toolchains/all-sdks
security ../toolchains/security
test-split ../toolchains/test-split
typescript-sdk ../toolchains/typescript-sdk-dev
It's currently reading the config file on client side (asking the engine the config, then parsing the toml inside the CLI).
I don't know what to think about that. I did it that was because it's just easy, but it means there's a small part of the toml logic that is on the client side.
Would it be better if I instead add a moduleList function in the workspace schema? I wasn't sure it's very useful and didn't wanted to add extra functions if we can avoid it, but I can quickly do it if that sounds better.
Also, the source is raw path, so relative to the config.toml. Should I make them relative to the workspace root instead? Or that doesn't matter?
I think it's better to keep it engine-side, even though there is a price to pay (higher latency even for basic operations).
It avoids split brain, and will handle special case for example CWD-module, -m ...
very nice - though IMO these paths should be either relative to where you ran the command from (CLI cwd), or absolute; relative to config.toml feels confusing
Yeah I thought about that. I think to have them relative to the workspace root can be nice
Name Source
changelog toolchains/changelog
ci toolchains/ci
cli toolchains/cli-dev
dagger-dev .dagger/modules/dagger-dev
docs toolchains/docs-dev
elixir-sdk toolchains/elixir-sdk-dev
engine-dev toolchains/engine-dev
go toolchains/go
go-sdk toolchains/go-sdk-dev
helm toolchains/helm-dev
installers toolchains/installers
java-sdk toolchains/java-sdk-dev
php-sdk toolchains/php-sdk-dev
python-sdk toolchains/python-sdk-dev
release toolchains/release
rust-sdk toolchains/rust-sdk-dev
sdks toolchains/all-sdks
security toolchains/security
test-split toolchains/test-split
typescript-sdk toolchains/typescript-sdk-dev
sweet! though i regret to inform you that there are merge conflicts preventing CI from running
will take a swing at it if i have downtime - looks small
Yeah it's still WIP
pushed a few test fixes
now ripping out lockfile (thought it was needed for plumbing PR but actually can move to porcelain)
nice. i've been dogfooding Workspace from an LLM perspective - added Workspace.glob, Workspace.exists, Workspace.search. huge wins so far, the Doug + Dev modules feel a lot clearer and don't have to be re-initialized with a new snapshot on every turn
oh man I can't wait to dagger install doug in any project
Workspace.search is a little interesting - tries rg client-side, falls back on grep, so the main question is portability/compatibility - hopefully that's enough? that aside, feels super worth it
also maybe dagger install claude ? π€π€π€π
oh also I really want withMountedWorkspace now
it's perfect for MCP servers
ship it pls π
ha ha coming soon π
thinking out loud: for LLM I think we may want "Workspace + rolling Changeset" - a workspace based on a directory on the host, but with a Changeset that keeps growing and is transparently applied to all APIs (somehow).
either that, or a way to sync changes back to a workspace eagerly, but that feels less cool (now you need to keep agents from stepping on each other's toes) instead of having sandboxed agent trees.
the problem: when Doug does an edit, its grep tool won't see its changes, because Doug's edit just returns Changeset - changes aren't applied until the user presses ctrl+s.
'somehow' is doing a lot of work there, there could be another way of looking at it altogether
(like maybe syncing back is the answer after all, not completely off the table)
: when Doug does an edit, its grep tool won't see its changes, because Doug's edit just returns Changeset - changes aren't applied until the user presses ctrl+s.
Feels like we're witnessing the collision of 2 worlds: pure chainable functions vs. traditional syscalls with side effects. How do we reconcile these 2 worlds in our DX?
(I remember we discussed this hypothetical future, where there's 2 different ways to do everything: return the artifact, or call (at the time) Env.export() or whatever
Now with Workspace, and the implementation being much further along, the problem is much more concrete
The good news: with explicit Workspace arguments, it's much easier to split up a big top-level function with side-effects, into smaller pure functions that return this or that artifact. (Was basically impossible with argument pragmas)
Last iteration. I'm good to push it to the workspace branch but as we split in multiple PRs I wonder if that's the right place or if I should keep it on the side for now
I suspect this can go to the workspace branch, the other PR is non-CLI-UX plumbing
for the coding agents case, it's really tempting to reach for git: agents commit on their respective branches, and leave your checked out tree alone, like container-use did, with automatic syncing back and forth via vanilla git push or pull
@brittle wharf going to need your help on moving towards merge
I'm trying to get this plumbing spin-off ready, but now having doubts about whether it was worth it. Going over it now
eh my initial instinct on seeing the PR was positive at least, happy to help
(despite saying it might not be worth it last time you asked)
I asked the agent to take a pass at remaining test failures, with the goal of passing the original main tests (makes sense since we target backwards compat). But it looks like it took it too literally, and reverted big chunks of the implementation to pre-workspace - especially check. I'l inspecting the damage
if we can get all the checks passing on a significant amount of changes that's worth it on its own
yeah that was my hope
Let me revert the last dumb changes. I think the core had promise
I pushed two changes to workspace branch:
dagger module listas shown above- a fix so that
dagger functions MODULEworks as expected (dagger call MODULE --helpwas already working, it was only aboutfunctions)
I have one more change, but I'm not entirely sure if that's the right thing to do.
When running dagger functions MODULE it loads everything. All the modules. So I have a version where it load the blueprint modules and a module that would have the same name as the MODULE we are focus on. Both because at this time we don't know for sure if it's a module or a function of one of the blueprint modules.
This gives interesting results (dagger functions python-sdk from 15s to 10s -> see attached screenshots, state was for both without cache but dang SDK loaded) but I'm not entirely convinced by the changes, especially it adds a FocusModule field to ClientMetadata. It works, but I'm not sure the design is great and I fear a bit it complexify even more the possibilities (as we already have things like SkipWorkspaceModules).
Also this only matters if we haven't already ran a dagger functions for instance, that will already load all modules from the root. If so, the optimisation is kind of useless.
That's why I haven't pushed it to the workspace branch for now and here is the corresponding commit.
Let me know what you think.
@topaz seal quick question: where do you put your worktrees?
my dagger/dagger clone is inside ~/dev/src/github.com/dagger/dagger.
The associated worktrees are at ~/dev/src/github.com/dagger/dagger-worktrees For instance ~/dev/src/github.com/dagger/dagger-worktrees/workspace
At start I wanted to put them inside dagger/dagger but I've read several time this is not the best thing to do.
The directory name is the branch name (that's just a convention) and I have several aliases that allows me to quickly create a worktree for a branch or a PR, open a new dedicated tmux session, launch claude inside. That way most of the time I just daggerdev MYBRANCH and I'm done.
ok cool. i'm investigating having agents able to automate them, related to this - representing branches of work as worktrees just makes a ton of sense because then Workspace.search, glob etc. can just run from those checkouts, and the user can keep an eye on them by just hanging out in the worktree directories. it came to a similar conclusion, of putting them beside the original repo, not inside of it (which tends to confuse editor configs and risks getting synced up multiple times with bad filtering)
i'll tell it to do ~/src/foo-worktrees/<branch> for ~/src/foo` and see how it goes
thank you that second one was on my list! I think even dagger functions with no arg works weird on our own repo?
weird how? on the workspace branch a dagger functions works just fine (or at least with this fixed version). It loads all the modules and display functions and modules
ok great! for me it only printed the main module's functions, not the toolchains
I wonder if the issue wasn't because of the alias -> blueprint
The .dagger/config.toml was still using alias = true but the engine is using blueprint. If I'm right I changed it.
So the aliased/blueprinted functions are now visible at the root of a dagger functions, the daggerdev is not displayed anymore and all the other modules are visible
ah yeah we had some churn on the design. I remember we decided to not use the term "blueprint" anymore. Then ran into issues with aliasing(cc @brittle wharf )
@topaz seal did you remove blueprint in our config file then?
Ok, I think I missed something then.
No I changed alias to blueprint in the config file.
I thought the opposite, that we were moving to blueprint instead of alias, at least that was my understanding as the config structs only referenced blueprint and not alias, even if there's the autoAlias function somewhere
So are we going back to alias = true? Or something else?
I don't remember π too much context switching. let me refresh my memory
@brittle wharf how warm is your cache on this part?

the last thing i remember is realizing alias = true should become blueprint = true, which turned into Workspace.defaultModule plus a planned tweak to the config to make it clear that you can only have one. and that we needed to make it work without actually merging the 'default module's fields into the schema Query, because that caused tons of issues
(asking Claude to read the thread and explain me what was the status on it π )
Current state / consensus
The consensus from the thread is:
1. Schema-level aliasing is dead β too many breakage vectors, abandoned
2. entrypoint = "module-name" at the top level of config.toml was the agreed direction (Feb 24),
replacing per-module blueprint = true
3. CLI handles it β Workspace.defaultModule returns the entrypoint name, CLI prepends it to queries
4. But it was never fully implemented/landed β the config structs still reference blueprint,
there's an autoAlias function floating around, and nobody cleaned up the gap between the agreed
design and the actual code
The confusion in today's messages (eunomie switching alias β blueprint, you not remembering) stems
from this incomplete transition. The design decision was made, but the code wasn't fully updated to
match.
Bottom line: You and Vito agreed on entrypoint as the concept and config keyword, with CLI-side
handling via Workspace.defaultModule. But the implementation still has remnants of blueprint/alias
that need to be reconciled.
@topaz seal π
Let me check the design doc in the branch. It should always be up to date. If it's not, it's my bad
Thanks. I probably have things to change to use the Workspace.defaultModule then.
Not sure I'll have time today for the change, but I can do that on Monday (to align all the things I've pushed, and the other branch I have)
ah yea that all sounds right 
@topaz seal let me double check the state of that "defaultModule" approach. I remember we were not quite done bikeshedding it (I was worried about fragmentation, because a client can't just connect - they have to know the defaultModule dance)
@brittle wharf @topaz seal
The "entrypoint" / client fragmentation problem
Your concern (raised Feb 24 and again just now) is:
If aliasing is handled CLI-side via Workspace.defaultModule, then every client has to "know the dance" β query
Workspace.defaultModule, then prepend it to every query. A client that just connects and introspects the schema won't see the "right"
thing. This creates fragmentation:
- dagger call has to know about it
- dagger functions has to know about it
- dagger shell has to know about it
- dagger check has to know about it
- dang has to know about it
- Any third-party client has to know about it
- Each command might implement it slightly differently (or not at all)
You said on Feb 24: "Still slightly worried about fragmentation ('this command honors entrypoint; this one doesn't') but at least the
concept is easy to explain"
And just now: "I was worried about fragmentation, because a client can't just connect - they have to know the defaultModule dance"
The tension
There are really only two options, and both have downsides:
βββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Approach β Pro β Con β
βββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Schema-level (engine hoists β Clients "just work" β β Causes recursive calls, dependency clobbering, core API β
β functions into Query) β connect, introspect, done β shadowing β Vito documented 3 breakage categories β
βββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Client-level (CLI queries β No schema breakage β Every client must implement the dance; fragmentation risk β
β defaultModule and prepends) β β β
βββββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The earlier attempt at a middle ground β a single exclusively-served entrypoint module (Feb 20: "we could simply support a single
entrypoint module, that would be exclusively served - no other modules allowed") β was rejected because it would drop constructor
support and break -m.
Unresolved question
Is there a way to make the engine handle it without schema stitching? For example:
- A special connection-level parameter that tells the engine "resolve bare function names against this module first"
- A query-routing layer below the schema that redirects, rather than merging types
- Keeping full schema but marking the entrypoint module distinctly in introspection metadata so clients can discover it without a
separate query
This is the open design question. Want to hash it out here, or take it back to the thread with Vito?
We should seriously consider just plugging a designer agent straight into discord π
For the record, today I'm working on 2 blockers:
- Entrypoint module (formerly blueprint=true, alias=true..) as discussed above
- Paths in workspace API
I will make sure these blockers are resolved consistently across workspace and workspace-plumbing; and I will make sure the design (both the decided and undecided parts) is clearly and consistently documented in design docs + PR descriptions.
Then:
- I hope to get tests passing workspace-plumbing, and from there: merge
@brittle wharf do you think codegen should honor entrypoint, or not?
- Pro: I like the consistency (every client sees the same thing)
- Con: when you change your workspace config to change the entrypoint, you break your generated clients (
dag.Foo().Bar()must be changed todag.Bar(), or vice-versa)
I don't think it can, even if we wanted to - you'd run into the same problems with nested fields colliding
I'm also playing with the idea of making the core module more real, (dag.Core().Container()), and subject to the same entryoint rules. Just would be set to be an entrypoint by default. Could be a nice simplification of the model, less special cases
hmm that could change the equation a bit, yeah
personally i don't have much motivation in applying the same sugar to codegen
feels like the juice isn't worth the squeeze. i still prefer to think of it as a CLI UX layer on top of drier primitives (API + codegen)
It kind of breaks the experience if eg. you want to build a dependency in your test code - in the CLI that's dagger call build but in the code it's dag.Foo().Build()
If it's not worth it in code, it kind begs the question of whether it's worth it in the CLI
(I guess we need it in the CLI, for backwards compat with main module)
that's where i disagree - CLIs commands are handwritten, it makes sense to optimize to reduce keystrokes. for code, you write it once, and hope the sands don't shift from underneath
OK, so we'd be defining entrypoint as a CLI convenience only? Or for hypothetical non-CLI clients: "use as a convenience hint for end user, when applicable"
yeah i suppose so. it's a progressive enhancement type thing, right now it's CLI only but perhaps it's relevant for other things later
but wait, the way you propose implementing entrypoint, would other modules be completely hidden? If it's implemented by the CLI prepending every query with foo
they don't get hidden - the other modules just get mixed in. so you still have the same collision concerns, between your entrypoint functions and module names, but it's at least much less delicate (underlying API still works, collision is just command shadowing, can emit a warning etc)
So the CLI queries twice I guess? And does client-side "stitching"?
dagger functions ->
- One query to list functions in the entrypoint
- Another query to list functions at the toplevel
- CLI merges the results of both queries?
i'm not sure if it's two queries; i recall there being one big custom GraphQL introspection call at one point (maybe currentTypeDefs)
I'm investigating
(ie. doing my homework)
You're right it's not 2 queries. But there's also a regression in dagger functions the way it's currently implemented:
- in
main,dagger functionsoutputs 1) all functions of the main module 2) the name of all toolchain modules - in
workspace,dagger functions1) the name of entrypoint module, 2) the name of all other modules
ah, hm could have sworn i had that working at one point (with a test and everything). but it's all a blur π
Codex seems to think there's way to do "schema stitching lite" with best of both worlds. Let me run that through the bs-meter and report back
Proposal: priority-based field resolution for entrypoint modules - entrypoint-priority-resolution.md
sounds reasonable! pretty simple, just don't clobber π
How do you feel about inserting a dag.Core(), always set as an entrypoint for now - but keeps options open for later?
I know you and @lofty shadow were discussing that as a desirable thing eventually, separate from this
no objection in principle
@dim smelt i have a kinda cool demo of workspaces + worktree automation if you're around
Interested π joining lounge
Update: I'm rebasing workspace, and trying to get workspace-plumbing green
feel free to ping for reviews/help. I went on a grand LLM adventure and it led me back home π
i can start chipping away at CI failures if there's nothing else in particular
OK let me push in-progress fixes first then
@brittle wharf I think a review on workspace-plumbing would be good. I don't think any slop made it in, but I might have missed a spot
There's a "ledger" doc that should have details on ongoing changes and especially drift with the workspace branch, to inform the future rebase
Well that rebase went horribly wrong... I just remembered we gave up on that, and merge main instead
what's your policy for merging vs. rebasing on the plumbing pr? rebase? i just merged dang runtime, and rebased it on main. only issue was a conflict in workspace_test.go since the first commit nukes it, but i just brought it back to resolve that, since a later commit restored it from main anyway
since the rebase went smoothly i'm assuming rebase, will push -f on confirmation
I've been rebasing
@dim smelt pushed a rebase + minor fixes for TestWorkspaces
these TestTelemetry failures are pretty sus, looking through the PR to find what might have done it
https://dagger.cloud/dagger/traces/453599d7afc79f66197cb54d6c409abd?span=11cd3a18dc824534
--- expected
+++ actual
@@ -7,21 +7,27 @@
β°β΄β starting session X.Xs
-β load module: ./viztest X.Xs
-ββ΄β finding module configuration X.Xs
-ββ΄β initializing module X.Xs
-ββ΄β inspecting module metadata X.Xs
+β load workspace: . X.Xs
+ββ΄β load extra module: ./viztest X.Xs
+β
+ββ΄β ModuleSource.moduleName: String! X.Xs
+β β viztest
+β
+ββ΄β ModuleSource.blueprint: ModuleSource! X.Xs
+β
+ββ΄β ModuleSource.toolchains: [ModuleSource!]! X.Xs
+β
β°β΄β loading type definitions X.Xs
β parsing command line arguments X.Xs
-β viztest: Viztest! X.Xs
-β .pending: Void X.Xs ERROR
-ββ΄β container: Container! X.Xs
-ββ΄$ .from(address: "alpine"): Container! X.Xs CACHED
-ββ΄β withEnvVariable NOW=20XX-XX-XX XX:XX:XX.XXXX +XXXX UTC m=+X.X X.Xs
-ββ΄β withExec sleep 1 X.Xs
-ββ΄β withExec false X.Xs ERROR
-β ! exit code: 1
-β°β΄β withExec sleep 1 X.Xs
+β Viztest.pending: Void X.Xs ERROR
+β°β΄β Viztest.pending: Void X.Xs ERROR
+ ββ΄β container: Container! X.Xs
+ ββ΄$ .from(address: "alpine"): Container! X.Xs CACHED
+ ββ΄β withEnvVariable NOW=20XX-XX-XX XX:XX:XX.XXXX +XXXX UTC m=+X.X X.Xs
+ ββ΄β withExec sleep 1 X.Xs
+ ββ΄β withExec false X.Xs ERROR
+ β ! exit code: 1
+ β°β΄β withExec sleep 1 X.Xs
first part makes sense, second part seems extremely weird (nested beneath itself??)
ohh maybe we have an extra core.AroundFunc being added somewhere 
I'm going to go ahead and apologize in advance, just in case
I'm going to rewrite my commit messages to shut up the DCO error
@brittle wharf besides CI errors, did you spot anything stupid or worrying in the code?
nothing egregious stood out yet, besides github's review UI kind of shitting the bed with the size of it
only thing i noted so far is currentTypeDefs(includeCore) defaulting to true is problematic with the Go SDK (https://github.com/dagger/dagger/issues/8810)
haven't pushed since earlier
ok making headway on the telemetry stuff, it's from the entrypoint proxy
I'm looking into filterCore in the CLI, that's more logic CLI-side than I was hoping for
yep
pushed telemetry fixes
- fix duplicate telemetry in entrypoint proxy wrapper
- fix proxy wrapper resulting in constructor call being marked 'internal' and hidden
- error early if resolving
-mmodule ref fails - fix case where a
type Broken struct {}; func (Broken) Broken() {}module would result in 0 functions- this is a little fishy - the test happened to hit it, and i'm not 100% confident in the work-around, similar to the clobbering bug but instead of clobbering you just don't get the function
Thanks. I'm still digging into the "fat CLI", I'm unhappy with the current implementation, trying to disentangle what happened and hwy
Fat CLI dimensions across main, workspace, and workspace-plumbing - fat-cli-branches-table.md
@dim smelt pushed a fix for TestContainer suite - it was just one failure, but it caught the fact that fields need to be exposed as entrypoints, too
this one seems similar - call --foo func --bar doesn't work I guess? (constructor args)
https://dagger.cloud/dagger/traces/8b92ec22183b36d2448d7e6f71ab49d8?span=fab141ae0331e74f
pushed a fix for that one (and maybe others) - the constructor flag handling is simpler now, just installs every constructor arg as a flag on call
pushing more fixes for this class of failure: entrypoint function that collides with core function and is this not registered
https://dagger.cloud/dagger/traces/6ca5c20849fbacdd26f4d5fe34bc7230?span=e979ce13b95551b2
This is technically where we're breaking, if this is a thing people are doing at all. Might be mostly tests. At the moment, it's all stick and no carrot, since dagger call file doesn't do anything either since we hide the core APIs. What if we didn't? 
@brittle wharf do you think we could review that plumbing PR together? I'm still worried about the CLI enfattening
sure
Thanks - I'm free for the next 2 hours
Does this table make sense to you? https://gist.github.com/shykes/ea184939bd6b912c4f3a812f19235e5f
Fat CLI dimensions across main, workspace, and workspace-plumbing - fat-cli-branches-table.md
kind of? it doesn't sound extremely actionable or precise, like if i fed this back into an LLM without context i think i'd get a bunch of different perspectives/fixes for one thing or another
at this point i wonder if it would be better to just do a clean room implementation and describe the high level goal
most of this seems better saved for after workspace-plumbing too? my understanding is this branch is meant to be a bridge between the two worlds so we can merge more quickly, with some temporary scaffolding
I caught a few low hanging fruits (slop cleanup)
yeah temporary scaffolding is fine, but don't want to allow unnecessary slop into the PR just because a demented digital brain decided it was necessary scaffolding
let me push the fix I have, then we look at the result together?
sure
the remaining design puzzle I think, is constructor args
(lemme know when there's a good time to look through the plumbing, I just wanna see whether it's gonna be a wrench thrown into the cache work mainly, but don't wanna review stuff that's changing
)
(which you had flagged as a possible issue)
Want to chat in 5mn? π
We can go over it together
Sure yeah
@lofty shadow @brittle wharf joining lounge
pushed:
- undid my changes to test suites
- restored
callconstructor flags
now to rebase on main now that tuist is merged...
force pushing
@lofty shadow @brittle wharf flagging an issue in my task: the specific case of dag.foo().foo() when Foo is an entrypoint. I this particular case, we need to decide if we allow the module's inner function to shadow its constructor. If we do, then it's no longer true that an entrypoint module is always reachable in 2 ways: via its top-level alias functions, and via its constructor.
This has 2 possible implications: 1) UX (users can't always rely on going through the construtor), 2) ID repeatability -> if the constructor is not reachable, we can't make dag.foo() an eager "canonical" alias to dag.foo().foo() such that the ID for dag.foo() is undistinguishable from the ID for dag.foo().foo().
So my question is: how should we handle such a conflict?
I'm still chewing on the 'thin outer shell' idea I mentioned, seems like it could help here too. CLI talks to a thin outer DagQL server that delegates to a normalized DagQL server (maybe even with core { ... }). All IDs go through the normalized server, outer server foo() delegates to foo().foo() on the normalized one, etc
pushed a couple of commits that implement this, looked promising locally - handles both cases (conflict with core, conflict with self constructor)
tried to get some wider CI runs before pushing but for some reason checks stopped running for my PR (https://github.com/vito/dagger/pull/416) - maybe native ci bug? cc @crisp dove @topaz zenith - tl;dr: pushed to my PR the first time, checks ran; pushed again, no checks; pushed again, no checks; pushed to upstream PR instead, checks ran.
I checked Inngest and nothing looked obviously wrong but don't 100% know what I'm looking for yet
tests that it fixed:
dagger call engine-dev test --run TestCall/TestCoreChaining/return_file/size --pkg ./core/integration/(file())dagger call engine-dev test --run TestModule/TestFloat/go/float64 --pkg ./core/integration/(foo().foo())
I see that the last two commits had checks run for them. Are those the commits that were pushed upstream? If I understood correctly you are saying that in your vito dagger org you are not seeing checks run for the github.com/vito/dagger remote?
Yeah vito org for vito remote. It looks like checks eventually ran, so I guess it was either Inngest or GHA rate limits? What was surprising was this seemed like a period of relatively low activity, and I waited a good 10 mins, and then it worked immediately once I pushed to upstream, plenty of room for coincidence though
looks like the checks ran ~16 mins later for the last commit (force-push @ 23:44PM => load @ 00:00:31)
I wonder when the event arrived. Let me see
Is this the commit you are referring to: abfd53e?
These are the timelines of the events we received (for that PR we received only 3).
3:05 -> PR gets opened. Event processed within the minute
3:30 -> commit gets pushed. Event processed at 4:00, 30 minutes after
3:44 -> commit gets pushed. Event processed at 4:01, 16 minutes after.
The Inngest events for each were correctly sent at the right time and the Inngest functions were queued and started at exactly the same time. The culprit here is the rate limit (last two pictures). Creating the commit event took 16 minutes. We tried to create it immediately but we got a rate limit from GitHub telling us exacxtly when we need to retry. We are now using the response from GitHub to tell inngest when that step must be retried, so it waited for about 16 minutes before attempting again, and that time it worked
I wonder why your org was rate limited though, it doesn't look like it has that much activity to be honest. This is the last few commit events we have for the past 3 days: