#**Better context API** ๐งต
1 messages ยท Page 1 of 1 (latest)
๐งต
<@&946480760016207902>
Better context API. We need a better way for dagger functions to interact with their context (everything outside their sandbox). Currently we have a "shadow API" that has evolved organically. This is becoming a blocker in several ways. Example: we need the ability to dynamically filter files uploaded from the client. toolchains need a more explicit way to receive the context from their project (this is related to 1 - toolchain-centric UX).
๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐
Add to this another point:
We needed "contextual access to current checks". Tried to add it to Env. In doing so, had to break the internal logic of Check to match Env. Specifically Env doesn't have a concept of a "main module". whereas Check and my ongoing improvements to it did... 11650 broke all my dev branches by adopting the Env view, which was the only way to merge.
But we don't have a working plan to reconcile the "Env view" with the "everything else view"
--> we need to reconcile
Expose checks on current environment.
Add a current-env function for dagger shell
Checks can be retrieved from the current environment. So from a module function:
dag.CurrentEnv().Checks()
Or using...
Might be time to bring this back: https://github.com/dagger/dagger/discussions/10370
@burnt locust refreshing myself on Env... is my understanding correct that Env.withModule is basically toolchains, minus the namespacing? eg. env | with-module github.com/kpenfound/jest will mount the functions for jest directly in the root. Current best practice to avoid conflicts is: "caller is responsible for not mounting modules in the same env that have conflicting functions"?
yep! and: yes it could probably just be given checks instead 90%+ of the time ๐
side tangent: for that use case, being able to run a single test becomes especially important, not sure what the state of the art is there w/ checks
I'm looking for a way to undo the damage done by 11620 11650, it basically rams an Env-like model on Check
because of the completely different meaning of the word "module" in both types
at this point it's a free-for-all architecturally, we need to get out of this situation asap or it will get even more inextricable than it already is
yes sorry
If i'm not mistaken, our de facto "Context shadow API" has 6 parts:
- contextual dirs
- host()
- Env
- toolchains
- implicit docker & git auth access
- session attachables reachable from standard core API: Directory.export(), Container.terminal(), secret(), etc
Can you explain that? Like instead of withModule, something like withModuleChecks?
Here I think you mean: LLMs need to be severely restricted in what tests are actually work, because their devloop is so rapid & there can be so many running in parallel?
โคต๏ธ
Dev: "with coding agents my token costs dwarf my regular compute costs"
Dagger: "challenge accepted"
it's:
- the 90% use case of exposing modules to agents is exposing checks, so instead of exposing functions directly we could just give the LLM a tool to run checks, just like
dagger checks - agents really strongly want to be able to re-run an individual test (not just an individual suite), which you can't currently do with
dagger checksafaik. e.g. if they run the tests and see one fail, it really wants to rerun just that one, to the point where it may flail around if it doesn't have an easy way to do so
Super clear thanks
re: filtering. Yes one way or the other it must be exposed as a first class citizen. My "dynamic checks" PR had some issues as we know, I've been polishing a better proposal, still need to post it (last week didn't have much async time, catching up this week)
Note: I'm bundling filtering & test splitting into the same problem
90% checks: also clear. Makes sense to go in that direction yes. We're making a big bet on checks on the human-facing side. Would make sense to do the same on the LLM-facing side
But a pre-requisite would be to reconcile the Env model with the rest of the API, so they can evolve together
I'm looking for a shortcut to do that asap, to stop the drift
How do you feel about:
- add
Env.withRootModule(or some other equivalent name) - deprecate
Env.withModule, replace withEnv.withToolchain- which is exactly the same thing except for the namespacing (easy to add namespacing? or OK to leave the discrepancy in behavior in the short term?). Or, even simpler, just rely on actual toolchains, and move "toolchain assembly" to the actual parent module? So you only need a singleEnv.withRootModule? - this way I can refactor
Env.checks()to just pass through toEnv.parentModule.checks()which unbreaks it
- What does that do? (edit: deduced later)
- SGTM - namespacing is probably fine either way, easy to adjust later since the manner in which tools are exposed is plumbing internal to the LLM. For example we could probably name each tool
<toolchain_name>.<func>just like you do in the CLI - Oh that explains 1 - it's the module toolchains are overlayed onto? Can it be an empty module? I ask because
Env.workspaceis ultimately what determines the context dir for everything including toolchains (so the root module isn't necessarily what provides the context dir)
Oh that explains 1 - it's the module toolchains are overlayed onto?
Yes.
Can it be an empty module?
Yes I don't see why not, any Module can be walked for checks after all.
I ask because Env.workspace is ultimately what determines the context dir for everything including toolchains (so the root module isn't necessarily what provides the context dir)
There's a longer conversation about how to converge a) how Env interacts with modules with b) how the engine itself interacts with modules, so that Env can become the authoritative virtualization layer that we want it to be...
But for the purpose of short-term unblocking: I guess the requirement is that dag,currentEnv().checks() returns the correct set of checks for the parent module, correctly wired to the same workspace as the functions calling dag.currentEnv() in the first place. I don't know enough about the plumbing of currentEnv(), but I would guess you've already had to deal with this situation, so that when a regular function callsdag.currentEnv().workspace(), it actually gets the same context the module was loaded from?
I guess dag.currentEnv() returns an Env with workspace set to nil, which disables the "virtualization" of workspace, and it passes through to the "real" context?
https://github.com/dagger/dagger/blob/412064595f695bed3d5b7e327f8b9c3167717e7d/core/schema/env.go#L143-L178
When an explicit env isn't set, currentEnv creates one on-the-fly with the current module installed and its context dir as the workspace
Conversely, when calling Env.checks() when workspace has been set to a virtualized Directory, then I would expect myVirtualizedEnv.checks() to return checks that, if called, executed in the context of that virtual workspace
OK, so that's what we want right?
calling a check from currentEnv().checks() will be undistinguishable from calling the same checks from the same module directly from the CLI? (sorry if that wasnt clear ๐ )
yeah, at least in principle - that fallback code I added was an educated guess tbh
makes sense from a design POV to keep them aligned that way
Re your other point: I think it would be super useful to have a LLM-optimized way of calling (and receiving feedback from) checks
There's an annoying amount of complexity buried in Env/LLM to support "live objects" which are objects that get re-instantiated with the latest Env.workspace as it changes over time. That was all to support dagger call test specific because Specific() lives on a DaggerTests object, which needs to be 'live' otherwise it'll run with stale code after changes are made. If we just worried about running checks that can maybe be cleaned up + simplified.
(Sorry this is getting more off topic, can prob put a pin in it at this point if there are other things to go over)
why does that object specifically need to be live more than other objects?
something specific to how our test function works?
Tricky to explain ๐
- The model starts with a tool called
Test, which gets called with the current env workspaceWorkspace1 - However,
DaggerDev.test()returns an object, calledDaggerTest#1to the LLM, which it then needs to call aDaggerTest_specifictool against - The LLM calls
DaggerTest_specific(DaggerTest#1), and it fails (happy path, bad code) - The LLM then makes edits, yielding
Workspace2 - But then the LLM just calls
DaggerTest_specific(DaggerTest#1)again, instead of starting 'from the top' by callingTestand getting aDaggerTest#2
So WITHOUT the fix it would just re-run the tests against Workspace1, and the LLM would think its edits weren't applied and go into a tailspin.
WITH the fix, DaggerTest#1 is actually re-instantiated on every use with the current Env workspace, just like calling dagger call test specific does (entire new CLI session, calls test().specific() instead of calling .specific() on an existing object).
So the problem is introduced by having a chained call from an intermediate object
ah I see. too much immutability for the LLM to handle
what do you mean by "the fix" in this context?
ah, it's the "annoying amount of complexity to support live objects" I guess
So are you thinking we would basically give up on "everything a LLM sees is objects and functions", and instead split LLM integration in 2 paths 1) a much simpler "checks API" for interacting with modules, and 2) higher-precision functions/objects system for individually injected bindings?
In other words, even though "modules are just a special kind of object", we would stop trying to get LLMs to understand that, and instead bifurcate the LLM interface in 2 distinct parts?
(speaking of bindings, that reminds me I never finished the env bindings API cleanup I had started with generic objects...)
I don't know what I'd prescribe in the long run since LLM isn't a key focus atm and it's hard to get a sense of how people are using it - just putting an option on the table (refocusing LLM/Env around toolchains) in case some of the baggage in LLM's interpretation of Env gets in the way of our other plans
Like, toolchains make a ton of sense for the coding agents LLM use case, but not as much sense for other use cases. I guess that's where the bifurcation would come from, but it's basically already here, just today it's 'Env workspace + withModule vs. withInputBinding'
yeah good point. it's already somewhat bifurcated. this would make the bifurcation even more pronounced (no object IDs exposed to the llm when calling checks, most likely)
First step: https://github.com/dagger/dagger/pull/11726
Toolchain-centric UX. With the arrival of toolchains, checks, user defaults... The "ideal" way of using Dagger is changing. But how exactly? There are still unknowns, and these unknowns are slowing down implementation. https://github.com/dagger/dagger/issues/11695#issuecomment-3757509323 . Note @burnt locust I am writing down my thoughts there.
@burnt locust I replied in that thread. Thanks for being patient.
Gaps in CI workflow. In theory, every dagger module now has a built-in CI workflow: dagger check. In practice, there is more to CI/CD than checks. The bare minimum is to publish what you checked. But how to coordinate all the possible permutations of "check" and "ship", without falling into the pseudo-code nightmare Dagger was meant to prevent? https://github.com/dagger/dagger/discussions/11653
And, a first pass at replying here: https://github.com/dagger/dagger/discussions/11653#discussioncomment-15564066
@burnt locust fun fact: we do have one small piece of persistent mutable context in the engine: the host filesync cache, namespaced by client id. I guess a persistent Env would absorb that
cc @weary timber
makes sense to me!
Mmmm @burnt locust I was thinking about "toolchains v2" (my nickname for the new API with explicit env. A few unanswered questions I have:
-
We talked about toolchains requiring an explicit env... But what about the inverse? Will explicit env only be available to toolchains? And if not - how does a non-toolchain module get access to that env, without getting those pesky cache invalidation issues?
-
Related to that: what happens to
currentEnv()? Toolchains wouldn't need it anymore. But for non-toolchains, it would be the only way to access the env. How do we reconcile that? Do we keep both? But seems awkward
@burnt locust @languid plover quick sketch for environment management, just to start the conversation...
This is for managing environments outside the sandbox.
dagger env list: list available environments for this project. There is always a default environment (calleddefault).dagger env create TEMPLATE [NAME]: create a new environment from the given template. Env templates must be configured indagger.toml( shooting from the hip here:man_shrugging: ). Example env template names:ci-pr,ci-main,prod,agent-dev... etc. If a name is not specified, the env gets a cute memorable name likeagent-dev-cute-panda-42`dagger env destroy NAMEdagger --env NAME: execute this command in the given env. Error if the env doesn't exist.
Example dagger.toml:
[templates.gha]
defaults.clientDockerConfig = "file://~/.docker/config.json"
defaults.publish.githubToken = "env://GITHUB_TOKEN"
defaults.go.lint.source = "/myapp"
ignore.go.lint.source = ["*", "!**/*.go", "**/broken/**"]
permissions.default = false
permissions.llmAuth = true
permissions.dockerAuth = true
permissions.gitAuth = false
scale-out = true
auto-update = true # just making shit up here
[templates.release.defaults]
npmToken = "op://ajeflnsdvlsd/npm/credential"
...
IMPORTANT NOTE: IMO matching git branches/tags to environments should be a CI-specific configuration.
In other words, CI is responsible for creating / destroying environments on git events. Which template to use for which kinds of environments could be configurable, either in the Dagger Cloud API, or in a special [ci] config above.
Something like:
[ci]
onSemverTag.template = "release"
onOtherTag.template = null # No environment created
onPullRequest.template = "ci-pr"
onTrunk.template = "ci-main"
Pretty cool! Completely different from what I thought we were talking about, but cool ๐ I thought we were solving for configuring projects within a git context. So multiple groups of sub paths, ignore lists, customizations, etc. I guess this is more about modifying behavior based on where/why dagger is being run
it's a variation that came to me after. it's still pretty close, main difference is there's an indirection between the git context and environment config
but I think I might be conflating two different things though: 1) interface between function sandbox and its surrounding execution context, and 2) stateful client environment persisted engine-side, manageable by outer clients
wdyt @burnt locust ?
having a hard time parsing what's what
- but i guess that goes for a lot of configs without the docs on hand
what's [templates.gha]? my brain is wedged on that since we just left GHA ๐
and is ignore.<x>.<y>.<z> a namespace for setting ignore rules, with x being toolchain, y being func, and z being arg?
this replaces .env too i suppose? (defaults.*)
in any case this does feel better than the dagger.json + .env mix we have now, even if it's mostly vibes
one thing i'd say is it's kind of hard to tell what the x/y/z components are here, i wonder if an explicit per-toolchain or per-module scheme would be clearer, but i can see how that might get unwieldy if you have multiple templates
or maybe the different templates should be different files
[toolchains.go.defaults.lint]
source = "/myapp"
[toolchains.go.defaults.test]
packages = [ "./..." ]
- i'm trusting https://string.is/toml-formatter for the conventional TOML wisdom here, but also garbage-in garbage-out to some extent
what I originally put in was:
[toolchains.go]
defaults.lint.source = "/myapp"
defaults.test.packages = ["./..."]
what you have might be fine, i've honestly not used TOML much and don't have strong instincts there
Yeah those are really broad strokes in the specific config schema... It's basically:
-
Trying a concept of "env template" to distinguish the source-controlled configuration of an environment, and the actual persistent mutable stateful environment
-
Transposing user defaults from
.env -
Transposing toolchain customizations from
dagger.json -
Making up random configuration keys that seem vaguely plausible, to see what feels right in that config file
NOTE: the gha env template was supposed to replace our current .env.gha -> a configuration specifically for our Github Actions runner environment (the one being deprecated)
I'm confused about the templates. When an Env is created from a template, what does it represent? an instance of source + configuration? Where does it live?
See #1463332258453651569 message
I was picturing stateful client environments, persisted on the engine
Is this separate from replacing defaultPath + special toolchain constructors, or still solving that too?
I was focusing on defining a centralized place inside which we could figure that out - either by transposing customizations and user defaults as is, or I guess changing how they work? So I think, decoupled
got it, I was trying to connect the dots and they weren't connecting
@languid plover @burnt locust separate topic, less brain-melting hopefully ๐ Tactical design issue for explicit env in toolchains...
๐
What I was imagining, though I'm sure we're about to poke some holes in it, is that where most toolchains have a Directory field today that is constructed as a context directory, they'd now have an Env instead that contains source and other things. I don't think that would need to be restricted to toolchains
I think we can retire currentEnv. I only ever used it in Doug, and it seems clear at this point that doug should be a toolchain, so now it would just have it via the constructor. We would have to re-construct Doug every time the Env changes but that's already true for the source code input which is also in the constructor.
re: non-toolchain Env access - I feel pretty comfortable with not allowing non-toolchains to access Env. It feels like that's the "dividing line" to me between programmatic API use and CLI/dev toolchain use
Yes but what do we do about non-toolchain modules using +defaultPath?
hmm I guess I'd expect that to still work and just always be relative to the module source's context like it was before, but in both toolchain and non-toolchain case
the latter point being a way to keep the two concepts orthogonal
Makes sense, like we discussed
But how would we explain the relationship between Env and +defaultPath?
They are kind of related... But that relationship would not be materialized or acknowledged
(meaning, there would be an Env under the hood, but we wouldn't let you see it)
maybe not an issue in practice
I think we could explain +defaultPath without indicating that it's related to Env at all - like it's closer to an overrideable go:embed (Go specific there but you get the idea)
Yeah that makes sense. If our bet on toolchain-centric model turns out to be correct, the majority of new dagger functions will be developed in a toolchain anyway, and +defaultPath will become more of a niche power user feature. And hopefully the new standard way to get files from the project context, via Env, will be simpler and the community will like it. Then it will be the "come to jesus moment" you had talked about ๐
Trying env configuration again, without the concept of persistent engine-side client environments (I will call those "sandboxes" and spin them off to a separate orthogonal design)
List env configurations: they are just files that you can use.
$ ls *.env.toml
gha-legacy.env.toml
shykes.env.toml
ci-main.env.toml
ci-pr.env.toml
release.env.toml
Apply an env configuration case by case
dagger --env=shykes.env.toml ...
Set the default env configuration for the current module
dagger env use gha-legacy.env.toml
(Presumably this would be persisted in dagger.json or a companion file.
Env configuration example
(Trying @burnt locust 's version for size)
# ci-main.env.toml
[toolchains.go]
defaults.lint.source = "/myapp"
defaults.test.packages = ["./..."]
ignore.lint.source = ["*", "!**/*.go", "**/broken/**"]
Includes / overlays
We'll probably need a basic include or overlay system, to reuse common configurations. The simpler the better.
# ci-main.env.toml
includes: ["base.env.toml"]
This version is simpler:
- no "templates" since there is no more persistent engine-side state to worry about.
- one file per env, so the schema is simpler
Looks like a good start!
I proudly showed my first draft ๐ to @rugged zodiac. Mentioned that it includes dynamic filtering. He immediately asked - "so if it's dynamic, how will we introspect the filters for smart checks?"
. That reminded me of your suggestion @burnt locust to perhaps have an intermediary representation of those ignore paths as data, and allow for them to be generated, somehow. So I started looking in that direction... No slam dunk so far, but it seems doable.
@burnt locust when you were thinking of generating ignore paths, did you mean "generate at runtime so clients can introspect", or "generate at dev time to be checked in git"? or tbd?
the latter
Ok I've been (lightly) exploring the former
since they're unlikely to change often, feels like it matches the pace of all other generate-eable things and makes things more auditable
True, and dagger generate will make the ux easier to hook up
found my last message on the topic: #1455259924908216381 message
which i still think is interesting ("primary file extension" based configuration like %.go to minimize toil) - but still required a pragma of some kind
I'm thinking of generated from actual code
yeah - like parsing go.mod etc?
right. the question is when and what happens to the output - runtime or dev time
in this, the %.go is an alternative way of organizing the ignore rules, rather than function or arg based, to hopefully be as far-reaching as possible
I guess if we go for "dev time" we can layer convenience pragmas on top as needed
if I understand correctly, that's an additional step where we take some of those generator functions, and move them to builtins registered on standard file extensions?
the steps I took to get there was:
- how can we move ignore rules out of 10 pragmas in 10 functions? => move them out to a file, reference file from args
- but how do i tell other modules/toolchains i use about my ignore files? => customizations? name files after modules/functions/args??
- but customizations are not great UX, and naming after functions/args is finnicky and probably means a lot of symlinks
- what if the toolchain args just configured what "primary file extensions" it cares about, which maps to an ignore file at a conventional path, and the toolchain can provide a generator to maintain it
basically trying to reduce the problem to 'most toolchains/functions/build tools/etc care about one primary file type, and usually there's just one correct set of files for anything looking for a particular file type'
so hopefully it'd work across multiple modules/toolchains without too much babysitting
plus some cleverness like having foo/bar/%.go use the same ignore filters but scope the results to foo/bar/ (if that ever comes up)
Quick sketch of a proposal I want to put together this afternoon:
-
Toolchains receive an explicit
Workspaceconstructor argument. Invalidates cache, allows dynamic file access. -
Fields can be of
WorkspaceDirorWorkspaceFiletype. This can be used for invalidation (include/exclude paths are tracked) -
If necessary for more aggressive caching on workspace paths: ignore paths for each type may be generated & stores in
dagger.lock(general lockfile feature) This could be used for faster smart checks -
Bring back "dynamic checks" branch and adapt the plumbing to support "dynamic artifacts". NOTE: would be cool if we explicitly tied artifacts to underlying
WorkspaceDirs(via fields). So you could reference dagger artifacts via the workspace paths they reference, somehow -
Assuming support for dynamic artifacts above:
check&shipbecome methods of the artifact itself. -
Assuming the above: less pressure for a generalized declarative workflow dsl: it's implicit that before shipping an artifact you also check it. And if that artifact references other artifacts, well... gotta check & ship those too!
-
Also want to try tying "ship" to an explicit registry publish, to take advantage of the obvious tie-in to publishing artifacts. Might want to rename to "publish" in the process
-
Git filtering: want to tie that to workspace configuration. Apply workspace config based on git context. Then everything downstream of that is agnostic
-
Test splitting & filtering: special annotation to declare a collection type wrapping a list of objects. The collection type can define methods that support parallel processing. Engine may call in split batches (test splitting). Client may filter the collection before calling (test filtering). Note: this may overlap with dynamic artifacts. Same collection type?
So yeah... ๐ <@&946480760016207902>
Overall sounds like the direction we've been crawling towards ๐ I have 1000 more questions though lol.
WorkspaceDir/WorkspaceFilefeels weird. Maybe its ok if its the only way. Can't we determine that any file/directory came from Workspace? Or we enforce that the field itself should just storeWorkspaceinstead of pulling stuff out in the constructorship->publishsounds like a pretty big reduction in scope, but what about the rest of the things you want to ship?- git filtering: what would a workspace configuration do? Like produce specific artifacts based on the git event? Or git state? Like from a CI perspective its really about whether to take an action because we pushed a tag, and not necessarily because we're on a tag.
WorkspaceDir/WorkspaceFilefeels weird. Maybe its ok if its the only way. Can't we determine that any file/directory came from Workspace?
Yeah maybe. But it needs to be made introspectable somehow. Right now it seems like dark llb magic & possibly unreliable to determine that info from an arbitrary Directory
Or we enforce that the field itself should just store
Workspaceinstead of pulling stuff out in the constructor
The problem with that, is that any function tainted by Workspace must be cache-invalidated. In this design I try to keep the taint limited to the constructor
ship -> publish sounds like a pretty big reduction in scope, but what about the rest of the things you want to ship?
Yeah I don't know. Holding on this one for now. Will depend on how the "artifact-aware publishing" idea shapes up
git filtering: what would a workspace configuration do?
This is roughly my strawman of ci-pr.env.toml. Just calling it "workspace config" instead of "env config" for now, for a clean slate disconnected from the baggage of Env. (we may still merge back into Env in the end, just easier this way for now).
So a workspace config is a centralized place for all the configuration that doesn't fit in code. User defaults, customizations, permissions, etc.
If you select workspace config based on git context (event or state, tbd), then you keep the conditional in only one place. Instead of, say annotating your functions one by one to say "do this only on PRs, do this only on tag" etc. It's more monolithic and I think simpler.
Like produce specific artifacts based on the git event? Or git state? Like from a CI perspective its really about whether to take an action because we pushed a tag, and not necessarily because we're on a tag.
Git state vs git event: good question.
I think that with this model, we could keep local execution agnostic to git context in general. Let's say you want to run a release locally. As a user, how do you tell dagger to run a "real" release with all credentials loaded, etc? And how to avoid you fat-fingering a release every other time?
In this model, you would explicitly select a workspace config that provides a release environment: dagger --workspace-config=./release.workspace.toml ship. Or perhaps dagger --env=workspace ship.
The release function itself might check the git state, for safety: make sure the head is a clean commit, mapped by a semver tag, perhaps check that that tag is not already released... etc. All custom logic that the toolchain dev is responsible for.
Then the only thing missing is for CI to map git events to workspace config. This is a CI-specific configuration. It would be something like: "on push to a PR, run in the ci-pr environment"; "on push to main, run in ci-main"; "on push to a new semver tag, run in release". This could be in a small Ci-specific config section in the repo; or it could be configured in Dagger Cloud
So maybe in CI, dagger cloud can first look at the workspace configs to decide which one it should use for the git context (or event). Or is that something that would be handled automatically when i run dagger check always
2 options:
- Dagger Cloud has its own pre-repo configuration, where you select which workspace config to apply for which kind of git event
- The repo has a top-level config for this mapping (
dagger.ci.tomlor something ๐ ), which Dagger Cloud reads directly
Either way it would apply to all dagger operations: call, check, ship...