#Workspaces aka "modules v2"

1 messages Β· Page 2 of 1

crisp dove
#

GitHub Apps authenticating with an installation access token use the installation's minimum rate limit of 5,000 requests per hour. If the installation is on a GitHub Enterprise Cloud organization, the installation has a rate limit of 15,000 requests per hour.

For installations that are not on a GitHub Enterprise Cloud organization, the rate limit for the installation will scale with the number of users and repositories. Installations that have more than 20 repositories receive another 50 requests per hour for each repository. Installations that are on an organization that have more than 20 users receive another 50 requests per hour for each user. The rate limit cannot increase beyond 12,500 requests per hour.
The rate limit seems to be per installation, not for the entire application. Which means it was the installation in your org that got rate limited thinkspin

#

Should we continue to hide this error to the user? Or should we create a load step that stays in pending until we can actually start? Or maybe mark it as failed with a clear message saying when it will be retried?

Most of Native CI is built in a way that if GitHub's API isn't working they should continue to work (if the Git operations API is down of course we are screwed). But we decided to treat rate limit errors differently, if we hit a rate limit we do not continue, we stop there and retry at the specified time.

brittle wharf
brittle wharf
#

Pushed a fix for TestModule/TestIgnore - the coremod path that constructs currentTypeDefs from GraphQL schema introspection wasn't respecting directives, so entrypoint functions promoted to Query didn't have them. I've been thinking lately that the fact that we have two competing schema representations is unfortunate, it would be nice if currentTypeDefs even in the module path worked purely from GraphQL introspection so we had a single source of truth

dim smelt
#

two distinct schema representations meaning: raw graphql & typedefs?

brittle wharf
#

Yeah - to be clear, it still makes sense for us to have a higher level API representation for modules and SDKs to deal with. I just don't like that sometimes during dev we have split behavior between the two, and since we already need GraphQL, it'd be nice if it was authoritative (also for Dang - currently some pragmas don't have directives defined for them, like +cache, so there's no way to express it until the directive is added)

dim smelt
brittle wharf
#

also happy to sync sometime, feeling pretty good about the new entrypoint proxy approach but would feel better if we picked it apart a bit

dim smelt
#

("rebasing" the english words πŸ™‚

brittle wharf
#

ah, well the last few commits fix those entrypoint edge cases, but curious about the UX tweak

#

I'm adding some more specific tests for it now, so we're not relying on tests that 'happened' to catch it

dim smelt
#

Well the remaining edge case is that your entrypoint module may or may not be reachable at its name, depending on whether it has an eponymous function

brittle wharf
#

ah true - like you can't explicitly call the constructor

dim smelt
#

I'm thinking it might be better to make it consistent, and never expose an entrypoint's constructor, so the rule is easier to memorize and less surprizing. But then, we'd need a UX device to set those arguments when you need to, without adding a hack that makes everything worse - that's where my proposal comes in

brittle wharf
#

yeah, was going to suggest just not having the constructor too. there's another test failing from that (for silly reasons - functions lists its constructor, which throws off whitespace in the table, fn - => fn -)

dim smelt
#

TLDR since we know config,toml is coming, with a nice user-friendly config UX that maps to constructor args (modules.foo.config.hello = "world"), it would make complete sense to have a CLI equivalent of that - just a way to set these nice key-value pairs in the CLI, overriding whatever is in the config file. From the user's point of view, it's just config either way. Not "arguments" to a "constructor", that's a module dev concern. I'm thinking we can kill 2 birds with 1 stone: ship an objectively useful feature that fits in the workspace design; and add a UX device for passing constructor args from the CLI without exposing the constructor function itself

#

Also - it would simplify CLI arg parsing a little bit, no ambiguous booleans to deal with, at least before the first subcommand πŸ™‚

#

Before:

dagger call --token=env://TOKEN --enable-cgo build --pkgs=./...
-- CONNECT
-- QUERY { mymod(token:"env://TOKEN", enableCgo: true) { build(pkgs:["./..."]) { ... } } }

After:

dagger call --config token=env://TOKEN --config enable-cgo=true build --pkgs=./...
-- CONNECT headers=extraModuleConfig:{"token":"env://TOKEN", "enableCgo":true}
-- QUERY { build(pkgs:["./..."]) { ... } }

Note that the extraModuleConfig values would be sent as a new connect param, like workspace, extraModules..

It's the exact same pattern as extraModules actually

dim smelt
#

@brittle wharf if you're on board with the idea, I'll try implementing it on top of current branch tip.

#

Or, if you prefer, I can let you cherry-pick the parts you like yourself

brittle wharf
dim smelt
#

getting a bit sick of entrypoints tbh

#

they better be useful, because they're consuming so much of our time..

#

eg. now I have to figure out how to reconcile "pass key-value config for the entrypoint" and also "pass key-value config for non-entrypoint modules"

#

might have to be 2 different connect params

brittle wharf
#

wouldn't be the first entrypoint feature that you regret πŸ˜‚

dim smelt
#

Can I run another idea by you? It was actually the very first thing I implemented, I threw it away but might be simpler in the end...

  • No "entrypoint module"
  • You can configure individual function aliases, eg. aliases.playground = ["dagger-dev", "playground"]
  • At migration, the legacy module is introspected, and aliases configured invididually

Tradeoff:

  • Slightly more verbose config...
  • ...but more fine-grained control over your config
  • Slightly more moving parts at migration...
  • ...but less moving parts at runtime
  • Less special cases to worry about (now the user's responsibility when configuring)
  • Less concepts to explain ("entrypoint module" vs "regular module")
brittle wharf
#

So would the GraphQL schema be the same as it is now, and it becomes purely a CLI concern? Almost like git aliases?

dim smelt
#

I was thinking we'd keep the hard-earned engine-side schema changes (seems to work nicely with your inner/outer architecture). Just the mechanism for configuring it would be more fine-grained

brittle wharf
#

ah ok. i interpreted aliases.playground = ["dagger-dev", "playground"] like it was just setting an argv expansion, but i guess you're mapping it to a graphql path? Would flags be supported?

dim smelt
#

yeah you would set entrypoint functions individually instead of module-wide

#

So in the example above:

dag.playground() would be aliased to dag.daggerDev().playground(). Exact same implementation of entrypoint functions as today. But you configure them one funciton at a time.

#

So for example, conflicts become the user's problem

#

The big win IMO is that we don't have to introduce a major concept of "entrypoint module" which happens to solve our immediate problem, but at the cost of extra UX moving parts that we may not need at all

#

The argument in a nutshell: function aliases are the smallest possible feature that actually solves our problem (backward compat + ability to customize your project for less keystrokes)

brittle wharf
#

I think I like it? Initial impression is pretty positive, aliases are super intuitive as a cross-cutting outer layer feature, it has more flexibility than entrypoint modules, the alternative I see is to define a Dang entrypoint module but even with Dang that's still a bit more plumbing than just throwing aliases in your config. For example you can just YOLO a bunch of aliases that touch a bunch of different modules, if you were to do that with an entrypoint module you'd need to maintain the dependencies etc

dim smelt
brittle wharf
#

easy enough UX too, something like dagger alias build -- dang build?

#

didn't we have something like that before...? feels familiar

dim smelt
#

It was in my very first iteration of the workspace proposal, if you've looked into that, maybe you've seen that design? I don't remember exactly its lifespan or exposure

#

Will draft a spec

brittle wharf
#

@dim smelt will the new approach mean we stop propagating constructor args to entrypoint functions? or will that all be the same?

dim smelt
#

I was thinking we could just say: "you can't pass flags in an alias. It's up to you to make sure no flags are needed. Use modules.foo.config as needed"

That way we can leave a lot of the existing code intact.

--> when explicitly calling a contructor (dagger call engine-dev playground) everything works the same. Constructor args are regular args.

--> when calling an alias, no special way to pass arguments to the "skipped" functions. It's the user's job to make sure those are not needed

We have a precedent with check, generate..

#

The only downside I see, is that it would break backwards compat when auto-migrating a legacy dagger.json I guess...

#

(very annoying detail)

#

@brittle wharf mmm what if an alias function accepted the flattened set of all the functions in the target path?

brittle wharf
brittle wharf
dim smelt
brittle wharf
#

hmm that'd be more prone to collisions with longer alias chains, hard to say what people will be doing years from now but i could see someone running into it

dim smelt
#

But tbh the only reason I care is to get backwards compat..

#

(specifically not breaking those existing dagger call --for-constructor foo --for-foo

brittle wharf
#

what if we keep going down the entrypoint proxy road, but reserve a toplevel field like new or init or with or [...] which takes the constructor args? and stop propagating constructor args to the entrypoints?

#

that way dagger call [...] takes the args from new/init/with, and for foo().foo() you get foo() at the toplevel and new/init/with() if you want to explicitly call the constructor

#

(new/init/with's return type would be Query, which is a legit thing you can do afaik)

dim smelt
#

So clients would have a special rule that they should follow init and expose it to the user as the top-level entrypoint function if it exists and returns Query?

#

it makes sense conceptually

brittle wharf
#

yeah, I think of it more like pushing more configuration into graphql, instead of bespoke client init headers

dim smelt
#

that part does make sense yeah

#

(more discoverable)

#

I like the flattening better though because it doesn't add a special dance that every client has to know

brittle wharf
#

i've been tempted to do that before for some types of global configuration

dim smelt
#

and, we could flatten in a way that avoids conflict. if engineDev and playground both have an argument token, prefix them with the function Name. playground(token:, engineDevToken: )

Still graphql, still discoverable (I agree that's a really nice benefit vs. opaque key-value config keys). But zero client magic which is also nice.

#

@brittle wharf looks like we now have 2 stacked decisions:

  1. Customization primitive

a. entrypoint module?
b. function aliases?

  1. How to expose hidden-path arguments

a. config/header overlay?
b. reserved init?
c. flattened alias args?

#

Maybe we can throw in a 3d decision to make things more fun? πŸ˜„

brittle wharf
#

now we need a whiteboard πŸ˜›

#

i want pro/cons columns

dim smelt
#

Actually not all permutations make sense, so it's really a total of 4 options:

  • Entrypoint module + config/header
  • Entrypoint module + reserved init
  • Entrypoint module + flattened alias args
  • Function aliases + flattened alias args
brittle wharf
#

@lofty shadow Do you forsee any caching gotchas with having a type Query { with(args...): Query! }?

lofty shadow
#

idk I guess it might be a corner case, cause right now Query isn't ever materialized as a result in the cache. As long as we don't need the result of Query.with(...) cached, then it probably works out

brittle wharf
#

That would still be accurate afaict, it's just these IDs would have a with call at their base that explicitly returns a Query thinkspin

brittle wharf
#

sounds like might be fine? padme_right

lofty shadow
lofty shadow
brittle wharf
#

wait, in other graphql servers, query is an actual field? i thought it was always just an optional parsing thing 🀯

brittle wharf
#

i am officially sniped

dim smelt
#

I don't think it's a field. It's the top-level type

#

But you can define your own fields that return that type

#

separately there's query / mutation / subscription which is.. an "operation"? (forget the exact term from the spec)

brittle wharf
#

and subscriptions

#

i think all of those are outside the query, like keywords

lofty shadow
dim smelt
#

not all DAGs have roots, so if the DAG has a root, that is literally a corner case πŸ˜›

brittle wharf
#

i think that's just how graphql schemas work; the schema itself defines the 'root object type', and it's almost always named 'Query'

dim smelt
#

yeah that was just an attempt at a joke πŸ™‚ I will cease and desist

brittle wharf
#

well, at least the cost of entry for trying these ideas is pretty low

#

but it depends on how we measure success πŸ˜›

#

like, init seems like it would be relatively easy, but partly because it's closer to what the client does today

#

(i think i prefer with since it's reconfiguring Query, like other with* APIs do)

dim smelt
#

Yeah... Honestly success for me is

  1. backwards compat for existing 'dagger call'

  2. smallest possible deviation from the target design to avoid bagage (namespaced modules installed in a workspace, thin client)

  3. cross as few one-way doors as possible

dim smelt
brittle wharf
#

true

dim smelt
#

(if one makes the CLI considerably fatter than the other, I could change my mind on that)

#

@brittle wharf yeah your with would make the client less fat than having it select and introspect an entrypoint module. So it would be an improvement

#

@brittle wharf if you're feeling it, let's do that

#

I'll let you make the call if that's OK, I feel like I'm spiralling on this thing, just want to move forward

brittle wharf
#

I'll give it ago and see how it feels πŸ‘

brittle wharf
#

hm, I think dagger call with --arg1 val1 would "just work" if we wanted to deprecate call flags to make the client even dumber thinkspin

dim smelt
brittle wharf
#

yeah, just saying it's an easy cleanup path to take later

dim smelt
#

I'll be honest from the end user's point of view, seeing that with in the CLI is super confusing

brittle wharf
#

is there a better name?

#

i guess at that point, the module name, if we don't want to claim a term. so back to square 1 in a way

dim smelt
brittle wharf
#

think i have a working prototype - pushing to get a ci run

#

targeted relevant tests pass, which never passed at the same time before

dim smelt
#

Nice. I was looking at removing the Workspace.defaultModule dead code, only used in dagger mcp, but that subcommand is a rabbit hole of its own...

brittle wharf
#

lol yeah

dim smelt
#

Going to take a stab at rebasing workspace onto workspace-plumbing

#

(not pushing, just experimenting)

brittle wharf
brittle wharf
dim smelt
brittle wharf
dim smelt
#

@brittle wharf I'm back in the fight πŸ˜„ How can I help?

brittle wharf
#

nice nice

#

things are looking pretty good

dim smelt
#

That tweet from some random startup shipping a test splitting / scale-out CLI really made me want to get moving again on dynamic artifacts / ship / test splitting&filtering

#

If there are no specific tasks for me, I'll try rebasing workspace and continuing dev there. Hopefully with minimal slop

brittle wharf
#

no specific tasks other than reviewing the changes i've been making, I suppose - the rebase might be more valuable, up to you

#

some interestings:

  • For entrypoint modules, we no longer expose the constructor under the module name - it's redundant with with and was breaking tests that expected functions to just list the entrypoints, so it's gone now, and you just pass flags to call like before. (Constructors for non-entrypoint modules still show up as expected.)
  • I had to make small changes to the Go and Python SDK for the new Query behavior. Previously they both special-cased Query and named it "Client" instead which led to weird stuff like ClientID, loadClientFromID etc. - now Query is just a regular type; the Go SDK Client struct just embeds it, and the Python SDK Client type subclasses it. Doesn't seem to have broken anything, thankfully.
dim smelt
#

@brittle wharf no regrets on with so far?

#

I'm going to spin out the lockfile plumbing, to chip away at the diff size in workspace

brittle wharf
# dim smelt <@108011715077091328> no regrets on `with` so far?

none yet - haven't found an achilles heel or anything, just a couple of tests doing weird things that happened to conflict in interesting but not really significant (practical) ways:

  • a Python test used with_ as a function name to test Python keyword collisions, I just changed it to use a different Python keyword
  • another test did a bare dagger call to test that the module was able to be constructed and asserted that it returned ModuleName@xxh3:... - but now that returns Query, so I updated the test. Briefly went down a rabbit hole of having the constructor return the module type, but that's a no-go, the whole point is to return Query since it has the other module constructors, etc.
dim smelt
#

What's the rule for clients? In addition to connecting, they must introspect for Query.with: Query and if it exists, they must expose its arguments as top-level flags to their users?

#

How does that work for SDKs?

#

Specifically generated client for eg. our test code (generated by introspecting the workspace)

brittle wharf
#

how do I try that out? looking for docs on standalone client generation

dim smelt
#

I think the usual dagger client...

#

I guess @coral kernel 's test-container branch would be the most advanced use of it

#

(used by our test code to orchestrate test dependencies)

coral kernel
#

yeah I think the docs for that are in my brain

dim smelt
#

Of course it's pre-workspace

#

But, it intentionally uses the top-level dagger.json, which is workspace-ish in design and intent

coral kernel
#

It does use the dependencies in the same way a module does though, so I think a client would still be defined with a dagger.json probably?

dim smelt
#

In our case it uses the toolchains - which will become modules.foo in workspace config

#

So what we'll need will be in the workspace config

coral kernel
#

I might be crossing contexts with what you're talking about for clients and the testcontainers work

#

but the generated client for the testcontainers PR uses dependencies, not toolchains

dim smelt
#

toolchains/engine-dev/dagger.json?

coral kernel
#

no, the root one

#

where the client is defined

dim smelt
#

I've been assuming all those generated clients would use toolchains in top-level dagger.json. For SDK tests it's how we'll solve the circular dependency between engine-dev and sdk modules

dim smelt
#

Ah you add them back as deps

#

Is that out of necessity because dagger client doesn't pick up toolchains?

coral kernel
#

yes but also the generated client doesn't necessarily need all the toolchains in its api. Same reason we separated toolchains and dependencies in the first place

#

thats why I was saying I'd expect the generated client to generate off of a dagger.json instead of a workspace

dim smelt
#

Wrong delineation. Yes a given client may not need all the toolchains. But you won't solve that by moving toolchains to deps. Then it just becomes "a given client may not need all the deps".

#

If dagger/dagger doesn't implement its own dagger functions, it won't be a module, and there won't be a dagger.json. No runtime dependencies since there will be no runtime

#

So for the use case of "my app's test code uses dagger to orchestrate dependencies", that will come from the workspace config

#

"My workspace has these modules installed, so my test code can call those modules to get things done"

#

--> and yes the test code may not need all the modules, we don't have a way to segment this further (except sub-workspaces I guess)

coral kernel
#

yeah makes sense to me. I currently think of it more like a custom module, which would have a dagger.json and dependencies

#

but when you put it that way it makes sense to tie it to the workspace instead

dim smelt
#

It does mean that there will be 2 kinds of codegen: from the modules in a workspace (like here), and from the dependencies in a module (when generating client for a module runtime). We haven't really dealt with this distinction yet, I think @topaz seal will be surfacing it soon πŸ™‚

#

As for how entrypoint modules affect this... πŸ€·β€β™‚οΈ

#

We could follow your suggestion @brittle wharf and ignore them completely in codegen

#

But, it's not really up to the client

brittle wharf
#

lol dangit, we already claim the word with in SDKs:

// With calls the provided function with current Query.
//
// This is useful for reusability and readability by not breaking the calling chain.
func (r *Query) With(f WithQueryFunc) *Query {
    return f(r)
}

@dim smelt if you disliked that name now's your chance

dim smelt
#

I didn't dislike it really. Made sense to me

brittle wharf
#

only issue is there's already SDK meaning for With specifically, at least in Go

dim smelt
#

init was a solid choice too

#

or config?

brittle wharf
#

yeah my only worry there is it sounding like workspace init or something

dim smelt
#

But what does Query.with mean today?

#

That's not a core API thing

#

Never heard of it

brittle wharf
#

it's the generic chaining helper that we add to all objects in codegen, so you can do foo.WithBar().With(myHelper()).WithBaz()

#

it actually didn't exist before, it exists now because now Query is ID-able like any other type

dim smelt
#

Oh, the Go SDK auto-adds a With to every ID-able type?

brittle wharf
#

yep

dim smelt
#

And it's not available in any released version of the SDK?

#

In that case, the answer is obvious -> break

#

Since we're not actually breaking anyone

brittle wharf
#

yep. i could just special case it

dim smelt
#

Even auto-adding With to every type seems excessive but anyway

#

Query is definitely a special case

brittle wharf
#

@dim smelt what do we expect for dagger query? currently a ton of tests are failing from my removing the outer constructor named after the module, since a bunch of tests depend on it, like this one: https://dagger.cloud/dagger/traces/a1dee328379034164adb5fc7644136f9?span=90b2981fe338b6d9
should it remain as {modName{foo}} for compatibility, or change to {foo} for alignment with the new model?

for compatibility, I suppose we could make it run with how a module would see the schema (no entrypoint)

dim smelt
brittle wharf
#

yup. prior to the HEAD commit, the first example still worked, because we were additionally registering the constructor like before.

dim smelt
#

Ah I see

#

query is pretty niche no? We could conceivably break it. Maybe with a --no-entrypoint flag as an escape hatch?

brittle wharf
#

yeah probably

dim smelt
#

@brittle wharf we may not even need a special flag

#

One follow-up UX feature I was planning, was to add support for -m [NAME=]ADDRESS -> load a module under the given name. When use that form, you just load that module in the standard namespaced way - you don't make it an entrypoint. So, this will work:

dagger -m modName=modAddress <<.
{
 modName {
  foo
 }
}
.
#

We don't even have to ship it now. We can wait until someone complains πŸ™‚

#

One less special case

#

also planned: support for multiple invocations of -m πŸ˜› (makes more sense with the namespaced form)

dim smelt
#

@brittle wharf one nice side effect of the workspace model, I think: engine becomes easier to scale. Because the engine gets cleaner metadata straight from the client connect params.

brittle wharf
#

pushing a batch dagger query test update + rebase on main

dim smelt
#

My plan for the day:

  1. Complete spin-out of lockfile plumbing from workspace, open new PR on top of workspace-plumbing
  2. Push my rebase of workspace on top of workspace-plumbing + lockfile, make it as clean as possible, then call for a new wave of testing. FYI @compact monolith
  3. (stretch) first pass at dynamic artifacts & provenance
dim smelt
#

Still polishing, but worth taking a look

brittle wharf
#

the failing checks fit in the github pane now πŸ₯Ή (the 2 pending below are DCO and netlify)

dim smelt
#

Working on the linting errors (harness trouble, of the dagger kind)

brittle wharf
#

sorry, missed your message

dim smelt
#

that works too 😁 thanks

#

@brittle wharf it's possible that the agent clobbered your commits when force-pushing its own fixes... 😬

#

(wasn't at my computer earlier)

brittle wharf
#

seems fine πŸ‘

dim smelt
brittle wharf
#

oh, workspace-plumbing πŸ˜›

dim smelt
#

Ah. nice.

dim smelt
#

I cleaned up the lockfile branch, it should have on failing checks except those of workspace-plumbing (which is the upstream branch). Also cleaned up the history.

It's ready for review

brittle wharf
#

@dim smelt is it expected that dagger functions in an empty directory (no workspace, no modules) simply returns empty now? there's a test for it, so just deciding which side is wrong πŸ™‚ - it used to expect module not found but I suppose now it's just an implicit empty workspace?

dim smelt
brittle wharf
#

sounds fine to me

#

maybe it should print something like No functions found. to be more explicit?

dim smelt
#

As long as it's on stderr

#

Update:

  • Rebased lockfile on latest workspace-plumbing, ready for review
  • Attempting a rebase (or really, a logical replay) of workspace from lockfile, wish me luck!
dim smelt
#

Update: I managed to move more logic into lockfile, so should be able to shrink workspace that much more

#

(the trick was to persist moduleSource() operations in the lockfile. Those already exist in main, so it's not workspace-specific. But workspace branch makes heavy use of it because of .dagger/config.toml - it just becomes a big consumer of an existing low-level lockfile lookup, alongside any other client calling dag.moduleSource())

dim smelt
#

workspace is rebased on lockfile πŸ™‚ Now 19 commits and relatively clean

#

Tomorrow I can finally get back to actual feature dev πŸ™‚ dynamic artifacts first

dim smelt
#

@brittle wharf should I execute a cascading rebase from your latest commits, or wait a little if more are coming up shortly? (trying to keep the whole stack as up-to-date as i can)

#

btw did you notice, the diff on workspace is now smaller than the one on workspace-plumbing πŸ™‚

#

lockfile is the smallest, but not insignificant

brittle wharf
dim smelt
#

Roger

#

Note, it's pretty cheap to run it (at least in keystrokes and brain cycles, not sure about tokens)

dim smelt
#

@brittle wharf should I run the cascade?

brittle wharf
#

fixing TestShell now, and whatever weird failure is going on in test-base, and then we should be all green

#

the dependency issue is resolved, and addresses something that did feel off to me before (fixing TestInterface by serving transitive dependencies), so feeling better about that too

#

the fix there was to only do that for entrypoint modules, which keeps dagger call get-duck quack working the same way it did before (CLI used to set includeDependencies: true) without polluting dependencies between modules

#

(just context dumping, seemed worth noting down)

dim smelt
brittle wharf
#

correct

dim smelt
#

Clears the way for maybe allowing modules to return types from other modules?

#

(as a follow-up I mean)

#

Since it's served anyway...

dim smelt
brittle wharf
#

i think we'd still need to sort out some form of isolating schemas from one another, like proxying between servers or something, otherwise you can have weird incompatibilities

#

might be not that hard to figure out though

#

what got me down this path was ci:bootstrap - it exposed this because the workspace had go as a toolchain, and engine-dev as a toolchain, but engine-dev also used go as a module, so these two stepped on each other's toes:

  • go
  • engine-dev > go
    specifically, the workspace was setting customizations (ignore paths for go:lint), but then engine-dev's dependency just clobbered it in the module set
brittle wharf
#

@dim smelt quick question: if my module has a go function, and i have a go dependency, which should be exposed as Query.go?
currently: dependency wins, but TestShell wants the module function to win

dim smelt
brittle wharf
#

client (shell, call, functions, etc)

#

i.e. when entrypoints are enabled

#

earlier we defined the precedence as core > dependencies > modfuncs but that was before we added the proxy layer, so flipping it might be easy

dim smelt
#

I don't think we want to serve dependencies at all to the client (except for the dependency types)

brittle wharf
#

by that i mean the toolchain pattern

dim smelt
#

I don't think we ever did except for shell -> but now we have modulesV2 which makes that concept obsolete

#

oooh. sorry

brittle wharf
#

yeah sorry switching topics πŸ˜›

dim smelt
#

Ok so:

  1. Workspace has a module called go
  2. That module has a function also called go
  3. And that module is set as entrypoint

dagger call go -> what gets called? That one?

#

Or, do you mean, installed module vs. CWD module?

#

cascade underway

brittle wharf
dim smelt
#

Ok and by "your module" you mean "there happens to be a dagger.json in the workdir, because you're actually developing a module" -> that kind of scenario

brittle wharf
#

sorry, i mean your workspace entrypoint module

dim smelt
#

AH OK

[modules.go]

source = "..."


[modules.main]

entrypoint = true
# The main module happens to have a function "go"
brittle wharf
dim smelt
#

I guess entrypoint wins? Seems like the most entrypoint-ey behavior? wdyt

#

Or, we detect it and return an error

brittle wharf
#

hmm there's an extra wrinkle: shell has entrypoint go , but lets you do .deps | go to explicitly call the dependency

#

and another wrinkle: currently, since the dependency wins, you have no way of calling your entrypoint's clobbered function, since even calling the with constructor just returns Query. maybe that shouldn't be with: Query!, it should be new: EntrypointType!, and always called by the client, instead of being conditionally present

#

then, in the shell - dependency wins, and you call . | go to disambiguate in the other direction instead

#

flipping that approach to make the entrypoint win, and support .deps | go, i think we'd need to reserve a Query.deps or something notsureif

dim smelt
#

cascadedone

#

We can adjust the shell builtins if needed. We could kill .deps

#

The shelll model predates toolchains, modulesv2..

#

And to be clear, shell means "deps" in a different, pre-workspace way

brittle wharf
dim smelt
#

ie dagger -m foo -c bar is not consistent with dagger -m foo call bar

#

Actually let's not use the word "dependency" to mean "a module installed in the workspace" it's too ambiguous at least in this context

brittle wharf
#

sounds good, i agree, it's not like scripts are using .deps either

dim smelt
#

So we can probably simplify the shell:

  • Commands resolve directly to the served schema (no shell-specific stacking of module functions, module dependencies (not relevant aymore ayway), cor efunctions)
  • Deprecate .deps
  • Other?
brittle wharf
#

yup yup

#

i'll check if hack/build still works after these changes

#

not sure how widespread dagger shell scripts are or if it's just us πŸ˜›

dim smelt
#

With all the churn, I forget if we currently have a client param for toggling serving of core constrcuctors in the schema?

#

I guess we must, to prevent dagger call container from calling core container() ?

brittle wharf
dim smelt
#

Good morning. I'm attacking the Artifacts design in earnest. First, transposing the "part 3" design doc onto the reality of today's implementation... A lot has changed. Now wrangling the details to see how realistic our assumptions were

#

@brittle wharf one issue is the reliability of the provenance tracking - recording workspace API calls and attaching on each object the paths that were involved in its creation. Then presenting that to the user.

--> That's much less useful if we allow workspace injection on the "leaf" (verb functions).

So I'm considering restricting workspace injection so that it's possible everywhere except verbs

#

(to gently corral module developers into materializing more, without making it too inconvenient)

#

The other option is to only allow it on constructors, but looking at my own dogfooding - it would be a PITA

brittle wharf
#

hmm, that reveals a tension - putting Workspace on the verbs rather than the constructor is what makes coding agents much easier to implement

dim smelt
#

of course 🫠

#

Can you give an example?

#

The downside of allowing it, is that modules that rely on it, will not benefit from the artifact provenance, and the UX as we sketched it relies heavily on that being accurate (so that eg. users can make sense of the artifacts based on where in the workspace they're from)

brittle wharf
#

yeah - it's the whole complexity with "live objects" - LLM used to store IDs that tracked where defaultPath was used, added a special reload: true field to the ID protobuf, and had Dagql re-load those arguments on use, so they won't get stuck with old data

#

(and by 'used to' i mean that's what happens on main and i ripped out on llm-workspace)

dim smelt
#

Oh wait - I'm also proposing that we allow keeping Workspace as a field. Does that help?

brittle wharf
dim smelt
#

Yeah

brittle wharf
#

shipit

dim smelt
#

It would taint invalidation + also would track it as having workspace-wide provenance (so it would match all filters)

brittle wharf
#

honestly either way i'd do what maximizes what's right for the non-LLM use case. i'll adjust as needed

dim smelt
#

Yeah but it's not necessarily just LLMs - for example my docusaurus module needs live runtime access to the workspace itself at runtime (currently via nodejs fs hook, but hopefully soon with Container.withMountedWorkspace 🀞 ). So in that case it's not an issue of module design, I cannot materialize ahead of time. Maybe the same problem as you?

brittle wharf
#

allowing storing Workspace seems cool, i already basically have to do that for the LLM use case, but it's stored inside LLM where you won't get in trouble for it πŸ˜›

dim smelt
#

(note to self find a gif for that)

brittle wharf
#

@lofty shadow is this in the category if things that are fixed by theseus? just double checking I haven't regressed anything. At one point I did lose that fix for caching-the-dagql-cache but that's back now. https://dagger.cloud/dagger/traces/a7950bb2d8ab1d311fa976ce992ad36e?span=07d25d9b8e70298e
tl;dr: Error: failed to generate code: failed to mount directory: failed to get other directory ref: no active sessions [traceparent:a7950bb2d8ab1d311fa976ce992ad36e-befdf518347422a1]

lofty shadow
brittle wharf
#

@dim smelt should dagger.Connect(ctx) skip loading workspace modules by default? thinkspin - that was the root cause of test-split:test-base failing - some introspection-related tests under cmd/codegen/... didn't expect modules to be loaded, and was missing the code for said module dependencies, which led to a cryptic failure

#

i added an option for it for now - dagger.Connect(ctx, dagger.WithSkipWorkspaceModules())
but alternatively we could have that be the default for backwards compat, and only load workspace modules if you explicitly pass dagger.WithWorkdir(path)?

dim smelt
brittle wharf
#

yep

dim smelt
#

I see. Yeah i understand, previously loading modules was 100% cli side

brittle wharf
#

yeah, my instinct is to not load by default - all current call sites are probably not expecting that to do anything related to the workdir

#

(including native ci for example)

#

speaking only to dagger.io/dagger, not client-gen, that might be different

dim smelt
#

so it's not just about loading modules from the workspace, but even what workspace to load in the first place?

#

see: "workspace identity and access control" there was an elaborate and painful design dance there. but I didn't consider "native apps" using only the pure client library

brittle wharf
#

yup - at some point we'd expect native ci to load a remote workspace right?

dim smelt
#

also good for @coral kernel 's use case - generated clients in test code. we want to load the current workspace & its modules there

#

so I guess we should first figure out: what's the intended use case behind the failing tests?

#

I think we might be good with current defaults actually - unless i'm missing a use case

brittle wharf
#

these tests are very unit-y and have nothing to do with modules, just using the core API as its test specimen for codegen logic

dim smelt
#

mm yeah so it's the equivalent of "dagger call -M just to be safe" but with the library now

#

what to do what to do

#

I guess it wouldn't be a problem if core didn't get shadowed.. but reverting that would create even more problems

#

also: if core were namespaced this would be a non-issue 😭

dim smelt
brittle wharf
#

well - in this case it's not that core gets shadowed, it's that the test runs with dependency module code excluded, and it blows up trying to load/introspect the workspace

dim smelt
#

so should we switch the param to loadWorkspaceModules: bool ?

brittle wharf
#

and flip the default?

#

how does remote workspace loading work anyway?

dim smelt
brittle wharf
#

ah i see

dim smelt
#

orthogonal to module loading though

#

so we can probably leave that on

brittle wharf
#

what if it was dagger.WithWorkspace(ref)?

#

too far?

dim smelt
#

well that's the idea no? doesn't each client param get a dagger.With... connect opt?

brittle wharf
#

no i mean - if unset, no workspace/module loading, if set, we pass it through explicitly. so you explicitly do WithWorkspace(".")

dim smelt
#

land mine

brittle wharf
#

but not sure if that goes against a 'there's always a workspace' mantra

dim smelt
#

rubiks cube

brittle wharf
#

backflip desk chair

dim smelt
#

pick your analogy 😁

#

it's not a "no", more of a "oh boy"

#

at the moment:

  • Client may specify a workspace (local to them, or remote)

  • If they don't, the engine picks a default

  • For mon-module clients, the default is .

  • For module clients, the default is to inherit the caller's workspace

brittle wharf
#

what led me to that is thinking from the extreme angle where the client wants anything it runs to never have any access to the workdir, because it's just a lowly daemon, and that might (at worst) be a security issue

dim smelt
#

but the client is never given access to files it can't already access (unless they're a module)

brittle wharf
#

in that case we could maybe have there still be a workspace, but it's a virtual, empty one? (followup to prev msg, not reply to yours)

dim smelt
#

giving an empty workspace is definitely possible but we would need a clear rule for when you get that, and it would have to fit in the overall model

#

in other words - what's the use case

#

(I've explored that idea before so agree it's a plausible tool to use if we need it)

brittle wharf
dim smelt
#

Is it material to what we're testing, or an accident of the test harness

brittle wharf
#

those aren't tests - they're all the dagger DX commands for module/workspace management

#

the only thing affected by this at all is one fringe codegen test that did a dagger.Connect, that happened to be beneath a module, but didn't want it loaded

dim smelt
#

Oh! Mmm which commands are those πŸ€”

#

It's to save loading time I guess

#

which makes me think: probably module loading should, and could, be way lazier?

brittle wharf
#

it's also so commands don't fail if you have broken modules, so you can get out of the broken state

dim smelt
#

Ideally a client should not have to skip loading modules for performance reasons, at all

dim smelt
#

Like a "safe mode"

dim smelt
brittle wharf
brittle wharf
dim smelt
#

Small note to self: open a PR to change those connect params to the less weird dagger.ConnectFooOpt maybe?

dim smelt
#

mmm maybe taking it too far

#

technically there could be a shim Query: foo module constructor simply by reading the module name from the toml. Then when the client calls that function - only then does it actually load the module

#

But, what type would it return πŸ€·β€β™‚οΈ doesn't work

brittle wharf
#

yeah, the end goal is to have the schema fully realized by the time you're connected, so it's kind of foundational

dim smelt
#

Still on the side of keeping the current default, I think

#

But it boils down to the primary "happy path" for someone importing the library

#

--> If the primary use case is orchestrating your test dependencies from test code (testcontainers pattern) then better to load by default

brittle wharf
#

one distinction: in the testcontainers/client-gen use case, your codegen'd SDK will be generated against your modules. in the dagger.io/dagger case, it won't, but you could still do non-codegen things like getting the current workspace's checks (though, i'm not entirely sure how that would work: we talked about hiding currentWorkspace)

dim smelt
#

Yeah accessing the current workspace from an external client is a gap. I needed it for my docusaurus live "JIT workspace" hack. In that case the orchestrating dagger function gets the workspace by argument injection, and passes it to the helper tool by writing its ID in an env var 😬

#

But, in that particular case once I get Container.withMountedWorkspace I won't need the hack anymore

#

Maybe we just say - workspace access is for modules only?

#

But perhaps feels like an arbitrary limitation

#

If I want to write a little go utility that connects to Dagger and lists the current workspace's checks or whatever - I would be frustrated that I can't, for a weird internal reason I don't understand

#

@lofty shadow do you still feel strongly that currentWorkspace() should be deprecated?

brittle wharf
#

have we talked about a workspace(ref): Workspace! API? seems like that's what would be natural for native CI

#

moduleSource(ref).asModule.checks -> workspace(ref).checks

lofty shadow
dim smelt
lofty shadow
#

Technically, my concern mostly only applies to functions making that call. They should just be accepting a workspace argument instead.

When it comes to external clients who are not running inside a cached function call, I don't care and it doesn't matter.

dim smelt
#

Mmm I see, so kind of like Host in that sense? Could be the special case that we hide from modules?

lofty shadow
dim smelt
#

But would it "make sense" in the overall design πŸ€” hyperthinkspin

#

ETOOMANYDIMENSIONS

lofty shadow
#

If it's really a thorn, I can see the world in which we allow it in function calls, it'd be something like tracking whether the function call made a currentWorkspace call and then treating it the same as if it had accepted the workspace as an argument in terms of caching (trickier because you have to make that decision later after the call already started, but possible in the new caching system).

It's just hard to explain that to users, especially if it's like indirect like a util function that was calling it or a case where it's called only conditionally, etc. If the only way for a function to accept a workspace was an arg, it just seems simpler and cleaner for everyone

dim smelt
#

Would be kind of hilarious if we didn't let an external client query it, but let an external client register inline dang functions that receive a workspace, and invoke that πŸ˜›

#

Or, some other way to smoothly bridge "external client" and "module"

#

maybe related to the big changes underway in codegen, SDK/runtime split?

#

If we could somehow frame "write a small go utility that calls dagger.io/dagger" as a special case of "write a dagger module that doesn't do much" it could be an interesting continuity

#

lots of hand waving going on there

compact monolith
dim smelt
#

I mean, if:

  • Developing a dagger module in Go feels very go-native now, can co-exist easily with native go codebase
  • Modules are narrowly defined as "a software bundle to extend what the dagger engine can do with a workspace"

Then it could make sense to say "if you want your native Go program to interact with a workspace, just embed a little dagger module in your project, load it from your program, then call its functions"

lofty shadow
lofty shadow
#

Don't think it's worth overcomplicating

dim smelt
dim smelt
compact monolith
#

When I read through the markdown design docs I was under the impression accessing a workspace would only be from an arg and thought nothing of it, seems sensible

dim smelt
#

(they can't be called, so there's no argument for them to receive)

compact monolith
#

Ah understood.

dim smelt
#

It would be cool if clients could register inline modules πŸ˜› But I will drop that

brittle wharf
dim smelt
#

Coincidentally, one open question in #1486795473908924547 is whether we should expose all discoverable objects as artifacts, or only objects with "verbs"

dim smelt
#

Yeah at the moment in my proposed change, you could keep that workspace arg

#

Mmmm maybe I should expose workspace provenance at the function level, rather than object level?

#

Or, perhaps both

#
  • One is "show me artifacts connected to this path, or invalidated by this diff" (risk of missing some, if your objects are too lazy)

  • The other is "show me checks, or other functions, that are connected to this path, or invalidated by this diff" (no risk of missing. a function that takes a workspace argument or has a receiver that has a workspace field, would always match)

brittle wharf
#

@dim smelt I'm working through the shell tests, there's currently .stdlib and .core builtins - nuke those too? they're impossible to support with the new model, you just get Query

#

supporting all of these is sort of like moving deck chairs on the titanic if we expect dang to replace it.

but, also, maybe this is a cue to add a core { ... } getter in entrypoint mode, instead of having core be inlined into the initial Query

dim smelt
#

Mmmm... that makes the shell actually less useful but I understand why - client-side entrypoint can change query to query. But engine-side, you're stuck with the same schema for the whole session

#

but, also, maybe this is a cue to add a core { ... } getter in entrypoint mode, instead of having core be inlined into the initial Query

What do you mean by "in entrypoint mode"?

brittle wharf
#

the client POV (call, shell), as opposed to module POV

#

(whether dang replaces shell is lateral tbh, we'd likely have the same needs)

dim smelt
brittle wharf
dim smelt
#

Ah I see. But if we want a top-level core getter, why not always have it? Then we would have the engine-side equivalent of .core, available to all clients

#

Like a builtin core module

brittle wharf
#

i agree, but fear for scope creep this close to merge - changing one side is much more malleable

dim smelt
#

Ah I see. If we're talking "merge fast, finish later" then I say break .core in shell now, unbreak it in a follow-up PR that adds toplevel core

#

(when we get around to it)

brittle wharf
#

sgtm

#

in a similar vein, when you run .help now you'll get a faceful of load-foo-from-ids - i'm thinking that's fine for now, the fix is to add a single ID loader (in my other PR)

dim smelt
#

We could just hardcode a filter in the .help handler for now?

#

(or just leave it)

brittle wharf
#

eh sure i can add a filter πŸ‘

brittle wharf
dim smelt
#

You used the list from shell's existing "show from core" list?

#

If you're not, don't forget to remove it, it might be dead code now?

brittle wharf
#

removed that - this is just with the load-X-from-id filtering

dim smelt
#

oh yeah this is fine

brittle wharf
#

rebased to fix DCO, everything should be green now. go:lint and go:check-tidy have been getting stuck and retried in CI, not sure why. they pass locally

dim smelt
#

amazing...

#

thank you!

#

will cascade tomorrow morning

brittle wharf
#

sweet. doing a self review now for any low hanging fruit (+ erik's comments)

brittle wharf
#

Did some substantial cleanups, and also just realized the newly added HideCoreAPI client param is a glorified currentTypeDefs(hideCoreAPI: true) that just synthetically strips out non-module fields on Query so investigating if there's anything better to do there.

The challenge is we can't literally hide the core API because the CLI bootstraps by calling currentTypeDefs. So we at least can't hide that, but then if we remove client-side filtering current-type-defs would show up in dagger functions. πŸ˜…

dim smelt
brittle wharf
#

well - all the load-foo-from-ids too, and with (since that gets exposed in a different way)

dim smelt
dim smelt
dim smelt
#

@brittle wharf where did you land on skipWorkspaceModules?

brittle wharf
# dim smelt oh right. By "single" I meant "dozens"

i'm exploring both paths:

  1. (preferable) seeing how far HideCoreAPI can go such that currentTypeDefs doesn't have to "lie" about the schema
  2. screw it, embrace the cold void of client-side filtering since we need some of it regardless, punt until we do core { ... } or something
brittle wharf
brittle wharf
#

ah, CLI also needs Query.address for flag parsing

dim smelt
#

@brittle wharf remind me what's the context for your current filtering streak?

#

specific to dagger shell?

brittle wharf
#

no, this is more around functions and call since shell actually does want the core API exposed

dim smelt
#

ah, functions shows too much?

brittle wharf
#

nothing is broken at the moment (besides functions listing with), but the way we achieve it is pretty goofy. The client sets HideCoreAPI for functions and call but this actually has no influence on the schema unlike other client params; the net effect is literally just that currentTypeDefs does a shakedown on the Query type it returns so it only includes module-provided fields

dim smelt
#

ah I see. so it's all still there in the schema, but the schema introspection function hides it

brittle wharf
#

yup

dim smelt
#

but wouldn't adding more client-side special case filtering make it more goofy, not less?

brittle wharf
#

i don't think so. what's goofy to me is setting a client param as if it affects the schema, and then having introspection APIs lie about the schema, while the CLI still happily makes use of schemas that currentTypeDefs says aren't there.

three options to resolve that goofiness:

  • make the client param real, so the schema is ACTUALLY filtered down (pulling on this thread now)
  • get rid of the client param, and just move it to a currentTypeDefs arg - at least then no one's lying, it's just a different preferred view of the schema
  • client-side filtering: since we're already inconsistent between shell and the rest, frame it as a UX concern of call and functions; the filtering logic is dead simple at this layer, just omit functions whose source module is ""
dim smelt
#

OK the only thing I would watch is the slippery slope of "well we already client-filter there, what's the harm in also client-filtering here?" - compounded by agents

brittle wharf
#

yeah. i mean, it depends what layer decides that any given thing is being filtered; there's pretty clearly a gradient beyond the client/server layer. some things we hide from the user, but the client needs them under the hood. as long as that's the case, there will necessarily be client-side filtering (or at least the decision to filter - arg to currentTypeDefs is the middle ground but even that doesn't buy much if the client-side STILL has to filter after that, for e.g. with)

dim smelt
#

But hiding it from introspection and allowing a client to call it directly - kind of like hiding a CLI subcommand - actually does the trick pretty neatly. I personally would have just left that alone, knowing we're going to scope core etc soon anyway

#

Anyway I can live with all these options

#

But have a feeling we don't agree on the direction to aim for zero client-side filtering over time. I think it's 100% achievable and good

brittle wharf
# dim smelt But have a feeling we don't agree on the direction to aim for zero client-side f...

It's a little frustrating to hear this after bending over backwards to support it for the past few weeks, and saying that it's preferable and in progress above. πŸ˜…

It's easy to say it's 100% achievable and good but it's another thing to actually commit to its implementation. It's a noble goal that no one would disagree with in a vacuum. When it starts to feel like it's coming at the expense of internal API consistency and by letting the client "cheat," that's where I start to second-guess where the boundaries really are. That doesn't mean I disagree with the goal. I just don't want to lie to ourselves to achieve it.

There are some things that make me second-guess doing this right now, but they're practical/tactical concerns, not doubting the spirit of the goal. I see a risk that we're over-indexing on a single client and trying to push its inconsistent desires down into the engine to tell ourselves it's simpler, without validation that all other clients will want those same things, and without ever being able to eliminate all forms of filtering client-side anyway, because not every single UX concern can be reduced to "does the client see it?" - there's a client and a user, and they want different things.

The tensions I see are:

  • currentTypeDefs is ostensibly just a Dagger API for introspecting the current GraphQL schema at a higher level. If it doesn't match the schema (without being given a filter), that's inconsistent. If the client has to call both currentTypeDefs and do GraphQL schema introspection to know what it really can do, we have something more nuanced than a 'client/server boundary' and haven't really made the client any simpler.
  • There isn't a single client. There's a shell client, and a functions and call client, because they don't agree on what the user should see - which is debatable at a UX level! In another universe maybe we embraced the simplicity, removed dagger functions and dagger core, and had just dagger call with a categorized --help like shell does. Would we change how all clients see the API because we decided to change our own client's UX?

All of that is resolved by just adding an arg to currentTypeDefs instead of making it a client param. Debatably it's still the client deciding to filter, but it's acknowledging that even the same client may want different perspectives of the unified schema at different times, which seems more flexible for other clients in the future, and avoids turning the client params into too much of a monolith (have to reconnect to get different view).

Lemme know if I'm missing some sort of puzzle piece that makes a bigger picture snap into place

dim smelt
#

@brittle wharf sorry about that - I withdraw it

#

I got carried away by the vibe-shedding

#

I'm good with whatever you think is the right call

brittle wharf
#

all good. i know i've provided enough friction to give that impression, wanted to clear up my pov

brittle wharf
#

pushed - added currentTypeDefs(hideCore: Boolean) and further fixes + simplifications to how shell works. gotta run to a basketball game, but hopefully it's all green FrogePray

brittle wharf
#

figuring out how to keep .cd working in shell now. shell is still very module-oriented, with .cd taking a module ref and focusing on the module's main object. previously that used to also make the toolchains available as toplevel shell functions, but that no longer works on our branch, which would break hack/build. checking if there's an obvious fix for that atm, in the spirit of keeping this PR backwards compatible.

NB: we could consider making .cd workspace-oriented, but workspace loading is all on client init now, so that would mean making a new client (maybe doable but that's intertwined with TUI init atm). or we could add a Workspace equivalent of Module.serve.

last and easiest option is to just remove the --no-mod from our hack/build shebang and let it load the workspace, assuming using shell like this isn't that widespread.

dim smelt
#

btw you can see the same intent in the ancestor of Workspace, now defunct "evolution of Env" proposal: https://github.com/dagger/dagger/discussions/10370

extend type Env {
  "Change the filesystem location to a local or remote address."
  chdir(location: String!, noLoad: Bool): Env!
}

(Sorry, discussions don't support anchor links πŸ€·β€β™‚οΈ )

brittle wharf
dim smelt
#

I think it boils down to feasibility - how hard is it to allow clients (in this case dagger shell) to call currentWorkspace().load(...) (or chdir or whatever we wat to call) and change the whole workspace under their feet.

dim smelt
#

@brittle wharf if the answer is "yes it's doable, but let's merge this first, even if it means breaking .cd for a little bit", I could live with that

(but we should probably agree in advance on how long a breaking window is acceptable, to keep ourselves honest later)

brittle wharf
#

good to know. still sorting out the options; adding Module.serve(entrypoint: Boolean) could be another incremental step

dim smelt
dim smelt
#

Update: I'm spinning out what I think is the next primitive from the part 3 workspace design doc: Collection. I think it can be specific and detailed enough that we can have a productive review, and unblock other parts - including ship πŸ™‚

brittle wharf
brittle wharf
#

all green again, I think shell is in a decent state

compact monolith
#

This is the lockfile PR that needs to go in before the remaining workspaces PR?

dim smelt
#

Then the lockfile PR layers on top of that - we will merge that next

#

(by the way I will need reviews on that πŸ™‚

dim smelt
#

@brittle wharf you may have already found a solution, but actually .cd could just force a re-connect and create a brand new client...

brittle wharf
#

What i've got now seems to work well enough. Don't want to sink too much time into shell beyond what's necessary for compatibility tbh. I considered recreating a new client, but that seemed potentially quagmire-bound

dim smelt
#

cool cool

brittle wharf
#

Yeah and the TUI init flow is pretty tightly coupled to engine client init. Maybe swappable at runtime, but that dag var ends up in places that might need to be replaced, or updated on the Frontend, etc. etc.

As a quick sanity check, this behaves the same as it did on main now:

cd ~/src/dagger # directory of worktree subdirs
dagger -M
.cd workspace-plumbing
.pwd # prints /home/vito/src/dagger/workspace-plumbing
cli | binary | export ./foo
=> /home/vito/src/dagger/foo

(the .pwd is inconsistent with where export writes to, but it was like that on main too)

traces for comparison:

#

so I think the PR is officially ready now πŸŽ‰ - this checks my last box, which was to make sure hack/build doesn't break

dim smelt
#

<@&946480760016207902> πŸ™‹ πŸ‘† πŸ™ partyparrot

#

@brittle wharf did you have to make API changes to get .cd to work?

brittle wharf
dim smelt
#

Ah I see, Module.serve() already exists, duh. But why did you need entrypoint: true specifically? Did Module.serve change its behavior in this PR to be namespaced by default? (which would make sense)

brittle wharf
# dim smelt Ah I see, `Module.serve()` *already exists*, duh. But why did you need `entrypoi...

Module.serve was always namespaced (it added the module's constructor into Query), and before it was shell's responsibility to 'zoom in' to the main object and make its toplevel functions available when you do .cd. Now shell just reflects Query for its toplevel functions so I needed a way for it to do set up the entrypoint stitching to maintain the previous behavior, so .cd uses entrypoint: true, and literal module refs (like github.com/vito/dang | .help) use entrypoint: false.

dim smelt
brittle wharf
#

noticed some shell things that still don't work, backfilling tests and fixing them
(edit: was a narrower case than i thought, not blocking review)

dim smelt
brittle wharf
brittle wharf
#

re: TestModules perf, there's a chance this is load-bearing, we'll see if the new runs are faster: https://github.com/dagger/dagger/pull/11995/commits/7bb95c03614cc35f6ae76f4501b005ca27426f34

I had to make that change for the test suite to be runnable at all with with-dev go test - without it, everything failed trying to load the outer repo's toolchains because path joining messed up somewhere (core/integration/toolchains/changelog). Not sure why that wouldn't be failing when running with dagger call engine-dev test, but if it was doing something and succeeding, that could explain things taking longer

brittle wharf
#

noticed one final(?) cleanup to do: removing Workspace.defaultModule - not used anymore!

brittle wharf
#

@dim smelt @lofty shadow any hesitation around merging once CI is green? i brought the analytics back for module install, only open question atm

dim smelt
brittle wharf
dim smelt
#

Well that's exciting πŸ™‚ Yeah good on my end!

#

No last minute regrets @brittle wharf ?

brittle wharf
lofty shadow
dim smelt
#

Yay πŸ™‚

dim smelt
#

@brittle wharf thinking through UX breakage in workspaces...

#

Since we're narrowing the scope of dagger -m to mean "load this module in the current workspace", it will break some use of dagger -m.

For example dagger -m github.com/dagger/dagger@pull/xyz/head call engine-dev playground terminal...

#

but also possibly in CI... dagger -m $GITHUB_REF check ?

#

I think that was one of the reasons I was hesitant about -C for workspace selection. Is it obvious and easy enough to find, for something so fundamental? Workspace as positional arg felt like a promotion if that makes sense

#
  • dagger check github.com/dagger/dagger@pull/xyz/head -> "Dagger, check this branch"
  • dagger generate ./sdk/go -> "Dagger, generate this directory"
  • dagger ship github.com/dagger/dagger@v1.0.0 -> "Dagger, release our 1.0!"

(I don't actually know if we could pull this off, but at least superficially it feels very readable)

#

One big downside: if your command needs positional args for anything else, it gets complicated (but maybe still doable?)

#

The big big draw of dagger -C / dagger --workdir is that it's a familiar concept -> like a built-in cd

#

The big downside is also the familiarity: git -C and go -C don't support git remotes... Wil anyone think to use -C github.com/foo/bar? Or will it be even more surprising because -C is familiar as a local-only thing, and you don't even think to look there?

Possibly the problem is worse for agents since familiarity gets baked in the training set

brittle wharf
#

i don't remember fully why i didn't like the arg - i think there were just more scenarios where you had to type it? like dagger check . go:lint. i agree the ergonomics are nice in those other examples though

dim smelt
brittle wharf
#

technically that's 100% more characters than . πŸ˜›

dim smelt
#

LOL

#

for what it's worth, I actualy want to deprecate go:lint notation...

#

But plenty other examples of commands that need positional args, eg. dagger call

brittle wharf
#

do you mean focusing on paths as the way to communicate?

dim smelt
#

i just think I found something much better πŸ™‚

#

byproduct of test selection & test splitting

#

I'm centralizing the design docs in modules-v2 branch, hack/designs/modules-v2

#

splitting up the docs, sharpening them and getting them individually ready for review & implementation, defining the sequence, it was too many moving parts to do it across gists and PRs, so I'm centralizing there

#
  • artifacts -> plumbing to expose the object graph as a first-class citizen. Artifacts are identified by coordinates not path. Available dimensions for those coordinates can be discovered and filtered. Schema path becomes an implementation detail.

  • collections -> builds on artifacts to make the coordinate system way more useful with keyed collections (maps of objects). Every collection type is an artifact dimension πŸ™‚ So you can eg. dagger check --go-test=TestFoo --go-module=./myapp but also dagger list go-tests --go-module=./myapp

  • execution plans: a DAG of actions on artifacts. Can be reviewed, filtered - for example by action kind (only check; check + generate..) and action name ("lint", "test")

#

I also think we could experiment with a separate CLI (dag?) that takes things even further and is 100% schema generated.

eg. it would just be:

  • dag go tests --module=./myapp run
  • dag go test run TestFoo
  • dag go modules
  • dag go module lint ./myapp
  • dag helm charts test

No way we can back that into dagger but could be a fun experiment in a 100% custom CLI for your workspace

#

or could be fun if dagger generates & builds that custom CLI for you under the name of your workspace 😁

then it would be eg.

anam-dev go tests --module=./myapp

#

actually that's cooler in a way because the custom CLI could hardcode the upstream url of the workspace. so it could work even outside of a local checkout. but would also detect a local workspace and use that

#

(that last part was a total sidetrack πŸ™‚ )

dim smelt
lofty shadow
#

@brittle wharf I forget where we were having the conversation about how Query is a corner case in a hard-to-articulate way, but I now hit the point where it actually hit me in the face while rebasing on the workspace commit, so it's easy to articulate now laughcry

The concrete reason it's a corner case is that in dagql all objects are ID-able, but there's no recipe ID for the root Query object. There's no query/selection that produces the root. So it is ID-able but also not ID-able.

I can deal with it for sure, but does make me question if this will be an eternal thorn in our sides. I basically have to pepper the code in a bunch of places saying that "it's okay for there to not be a call frame in this one particular case". Which is not a big deal for now. Just food for thought.

dim smelt
lofty shadow
#

IIUC

dim smelt
#

ah right

#

The Curse of the Entrypoint ℒ️

brittle wharf
#

Also all the entrypoint proxy stuff that Query.with resides in should never show up in the DAG thankfully, since it's purely routing sugar; all IDs etc. delegate to an inner "Canonical()" server

#

(Not that I don't believe you, I think my face just needs to be smashed into the same wall)

#

Some of this is just downstream of how we happened to structure the ID/Call protobuf stuff, maybe we could tweak it so instead of having a required field {Type!, Parent!, Field!, [Arg!]!} it's something like {Type!, [{Field!, [Arg!]!}]} - so root would just be {"Query", []}?

lofty shadow
dim smelt
#

The mythical land of theseus

lofty shadow
# brittle wharf Ah hm, feels like our equivalent of `nil` in Buildkit being how `llb.Scratch()` ...

Yeah, I before the workspace rebase had to use a single line of kludge in one place to special case Query, but it didn't work anymore because now Query sometimes is a thing with state (even if it's technically do-not-cache and such). And instead I had to special case it multiple places.

It's possible with more searching I can simplify it back again. The problem is just that the code worked off the assumption that ID-able objects always have a call frame

dim smelt
#

Where cache hits grow on trees, and streets are paved with green CI checks. Not a single flake in all the land.

lofty shadow
#

Once I finish this rebase everyone else can start visiting too

dim smelt
#

They say in the Land if Theseus, compute is free. The people consume as many functions as they wish, without a care in the world. No matter what object ID they need, they have but to kneel down to grab it

lofty shadow
#

I think btw that module loading cache might be broken or severely hampered with just the workspace changes on main. the load workspace step is very slow in repeated runs. I had to test it off main to see if that was new or not.

I haven't looked into it cause I'm just assuming I'll fix it as part of the rebase with theseus. Just an FYI though that if we do a release now w/out theseus that may be regressed majorly

dim smelt
#

The only limit to the size of your DAG is your imagination. For my DAG is your DAG - as there is but one true DAG.

#

You think this is is praise, but actually it's psychological pressure, I will continue to quote increasingly cringey verses from the Book of Theseus until the branch is merged

#

"Oh all-powerful host!" the pilgrim cried out. "Grant me the knowledge of the elusive etree! For I have traveled far, and my people hunger for more incremental builds"

#

You can tell this is not AI-generated because the AI would be funnier, probably

lofty shadow
#

I can assure you I have enough internally motivated pressure already, doing quite literally everything I can to finish it. Just a monstrous amount of change to manage, even if it's all improvements. It'll be next week though cause I am on vacation starting on the 11th so it's a sprint until then πŸ™‚

dim smelt
#

ha ha didn't mean to actually pressure you 😁 just excited

lofty shadow
dim smelt
#

let me know when it makes sense to rebase, that way we can help indirectly with more testing

dim smelt
#

We've reinvented the fork bomb from first principles. Would be funnier if it could trigger an infinite scale-out loop

lofty shadow
#

woohoo All of TestModule|TestWorkspace passing after my merge commit of the workspace changes https://dagger.cloud/dagger/traces/ab0655880f14283271cb6e54a3d660f1

Pulling in the workspace changes does seem to have regressed perf, the tests took several minutes longer and then engine is consuming 2GB of RSS vs. 1.2ish GB before the commit. But I'm sure that'll be addressable, I've had to go through several "fix perf" sidequests throughout this PR anyways. Been able to fix it with enough pprof'ing.

The good thing is it's working.

#

I do have more to merge in from main, the dagger up PR in particular is probably another one that'll require some thinkies but hopefully not as bad as the workspace one.... 🀞

dim smelt
#

Hurray!

#

Re: perf, my naive guess is that we may have regressed in how lazily we load modules

dim smelt
#

It should be much more superficial

lofty shadow
# dim smelt Re: perf, my naive guess is that we may have regressed in how lazily we load mod...

could be, also could be something very non-obvious but significant around typedefs. In my PR I had to refactor every reference to a typedef to be a cached canonical ObjectResult[T] rather than just T, mainly for caching reasons. But that had the nice side effect of reducing engine RSS by several GB (during TestModule anyways) and also reducing the CPU overhead of the go garbage collector by ~half.

I tried to maintain that in my rebase but I mighta missed something. My PR also did a bunch of related stuff to minimize how often we repeatedly install schemas to servers, might need to check if that plays nicely with the multilayered "entrypoint and canonical" server thing. I thought it did but who knows.

lofty shadow
lofty shadow
dim smelt
#

In the Land of Theseus, "distance" is an obsolete concept. Any destination, no matter how remote, is always two seconds journey away.

#

The laws of physics still apply, but in a better, elevated form that reframes the very nature of the universe. To experience this new frame, one must only believe.

brittle wharf
lofty shadow
#

the fact that it just suddenly worked now again after the merge commit without having to think about it is all that matters now

brittle wharf
#

i wonder what it was thinkies - but perhaps i should instead spend these freed up mental cycles elsewhere

lofty shadow
lofty shadow
#

Finished the merge commits, no more conflicts on the PR https://github.com/dagger/dagger/pull/11856 hyperfastparrot

Large swaths of tests are passing but the full CI run flushed out some more stuff I didn't run locally. So far everything looks immediately obvious except for TestInterface, will have to look into that one.

Pushing fixes for what bubbled up there now. After that there's still gonna be some code churn because despite 10s of thousands of lines of deletion there's still a ton dead code left to sweep up, but will be cosmetic things here on out. A lot of dead code hanging on by a thread and just needs one adjustment to let it all fall out. Everything that requires actual thinking has been converted over and ported (famous last words, but feel good about it at this point, there's literally just not that many lines of code left that have gone untouched)

dim smelt
#

@brittle wharf I just remembered why -C was a problem... It's those damn entrypoints again...

dagger call -> when there's an entrypoint where do the constructor args go? That was the blocker

#

In that situation, it seemed like an elegant solution to make the workspace itself the first arg:

dagger call github.com/vito/dang --foo=bar build --bar=baz
#

Alternative:

dagger -C github.com/vito/dang call --foo=bar build --bar=baz

I guess it's not too terrible, same UX we already have today

dim smelt
#

Cleaning up workspace

#

It appears when I replayed workspace over workspace-plumbing, I trusted the agent a little too much, and things went seriously wrong... A lot of initially good code was thrown away, and intentionally re-implemented wrong. Like the agent looked at workspace-plumbing, and extrapolated missing features from scratch while discarding the hard-earned design from old workspace... facepalm

Anyway I'm fixing it. A good forcing function to cleanup and prep for final review anyway

#

Areas of focus right now:

  • install, update were broken
  • Migration. Super important...
  • Workspace selection (was -C/--workdir, then positional arg... cleaning that up)
  • Various polish
  • Using new entrypoint (was still stuck on old "blueprint" / "alias" unfinished cncepts)
  • Finish wiring in lockfile
compact monolith
#

Is it too late to put a small feature request into this? Can I set progress per env in .dagger/config.toml?

compact monolith
#

Another question that may lead to a feature request: Can I pass args in aliases defined in .dagger/config.toml? Usecase: setting config.<parameter>=<value> for constructors, but aliasing function calls and inputs

compact monolith
#

Yeah, since --progress=logs landed all our Gitlab jobs use it, but locally the default interactive display is great

dim smelt
#

we could also just change the non-tty default to logs (not mutually exclusive)

compact monolith
#

I'd be fine with that. plain is a lot for end-users, and is really only useful for module writers digging into what's happening in their modules

brittle wharf
compact monolith
#

Well we don't use dots anymore since logs landed. The dots don't do anything for anyone really, end-users want the commands and their outputs, module maintainers want the interactive terminal for -i

brittle wharf
compact monolith
#

Not been raised as an issue yet for anything other than dockerbuild which outputs basically nothing in logs output

#

I'm guessing that's to do with handing it off to buildkit and may be possible to improve in the future post-theseus?

brittle wharf
lofty shadow
harsh imp
lofty shadow
# harsh imp Looking forward to this. Being able to easily convert Dockerfiles and compose wi...

Oh to be clear, the conversion is internal right now. When you call Directory.dockerBuild the implementation internally converts the dockerfile to the dagger ops. It's not that the Dockerfile is converted to dagger SDK code.

However now that you mention it, it would be much easier to implement something like Dockerfile->Dagger SDK code now. That's just not how the baseline feature works, that'd be a nice extra later

harsh imp
#

Definitely see all the pieces being put together to achieve that. Healthchecks being the most recent. Compose is very popular for testing and it does not work the same across container runtimes. For example, we were evaluating moving from docker desktop to podman. I know a team that had to move back to docker desktop because podman didn't work the same. I can see Dagger filling that gap around being universal.

harsh imp
#

Ah ok I lied. dagger up being the most recent puzzle piece πŸ™‚

dim smelt
#

@brittle wharf what happens if my workspace config has an entrypoint module, and on top of that my CWD is inside a sub-module? Wouldn't that require having two entrypoint modules?

brittle wharf
#

don't recall off the top of my head between workspace-plumbing and the stuff on top of that thinkies

#

i could see that just prioritizing the cwd module

dim smelt
#

right, so the cwd entrypoint nukes the parent's entrypoint (but nothing else)

#

Looks like the situation can't happen in main/workspace-plumbing, because at loading it doesn't differentiate between "compat workspace loaded from eligible dagger.json" and "nearest dagger.json I can find".

#

Found another major regression in compat mode, will fix it in workspace.

  • blueprint and toolchains must be completely ignored in module loading. But right now they're not. Somehow that also got lost in the rebase of workspace over workspace-plumbing.
compact monolith
#

@dim smelt How can I create a workspace for testing purposes? Test submodule would previously create the --source directory it needed in the test.

Example:

Old method under test:

func (m *Module) Thing(ctx context.Context, source *dagger.Directory) { ... }

Old test: source := dag.Directory().WithNewFile("file-needed", "contents") -> dag.Module().FunctionThatRequiresDirectory(source))

Updated method under test:

func (m *Module) Thing(ctx context.Context, workspace *dagger.Workspace) { ... }

New test: ❔

dim smelt
# compact monolith <@488409085998530571> How can I create a workspace for testing purposes? Test su...

This is the PR for workspace: https://github.com/dagger/dagger/pull/11812

I'm working on polishing it now.

You can play with in a playground with:

dagger -m github.com/dagger/dagger@pull/11812 call engine-dev playground terminal
GitHub

Introduce workspaces: project-level configuration separate from module definitions.
Dagger currently conflates project configuration and module definitions in a single dagger.json. This PR separate...

compact monolith
#

I mean specifically in a test module rather than in general. Is there a planned future method of creating a workspace in a module, or will I have to put test data in the test module's real workspace?

#

(Talking about the type *dagger.Workspace)

dim smelt
#

Oh I see. You want to create a synthetic workspace for testing?

compact monolith
#

Yep. We currently create a directory containing test files for testing modules in a submodule. Want to move over to workspaces (removes the need for --source on most things, and makes transporting files between funtions easier) but could do with the ability to create a workspace

dim smelt
#

I could try adding it... @brittle wharf any red flags?

In the past I think it threw us off (or at least threw me off) when discussing Env, but maybe it was just abstract complexity that caused brains to melt. Maybe now that most of it is implemented, it's not so bad?

My main concern is crossing wires between "workspace as direct gateway to mutating external state" and "workspace as snapshotted immutable state"

#

Mmm an alternative could be... Test module creates a directory inside its sandbox container, then opens a client connection setting that directory to be the workspace. Then loads modules, which will inherit the workspace.

brittle wharf
brittle wharf
dim smelt
brittle wharf
dim smelt
#

I guess dagger.Connect() using the generated client provided by the runtime? πŸ€”

compact monolith
#

There may be a bit of a workaround I can use before getting in too deep. What does the *dagger.Workspace resolve to in a submodule?

my-module/
β”œβ”€β”€ .git/
β”œβ”€β”€ dagger.json
β”œβ”€β”€ main.go
└── tests/
    β”œβ”€β”€ dagger.json
    └── main.go

If tests/main.go includes a function that has a parameter of type *dagger.Workspace what do I get?

dim smelt
compact monolith
#

So dagger call -m tests <args> will result in the root dir of the parent module being the workspace?

#

I think there's a world where I shift the test files out of functions and into actual files in the module and just read them from the workspace as an acceptable compromise

dim smelt
dim smelt
compact monolith
#

I think so. Currently each test creates the file it needs first with source := dag.Directory().WithNewFile() then passing that directory into the function being tested, whether thats docker building or service deploying. This is a nice-to-have, no test files litered around in the module, created files are thrown away seamlessly at the end of each test etc. I'm not opposed to creating actual files to test with instead, keeping testing is more important than being precious over how we test.

Something like this:

my-module/
β”œβ”€β”€ .git/
β”œβ”€β”€ dagger.json
β”œβ”€β”€ main.go
└── tests/
    β”œβ”€β”€ data/
    β”œ   β”œβ”€β”€ Dockerfile.base/
    β”œβ”€β”€ dagger.json
    └── main.go

And functions in tests/main.go get access to Dockerfile.base from the *dagger.Workspace arg in the test function. I'll trial it tomorrow morning πŸ™‚

dim smelt
#

Yeah if I understand correctly, if you're using Dagger to generate the files in the test workspaces... and Dagger doesn't support loading creating a synthetic Workspace object... Then you will need a small client-side bridge that exports the dagger directory to a tmp dir locally; then connects a new client using that tmp directory as a workspace

#

In the future I could see us adding the missing bridge so you don't have to do that export... But it won't be right away because the design stack is piled so high already

compact monolith
#

I'll test some workarounds for now

dim smelt
#

OK thanks for the report! Please let us know how it goes

dim smelt
brittle wharf
#

I guess that could become with<ModuleName>? thinkspin

#

and call could namespace constructor flags like --mod1-foo --mod2-foo if it needs to disambiguate? thinkspin

#

seems doable, at least, if we want it

dim smelt
#

sounds great and super simple πŸ˜›

#

In this particular case I just need to resolve the ambiguity between CWD and -m

#

Just trying to see where to handle it (let the plumbing fail, or enforce it before passing to the plumbing)

brittle wharf
#

is it "just" cli convenience of being able to run workspace-wide functions everywhere, like linters/formatters?

dim smelt
#

With the upcoming artifact-centric commands, collections, checks, ship etc - I think the exact schema path will matter less and less, so the cosmetic value of entrypoints will diminish over time (my guess) since it applies primarily to dagger call

brittle wharf
#

makes sense

dim smelt
#

Also I found a bug in my current code - -m and CWD are allowed to both set entrypoint, so dedup is not tight enough

#

btw @brittle wharf I reverted workspace selection, no more weird positional args and --.

But I also didn't go back to overloading --workdir. Instead there's a new clean --workspace / -W (didn't want to break -w unilaterally but if there's consensus that it's worth it, I would love to do that too)

brittle wharf
#

i actually kinda like -W, makes it seem important like -C πŸ˜› - but that may be my bias from using -w a lot

dim smelt
#

yeah -W is good I agree

compact monolith
#

Root module:

// A generated module for Daggerworkspacesubmodule functions

package main

import (
    "context"

    "dagger/daggerworkspacesubmodule/internal/dagger"
)

type Daggerworkspacesubmodule struct{}

func (m *Daggerworkspacesubmodule) ViewWs(ctx context.Context, ws *dagger.Workspace) ([]string, error) {
    files, err := ws.Directory(".").Entries(ctx)
    if err != nil {
        return nil, err
    }
    return files, nil
}

Tests submodule:

// A generated module for Tests functions

package main

import (
    "context"
)

type Tests struct{}

func (m *Tests) TestWorkspaceInSubmodule(ctx context.Context) ([]string, error) {
    things, err := dag.Daggerworkspacesubmodule().ViewWs(ctx)
    if err != nil {
        return nil, err
    }
    return things, nil
}
#

Calling root module:

dagger call view-ws
βœ” connect 0.1s
βœ” load module: . 0.1s
βœ” parsing command line arguments 0.0s

βœ” daggerworkspacesubmodule: Daggerworkspacesubmodule! 0.0s
βœ” .viewWs(
  ┆ ws: currentWorkspace(skipMigrationCheck: true): Workspace!
  ): [String!]! 0.1s

.git/
.gitattributes
.gitignore
LICENSE
dagger.gen.go
dagger.json
go.mod
go.sum
internal/
main.go
tests/

Calling tests submodule - no workspace directory contents:

dagger -m tests call test-workspace-in-submodule
βœ” connect 0.1s
βœ” load module: tests 0.2s
βœ” parsing command line arguments 0.0s

βœ” tests: Tests! 0.0s
βœ” .testWorkspaceInSubmodule: [String!]! 0.0s



Setup tracing at https://dagger.cloud/traces/setup. To hide set DAGGER_NO_NAG=1
#

So that approach isn't going to work and I'm still searching for a way to do this