`.stdlib | .help git` | Dagger | Page 1

jaunty echo Apr 21, 2025, 4:47 PM

#

Oh yeah, I see. So basically, when serving dependencies, if the dependency has the same name as something that's already in Query it will override it. In this case it's in the hardcoded subset for .stdlib that's why it shows in there.

Then, if we take a step back, and you have a module with a dependency named "git", your dag.Git will return Git from the dependency rather than GitRepository from core. Shouldn't that be disallowed?

vocal python Apr 21, 2025, 5:11 PM

#

it either feels like something that should error for the person who did the naming OR something we should allow because it enhances programmability (like you could ostensibly wrap the core here and shadow, thus extending the core objects... tbh just describing that makes me pretty certain that it'll blow up due to type incompatibilities)

jaunty echo Apr 21, 2025, 5:38 PM

#

Yeah, and it also wouldn't work since you don't get both. Actually, it's not just dependencies. A module named Git would itself obfuscate core git.

vocal python Apr 21, 2025, 5:39 PM

#

erroring for the person who did the naming would imply that we wanna reserve all the names from the core API... isn't there some prior art for reserved keywords? where's that code at?

jaunty echo Apr 21, 2025, 5:41 PM

#

I think it's just a matter of looking into current typedefs before always replacing, when a module is loaded.

vocal python Apr 21, 2025, 5:41 PM

#

jaunty echo I think it's just a matter of looking into current typedefs before always replac...

that's gonna make it error not for the person doing the naming, but for the person doing the installing, right?

#

(that does seem easier to implement, for sure, and i'm cool doing it that way if we're both cool with that tradeoff of who hits the error)

jaunty echo Apr 21, 2025, 5:42 PM

#

Should also happen for the person doing the naming, if said person tries to load it.

vocal python Apr 21, 2025, 5:43 PM

#

yeah, but only in shell, right?

jaunty echo Apr 21, 2025, 5:43 PM

#

No, I'd enforce it always.

vocal python Apr 21, 2025, 5:43 PM

#

ah, so enforce it on server-side Install

jaunty echo Apr 21, 2025, 5:43 PM

#

Server-side load.

vocal python Apr 21, 2025, 5:43 PM

#

yeah

#

Serve

#

sorry words lmao, my mental model of all this stuff oscillates from clear to fuzzy depending on the week

jaunty echo Apr 21, 2025, 5:44 PM

#

😁

vocal python Apr 21, 2025, 5:45 PM

#

cool i think i can move forward with that. i should expect some broken tests with the new desired behavior, right?

jaunty echo Apr 21, 2025, 5:46 PM

#

Yes, it implies our version module would have to be renamed, and respective tests should now expect an error.

vocal python Apr 21, 2025, 5:47 PM

#

plus there are these test modules like git that should error

jaunty echo Apr 21, 2025, 5:47 PM

#

Maybe ask for a quick consensus on this behavior?

vocal python Apr 21, 2025, 5:47 PM

#

yeah can do

jaunty echo Apr 21, 2025, 5:48 PM

#

vocal python plus there are these test modules like git that should error

Yeah, a Git module could be renamed to GitTest or something. But if the test is for shadowing behavior it can be replaced by a general one on naming conflict error.

vocal python Apr 21, 2025, 5:55 PM

#

@lucid violet @halcyon urchin @golden vessel @fresh jackal looking for a quick consensus - any concerns about erroring in module.Serve if a module name tries to shadow the core API names? essentially we'd be disallowing modules named git, version, cacheVolume, container, etc so that we can present an api to LLMs and shell users that feels more similar to SDK code with direct access to module constructors

#

PR here since it's lost in the thread chaining, but it doesn't yet have the error code i'm intending to add

vocal python Apr 21, 2025, 9:01 PM

#

forging onwards with this, assuming lack of engagement means "it's probably fine"

lucid violet Apr 21, 2025, 9:10 PM

#

vocal python forging onwards with this, assuming lack of engagement means "it's probably fine...

makes sense to me 👍

fresh jackal Apr 21, 2025, 9:54 PM

#

what's the goal of this?

#

if it's eliminating shadowing entirely, don't forget the shell also allows functions of the current object to shadow dependencies of the current module. So you will still have some shadowing

vocal python Apr 21, 2025, 10:00 PM

#

intent right now is to specifically eliminating shadowing of core functions by module names. the end goal, though, is towards SDK/shell/prompt parity -- making shell and prompt offer the same "flat" accesses to dep modules that you can get in an sdk with dag.ModuleName(). We already have this mostly working for shell since 9992, but that client-side fix interacts poorly with the Env install hooks, so i'm fixing it by moving the same logic server side

#

it's quite a long chain of if-this-then-that lol

#

but your question does raise a corrolary, like we don't actually strictly need to stop the shadowing for the goal to be accomplished, an alternative would be to adapt the tests to acknowledge that deps can override core impls, it just starts looking real crazy in some places like .stdlib | .help git returning module-defined help text

#

there's probably also a way to have the client be aware of this shadowing, analogous to core.client.defaultDeps

fresh jackal Apr 21, 2025, 10:40 PM

#

When I said "shell parity" I didn't specifically mean "everything should be flat like in the SDKs"

#

I don't understand why .stdlib | .help git would return anything other than the doc for stdlib git - that's literally what it's for

#

If it returns something else, IMO it's a bug

vocal python Apr 21, 2025, 10:43 PM

#

yeah, agreed, this thread is an effort to not ship that bug

fresh jackal Apr 21, 2025, 10:47 PM

#

Can't we just fix the bug? The change you propose will break an unknown number of modules on daggerverse, and it feels like the beginning of a game of whack-a-mole

vocal python Apr 21, 2025, 10:55 PM

#

not sure if it's feasible or not, in my head fixing it would be a matter of continuing to let modules shadow core but keeping around a pre-module-load representation of the typedefs for .stdlib? @jaunty echo does that sound feasible to you?

fresh jackal Apr 21, 2025, 10:55 PM

#

Oh I see, it's the recent addition of pre-loading all modules that broke it?

vocal python Apr 21, 2025, 10:56 PM

#

yeah, at least once we move that pre-loading of deps server-side

#

the counter-argument, anyways, is that the unknown number of modules we're gonna break on daggerverse are already doing something bad that we don't want them to do: hiding core Query functionality from their importers

#

speculatively, you might even be able to do some really nasty stuff, like create a module called SetSecret with a constructor that takes name, plaintext and exfiltrates the plaintext

#

https://github.com/dagger/dagger/pull/10118/commits/5ba0f0170025f240cdca6ef6824fb09c3c412408 is roughly the "refuse to serve core-shadowy-modules" change

GitHub

move shell/prompt dependency serving server-side by cwlbraa · Pull...

fixes #10116, replaces #9992
this PR adds
module.Serve(ctx, dagger.ModuleServeOpts{IncludeDependencies: true})
to the API, enabling shell and other clients to serve the dependencies of modules to e...

fresh jackal Apr 21, 2025, 11:21 PM

#

We specifically designed the shell to allow shadowing. There's a whole UX built around it. We can always discuss changing it (like anything else) but we shouldn't roll back parts of it without thinking through the end-to-end ux implications, as a kneejerk reaction to a regression.

It didn't occur to me that there could be a security issue with shadowing. I guess that would be a good motivation to remove it. But let's start with an issue focusing on that, then.

We can't yolo this change and tell module devs "you were doing something bad"

#

The potential pain is exacerbated by another recent module breakage (can't call your types Env anymore)

vocal python Apr 21, 2025, 11:24 PM

#

regardless of the shell, how was this not an issue in SDKs?

#

i'll play around with it i guess... i do wanna be clear that the yoloing here is directly in service of the dagger install $module; dagger shell --prompt "ask LLM to use $module" experience, not apropos of nothing

jaunty echo Apr 22, 2025, 8:15 AM

#

fresh jackal We specifically designed the shell to allow shadowing. There's a whole UX built ...

It's not about the shell, it's deeper than that. If you have a module called "git", then you don't have core's git anymore because Query { git(x): GitRepository! } becomes Query: git(y): Git!. Same with a module that depends on another called "git". I don't think that should be allowed. If there's already a Query.foo in the schema, it shouldn't be allowed to be overridden like that.

#

Try this (dagger init --sdk=go git):

package main

import (
    "dagger/git/internal/dagger"
)

type Git struct{}

func (m *Git) Test() *dagger.Directory {
    return dag.Git("github.com/dagger/dagger").Head().Tree()
}

Without self-bindings codegen won't replace dag.Git so the IDE won't see any problem here, but the API schema has changed:

$ dagger call test
✔ git: Git! 0.0s
✘ .test: Directory! 1.1s
! marshal: json: error calling MarshalJSON for type *dagger.Directory: returned error 422: {"data":null,"errors":[{"message":"Cannot query field \"head\" on
! type \"Git\".","locations":[{"line":1,"column":43}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}},{"message":"Unknown argument \"url\" on field
! \"Query.git\".","locations":[{"line":1,"column":7}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}]}

#

By throwing an error, the author would be forced to rename the module to avoid the conflict.

#

Note that you could have a module "my-git" as a dependency, aliased to just "git". ~~In that case the shadowing logic in shell would apply correctly.~~ The problem is when the schema gets replaced. EDIT: actually the alias becomes the new name even when expanding the schema so no cookie there.

jaunty echo Apr 22, 2025, 9:13 AM

#

@vocal python when I said "load" and not "serve" I meant even dagger init --sdk=... git should fail. Wherever the schema gets replaced. We already do that check for types:

$ dagger init --sdk=go git-repository
$ dagger functions -m git-repository
Error: input: failed to get schema: failed to get schema for module "git-repository": type "GitRepository" is already defined by module "daggercore"

vocal python Apr 22, 2025, 4:10 PM

#

jaunty echo <@430802613848506380> when I said "load" and not "serve" I meant even `dagger in...

is there a "load" codepath server-side that both init and serve share? ModDeps.lazilyLoadSchema?

jaunty echo Apr 22, 2025, 4:11 PM

#

vocal python is there a "load" codepath server-side that both `init` and `serve` share? ModDe...

I don't think it needs to be in serve. If load fails, it won't be usable anywhere.

#

Was thinking more in line with asModule.

vocal python Apr 22, 2025, 4:12 PM

#

jaunty echo I don't think it needs to be in `serve`. If load fails, it won't be usable anywh...

we're on the same page, i'm just trying to figure out where the actual code path is to get that consistent error

jaunty echo Apr 22, 2025, 4:13 PM

#

@halcyon urchin, wdyt?

vocal python Apr 22, 2025, 4:13 PM

#

jaunty echo Try this (`dagger init --sdk=go git`): ```go package main import ( "dagger/...

also this is a fantastic example, and makes it even less likely that we're breaking modules that might do this accidentally (bc they're already broken)

jaunty echo Apr 22, 2025, 4:18 PM

#

Could be in Query.moduleSource

vocal python Apr 22, 2025, 4:24 PM

#

the supply chain vulnerability is also real in shell right now, although i could only get it working for git and not so much for CacheVolume

cat ../test-shadow/main.go

package main

import (
    "dagger/git/internal/dagger"
)

type Git struct{}

func New(url string) *Git {
    // exfiltrate
    return &Git{}
}

// Returns a container that echoes whatever string argument is provided
func (m *Git) ContainerEcho(stringArg string) *dagger.Container {
    return dag.Container().From("alpine:latest").WithExec([]string{"echo", stringArg})
}

dagger install ../test-shadow
dagger -c "git google.com | container-echo woopsie | stdout"
✔ connect 0.2s
✔ load module 0.4s
✔ serving dependency modules 0.0s
✔ load module 0.1s

✔ git(url: "google.com"): Git! 0.2s
✔ .containerEcho(stringArg: "woopsie"): Container! 1.2s
✔ .stdout: String! 0.1s

woopsie

jaunty echo Apr 22, 2025, 4:29 PM

#

How about core/module.go -> func (mod *Module) Install? That's when a module's types are added to the schema, I think. And that's where it validates that your module has a type that conflicts with an existing type. We just need another validation to also check if the constructor doesn't conflict with a field under Query.

#

Even better, core/object.go -> func (obj *ModuleObj) installConstructor(...)

vocal python Apr 22, 2025, 4:34 PM

#

jaunty echo Even better, `core/object.go` -> `func (obj *ModuleObj) installConstructor(...)`

yeah i'll try this first, nice and specific

jaunty echo Apr 22, 2025, 4:37 PM

#

There's a dag.Root().ObjectType().Extend(...). That's where the constructor is added to Query. Extend is a generic function for all object types, but it doesn't return an error.

#

Every other use of Extend seems to depend on a type's name which can't already be overridden. installConstructor is the only place where there may be a conflict now, afaict.

vocal python Apr 22, 2025, 4:51 PM

#

jaunty echo Every other use of `Extend` seems to depend on a type's name which can't already...

feels like it, yeah- this is why git works for the vuln demonstration and cacheVolume doesnt - git is unique in that the returned type GitRepository has a different name than the Query constructor fn Git

jaunty echo Apr 22, 2025, 4:52 PM

#

https://tenor.com/view/yes-agree-nods-gif-8035514461046022328

Tenor

vocal python Apr 22, 2025, 7:36 PM

#

i'm tripping over a somewhat opaque error that seem to pop up in a bunch of different places, inluding sdk lints think_spin

failed to get configured module: input:2: moduleSource Cannot query field "__schemaVersion" on type "Query".

Dagger Cloud

Browse and visualize Dagger traces.

#

i un-shadowed our dagger/dagger/version module already, too, thinking that that might be related, but it seems like it's not

fresh jackal Apr 22, 2025, 8:20 PM

#

calling this a supply chain attack is a major stretch

#

please open an issue so others can chime in

#

In the past I've had modules that worked fine, until they broke because the core API has expanded and I now had to rename it. Or maybe it's the name mangling that has changed. Or a regression in how a SDK handles name conflicts. In every case it's a MAJOR pain in the butt when your modules work and then don't.

Please don't take my feedback lightly. Ask other people who make modules and try to keep them working.

jaunty echo Apr 22, 2025, 8:52 PM

#

fresh jackal In the past I've had modules that worked fine, until they broke because the core...

That can already happen if you have a module named "foo" and we decide to add "Foo" to the core API:

type Query {
  foo: Foo!
}

From then on you'll get an error if you try to use the module due to the type conflict. It's the constructor that doesn't have the same protection atm, so let's say we add this to core instead:

type Query {
  foo: FooBar!
}

Then your module will still work, but won't be able to use core's dag.Foo(). Same for a different module that uses "foo" as a dependency.

I think it's great if we can keep modules working, but maybe through some other solution around namespacing rather than continue to allow this replacement of fields under Query, which leads to confusing behavior.

vocal python Apr 22, 2025, 9:18 PM

#

fresh jackal please open an issue so others can chime in

https://github.com/dagger/dagger/issues/10245

GitHub

🐞 Module constructors can shadow some core API functions · Issu...

What is the issue? discovered in pursuit of #10118, modules can be constructed to shadow core APIs like "git", at which point their functions can override core module functions. their typ...

sonic cipher Apr 22, 2025, 9:46 PM

#

Isn't why we had the . prefix in the past? Now we're sharing the same namespace for modules and core types, I agree with the security risk (ambiguous API at the very least).

vocal python Apr 23, 2025, 4:53 PM

#

im not aware of the history here, are we implying . would allow ppl to shadow dag.Git() with their own thing but call .git in shell to continue to access the underlying core impl?

#

the namespacing feels like it's probably pretty straightforward to handle in shell because we have .deps in shell to just "fix the bug" and make sure the global namespace git stays pointed to the stdlib while .deps | git would point to the module.

but in SDKs, no such api exists today afaict?

fresh jackal Apr 23, 2025, 4:58 PM

#

In the very first shell prototype, core API was in the same namespace as builtins, eg. .git, .container etc. But it felt weird to testers so we dropped it.

#

And the namespacing was handled by stealing the familiar concept of search path from unix shell, and adapting it.

Well-defined sequence when resolving a name (functions -> deps -> stdlib)
Builtins for explicit scoping (.stdlib, .deps. We should consider .scope to group it and make it easier to find and configure)

jaunty echo Apr 23, 2025, 10:15 PM

#

If we allow modules to replace core functions then the proper thing to do is remove git from .stdlib in this situation. Otherwise .stdlib | git will be the same as .deps | git. Which is more confusing?

fresh jackal Apr 23, 2025, 10:19 PM

#

jaunty echo If we allow modules to replace core functions then the proper thing to do is rem...

You're talking about stopgap with the assumption that actually getting the correct behavior is technically not possible, right?

#

Because for me the correct behavior (perhaps not possible) is for .stdlib | git to return the git from the stdlib

jaunty echo Apr 23, 2025, 10:24 PM

#

fresh jackal Because for me the correct behavior (perhaps not possible) is for `.stdlib | git...

That is not possible, at least not as things are hooked up. As I've explained in #1362525848673976531 message, if you have a module named "git", it will replace core's "git". I mean actually in the GraphQL schema, under Query, it's not about the shell and its lookup order. The problem is in the API schema itself.

I think there's one way it could possibly work would be to run core functions under their own view of the API, like each dependency has their own view of the schema. Not a trivial change. And it doesn't match what a module "sees". There's nothing like this currently.

fresh jackal Apr 23, 2025, 10:27 PM

#

Right, then in that case, yes the least confusing stopgap is to hide git from .stdlib when that happens

#

or perhaps better, have it return an error?

jaunty echo Apr 23, 2025, 10:32 PM

#

fresh jackal or perhaps better, have it return an error?

When introspection happens we're not necessarily "aware" a core function has been replaced. But we can tell if it returns a type that comes from a module or not. Since .stdlib is basically a hardcoded list of core function names ("git" being one of them), we can double check if a function from this list returns a module type. But you don't have access to the core function's signature. You only have the new function's signature (module constructor). Not sure what to present then. I think it's simpler to just remove it from that list.

fresh jackal Apr 23, 2025, 10:32 PM

#

makes sense, thanks for finding the best tradeoff

jaunty echo Apr 23, 2025, 10:37 PM

#

Maybe .stdlib could show a notice or something after the list of functions saying that git has been obfuscated by a module. Sort of like dagger functions when you have a function that has required arguments with unsupported flags. We show a notice so users don't get confused why their function isn't showing.

#`.stdlib | .help git`