#`.stdlib | .help git`
1 messages ยท Page 1 of 1 (latest)
Oh yeah, I see. So basically, when serving dependencies, if the dependency has the same name as something that's already in Query it will override it. In this case it's in the hardcoded subset for .stdlib that's why it shows in there.
Then, if we take a step back, and you have a module with a dependency named "git", your dag.Git will return Git from the dependency rather than GitRepository from core. Shouldn't that be disallowed?
it either feels like something that should error for the person who did the naming OR something we should allow because it enhances programmability (like you could ostensibly wrap the core here and shadow, thus extending the core objects... tbh just describing that makes me pretty certain that it'll blow up due to type incompatibilities)
Yeah, and it also wouldn't work since you don't get both. Actually, it's not just dependencies. A module named Git would itself obfuscate core git.
erroring for the person who did the naming would imply that we wanna reserve all the names from the core API... isn't there some prior art for reserved keywords? where's that code at?
I think it's just a matter of looking into current typedefs before always replacing, when a module is loaded.
that's gonna make it error not for the person doing the naming, but for the person doing the installing, right?
(that does seem easier to implement, for sure, and i'm cool doing it that way if we're both cool with that tradeoff of who hits the error)
Should also happen for the person doing the naming, if said person tries to load it.
yeah, but only in shell, right?
No, I'd enforce it always.
ah, so enforce it on server-side Install
Server-side load.
yeah
Serve
sorry words lmao, my mental model of all this stuff oscillates from clear to fuzzy depending on the week
๐
cool i think i can move forward with that. i should expect some broken tests with the new desired behavior, right?
Yes, it implies our version module would have to be renamed, and respective tests should now expect an error.
plus there are these test modules like git that should error
Maybe ask for a quick consensus on this behavior?
yeah can do
Yeah, a Git module could be renamed to GitTest or something. But if the test is for shadowing behavior it can be replaced by a general one on naming conflict error.
@lucid violet @halcyon urchin @golden vessel @fresh jackal looking for a quick consensus - any concerns about erroring in module.Serve if a module name tries to shadow the core API names? essentially we'd be disallowing modules named git, version, cacheVolume, container, etc so that we can present an api to LLMs and shell users that feels more similar to SDK code with direct access to module constructors
PR here since it's lost in the thread chaining, but it doesn't yet have the error code i'm intending to add
forging onwards with this, assuming lack of engagement means "it's probably fine"
makes sense to me ๐
what's the goal of this?
if it's eliminating shadowing entirely, don't forget the shell also allows functions of the current object to shadow dependencies of the current module. So you will still have some shadowing
intent right now is to specifically eliminating shadowing of core functions by module names. the end goal, though, is towards SDK/shell/prompt parity -- making shell and prompt offer the same "flat" accesses to dep modules that you can get in an sdk with dag.ModuleName(). We already have this mostly working for shell since 9992, but that client-side fix interacts poorly with the Env install hooks, so i'm fixing it by moving the same logic server side
it's quite a long chain of if-this-then-that lol
but your question does raise a corrolary, like we don't actually strictly need to stop the shadowing for the goal to be accomplished, an alternative would be to adapt the tests to acknowledge that deps can override core impls, it just starts looking real crazy in some places like .stdlib | .help git returning module-defined help text
there's probably also a way to have the client be aware of this shadowing, analogous to core.client.defaultDeps
When I said "shell parity" I didn't specifically mean "everything should be flat like in the SDKs"
I don't understand why .stdlib | .help git would return anything other than the doc for stdlib git - that's literally what it's for
If it returns something else, IMO it's a bug
yeah, agreed, this thread is an effort to not ship that bug
Can't we just fix the bug? The change you propose will break an unknown number of modules on daggerverse, and it feels like the beginning of a game of whack-a-mole
not sure if it's feasible or not, in my head fixing it would be a matter of continuing to let modules shadow core but keeping around a pre-module-load representation of the typedefs for .stdlib? @jaunty echo does that sound feasible to you?
Oh I see, it's the recent addition of pre-loading all modules that broke it?
yeah, at least once we move that pre-loading of deps server-side
the counter-argument, anyways, is that the unknown number of modules we're gonna break on daggerverse are already doing something bad that we don't want them to do: hiding core Query functionality from their importers
speculatively, you might even be able to do some really nasty stuff, like create a module called SetSecret with a constructor that takes name, plaintext and exfiltrates the plaintext
https://github.com/dagger/dagger/pull/10118/commits/5ba0f0170025f240cdca6ef6824fb09c3c412408 is roughly the "refuse to serve core-shadowy-modules" change
We specifically designed the shell to allow shadowing. There's a whole UX built around it. We can always discuss changing it (like anything else) but we shouldn't roll back parts of it without thinking through the end-to-end ux implications, as a kneejerk reaction to a regression.
It didn't occur to me that there could be a security issue with shadowing. I guess that would be a good motivation to remove it. But let's start with an issue focusing on that, then.
We can't yolo this change and tell module devs "you were doing something bad"
The potential pain is exacerbated by another recent module breakage (can't call your types Env anymore)
regardless of the shell, how was this not an issue in SDKs?
i'll play around with it i guess... i do wanna be clear that the yoloing here is directly in service of the dagger install $module; dagger shell --prompt "ask LLM to use $module" experience, not apropos of nothing
It's not about the shell, it's deeper than that. If you have a module called "git", then you don't have core's git anymore because Query { git(x): GitRepository! } becomes Query: git(y): Git!. Same with a module that depends on another called "git". I don't think that should be allowed. If there's already a Query.foo in the schema, it shouldn't be allowed to be overridden like that.
Try this (dagger init --sdk=go git):
package main
import (
"dagger/git/internal/dagger"
)
type Git struct{}
func (m *Git) Test() *dagger.Directory {
return dag.Git("github.com/dagger/dagger").Head().Tree()
}
Without self-bindings codegen won't replace dag.Git so the IDE won't see any problem here, but the API schema has changed:
$ dagger call test
โ git: Git! 0.0s
โ .test: Directory! 1.1s
! marshal: json: error calling MarshalJSON for type *dagger.Directory: returned error 422: {"data":null,"errors":[{"message":"Cannot query field \"head\" on
! type \"Git\".","locations":[{"line":1,"column":43}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}},{"message":"Unknown argument \"url\" on field
! \"Query.git\".","locations":[{"line":1,"column":7}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}]}
By throwing an error, the author would be forced to rename the module to avoid the conflict.
Note that you could have a module "my-git" as a dependency, aliased to just "git". In that case the shadowing logic in shell would apply correctly. The problem is when the schema gets replaced. EDIT: actually the alias becomes the new name even when expanding the schema so no cookie there.
@vocal python when I said "load" and not "serve" I meant even dagger init --sdk=... git should fail. Wherever the schema gets replaced. We already do that check for types:
$ dagger init --sdk=go git-repository
$ dagger functions -m git-repository
Error: input: failed to get schema: failed to get schema for module "git-repository": type "GitRepository" is already defined by module "daggercore"
is there a "load" codepath server-side that both init and serve share? ModDeps.lazilyLoadSchema?
I don't think it needs to be in serve. If load fails, it won't be usable anywhere.
Was thinking more in line with asModule.
we're on the same page, i'm just trying to figure out where the actual code path is to get that consistent error
@halcyon urchin, wdyt?
also this is a fantastic example, and makes it even less likely that we're breaking modules that might do this accidentally (bc they're already broken)
Could be in Query.moduleSource
the supply chain vulnerability is also real in shell right now, although i could only get it working for git and not so much for CacheVolume
cat ../test-shadow/main.go
package main
import (
"dagger/git/internal/dagger"
)
type Git struct{}
func New(url string) *Git {
// exfiltrate
return &Git{}
}
// Returns a container that echoes whatever string argument is provided
func (m *Git) ContainerEcho(stringArg string) *dagger.Container {
return dag.Container().From("alpine:latest").WithExec([]string{"echo", stringArg})
}
dagger install ../test-shadow
dagger -c "git google.com | container-echo woopsie | stdout"
โ connect 0.2s
โ load module 0.4s
โ serving dependency modules 0.0s
โ load module 0.1s
โ git(url: "google.com"): Git! 0.2s
โ .containerEcho(stringArg: "woopsie"): Container! 1.2s
โ .stdout: String! 0.1s
woopsie
How about core/module.go -> func (mod *Module) Install? That's when a module's types are added to the schema, I think. And that's where it validates that your module has a type that conflicts with an existing type. We just need another validation to also check if the constructor doesn't conflict with a field under Query.
Even better, core/object.go -> func (obj *ModuleObj) installConstructor(...)
yeah i'll try this first, nice and specific
There's a dag.Root().ObjectType().Extend(...). That's where the constructor is added to Query. Extend is a generic function for all object types, but it doesn't return an error.
Every other use of Extend seems to depend on a type's name which can't already be overridden. installConstructor is the only place where there may be a conflict now, afaict.
feels like it, yeah- this is why git works for the vuln demonstration and cacheVolume doesnt - git is unique in that the returned type GitRepository has a different name than the Query constructor fn Git
i'm tripping over a somewhat opaque error that seem to pop up in a bunch of different places, inluding sdk lints 
failed to get configured module: input:2: moduleSource Cannot query field "__schemaVersion" on type "Query".
i un-shadowed our dagger/dagger/version module already, too, thinking that that might be related, but it seems like it's not
calling this a supply chain attack is a major stretch
please open an issue so others can chime in
In the past I've had modules that worked fine, until they broke because the core API has expanded and I now had to rename it. Or maybe it's the name mangling that has changed. Or a regression in how a SDK handles name conflicts. In every case it's a MAJOR pain in the butt when your modules work and then don't.
Please don't take my feedback lightly. Ask other people who make modules and try to keep them working.
That can already happen if you have a module named "foo" and we decide to add "Foo" to the core API:
type Query {
foo: Foo!
}
From then on you'll get an error if you try to use the module due to the type conflict. It's the constructor that doesn't have the same protection atm, so let's say we add this to core instead:
type Query {
foo: FooBar!
}
Then your module will still work, but won't be able to use core's dag.Foo(). Same for a different module that uses "foo" as a dependency.
I think it's great if we can keep modules working, but maybe through some other solution around namespacing rather than continue to allow this replacement of fields under Query, which leads to confusing behavior.
Isn't why we had the . prefix in the past? Now we're sharing the same namespace for modules and core types, I agree with the security risk (ambiguous API at the very least).
im not aware of the history here, are we implying . would allow ppl to shadow dag.Git() with their own thing but call .git in shell to continue to access the underlying core impl?
the namespacing feels like it's probably pretty straightforward to handle in shell because we have .deps in shell to just "fix the bug" and make sure the global namespace git stays pointed to the stdlib while .deps | git would point to the module.
but in SDKs, no such api exists today afaict?
In the very first shell prototype, core API was in the same namespace as builtins, eg. .git, .container etc. But it felt weird to testers so we dropped it.
And the namespacing was handled by stealing the familiar concept of search path from unix shell, and adapting it.
- Well-defined sequence when resolving a name (functions -> deps -> stdlib)
- Builtins for explicit scoping (
.stdlib,.deps. We should consider.scopeto group it and make it easier to find and configure)
If we allow modules to replace core functions then the proper thing to do is remove git from .stdlib in this situation. Otherwise .stdlib | git will be the same as .deps | git. Which is more confusing?
You're talking about stopgap with the assumption that actually getting the correct behavior is technically not possible, right?
Because for me the correct behavior (perhaps not possible) is for .stdlib | git to return the git from the stdlib
That is not possible, at least not as things are hooked up. As I've explained in #1362525848673976531 message, if you have a module named "git", it will replace core's "git". I mean actually in the GraphQL schema, under Query, it's not about the shell and its lookup order. The problem is in the API schema itself.
I think there's one way it could possibly work would be to run core functions under their own view of the API, like each dependency has their own view of the schema. Not a trivial change. And it doesn't match what a module "sees". There's nothing like this currently.
Right, then in that case, yes the least confusing stopgap is to hide git from .stdlib when that happens
or perhaps better, have it return an error?
When introspection happens we're not necessarily "aware" a core function has been replaced. But we can tell if it returns a type that comes from a module or not. Since .stdlib is basically a hardcoded list of core function names ("git" being one of them), we can double check if a function from this list returns a module type. But you don't have access to the core function's signature. You only have the new function's signature (module constructor). Not sure what to present then. I think it's simpler to just remove it from that list.
makes sense, thanks for finding the best tradeoff
Maybe .stdlib could show a notice or something after the list of functions saying that git has been obfuscated by a module. Sort of like dagger functions when you have a function that has required arguments with unsupported flags. We show a notice so users don't get confused why their function isn't showing.