#`sdk.config` design discussion

1 messages ยท Page 1 of 1 (latest)

hidden creek
#

the reason to do it this way is that we need some way of marking the module this way - other config mechanism are kind of lacking.

  • extending SYSTEMENV from https://github.com/dagger/dagger/pull/7255 means that users need to run special engines to run special modules - this just makes it harder to manage and deploy engines, and makes all of the stuff we're trying to solve with more ephemeral engines and automatic provisioning way harder.
  • pulling automagically from the environment sucks, and means that a caller needs to configure their environment in a specific way to call a module - it breaks some of the ease of just doing -m path/to/module which hurts ability to load modules in cloud, or in shell, etc.

this mechanism is not for configuring secrets, or automatically pulling data from the caller environment - it's a simple key-value map of env vars, that are hardcoded for building+running that module. e.g. you could set linker flags for the go compiler, or enable specific go features (that need env vars set, e.g. GOEXPERIMENT)

#

but, @fossil wigeon points out that with @dawn torrent's work in https://github.com/dagger/dagger/pull/9504, there's a .env file. I think the purposes are a bit different - this file is for caller config, while the sdk.env key is for module author config

#

because this pr adds a new user-facing feature to dagger.json, and boxes us in a little bit, i just want to make sure this sounds reasonable to most folks before we go and do it.

#

cc @wraith summit @jaunty holly @rustic spear

#

๐Ÿค” actually, we can unblock the release for today, if we just revert the env key from our original pr (and then add it in again)
since folks are at civo navigate today

#

@wraith summit does that sound alright to you? i can throw that together

wraith summit
#

Yes, that sounds good to me

#

Thanks Justin

jaunty holly
#

@hidden creek seems to me that the purpose of both files are different indeed and this new env config in the SDK only works for the go runtime for the moment

#

I don't see any strong indications to revert it IMO

hidden creek
#

makes sense to avoid continuing to special case go as much as possible

fossil wigeon
#

The reason to be careful here is not just about sdk.env vs .env. Someone may have something else in mind that's a better idea and then it's harder to take back.

hidden creek
#

if it's part of the module config, then it makes sense to have it in dagger.json imo, instead of splitting it out further into more files

#

but, yes, that's why i think we should discuss more (since this would be an annoying decision to undo)

fossil wigeon
#

Yeah, it's the taking it back that's the issue. But ๐Ÿ‘ from me as a module authoring config, in contrast with possible future caller specific configs.

jaunty holly
wraith summit
#

My 2 cents would be: things like secrets should come from .env file as they are caller specific and we dont want to checkin those. But things like GOPRIVATE should come from a config that is checkedin with the module code,

hidden creek
hidden creek
#

but no harm done, just glad we caught this and didn't maybe steer us into a corner that was hard to get out of

hidden creek
dawn torrent
#

what's sdk.env?

hidden creek
# dawn torrent what's `sdk.env`?

we updated the sdk field in dagger.json to be a map (instead of a direct string), so that we can apply pinning (separately, still tbd).

but while there, we added an env field to that sdk map, which allows the module author to modify the environment variables that the module is built and runs in.

dawn torrent
#

ah cool. Sdk configuration, much needed! Is it literally passed as env variables in the sdk's runtime?

hidden creek
#

very literally, if you put "VARNAME": "VALUE", you get VARNAME equal to VALUE. no interpolation, no host interaction, etc

#

pretty much just stiched onto a runtime env

dawn torrent
#

will people use it to pass secrets occasionally?

#

mmm wouldn't it be inconvenient to mix config you shouldn't check into git probably, inside a dagger.json that otherwise should be checked in?

hidden creek
#

In this case, we actually do want to check this config in - like GOPRIVATE, otherwise every user who calls the module will have to add it to their own files (which makes it harder to just use)

hidden creek
hidden creek
dawn torrent
#

seems more like a SDK-specific config field to me.

#

with a sdk-specific schema

#

subtle difference but makes a difference for the UX I think

hidden creek
#

Yeah, was hoping to make a bit more generic, 1. to avoid this and make sdk drift less likely (growing problem as we adopt more SDKs, which would otherwise need to implement this kind of behavior themselves every time), and 2. it's useful as a feature in its own right - we can use it to enable certain build settings for go, or to allow more easily configuring a local module without diving into the code

#

If we do SDK specific, we'll be reimplementing private dependencies for almost every single one of them, with config schema changes that it'll be hard to keep in sync with each other

dawn torrent
#

they don't need to be in sync

#

every sdk will be configured differently

#

users will not pass GOPRIVATE to the Java SDK

#

so there will be an implicit schema anyway. No downside to making it explicit.

#

and lots of upside

hidden creek
dawn torrent
#

like being able to list available options...

hidden creek
#

i think even if we had the explicit options, being able to configure the runtime env of the sdk is important

#

see the hack above, users have asked for this before

dawn torrent
hidden creek
#

๐Ÿค” not sure i understand

#

we could put it top-level in env if that's the objection

jaunty holly
hidden creek
rustic spear
#

Allowing any env to be set does open up a pandora's box in terms of which values are expected to be supported by the SDK and which will just break. Like an env var that would just break the go sdk would be GOROOT or something.

That's okay if we are okay saying "you are on your own for whether this env var you're setting will actually always work". But if not then I feel like we'd have to limit the settings to those the SDK officially supports (and has tests for, etc.)

#

It's just a classic "super powerful footgun" vs "limited but safe" problem

#

I'd personally probably air on the side of making the settings specific to SDKs and requiring they officially support it. I just don't want to users to come to us asking why setting OBSCURE_PYTHON_ENV_SETTING_WE_DIDNT_ANTICIPATE in a 5 level deep dependency of their module is causing their module builds to occasionally fail, or something like that

hidden creek
rustic spear
#

do you mean users running go build in a withExec?

hidden creek
#

i kind of mean more generally, like a user running go build themselves

#

i don't know if we need to handhold developers here

rustic spear
# hidden creek i don't know if we need to handhold developers here

I personally view the SDKs as a degree of hand-holding in that we want them to "just work" and are something we officially support. It's just harder to make guarantees around "officially support" when we also allow any user of any sdk to configure it in almost arbitrary ways.

#

Like just imagining the docs for this feature it's either going to be:

  1. Here's valid settings that are supported for SDK xyz
  2. You can set any env var you want on SDK xyz, but no guarantees of what actually works (both now and in the future). Here's some env vars we currently have tested working: ...
#

I do totally agree that 1. creates more overhead for SDK developers and users who need to wait for support for a given SDK to add support for any missing configuration

#

Maybe there's a compromise solution:

  • Make SDK-specific configuration
  • But an SDK is free to add specific configuration for "arbitrary env vars", if the SDK developer is willing to support that
hidden creek
# rustic spear I do totally agree that `1.` creates more overhead for SDK developers and users ...

i think the amount of overhead is really high actually - it means that every single env var we might want to add, we should go through a wider design thing (since it's a user-facing change). under the current work, we've kind of got the groundwork laid for private dependencies for most sdks. if we restrict this to just controlling GOPRIVATE, then we have to go and do this work for each single sdk

#

and while today it's just private deps, i don't neccessarily think it's gonna stop with that - i think we'll want to continue to allow further configuration of sdk runtime envs, which means we'll need to do it for all the sdks again

fossil wigeon
#

In Python sdk.env doesn't fix private dependencies. It just needs secrets for that. I suspect TS is the same.

hidden creek
hidden creek
fossil wigeon
#

I don't know if there's equivalents to the GOPRIVATE thing.

fossil wigeon
rustic spear
dawn torrent
#

I have to go on stage in a few minutes ๐Ÿ˜„ But there's a chance to discuss this further before it merges yeah?

hidden creek
#

i mean, i feel a bit outvoted here, so happy to move to a sdk-specific config if it feels like that's the way to go? but it feels like that's just solving the immediate problem, i can't shake the feeling that we'll eventually end up exposing this flexibility at some point when someone wants it

dawn torrent
#

feeling like this @hidden creek ? ๐Ÿ˜›

hidden creek
rustic spear
hidden creek
#

if everyone is doing it, we should centralize it ๐Ÿ˜›

hidden creek
jaunty holly
hidden creek
#

worth emphasising that it's only at the module level - you're only affecting yourself as the module author - everything else in your dep tree is unaffected

#

you can't configure your deps runtime env

rustic spear
hidden creek
#

mm, but if you can't build the module, then it won't get published to daggerverse - you could try making a dependency with bad code as well

rustic spear
dawn torrent
#

To be clear I'm proposing that each SDK has a way of specifying its own configuration on the fly. So the Go SDK can map the half-dozen or so Go env variables (or flags) that it wants to make configurable, and expose that as SDK config

fossil wigeon
#

For Python I'd want both sdk specific settings (to move current ones from pyproject.toml to dagger.json), but also a generic envs to add add to the runtime container as that can be useful to configure uv (the tool supports configuring via env vars). To join both features, I'm willing to add the envs field to the sdk specific config.

dawn torrent
#

It's strictly better from a SDK DX and module dev DX point of view. It's just more work for us to implement.

#

Also if there is any chance of secrets being in there, having it all in dagger.json will be a problem.. Not sure what to do about that

hidden creek
#

for secrets, i'd kind of like to have secret mappings next to an envs field - then you could use a user-defined .env file to load them, or connect them to a secret provider with a cli flag, etc. but the module would define which secrets it can accept into it's environment

jaunty holly
hidden creek
#

is there a reason an sdk wouldn't implement the env? would we want to not do this for our core sdks?

dawn torrent
jaunty holly
rustic spear
dawn torrent
dawn torrent
#

I'm skeptical that a Go/Python/Typescript SDK is so complex to configure that you can't define a config schema in advance for it

hidden creek
#

ah, i rest my case ๐Ÿ˜› i'm happy with just manually adding the config settings somewhere in dagger.json

hidden creek
#

let's just pick an option as to where it goes so we can try and unblock shipping private go deps ๐Ÿ˜›

#

honestly, anything that isn't adding SYSTEMENV_GOPRIVATE sounds fine to me

rustic spear
fossil wigeon
hidden creek
rustic spear
hidden creek
#

i'd rather that go into the new engine.json if we can though

rustic spear
hidden creek
#

having to manually spawn and configure an engine container kinda sucks from a usability perspective, and also really does not gel very well with any future stabilizing/drivers work we're planning soon

#

cool awesome, i can take moving that at some point ๐Ÿ˜„

rustic spear
#

that's why system env is implemented as an API call rather than literal in-line os.Getenv

hidden creek
#

i imagine we'd have sdk.config as an object in dagger.json - that whole object gets passed to SDK.Runtime, and it can choose what to do with it. we'd add GOPRIVATE initially.

hidden creek
jaunty holly
#

or we see this Systemenv <> dagger.json translation different?

hidden creek
#

allowing you to configure your deps in such an invasive way kinda breaks a lot of our dependency model, we go so far out of our way to isolate them

jaunty holly
hidden creek
#

well, yeah ๐Ÿ˜›
what's safe for the dagger infra person to configure (like proxying settings) isn't safe for the module author to configure (since they don't know where it's running)

#

and vice versa

#

but we can't just let the infra person arbitrarily configure everything in the env for the same arguments we don't want the module author to - except now, you've made it possible to write a module that only runs on a specific engine

#

which is the whole thing we'd want to avoid

jaunty holly
#

I'd assume if two configs overlap, the module one should always win for that module runtime container?

hidden creek
#

i don't think they'd end up overlapping

#

engine sdk config would control things like proxies/mirrors/other global settings (maybe debug? telemetry for the security concious?)

#

but the module author shouldn't be concerned with such things

rustic spear
#

yeah GOPROXY should never be allowed in the module's dagger.json, the go sdk should actually error out if it's set there I think

jaunty holly
#

so yeah.. seems like we need that global SDK engine config ASAP @hidden creek

rustic spear
hidden creek
#

just like you'd mark a GOPRIVATE right?

fossil wigeon
jaunty holly
#

FWIW the Go runtime is not forkable right?? since it's kind of "special"?

fossil wigeon
hidden creek
#

i think there's a difference here though:

  • mirrors/proxies shouldn't affect the runtime of a module ideally - it should still run the same, even when configured differently. you'd use these when in a super locked down environment, where you need to pull from certain places.
  • sdk config in dagger.json can include details (not secrets!) for private registries/repositories/etc - if those deps are required to allow it to run.
hidden creek
#

and is kind of pre-req for doing anything else that we're talking about

#

it also feels low-risk, since we'd initially just be adding goprivate config for the go sdk.

fossil wigeon
hidden creek
jaunty holly
#

since the first "blocker" people run into when running dagger in corp environment is: "how do I configure my internal index"?

#

and as we're discussing, it can be done in two ways: at the engine or module level

hidden creek
#

to summarize kind of my thoughts on the separation: there are 3 places to put config:

  • the engine (today via engine.toml, engine.json, mounting certs into the engine, setting env vars in the engine -> we should consolidate all this to engine.json one day)
  • the client (loading .git/config, and pulling git credentials)
  • the module

i think generally - we should aim to move as much config as possible out of the engine. the more that's in the engine, the harder it becomes to load modules - i think we need to avoid (as much as is possible) needing to have a pet engine to load a specific module. if you need that, then it becomes so much harder to host your compute, to run your daggerverse, to load your cloud modules, to interact with shell etc.

#

so if there's a private dependency, we should do whatever we can to make sure that people end up configuring how that works through the module / the client / some dance between the two

#

because otherwise, we end up with lots of pet engines, and it's so much harder to get your whole org using dagger

jaunty holly
#

so in that example, configuring the index at the module's pyproject.toml should be ok

hidden creek
#

i read that as allowing the equivalent of a GOPROXY. you shouldn't be able to do that, since ideally that shouldn't have an affect on building the module - it's an infra concern.

fossil wigeon
#

For context, the config for using a custom base image for the runtime container was added for two reasons: 1) affect the installation phase with needed settings, however it still lacks support for securely loading secrets which is where the biggest need for customization is; 2) adding extra OS packages. We can solve some things with a custom sdk-specific config, both on the module and the engine, but still need a proper secrets solution.

rustic spear
#

My understanding was that we should source things like GOPROXY from some sort of engine config, but today the only "engine config" available for it is "env var on the engine container". But with the stuff Justin has done more recently and is planning that would expand to:

  1. engine config file in the engine container
  2. runtime engine config sourced from the client
  3. etc.
#

so still not in the module configuration, but in a place that doesn't have to resort to provisioning special engines

jaunty holly
jaunty holly
fossil wigeon
jaunty holly
#

so @hidden creek we're replacing SYSTEMENV with sdk.config in engine.json for now to unblock private go modules?

hidden creek
#

No, to unblock private modules we're adding a GOPRIVATE field in sdk.config in dagger.json

fossil wigeon
hidden creek
#

Credentials are gonna come from the gitconfig - since the go ecosystem generally uses ssh for these

#

We don't need to propagate creds, we just need the gitconfig file

jaunty holly
#

either SSH_AUTH_SOCK or .gitconfig

fossil wigeon
hidden creek
#

Yee, if they can pull from git ssh

#

I still think we'll need explicit pat support eventually

#

But like we did this in two stages for private modules, I think we can do that here

fossil wigeon
hidden creek
#

Details I cannot remember, but see rajats wip pr

#

(on mobile now, sorry)

jaunty holly
fossil wigeon
#

I'm asking because I want to make sure all SDKs can pull private dependencies via git, not just Go.

jaunty holly
fossil wigeon
fossil wigeon
#

Ah, but that's only the Go SDK

jaunty holly
#

initially yes

hidden creek
#

Should be easy to extend

#

We can do that before merging even if it's a hard requirement

fossil wigeon
#

I mean, I'd say why only Go? All SDKs can benefit from having git auth.

#

Otherwise I need to keep giving workarounds.

jaunty holly
fossil wigeon
#

For private repos. In Python, the most common source for private repos is also git.

#

Not everyone deploys a private index as it requires infrastructure.

jaunty holly
#

๐Ÿ‘

#

SGTM, we can add support for python and ts as well

#

I don't think it should be a huge amount of work

#

specially if we configure the global git config file

fossil wigeon
#

Can it be general, or does every SDK need added code to support it?

hidden creek
#

I think it should be possible to move to the other side of the sdk interface

#

From where it is now

#

(i.e. to the caller side)

#

Or we extract the logic and hook it into moduleSdk as well

fossil wigeon
#

@wraith summit, up for it? ๐Ÿ™‚

wraith summit
#

SGTM (though i need to reread this thread when i am actually awake)

fossil wigeon
hidden creek
#

(and also GOPRIVATE in the dagger.json config)

wraith summit
#

Hi Justin, just thinking out loud as I am implementing sdk specific config.

#

given that our sdk is an interface core.SDK, and any module that implements that interface is essentially an SDK. it means the config can be of any shape depending on the sdk.

hidden creek
#

yes!

#

it's a map

wraith summit
#

this means we need to defer the marshalling/unmarshalling of dagger.json

hidden creek
#

i was thinking about this lol yeah

wraith summit
#

i was thinking the sdk.config will be json.RawMessage and then defer the processing of this config until we know the SDK.

#

and then possibly add another function to sdk interface Configure(ctx context.Context, rawConfig []byte) error

#

which will then take care of validating the shape of that sdk as per the sdk specific config?

#

this seems doable in theory as I think about it, but also thought to run through you incase you see any obvious flaws with this approach

#

OR. maybe create that config.json file as part of exporting moduleSource dir. and then the sdk can load it on the fly during codegen and moduleRuntime function call.

wraith summit
#

hmm, it works with deferred parsing for gosdk ๐ŸŽ‰ . trying for typescript sdk now.

#

still need to think about how to make this part of api ๐Ÿค”

wraith summit
#

Hi @fossil wigeon @hidden creek i am slightly confused by how to pass SDK config for other structs. right now it seems all sdk implement some way of reading user supplied config (e.g. python has Discovery.Config, typescript has analyzeModuleConfig)

#

I was thinking to create a new config file, which will be read originally from dagger.json, and then read that when we call codegen/runtime functions inside the moduleSDK

#

right now the flow seems:

dagger call -> read sdk config -> call sdk.codegen via graphapi -> which creates the container for running codegen and also run codegen (and similarly for runtime)

#

if we have to inject our customizations for codegen and runtime, it should be before (or part-of) the sdk.codegen function

#

my first question is, do we plan to stop using those workarounds once we implement sdk.config feature?

#

my 2nd question is: for customizing the image, there are two customizations we need right now (which can change later)

  • the gitconfig file that we fetch from user's host system
  • the env variables that we inject in the container that runs codegen/runtime
fossil wigeon
wraith summit
#

it seems like it has to be code in SDK

#

when calling codegen, we pass module source and introspectionJson

#

and the codegen function in SDK accept those two params

#

so we will have to change sdk runtime code to allow mounting the additional files (e.g. gitconfig now and .netrc later)

#

i was thinking it could be additional apis on container (but that also seems wrong). or maybe new functions in sdk interface

hidden creek
#

instead of raw bytes

#

but otherwise - sounds good to me, I think having an explicit Configure method sounds good (then it can return itself)

#

we could even put it in the constructor actually?

fossil wigeon
#

so we will have to change sdk runtime code to allow mounting the additional files (e.g. gitconfig now and .netrc later)
You could also mount those files on the container where the SDK module runs. So the SDK then pulls from dag.CurrentModule().Source()....

#

Yeah, the custom sdk config can be seen as a file, just like git config and netrc.

#

So better if we have a general way to load these. For the latter two, a dagger.File is easiest because we just need to re-mount them in the runtime container, which could come from dag.CurrentModule().Source().

#

The custom sdk.config as a dagger.JSON in the constructor is the easiest in that case. So maybe best to not mount as a file too so we're not forced to load its contents and can just unmarshal directly.

#

It seems to me, if we had a "base" and "setBase" functions in the SDK module that the engine would call, it could mount the git config and netrc automatically.

#

Among other things like passing standard stuff like sdk.env (instead of making it part of the custom sdk.config).

#

Like call constructor โ†’ get "base" container field โ†’ mount stuff in container โ†’ withBase(altCtr).codegen(...)

hidden creek
#

yeah that sounds good

#

significantly better than special casing gitconfig and friends

wraith summit
#

ah. i totally missed the constructor part here.

#

that should solve the prob I was running into. let me try that out and report back

#

i was too focused on codegen function call.

fossil wigeon
#

The config as a dagger.JSON in the constructor also simplifies that setup, otherwise I need that mod.load(mod_source) to get the user's source files.

wraith summit
#

๐Ÿ‘

#

having it as dagger.JSON, how do we add those sdk specific configs to api though?

fossil wigeon
#

You mean the types for ux, validation, etc?

wraith summit
#

yeah

#

validation, we could possibly do in constructor itself

fossil wigeon
#

Is there a simple util that produces a json schema from a struct?

wraith summit
#

we should have something (we do that today in dagql somewhere)

fossil wigeon
#

That would have to be executed by the SDK module though, right? When would this information be needed, other than getting the json schema of dagger.json (how do you get that by the way?) or the queried by dagger config or something?

wraith summit
#

I'll implement the other part first, maybe we will get the answer during that implementation

wraith summit
#

I am working through to add gitconfig to module-sdk and then last part would be to add tests for these.

#

but if there is any feedback in the meantime, it will be very helpful

hidden creek
#

looks okay i think, one note - i don't think json.RawMessage is quite right here, I think probably map[string]json.RawMessage is better?

#

we want to validate it's a map, passing a list or single string there isn't valid

wraith summit
#

I will cross check the behavior but i think i am doing a strict check when unmarshalling

#

So it will error out if the format is not the expected one

#

But also map is good, that will fail fast

#

๐Ÿ‘๐Ÿป will make that change

wraith summit
#

~~@fossil wigeon I am working through following:

Like call constructor โ†’ get "base" container field โ†’ mount stuff in container โ†’ withBase(altCtr).codegen(...)

and this seems doable but it means we have to refactor all SDKs (which seems better to me as it will make things consistent, but also means more work and potentially a backward incompatible change).

e.g. today in Typescript SDK, we expect to read base-image, runtime (bun or node) and runtime version from package.json. we probably need to change this to use sdk.config now. and to make sure private deps work, we need to call constructor -> constructor load sdk config -> get base image -> we add gitconfig/netrc -> codegen further calls more customizations (e.g. installing private deps) -> codegen/moduleRuntime.

does that make sense?~~

ignore this for now

#

I am working through the refactoring of typescript sdk as a reference change

dawn torrent
#

maybe dumb idea, but could we have the sdk (optionally) register a withConfig function and allow any arguments schema? Just enforce function name, and return type. Then introspect that schema to validate the config field on a per-sdk basis

#

bonus: we can reuse all the "magic strings" for converting string arguments to eg. a secret, a file etc. so lots of convenience would be available to sdks and their users for free

#

I like that it takes advantage of our magic dagql plumbing

hidden creek
#

lol that's very neat. i imagine more annoying to implement though than this quick solution.

dawn torrent
#

i'm sure it's a little difficult, but for any other platform, it would be absolutely impossible, so in the end it seems worth it

hidden creek
#

that magic string parsing logic lives only in the cli (e.g. that's why it doesn't work in // +default)

#

(we should move it)

#

so that that would become possible, but it's not plumbed through that way right now

dawn torrent
#

right, I just meant it would become architecturally possible. Didn't mean to trivialize the implementation part ๐Ÿ˜

fossil wigeon
#

Yeah, put a ๐ŸŽ€ on it and ship it! ๐Ÿ™‚

dawn torrent
wraith summit
#

this is essentially what we are doing right? by passing the rawConfig in constructor. instead we want to call another function in that SDK?

#

the other things still remain the same

fossil wigeon
#

No because it's no longer a dagger.JSON you send in.

#

You'd introspect the function's arguments to figure out the schema.

#

Also solves having to report what's that part of the json schema for dagger.json, because the engine could do it on its own.

dawn torrent
#

Right and sdk doesn't have to parse json

wraith summit
#

I see. this also solves the 2nd problem (how to expose the struct fields to user) that we were discussing yesterday

fossil wigeon
#

Yes, exactly.

wraith summit
#

ok.. this solves the config problem. but the refactoring of sdk is still required to use that config right?

#

i mean the strikedthough msg i sent above regarding analyzemoduleconfig/base/codegenbase/codegen/moduleruntime functions and the order in which they are called?

fossil wigeon
#

I think we need to bridge both for a while, i.e., deprecation path. So it's like, read package.json and if it has custom configs there, copy them into the new structure from dagger.json/withConfig (if both, dagger.json has priority), and continue from there. This way you don't break right away and give a chance to upgrade. Remove at next minor bump.

#

You'd do this after the withConfig though, because you need the sources, but then overwrite only if not set.

wraith summit
#

๐Ÿ‘ - yeah that makes sense. that is what i did in my WIP refactoring of typescript sdk.

#

could you talk a little on how the introspection would be done? is there an example in our codebase where we do something similar?

wraith summit
wraith summit
#

Hi @hidden creek, would you have few mins to huddle about implementing "withConfig function".

I have a few questions around it to make sure I understood it correctly (e.g. the withSDK function currently returns *ModuleSource, do we want to add withConfig function as a chain here? or we want the withSDK to now return *SDK (or something like that) and have a withConfig function with it?).

hidden creek
#

i would rather have withSDK take a config param

#

i'm not sure i get why we need to have a separate withConfig at the api level

#

my understanding was that withConfig would be a method in the SDK runtime itself? or even part of the constructor?

#

maybe i'm completely misunderstanding the previous discussion?

hidden creek
wraith summit
#

my understanding was that withConfig would be a method in the SDK runtime itself? or even part of the constructor?

so I added this as part of constructor, and that seems to have worked for me. What I am confused about is this statement:

maybe dumb idea, but could we have the sdk (optionally) register a withConfig function and allow any arguments schema? Just enforce function name, and return type. Then introspect that schema to validate the config field on a per-sdk basis

#

i also did a poc to explicity add withConfig function in moduleSDK modules (e.g. typescript runtime), and that works as well, but I can't get my brain around Then introspect that schema to validate the config field on a per-sdk basis specifically.

hidden creek
#

so, if the constructor defines a paramater called "myPersonalConfigSetting: string", then this should be automatically filled by the engine from the contents of dagger.json's sdk.config.myPersonalConfigSetting

#

the full raw json config would never be provided to the SDK runtime

#

only the extracted parameters

wraith summit
#

ok.... could you pls give me a pointer on how that would be achieved?

hidden creek
#

๐Ÿ˜› yeah, it's kinda tricky. you need to somehow get the typedef of the sdk runtime module

#

not quite sure if it's plumbed through to where you need it

#

then you can look at the constructor type signature, and iterate through the args there to extract what you need

wraith summit
#

what is the downside of passing the raw config and let sdk parse and report error if its invalid...

hidden creek
#

there's a lot of advantages to having it be explicit here - we can have the engine validate it consistently, we can have consistent capitalization, we can document the values for it much easier, etc.

wraith summit
#

e.g.

https://github.com/dagger/dagger/commit/ff9e91bbefc52e3af3d0e2698493c9795417bc97#diff-a8b5f7f08501c3951b9991d2450b8b0a68ac764af369185bbf7044d9654ac58cR61

    // +optional
    sdkSourceDir *dagger.Directory,
    // +optional
    rawConfig dagger.JSON,
) (*TypescriptSdk, error) {
    var c Config
    if len(rawConfig) > 0 {
        decoder := json.NewDecoder(strings.NewReader(string(rawConfig)))
        decoder.DisallowUnknownFields()
        err := decoder.Decode(&c)
        if err != nil {
            return nil, fmt.Errorf("failed to marshal typescript config %s. error: %w", rawConfig, err)
        }
    }
return &TypescriptSdk{
        SDKSourceDir: sdkSourceDir,
        moduleConfig: &moduleConfig{},
        config:       &c,
    }, nil
}```
hidden creek
wraith summit
#

๐Ÿ‘

#

it may not be plumbed through, but could you point me to how it may be done for something similar already in code, I can work through to make sure it works for our case.

#

i am thinking it may be done in a similar fashion how args for constructor are passed along?

hidden creek
#

okay, it might be possible to get the typedefs from sdk.mod.Self.ObjectDefs (where sdk is a moduleSDK)

#

from there, it should be possible to get the "main" object

#

then you can follow any code that introspects Object TypeDefs

fossil wigeon
#

I think there's a benefit of using a separate withConfig instead of the constructor. The constructor has the sdkSourceDir argument which shouldn't be a config in dagger.json, and we may need others like this in the future. Decoupling ensures you're not limited later.

wraith summit
#

this was very useful chat/suggestions. thanks Justin/Helder.

hidden creek
#

yeah fair

#

decoupling is fine

wraith summit
#

there's a lot of advantages to having it be explicit here - we can have the engine validate it consistently, we can have consistent capitalization, we can document the values for it much easier, etc.

and

also means it's very nice to define defaults, optionals, etc.

@hidden creek, iterating over these comments, I am adding a withConfig function to the sdk runtime. and assuming that this function with have a few arguments (arg1: type1, arg2: type2.....), and the engine will introspect the signature of this function (with static name of withConfig), and try to call the function with Input args of arg1, arg2 (which will be fetched from dagger.json -> sdk.config in the engine).

does that sound about right?

hidden creek
#

yes!

#

sorry, i should have clarified that when we originally had this discussion

#

sdk.config design discussion

wraith summit
#

๐Ÿ‘ gotcha. thank you. just wanted to make sure I understood you correctly here

hidden creek
#

does that seem kind of reasonable to you? you should be able to get good result by converting the TypeDefs you get during arg introspection into dagql.Inputs

#

then you can DecodeInput on that to convert from the json decoded field in sdk.config

#

then you'll end up with a dagql.Type or something, which you can then feed into a srv.Select when you call withConfig

#

if that makes any amount of sense ๐Ÿ˜† that's how i'd do the plumbing

wraith summit
#

I still thought passing direct config to sdk is better as this is between engine and sdk anyways, so we can skip all this and just let sdk parse the raw config.

but i also don't understand all the internals, so I trust your judgement here.

hidden creek
#

let me know how you get on, and ping me if you have any questions ๐Ÿ˜„

#

i got very distracted yesterday, so wasn't very responsive sorry ๐Ÿ˜›

wraith summit
#

no worries. i understand you get lots of questions from a lot of ppl :), so all is good.

#

i will ping you if I get stuck.

fossil wigeon
wraith summit
#

I guess, i am still struggling a bit to understand how would we expose that schema for dagger.json? because this schema can only be evaluated only after initializing sdk as this is sdk specific. and loading dagger.json is the first thing we do.

if we are going to lazy initialize that part (by parsing the config into map[string]json.RawMessage), it means we are doing this right after we are calling the constructor of the SDK, in which case it is already too late to load the schema?

fossil wigeon
#

You don't need an instance of the main object to get the schema via withConfig, just need to introspect the module, i.e., load it.

wraith summit
#

you mean calling module.initialize for that right? (sorry, if my terminology is a bit off here). i think that is where we call the empty function to get the details about the module objects/functions etc.

fossil wigeon
#

You don't necessarily need to call it, just need to hook in after it occurs.

wraith summit
#

i see.... so what we are saying is that the schema validation can happen earlier and the initialization can happen after we have called the constructor on the module sdk?

fossil wigeon
wraith summit
#

No, I'm saying the schema validation can happen during initialization, but before calling the constructor on the module sdk.

gotcha. sorry, when i said initialization i meant calling the constructor ๐Ÿ‘

#

let me rephrase that a bit:

so validation can happen during initialization (module.initialize) ... but the actual config set function will be called after the constructor (as we need to store the config on the instantiated object).

fossil wigeon
#

Yes, it's two parts: 1) validation and 2) applying.

#

For the validation part we need to remember where it'll be needed:

  1. When loading dagger.json, to validate if sdk config is valid
  2. When exposing the JSON API schema for dagger.json (I believe there's a separate binary for that?)
  3. Not really today but we may want to expose it somehow for some clients. For example, our CLI could help you fill that config knowing its correct types, rather than you having to manually edit dagger.json. This could be reused by the JSON API schema on the previous point.
wraith summit
#

thank you so much Helder. it seems like my changes are being done in ^^ direction only. I will open a draft PR once I have this working.

fossil wigeon
wraith summit
#

๐Ÿ‘ sure, i will check that out. thanks for the headsup

wraith summit
#

I have now rebased it on top of 9505 (basically main, as it is now merged), and have things working mostly.

#

the PR has up-to-date code now, but the CI is not happy yet.

#

the PR now has following changes:

  • gitconfig attachable
  • SDK Config support in dagger.json
  • support for optional withConfig function

I will also polish and push reference implementation for typescript next.

wraith summit
#

Hi @hidden creek, there is one last issue (the famous words "last issue"), that is blocking me here. when linting with dev engine, I am getting a invalid id format error. I am suspecting it is happening when a moduleSDK loads goSDK as runtime sdk. but I am having trouble making a fix for it.

#

can you think of anything obvious (or if you can jump on a huddle with me for 10 mins, it will be really great).

wraith summit
hidden creek
#

๐Ÿ‘€ that's odd - this is in your main pr right? latest code up to date?

wraith summit
#

pr branch. to reproduce, just checkout and call dagger call -h with dev engine.

hidden creek
#

okay looking at it

#

the full error for reference:

failed to load dependencies as modules: failed to load module dependencies: select: failed to get module runtime: failed to call sdk module moduleRuntime: select: failed to get generated context directory: select: failed to generate code: failed to get modified source directory for go module sdk codegen: select: InvalidArgument: InvalidArgument: rpc error: code = InvalidArgument desc = invalid id: unsupported scheme &url.URL{Scheme:"", Opaque:"", User:(*url.Userinfo)(nil), Host:":0", Path:"", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"6i7x5vaikt0aaj01weekhv7pu", RawFragment:""}
#

what i suspect is happening is that you're making a query with an empty url object

#

something needs to be called with .String i guess

#

okay, adding this in getUnixSocketSelector fixes the bug for me, still not sure it works fully:

    if clientMetadata.SSHAuthSocketPath == "" {
        return nil, nil
    }

but we shouldn't try and mount an empty socket

#

really not sure why we don't get the error id is not set instead, but ๐Ÿค”

wraith summit
#

oh. this is useful. I will verify. thank you

wraith summit
#

I think I finally have all CI passing (except elixir, which seems like a known issue atm) and the features working (sdk config and git config). I will work on adding tests for it now

(also worried that if new changes will change how git attachable works before I could land this PR, saw some discussions around it in engine-dev channel).

hidden creek
#

aha, no, it shouldn't break that!

#

mostly my change is just confusion around how it's meant to work - maybe i need to read your PR tests to see how you handle this

#

but also, nice! i'll give this a proper lookover tomorrow โค๏ธ

hidden creek
#

okay, i've looked through ๐Ÿ‘€

#

it looks really good

#

aside from the TODOs in there which you know about, i don't think i have any comments - i left a comment about wanting a test for WithConfig

#

but aside from that.. the tests for doing GOPRIVATE which i know you're working on

#

is there anything you're blocked on now?

#

i would like to get this in during the next "merge window"

wraith summit
#

Hey Justin, thank you for reviewing the PR. yes, I am working on adding the tests as we speak (they are in different branch right now to avoid polluting the PR branch).

#

I will try to add more tests today, and maybe run into a scenario where I need your time. I will let you know.

#

but last night, I had my first test run successfully (I need to go back and double check that I didn't hack anything into the code to make that work)

wraith summit
#

Hi Justin, this PR is ready for review:

https://github.com/dagger/dagger-test-modules/pull/7 - add a go module to simulate a private dependency
https://github.com/dagger/dagger/pull/9323 - the only test failing is because of the private dep repo is not configured correctly.

GitHub

This PR adds a simple golang module to simulate the private dependencies in integration tests.

hidden creek
#

yup!

#

left some comments ๐ŸŽ‰

#

and merged the dagger-test-modules pr