content-addressed secrets footgun | Dagger | Page 1

gaunt canopy Oct 7, 2022, 4:03 PM

#

this is only an issue for content-addressed files created via withNewFile, other files (including stdout/stderr) should be fine as long as the LLB that created them used secrets appropriately

#

content-addressed secrets footgun

#

maybe we could have an InMemoryFile (or better name) that doesn't expose secret but still has an id : FileID?

gaunt canopy Oct 7, 2022, 4:32 PM

#

hm except that's not possible because withNewFile just builds on a Directory which might not be originally in-memory

opal rampart Oct 7, 2022, 5:37 PM

#

Ugh … right.

And it’s not the first big problem we encounter we secrets.

Fundamentally:

Either the secret gets leaked in plaintext (e.g. withNewFile aside, even with Exec(“vault get -o /secret”).File(“/secret”).Secret(), the secret ends plaintext on a layer and cache exported)
Or if it doesn’t leak into layers, then it won’t work if the step that produced it is cached (as Kyle pointed out in the vault demo, it worked only the first time. Once cached, the secretid mapped to nothing)

#

withNewFile fits in the “leaking bucket”, except instead of being a cache layer it’s in the instructions themselves

#

Had a couple of half baked ideas:

We design secrets like in buildkit with providers that gets JIT invoked when a secret is needed, bypassing cache. POC here: https://github.com/dagger/dagger/pull/3124
We rethink SecretIDs so they work a lot more like FileIDs. Except it’s always an in-memory file.

GitHub

[POC] Secret Providers by aluzzardi · Pull Request #3124 · dagger/d...

[NOTE: for discussion only, not meant to be merged]
This PR introduces the concept of "secret providers".
Example:
{
core {
image(ref: "alpine") {
exec(input: {
...

#

/cc @sacred magnet @autumn hawk

autumn hawk Oct 7, 2022, 5:44 PM

#

Obviously the long-term fix is to remove the "IDs are cookies" part.. So we're talking short-term remediation right?

gaunt canopy Oct 7, 2022, 6:01 PM

#

@opal rampart for the first one, some consolation is that reading from stdout (i.e. not using -o) will be fine since that'll end up in the /dagger mount which won't get exported

#

JIT secret fetching sounds most ideal to me, but there are tricky parts there, e.g. secrets may have multiple fields (aws access key + secret key) which must be fetched and used together, but passed as different env vars. so you have to avoid fetching the same secret multiple times because it might be regenerated in-between.

sacred magnet Oct 7, 2022, 6:52 PM

#

autumn hawk Obviously the long-term fix is to remove the "IDs are cookies" part.. So we're t...

Making IDs not be cookies will require persisting the actual value somewhere so it can be mapped from the ID (probably in our daggerd boltdb, at least in my most recent thinking), so then we'd be responsible for persisting user secrets. I guess the shortest reasonable path to that would be that we have built-in support for vault or similar? I'm not sure, but there's a lot of implications and design to think through I believe.

sacred magnet Oct 7, 2022, 6:53 PM

#

gaunt canopy JIT secret fetching sounds most ideal to me, but there are tricky parts there, e...

Agree that JIT fetching is probably the fastest path and is not really that bad even in the long term provided we make it easy for users to setup these secret vendor services (or whatever we want to call them)

opal rampart Oct 7, 2022, 7:31 PM

#

gaunt canopy <@707661669819613324> for the first one, some consolation is that reading from `...

I must say I’m not super familiar with output mounts, but if we can grab the stdout without running the command again (cached), doesn’t that mean it’s still committed to disk somewhere?

gaunt canopy Oct 7, 2022, 7:32 PM

#

oh, yeah, it's on disk somewhere, just pointing out it won't end up in a pushed image

#

which isn't all that helpful i guess (when would you push a vault read)

opal rampart Oct 7, 2022, 7:33 PM

#

Oh yep yep. It’ll be in the cache export though, less bad but still

gaunt canopy Oct 7, 2022, 7:34 PM

#

ah yeah true

autumn hawk Oct 8, 2022, 7:54 AM

#

What if we embraced the cookie model but split it from IDs?

change IDs from being a “recipe” to a simple hash of the recipe
a lookup table of recipes (keyed by hash) is sent back and forth as a http header. Like an actual cookie 🙂
both client and server use their shared lookup table to resolve IDs

Benefits:

IDs are short
IDs contain no plaintext secrets
No complicated persistence to implement in the engine
No complicated logic in SDKs either: send empty cookie on first request; for each subsequent gql request, send the cookie received from the previous response
Cookie header may contain plaintext secrets BUT much easier to keep secure because that header remains private to the SDK and engine. Doesn’t touch client code or extension code; people don’t copy paste it around; Biggest risk is http request dump leaking into logs in case of error
Bonus: playground could use the lookup table to make graphql stitching easier: auto-complete IDs instead of copy-pasting them? 😇

opal rampart Oct 11, 2022, 4:31 PM

#

That was the dream. @sacred magnetattempted an implementation in the past -- didn't work out in the end because we have no place to store " a lookup table of recipes (keyed by hash)" persistently

#

(when you run a cached thing, you'll only get the "hash" -- but the code that fills in hash->llb hasn't run)

opal rampart Oct 11, 2022, 4:41 PM

#

opal rampart Ugh … right. And it’s not the first big problem we encounter we secrets. Funda...

Also -- the ID leak is a a symptom of the underlying problem, not the main problem.

Loading a secret leaks plaintext secrets to layers and is exported to cache (e.g. File().Secret() --> leak). Leaking the secret in the ID is a worse shape of the same problem.
When secrets are not leaking, they're actually broken by cache. e.g. Kyle's vault extension calling AddSecret --> first time it works, second time it's called it will be cached and NOT AddSecret, returning the secret hash from cache --> broken secret

#

basically, with the current design, secrets "have" to leak in order for them to work

#content-addressed secrets footgun