#Accessing Host Env

1 messages ยท Page 1 of 1 (latest)

dull vessel
#

Hello!

I'm working on a PoC to explore using Dagger to retro-fit an existing CI.

I'm currently struggling with trying to access the Host Env. The existing CI is dense with Environment Variables, and there's a lot of embedded steps.

For example, we connect to Hashicorp vault to pull secrets into the CI, and this requires 4 envvars set in CircleCI's context (vault address, role, namespace, and an OIDC token).

My hope had been to create a Module that abstracts away the Vault items so end users can focus on just writing their steps, and if they need secrets, they can import and init the module:

func (m *MyDaggerModule) Build(
  ctx context.Context,
  // +defaultPath="/"
  source *dagger.Directory,
) (string, error) {
  vClient, err := vault.NewClient(vault.ClientConfig{}, dag)

  dag.Container().From("alpine:latest").
  WithEnvVariable("GITHUB_TOKEN", vClient.Fetch("GITHUB_TOKEN")).Etc()

// etc etc

Sadly this doesn't seem to work, as the Env isn't accessible from the Dagger function.

Even running something like:

os.Environ

Shows a limited set of Env vars.

The story is different if running "dagger run env", but I'm unclear on the difference between run and call.

It seems like the preferred method is soemthing along the lines of defining each variable you want from teh Host Env and passing it explicilty...

func (m *MyDaggerModule) Build(
  ctx context.Context,
  // +defaultPath="/"
  source *dagger.Directory,
  // vaultAddr the Address of HC Vault
  vaultAddr string,
) (string, error) {

dagger call build --vault-addr env://VAULT_ADDR

But for an existing CI with 30+ env... this is not really feasible nor scalable. It kind of shatters the dream of building out setup modules that abstract away the setup and get out of the way of end users.

I sure the use of Vault in EVERY CI step is likely bloated, and maybe the answer is forcing developers to be more intentional with usage.

Seeking guidance on what the best practice is here?
I had been hoping to bundle everything up into dagger, but lack of access to Host Context makes this very difficult. Even an .env file isn't realistic, since I couldn't make a function to read and bundle the env without running it as privileged.

digital basin
dull vessel
#

Yes! I feel like that is very much along the lines I'm thinking.

Also the related thread:
https://github.com/dagger/dagger/issues/9584

I think is exactly the same issue: https://github.com/dagger/dagger/issues/6723

GitHub

Overview Dagger should natively support .env files. It's a de facto standard for managing environment-specific configuration in a lightweight, portable way. It is widely supported, included by ...

GitHub

Somewhat related to #6112 but a bit more general Passing a large number of flags and args when using dagger call can become way too tedious to type out by hand. One example would be support for fil...

#

I'm thinking my mistake is in trying to bootstrap Vault into Dagger.

I think the preferred path is likely to have the calling environment already have vault setup, and then pass in secrets explicitly

dagger call build --GITHUB_TOKEN vault://my_vault_tenant_ci_github_token

but honestly, that's gross ๐Ÿ˜

Was really hoping to avoid any kind of bootstrapping whatsoever, so the Pipeliens could be SUPER portable.

A sad statement, but if I come back to my devs like "You gotta setup and authenticate the vault cli in yoru local env before you can run Dagger" it'll fail ๐Ÿ˜

I think a lot of the env-var grossness can be fixed by migrating more values out of the Env into Vault, but there's still the bootstrapping problem.

It'd be great to just do

dagger call vault-setup --vault-addr env://VAULT_ADDR

but the container nature of it means it'd just... go away ๐Ÿ˜ฆ

digital basin
#

cc @latent finch who has thought a lot about this.

dagger call vault-setup --vault-addr env://VAULT_ADDR
what would be the ideal artifact left on the user's system after this command?

onyx agate
#

@dull vessel decoupling the secret provider from the secret consumer is precisely what makes the function portable.

#

What you are talking about is getting less portability in exchange for more convenience

latent finch
#

FWIW for Vault, you don't need to setup and authenticate the vault cli or anything like that. You just need to set the same env vars that you're currently wanting to pass in to the function. I think the default secrets idea will help here a lot to remove the need to pass lots of secret reference args. If you want to bake a vault client into your top level function instead, its totally possible, but breaks the true portability like solomon mentioned

dull vessel
#

Full disclosure, I think the existing pipelines I"m trying to replace are already super gross, so it's likely that any solution is goign to be equally gross unless it's completely pivoted ๐Ÿ˜

#

That's likely a BIG part of the problem XD

latent finch
#

Makes sense! On the portability spectrum, here's how I think of it

  1. Current state: lots of variables defined in the CI platform. It means all of the specialized tooling only works if it's run in that platform
  2. Put the secrets in a proper secret vault. Now the specialized tooling can run anywhere it can connect to the secret vault from.
  3. An interface between the tooling and the secret provider. Now the tooling doesn't need to connect to the secret provider, it can connect to any provider that can give the values it needs. So now you can have multiple (prod /dev) vault instances, or developers using 1password and CI/prod using vault, etc
onyx agate
#

Hopefully we can start incrementally cleaning up the grossness ๐Ÿ™‚ Getting a quick incremental win is key. Will motivate you & the team to keep going until it's perfectly clean.

dull vessel
# digital basin cc <@135620352201064448> who has thought a lot about this. `dagger call vault-s...

Actually, I think you're onto something here. Your comment "what would be the ideal artifact" i think hints at the answer.

Could make a Vault function that boots up vault and spits back the artifact needed for the CLI to work. I'ts not intrinsic to the system then, but more... if you want to use vault, you call vault first. I think that's what we're looking at?

dagger call vault-init
???
dagger call build --github-token vault://GITHUB_TOKEN

Middle layer is missing

#

would be neat if the dagger call vault-init could boot a vault provider that would live off tot eh side, so it could then fill in for later calls that say vault://

onyx agate
latent finch
onyx agate
#

Your object might look like:


func New(
  // +optional
  vaultAddress string,
  // +optional
  vaultRole string,
  // +optional
  vaultNamespace string,
  // +optional
  token *dagger.Secret,
  // +default="GITHUB_TOKEN"
  githubTokenPath string
) *MyModule {
  return &MyModule{
    VaultAddress: vaultAddress,
    VaultRole: vaultRole,
    VaultNamespace: vaultNamespace,
    Token: token,
    GithubTokenPath: githubTokenPath,
  }
}

func (m *MyModule) Build() *dagger.Container {
  vault := dag.Vault(m.VaultAddress, m.VaultRole, m.VaultNamespace, m.Token)
  ghToken := vault.Get(m.githubTokenPath)
  // use your token at will
}
dull vessel
latent finch
onyx agate
dull vessel
#

My devs want to do the minimum amount of work possible ๐Ÿคฃ

latent finch
onyx agate
dull vessel
#

My ultimate goal would be to curate both a Step Library, and Pipeline library

90% of devs will just use the curated library.

but the other 10% could customize from there. Add in some steps if needed.

The dream is more something like:

type GoPipeline struct {
  PreTestSteps []DaggerStepInterface
  // ...
}

something like taht. If they need to hook into different parts int he flow, they can. They define their custom steps, and then register them with the parent pipeline.

dull vessel
#

It's separation of concerns... and I think that's a big anti-pattern within the existing CI.

Existing CI uses a giant "Golfbag" Docker Image that has everything and the kitchen sink. It's awful... we literally spend 50% of our CI Credits just pulling the image every run (Circlce CI doesn't let you cache runner images :()

It literally has every script, every SDK, and a pile of pre-baked envvars.

The existin vault integration literally just dirwalks the namespace and dumps every secret into a setenv file, so the contents of the vault become env secrets.

It's convenient in a way, but also you really have no idea what is required.

I don't actually know what secrets are needed by the Snyk delta scan, because it slurps up envvars and makes reference to scripts.

SO the thinking is one of functional programming? Be explicit about inputs and outputs?

latent finch
#

I've spent a lot of time thinking about this ๐Ÿ™‚

SO the thinking is one of functional programming? Be explicit about inputs and outputs?

I think that's definitely a big part of it. Like you pointed out, its pretty normal today to just expose all of your secrets your pipeline might need to every tool/script and hope they all find what they're looking for.

CI platforms also expect you to copy/paste secrets into their platform, and 1) they're not a secrets provider, and 2) even if your secret doesn't leak while you're aware of it, it generally stays there long after you stopped using it and will eventually leak. Saw this case first hand

On the secrets provider side, we also have 2 subcategories of providers: products designed for humans to use like 1password, and products designed for machines like Vault. The designs and access patterns are quite different, so it's not trivial to say CI uses Vault and devs use 1password.

And further, should CI need secrets? Yes, eventually, but it can probably be broken down into workflows or pipelines that rely on some authenticated process and those that don't. And if they're broken down that way, designing access and roles for developers probably gets a bit easier.

And that's just the basics. IMO follow exactly how every CI platform tells you to setup their platform, you've dug a very tricky hole to crawl out of

dull vessel
#

I have always hated putting secrets in CI Platforms.
They get forgotten, you rotate a secret, and a month later realize some job somewhere broke because you killed the token it used to call home... uff.

This is why I thought I was being more clever with Vault being baked in, but I also see the point of portability.

It is a bummer though, as setting up the secret provider for the Dagger engine becomes less turn key

latent finch
#

Yeah for sure, the main portability you're losing by baking in the Vault client is that you're probably going to be hardcoding the secret paths. If that's something that makes sense for the problem you're trying to solve, I'd say go for it! You could try to keep the abstraction alive by keeping that vault implementation a detail of the main dagger module and have the rest of the modules/functions use the dagger Secret type. And then in the future if you do reach for more portability or our provider implementation makes more sense for you it would be an easy change

dull vessel
#

I think my initial thought was to ive the vault client a fallthrough system:

  • look for token file
  • look for OIDC token file
  • initiate OIDC Flow

The CircleCI environment has the OIDC Token, so that's what it would use

Most devs working locally won't, so it woudl kick of an interactive OIDC flow.

The end artifact of all these flows though is the token file, which is what Dagger's provider needs.

So it might be that the "vault setup" step becomes abouve generating the token file as an artifact so it can provide it to dagger.

latent finch
onyx agate
#

i strongly recommend approaching this as 2 distinct problems:

  1. Overall complexity and reliability. Does the system become exponentially more complex and brittle as it grows, or not.

  2. Turnkey convenience. How many characters must a end user type a) for initial setup (one-time cognitive cost) and b) each loop iteration

Terms like "grossness" and "smell" and "non optimal" are ambiguous as to which you're referring to.

dull vessel
#

The ballooning was... continuing downthe path I was, the number of arguments each step would require, if it had things like the vault module built in the way I was thinking.

It's been a good discussion, and I think it's made me pause and take a step back to reconsider the approach

#

I think I see a better "golden path", and bringing forward principles from other areas of programming (Seperation of Concerns, Least Privilege, etc).

I'm sure there's issues to sort through still, but it gives me a lot to think about, and I think I understand the paradigm better.

onyx agate
#

That ๐Ÿ‘† leaves the problem of not having to pass those 4 bootstrap arguments each time. That's a "turnkey convenience" issue, and at the moment we don't have a good solution to it - you just have to deal with on your end, whether with a wrapper script, CI configuration etc. But we will solve it ๐Ÿ™‚

#

Generally we prioritize complexity & reliability first, turnkey convenience second. Because it's easier to add convenience later, but impossible to fix exponential complexity and reliability after the fact