Separation of environment code and data | Dagger | Page 1

grave oak Jun 29, 2023, 7:23 PM

#

Starting a thread for inevitable bikeshedding around this important topic 🙂

#

And copying @subtle thorn with whom I discussed this several times 🙂

subtle thorn Jun 29, 2023, 7:52 PM

#

Since you're inviting bikeshedding, happy to honor the invitation with my thoughts on the topic.

To me, it's very analogous to where Dockerfiles are: some dockerized repos will want to have their Dockerfile, while for other repos that don't or are unlikely to ever have Dockerfiles (think source code of gcc or binutils), there are dockerfile-only repos pointing to the external source code.

Note that even if you have a dockerfile you can always build the repo with another dockerfile of your choosing.

I'm not exactly sure how zenith wants to separate in practice the source code and the environment.
What I think is important is that the average user who just wants to contribute to a project and wants to run the jobs, they should have as little friction as possible. So I hope they don't have to specify both the source and the environment.

grave oak Jun 29, 2023, 7:57 PM

#

adding @marsh hamlet @halcyon shard @quasi snow @final lodge @rocky dove @spark vale

#

So I hope they don't have to specify both the source and the environment.

Yes we will make sure of that

marsh hamlet Jun 29, 2023, 7:57 PM

#

agree with @subtle thorn , I think that solving for the "platform engineer" usecase is important, but lots of people just want to bundle their build logic and repository together, so it should be painless for them to do that while also unlocking all of this extra great functionality. I feel like that is totally doable under this proposal, but might just take some iteration to get right

rocky dove Jun 29, 2023, 7:59 PM

#

We were discussing having a way of configuring the defaultContext of an environment, which will let callers skip having to specify env and source. And if you set defaultContext to your own source code, you recreate the embedded use case essentially. It's just a lot more flexible now and encourages use of generic re-usable environments, which should be a killer feature

grave oak Jun 29, 2023, 8:01 PM

#

We have discussed this point at length in person 🙂 Where we have (roughly) settled:

A resounding YES to supporting both the "advanced platform team" and "I just want an easy repeatable way to build and test my simple software" use cases
In theory we could achieve this by supporting both "embedded environment code" and "separate environment code". In practice though: we end up only the embedded pattern, and "separate" falls by the wayside. I know this, because that's the situation we're in today.
Instead we think we can support both use cases by sticking to always separating the environment code, and still making it very easy to support the "easy" use case. Just because "embed in the app repo" is familiar, doesn't mean it's the only possible pattern, or even the best.

#

One way we can avoid having to specify the env + app source each time, is with git remote mapping.

subtle thorn Jun 29, 2023, 8:06 PM

#

So if i understand correctly, your bet is that by forcing (with some inevitable friction) to have the env in a separate git repo, you'll gain a richer modular ecosystem of envs / bricks of envs? Or did I misunderstand ?

#

There's also the inevitable: how would it work in a monorepo ?

rocky dove Jun 29, 2023, 8:09 PM

#

Yes I think that's accurate, though the friction should be tiny, just a matter of configuring one field

rocky dove Jun 29, 2023, 8:11 PM

#

subtle thorn There's also the inevitable: how would it work in a monorepo ?

Should still work in that you can define all your environments there and run them against source code elsewhere in the monorepo, either using local path references or git references. And same idea of needing to set defaultContext to point to source code when it makes sense

#

Examples where you wouldn't want defaultContext would be a generic "go tools" environment, which @final lodge has been prototyping. Basically a bunch of APIs that make it easy to build/test/do general go stuff. E.g. he put in a StaticBinary helper that sets all the relevant flags and env vars for you to enable that.

Then examples where you do want defaultContext might be when you have very custom build steps that can't be made totally generic (which is obviously the case fairly often). Then you would likely point the default context to whatever source it is tuned for. But the nice thing is even when you have very specialized environments, they can still make use of and wrap other existing generic environments.

grave oak Jun 29, 2023, 8:15 PM

#

subtle thorn So if i understand correctly, your bet is that by forcing (with some inevitable ...

We think we can make the friction go away 🙂

#

Especially considering that there is also friction in the embedded model. For example, where do the files go? Should each user choose a directory name, or should Dagger impose a convention? If the files are spread across multiple locations in the repo (eg in a monorepo), what is the relationship between those files? etc

#

Also: what if I improved my test pipeline, and I want to apply it against older versions of my code?

subtle thorn Jun 29, 2023, 8:17 PM

#

The way I understand what you're saying is that it's as if the dev image (aka env) knew exactly what kind of project it is and how to compile/test it, so all you really need is just a "FROM generic-go-dev-env" and so the Dockerfile becomes just a pointer to a dev env.

grave oak Jun 29, 2023, 8:17 PM

#

the environment is not an image, it's a DAG

subtle thorn Jun 29, 2023, 8:18 PM

#

yes, that's where the analogy breaks down

#

😄

grave oak Jun 29, 2023, 8:18 PM

#

oh it’s an analogy sorry. let me re read

subtle thorn Jun 29, 2023, 8:19 PM

#

I'm using a dockerfile analogy because that's what most people using docker are using today. And that's what i'm most familiar with. Sorry if it's not very clear

grave oak Jun 29, 2023, 8:20 PM

#

It's clear 🙂

#

I guess Dagger operates at a higher level than the Dockerfile

#

Dockerfile -> "here's how to build my application into a container"

#

Dagger environment -> "here's a place where you can do things, including possibly building your application"

subtle thorn Jun 29, 2023, 8:24 PM

#

So @rocky dove in your example, how would a repo owner specify that they want to use a generic go dev env ?

grave oak Jun 29, 2023, 8:27 PM

#

Either in the command line, eg. dagger -e github.com/tiborvass/supergodev or dagger -e universe/go; or possibly pre-configured in a config file:

environments:
  github.com/myproject/foo: universe/go

#

Another idea is that you can very easily create local environments by just editing eg. ~/.dagger/env/supergoenv -> environment code goes there

#

And then you can just dagger -e supergoenv

subtle thorn Jun 29, 2023, 8:37 PM

#

I guess i'll rephrase my question, not from the platform engineer's view but from the consumer's view: when i arrive on a new github project, I imagine i just run dagger and it shows me all the tasks i can do, and say i'd like to build the project and i do something along the lines of dagger do build. Where in the chain of events, is the go-dev env specified ? Did the consumer have to manually pull go-dev beforehand so that it shows up in their ~/.dagger ? I'd argue that it has to be a config in the source repo, however small.

limber stirrup Jun 29, 2023, 8:37 PM

#

I guess, local and remote env both of them has valid use cases.

I guess with this new env structure extending / wrapping other envs would be also possible

#Separation of environment code and data