Codegen extravaganza | Dagger | Page 1

ivory tapir Sep 9, 2022, 1:29 AM

#

First there are 2 areas of codegen:

Server-side (generating server code from schema)
Client-side (generating client code)

#

On the client-side we are exploring 2 techniques:

Pre-written queries. Client code is generated from a set of pre-written gql queries.
Query builder. Client code is generated by schema introspection, and allows creating queries in native code instead of graphql

#

Pre-written queries is how most (all?) generators work today. So it has the benefit of more existing code and ecosystem awareness. But someone has to write the graphql query first, and the structure of the graphql query leaks into the generated types

#

Query builder has the potential to remove the need to write graphql queries. This can be good or bad depending on how you look at it: less need to learn a new language or concept; but also less opportunity for all dagger developers to share, reuse and learn from a growing ecosystem of graphql queries. Main downside is that it’s uncharted territory: we have to write this code ourselves

#

One possibility is that we keep bridges between gql queries and query builder. Perhaps we could provide 2-way conversion: export the raw query from a native go/ts query builder call; and generate a go/ts query builder snippet from a gql query

#

again the main downside is that it would be all new code aka lots of work

boreal fern Sep 9, 2022, 2:05 AM

#

ivory tapir One possibility is that we keep bridges between gql queries and query builder. P...

(I was just thinking more about this too. Agree on the lots of work part, but there is a big "cool" factor here besides the more practical benefits. It would be like google translate for API calls or something)

ivory tapir Sep 9, 2022, 2:11 AM

#

Another important factor is how easy it is to create and share graph extensions.

Right now it’s cumbersome; something reserved for power users. So if the community is not reusing each other’s gql queries because they’re writing them in their native language (causing fragmentation) then extensions cannot fill the gap because not enough people will be able to write them.

However if we make it 10x easier to write an extension, to the point where you could realistically write one on your first day using dagger (as opposed to your 30th day) then that would make it less of a problem that we’re not growing an ecosystem of gql queries: we’ll be growing an ecosystem of graph extensions instead

#

In fact even if we generated client code from pre-written queries, we would still need extensions to be very easy to write, because those queries get pretty complex and the complexity leaks into generated code. The only tool we have to abstract away the complexity is an extension. So one way or the other: all roads lead to Extension DX as the critical bottleneck to our overall DX. We have to make extensions 10x easier to create

#

And THAT requires (we think) a code-first DX + somehow beating the state of the art of gql code-first which is still too complicated

#

TLDR if we can make it 10x easier to create extensions then it’s ok to not have everything figured out on client code generation

#

By 10x I mean that writing an extension should be as easy as writing a script (which today is not the case)

#

Let me know @boreal fern if I forgot anything 🙂

#

Oh yeah I’m going to tweak the core API proposal (small tweak) to leave a door open for a possible trick to help make writing extensions 10x easier down the toad

tired isle Sep 9, 2022, 10:56 AM

#

Regarding query builder, some extra context: There's a PR (#174) that started about a month ago, it's purely experimentation

Query builder pattern tries two things:

Low level query builder (mostly for Go):

It really sucks doing manual queries in Go (and in any other typed languages). Because of expected return types, etc

The low level query builder API provides a programmatic way to write queries and bind their result

Example here: https://github.com/dagger/cloak/pull/174/files#diff-1b6185b0a44826e95e82f077b90cc982a13af4c6db19799cfdeba9391bccb9cd

Rough sketch:

var contents string
root := Query().
    Select("core").
    Select("image").Arg("ref", "alpine").
    Select("file").Arg("path", "/etc/alpine-release").Bind(&contents)

This generates query{core{image(ref:"alpine"){file(path:"/etc/alpine-release")}}} and after executing contents will contain "3.16.2\n"

Query Builder + Code Gen

This is code generation (as we know it today) built on top of query builder

Provides the advantage of codegen (e.g. using static types such as core, alpine, etc) with the "dynamism" of query builder

Why? With the standard codegen model (e.g. operations), we lose a lot of advantages compared to writing gql queries manually.

Specifically: it's a request/response model (e.g. function call = operation call), so you can't build "in code" chaining or parallelism

Example:

alpine today: with codegen, we're doing one gql request to grab the alpine image, then one request FOR EACH apk add, etc

Not only it's slower, but it breaks the "builder pattern" (e.g. for each apk you have to check the error code, etc)

alpine today:

output, err := core.Image(ctx, "alpine:3.15")
if err != nil {
    return nil, err
}

fs := &output.Core.Image

for _, pkg := range pkgs {
    output, err := core.Exec(ctx, fs.ID, core.ExecInput{
        Args:    []string{"apk", "add", "-U", "--no-cache", pkg},
        Workdir: "/mnt",
    })
    if err != nil {
        return nil, fmt.Errorf("failed to install %s: %s", pkg, err)
    }
    fs = output.Core.Filesystem.Exec.Fs
}

return fs, nil

query builder alpine POC:

func alpine(packages ...string) *Filesystem {
     fs := core.Image("alpine")
     for _, pkg := range packages {
         fs = fs.Exec("apk", "add", pkg).FS()
     }
     return fs
 }

stdout, err := alpine("curl", "jq", "bash").Exec("ls -l").Stdout(ctx)

#

There's a LOT of weird/undefined behavior/downsides though, it's far from being a proper POC

hollow brook Sep 9, 2022, 11:16 AM

#

tired isle Regarding query builder, some extra context: There's a PR (#174) that started ab...

Thank you for the context, that's one of the benefits why we use ORMs (e.g., Django's, just thinking about how it builds queries, not the actual object to table mapping). Makes sense.

#Codegen extravaganza