`moduleTypes`, self calls, codegen, `dagger generate` and commit generated files | Dagger | Page 1

olive bobcat Apr 3, 2026, 4:35 PM

#

I have an issue with moduleTypes and I'm not sure how to solve it.

First, some history: moduleTypes was extracted when I worked on self calls. The way self calls work is fairly simple: the module itself is present in the type schema returned by the engine, so we can generate the necessary code. Before moduleTypes, we invoked the module with an empty function name and it responded with its types. Since the module needs those types for codegen, moduleTypes was extracted as a separate phase.
One nice side effect is that dagger functions doesn't need to execute the module at all, it only needs the types.
moduleTypes only depends on base types, since a module can't expose types from a dependency.

Outside of self calls, it gives cleaner phase separation, but it's not critical.

Now, the problem: I'm working on replacing dagger develop with dagger generate. The idea is to commit generated files, moving all that logic from runtime to a dev-time phase. The runtime shouldn't need any codegen at all.
But the generated code only contains the invocation boilerplate, it doesn't include the moduleTypes part (anymore). So even with generated code present, it can't answer a dagger functions. It also can't serve the types needed to generate self-call bindings. Which makes the generation somewhat useless.

My idea is to also generate moduleTypes. During generation we'd have something like:

analyze module source code
generate type defs (moduleTypes) with local persistence
send types to the engine
generate bindings and invocation boilerplate (the entrypoint), using the schema with dependencies (and with the analyzed types if self calls are enabled)

This also means the engine needs a way to query the module for its types, a bit like the legacy empty-function-name approach. But what I'd like is to get the types without any build step.

The benefits: we remove all codegen from runtime. And the cost of self calls becomes negligible 🎉 , it only exists at dagger generate time. If that's the case, we can imagine to remove it from experimental and enable it by default. Performance was a blocker.

My question is: I'm not sure how to persist this moduleTypes result. If we could describe it as a single query, that would help a lot, but today that's not the case. Maybe it's something we can make work. Or other ideas?

Any ideas or feedback welcome

#

cc @uncut wraith @brisk loom @granite sleet @sly reef

sly reef Apr 3, 2026, 5:48 PM

#

olive bobcat I have an issue with `moduleTypes` and I'm not sure how to solve it. First, som...

Can you explain a little more? sorry...

why does the change make moduletypes disappear?

olive bobcat Apr 3, 2026, 7:41 PM

#

no problem, I have hard time to explain my problem 😅
So, my issue is when we remove all codegen from the runtime.
Whatever it's develop or generate in fact. But let's says we generated the files of a go module, commit them. And then we want to list functions and call functions from this module.
The generated files contain the entrypoint, something like that:

func invoke(ctx context.Context, parentJSON []byte, parentName string, fnName string, inputArgs map[string][]byte) (_ any, err error) {
        _ = inputArgs
        switch parentName {
        case "MyModule":
                switch fnName {
                case "ContainerEcho":
                        // ...
                        return (*MyModule).ContainerEcho(&parent, stringArg), nil
                case "GrepDir":
                        // ...
                        return (*MyModule).GrepDir(&parent, ctx, directoryArg, pattern)
                default:
                        return nil, fmt.Errorf("unknown function %s", fnName)
                }
        default:
                return nil, fmt.Errorf("unknown object %s", parentName)
        }
}

WIth that, the SDK runtime can just grab those files, pull the dependencies and build it. No more introspection, codegen, etc. And we can dagger call FUNCTION because we just use the entrypoint.
But when we want to run dagger functions for instance, or dagger check, etc, we must have access to the different types exposed by the module.
Before self call, the problem didn't existed because the entrypoint was used to return the types exposed by the module. If it was still the case, it would be easy.
But it has been removed for self calls, because the exposed types are required before the codegen.
But to get those types means to introspect the code, use the SDK to create the module (containing the types), etc.
All that is a bit like a second, more limited, entrypoint/runtime execution, and the corresponding code is never exported to the host, not committed.
One of the ideas I have is to keep it like that, but export to the host the result of the moduleTypes. During a generate it's created, then exported with the rest of the code. And then when the engine needs to retrieve it, it's already there, no more introspection / codegen.

...
rubberduck
...

I think I just got an idea 🤔
Maybe I can bring back the entrypoint branch with the empty function name, but in a way that works with self calls. That could be nice, and even help to debug this aspect. Not 100% sure it can work, but I can investigate that

sly reef Apr 3, 2026, 7:46 PM

#

All that is a bit like a second, more limited, entrypoint/runtime execution, and the corresponding code is never exported to the host, not committed.

That's the part I don't understand. Why is that part not exported to the host? Isn't it part of codegen?

#

In my mind, when you split the moduleTypes function, you split the runtime interface - instead of exposing a single entrypoint (dispatch) a runtime now has to expose 2 entrypoints: (dispatch and moduleTypes). So when a SDK generates a runtime, now it generates a runtime with 2 entrypoints. So, whether that 2-entrypoint-runtime is generated at runtime or not - what's the difference? What's special about that 2nd entrypoint ?

#

Ah is it more like 2 different runtimes, each with their own entrypoint - to avoid the overhead of building one big runtime? (Because moduleTypes requires a much smaller runtime?)

#

Or, should the second entrypoint just be literally a graphql schema?

brisk loom Apr 3, 2026, 7:51 PM

#

Yeah, can’t you just generate a graphql schema of the types and export it on the host?

sly reef Apr 3, 2026, 7:52 PM

#

But those schema files would have to be exposed via a standard runtime interface, which at the moment is not defined

sly reef Apr 3, 2026, 8:11 PM

#

OK @olive bobcat I'm starting to understand. It's not that dagger functions breaks, it's just that it's still doing codegen at runtime, so at the moment only half the problem is solved: dagger call doesn't do codegen anymore, but dagger functions (and dagger call --help, and dagger check -l...) still does

sly reef Apr 3, 2026, 8:27 PM

#

Digging a little deeper...

It looks like moduleTypes() already returns a Container... With a weird runtime contract but I'm sure there are excellent legacy reasons for the weirdness 🙂

In any case, if there's a container, that means there's a entrypoint... And if there's an entrypoint, there's source code somewhere - probably generated. So why not generate that moduleTypes entrypoint source code, and commit that?

sly reef Apr 3, 2026, 8:46 PM

#

sly reef Digging a little deeper... It looks like `moduleTypes()` already returns a `Con...

Answering myself after some research:

YES that would probably be the right thing to do
BUT currently the moduleTypes entrypoint source code is not generated... It runs codegen dynamically. So we will need to expand codegen to generate a moduleTypes entrypoint from schema. Then we can commit that (and the SDK's moduleTypes function can be changed to building it)

sly reef Apr 6, 2026, 11:32 PM

#

@olive bobcat FYI I prototyped one possible solution: https://github.com/dagger/dagger/pull/12915

GitHub

wip: prototype moduleTypes persistence by shykes · Pull Request #1...

Problem
Every time you run dagger functions, the engine calls the SDK's moduleTypes hook, which runs live codegen inside a container just to figure out what functions and types your module ...

olive bobcat Apr 7, 2026, 2:23 PM

#

Yes, I know the issue is a bit weird, and I have hard time to clearly explain it 😅
I can explain why this is a container, like the runtime (TL;DR: it's because of the way we deal with engine version mostly)
I'll have a look the your prototype, I also prototyped something in that spirit, that was storing the module def in dagger.json, but it wasn't that nice. Just a quick experiment.
The first version of moduleTypes was writing a json representing the types. But at that time we had a lot of discussions regarding if this isn't causing more pain, as it means we have a second kind serialization of what is already covered by the API. But maybe I'll go back to that. Especially it means we don't need the API to get the types, we are not doing calls, we just analyse the source code and generate a representation of the signatures. I'd love to have that in the API directly, but not sure that's doable (because of references).

sly reef Apr 7, 2026, 5:45 PM

#

olive bobcat Yes, I know the issue is a bit weird, and I have hard time to clearly explain it...

Yeah my prototype (really it's a POC) has 2 parts:

New core API to export eligible types to JSON. Eligible = can safely be exported/imported without losing information.
Go SDK exports that json at generation; and moduleTypes() just loads it at runtime

olive bobcat Apr 8, 2026, 3:01 PM

#

@brisk loom What do you think about that approach? This reminds me a lot the Json attempt I made at the beginning of moduleTypes/selfcalls (but more generic). Maybe the right time to do that again.

brisk loom Apr 8, 2026, 3:29 PM

#

Seems quite similar to https://github.com/dagger/dagger/discussions/12614 but both way

GitHub

Introspection API idea · dagger dagger · Discussion #12614

In the context of client generation and various discussion with @shykes and @helderco. We conclude that the introspectionJSON arg in SDK module should not be required since they are other ways to g...

#

So if I understand well, we would have:

Introspection of the modules to extract types
Store the result as a JSON artifact (or even better if its a GQL schema)
Have an way in the engine to load that artifact as a module so we can answer dagger function

Am I right or there's something missing?

olive bobcat Apr 8, 2026, 3:42 PM

#

I think that's the main idea. The goal for me is so that when a dagger call happen, there's no codegen at all.
So for that, it means that dagger generate (develop phase) must perform the moduleTypes and codegen and the result of both must be exported.
That way a dagger functions will be way quicker as it only means to read and load a representation of the module, even if there's no cache.

The persistence above, even if scoped for now on moduleTypes, is more general. It's "allow to to store and load a representation of an object". So that means without IDs. In that way it's closer to the JSON representation I did at start.
But my fear with it is we are introducing a second way to serialize objects. Maybe that's not a problem. But it's a bit more complexity.
GQL schema, maybe it's possible, but I guess we will not do that for all types, it's more like a new type defs api. no?

brisk loom Apr 8, 2026, 3:46 PM

#

A total different idea but can't we generate another entrypoint specially to return these moduleTypes?

#

GQL schema, maybe it's possible, but I guess we will not do that for all types, it's more like a new type defs api. no?
If we only need an artifact for typedef, I think a GQL schema is enough no?

sly reef Apr 8, 2026, 4:00 PM

#

brisk loom A total different idea but can't we generate another entrypoint specially to ret...

yes that's where I started, but it's much more work IMO

#`moduleTypes`, self calls, codegen, `dagger generate` and commit generated files