> so I started work on removing the | Dagger | Page 1

onyx tangle Jan 21, 2026, 5:43 PM

#

I did notice once significant use of LLB left in core/schema/query.go for the schema introspection json file. I don't think it will be too hard to dag-op-ify though

#

after that there's probably some small loose ends scattered about but I think the schema file is last significant one?

dull portal Jan 21, 2026, 5:45 PM

#

oh..... that's right! I think I ran into it when I was tackling the container.File work, and it got pretty hairy so I left it as is, but I can give it a try next.

#

I think it was mostly the codegen portion of the build would break, and it made the dev cycles a bit more difficult, if I recall correctly.

onyx tangle Jan 21, 2026, 11:34 PM

#

Also, Dockerfile builds technically still use LLB, but don't worry about those.

I have an idea on how to isolate+bridge such that dockerfiles still go through buildkit but then the results are converted to what dagql understands. Only doable as part of the solver-removal work though, not beforehand.

dull portal Jan 22, 2026, 9:20 PM

#

I'm starting to feel a bit stuck on debugging the schemaJSONFile work. I'm trying to setup a dagop under https://github.com/dagger/dagger/compare/main...alexcb:dagger:dagop-schemaJSONFile?expand=1#diff-a2bab836c6f25d97c19d9ec4f5cbe987a91b5584bfa04e3488e7728d08be10f4R154

but when I try to do a build, I get errors during the codegen process:

Error: failed to serve module: failed to load dependencies as modules: failed to load module dependencies: failed to load dependencies as modules: failed to load module dependencies: failed to call module "codegen" to get functions: Error: "go" not found

It seems like the sdk runtime container that's being returned is wrong (since the go binary isn't there):

I'll keep hacking away at it, but was just curious if @onyx tangle had any ideas.

GitHub

Comparing dagger:main...alexcb:dagop-schemaJSONFile · dagger/dagger

Automation engine to build, test and ship any codebase. Runs locally, in CI, or directly in the cloud - Comparing dagger:main...alexcb:dagop-schemaJSONFile · dagger/dagger

#

here's my relevant printfs

ACB eval of {ClientID:s31ezi1njh9zu349dplftsjzk SessionID: SecretToken: Hostname: ClientStableID: ExecID:vn0adwxstriet6vdr3sskxc79 Internal:true CallID:0xc002aa86c0 EncodedModuleID:ChV4eGgzOjk1M2EzY2U5NTU5NDYwYWMSnQEKFXh4aDM6MTNiNzgxM2I2YzI0NTU0YRKDARIQCgxNb2R1bGVTb3VyY2UYARoMbW9kdWxlU291cmNlIjMKCXJlZlN0cmluZxImOiQvaG9tZS9hbGV4L2RhZ2dlci9kYWdnZXIvY21kL2NvZGVnZW4iEwoNZGlzYWJsZUZpbmRVcBICGAFKFXh4aDM6MTNiNzgxM2I2YzI0NTU0YVgBEl8KFXh4aDM6OTUzYTNjZTk1NTk0NjBhYxJGChV4eGgzOmU4YWY3ZGNlNWJmNGRiOGQSCgoGTW9kdWxlGAEaCGFzTW9kdWxlShV4eGgzOjk1M2EzY2U5NTU5NDYwYWNYARJ2ChV4eGgzOmU4YWY3ZGNlNWJmNGRiOGQSXQoVeHhoMzoxM2I3ODEzYjZjMjQ1NTRhEhAKDE1vZHVsZVNvdXJjZRgBGgh3aXRoTmFtZSIRCgRuYW1lEgk6B2NvZGVnZW5KFXh4aDM6ZThhZjdkY2U1YmY0ZGI4ZA== EncodedFunctionCall:[123 34 78 97 109 101 34 58 34 34 44 34 80 97 114  ... CallerClientID: ParentIDs:map[] CacheMixin:xxh3:7bef3336d1891a0a HostAliases:map[] ExtraSearchDomains:[] RedirectStdinPath: RedirectStdoutPath: RedirectStderrPath: SecretEnvNames:[] SecretFilePaths:[] SystemEnvNames:[] EnabledGPUs:[] SSHAuthSocketPath: NoInit:false AllowedLLMModules:[] ClientVersionOverride:} failed exit code: 2 [traceparent:38c0ecd6fcd97e67fba1c479ec4ef72a-96a4f0a7a62b3685]
ACB load failed Error: "go" not found

onyx tangle Jan 22, 2026, 9:29 PM

#

dull portal I'm starting to feel a bit stuck on debugging the `schemaJSONFile` work. I'm try...

having not read it yet, my immediate thought is the cache key might be wrong in that the cache key for the object passed around is not content hashed based on the schema. That would explain the behavior at least.

I suspect this API might be best following a different pattern than the usual dag-op. I don't think we need the whole lazy thing where the same API gets invoked two times, once outside dagop and once inside. It can actually be much simpler.

The whole point of this API is to be able to pass around the schema json as a File, but avoiding creating an ID that's passed between clients which contains the entire schema json string (which can be MBs).

Ideas:

Don't use dag-op wrappers or anything. Just directly create a new MutableRef, write the schema.json to it, make it an ImmutableRef and then return a File with that ref set. The only risk is whether our code correctly handles Refs created outside of DagOp yet, not sure since I don't know if we've tried yet.
Update File to support an additional field under LLB and Ref: Contents []byte. If Contents is set, then just pretend its a file with the given bytes. Would require updating all the relevant methods on File to handle that case in addition to the existing LLB+Ref cases, but probably not too bad

Those would be mutually exclusive options. But the end result of either of them would be that you can pass around references to the schema json file without having to actually pass the entire contents as an ID string from one client to another (i.e. during SDK module calls, the ID of the schema json file are won't actually have the full string in it).

#

If I had to pick, 2. might be the simplest?

#

Also, I'm just thinking out loud here, so take with salt

dull portal Jan 22, 2026, 9:32 PM

#

those are definitely helpful ideas! I'll keep on hacking at it -- I too was tempted to try creating a ref outside of the dagop, which might be possible especially since the schema should always be added to an empty directory.

#

I also added a panic incase we were passing around an empty digest, which never got hit, so I think that's a good thing 😅

onyx tangle Jan 22, 2026, 9:35 PM

#

yeah that's important! But also important that the digest of the arg being passed around changes when the schema changes. I guess you could check by println'ing schemaJSONFile.ID().Digest() for the id that's used here

dull portal Jan 27, 2026, 12:04 AM

#

So I'm running into the issue of s.srv.View getting lost when I get a callback in the dagop context, which I discovered due to a printf under: https://github.com/dagger/dagger/blob/3ee8da3aa724ffeca12aa7de8b73b33361f4f54a/core/schema/query.go#L274

I initially tried passing along the expected view as an arg, e.g.

        WithArgument(call.NewArgument(
            "expectedView",
            call.NewLiteralString(s.srv.View.String()),
            false,
        ))

then tried changing the Server.Schema() func to accept a specific View argument: https://github.com/dagger/dagger/blob/3ee8da3aa724ffeca12aa7de8b73b33361f4f54a/dagql/server.go#L450-L493

however I don't think that worked, since the tests kept on failing -- in particular some of the go client code isn't being generated correctly since the wrong schema is still being returned, which was ultimately giving me some ./main.go:15:13: dag.ContextGit undefined (type *dagger.Client has no field or method ContextGit) errors.

so I put in a terribly ugly hack, which I store the schema under a global map with a random ID, and pass along a reference to that ID instead, under: https://github.com/dagger/dagger/blob/3ee8da3aa724ffeca12aa7de8b73b33361f4f54a/core/schema/query.go#L330

But that's not an acceptable solution (memory leak, super hacky, etc).

So the ultimate question: How can I keep a reference the dagql.Server of the original call (or the equivalent original view) from within the dagop callback?

dull portal Jan 27, 2026, 5:54 PM

#

@onyx tangle I got all the tests passing for this now under https://github.com/dagger/dagger/pull/11733 however I'm still stuck figuring out how to get the correct view inside the dagop context.

GitHub

chore: convert __schemaJSONFile to dagop by alexcb · Pull Request ...

onyx tangle Jan 27, 2026, 5:55 PM

#

dull portal <@949034677610643507> I got all the tests passing for this now under https://git...

sorry yes on my queue to look into soon, I will let you know!

dull portal Jan 27, 2026, 5:55 PM

#

ah thanks, no worries.

#

I have to run out at some point today to deal with a flat tire on the car 🙁

#

but I can shift on to https://github.com/dagger/dagger/pull/11734 (completely unrelated) so I'm not blocked.

onyx tangle Jan 27, 2026, 9:31 PM

#

dull portal So I'm running into the issue of `s.srv.View` getting lost when I get a callback...

When you passed the view as an argument, did you also include the view in the digest of the operation? Right now with dagql IDs, if you override the digest then it doesn't matter if you change the args, so if any args change you gotta incorporate that into any custom digest

#

I actually may fix this soon by changing custom ID digests to work by just accepting a list of extra digest mixins, but doesn't exist yet

onyx tangle Jan 27, 2026, 9:31 PM

#

dull portal I have to run out at some point today to deal with a flat tire on the car 🙁

Oh I just read this, sorry to hear and no problem!

#

no rush at all

dull portal Jan 27, 2026, 9:33 PM

#

onyx tangle Oh I just read this, sorry to hear and no problem!

I found the friendliest tire shop, got it all sorted out and am back home. took all of 10-15mins at the shop.

dull portal Jan 27, 2026, 9:33 PM

#

onyx tangle When you passed the view as an argument, did you also include the view in the di...

I'm pretty sure I tried that... I had it under this commit: https://github.com/dagger/dagger/blob/3ee8da3aa724ffeca12aa7de8b73b33361f4f54a/core/schema/query.go#L344

#

    newID := dagql.CurrentID(ctx).
        WithArgument(call.NewArgument(
            "expectedView",
            call.NewLiteralString(uglyID),
            //call.NewLiteralString(s.srv.View.String()),
            false,
        )).
        WithDigest(hashutil.HashStrings(
            // dgst.String(),
            fmt.Sprintf("%v", rand.Float64()),
            s.srv.View.String(),       <-- I had this in here, even before I started tossing in random values :)
        ))
    ctxDagOp := dagql.ContextWithID(ctx, newID)

#

I think part of the issue I was running into is when I'm inside the dagop, it always points to the same server that's configured under the worker: https://github.com/dagger/dagger/blob/3ee8da3aa724ffeca12aa7de8b73b33361f4f54a/engine/buildkit/worker.go#L113

onyx tangle Jan 28, 2026, 8:59 PM

#

dull portal I think part of the issue I was running into is when I'm inside the dagop, it al...

What happens if you A) consistently use s.srv rather than CurrentDagqlServer(ctx) and B) Force a sync evaluation of the dagop (i.e. eval the LLB) inside the implementation?

#

Trying to plumb context through the buildkit solver in a way that actually works consistently is essentially impossible (which is why we still get random CI flakes about sessions missing), but if you do it all synchronously it might avoid the problem? Also shouldn't be a regression since we were implicitly evaling before this anyways I think

#

Side note: I've been seeing evaling spans in cloud traces: https://github.com/sipsma/dagger/blob/2408141c2463e4c8e0685cd40e457c5132fcc109/core/container.go#L1865-L1865, was that left in by accident?

GitHub

dagger/core/container.go at 2408141c2463e4c8e0685cd40e457c5132fcc10...

A portable devkit for CI/CD pipelines. Contribute to sipsma/dagger development by creating an account on GitHub.

dull portal Jan 28, 2026, 9:02 PM

#

onyx tangle Side note: I've been seeing `evaling` spans in cloud traces: https://github.com/...

hmmm, I think I put that in with the telemetry.Internal() flag to hide it.

onyx tangle Jan 28, 2026, 9:02 PM

#

dull portal hmmm, I think I put that in with the `telemetry.Internal()` flag to hide it.

It shows up still, for whatever reason

#

at least I've seen it

dull portal Jan 28, 2026, 9:03 PM

#

I can try removing it, and see if the idtui golden tests pass.

dull portal Jan 28, 2026, 9:05 PM

#

onyx tangle What happens if you A) consistently use `s.srv` rather than `CurrentDagqlServer(...

I was initially just using s.srv, but never tried the force sync. I'll give it a try.

onyx tangle Jan 28, 2026, 9:10 PM

#

dull portal I can try removing it, and see if the idtui golden tests pass.

example here: https://dagger.cloud/dagger/checks/github.com/dagger/dagger@fab82885ee35cc6d3809356a1d18e369c5a80389?check=testSplit%3AtestModules&listen=c2a055d335f334a5&listen=dfb6991b18580f2e&listen=739ec16c9a5b0b67&listen=5d389ab1f69f5d12&listen=8b3e7753176b4be8&listen=a6b1ae33706dde29&listen=82db9ca811eeeb07&listen=801ccb19ef41da05&listen=2d8220310a96a43f&listen=327bfb698fb00e8b&listen=88f2a1743ec98bae#0d2fa20ab3e27f12

Dagger Cloud

Browse and visualize Dagger traces.

dull portal Jan 28, 2026, 9:10 PM

#

oh yeah.... that totally shouldn't be there 🙁

dull portal Jan 28, 2026, 10:01 PM

#

I managed to compare the contents of the schema, what I expect on the left (outside the dagop), and what I'm getting (inside the dagop).

#

somehow there's a test that's changing the schema? 😕

#

and this is produced using s.srv and an eval done under:

    f, err := DagOpFile[*core.Query, schemaJSONArgs](ctxDagOp, s.srv, parent.Self(), args, nil, WithStaticPath[*core.Query, schemaJSONArgs](schemaJSONFilename))
    if err != nil {
        return inst, err
    }

    if _, err := f.Evaluate(ctx); err != nil {        <--------------------------- here
        return inst, err
    }

    dop, err := dagql.NewObjectResultForID[*core.File](f, s.srv, newID)
    if err != nil {
        return inst, err
    }
    return dop, nil

dull portal Jan 28, 2026, 11:07 PM

#

onyx tangle Trying to plumb context through the buildkit solver in a way that actually works...

so given it's "essentially impossible", what are your thoughts in the way I have it implemented -- storing the contents in a global map, and simply passing a key (the hash of the content), via the dagql call?

If you think that's ok, I can clean it up a bit, and possibly add some sort of reference counting to delete the contents once the f.Evaluate(ctx) returns.

onyx tangle Jan 28, 2026, 11:12 PM

#

dull portal so given it's "essentially impossible", what are your thoughts in the way I have...

No we can't do that. It's too hacky and we're gonna regret it immediately.

Is your latest attempt pushed to your branch? I can take a look to see what's going wrong

dull portal Jan 28, 2026, 11:13 PM

#

one sec.... I'll push the version with the eval call

#

it contains a call to panic() whenever it detects a missmatch under: https://github.com/dagger/dagger/pull/11733/changes/9f6b81290247fd9b9fa61d3f5f4cb3409cc5fe3e#diff-a2bab836c6f25d97c19d9ec4f5cbe987a91b5584bfa04e3488e7728d08be10f4R162-R165

which I can then pull up a diff of the schema by running:

docker logs dagger-engine.dev |& grep 'ACB want' > /tmp/want
docker logs dagger-engine.dev |& grep 'ACB got' > /tmp/got
meld <(cat /tmp/got | awk '{print $3}' | base64 -d | jq .) <(cat /tmp/want | awk '{print $3}' | base64 -d | jq .)

#

we're gonna regret it immediately.
😂

onyx tangle Jan 28, 2026, 11:46 PM

#

dull portal > we're gonna regret it immediately. 😂

Yeah to be clear I'm glad you tried it to verify this is all possible to get working 😄 I just know in my bones that it would become a headache quickly

#

(looking at what you pushed now)

dull portal Jan 28, 2026, 11:48 PM

#

I snuk in a RangeSorted function, so the fields should be produced in a stable order now,

onyx tangle Jan 28, 2026, 11:49 PM

#

dull portal I snuk in a `RangeSorted` function, so the fields should be produced in a stable...

What were you running to repro the bug? Just anything with a module?

dull portal Jan 28, 2026, 11:49 PM

#

./hack/with-dev go test -count=1 -v -run TestModule/TestContextDirectory ./core/integration/ hit it pretty quickly.

onyx tangle Jan 29, 2026, 12:52 AM

#

dull portal `./hack/with-dev go test -count=1 -v -run TestModule/TestContextDirectory ./core...

Found one possibility: https://github.com/sipsma/dagger/blob/21f887838c3897cee576f161b795eb027b6ac812/core/schema/wrapper.go#L135-L135

The CacheMap for the dagop is set to return that CacheKey, but it's just a mix of the self digest (which is Query in this case, so nothing) and args digest (which is also not helpful)

#

I'm profoundly looking forward to the day where we can get rid of those extra layer of cache keys and all we care about is the dagql cache key...

#

but ironically have to get over this hump first 😄

#

I'm checking if my theory is correct, just an fyi in case you're looking around still too

onyx tangle Jan 29, 2026, 1:20 AM

#

eh okay, that may be a factor but not the whole story, still digging

#

seeing if there's a way to fix the thing with the server being constant in the worker

onyx tangle Jan 29, 2026, 1:37 AM

#

okay that dagql server field on worker is actually an interface that gets you the "right" dagql server, problem is that it gets you the right one based on your client id.

that doesn't quite work in the way we call it though: https://github.com/sipsma/dagger/blob/21f887838c3897cee576f161b795eb027b6ac812/core/moddeps.go#L216-L216

onyx tangle Jan 29, 2026, 2:28 AM

#

I think i got something working. I pushed it to your branch as a separate commit.

#

It basically just passes the schema as an arg in the ID. I'm attempting to do it in such a way that the ID w/ the schema attached stays internal only and doesn't cross IPC/network boundaries

#

TestModule was passing for me, we'll see if everything else is okay with it

#

If it works I think that's probably good enough to hold us over. It should become much more trivial once the solver is out of the way

dull portal Jan 29, 2026, 8:23 PM

#

I see you're just passing the contents along over the args -- if that works and you're not worried about the memory overhead, then maybe that's fine?

The current code has a comment under https://github.com/sipsma/dagger/blob/21f887838c3897cee576f161b795eb027b6ac812/core/schema/query.go#L146-L151

    // LLB marshalling takes up too much memory when file ops have a ton of contents, so we still go through
    // the blob source for now simply to avoid that.

which made it seem like we wanted to avoid passing large data around, but perhaps that's only a LLB limitation and totally fine for dagops? or at least fine for now and we'll just accept the extra dagop serialization overhead?

onyx tangle Jan 29, 2026, 8:24 PM

#

dull portal I see you're just passing the contents along over the args -- if that works and ...

Right that's why I mentioned above that this approach does it in such a way the schema attached stays internal only and doesn't cross IPC/network boundaries

dull portal Jan 29, 2026, 8:25 PM

#

ah, gotcha!

#> so I started work on removing the