0.20.6 & constructor args | Dagger | Page 1

gentle cradle Apr 30, 2026, 4:24 PM

#

🧵

#

@heady haven note that if 0.20.5 didn't have the issue, it's a much narrower set of possible causes (workspace-plumbing was already merged in 0.20.5)

#

cc @daring sapphire @ebon brook

pallid matrix Apr 30, 2026, 4:27 PM

#

I swore I reproduced this locally on my box, but when I retried it things worked. It's possible I repro'd wrong, but the ci running in cluster definitely didn't. I've been running around and fighting flux, so I haven't verified in-cluster yet

daring sapphire Apr 30, 2026, 4:27 PM

#

@pallid matrix yeah if you can provide a repro that would help us tons! I'll try it myself and see if i can repro

ebon brook Apr 30, 2026, 4:28 PM

#

I can't repro in v0.20.6 FWIW

#

I see the constructor args in --help

pallid matrix Apr 30, 2026, 4:42 PM

#

Ok, I have repro'd in our cluster. I'm not using --help in cluster though. I'm trying to run the command dagger call <args> all and the initialization args that used to run in 20.5 aren't recognized in 20.6

#

Could this be incompatibility between client 20.5 and engine 20.6? That is a difference in cluster vs my local.

ebon brook Apr 30, 2026, 4:46 PM

#

pallid matrix Could this be incompatibility between client 20.5 and engine 20.6? That is a di...

could be. Just verified constructor arugments work with both v0.20.6 client and server. Checking with v0.20.5 client now

#

just to double check you're calling your function like dagger --constructor-arg value call $myfn correct?

ebon brook Apr 30, 2026, 4:48 PM

#

ebon brook could be. Just verified constructor arugments work with both v0.20.6 client and ...

yep, seems like it breaks with client v0.20.5 and server v0.20.6

#

I'd upgrade since both v0.20.5 and v0.20.4 were bumpy releases @pallid matrix

#

they had a few regressions from the workspaces work which were fixed in v0.20.6

gentle cradle Apr 30, 2026, 4:49 PM

#

@pallid matrix Do you mind sharing the actual names of the args? Wondering if there's a naming conflict maybe 🤔

ebon brook Apr 30, 2026, 4:50 PM

#

gentle cradle <@970661730088783872> Do you mind sharing the actual names of the args? Wonderi...

I doesn't seem to. My arg is called --api-src

gentle cradle Apr 30, 2026, 4:51 PM

#

(sorry I meant @pallid matrix )

pallid matrix Apr 30, 2026, 5:22 PM

#

Mine was like --pipeline-repositry

ebon brook Apr 30, 2026, 5:26 PM

#

in any case I was able to repro @gentle cradle

#

bumping the client to match v0.20.6 in the engine fixes it

#

20.4 and 20.5 were bumpy releases and we accidentally introduced some regressoins there 😬

pallid matrix Apr 30, 2026, 5:37 PM

#

will 20.6 client work with a 20.6 engine where a module has asked for like 20.3 functionality?

ebon brook Apr 30, 2026, 6:06 PM

#

pallid matrix will 20.6 client work with a 20.6 engine where a module has asked for like 20.3 ...

yes, we do have backwards compatibility

#

you should be able to verify it locally though

pallid matrix Apr 30, 2026, 6:07 PM

#

Ah, good point. It has worked locally with older engine defs in my local env.

pallid matrix May 1, 2026, 4:59 PM

#

I think I'm seeing a pretty massive performance degredation in 0.20.6. I have a set of module with tests, and a CI job that runs through all tests in a kube cluster with dagger-engine deployed (all tests use that common deployed engine). In 0.20.5, I could run through the entire suite in about ~30m if cache was stale, <10m if cache existed. Sometimes as fast as 3-4m. Since upgrading to 0.20.6, I can't get a run through the suite at all as it takes >1 and times our in our CI. Worse, I would expect to see some benefits of caching if I re-run the job, but i don't seem to as even a re-run takes >1 hr and fails.

Am I missing something in config related to cache or performance changes in helm charts maybe?

ebon brook May 1, 2026, 6:12 PM

#

pallid matrix I think I'm seeing a pretty massive performance degredation in 0.20.6. I have a...

cc @woeful stream and @past stirrup 🙏

gentle cradle May 1, 2026, 6:14 PM

#

pallid matrix I think I'm seeing a pretty massive performance degredation in 0.20.6. I have a...

Can you share the command used to trigger the degraded test run?

#

(Not for repro, but looking for clues as to which codepaths are activated, that might be responsible for the regression)

pallid matrix May 1, 2026, 6:21 PM

#

I have a test script that does 2 things:

pushd module/tests dagger install --progress=report .. dagger call --progress=dots all popd

And iterates over all our modules. What I have noticed, is when I run this locally, all of the module tests dagger.json and go.[mod|sum] are getting updated (I noticed changes in my local git) whereas before that did not happen. Currently, most modules have a dagger version of 0.20.3 in their dagger.json.

It feels like more is occuring in this process than used to with the changes I'm noticing in local files.

pallid matrix May 1, 2026, 6:50 PM

#

I split up our CI run into multiple separate jobs, 1 for each module, and a single module's test suite is taking between 7-15minutes. Even modules where the testing complexity is low.

gentle cradle May 1, 2026, 6:52 PM

#

@pallid matrix any chance you could share a dagger cloud trace URL? That would help enormously

pallid matrix May 1, 2026, 6:52 PM

#

Unfortunately not. It's all run local

#

My gut based upon what I'm seeing speed-wise locally is that dagger install step is updating modules and dagger.json, and those dep updates prevent cache hits and overall slows things down.

gentle cradle May 1, 2026, 6:57 PM

#

pallid matrix Unfortunately not. It's all run local

You can still export traces to dagger cloud even when running locally

#

Oh wait you dagger install on each run?

pallid matrix May 1, 2026, 6:58 PM

#

Yes, for each module's tests to ensure it is running against the latest module version.

gentle cradle May 1, 2026, 7:00 PM

#

So these are tests for dagger modules? If so, wouldn't you want each module's tests to live inside that module's repo?

pallid matrix May 1, 2026, 7:02 PM

#

They do. We have a repo, and each module is in a dir in that repo with with it's tests:

<module>/tests

The tests are a separate dagger module that uses the module at level ... I've run across code changes in the module (api changes usually) not being recognized in the tests unless I tell dagger to go update the dep (ie the module itself). I'm currently doing that with dagger install ..

#

This is new also. In my parallel job invocation, I got this failure:

ERROR: Job failed (system failure): pod "dagger/runner-dgmlauop4-project-474-concurrent-2-thv3idzy" is disrupted: reason "EvictionByEvictionAPI", message "Eviction API: evicting"

Never seen that before either.

#

I'm also seeing really high load on the engine.

#

If a trace would help, how do I send one to dagger cloud from my local setup?

gentle cradle May 1, 2026, 7:06 PM

#

pallid matrix They do. We have a repo, and each module is in a dir in that repo with with it'...

So these are local deps? module A imports module B in the same repo?

pallid matrix May 1, 2026, 7:06 PM

#

gentle cradle So these are local deps? module A imports module B *in the same repo*?

Correct.

gentle cradle May 1, 2026, 7:07 PM

#

pallid matrix If a trace would help, how do I send one to dagger cloud from my local setup?

dagger login -> once logged in, it happens automatically. You'll see a trace URL. You can also press w from the TUI, or call dagger -w

gentle cradle May 1, 2026, 7:07 PM

#

pallid matrix Correct.

So these are never pinned, you always get A and B for the same snapshot of the repo. So you absolutely do not need to run dagger install each time

#

(probably not the root cause of this regression, but you never know)

#

Generally running dagger install at runtime is a bad idea and indicative of a problem somewhere. In some cases, the problem might be on our side - ie. you actually have no choice but to do it that way as a workaround for a limtiation of dagger. Hard to tell without seeing your code

pallid matrix May 1, 2026, 7:12 PM

#

I can take out the install step and see if:
A) it solves the speed issue
B) If no issues arise down the line

If that solves the problem and no issues pop up, I'm happy to chaulk this up to PEBKAC. 😄

pallid matrix May 1, 2026, 7:13 PM

#

gentle cradle So these are never pinned, you always get A and B for the same snapshot of the r...

That is what I thought, but I do recall having issues at one point with changes not making it to the test module. That was local devel though, where I probably had to do a develop/install because I had local generated code.

woeful stream May 1, 2026, 7:19 PM

#

pallid matrix I have a test script that does 2 things: `pushd module/tests dagger install --p...

Hey, if you could have a self contained repro that would greatly help tracking it down 🙏 Something showing the regression as a standalone between 0.20.5 and .6

pallid matrix May 1, 2026, 7:33 PM

#

I'll try to see if I can create one. So far, the dagger install step does not seem to be a cause. I am seeing the engine under heavy load and I'm not sure why. It could be that is the issue.

pallid matrix May 4, 2026, 1:43 PM

#

Hrm, I'm no longer sure there is a performance degradation in 0.20.6. I reverted to 0.20.5 and run the parallel jobs I run in 0.20.6, and it does not do well at all. 5/15 pass, with the failures being various timeouts/broken pipes/etc.

We're running dagger as a statefulset behind a kubernetes service with cache on ebs gp3. There are no requests/limits on the dagger-engine helm chart we apply, and the engine should be on its own node (aside from cluster required pods like istio). I discovered that even though limits were not set, defaults of requests 100m/256Gi and limits of 500m/1Gi were applied. I bumped those to 3000m/5Gi and 3500m/10Gi and I saw the benefits of caching again (the 5 prior jobs completed almost instantly), but the engine is struggling under the load from the past 10 failed jobs.

The limits increase test was done under 0.20.5 and I still need to do the same with 0.20.6. Is there a resource sizing guide for the dagger engine? I would've thought 3cpu would more performant running 10 jobs in parallel. How do you size the resources for the engine?

woeful stream May 4, 2026, 5:23 PM

#

pallid matrix Hrm, I'm no longer sure there is a performance degradation in 0.20.6. I reverte...

Could you try on 0.20.3 ? My guess is that we might have introduced something on 0.20.4 🙏

pallid matrix May 4, 2026, 5:48 PM

#

With the original resource limits (100/500, 256/1) or the expanded limits?

woeful stream May 4, 2026, 6:16 PM

#

pallid matrix With the original resource limits (100/500, 256/1) or the expanded limits?

As you want, wouldnt be surprised for it to work with the original resource 😇

pallid matrix May 4, 2026, 8:40 PM

#

0.20.3 is performing much better. I dropped to the original resources (100/500, etc) and 13/15 passed and 2 failed in about 30m runtime. The runtime is looking like it would be similar to when I just had a single job iterating through all the module tests one by one. Once this finishes I'll try again and see if caches hit and failures are retried successfully.

I didn't see the load in the dagger engine pod (measured via uptime) hit as high a number as it did in 0.20.5 or 0.20.6.

gentle cradle May 4, 2026, 8:57 PM

#

pallid matrix 0.20.3 is performing much better. I dropped to the original resources (100/500,...

We have a theory. If you had a way to send us a before / after trace on dagger cloud, that would be super helpful

pallid matrix May 4, 2026, 9:02 PM

#

I can probably open things up to do that. 0.20.3 and which other version?

You mentioned i just have to do a dagger login to send a trace. I assume i need an account or something?

gentle cradle May 4, 2026, 9:08 PM

#

pallid matrix I can probably open things up to do that. 0.20.3 and which other version? You m...

Yes. If you don't have an account, just call dagger -w it will send you to the web UI, with option to setup if needed (also works by pressing w while inside the TUI)

#

All traces are private by default, so you can safely share a URL, only your org and Dagger employees with support "super-admin" access will be able to access the trace

pallid matrix May 5, 2026, 11:50 AM

#

Do you want a trace from a single one of the parallel jobs, traces for all the parallel jobs, or a trace of when I ran all of the tests in one sequential sequence?

pallid matrix May 5, 2026, 2:59 PM

#

@gentle cradle ^^

gentle cradle May 5, 2026, 3:40 PM

#

pallid matrix Do you want a trace from a single one of the parallel jobs, traces for all the p...

Anything works as long as they're a slower one and a faster one, and they're the same command

gentle cradle May 6, 2026, 6:53 AM

#

Update: I'm following my educated guess on a possible root cause, while waiting for the traces.

If anyone has a theory on the root cause, or is already working on a fix, please let me know!

pallid matrix May 6, 2026, 12:00 PM

#

I created some traces, but the performance wasn't the same as what I reported. Trying again this morning with the original process

pallid matrix May 6, 2026, 1:29 PM

#

Ok, I've got 2 series of traces, one using 20.6 and another using 20.3. I don't see a way to share the set, only specific traces though.

pallid matrix May 7, 2026, 11:54 AM

#

@gentle cradle ^^

gentle cradle May 7, 2026, 2:27 PM

#

pallid matrix Ok, I've got 2 series of traces, one using 20.6 and another using 20.3. I don't...

Yes individual trace URLs are the available mechanism for sharing at the moment

pallid matrix May 7, 2026, 7:14 PM

#

#

#

@gentle cradle ^^

woeful stream May 8, 2026, 9:25 PM

#

pallid matrix 0.20.3 is performing much better. I dropped to the original resources (100/500,...

Erik have made a PR that could fix one of the root cause: https://github.com/dagger/dagger/pull/13117

pallid matrix May 8, 2026, 9:28 PM

#

@woeful stream Were the traces I provided of any use?

woeful stream May 8, 2026, 9:29 PM

#

pallid matrix <@274903880343748619> Were the traces I provided of any use?

I'll triple check later today or Monday 🙏 We've hit some internal things that we dug, and I'll correlate later. Thanks for providing those ! 😇

#0.20.6 & constructor args