@gerhard @Erik Sipsma @jedevc @matipan | Dagger | Page 1

While I do not think that we should be focusing exclusively on "engine stability", our CI reliability is not currently at a point where we can be confident in the test results. 33% of the time they fail for reasons unrelated to the changes being made.

Here is a bird's eye view of the checks on main - many commits have failing checks (see screenshot).

Since our CI switched to v0.11.8, 10 our of 15 runs passed - 66% success rate. The first of the 5 failures was a genuine mistake, and the remaining 4 are known issues (most recent failure first):

We are also seeing genuine intermitted failures in non-Engine test suites, such as https://dagger.cloud/dagger/traces/6ee45731fe7aa860bcf48883866eac21?span=8cea5c72281296b2

The truth is that the team is shifting focus to the v0.12.0 release, and the current state of main no longer allows us to ship a v0.11 patch release. We could create a new release branch, start back porting from main and then release a new patch, but this seems like a lot of work. Most likely, we will end up releasing the latest fixes - such as https://github.com/dagger/dagger/pull/7717 - in v0.12.0

As a change of scenery, we can resume packaging & prod architecture work, and accept that our CI is likely to remain at the current stage - 66% reliable with a run duration of ~20 mins

#@gerhard @Erik Sipsma @jedevc @matipan