#🚀 v0.12.0 - 9th July 2024
1 messages · Page 1 of 1 (latest)
Hello all, kicking off this thread wayy early, since we have a proposed date, and quite a few things in flight before then 😄
So, the idea is we skip this and next week for releases, and aim for v0.12 the week after, to avoid clashes with 4th of July holiday.
main is open for breaking changes at this point I think - so we can start working through some of the drafts in the milestone. I think there's still potentially conversations around what's okay to break, etc, we should avoid breaking things in daggerverse as much as possible.
If we hit an urgent issue that means we have to release a v0.11.9 (hopefully not), the plan is to branch v0.11 off of v0.11.8, and cherry-pick any fixes to there.
cc @hard shoal @toxic finch @gloomy oasis @buoyant ledge @lean nest @noble basin @sullen zephyr @timber spade @lucid birch
So can I do the breaking change on the TS dependencies & drop commonJS support?
✨ v0.12.0 - 9th July 2024
As a non ts dev, that's up to you I guess! The thing to help decide is who this is going to break, and what their migration path will look like.
If it's a significant number who will need to take manual action, or the migration path is particularly tricky, it might be worth polling #typescript to see if they have opinions.
Okay, I'll tackle this problematic next week then, first I'm focus on context dir
👍
What's the main motivation for waiting 2 weeks - marketing or engineering?
(not sure which, based on "avoid clashes with 4th of July holiday")
2 weeks is an estimate of how long it will take us to get everything merged & ready for the release. The truth is that we can cut the release anytime, including the 4th of July. The date above is more of an estimate, we can adjust as needed.
This tracks what we think we want: https://github.com/dagger/dagger/milestone/45 . There are a bunch more things in flight here: https://github.com/dagger/dagger/pulls . Pretty sure there are changes coming which are in neither of these two places.
Are there things that you think we absolutely need to have in v0.12.0? If there is anything that you think we can defer post v0.12.0, that is likely to help move the release date closer.
most of the things in the milestone currently are breaking changes - those should be done in v0.12.0, or will have to wait till v0.13.0
I think the second we merged breaking changes into main, we accepted that anything in the milestone that is not yet ready, will have to wait for a follow-up
I don't like keeping main in a paralyzed unreleasable state in the middle of a period of frequent emergency fixes
see
this is the plan for emergency fixes
Are there things that you think we absolutely need to have in v0.12.0? If there is anything that you think we can defer post v0.12.0, that is likely to help move the release date closer.
I would love to have the luxury to answer that question, but it seems my hands are tied really
we can make more releases in the v0.11.X series, we just won't do so directly from main - this will avoid us accidentally pulling in huge changes that might also break legitimately
I don't think we've had to backport for releases in the past. Assume we will need to make at least one more .11 patch release. Are we comfortable actually doing this? The productivity tax is worth it?
i've done a lot of backported releases in the past with buildkit - overhead has generally been fairly minimal, i'm not too worried about it
the alternative seems to me that we merge all of our breaking changes within a tiny windows, and try and release immediately after to avoid closing main for too long, which means we have less time for internal testing
-
My marketing perspective is that it's better to release before July 4 than after, if possible, because that date coincides with the sharpest dropoff in engagement in our community. So it will be timed exactly for minimal impact
-
My engineering perspective is that even with a good plan B for patch releases, it's not healthy to keep main unreleasable for too long. So the sooner we get it out of the way, the better.
i'm not entirely convinced about being ready for the 4th - even if we drop everything but the breaking changes from the milestone, i think some of those deserve some careful thought to avoid breaking the daggerverse
To be clear I don't actually care what goes in the milestone.
So I'm not saying we should rush everything in the milestone
right, but we should try and avoid making breaking changes outside of minor releases
everything with the kind/breaking label
some of those are only breaking in theory - like https://github.com/dagger/dagger/pull/7500
we can definitely strip down the release and only do the kind/breaking changes
the reason for setting and coordinating a date is to try and give us time to get the release ready and in a relatively stable state after a large number of recent refactorings
not neccessarily for making a large marketing splash
at least from my point of view
OK. Then I'll leave it alone
My 2c, in the future maybe we should consider flipping it: merge breaking changes into a breaking-changes branch, and integrate there, then merge at the last minute for release. That way main is releasable at (almost) all times
that's kind of what we're doing, except they're all just individual prs
that seems to be a major difference, since we're measuring time to release those PRs in weeks
if we have a separate long-lived branch, now we need to manage that - what release does that go in? who tests that? who keeps it up-to-date with main?
i would strongly push back on having more than one long lived branch to work on
Yeah that's fair. I just think right now we're blind to the tradeoff this entails: we have mergeable PRs that are stuck in limbo for a long time, because they have a breaking change, and our process for managing those is "park all the breaking PRs until we are ready for a 3-week long stressful sprint to a minor release, with a not-completely-defined process"
if all of the prs are merged earlier, and the respective owners have the cycles to get those done sooner, we can release sooner than the suggested date
It kind of looks like we're holding on some of these PRs not because they're not ready, but because we're scared of merging a breaking change into main too soon
i mean, sort of, it's good to batch these changes together
if we just merge every breaking change as soon as it's ready, we lose user's trust, it makes it so much harder to convince users to upgrade to new versions
making upgrading a predictable and easy experience (even if that takes more engineering time) is definitely worth it in my experience
a badly executed release ends up with us attempting to recover for weeks after, our users frustrated, and everyone in critical bug-fixing mode.
That's fine with me
I'm just pointing out that you're already maintaining a virtual release branch
@hard shoal most of the api breaking changes are related to your work - do you feel confident in getting those changes in this week?
The sum of all PRs that you want to batch together as soon as the window opens for a breaking release.
I think you should consider frontloading some of that "batch it together" work to keep the release window for growing too large (3 weeks is a long time)
Also didn't we just merge a breaking change last week, thereby triggering this whole machinery without having any control over it?
i don't think the cadence of a release every 3 weeks is at unreasonable - we've settled into a cadence of 1-2 weeks recently, but we have done longer and we have done shorter
given that each of these is a reasonably large commitment of time (i'm working to automate more, but until then), i'm actually more convinced that a slightly longer period than we currently would be better
From my point of view:
- Hey we just merged a breaking change
- So we should probably cut a 0.12 release very soon. We can't make patch releases until we do
- Also we should batch together as many breaking changes as possible
- We have a half dozen of those pending, but it will be another 2 weeks to get them all ready to go
- When would you like to release 0.12?
--> Does it matter when I want to release 0.12?
we have merged breaking changes between clients and engines before - the changes in the milestone potentially break engine<->module compatability - there's an impact on the daggerverse here, i don't think we should just rush into that
i need to head off for the day now, will pick this up tomorrow
Which one(s) break engine-module compat?
this changes an api served to modules
if older modules are connected to newer engines (like a depedency from the daggerverse), you will get stuck
(Non urgent topic, please head off if needed)
captured some of this in https://github.com/dagger/dagger/issues/7640
Ah okay, thanks! Those are the ones that are exceptionally painful for users since they can get stuck waiting on transitive deps to update
Ah nice will take a look as soon as possible
yeah, i want to avoid having users pin to ancient versions of dagger and not upgrading
since this then hurts our ability to get users to adopt new functionality/etc when we add it
Yep, it’s a python 2->3 type situation
one day i think we should move towards a release branch setup, like containerd/buildkit/etc - i think something like that is super important for giving users reassurance about long term support, but we have too much in flux to think seriously about that right now, imo
To be clear this issue (module compatibility, version checks etc) is infinitely more important than a 0.12 release date.
I'm not sure if anything I'm saying about the release date might negatively impact module compat, but if it does, module compat wins
@timber spade looks like you already have a PR in that's worthy of a patch release -
Yes, I'm catching up on the details in this thread, previously just read the part about incoming module-engine incompat.
Provided main in it's current state is releasable I'll go ahead and do that and setup a separate release thread.
Provided main in it's current state is releasable I'll go ahead and do that and setup a separate release thread.
I think part of the thread above is caused by the fact that main in its current state is not patch-releasable, because we have at least one breaking change. But don't quote me on that, I don't know for sure what the state is.
Yep I was just double checking and there's 2 commits that are breaking: https://github.com/dagger/dagger/commit/87a0f1a74a763c2bdf6c2a2ab99fc39cda96917a and https://github.com/dagger/dagger/commit/fef32cb3d04ff26ca9ef5e80e6b5dba5634bd1b0
So will need to go with the separate branch approach I think. Opening a separate release thread now to figure that out
Btw if there's anything you need a review on tonight ping me, I'm gonna work late
@ocean egret @dapper socket for daggerverse - since we expect to introduce breaking changes, do we have an easy way to "check" the published modules against a to-be-released version of dagger
i think the idea is to avoid breaking everything, but if we can catch things that we do, it would be great to get advance warning, and downstream any contributions, or maybe work out how to avoid making those changes in the first place
Would these breaking changes be caught by something like a dagger functions <module> call or similar?
they should be, yes!
if i'm allowed to ask for the moon, what would be even cooler would be a way to check dagger main against the daggerverse on a nightly basis, and be able to track and graph increased/decreased rates of failure
but obviously one-off would probably be fine here
Sounds doable! Would we only care about the latest version of each module or do we want to check against all versions? Or maybe we care to check only module versions that were built with dagger v0.x.y and above?
oof those are all good questions
all versions is "interesting", but not sure, i think there's likely to be a lot of drift, etc, at least at this stage in the ecosystem - might be worth trying? i think the latest is most important though
Built with v0.x.y is a good constraint to have, but we haven't broken any module compatability since we launched modules in v0.10.0
so we could have that as the lowest minimum for now
Sounds good. I'll get to it!
cheeeeers 🎉 should hopefully make our release (and all future releases) much smoother
we have a way to check that @round geode @ocean egret it's just that it's not automated
let's go to lounge and I can explain how we're doing it
👀
To resurrect the topic of breaking changes and their relationship to releases: if we decided to be more strict about avoiding breaking changes, could we? And what price would we pay?
surfacing something from a dm with @hard shoal: tomorrow morning i'm going to look at trying to have dynamic schemas based on the engineVersion - so the breaking changes avoid breaking the daggerverse
this is possible - but generally we want to make breaking changes for things like APIs where the previous decision turns out to not be the right one.
- maybe there's an inconsistency in an old api, and we want to update those older ones as we've seen how people use it in the wild (https://github.com/dagger/dagger/pull/7293)
- maybe a bunch of methods that we added are no longer needed, and we've gone through the deprecation process, and want to take them out now (https://github.com/dagger/dagger/pull/6934)
- we want to change defaults in a way that if people were relying on this, it breaks users (https://github.com/dagger/dagger/pull/7136)
there are other approaches - buildkit has never broken backwards compat
however, there are big costs to this:
- innovation is restricted - every single change has to be made thinking about compatability, so you can't just "fix a bug", sometimes you need multiple fixes, for all the different variations of client/engine combos
- codebase becomes much more complicated - all of these edge cases take handling, which both makes it harder to make simple changes, but also prevents outside contributions
personally, i think the "never break" is the right approach, but if we paid those costs, we'd be stuck with some weird api things indefinitely, and we'd slow engineering pace down - maybe one day, but i'd like to push that into the future until it's unavoidable
The part that I keep thinking about, is that a lot of these breaking changes are very superficial - changes in the graphql schema plus a little plumbing. If we could find a way to encapsulate these changes as different versions of an "internal module" - as opposed to special cases mixed everywhere in the code - then perhaps we could have our cake and eat it too?
That wouldn't make all breaking changes go away, but it would allow us to tier them in a 80/20 way maybe
If core could become a "proper" module, that's definitely doable
I guess we just gotta work out how to do it finally 😛
To summarize:
- I 100% empathize with the idea that we still need freedom to break things - or progress will grind to a halt which could kill us
- On the other hand, I also worry that too much breakage, too frequently, can create a "death by 1000 cuts" situation, that could also kill us. Especially when considering the secondary effects, like growing incentives to group breaking changes together (which trains users to not trust minor updates) and rushing breaking changes to keep the release window small (which increases the risk of quality problems)
cc @timber spade food for thought maybe - i wonder if when we move out the llb stuff you were thinking about, if we could have core be a "special" module, but one that speaks serialized Ops (either through graphql/json/etc)
Just a few notes on the impact of these PRs:
- api: align
Container.withNewFilesignature withDirectory.withNewFile- High impact. It's a commonly used API, and there's no solution for deprecation because it changes which args are optional or not.
- Can be temporarily replaced today with
Container.newFileandDirectory.withNewFile.fileto avoid the breaking change, or maintain backwards compatibility with different engine versions (used in ourcimodule). - In a positive note, it's an easy change for users to make and doesn't affect Python.
- api: skip entrypoints by default in
withExec- Backwards incompatible (because it flips the default), but users can prepare in advance.
- Can avoid the breaking change by setting
skipEntrypoint: true|falsetoday (or stop relying on the entrypoint), asskipEntrypointwill continue to work, just deprecated.
- api: return absolute path on export instead of boolean
- Very low impact. If there's affected users, they probably don't know the boolean is useless.
- Remove deprecated schema fields
- Very low impact, as these have been deprecated for quite a while.
- Codegen: Functions that return the Void type should generate a client that only returns an error
- Affected users are most likely Go users, consuming a function that doesn't return a value, from another module.
- Almost all core APIs that return
Voidare considered internal (for SDKs or CLI), sinceModule.servemay be used here and there.
@hard shoal for 0.12, maybe we punt on Container.withNewFile? I worry that it will be the straw that breaks the camel's back. wdyt?
I'm ok with that.
When would we punt it to?
If we don't change it now, for the reasons we want to, will we ever?
Can be temporarily replaced today with Container.newFile and Directory.withNewFile
I guess we have a one-time card we can play for non-deprecatable breaking changes: a one-time change of the API to remove the withXX style 😛 withExec becomes exec, etc. Then we would get a clean deprecation (at the expense of forcing absolutely everyone to change their code within the deprecation window). Not saying we should go for that nuclear option, but it is there.
If this approach works, we'll only be breaking this in module that have v0.12 in their engineVersion - all old modules will continue to work
I think its worth trying that approach if we can - we can get the best of both worlds - we remove the issue in new code, but don't break old module code
So can I do the breaking change on the TS dependencies & drop commonJS support?
How are you planning on dropping support? This just in the code-gen side of things ?
I'm currently testing with the latest image built from main (5231aa7de19266430581da054604ce533cc85898), should I be targetting something different? @round geode
That one should be good yes 🙂
So I already run the whole recrawl twice and in both cases I haven't seen any errors. I'm syncing with Marcos in a bit for a quick review to see if I'm doing things correctly.
This recrawl is only checking for the latest version of all modules. 566 modules were checked
Bundling this up into a dagger module that does the whole thing and putting that in a workflow dispatch would be fairly straighforward and could be an interesting community call demo 
I was looking for errors at the wrong table. We do have quite a few, so far we've found a few modules that are using the Terminal type which is marked as invalid type. @round geode I can send over the entire output string of the code generation command, or I can try to build something that processes this and tries to group by error. Which one do you prefer?
It may be way too late for that, but just in case: any chance we could morph the breaking Container.terminal() change into a non-breaking Container.terminal().attach()?
I don't see how we have to break the CLI behavior to get the new code behavior. We could just change the CLI code so that it calls the new attach() when receiving a Terminal return value
Would be one less headache
Out of 566 modules that were scanned, 74 failed which is ~13% error rate roughly
First file contains a summary created by ChatGPT that is supposed to clusterize the errors, haven't checked for each one individually yet and it seems to have duplicated the Terminal error, but it might be a good enough proxy. The second is a deduplicated output that for contains the error encountered for each module+ref. I validated each of those failures using a dagger functions call as well and I can see that it fails with the same error
Yeah this has been on my brain's backburner for a while. The biggest trickiness is that everything I can think of would still result in some API that may end up with backwards incompatible changes. i.e. if core is a true module then it still talks to the engine somehow and that API can break.
It still is most likely worth doing since I'd rather deal with breaking changes to an internal-only API than a user-facing one, but I'm not sure yet whether it actually ends up reducing our maintainance burden overall, which also the same primary source of pain for dealing with version compat across breaking changes in core today.
It's just all hard to say since everything is still fuzzy+hypothetical. Agree that once I experiment with moving off LLB we'll be able to get a better concrete idea on what this would look like in practice.
100% to be clear I'm not talking about fully swappable core modules with a stable public API (which would just shift the problem as you mentioned), I'm picturing more a private, go-only API, something just barely above llb/solve. An architecture that would give us some breathing room by making things more modular internally
Okay, so manually tracking through this, lots of the errors are from changing Container.Terminal to return a Container type (see https://github.com/dagger/dagger/pull/7758)
so we should have this as a changelog However, it's weird that the *Terminal type is now gone.
any thoughts on this? #1254757484057464975 message
if doable we could get that Terminal type back, no breaking change, get our cake and eat it too
Sorry, missed this - I think the ergonomics of the new Terminal make a lot of sense - I would rather users didn't have to add attach into their code as well.
I'm gonna investigate the dynamic module api thing, which is on my todo list today - maybe we can pull that trick here too
Another idea I had was that we could alias Terminal to Container
fyi there's some of these are failing on recent dagger versions as well - e.g. when we changed the pragma syntax slightly
a bit of hackery later, this is actually super doable: https://github.com/dagger/dagger/pull/7759
the end result will look something like: https://github.com/jedevc/dagger/blob/2d5fab40bca48d228a1cfb41a89e76ed8b4622c3/core/schema/query.go#L60
at schema install time we just need a little check, and we can choose exactly what kind of api we want to serve at that point
for example, this will let us shim in the old Terminal type for old dagger versions, etc
one thing i have thought about here - having dev-... versions in engineVersion really hurts, because we can't compare version numbers elegantly - thankfully, i think this is a bit of an edge case, users are installing from main or dev versions already there, so that's already a bit of no-mans land
at some point, we should switch to go-style pseudo-versions (https://github.com/dagger/dagger/issues/7760)
@hard shoal does this seem somewhat reasonable from your POV? we can install different versions of the withFile function implementation depending on which version we detect the daggerVersion looks like
Interesting! Yeah, we can try it. Currently trying to rebase the PRs, had a lot of conflicts after the testctx PR 🙂
ahh sorry! that was gonna cause a lot of conflicts regardless sadly, but at least now it looks like tests are running a bit faster 🎉
Totally fine, I was looking forward to trying out those changes and it needed to be done 🙂
turns out this is doable, but with much pain 😢 i've ended up having to introduce a new dagql concept of "views", which allow multiple different perspectives on a single server.
without this, you end up with a server for each version of the schema, which means that ids valid in one schema might not be valid in another schema (and so isn't a viable approach)
anyways, i remain very hopeful, it's just... taking time to hack through
Hey, could I get your opinion on https://github.com/dagger/dagger/pull/7794#issuecomment-2199927154 ? 😄
So I can add the breaking changes from our dependencies if that's okay
Did you look at what the update path looks like?
Also, we need to make sure that this doesn't break any ts modules that work today in the daggerverse
Ohhh yeah, I need to check that
hey everyone! Gerhard will be ping the owners of certain PRs soon to add details into the blog post draft. I apologize for the short notice, but please add your content to the draft tomorrow. Once you add the content, Neela will clean it up.
This will give Solomon has enough time to review on Wednesday before the US holiday and PTOs, so we can get it out shortly after the release. Thanks all!
update: think it's almost there, i've also started work on "unbreaking" the already merged breaking changes
technically it should be readable enough if anyone is curious to review, but going to tidy up commit history tomorrow to make it a bit easier (i also want to rebase-merge this one)
👋 any update on this?
Yup, I’m on it, normally nothing on the user end, it just drops support for commonjs but we set it in the runtime
Most modern js is compatible with ES2016 so should be good
I’m still checking and trying some local test to catch any issue
Hello hello. Given the unfinished design discussions that we want to wrap up before release:
- Container autoprint, and autoprint in general
- TUI "lingering"
- Traces integration into TUI
- Breaking change to
exportstring argument?
Should we start planning for a possible delay in the release?
I'm available today (on phone while traveling) and tomorrow (fully) on west european business hours, to participate in the discussion.
But it seems optimistic that everything is 100% wrapped up across many TZs, and any changes we agree to are shipped and confidently tested by tuesday morning Pacific.
wdyt?
apparently you're good at reading minds 😄
i was gonna ask this exact thing at the end of today, as us folks came online
but yes, i absolutely think we should be delaying - the milestone has 15 open items on it
Yeah, agree on delaying the release too.
Yup agree too!
@round geode Did we have any issue with the Go CI, I keep hitting an issue with:
dagger call -m dev --source=.:default sdk go test
--- FAIL: ExampleGitRepository (2.24s)
panic: input: git.tag.tree.file resolve: failed to update submodules for https://github.com/dagger/dagger: git error: exit status 128
stderr:
fatal: No url found for submodule path 'engine/telemetry/opentelemetry-proto' in .gitmodules
[recovered]
panic: input: git.tag.tree.file resolve: failed to update submodules for https://github.com/dagger/dagger: git error: exit status 128
stderr:
fatal: No url found for submodule path 'engine/telemetry/opentelemetry-proto' in .gitmodules
I can reproduce it locally, but it's also happening in GH: https://github.com/dagger/dagger/actions/runs/9839011704/job/27160149479?pr=7794
Java and Rust are also breaking even if I didn't touch the code in the PR (https://github.com/dagger/dagger/pull/7794/files): https://github.com/dagger/dagger/actions/runs/9839011704/job/27160149690?pr=7794
https://github.com/dagger/dagger/actions/runs/9839011704/job/27160150335?pr=7794
Are we aware of this issue?
I also see the issue on main btw
Container autoprint, and autoprint in general
I am focusing on that one at the moment
hm i think this is actually potentially a weird buildkit issue i've seen before - if you clear your cache, does it start working again?
I mean it fails on the CI & locally
So I'm not sure cache is the case, but yeah locally I start from a clean engine
It also make test all & testdev failing
Check mainjob
It's because you deleted the .gitmodules in your PR
Looks like it match the error I see in tests
dammit, i only half removed the submodule
hate submodules, genuinely full of misery and nastiness
No problem
I approve your PR, merge it whenever you want
hm, still looks like something is wrong
will follow-up
there might be something about https://github.com/moby/buildkit/issues/4260
Approved, only test fails, I'm checking the traces
Ok, it fails but doesn't seem related: https://dagger.cloud/dagger/traces/63386d3cc7c8a36774227cab82e9f7ba?span=b44159eb4412ea7a
Merge it whenever you want btw
@round geode There's still an issue with the CI, different from the one before: https://dagger.cloud/dagger/traces/dc57d7472c38dd0117cc967fca89b36f?span=385f95a830e5e5bb#385f95a830e5e5bb:L62
I'm not able to run go test . -update on my computer, am I missing something?
Oh I guess I need to regenerate the doc
hm i might have missed that one
serves me right for trying to balance 100 prs at a time
Yeah I have the same issue haha
pulled that into https://github.com/dagger/dagger/pull/7843
@timber spade are we doing https://github.com/dagger/dagger/issues/7775 for this release?
@craggy osprey @hard shoal any objections if we delay https://github.com/dagger/dagger/issues/7777 ? It's gonna require some go SDK misery to get it to work
@noble basin what's needed for https://github.com/dagger/dagger/issues/7597 ? is this just going to be updating the RELEASING notes?
@hard shoal do you need a hand on adapting the breaking changes with the views/etc? happy to lend a hand on those if needed
if anyone has a moment to think about https://github.com/dagger/dagger/issues/7760 that would also be wonderful to include in this release - but there's some very fiddly subtlety about ordering, e.g. the practice of comparing to not-yet released versions is probably a bit confusing in this case.
I'm very split on what to do with https://github.com/dagger/dagger/issues/6894 - I would like to implement the fix I suggested with a fancier understanding of max-parallelism to handle nesting, but this it sounds like a "medium t-shirt" task potentially. Removing the option is a "x-small", so could be done very easily.
Yeah that'll only take a few minutes, will send out a PR in a few
Adding changie and tracking the changes since the last release. It was on my today's list, but I ran out of time. Will be on it tomorrow since I have 2 more Helm changes which go together for v0.12.0
(my dev environment refuses to start because it gets stuck on resolve image config for docker-image://docker.io/docker/dockerfile:1 despite all other networking working......... 🤬 so once I can actually do anything, will send out a PR in a few)
Lol, I actually went with "Vpn vpn" in this case, which just fixed it (???????)
I was reading some others issues/PRs that made it sound like keeping the .git dir by default could present caching issues perhaps. Not sure if we’ve got opinions from @timber spade yet.
That was a concern in the past and is worth re-checking now (will comment on the issue)
Otherwise the exponential backoff can keep going indefinitely and wait too long.
Fixes #7775
@timber spade re: "legacy" ... not sure where we landed on allowing non-module code to invoke modules?
(not urgent convo, we can postpone this thread until after the release)
@timber spade basically wondering if it could be a path to test dagger modules using native frameworks if e.g. we have bindings to invoke
I still think we want to do it, there was a time period where I thought I was gonna get to it imminently, but 10k other things took priority https://github.com/dagger/dagger/issues/5993
Yep make sense. Maybe if we codegen "self" (?) that could be used for testing purposes in vanilla go test
That's a route, yes, I'd be more in favor of something that's sort of a hybrid where we just make it easy to wrap external frameworks by implementing a small module and then still invoke it with dagger, sandboxed, etc. https://github.com/dagger/dagger/issues/6724
Supporting native frameworks is the way to go without a doubt; we're not gonna write custom test frameworks for every SDK language obviously. But just supporting i.e. go test directly without module wrapping reverts us back to requiring certain host dependencies at the right version, etc. (which gets especially painful for languages like Python/TS/etc.), which I'd really like to avoid.
So I think it's worth finding some path that lets you e.g. write _test.go files the same you normally would, but then you still invoke the tests using dagger. Same idea for other test frameworks in other languages.
And then to solve the problem w/ there being many different test frameworks for languages like python/ts, we make the support for a given test framework implemented as module (one idea in that issue is a module function that accepts another module as an argument, which is possible today).
Hazy-ish idea still but I think there's something there.
Even if it's neatly wrapped behind a e.g. dagger.Connect() that returns a codegen'd dag? In which case it would be very similar to invoking modules from within a module
Thinking about the invocation part wrapped by dagger ... it's somewhat "lossy" since we'd lose that part of native (e.g. for instance, in Go you can click to run one test from vscode, in TS when using jest the tool can do live watch and re-test, etc etc)
Right but when you invoke code in a module we are taking care of all the setup of the language-specific tooling at the right version, etc. in the sandboxes we create for the user. If they are invoking direct from the host then the onus is back on them to do all that
Oh I might have misunderstood what you were saying here, you're saying that when you invoke native code directly on the host, the code itself is just wrapped up into a container rather than executed direct on the host per-se?
Oh yeah. Looks like it’s on the host but it’s not. Similar as to when you invoke a dependent module
dag.MyModule would behave like dag.OtherImportedModule — just codegen’d
Just a convenience to pilot the module for testing using native local code, but it’s as if you were issuing dagger calls
I think it would be different than https://github.com/dagger/dagger/issues/5993 then. For that, we just wanted to make modules invokable via a library in the same way "legacy" SDKs enable you to invoke core APIs from a library. I don't think we could support containerizing that in general (the code might just be a compiled library inside some binary that's now executing on the host, can't really just teleport it to the engine in a container in the general case).
I see, yeah I think that would be nice to have for the IDE integration use case you mentioned. I can see how we'd have dag basically just switch behavior depending on if it's already in a dagger container or not (if not in container, ship the test code to the engine and invoke the test there; otherwise just run the test). But I think it would be highly specific to this "test" use-case (it requires the source code is available, etc.)
🙏 thaaaanks!
does anyone needs reviews, help, etc with anything on the milestone? happy to trade, i've got a couple prs that could use some eyes 👀
@toxic finch do we need a typescript issue in the milestone for the current deps issue etc you're working on?
Hmm no I think it's okay, I'll finish the package manager config tomorrow
Awesome awesome 😎
@toxic finch, there's an issue with a TypeScript test. It's failing in main:
dagger call -m dev --source=.:default test custom --pkg=./core/integration --run="Module/TypescriptRuntimeDetection"
https://dagger.cloud/dagger/traces/f6e7a847b5fdff1b14b0c3dccf91be05?span=a3d0d1b65eb276af
Error Trace: /app/core/integration/module_typescript_test.go:616
/app/testctx/testctx.go:163
Error: Not equal:
expected: map[string]interface {}{"runtimeDetection":map[string]interface {}{"version":"node@20.15.0"}}
actual : map[string]interface {}{"runtimeDetection":map[string]interface {}{"version":"node@20.15.1"}}
Diff:
--- Expected
+++ Actual
@@ -2,3 +2,3 @@
(string) (len=16) "runtimeDetection": (map[string]interface {}) (len=1) {
- (string) (len=7) "version": (string) (len=12) "node@20.15.0"
+ (string) (len=7) "version": (string) (len=12) "node@20.15.1"
}
Test: TestModule/TestTypescriptRuntimeDetection/should_detect_pinned_lts_node_version
Hey, I fixed the tests, lts has change its image and the extra logic needed to accept a sha256 would be pretty tricky for because we need an alpine image for the runtime.
I tried few things but it wasn't good enough IMO:
- Reparse the version to see if a @ is there, meaning we need to add the extra sha after -alpine
- Check if alpine is already present, then ignore it.
However, this hide a lot of things to the user, I think he expects to just set a version, not an actual docker image version.
I want to get your thought on that though, I can make the changes if you think that's okay.
@toxic finch, how about this one? https://dagger.cloud/dagger/traces/b6a88533029fc34be86bdbbed3d12424?span=3f6b973e3e91eec9#d805936f09d01772
the quest to fix a bunch of random dev versions in engineVersion continues 😢 https://github.com/vito/daggerverse/pull/4 (cc @gloomy oasis)
FYI I'll be on a podcast recording from 4pm to 6pm FR (3pm-5pm UK, 7am-9am SF). Then available again for release support.
@round geode @hard shoal yesterday Justin said I should check with both of you on the status of breaking changes in the milestone - what to do about them; whether we should consider dropping some from the release; and what would be consequences if we did.
@hard shoal can i get a review on https://github.com/dagger/dagger/pull/7831 ? the rebasing on this one is not super enjoyable
honestly looks like we're gonna get them all in 🙂
Yeah, I know what you mean! 👀
WithNewFile is a similar hell i imagine
picking up https://github.com/dagger/dagger/pull/6934 again now
Haven't been able to take on https://github.com/dagger/dagger/pull/7773 for the compat layer, but the rest is ok.
Feel free to ping me with any release blockers (i.e. reviews), my current task isn't one
hallo there https://github.com/dagger/dagger/pull/7831
@gloomy oasis ^
also this one: https://github.com/dagger/dagger/pull/7692
i'm also working more on https://github.com/dagger/dagger/pull/7858, i have an idea for how to do version compat for dev versions
so that i don't need to go through the entire daggerverse and do this
back
good news 🙂 i have a clever idea for making sure that dev versions in engineVersion don't make life absolutely miserable 🙂
as requested 
In terms of timing for tomorrow - I'm probably looking at kicking off the release towards the evening my time, so that there's a day to finish off anything last minute, do reviews, run any last minute manual tests, etc.
Slightly after the community call probably
@round geode does that leave enough time for actually releasing on thursday?
Yeah, I'll run into my evening 🙂
If it all goes to plan, I'll head off earlier than usual on Friday 🙂
Maybe a good opportunity to test a EU->US release handoff?
Oh yeah of course 🙂 sorry, that slipped my mind 🙂
Let's go with that then, I'll try and get everything into a good-to-tag state
And @timber spade will pick up from there
I really want to see if I can get https://github.com/dagger/dagger/issues/7760 tidied up and done, since it will hugely improve dev builds - that said, not the end of the world if it doesn't
Currently, all releases are tagged using semantic versioning - e.g. v0.11.9, the latest release. This tag is builtin to the resulting CLI and Engine binaries, and is used (among other things) for d...
I'll look now
Unless I should wait
I think we might have to drop https://github.com/dagger/dagger/issues/6894 though (cc @noble basin)
Happy to have a review in-principle 🙂 but the code is definitely broken right now for reason I don't understand (first commit should be good though)
@timber spade I just ran into the mother of all examples for the internal/external call discrepancy, 15mn after we discussed it:
- Parallelize a function by using goroutines/errgroups: now it needs a context
- Function is super low-level, dozens of direct callers, need to pass the context from all of those
- Half of the callers didn't take a context, repeat the cycle with hundreds of their own callers
Actually in this case it may not be worth plumbing a context all the way down
This is when doing calls to other functions in your own module right?
Just checking if this is fixed by self calls ^
To some extent, this is just Go being Go and something you always have to deal with (the perfect use case for AI tools, completely mindless+tedious changes that are hard to automate). But self calls would help a bit by enforcing consistency through codegen
Yes correct
Yes it's a general case of Go being Go, it's just that with self calls, I wouldn't need a context at all, because that particular function is 100% lazy
oops, found a weird terminal mess up issue in dagger login: https://github.com/dagger/dagger/pull/7879
Without this, we're just writing directly on top of the TUI, oops!
Before:
❯ dagger login jedevc
─ ←↑↓→: move home: first end: last enter: zoom +/-: verbosity=1 q: quit ────────────────...
I also hit an issue where the terminal stays open for a few seconds, then goes away. Worked fine while it was there. Happened consistently too. Anyone else got that?
@sullen zephyr, are you ok with https://github.com/dagger/dagger/pull/7857#discussion_r1673755631 (bikeshedding)?
left a comment
if there's time, got a tiny one to sneak in assuming CI passes: https://github.com/dagger/dagger/pull/7881
yeah, there's still some bigger pieces that still aren't in 😱
So, what's the verdict on pulling the trigger today?
there's a couple things still in flight - i'm not sure about https://github.com/dagger/dagger/pull/7794
since tom is on holiday today
i'm kinda tempted to not take that one - since it potentially breaks users, and tom isn't around to help sort that out if it goes wrong
i'm working on fixing and writing tests for https://github.com/dagger/dagger/pull/7858
If you're not sure about taking it, don't take it
The whole theme of this release is quality, definitely not the time to squander it with a rushed timebomb
then there's just https://github.com/dagger/dagger/pull/7773
we just need the compat part right @hard shoal?
WarningThis is a breaking change in Go! 💥
Fixes #6868
When depending on a module's function that doesn't return a value (returns only error or nothing):
func (m *SomeModule) DoSomething() ...
Yep, been working on getting https://github.com/dagger/dagger/pull/7880 out the door today.
also a fun thing i've worked out we can do for next time - we'll be able to merge+test breaking changes for any future release we like 😄 and keep it disabled for normal users
so we won't need to keep these prs around until the very last second
That would be great! It's a lot of work to keep up.
Oh nice, is that an offshoot of the compat mode work?
yup - all you'll need to do is put a view for something like "0.13.0" - and it'll be hidden for everyone 🎉
a nice side effect of the views, and having everything use semver https://github.com/dagger/dagger/pull/7858
Realize we need to update the dagger call -m help text. Which is preferred?
The example below changes the current one:
- s/github repo/git repo/
- drops the local path example (e.g. "/path/to/some/dir")
- uses a real working example module ref
- shows the same format as the Daggerverse publish page (USERNAME/ORG etc bikeshed possible)
-m, --mod string Path to the module directory containing the dagger.json config file.
Either local path or git repo:
GITSERVER/USERNAME/REPOSITORY[/SUBPATH][@VERSION] (e.g.
"github.com/dagger/dagger/dev@main")
@sullen zephyr or others got a preferred set of these?
Another nice one is prob 1, 3
I think just 1 is good
if we do 3, we should use a canonical hello-world example
our own ci is probably not the right example here, it's definitely a deep end to throw users in
definitely use hello (eg. github.com/shykes/hello or equivalent) and not our own repo, since it happens to be in flux /dev, no /dev, etc
Yep, this is the current state
-m, --mod string Path to dagger.json config file for the module or a directory
containing that file. Either local path (e.g. "/path/to/some/dir") or
a github repo (e.g. "github.com/dagger/dagger/path/to/some/subdir")
this should be ready for a proper look now: https://github.com/dagger/dagger/pull/7858
Okay, something like this look nice @round geode ?
$ dagger help call
...
-m, --mod string Path to the module directory containing the dagger.json config file.
Either local path or git repo
$ dagger help publish
...
-m, --mod string Path to the module directory containing the dagger.json config file
i think we should keep the reference to a git repo for now
oh it is there
sorry, it's been a day
yeah that lgtm
@round geode 3 questions:
- Are we still releasing today
- How can I help
- Are we still handing off to the US so that you can go to bed at a normal hour
we have one more from myself, and another couple from @hard shoal
if we do it, i'll definitely be handing off
Can the US start a release?
yup for sure 🎉
i'm happy to keep hacking on my bits for a bit more, but if @hard shoal has to head off, we should probably hold a bit more
i'm really not a fan of releasing tomorrow, but potentially we could do so monday
These two are very nearly there. I'll but pushing to get it done today.
I personally would be okay with tomorrow, but we'd probably want to do it so it overlaps between timezones so in the worst case scenario of fires there's more time than just a single workday to get through them all.
I've been doing a lot of "laying without actually falling asleep" recently for some reason so I've been online later than usual, but can be here at like 8am PT to pick up
So today is definitely off the table?
laying without actually falling asleep
The worst
I'm reviewing @jed's PR again now. Provided that and the other remaining things are merged by like say 1pm our time I think we can do the release today?
I just need to avoid it going too late since I have some IRL deadlines tonight (picking houses to tour)
@hard shoal can we group the docs updates into the v0.12 milestone?
Oh yes 
i'm okay with that 👍
Cool, and to be 100% extra clear I mean that I would do it then since it's already late for you
anything I can do to help ?
how nicely volunteered 🎉 #maintainers message
hope you're doing okay btw, that sounds incredibly miserable 😢
It could be much worse and seems to always happen this time of year 🤷♂️ I just am still groggy at like 9 am lol
@jed do you think it's crucial to get https://github.com/dagger/dagger/pull/7858 in for v0.12.0 or can we merge it right after? Just thinking about how it touches on parts related to CI quite a bit, which is always a risk
yes it is
OK then we need to plan a major developer relations effort, to get people to migrate their modules
mm this is fair - there's a few other related changes to version compat that i might pull in to another pr - but that sounds fair to me, i don't object terribly
it's only for users who consume dev/etc
We bought ourselves some time with compat mode obviously. But they still need to port their code, or they effectively aren't really upgrded
Which they will soon realize when eg. they try to call the cool new Terminal() etc
Then what happens
-> we need to have a good answer to that question
cc @lean nest @craggy osprey @high ember
If it will take too long to split out or it will still have some parts that touch on CI, then we should just go with it as is I think. But if it's super trivial to pare down to something a bit more minimal then I think that's worth it
As a start: all of us need to update our own modules, and report back on the experience
One more huge thank you @round geode (and everyone else who helped) for this compat mode - without it, we basically couldn't release at all as is (would be blocked not just on planning this devrel effort, but actually executing it in parallel)
Compat mode is all @round geode, just been cheerleeding him on!
^ Yeah I was mostly just paranoid about this months ago and added the groundwork for it but Justin did all the actual implementation to get it working
Done a few times. You can use rg to find *Container, *Directory... like the common ones.
Maybe we can do some automation around. That 👆 😆
maybe gofmt could cover it?
There's an escape hatch to reduce work if needed through "star" import.
Only Go SDK affected, right?
does this work?: gofmt -l -w -r 'Container -> dagger.Container'
Container -> dagger.Con... oh come on! That 👆 😉
Noice! 😍
seems like it works, we just need one for every type 😛
@gloomy oasis also had the good idea of pointing users to . imports
Get the list from dag.TypeDefs(), daggerize it! 😛
fyi - this is the kind of info we should be capturing in our release notes
we need the new import line too, right?
a bit late now - but actually, if anyone has a moment - going through all the .changes breaking changes and adding migration instructions as a multi-line part of the body would be super useful
I can look, draft, hunt down info, if that's helpful?
Others might have the right info at hand
doesn't work due to conflicts 😦
./dagger.gen.go:25:6: Tracer already declared through dot-import of package dagger ("concourse/internal/dagger")
./internal/dagger/dagger.gen.go:27:6: other declaration of Tracer
./dagger.gen.go:38:6: DaggerObject already declared through dot-import of package dagger ("concourse/internal/dagger")
./internal/dagger/dagger.gen.go:52:6: other declaration of DaggerObject
./dagger.gen.go:40:6: ExecError already declared through dot-import of package dagger ("concourse/internal/dagger")
./internal/dagger/dagger.gen.go:96:6: other declaration of ExecError
but the idea was to just add . "mymod/internal/dagger" to the imports
grr
ExecError can technically be moved
moving DaggerObject is a pain
I tried for that one, it's so linked into the go codegen it was not particularly simple
gofmt works but there are a lot of things to fix since it also includes e.g. ContainerWithExecOpts
what does that change to?
Everything gets a dagger. prefix?
yep, using Go's official formatting engine, which is great for code mods like this, except we have to tell it each thing to fix by hand
Didn't we completely forget to mention this as a breaking change in the release post?
It's probably our biggest breaking change
@lean nest 👆
i think i can script this based on go build error output, 1 sec possibly many secs
hm there's actually a good reason to not do this 😢
sorry, just realized - atm, we've never made a big breaking changes to modules - so we can assume that any non-semver engineVersion is compatible with v0.11.9
if we publish v0.12.0 without this, we end up in a state where dev versions after v0.12.0 aren't compatible like this
we could just very nicely tell everyone to not use engineVersion with dev versions like that 🤔
mayyybe it's okay
Mm I'll just keep reviewing it as is for now, don't want more last minute changes on last minute changes 😄
crap, found a flaw in my dastardly plan
(I'm posting comments as I go)
fyi the changes to dev/mage is kind of related to the core dev chat yesterday
instead of doing two dagger calls for the cli+engine, we can just do one
might have found a bug:
⬢ [fedora-toolbox:38] ❯ dagger-dev develop
✔ connect 0.1s
✔ moduleSource(refString: "."): ModuleSource! 0.0s
✔ ModuleSource.kind: ModuleSourceKind! 0.0s
✔ ModuleSource.resolveContextPathFromCaller: String! 0.0s
✔ ModuleSource.resolveFromCaller: ModuleSource! 0.0s
✔ ModuleSource.asModule: Module! 0.1s
✔ Module.sdk: String! 0.0s
✔ ModuleSource.resolveFromCaller: ModuleSource! 0.0s
✔ ModuleSource.sourceSubpath: String! 0.0s
✔ ModuleSource.resolveFromCaller: ModuleSource! 0.0s
✔ ModuleSource.asModule(engineVersion: "latest"): Module! 0.1s
✔ Module.generatedContextDiff: Directory! 0.0s
✔ Directory.export(path: "/home/vito/src/daggerverse"): String! 0.0s
daggerverse/concourse on main [$!?] via 🐹 v1.22.1
⬢ [fedora-toolbox:38] ❯ go build .
# concourse
./dagger.gen.go:110:13: undefined: JSON
./dagger.gen.go:124:13: undefined: JSON
./dagger.gen.go:200:9: undefined: JSON
./dagger.gen.go:210:9: undefined: JSON
./dagger.gen.go:248:12: undefined: JSON
./dagger.gen.go:260:12: undefined: JSON
./dagger.gen.go:277:12: undefined: JSON
./dagger.gen.go:289:12: undefined: JSON
are we somehow not qualifying references to scalar types in generated code for dagger.gen.go?
here's my somewhat hacky script if anyone wants to try it:
set -e -u -x
sources="$(ls *.go | grep -v dagger.gen.go)"
while true; do
go build . 2>&1 | grep undefined: | awk '{print $NF}' | sort | uniq > /tmp/undefined
if [ ! -s /tmp/undefined ]; then
echo "done"
break
fi
echo "fixing $(wc -l /tmp/undefined) undefined symbols: $(cat /tmp/undefined | xargs)"
for x in $(cat /tmp/undefined); do
gofmt -l -w -r "$x -> dagger.${x}" $sources
# NOTE: on Mac you might need to change this to `-i ''`
sed -i -e "s/dagger.${x}:/${x}:/" $sources # gofmt seems to mistakenly replace field usage, but not definition
done
goimports -w $sources
done
which currently goes into a tailspin due to the above issue, but otherwise fixes my code
they're all fields in MarshalJSON/UnmarshalJSON
I'm guessing the trail is:
- https://github.com/jedevc/dagger/blob/0f4398e194ea24c8a52da82185a7e356a3206969/cmd/codegen/generator/go/templates/module_objects.go#L301
- which calls this and needs to be updated? https://github.com/jedevc/dagger/blob/0f4398e194ea24c8a52da82185a7e356a3206969/cmd/codegen/generator/go/templates/module_objects.go#L430
I think it might be line 420 specifically - since JSON is a type alias
it could also be:
diff --git a/cmd/codegen/generator/go/templates/module_objects.go b/cmd/codegen/generator/go/templates/module_objects.go
index f4a91c0dd..b4d2286b2 100644
--- a/cmd/codegen/generator/go/templates/module_objects.go
+++ b/cmd/codegen/generator/go/templates/module_objects.go
@@ -444,7 +444,7 @@ func (spec *parsedObjectType) concreteFieldTypeCode(typeSpec ParsedType) (*State
s.Id(typeName(typeSpec))
case *parsedIfaceTypeReference:
- s.Op("*").Id(formatIfaceImplName(typeSpec.name))
+ s.Op("*").Id(formatIfaceImplName(typeName(typeSpec)))
default:
return nil, fmt.Errorf("unsupported concrete field type %T", typeSpec)
i'll try things out and put up a PR
nope, that's definitely not it
well, this works, but i don't know if it breaks other things - looking for a way to deduce whether the qualifier is needed
diff --git a/cmd/codegen/generator/go/templates/module_objects.go b/cmd/codegen/generator/go/templates/module_objects.go
index f4a91c0dd..795c23470 100644
--- a/cmd/codegen/generator/go/templates/module_objects.go
+++ b/cmd/codegen/generator/go/templates/module_objects.go
@@ -417,7 +417,7 @@ func (spec *parsedObjectType) concreteFieldTypeCode(typeSpec ParsedType) (*State
s.Op("*")
}
if typeSpec.alias != "" {
- s.Id(typeSpec.alias)
+ s.Id("dagger." + typeSpec.alias)
} else {
tp := typeSpec.GoType()
if basic, ok := tp.(*types.Basic); ok {
i'm assuming that would break with locally-defined aliases?
maybe parsedPrimitiveType needs a moduleName field too
ohhh i think i see what's happening - it's seeing this as an alias type - but the alias type is being removed
is go somehow analyzing the dagger.gen.go?
i think it must be
that would also do it
Damn, I'm having "no space left on device" issues
are you on v0.11.9 for the stable engine?
Yes
i think this is the 75% issue
i've been having this too, my disk without dagger is about 50% full
so there's no way dagger can have 75%
have you tried rm -rfing your home directory? worked for me
I already deleted more than 100GB but something's gobbling up my disk. Is there an easy way to see which processes are consuming disk space in macos?
"our innovative device cleaning techniques will leave your disks sparkling and empty"
https://www.derlien.com/ is solid - but the trick is to install it before running out of disk space
Disk Inventory X, disk usage utility for Mac OS X
Jesus, problem seems to have gone away after closing Safari 😮
And closing orbstack. But already closed orbstack and didn't fix.
71GB for dagger-engine.dev
That's supposed to be stopped here https://github.com/sipsma/dagger/blob/2a9bb343e013194b9153739fe17a06c55282a841/core/schema/sdk.go#L469-L469 (supposed to anyways)
what does this mean again?
✔ connect 0.1s
✘ initialize 2.7s
! get module name: invalid character 'i' in literal false (expecting 'l')
✔ resolving module ref 0.1s
✔ installing module 2.6s
✘ analyzing module 0.0s
! get module name: invalid character 'i' in literal false (expecting 'l')
(something about parsing 'failed' instead of 'false'? but not sure where)
there will be "something" in the logs
time="2024-07-11T19:08:03Z" level=error msg="failed to serve request" client_hostname=dev.fedora client_id=x5f2gntynfc29xadebh2oz3kd error="failed to get schema: failed to get schema for module \"concourse\": failed to create function \"resource\": failed to find mod type for function \"resource\" arg \"source\" type" session_id=zqvcnhamyhwhefldv2h6e526v
also the issue with that is here: https://github.com/dagger/dagger/blob/main/engine/server/session.go#L1095-L1114
hmm, seems unhappy with the JSON field, even though it compiles
remember graphql indicates failure not through http status codes 🙂
The spec for gql-over-http allows different status codes afaik, but yeah I bet this error just needs to write a gqlerror body instead of whatever go http writes: https://github.com/dagger/dagger/blob/2a9bb343e013194b9153739fe17a06c55282a841/engine/server/session.go#L816
So what we do for panics but there too
I've had that issue a lot with default values. For example, if my module didn't have quotes around the value. Also when I tried to use a variable to set a default value.
do you have a wip commit somewhere? i can also start digging in
will push one now
@round geode https://github.com/dagger/dagger/pull/7886
kinda surprised by the error tbh, though I haven't tried to run this module in a while. seems unrelated to the original issue
I didn't notice anything else other than the comment I left in https://github.com/dagger/dagger/pull/7858#discussion_r1674488441 but I'm seeing if I can do some manual steps locally to build + provision an engine with a v0.12.0 version string and then run through some stuff. Just out of caution
yeah, i'm actually gonna cherry-pick the important changes over
the main change in here is that it affects dev versions
which should mostly affect us - i think we can merge after the release, and let it soak on main for some time
You hit that just because you had miscompiling code right? I saw it before but only during dev when I had bugs in local engine code. If it can get hit by users in normal course of things I'll fix it quick before the release (it's super quick)
at this point it's not miscompiling code, as far as I can tell - this is with the codegen fix for JSON -> dagger.JSON, I don't see anything wrong
but then at runtime it seems unable to find the JSON type
Ah okay. Eh I should just fix that in case it gets hit by users somehow
lemme try this with v0.11.9 to see if it's not a new regression
yeah fails with v0.11.8 too (cc @round geode - so I think that PR might be OK as-is, except I should add a test)
we should definitely have some tests for the JSON case yeah
i don't think we do anywhere
it's possible the test just won't pass anyway because of the existing breakage, and this entire issue is kind of a nothingburger for v0.12 as a result. writing it up anyway, we'll see, maybe i can find the original breakage too
yep
✘ TestGoJSONField 20.6s
! test failed
┃
┃ Error Trace: /app/core/integration/module_test.go:1400
┃ Error: Received unexpected error:
┃ input: container.from.withMountedFile.withWorkdir.withExec.withNewFile.withExec.stdout resolve: process "dagger --debug query" did not complete successfully: exit code: 1
┃
┃ Stderr:
┃ Error: make request: invalid character 'i' in literal false (expecting 'l')
┃ Test: TestModule/TestGoJSONField
✔ /.dagger-cli session --label dagger.io/sdk.name:go --label dagger.io/sdk.version:n/a 20.6s
Waiting for checks to go green to merge:
- https://github.com/dagger/dagger/pull/7773
- https://github.com/dagger/dagger/pull/7880
Have to leave for now but will be back in a couple of hours.
this time it's:
failed to get schema: failed to get schema introspection JSON: introspection query failed: input: __schema.types[62].fields[0].type.ofType panic while resolving __Type.ofType: unknown type: Json
@round geode, since you're consolidating the docs updates in one PR, fell free to close these:
possibly - looking in the logs from earlier, I see:
❯ docker logs -f dagger-engine.dev 2>&1 | grep 'Json'
time="2024-07-11T19:08:03Z" level=debug msg="module did not find scalar" mod=concourse scalar=Json
time="2024-07-11T19:41:46Z" level=debug msg="module did not find scalar" mod=concourse scalar=Json
which could be the root cause for the module being unable to find the type, but I'm not sure why it wouldn't have failed earlier
oh, maybe that's tied to the same lookup action that returned the error
is it weird that there's a TypeDef for Json in the first place? as opposed to treating it like a built-in?
@round geode fwiw I don't think we need to worry about this for v0.12, since it's just as broken as before, so don't hang around past 9pm for this unless you have nothing better to do 😛
yeaaaaaa we can ship this bug 😄
fixed, i think: https://github.com/dagger/dagger/pull/7886/commits/a94e630cb1040a6072ef108782cdb4cb7ce25d18#diff-cec7dba357d87e010c3f8fccb971e2410da5f50c0703100439ca5a4f92fca62eR17
tl;dr:
strcase.ConfigureAcronym("JSON", "JSON")
will merge once CI is ✅
yep lol
here's a weird one: i've seen this flake a little bit: https://dagger.cloud/dagger/traces/d4dd92153f36ae896e23293adfffef10?span=32c867899772b3fc#32c867899772b3fc:L19
is this familiar to anyone?
i really don't get this one, it just simply "does not seem possible"
saw that one just today too, in another PR. I think it's a new test?
it's a new test, yeah, but the issue seems to be coming from go codegen
i guess this looks similar to the weird python equivalent where we get "main module not found"
or no, this error comes from the cli
If it is, I'm borderline happy because maybe I could finally repro that failure locally... I've run so many loops of that test + others in parallel to get the flake to happen and never have 🤬
Possibly still related somehow in that if the directory for the module source is "confused" that could explain either failure (maybe)
my worry is that somehow this would have appeared when introducing views (which is also when these tests appeared)
however, views shouldn't affect presentation of non-core modules
so i'm just kinda confused 😄
if it does help, i have seen this issue locally @timber spade
huh, i just tried running all of TestLegacy, and I got something even weirder:
Error Trace: /app/core/integration/legacy_test.go:295
Error: Received unexpected error:
input: container.from.withMountedFile.withWorkdir.withExec.withNewFile.withNewFile.withExec.stdout resolve: process "dagger --debug call skip stdout" did not complete successfully: exit code: 1
Stderr:
1 : 2fedb8a31a0e3ca7: initialize
2 : 63945c60b2c885cf: resolving module ref
2 : 63945c60b2c885cf: resolving module ref DONE [0.1s]
3 : b09987408c78c01d: installing module
3 : b09987408c78c01d: installing module ERROR [3.1s]
3 : b09987408c78c01d: ! input: module.withSource.initialize resolve: failed to initialize module: failed to call module "test" to get functions: call constructor: process "go build -o /runtime ." did not complete successfully: exit code: 1
Stderr:
main.go:3:8: package dagger/test/internal/dagger is not in std (/usr/local/go/src/dagger/test/internal/dagger)
1 : 2fedb8a31a0e3ca7: initialize ERROR [3.2s]
1 : 2fedb8a31a0e3ca7: ! input: module.withSource.initialize resolve: failed to initialize module: failed to call module "test" to get functions: call constructor: process "go build -o /runtime ." did not complete successfully: exit code: 1
Stderr:
main.go:3:8: package dagger/test/internal/dagger is not in std (/usr/local/go/src/dagger/test/internal/dagger)
Error: input: module.withSource.initialize resolve: failed to initialize module: failed to call module "test" to get functions: call constructor: process "go build -o /runtime ." did not complete successfully: exit code: 1
Stderr:
main.go:3:8: package dagger/test/internal/dagger is not in std (/usr/local/go/src/dagger/test/internal/dagger)
Test: TestLegacy/TestExecWithEntrypoint
package dagger/test/internal/dagger is not in std huh, why should it be in std?
Fix for that schema error formatting: https://github.com/dagger/dagger/pull/7888
This error will happen when the import path doesn't exist, so if the module was named foo and you tried to import dagger/bar/internal/dagger or similar, you'd get this
mmmmmm
idk why for 100% sure, but might be that if an import path doesn't start with a URL-looking thing and the import path doesn't exist in the codebase, then it assumes it must be from the stdilb
the problem is. the module is called test, the path is test
Right I'm not questioning that, just saying why the error is extra confusing
This does sound an awful lot like the python flake and the one you mentioned above
like different heads of the same monster
(could be wrong, just a suspicion)
Like if the Directory is just missing files or dirs that are supposed to be there, that feels like it could explain all of them
mmm i wonder how that could possibly be happening
I don't know, that's why I gotta repro locally, but "thankfully" this one appears to be easier to trigger locally than the python one maybe?
yeah, i can get weird things to happen just running all of TestLegacy together over and over
Trying now (really really hoping this isn't another edge merging issue, but that is coming to mind as a possibility in that it could swap one directory for another if incorrectly merged)
are you using dagger call or dev-engine+go test?
i was doing dagger --progress=plain -m dev call --source=.:default test custom --pkg=./core/integration --run="TestLegacy"
it's potentially possible there's something weird about my setup
I started with ./hack/dev + go test and ran it 50 times in a row successfully. Trying dagger call now
Oooo
Stderr:
12 : [57.3s] | main.go:3:8: package dagger/test/internal/dagger is not in std (/usr/local/go/src/dagger/test/internal/dagger)
🎉 i'm not crazy!
Just in terms of time, it's super late for you Justin and you mentioned you were still working on splitting out changes from https://github.com/dagger/dagger/pull/7858, right? At this point that latest I would feel comfortable starting the release is in an hour (3 my time) so idk if it's realistic to get everything sorted by then? Should we just plan to start it in EU tomorrow and hand over to US?
Awesome, no worries, looking now. There's the docs PR too right? But that's it?
i suddenly became convinced that this was very tricky logic and i didn't trust myself to have done it right
docs are in https://github.com/dagger/dagger/pull/7884 🎉
the one other thing i would have tried to do, but don't really have time now, was to prep release notes - i was gonna try and tidy up the summaries, and for the breaking changes at least, add information on how to migrate
I am happy to tidy up. For instructions on how to migrate, some of them are pretty obvious and I can add high level descriptions there. But ideally we'd have something fleshed out with examples in each language, etc. Which is just going to take a while.
So either:
- I can start the release after the above PR is merged (shouldn't take past 3 my time unless I find anything major), format the release notes nicely to the extent feasible and then we follow up in the next few days (modulo weekend) with more instructions
- We delay release, write up full instructions, then do the release
I'd vote 1, but if others have opinions let me know
i'd vote 1 too - i think getting it out would be great
There's a couple legit looking test failures there
yeah, looking at those now 🙂
then i'm gonna go 😄
i'm kinda worried the behavior is legitimately wrong - there's a thing where if there is no engineVersion, it should serve v0.9.9
since we used to not generate the engineVersion
Please yes 😄 (in that it's late, not that we don't enjoy your company)
i have a theory it could be this: https://github.com/dagger/dagger/pull/7887
One more docs thing we'll need eventually:
https://github.com/dagger/dagger/pull/7890/files
cc @high ember
right, i pushed some final fixes to this, hopefully that does it
i'm off - pizza beckons 
i'll be around tommorrow morning to pick up anything, and help people who are trying to do migrations 😄
approved!
i cleaned up the milestone, i think everything in it should be good to merge now (assuming tests etc)
From a read it LGTM, provided tests pass I'll approve and merge. If there's anything more though I will fix it but would then be getting a bit tenuous in terms of doing the release fully. Realistically I'd just need to take an hour-ish break at some point, so provided there's not somehow a time-sensitive fire that should be okay
The tests still failed legitimately in that PR, looking
merged the JSON fix
Looks good to have taken that PR - one small note before I definitely well and truly go to bed 🛏️, before tagging and releasing, it would be good to sanity check a cli and engine with a semver --version to make sure it doesn't all just fall apart
Yeah I started that earlier, currently I'm waiting for CI to finish in the rest of the changes in the milestone https://github.com/dagger/dagger/milestone/45, will do manual checks off latest main again in the meantime
Hey, I'm back and around for maybe 1h if anything comes up.
Was just gonna say I saw a legit failure in the python runtime module, but then saw you pushed, so presumably fixed 😄
We need the change to Void of course, but is the python runtime module PR 100% necessary for v0.12 or can it wait for 0.12.1? Just if it comes down to making a call there
marshal: json: error calling MarshalJSON for type *dagger.GeneratedCode: input: container.from resolve: failed to resolve image docker.io/library/python:3.10.10-slim-bookworm: failed to resolve source metadata for docker.io/library/python:3.10.10-slim-bookworm: docker.io/library/python:3.10.10-slim-bookworm: not found
Yeah
Yeah, that's reverted. Been having flakes until now. Hoping for 🥦
I'll watch too
call function "TestPublish": context canceled grrr
CI has been very frustrating lately. When things fail so much you start learning to ignore them which isn't good.
For the ProjectLayout flake we can add a few t.Log to get the state of the container in a few places. If expected files are in place and with the right contents for example.
We never got to 0 flakes, but like 2 weeks ago they were relatively rare IME, but in the last week more popped up. It's mostly been the weird typescript stuff for me, but there's definitely others
That would be useful yeah. Did you ever manage to see that one locally btw?
I don't think so! I did hack for 30 mins or so to add those calls and simulate a failure in order to see if I'd get the needed information but I couldn't make it work properly only on failure. Didn't want to make those calls every time, even on success, but better that temporarily than having nothing.
Yeah I tried running the test by itself in a loop, running all python tests in a loop, etc. just in the background for multiple hours and never saw it once. The new Legacy tests do have an oddly similar looking flake which I can repro locally, so hopefully it's the same thing and once there's breathing room I'll be able to actually dig into it
Yeah, I noticed that!
I wonder after the OTel instrumentation in the test suite @gloomy oasis, where connect() should go, because of the pararelization? There's some tests with only a connect at "top-level" and then multiple sub-tests with the calls. Previously you'd need every subtest to have their connect if you wanted to use t.Parallel().
Yeah, it's technically a regression to the time before Justin refactored t.Parallel()/connect() usage. I've thought about moving it into middleware or something so it only happens just before each test runs. It's not super clear whether holding those sessions open is a genuine issue or theoretical though (even at the time of the original refactor afaik)
tangent: right now having a top-level connect() also breaks the span hierarchy since everything ends up beneath the dagger session CLI instead of beneath each sub-test, which I have a local fix for, but need to isolate to the minimal change.
So it's best not to share connect() then?
yeah. afaik the downside is minor, but it's technically better to do it in each sub-test
Yeah. This one for example: https://github.com/dagger/dagger/blob/87e206640497f628ca9f4955fb79397654f51d58/core/integration/module_python_test.go#L422-L436
Maybe move that common part to a setup function?
Manually setup v0.12.0 engine+CLI locally, can call unmodified CI module functions fine, run module tests configured to use those fine, etc.
Then did dagger develop to upgrade CI module to v0.12.0, which worked, but then when I try to call engine lint this is the only error I see by default:
sipsma@dagger_dev:~/repo/github.com/sipsma/dagger$ _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://dagger12 ~/dagger call -m dev --source=.:default engine lint
Setup tracing at https://dagger.cloud/traces/setup. To hide: export SHUTUP=1
✔ connect 0.2s
✘ initialize 1.6s
! input: module.withSource.initialize resolve: failed to initialize module: failed to call module "dagger-dev" to get functions: call constructor: process "go build -o /runtime ." did not complete successfully: exit code: 1
✔ resolving module ref 0.1s
✘ installing module 1.4s
! input: module.withSource.initialize resolve: failed to initialize module: failed to call module "dagger-dev" to get functions: call constructor: process "go build -o /runtime ." did not complete successfully: exit code: 1
✔ ModuleSource.resolveFromCaller: ModuleSource! 0.2s
✔ ModuleSource.asModule: Module! 1.0s
✘ Module.initialize: Module! 0.3s
! failed to initialize module: failed to call module "dagger-dev" to get functions: call constructor: process "go build -o /runtime ." did not complete successfully: exit code: 1
If I set --progress=plain then I get the actual incompat problems:
Stderr:
# github.com/dagger/dagger/dev
./helm.go:20:10: undefined: Directory
./helm.go:33:25: undefined: Container
./helm.go:46:5: undefined: File
./helm.go:71:50: undefined: ContainerWithNewFileOpts
./cli.go:162:36: cannot use dagger.ContainerWithNewFileOpts{…} (value of type dagger.ContainerWithNewFileOpts) as string value in argument to ctr.WithExec([]string{…}).WithDirectory("/nix", dag.Directory()).WithNewFile
./cli.go:163:4: unknown field Contents in struct literal of type dagger.ContainerWithNewFileOpts
./main.go:110:29: undefined: GoLintOpts
./sdk_elixir.go:114:34: cannot use dagger.ContainerWithNewFileOpts{…} (value of type dagger.ContainerWithNewFileOpts) as string value in argument to ctr.WithNewFile
./sdk_elixir.go:115:4: unknown field Contents in struct literal of type dagger.ContainerWithNewFileOpts
./test.go:229:65: unknown field Contents in struct literal of type dagger.ContainerWithNewFileOpts
./test.go:229:65: too many errors
Those are the known incompatibilities, but the fact that I can't see anything unless I do --progress=plain seems wrong. Was this a known issue? cc @gloomy oasis
Yeah, I've hit that a lot. I mentioned it on Discord.
Didn't have time to file an issue.
hmm maybe it's not revealing internal spans when it should? I disabled the top-level error log reporting since it's in principle redundant with the progress report and we have that longstanding issue to dedupe all that. But seems like there's a gap here
Mentioned here #1260743246917795840 message
specifically I think calls beneath Module.initialize are marked internal: true because they're dagql.Select "self" calls
My main concern is just that it's gonna make upgrading even more painful. I can just put it in bold in the release notes. But if there's any super low risk + quick ways to fix it, seems significant enough to do
but we should probably have those be revealed if they're in error state
let me see if there's a quick fix - do you have a repro available?
Early this morning you could have a missing required flag error, and it didn't show with TUI.
Like dagger-dev call -m dev
Or an error with the value to -m.
eh ok I'll just back out the change for now, I think I fixed something like that (https://github.com/dagger/dagger/commit/4424b1fe16d9d88852882a435c200c7f90bc7291) but better to play it safe
Oh that came after. I'm rebuilding dev and trying it out again.
dagger call -m dev --source=.:default --version v0.12.0 engine container --platform linux/arm64 export --path=/home/sipsma/engine.tar
dagger call -m dev --source=.:default --version v0.12.0 cli file --platform linux/arm64 export --path=/home/sipsma/dagger
docker load -i /home/sipsma/engine.tar
docker tag a1ed7546b53f1170739034d353a649740e19e8b19535fc0f091bfe9cd62e0e69 dagger12
docker run -d -v dagger12:/var/lib/dagger --name dagger12 --privileged dagger12
_EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://dagger12 ~/dagger develop -m dev
_EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://dagger12 ~/dagger call -m dev --source=.:default engine lint
(highly manual to get a semver tagged local engine + cli right now)
Also was having issues using the python runtime module directly. Some errors flashed in TUI and disapeared in the end, but showed in progress=plain.
thanks! will use this to test when I get back to it
It's probably doable just with ./hack/dev engine+cli too, I think it's just the upgrade to v0.12.0 in dagger.json that's needed
Approved, will merge on green
Alice just reminded me that besides picking out houses to tour by tonight we also have a zoom meeting w/ another realtor at 6..... so we'll see what I can do. I just don't wanna push a tag, disappear for an hour and come back to everything broken at 7pm lol.
I'll at minimum do the release notes since that's before any tag pushes. I'll decide then whether to keep going or wait for EU to pick up
zero pressure on my end to rush this
Working on the notes now, it would be very nice to link to the blog post where there's more details on how to fix the various incompatibilities, but obviously a bit of a catch-22 😄
So I think we can just edit the release notes after it's out to link to that.
Another thing I just realized, it would be especially bad for us to have a long time gap between the Engine+CLI release and daggerverse being updated in this case. Due to the fact that module publishes using v0.12.0 exclusive APIs will fail until daggerverse is upgraded (right? I'm not misthinking about this?)
So from that perspective I think we do have to wait until it's working hours for infra+cloud people to push the engine tag. Obviously we can't avoid any gap between engine release + daggerverse updates right now, but should at least keep that minimal.
- Also of course worth thinking about in the long term how to get rid of this problem, but for another time
Copilot's suggested autocompletion in the release notes made me laugh here 😄
Release notes: https://github.com/dagger/dagger/pull/7899 described the hand edits I made there
(Gonna be afk for a bit, on return I'll get as much ready as possible and leave a hand-off comment summarizing all the above)
Back, gonna do some more manual testing of version/compat stuff for now
Tested compat stuff with my fake v0.12.0 engine+CLI and went through updating all our ./dev + submodules to v0.12.0 gradually. Went super smoothly 🎉 Could go one at a time and dependency modules were still working when using v0.11.9
To save time for others later when we need to actually do this, diff is in links below
To summarize the current state of things for Justin tomorrow:
- Didn't go forward with engine tagging yet because of time and that I realized we needed to avoid a long gap between engine+cli release and daggerverse updating due to the fact that v0.12.0 modules will fail to publish to daggerverse. So for this release in particular should wait until infra+cloud team is available to help with that.
- Only one extra merge after you went to bed to ensure that the errors hit when upgrading incompatible modules are visible in the TUI by default: https://github.com/dagger/dagger/pull/7896
- Sent out release notes PR here, with some hand edits (feel free to update any more however you'd like obviously): https://github.com/dagger/dagger/pull/7899
- Diff of updates needed to dev module for v0.12.0 here: https://github.com/sipsma/dagger/commit/26bc2ae522430588c8e8d7ed4816f87c01234461
I'm gonna spend a little time poking at that flake in TestLegacy/PythonLayout around missing files now because it is proving to be highly annoying in CI. If I find anything before bed will update. EDIT: did make some progress actually, opened issue here with what I found so far: https://github.com/dagger/dagger/issues/7900
awesome, thanks for the writeup - i had a small realization that some of the tests i threw together last night will actually fail on tags - which isn't terrible, but still worth fixing
so i'll put that together, and take over the release notes
idea is to tag shortly after my lunch here, hopefully this maximizes number of folks online
there we go: https://github.com/dagger/dagger/pull/7901
Hi @round geode just a reminder that I'm available today to help
focusing on the docs and marketing side of things at the moment
hallo, having another pair of 👀 on https://github.com/dagger/dagger/pull/7899 would be good
i'd also like to get https://github.com/dagger/dagger/pull/7901 in as a last minute change
aha, also let's bump our base images: https://github.com/dagger/dagger/pull/7902
while waiting for tests to pass, will go grab lunch
jumping on a call right now, will get to this queue right after
@hard shoal you around to look through the above?
good point - is there another way of having a line break there? if you leave a blank line then markdown formats them with loads of extra space, see https://gist.github.com/jedevc/d6c17196be0d16c8a8ad0efc36265b26/e8b6e68153170bb766d9687cda218b4f8d3ce646
the other option apparently is to leave two spaces at the end of a line
I see, so it creates a <br>.
which i think is morally terrible, so i decided to not do that 😛
exactly yeah
yup, ofc 🙂
Anything that I can help with @round geode @hard shoal ?
^ some of the linked above prs
i'll hold to merge until then 🎉
going to pull in https://github.com/dagger/dagger/pull/7887 to hopefully reduce our test flakes before tagging
back
woop, well all prs are in now 😄
just to confirm - this is gonna be the right blog post url? https://dagger.io/blog/dagger-0-12
That's a question for @lean nest or @kind flame
@silent @lean nest @kind flame Can we remove "release" from that graphic
It's just Dagger 0.12
going to tag in 5 mins
fyi @noble basin @hard shoal @sullen zephyr - speak now or hold your peace 😄
i should have enough time to run the whole release - but any follow-ups etc, will need to be handle by the us team
Oh god KSP
Yeah I can try
need a quick approve on https://github.com/dagger/dagger/pull/7903
This PR was not auto-generated.
cc @gerhard https://github.com/dagger/dagger/actions/runs/9909287007/job/27377007275#step:11:518
since the pr failed to auto-generate
@round geode done
just updating it with some more things
Yeah @round geode @hard shoal @timber spade @gloomy oasis just to prepare you for what's coming: people are definitely going to get tripped up by the combo of 1) compat mode kicking in without their knowledge 2) new Terminal() mysteriously not working
I just hit it again
Not impact on the release itself, just preparing you for the post-release wave of support
It's compounded by the fact that the error isn't a clean Terminal: never heard of that. It's actually Terminal exists but it doesn't have this method you're trying to use (because it's a breaking change and the old Terminal type doesn't support chaining)
So the real problem is further obsfucated
@noble basin @ocean egret could I request an accelerated LGTM + docs deploy on this page: https://github.com/dagger/dagger/pull/7893 🙏
I would like this to be live when we announce 0.12
Expand the "Adopting Dagger" page with a framework for successful adoption.
i also need a docs review on https://github.com/dagger/dagger/pull/7884
@round geode any chance you could merge my docs PR too? 🙂
(that way we sneak it in the same manual docs deploy)
thanks, merged!
@round geode do you have a prod docs deploy in your launch checklist?
OK then, one less thing on my own list
lol in doing the release, i remember why i ended up writing https://github.com/dagger/dagger/pull/7705
really don't love the fancy dance around prs
@round geode before I write that comment in the PR: is it still the case that removing auto-cli-download from the SDKs would solve a lot of pain on the engineering and release side?
(Impact on the user would be: your custom client now has a new runtime dependency: the dagger CLI. If it's not there, you get dagger: command not found)
Asking since it seems related to 7705
yeah, it definitely would - tbf, now is exactly the right time to bikeshed removing that in v0.13 😄
then let's do it
sadly, we still need that little dance
or maybe not
when my head is clear again on monday i shall have an answer 😄
Anything that simplifies the infernal version matrix is a win
agreed. i've now just become convinced that versioning is actually one of the hard problems
oop
this is a new error: https://github.com/dagger/dagger/actions/runs/9909851288/job/27378873881#step:3:96
Error: input: moduleSource.withContextDirectory.asModule resolve: failed to create module: select: failed to update codegen and runtime: failed to generate code: failed to get modified source directory for go module sdk codegen: select: error committing 1iglevyg6wrglgvs8vf22ehgg: database not open
thankfully. unrelated to the recent release i think, lint job is still using 0.11.9
hm that's weird though, because it's not like it was one worker that saw the issue
could potentiallllly be some weird disk flake
it's from buildkit, but i've never seen this before ever
re-running and hoping to never see it again
Sure! Just finished a pairing session, taking a short break, will do it first thing when I'm back.
already done (but thank you)
blog is live - https://dagger.io/blog/dagger-0-12
Let me know when it's safe to post about it!
@dapper socket @ocean egret @noble basin @compact lava the playground, daggerverse, cloud, etc should all be good to upgrade to v0.12.0
@noble basin @ocean egret install.ps1 and install.sh need updating on cloudfront
Please @ me too because I need to know for social.
There’s some fixes needed in some snippets.
@noble basin @craggy osprey dagger-for-github pr: https://github.com/dagger/dagger-for-github/pull/135
docs should be live @sullen zephyr!
happy to share more widely now, the only missing piece is helm
and github actions
@hard shoal can you DM me and we hop on a direct call? Team is in team audio, so we can't go there.
ah, I see Framer didn't save some snippets in the right languages. I just updated and re-published.
@hard shoal can you refresh, and see if that fixes the issues you saw?
It's more than that. We can hop on a call
Doing it now.
I'm heading off for the weekend in about half an hour FYI y'all
dagger-for-github is done
doing install now
helm PR approved & merged
anything else in the queue?
@round geode @noble basin if I tweet about the release right now, is it bad?
Let me do a quick check first. 2 mins. @round geode ?
OK, install.sh is now up-to-date. Want to confirm @round geode ?
so I can tweet? 🙂
ffs what is this doing here again? https://github.com/dagger/dagger/actions/runs/9910903565/job/27382386113#step:3:93
well I just lost my whole twitter thread
2nd time around it's better, more refined.
OK, v0.12.0 looks correct at the surface.
@noble basin are our ci workers healthy? noticing some v0.11.9 workers having more issues than usual
could there maybe something where doing the release spins up a ton more workers? and this then overcrowds nodes
@sullen zephyr, which Wolfi version should we use in https://docs.dagger.io/manuals/user/containers ?
github.com/dagger/dagger/dev/wolfi@v0.12.0?
Problem with a std lib in our main repo (to be used externally) is that it takes a long time to load. Still going
Took 2m39s
We need to check the modules used in our docs.
on that note, worth bumping its /apko dependency, I just bumped it + Bass to Dagger v0.12 which 1) removes that noisy checkVersionCompatibility, and 2) fixes telemetry in the Bass SDK so you can actually see what it's doing
I can put up a PR for that
@hard shoal I would avoid referencing modules in dagger/dagger for now.
That will one day become our stdlib, but it's too soon, we need the flexibility to move things around
Just asking which wolfi is updated to use externally.
Seems your repo has it fixed, just needs to publish (needs a tag).
Oh, but might as well also update the apko dependency like Alex said.
fyi - looking with @noble basin it appears these ran on a node with no v0.12.0 jobs scheduled, it doesn't seem like the release of v0.12.0 would have triggered this issue
fingers crossed it's a particularly weird infra flake outside of our control, but given i've never seen this, i really don't have any good ideas (cc @timber spade @gloomy oasis who may have seen this weird buildkit-y error before)
@gilded grail, golang module also needs an update for https://docs.dagger.io/manuals/user/directories
@sullen zephyr what about the linter one in https://docs.dagger.io/manuals/user/files ?
uh oh. yeah that one moved.
Didn't realize that was in there
@sullen zephyr, also hello in https://docs.dagger.io/manuals/user/call
Yeah hello is fine
I added github.com/shykes/hello but didn't touch the daggerverse one
But needs a version bump in the doc.
Stderr:
# hello
./main.go:38:19: undefined: Container
Ah!
Latest (tagged) version uses a dev engine: https://github.com/shykes/daggerverse/blob/hello/v0.2.0/hello/dagger.json
@hard shoal I added you as collaborator on both repos
Modules from our team are likely to have dev versions in them so it's breaking in the docs rn.
in case you need to rush a change
i'm off now
i'd say ping me if it's urgent, but i'll probably be napping 🙂 (staying up late last night to finish all the bits off is taking it's toll now)
I did a some digging and concluded that this is a side-effect of heavy contention. When there are many CI jobs using the same Dagger Engine, our pipelines will flake. I just watched the same job pass when there is a single Engine & CLI test running on a single Engine. It passed the first time in 10 mins.
When we were seeing the failures, we had 8 CI jobs running at the same time and then the load looked like this:
Yeah never saw that before either. It's from boltdb: https://github.com/sipsma/buildkit/blob/d2c730c0f9ab50dd4178f84dd8c75392d4062254/vendor/go.etcd.io/bbolt/errors.go#L9-L9
With the buildkit part of the error here: https://github.com/sipsma/buildkit/blob/d2c730c0f9ab50dd4178f84dd8c75392d4062254/solver/llbsolver/ops/exec.go#L396-L396
The thing is, if you are that far along in the buildkit code, I think the boltdb must have been open and at least read successfully by that point. Which, if correct, would imply that it's suddenly closing.
Running out of disk comes to mind as a possibility, but just a guess. Not sure what boltdb does in that case off the top of my head
Also possible the engine got a SIGTERM, started shutting down (including closing DBs), but this particular operation hit this error before anything else and managed to make it out in time to the client?
👉 https://github.com/dagger/dagger/pull/7909
I've updated @sullen zephyr's modules, but there's a lot of references to @gilded grail github.com/kpenfound/dagger-modules/golang and that's failing due to Go version.
docs/current_docs/integrations/snippets/Jenkinsfile:1
docs/current_docs/integrations/snippets/actions.yml:1
docs/current_docs/integrations/snippets/argo-workflow.yaml:1
docs/current_docs/integrations/snippets/azure-pipelines.yml:1
docs/current_docs/integrations/snippets/buildspec.yml:1
docs/current_docs/integrations/snippets/circle.yml:1
docs/current_docs/integrations/snippets/gitlab-docker.yml:1
docs/current_docs/integrations/snippets/gitlab-kubernetes.yml:1
docs/current_docs/integrations/snippets/tekton-dagger-task.yaml:1
docs/current_docs/manuals/user/artifacts/consumption/export.mdx:3
docs/current_docs/manuals/user/artifacts/production/directories.mdx:1
docs/current_docs/manuals/user/artifacts/production/inspect.mdx:1
docs/current_docs/manuals/user/functions/arguments.mdx:1
docs/current_docs/manuals/user/functions/chaining.mdx:1
docs/current_docs/manuals/user/host/host-fs.mdx:1
docs/current_docs/manuals/user/remotes/remote-repositories.mdx:1
docs/current_docs/manuals/user/visualization/cloud-get-started.mdx:1
I have to go away for the weekend now! 👋
@craggy osprey not sure if you can help for the Kyle stuff
I'll checking. Worst case, I'll just fork his dagger-modules and we can point to my copy.
@noble basin thanks for merging the dagger-for-github PR to upgrade to v0.12, I'm going to increment the action version from v5 to v6 to reflect
currently it has v5.12.0 any objection to moving that to v6.0.0 @round geode
since there are breaking changes
I'll hold off in case we're using the action and depending on the version
dagger-fork ➤ grep -r dagger-for-github@v5 git:main
./docs/current_docs/integrations/snippets/google-cloud-run/main.yml: uses: dagger/dagger-for-github@v5
./docs/current_docs/integrations/snippets/actions-ghcr.yml: uses: dagger/dagger-for-github@v5
./docs/current_docs/integrations/snippets/actions.yml: uses: dagger/dagger-for-github@v5
just docs, it looks like
@ocean egret looks like we're not depending on dagger-for-github action in our CI, correct?
So I'll move forward with bumping our action to v6
Amazing work people! Thank you so much
and a special thank you for @round geode for running this release like a boss
bumping Daggerverse to v0.12.0 so we don't overlap cc @ocean egret @craggy osprey
@round geode @craggy osprey seems like Daggerverse doesn't like being bumped to v0.12.0. Publishing Go modules <0.12.0 fails due to the changes in type aliasing. I can't troubleshoot now since I have to run, but I can check after dinner if nobody has figured it out yet: https://github.com/dagger/dagger.io/pull/3849
cc @lucid birch in case you have some time helping us tracking this down? 🙏
@dapper socket yeah I raised the same issue yesterday, those aliases are gone
yeah I'm aware. But dagger install still works. However.. seems like whatever Daggerverse does, it's just not happy with it
not sure what needs to be changed in Daggerverse to still make it initialize older modules? Haven't followed in-depth the compat mode thing.
Ohh ok I thought you were referring to daggerverse's pipelines
this is actually daggerverse failing the mod builds
yes, this would be Daggerverse publish flow
gotta run now, will check back after dinner 🙏
I’ll jump on after a bit too. Guessing we’ll need to update all the daggerverse modules internally, yep
This module is broken https://github.com/shykes/hello/blob/main/dagger.json
engineVersion should not be a dev version
@sullen zephyr mentioned this earlier in the thread, but looks like this one got forgotten
If engineVersion is a dev version, the only possibly interpretation of this is to serve the latest API (there's not really anything clever we can do here, sadly until the semver everywhere work gets done and merged)
Publishing https://github.com/shykes/daggerverse/tree/main/hello should work though
If it doesn't, I briefly looked through the daggerverse module loading flow, and it's honestly very unfamiliar 😢 I can take a look on Monday potentially, maybe there's some bug on dagger / daggerverse about the way module init works (sorry, on mobile so reading code is hard)
Note for the future - because of module compat, we should actually be able to bump daggerverse before a release, so we can avoid a scramble after
I guess we are in a pickle about what to do with dev versions now, since technically we have no idea whether they're v0.11 compatible or v0.12 compatible 🤔
I guess we should attempt to purge them as much as possible from the daggerverse, but not sure what we should do if we encounter them.
I kind of wonder if daggerverse should not allow users to publish modules with a non semver version at all going forward (unrelated, just musing)
Nothing explicit should need to be done - it should just work 🤔 but that said, maybe daggerverses loading logic is weird, I could just be very unfamiliar
Thx for replying so late your time Justin
I was actually aware of this but totally forgot to check this module's dagger.json
For some reason I assumed it should work given that's the module that we use pretty much anywhere
I guess we should disallow indexing modules in Daggerverse without a semver engine version then
Or at least, in the case that it fails, modify the error message to include a warning about semver as well
I didnt update shykes/hello because what’s on our docs is shykes/daggerverse/hello. That one I updated and tagged. Didn’t check if we still auto publish or not but tested directly
Another idea could be fail by default if a non-semver is used in dagger.json unless a flag is used?
Working on a problematic part of Daggerverse where we're doing a dag, err := dagger.Connect(ctx, dagger.WithLogOutput(os.Stderr))
and then using alpine/git doing WithExec([]string{"show", "--no-patch", "--format=%ct", commit}).Stdout(ctx) essentially a git show.
Stderr:
[dumb-init] show: No such file or directory
So we're hitting Skip entrypoint by default in WithExec()https://github.com/dagger/dagger/blob/main/.changes/v0.12.0.md?plain=1#L12-L14
Confirmed. I have to change the code to one of these:
WithExec([]string{"show", "--no-patch", "--format=%ct", commit}, dagger.ContainerWithExecOpts{UseEntrypoint: true}) ✅
WithExec([]string{"git", "show", "--no-patch", "--format=%ct", commit}) ✅
Unfortunately this shows up in my IDE:
WithExec([]string{"show", "--no-patch", "--format=%ct", commit}, dagger.ContainerWithExecOpts{SkipEntrypoint: false}) ❌ which doesn't work
===
I used the top line and that seems to fix it, @dapper socket
Also changed this a bunch in the modules part of things. Not sure if necessary, but was following convention.
Though strangely the dagger.gen.go keeps reverting to
import (
...
"main/internal/dagger"
"main/internal/telemetry"
...
)
when I delete it and re-gen with dagger develop
even though
{
"name": "daggerverse",
"sdk": "go",
...
🤷♂️
Because in module I recently created called interactive I get this in dagger.gen.go:
import (
...
"dagger/interactive/internal/dagger"
"dagger/interactive/internal/telemetry"
...
)
oh! I get see the difference module github.com/dagger/dagger.io/daggerverse in our go.mod, versus module dagger/interactive in my new module's go.mod.
Second form is preferred. Makes it explicit that it’s a git command so it’s easier to understand. On the last part it doesn’t work only in Go. It’s because of the way we create the opts struct.
I agree the second form is clearer if the entrypoint is simple like with this one:
https://github.com/alpine-docker/git/blob/master/Dockerfile#L10
ENTRYPOINT ["git"]
But I suspect with DBs for service containers and some other cases, the first form will get a lot of use, which will make that use case more verbose:
https://github.com/docker-library/postgres/blob/master/Dockerfile-alpine.template#L219
ENTRYPOINT ["docker-entrypoint.sh"]
https://github.com/docker-library/postgres/blob/master/docker-entrypoint.sh 356 line script with vital setup
Oh, maybe not, because we won't need WithExec() on the service container itself perhaps...
Still need some coffee. Early here. Trying it out.
There’s things you can do programmatically. If you have a common start you can put it in a var. If you still need to set the option you can put that in a var and reuse to reduce verbosity:
opts := dagger.ContainerWithExecOpts{
UseEntrypoint: true,
}
ctr.
WithExec(…, opts).
WithExec(…, opts)
See changes in Redis snippets in https://github.com/dagger/dagger/pull/7905/files
Like this example I had
// Database service used for application tests
database := dag.Container().From("postgres:latest").
WithEnvVariable("POSTGRES_PASSWORD", "test").
WithExec([]string{"postgres"}). // <<<< this breaks
WithExposedPort(5432)
AsService()
// Run application tests
out, err := client.Container().From("golang:1.22").
WithServiceBinding("db", database). // bind database with the name db
...
would need to update the WithExec either way...got it
var execOpts = dagger.ContainerWithExecOpts{
UseEntrypoint: true,
}
...
WithExec([]string{"postgres"}, execOpts).
...
or
...
WithExec([]string{"postgres"}, dagger.ContainerWithExecOpts{
UseEntrypoint: true})
...
I'll work on this. About 20 references to update to a new version that will likely be kpenfounc/dagger-modulues/golang@0.2.0
Kyle just gave me push access to his dagger-modules and go-releaser repos.
https://gist.github.com/jpadams/2f65118b1595ca8b3f2a40b8fc168b77
🚀 v0.12.0 - 9th July 2024
only one little thing left: https://github.com/dagger/dagger/pull/7904
just need to bump our ci workers to use the latest release 😄
uh oh
from that pr, noticed that some of the jobs seem to terminate, and then never actually exit
fyi @gloomy oasis @timber spade, fingers crossed it's not telemetry draining again
considering one of the jobs is canceled, I wonder if this is spot instances again
Mm but looking at the timestamps, the job finished way before the cancellation
It could somehow still be that ofc
does the cancel come first, or does it happen when GHA realizes the runner is missing?
(I don't expect you to know that offhand, will ask in #infrastructure :D)
I do see a spot interruption here: https://grafana.ci.dagger.cloud/d/feedbe64-5c48-410d-8dfa-30dcfaba246e/karpenter?orgId=1&from=1721058831800&to=1721061861002 - but it does seem like it came later
This is still in the works and is a part of the gen2 runners, but we have on demand runners in the new ci cluster: dagger-gen2-v<VERSION>-<size>-od. It is not considered production yet but soon
Feel free to use them in PRs if it helps you troubleshoot. We are working on them so expect some instability
hmmmm this theory looks much more likely to me now, did a re-run, and saw none of the same issues
still worth keeping an eye out for, if there is an issue we will definitely see it once we deploy more widely
The golang module was updated for v0.12 and the docs are updated/published.
I found a couple other example one-liners to fix. Will do that now.
Including dagger -m github.com/dagger/dagger/linters/markdown@88d89e8d15ab6ad9ca4043a920d3cd735a6405fd call rules contents which I'm guessing doesn't work due to the change in CI structure. But looking now.
Can see it places like: https://docs.dagger.io/manuals/user/chaining/
This vestigial tail remains: https://github.com/dagger/dagger/tree/main/linters
As someone who does not know this module inside and out, I'm not sure how to get to a markdown linter. Would be cool to be able to search within the module or expand all the possible "call trees". I'll go look at code now.
dagger -m github.com/dagger/dagger/dev call --help
Full trace at https://dagger.cloud/dagger/traces/b88703c275a0b51c0d1b793f6d6233bc
✔ connect 1.3s
✔ initialize 3.7s
✔ prepare 0.0s
Call a module function
USAGE
dagger call [options] [arguments] <function>
FUNCTIONS
check Check that everything works. Use this as CI entrypoint.
cli Develop the Dagger CLI
dev Creates a dev container that has a running CLI connected to a dagger engine
docs Develop the Dagger documentation
engine Develop the Dagger engine container
go Dagger's Go toolchain
helm Develop the Dagger helm chart
scripts Run Dagger scripts
sdk Develop Dagger SDKs
test Run all tests
version
ARGUMENTS
--source Directory [required]
--docker-cfg Secret
--version string
I mean...I'll go look at Daggerverse now 😆 https://daggerverse.dev/mod/github.com/dagger/dagger/dev@133917c6f9ce36d8cfdc595d9b7bd2c14cbc2c20
Except now, the CLI examples really want to be back...Have an issue for that, will prioritize.
Looks like the old intent was markdown linting of the docs...https://github.com/dagger/dagger/commit/3055bbd443941d87bfe93513cbe1ba411404e836
So going to try that one.
So guess this is pretty similar area: dagger call -m github.com/dagger/dagger/dev --source https://github.com/dagger/dagger\#main docs lint
but we don't have a function to show the rules...so I'll likely put in a different linter example or anything that returns File contents, which was the example's intent.
Did it with https://github.com/dagger/dagger/pull/7945
@gloomy oasis, yeah, just seen this misery again after merging the 0.12 branch: https://github.com/dagger/dagger/actions/runs/9970579309/job/27549786121?pr=7939
the call fails, but the job hangs on for another 8 minutes
hm, specifically there seems to be something that this occurs if an error occurs, a weird hang can appear
in this case, we see a otel tcp proxy listen failure from context cancelled - this is definitely weird, but probably "okay": https://github.com/dagger/dagger/actions/runs/9970573229/job/27549768418?pr=7936#step:3:138
what's strange is how this failure doesn't actually seem to propagate anywhere
what in the actual world is happening
we never see an initialize DONE
in all the failure cases
but how? analyzing module does complete, those defers are right next to each other
that would definitely explain the hang - if the initialize never hangs up somehow, then we would hang until it's completed
i wonder if this has something to do with https://github.com/dagger/dagger/pull/7881
okay, no something seems wrong there, but, unrelated... what? https://dagger.cloud/dagger/traces/dbc42634c596df5e871f481e1012c698?span=3eea241a8be54c30#4e608f9ebee0fb4d
there's a withexec that seems to fail deep in the tree, that isn't propagated upwards
installing module is never seen by the frontend weirdly: https://github.com/dagger/dagger/actions/runs/9970573229/job/27549768418?pr=7936#step:3:89
but it is seen in the corresponding trace in cloud: https://dagger.cloud/dagger/traces/dbc42634c596df5e871f481e1012c698
wrote this up into a tracking issue: https://github.com/dagger/dagger/issues/7954
After applying #7904, we've seen jobs hang weirdly. See an example: https://github.com/dagger/dagger/actions/runs/9970573229/job/27549768418?pr=7936#step:3:89 https://dagger.cloud/dagger/traces...
hmm, now i wanna see how this looks without the cause/effect shenanigans, maybe that exec codegen was incorrectly attributed to this withExec
ohhhhh i forgot that that's a thing 👀
i guess it's a visualization level thing? can it be disabled in cloud?
also, this is totally the wrong place to discuss this one, so gonna spin up a new 🧵
I noticed our docs need to be updated to mention multiple git servers...I tried to just edit it/submit a PR but I'm getting an error and not sure how to resolve. Any help would be much appreciated. @void pendant @dapper socket @compact lava
team audio if you want
FYI if folks were curious, stupid marketing guy didn't realize he needed to fork the repo to submit a PR...🤦♂️ Feel free to make fun of me now.
@lucid birch @void pendant @lean nest Jeremy and I just noticed we don't seem to have an docs for interactive debugging... seaching for terminal brings up the old way of doing things, there's a debugging section in docs with nothing on the new feature. We're curious if there's a plan to add these. I'm looking to add Interactive Debugging to our website as I think it's a really compelling feature but I think it's important that I link to docs so we don't fumble the conversion etc. I can certainly hold putting it up if folks feel its better.
Vikram has been gone since we've publicized it, so he hasn't worked on it. I just created an issue to ensure it doesn't get lost and Vikram can help find a spot on his backlog when he comes back on Monday https://linear.app/dagger/issue/DOCS-312/add-interactive-debugging-to-the-docs