#CI refactor
1 messages ยท Page 1 of 1 (latest)
The main PR: https://github.com/dagger/dagger/pull/8355
We're going to split it up into incremental changes, starting asap
Tentative plan (can be parallelize)
github.com/dagger/dagger/versiongithub.com/dagger/dagger/modules/gogithub.com/dagger/dagger/modules/supermodgithub.com/dagger/dagger/modules/superdir(might as well rename dirdiff while we're at it)github.com/dagger/dagger/go/sdk
Then:
github.com/dagger/dagger/engine(bigger chunk)
Let's coordinate here
I'm taking version
FYI @summer root @shell portal
Got sidetracked by in-person meetings, but wrapping up my day with a first draft of the version PR
Left some comments on the split-out PR, will try and find some time today and make the changes I want myself (and push them as a separate commit, so we can easily revert)
But left as comments just in case I can't find the time!
Notes from check-in call:
- Participants: @gentle orchid, @somber dust, @brisk raptor
- Version PR: overall looks good. Failing xxx-dev checks in CI, because the dev version string is not the same, and the format matters.
- Problem: no way to differentiate clean from dirty checkout (necessary for -dev)
- Solution: partial git diff using the git-dir and inputs directories we already have. yay!
- @gentle orchid and @brisk raptor will pair on the diff implementation
- @somber dust will take over tomorrow and make the dev format identical (for go tooling interop)
@gentle orchid quick 5mn break then ready to pair ๐
joined dev-audio, ready when you are
@somber dust we fixed dev detection logic ๐ You can now call Version.dev() and it will correctly tell you if any of the input files have changed
nice
picking this up now ๐
๐ข working on this module is not wonderfully fun - it takes me about 30s just to compute a version identifier
okay, almost there ๐
just working on something with the dev detection - it currently doesn't work with newly added files, so trying an experiment to see if i can get that working
hmmm had a realization on a ci failure: https://github.com/dagger/dagger/actions/runs/11053632481/job/30708466631?pr=8564
the git cloned repository doesn't have any tags in it
i think we can update that for our clone - but the issue is, it won't work well for the entirely remote pattern where we do -m github.com/dagger/dagger - because that's a shallow clone (and so won't have the tags)
๐ข ahh it does work if you specify the tag in the git url
ahhhh i worked out how to entirely remove the weird .changes/ markdown file analysis, and just git tags, but because we don't have access to all of them, i can't
@somber dust maybe you could just find all the tags with git ls-remote?
there's a trick for that iirc
good point, i can try doing that
unfortunately, one issue with this - now we need to have ssh auth sock in the module
oh well, we can just hardcode the upstream url ๐ sucks, but better than pulling from "origin", which is potentially wrong (e.g. for me, this is jedevc/dagger, which has no tags, and sometimes has a collection of absolutely wild test tags)
that's a good point - it should probably be based on the "source of truth" for dagger anyway
super sad thing i just discovered - if you do llb.Git using a commit sha that's not on the default branch... you can't
which means you can't dag.Git with a commit that's not on the default branch
gonna try and fix that upstream, since there's no good reason for that limitation
it sucks, cause it means you can't do -m github.com/dagger/dagger@<commit sha> you have to do -m github.com/dagger/dagger@refs/... (and really hope there's a ref that points to the thing you want)
e.g. imagine hell if you push while a ci job is running ๐ and you get the new content instead of the old one
I already address this with VersionTags()
(joining another call as you know, sorry)
i refactored this around a bit, i was looking to remove all of the changie stuff entirely while we were here, but yes
@somber dust here's an old but relevant issue ๐ https://linear.app/dagger/issue/DEV-3233/bug-dagger-apis-git-checkout-doesnt-have-all-tags
yeah, i think we need an "unshallow" option for cases like this
or similar
technically, all i want is certain branches to be unshallowed
(and all the tags)
okay, writing up for the day and passing off to @brisk raptor @gentle orchid:
- i'm happy with the current state of the module, i got it working to the previous version formats
- i also added some very small improvements since we have all this rich git data now - i killed the changelog
.mdparsing to get the latest release, and now we get this from git (goodbye fragile changelogs, it's my gift to you @brisk raptor). i also did something clever, and we set better engine image tags in the cli when building from a branch that's not main - tidied and refactored around, and got everything to use the same wolfi base (to avoid extra overhead)
- changed how the
Devfunctions works and renamed it toDirty- it now operates exactly likegit status(so handled added files or staged changed, whichgit diffdoesn't), i grabbed the logic from similar in buildx (where i worked on this problem before)
I really like this btw:
$ dagger call -m github.com/dagger/dagger/version@pull/8564/head --git-dir=https://github.com/dagger/dagger#pull/8564/head
_type: Version
imageTag: 72a848aa27ceca12609a12821d6c3732b62df9d7
lastReleaseVersion: v0.13.3
nextReleaseVersion: v0.13.4
version: v0.13.4-240926174249-f36402ee4e22
At a glance, you've got the version, the image tag used by it, and finally the last/next version
I think it's done? Just needs reviews, but if it could avoid being merged until tomorrow, I'd like another glance through it with fresh eyes ๐
Nice, thanks!
@somber dust can you explain how the distconsts part works?
not related to the current version pr right?
yes I mean specifically this PR's use of distconsts
I'm familiar with our existing use of distconsts in general
? it doesn't use distconsts
oh sorry. engine/consts
sorry, getting a little distracted here too lol
Oh you literally just changed the comment
nevermind
all good ๐
So should I merge it?
if it looks good to you, approve it ๐
if there is a rush, feel free to merge, but i'd like to take another glance at it tomorrow morning when i've had coffee to make sure that the logic actually pans out - also if it breaks on merge, i'd like to be around to help fix, and i'm just about to sign off and pack for san diego ๐
@somber dust are you sure you didn't accidentally remove the un-shallowing in your refactor?
and added more ๐ข
it sucks so so so much, but i have some ideas for how to get it out in the future (and doing dagger builds offline isn't really a thing anyways, so not really the most urgent thing)
contextual git
(i think i will make this my hackathon topic)
So if I understand correctly, a downside of our current approach is that, even if we don't need to fetch more from upstream, because we're working from a local checkout, we will still execute those commands and they will require talking to upstream, even though they will end up downloading nothing new ?
But at least, no wasted downloads - I'm guessing if there's nothing to fetch, it's just a wasted roundtrip
yup ๐ข
buttt - one cool thing is it means that if someone tagged a release, it does mean you still get the right result, even if you haven't re-pulled (on your local repo)
yeah, git is good about not transferring data twice - it just goes, "you already have everything, go away now"
Is there a potential cache invalidation problem with these commands?
I guess not (except for the obscure case where you pulled commits just before they were tagged)
yeah, it's a realllllly obscure case
so in the 99.99% case: new commits upstream, so new git dir, so new fetch command, so you get fresh tags, and your version is up to date
exactly
but in between git pulls: proper caching of those fetches
not too bad!
and the code looks readable and clean
i really like the way you made the new VersionTag struct, i think it's a very neat way to think about this
because now you just go "git give me every release for a component in this repo", and by having it sorted, it's really easy to determine which the latest is
merged ๐ ๐

Right it kind of just fell into place:
- "Argh I wish I had this information"
- "Oh wait, I could easily fetch it"
- "Ah the data structure just writes itself!"
first win as the little ci task force feels good ๐
lol, and it immediately turned red on main
@shell portal @summer root we're definitely pulling you into the next one ๐
oh no! Our fault, or flake?
Something different about main?
tl;dr - git fetch by default does not want to update a ref that you have checked out
hm, actually there's a reason that flag tells you not to use it lol
lemme push a different fix
merged that but now found even more things sorry
git repos are a nightmare
shallow git repositories are so weird to work with (and turns out technically i've been working with a shallow repo for the last year lol)
@gentle orchid @brisk raptor fix as promised: https://github.com/dagger/dagger/pull/8589
i'm heading off to go pack now ๐
Thank you!
Got a few problems not fixed by the PR:
- New error replaced the
.git not founderror
$ call -m github.com/dagger/dagger/version@pull/8589/head
invoke: no releases found
- Execution of the "unshallow" is considerably slower than before, do we really need to fetch this much?
- Version string for a stable release tag no longer produces a clean stable version
$ dagger call -m github.com/dagger/dagger/version@pull/8589/head --git-dir=https://github.com/dagger/dagger#v0.13.2
v0.13.4-010101000000-dev-ac5c7f4a3265
(same problem in main)
$ dagger call -m github.com/dagger/dagger/version --git-dir=https://github.com/dagger/dagger#v0.13.2
v0.13.4-010101000000-dev-ac5c7f4a3265
It's so slow
Yes - there are no past releases with no git dir, this is expected for this, we could change it but the return result is invalid then
I can look into it for local execution, but without contextual git... yes. I can look for optimizations, but we rushed to merge and so it's slower ๐ I think that's an alright compromise for now
Will glance into this when I finish packing
digging into this - this is because that release is technically "dirty"
the contents of the git repo are entirely different from the contents of the base directory
so when we do a git status, we're looking at completely different results
essentially you've got the git directory from the release, but the contents of pr 8589
oh right - I need to override inputs also
looking into a quick optimization right now, and also worked out a clever way to remove the .changes input
sigh contextual git is gonna be such a lifesaver
this is such a nightmare without it
while sitting around in the airport, if i have some acceptable wifi, i might try writing some tests for this module
i'd like to be confident that it works
i've managed to shave off a few seconds of fetch by cramming all the git fetch into one single command
okay pushed some changes, all in separate commits (but should be squashed when merged)
- i've fixed the case where no git metadata is available
- i've added some optimizations to group all the git fetches together
- i've done a cool trick where we can actually load the
.changes/.nextfile from git, which is actually way more accurate for some edge cases, but we still need to hold onto the old way for when no git metadata is around (v0.13.4 will have that fix, so i've added a note we can remove it)
But beyond optimizations, it just seems much slower than my first implementation - was there something I just forgot to fetch that was very important? I definitely had all the tags... What else was missing?
We need the repo history (at least the list of commits) - without that, we can't determine the right tag just using git.
For example given a random commit hash with no history, how do you know what the previous version was?
So I needed to pull the history of the checked out commit
Ah it's the previous version - right
I was only using it to determine if a commit was tagged
@somber dust is that fix PR ready to merge?
I almost did, but saw a failed check. Seems to be gone now
I would hold off, since I'd like to be around to fix it if it breaks again ๐
OK but this is killing me:
$ dagger call -m github.com/dagger/dagger dev terminal
โ connect 0.3s
โ initialize 3.2s
โ prepare 0.0s
โ daggerDev: DaggerDev! 5.6s
! call constructor: process "/runtime" did not complete successfully: exit code: 2
โ invoke: input: version.version resolve: call function "Version": process "/runtime" did not complete successfully: exit code: 2
โ
โ Stdout:
โ invoke: input: container.import.withMountedTemp.withMountedTemp.withWorkdir.withWorkdir.withDirectory.withDirectory.withExec.withExe
โ withExec.withExec.directory resolve: process "sh -c git checkout $(git rev-parse HEAD)" did not complete successfully: exit code: 12
โ
โ Stderr:
โ fatal: not a git repository (or any of the parent directories): .git
โ fatal: not a git repository (or any of the parent directories): .git
โ version: Version! 0.1s
โ Version.version: String! 5.3s
! call function "Version": process "/runtime" did not complete successfully: exit code: 2
โ invoke: input: container.import.withMountedTemp.withMountedTemp.withWorkdir.withWorkdir.withDirectory.withDirectory.withExec.withE
โ withExec.withExec.directory resolve: process "sh -c git checkout $(git rev-parse HEAD)" did not complete successfully: exit code:
โ
โ Stderr:
โ fatal: not a git repository (or any of the parent directories): .git
โ fatal: not a git repository (or any of the parent directories): .git
โ Wolfi.container(packages: ["git"]): Container! 5.2s
โ Container.directory(path: "."): Directory! 0.1s
! process "sh -c git checkout $(git rev-parse HEAD)" did not complete successfully: exit code: 128
Full trace at https://dagger.cloud/dagger/traces/78d52f6eee06c50a7887f0bd36059652
Error: response from query: input: daggerDev resolve: call constructor: process "/runtime" did not complete successfully: exit code: 2
Stdout:
invoke: input: version.version resolve: call function "Version": process "/runtime" did not complete successfully: exit code: 2
Stdout:
invoke: input: container.import.withMountedTemp.withMountedTemp.withWorkdir.withWorkdir.withDirectory.withDirectory.withExec.withExec.withExec.withExec.directory resolve: process "sh -c git checkout $(git rev-parse HEAD)" did not complete successfully: exit code: 128
Stderr:
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
Run 'dagger call dev terminal --help' for usage.
It's bad luck, but literally 5 times a day I point to this example as the proof of how wonderful and unique Dagger is
and now it's broken...
Sure go for it - but if it breaks, you're on deck to fix it
lol fair enough
I'll be on deck to reverse the PR immediately ๐
@somber dust other than possible risk of something going wrong, is there any known issue with the PR at this point?
OK then
Fyi, I think we normally do merges as squash-and-merge, instead of rebate-and-merge (even though rebase-and-merge is my personal preference ๐)
Ha ha, I just clicked on that without thinking about it, since a) it's obviously the right way to do it IMO and b) it's usually disabled so I never have to think about it
I guess it was recently enabled?
Segfault in the compiler?
Why is it trying to publish?
Immmmm gonna say that's not our fault
We publish every commit on main
oh right
Not tagged, but they still go to the registry and clj
Yeahhh that job also runs as test publish I think, so we would have tried to build it in the PR
If it properly breaks again, weiiiird
Anyways! On that note, I'm off to bed lol, big travel day for me tomorrow

Safe travels, see you on the other side!
@somber dust are you able to join?
assuming we're skipping today? given there's fastly at the same time?
one question for offline - what's the plan to move forwards with the gha module?
are we holding that work in light of native ci?
if we're going ahead with it, can we import it into modules/ soon? i'd like to have it all live in the same place if we can (i was looking at making some fixes/changes, and it's fiddly to work on across multiple repos)
also if we're going ahead, i was gonna try and simplify to remove the complex dev engine bits, remove the last bits of mage, and convert everything over to use it
the reasoning is - our ci config is now split: this makes changes hard.
i think we either need to go backwards (to what we had before), or go forwards (and get everything consistent) - i have a massive preference for forwards ๐
can't make the call today sorry conflict
good news! the version module is working correctly on tags ๐
just tagged the release v0.13.4, expected pain and suffering, all just works 
I think we should continue. Feels like an important stepping stone, to make Dagger a golden example of "retrofit CI" first, before making it a golden example of "native CI". Feels like a natural progression and helps us keep momentum.
I can open the PR to move gha into modules
re: dev engine. I did think "mm this should print be dagger-in-dagger with a regular gha config" wdyt?
awesome ๐ i'll plan to spend a decent amount of time on thursday/friday doing some work here ๐
mmm yeah, it's almost this - last time we tried, we hit some horrible issues - it's worth trying again though for sure now we have flakes a bit more under control
the reason it's hard, is then this case becomes triple-nested dagger
Noticed the recurring invite is still on? I can't make it today, just gone out for a haircut ๐
Me and @matipan have mostly finished up in https://github.com/dagger/dagger/pull/8715, would appreciate sync/async comments - the idea is to start with this import + feature, then to keep iterating on it in main so we keep up velocity on this (it's an important part of one-click releasing now)
@somber dust yeah the call is canceled. Will remove the invite now. Didn't have time yesterday since it was so busy
@shell portal @somber dust I'm still reviewing that GHA pr, sorry for being slow - it's surprisingly difficult to review the API of a module in a PR, I got used to just having the daggerverse API docs page..
Note, @summer root @finite lava this ๐ might be a good use case for your "open in Cloud" button idea. I would love to click a button in that PR right now, and see the module page in Cloud, with the docs, an interactive shell to play around and everything
Update: this has been de-prioritized, which makes me sad. I am going to try carrying it on my own as a side project. I think it's a waste that our own CI is something we're slightly ashamed of, and don't proudly point our users to it as a golden example. I hope we can free up engineering cycles to do that soon.
As a reminder, this is the original PR: https://github.com/dagger/dagger/pull/8355
We started breaking it up into more digestible pieces.
So far we got one piece merged: the version submodule. I will try to find another small win to regain momentum.
Going to resurrect this game plan
Next up: modules/go
Then I will try cmd/dagger which depends on go and version
Then, an easy one: helm which depends on cmd/dagger
So much good stuff rotting in that big PR... makes me sad
And special mini-boom: https://github.com/dagger/dagger/pull/8869 (the interface thing)
This depends on #8868 , please review that first
In github.com/dagger/dagger/modules/go, you can now pass "overlays" which can be installed on the base container to modify it. T...
If you have any cycles to review, I'd appreciate it ๐
those tend to catch conflicts quickly
(left some coffee shop comments, since I wanted a chance to weigh in!)
Some concerns about the overlays, otherwise, I'm a fan of the direction generally - will help out some more next week โค๏ธ
Hopefully this aligns with release improvements enough that I can steal some time away to work on things here
hey aren't you on vacation?
yeah the overlays are an experiment. Cool use of interfaces IMO, and directionally correct, but lots of room for improvement. Some of it is really because interfaces themselves are rough...
either way thank you!
Yeaaa the weather here in Denver is dreadful though, and things are closing from the storm - so casually looking at interesting work things.
Hopefully Salt Lake on Monday should be a little bit clearer, so will take some time back then.
meanwhile in SF
Jealous ๐ญ๐ญ
started working on moving more actions into our gha management: https://github.com/dagger/dagger/pull/8995
i am starting to wonder about the Pipeline logs part of the report though
it's quite difficult to read, and i'm generally never using it - and just skipping to the cloud trace (or even the github logs view)
wdyt about replacing it with just a link to the github action logs, since we already have the trace view?
an example: https://github.com/dagger/dagger/actions/runs/11912487630?pr=8995
generally really liking the summaries though, with multiple jobs, it gives a very nice summary of each individual job all in one page
the cli module lgtm - https://github.com/dagger/dagger/pull/8870
just waiting for checks to pass, then will merge ๐ (update: merged)
did a follow-up in https://github.com/dagger/dagger/pull/8997, looks like the logic i originally added way-back-when that detects if the git repo had changes to it was wrong, and the new cli module somehow triggered it
helm is ready for review. https://github.com/dagger/dagger/pull/8888
Quick update:
- I missed a gap in the helm modul (thank you Justin for catching it)
- Helm and all SDKs share a dependency: the release logic.
- In the full refactor, I solve this by moving those release utilities to the
release/module - I'm going to open a separate PR for the bare minimum
release/module, to unblockhelm/. We can add more release logic in there as a follow-up, but I know that's a sensitive topic for you Justin so I expect some bikeshedding. Hopefully by keeping it very small in my initial PR, we can avoid holding back the helm merge too long
update: release logic was refactored 3 weeks ago, so that part of my original PR doesn't apply cleanly anymore. Going to try to figure out what the refactor does
@somber dust I think that's your refactor. Specifically it's githubRelease() that now receives a dagger.VersionGit
I can't understand where that type comes from. The version module? But it has no Git type
Oh right it does - it's just in a different file
So here we are, I can no longer easily break out githubRelease into a separate module, because it takes a type from another module, which is not allowed
hm, that is not ideal, i think have a few ideas for how we could do it, and engine changes we could make to unblock it - it's not impossible, just fairly difficult right now
up to you for how you want to progress - i've got to head off for the day now, but i've planned and scoped the shell autocompletion work, and expect it to take pretty much all of my time until it's done
Why even carry that Git field around since it's queried directly from the version module?
That is definitely the correct priority ๐
i don't think there's any reason it couldn't just take a directory?
just looking through it
All invocations of githubRelease pass the unmodified dagger.Git field, which itself is received unmodified from calling the version module
So I think we can remove that parameter altogether
yeah, we could probably just call it directly
If later someone needs to parametrize it, they can just add a string. But for now it doesn't even seem that anyone needs it
yeah, should be pretty doable
OK cool, I will try after my childcare break ๐ (I'm on PTO today but sick kids at home, so this is my break for the day)
release module sgtm - or even just a generic "utils" module
^ if that helps unblock faster
insert meme: when refactoring CI is your fun breal for the day
Thanks Justin! Can't wait to play with autocomplete, it's going to be epic
Especially in future v2 when we add daggerverse search + autocomplete of module addresses based on daggerverse index ๐คฏ
i think we might need to consider rewriting the entire shell parser ๐ (after shipping an initial version)
but i'm getting ahead of myself
well that took longer than I expected... the whole day to be precise
but it's pushed
question - should ci jobs triggered through our modules/gha helper be able to access details of the events that triggered them?
just realized when thinking about @oblique ginkgo's use case in #maintainers message, and also about changing our release pipelines to all use modules/gha (which need to know the tag that triggered them)
i think we should probably support this, would definitely be useful for our use case, and maybe others as well ๐
but not sure what this would look like? a basic idea would be to look for args of a specific "format", and inject those.
e.g. if the target function can take an arg --github-ref, then it should automagically inject GITHUB_REF into it.
haven't had a proper look about these yet this week sorry, will manage to find some time next week
@brisk raptor for when you're online: do you have any context for the code here? https://github.com/dagger/dagger/pull/9161#discussion_r1878273553
it seems incorrect to me, and it seems safe to remove (I think github handles this already), but I might be missing something fairly important here.
Super weird I remember dealing with this typo already at some point.
probably a regression across refactors
seems to have been introduced in https://github.com/shykes/gha/commit/c4940abbfcb0f113e25ac4e05cd329a22d29a5e5
yeah the same repositoryUrl issue existed there though
since there's not a corresponding git.repository_url
but it's not worked since that refactor
but i think this logic can just be removed entirely - it's all available in env vars anyways
I don't know what you're referring to, sorry
i'm saying, i think this entire block of logic probably can just be removed?
i can see what the logic is trying to handle, but github already does this and provides these environment variables
what i'm trying to understand is why this was setup in the first place - was their some case that needed handling that we shouldn't break?
kind of annoying that we've lost that history in dagger/dagger. I planned on removing that repo
no, Github does not provide this. You have to manually inject it in your yaml
The use case is hiding GIthub's dumb extra layer of variable interpolation
i'm so confused
we literally use it like this
github also documents this behavior https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/store-information-in-variables#default-environment-variables
Mmm you're right - then I can't remember the exact reason I needed it. I think it was a variable that they didn't expose as env vars by default
You can also read most of the github context data in environment variables
I think it's that "most" that bit me at some point
Also, I used this mechanism for secrets
the secrets mechanism is slightly different (we do need that bit)
(but I think disabled it in the current version because I was worried it might be too magic)
No I mean, I used to auto-select secrets by looking at env var lookups
we don't do that ๐ค
you still have to explicitly provide them
i think i'd like to remove the github env bits - it won't break anyone, since the behavior is already broken right now
if a user reports a case where it won't work, we'll at least find that case
the git history is not deleted. it's in shykes/gha.
we discussed how to do this
together - we did in the code review in https://github.com/dagger/dagger/pull/8715 and the followup conversations around that
Rough summary:
Imports https://github.com/shykes/gha into modules/
Add WithJob to allow combining all SDKs together into one github workflow together
Remove the Settings object - we started out ke...
Well I'm going to delete that (it's polluting daggerverse results)
yeah makes sense
It was a neat trick, and got me excited about using Paul's go-bash library for more things. I wish I could remember the variable that I couldn't get the normal way
this potentially could break usage of it for users who were using it and haven't yet migrated - could we hide the results from daggerverse instead?
Sure, it wasn't at the top of my todolist anyway. Good opportunity to dogfood the module retirement / migration process
(at the moment there is no way to mark a module as deprecated or internal, for use by daggerverse. But @gentle orchid is on it I believe as part of #p-daggerverse-quality )
hoping to start a bit of a brain storming session here: https://github.com/dagger/dagger/actions/runs/12275890055/job/34252281126#step:3:121
during our release, looks like i missed the fact that we renamed an arg but it wasn't updated in a workflow
not even autogenerating our gha would have helped here
is there something that would have helped us catch this? i really like the idea of being able to "statically type" this
e.g. if it was somehow possible to say, here's a dagger function call - take it's id, and "replay" the id in the context of gha
i might think about this idea
sorry I missed this ๐
sounds like a more fundamental problem that we need a Native CI to solve
(ie let's get to a point where the gha workflow simply never contains any function name or arguments- because there's only one entry point and everything is configured in the module)
yeah, for test this is alright, but curious if we have a long-term plan for how this would work for releasing? we need to pass the tag we're releasing in (along with other metadata sometimes, that github populates)
if we had contextual env or contextual events or something like that, that would let us do that