#maintainers
1 messages Β· Page 12 of 1
i can't remember every single interaction here, sorry, there's more context written up in https://github.com/dagger/dagger/pull/7858/
but if it's all possible, i think we should keep the versions of client/engine that are built from a single build the same
not doing this feels like weird issues around the corner
so if the engine has a hash in the id, so should the client
ok
and the hash should be the same? if the engine had a hash of engine files, and the cli a hash of cli files, things would break?
i'm not 100% sure what will happen, but i remember writing the dev-export function specifically to solve this problem and keep them the same
just making sure i understand how it works so I can make our CI as simple and efficient as possible
is dev-export used for the "dev-engine: true" CI workflows?
yes, and also for the hack/dev script
without the dev-export, it was possible to hit a scenario where you build a cli and engine together that refused to talk to each other
because one of them was consistently newer than the other
they need to have the same version identifier always so they can always talk to each other
I'm a little worried that this entire load-bearing wall is built on changie files but anyway
we could actually get rid of the changie bit
we use .changes/.next mostly
which we manage manually
so I saw π
it's really interesting how understanding of the pipeline leads to understanding of the product behind it
like watching how a product is assembled at the factory helps you understand how it's made
hello folks, when adding a new field to
@tepid nova I know you said that there is work on the roadmap to make this variable unecessary, but it bites lots of us today when inadvertently left set. #daggernauts message cc @spark cedar
https://github.com/dagger/dagger/issues/8448
I don't really feel like I'm getting much visibility into nested TypeScript modules with the CLI or Dagger Cloud.
Can someone maybe take a look?
Can you send a trace URL, and where you expect to find nested info?
I'm guessing by "nested" you mean executing a container which itself calls the dagger api?
I only see the entrypoint function, none of my subsequent or nested @func() are isolated in anyway shape or form
Threaded ^^
@tepid nova It would be good if we could tackle Env propagation in a similar fashion to directories π
What are you trying to do? It would be fantastic if we could allow argument annotations on functions, this would allow pulling the environment from .envrc or .env files Why is this important to you...
Hey @rancid turret I'll need your help on that one please: https://github.com/dagger/dagger/pull/8436#issuecomment-2353133546
Update ModuleFunction.setCallInputs to apply ignore patterns on Directory that have ignore as metadatas.
Update ResolveDirectoryFromCaller arguments to supports optional ignore argument.
Update CLI...
Found a potential security issue. dagger call FUNCTION where the function returns Secret or Secret[], prints all the secrets and there's no obfustication.
^ fyi @hasty basin we were discussing this #p-daggerize-the-world message
i don't think this is directly a security issue, but it's definitely worth fixing asap imo
π I'm trying to exclude everything except my module's metadata in dagger.json and I'm getting some weird behavior. I have this dagger.json
{
"name": "lala",
"sdk": "go",
"source": ".dagger",
"exclude": [
"**/*",
"!.dagger/**/*",
"!dagger.json"
],
"engineVersion": "v0.13.0"
}
However, when calling dagger functions I intermittently get an error with module name and SDK must be set. Any ideas @spark cedar @rancid turret @civic yacht ?
π huh
yes, it's very strange
marcos:tmp/lala (β |kind-kind)$ dagger functions
β connect 0.8s
β initialize 0.2s
! get module name: input: module.withSource.initialize resolve: module name and SDK must be set
β analyzing module 0.1s
! get module name: input: module.withSource.initialize resolve: module name and SDK must be set
β Module.initialize: Module! 0.0s
! module name and SDK must be set
marcos:tmp/lala (β |kind-kind)$ dagger functions
β connect 0.8s
β initialize 0.4s
Name Description
container-ech Returns a container that echoes whatever string argument is provided
marcos:tmp/lala (β |kind-kind)$ dagg
i wonder if the include/exclude pattern isn't fully deterministic
probably a weird race somewhere
I was thinking about the same
I'll open an issue
@tepid nova @wild zephyr I wrote a comment on my PR about the errors API I threw together, but bikeshedding there would probably get too off topic/broaden the scope - so I carbon copied the meat of it into the more appropriate issue: https://github.com/dagger/dagger/issues/8421#issuecomment-2353430304
Just pinging in case you had a half written comment and didn't notice my edit π
What's the technical limitation that prevents a module from returning a type from another module? π
I need to stop asking questions in your dev channel. Should these be in #daggernauts ?
yes, that's a better spot for modules, but this is a good spot for engine stuff. That kind of issue does kind of straddle π
seems like that's the case π±
works: (exclude: **/.git, **/*, !.dagger/**/*, !dagger.json)
doesn't work: (exclude: !.dagger/**/*, !dagger.json, **/.git, **/*)
exclude order is mixed. Will open an issue and tackle this one π
Ooooo that makes sense - could be something like we convert it to a map and then back to a list?
Feels like a v0.13.1 issue for sure π
Are SecretStores unique per module or per client?
yes, addressing that now
per-client, with each invocation of a function being a unique client. Secrets are automatically transferred between clients by providing them in arguments to functions or by returning them from functions.
Ah, that's why this doesn't work π’
IIRC (and I very well may not be remembering correctly), it's expected in buildkit filtering logic that the order matters here. The first case is saying:
- ignore everything (
**/*) to start - but then on top of that don't ignore
.dagger/**/*anddagger.json
Whereas the second case is saying:
- Don't ignore
.dagger/**/*anddagger.jsonto start - but then on top of that ignore everything (
**/*)
Were you saying that you get a different order every time you run it?
that's how it gets printed in the TUI at least.. I'd assume that order matters π
Ah nevermind, I thought at first you were saying you changed the configuration to that order and got the unexpected behavior, yeah that all makes sense then
gotcha, yes, will fix this as soon as I finish my call π βοΈ
Was catching up on what I missed during PTO and saw this thread, simplifying now SGTM but left some more context on the previous approach here: https://github.com/dagger/dagger/pull/8341#issuecomment-2354084402
Are there any plans to standardise some common functions into formal Dagger API(s)? β An example of what I meant is dag. HTTP, it's very handy. I'd love to see dag. Zip/Unzip, and for sure, there are use cases that'll come up with other common functions as APIs. π€
hello folks, in core/service.go, I have an instance of *core.Service, how do I convert this to a dagql.Instance[*core.Service] instance?
ah π so a dagql.Instance is a result that is returned from the api
so you need to get one by calling Select
you can't directly convert them
requesting comments on https://github.com/dagger/dagger/pull/8318
i would like to switch to including the git dir by default for git clones, specifically this solves the very real issue https://github.com/dagger/dagger/issues/8387 where context directories work different when loaded locally vs from git
passing structs between modules
is someone else also having issues running dagger locally due to GitHub timeout issues?
I may have messed up my Docker installation somehow (I was trying to do weird things there). re-installing Docker fixed it.
could someone please help me re-run this action run: https://github.com/dagger/dagger/actions/runs/10907039367/job/30269750383?pr=8479
Can I set a default value for Container arguments? Eg. a string that would be used as a remote image ref, same as CLI arguments
So if you call From repeatedly it would always use that value? If so, no, doesn't exist right now
No I just mean this:
func Foo(
// +optional
// +default="alpine:latest"
c *dagger.Container,
)
Ohhh those default args π no that doesn't exist either yet, last I checked anyways
Makes sense to me though
Right, it maps directly to what the CLI does, so I actually didn't even think twice before using it - was surprised it wasn't there. I'll create an issue
I have to change my function to take a string ref as argument, because I can't manipulate the platform after receiving that container
Yep that makes sense, that API was very pre-modules and doesn't really work in this model very well
Initially I wanted func Foo(c *dagger.Container, p dagger.Platform) but had to change to func Foo(c string, p dagger.Platform) So I can do dag.Container(dagger.ContainerOpts{Platform:p}).From(c)
Question on our build: I see that we build the CNI plugins in different environments depending on the engine's target distro (ubuntu, wolfi, alpine). But we don't do that for other binaries bundled with the engine - or even for the engine itself. What's special about the CNI plugins that requires this?
I thiiiink they're linked to system libraries which are gonna be different on each system?
OK so CNI plugins depend on system libraries in a more specific way that other binaries in the way don't - right?
just making sure i understand
Also, would it make sense to use the same, distro-specific build environments for all binaries, even though we technically only need it for cni plugins? It might help with caching - only one build environment instead of 2. Thoughts?
Yes yes yes
Atm, we have a problem that you can't actually build with race
Oh?
Ah - I saw it needs CGO but assumed everything else worked
Which is very very platform specific
I see, so building with race works when the build env is ubuntu or wolfi, but not if it's alpine?
Maybe I can't quite remember I haven't tried recently
It should work on all of them
But I think it only currently works on Ubuntu?
Ok. My immediate takeaway is: it's a good idea to build everything in a distro-specific build env, instead of just the CNI plugins
Yes I think this is correct
For the engine, definitely
Cli doesn't need this afaict
Right but if we're building it all - can't hurt either right? (thinking of minimizing cache footprint maximizing caching)
Mm sure, that works
In any case runc has its own special-case build env... Not sure if that could ever be streamlined. It has xx-apk and xx-go magic that I'm afraid to touch π
@spark cedar ah you tricked me π With a shared build env across all binaries in the engine build, it makes slightly less sense to break out these binaries into their own modules
(need to pass the env container as argument to each)
I'm still going to try though! Let's see what happens
Ahahaaaa no trick planned I promise, you just happened to stumble into my little peeve I've been meaning to fix in the ci for ages
Yeah runc is a huge nightmare to cross-compile without that magic; the end result is cross-compilation and statically linked binaries IIRC. Maybe xx-apk/xx-go could be packaged up into a reusable "cross-compile my cgo into a static linked binary" module though? π€·ββοΈ
That sounds cool π At the moment the magic is not a blocker at all, so file under "future bonus optimizations"
@still garnet I'm working on integrating resource monitoring for execs so we can track cpu/memory/disk/network usage of individual processes and find performance bottlenecks. Most immediate goal right now is to use this to figure out what specifically in our CI is hogging the most disk bandwidth since that seems to potentially be a big bottleneck for us. But obviously want this to be a base for bigger TUI+cloud features for both us + users, so attempting to do something that is more than throwaway code even if it's barebones to start.
- cc @spark cedar @wet mason @stray heron et al, relevant to all your interests π
Have the cgroup monitoring up and running, can print metrics directly to engine logs for each exec and such. Trying to figure out the best way to hook it up to just the TUI over OTEL. Starting a thread π§΅
could someone please retrigger these github actions: https://github.com/dagger/dagger/actions/runs/10914096343/job/30291666110?pr=8427 and https://github.com/dagger/dagger/actions/runs/10914096349/job/30291664856?pr=8427
Hi @spark cedar π when you have change, could you pls review https://github.com/dagger/dagger/pull/8479
done
just an fyi - i think i'm looking at extending the graphql introspection api for dagql to include querying directives: https://github.com/jedevc/dagger/blob/48d0fb32f381ba3cc0aee4120d5ef5bcdeb22994/dagql/introspection/types.go#L139-L142
i have a pretty strong use case for this, and given we already have data like "impure" here, it would be quite nice to access this info from codegen clients
i remember when talking about this before, we were blocked because of https://github.com/graphql/graphql-spec/issues/300 (which doesn't seem to be going anywhere), but because we have dagql, we have complete power to write our little custom extension to the introspection api
mostly flagging
just in case someone has some huuuuge objection to this π
Is there any way SDKs could generate a constructor from top-level fields?
While splitting our CI module into sub-modules, and using context dir, I'm finding myself using fields + constructor more and more, as opposed to individual function args. For example it makes way more sense to pass source as a constructor argument, instead of passing it to each individual function. This is especially true in Go where ignore patterns have to be copy-pasted in each function.
The result is that I have a lot of constructor boilerplate. I wonder if we could reduce that boilerplate somehow?
func New(source *dagger.Directory, version string, platform dagger.Platform) MyModule {
return MyModule{
Source: source,
Version: version,
Platform: platform,
}
}
type MyModule struct {
Source *dagger.Directory
Version string
Platform dagger.Platform
}
--> Could we make that shorter
I've been doing the same with credentials. Better to pass them via constructor than each and every function.. It would definitely be nicer if there's a way to reduce that boilerplate.
Hi all, what does your inner loop look like today for changing a line in the engine? I know that the slow rerun of pipelines is being worked on, but in the meantime, how do I rebuild the engine leveraging go's cache?
Hi all, what does your inner loop look
@rancid turret Could I get a review on https://github.com/dagger/dagger/pull/8436 when you're available please? π
Update ModuleFunction.setCallInputs to apply ignore patterns on Directory that have ignore as metadatas.
Update ResolveDirectoryFromCaller arguments to supports optional ignore argument.
Update CLI...
hey i have a naming question π€
what is RelHostPath? what is the "module root" in the description? i'm really struggling to understand all the conflated terms we use around here (see https://github.com/dagger/dagger/issues/8424)
I think RelHostPath is the path to the ContextDirectory on the host?
is that true? or am i misunderstanding something hugely here?
and if it is, under which scenarios is actually possible for the ContextDirectory to not be the hosts path value from -m <path>?
Dropping a quick bookmark before I forget: there are a few tests that use the python image, most notably the services tests (httpService). That might be a cause for some extra test load/slowdown - the image is much larger than I thought at the time. It can likely be swapped out for a more minimal httpd based image. /cc @civic yacht (httpd example, using busybox: https://github.com/dagger/dagger/pull/8442/files#diff-cf1e6cc74aade66b528a7d3c9fdaa760bcc1e89ac851e582a46fad7aa3287574R229-R235)
@fair ermine i think this was added as part of context dirs?
This crashes for me:
dagger call -m github.com/dagger/dagger dev
https://dagger.cloud/dagger/traces/28fa4c78653f8386a253f7f1ba063651
fetch https://dl-cdn.alpinelinux.org/alpine/v3.20/main/aarch64/APKINDEX.tar.gz
WARNING: updating and opening https://dl-cdn.alpinelinux.org/alpine/v3.20/main: Permission denied
fetch https://dl-cdn.alpinelinux.org/alpine/v3.20/community/aarch64/APKINDEX.tar.gz
WARNING: updating and opening https://dl-cdn.alpinelinux.org/alpine/v3.20/community: Permission denied
ERROR: unable to select packages:
git (no such package):
required by: world[git]
Can anyone reproduce?
Looks like there's a missing apk update?
hm, looks like a download failed? or the file failed to be written?
https://dl-cdn.alpinelinux.org/alpine/v3.20/main: Permission denied definitely seems wrong
those are warnings (not sure if they're related), look at the last error
right, but if we can't write the package update, that'll explain the error
i've seen these kind of issues in alpine before
This has the warning but no error for me:
dagger core container \
from --address=alpine \
with-exec --args=apk,update \
with-exec --args=apk,add,git \
with-exec --args=git,version \
stdout
If you remove the apk update you get the error
actually nevermind - different error but it fails either way
is this a transient alpine repository error? Or a pipeline error on our end?
This worked 24h ago for me on main
this is working for me perfectly π€
!!
what version of dagger? I just upgraded to 0.13.1 on mac
(didn't try just before)
@civic yacht could this conceivably be related to the 0.13.1 release?
Everything else seems to work fine. Feels a little too surgical to be a low-level system issue
same, i just updated
if you visit the url https://dl-cdn.alpinelinux.org/alpine/v3.20/main manually, does it work?
it should be a directory listing
v0.13.2
Nothing obvious, agree it feels like some weird transient possibly-network error, that url works for me
wooo who's your isp? name and shame
(the curl works - let me check the dagger call)
I don't understand how that turned into an apk "permission denied" error though
It's a local San Francisco ISP, they are generally great so don't want to name and shame
yup it all works now. Sorry for the false alarm
If the apk error had been more clear, I would have figured it out on my own I think
"permission denied" just sent me down the wrong path completely
oh so much love to local isps, i would love to have one, unfortunately i am stuck with the potentially one of the least loved isps in the uk π
idk know the exact error Safari is implying there, but "can't find the server" sounds like DNS maybe?
oh just saw this, yeah lol
I had to do the same thing to my router after moving, I got weird ephemeral resolution problems until I switched from the ISP default DNS to 1.1.1.1 and 8.8.8.8
It's the relative path from the host location, so yeah the path to the context dir from the host, it's because the path to the context dir is different after the module is loaded, so we use that variable instead
ah okay,
any objections if i rename that property then?
because i think it should be ContextDirPath
at least for the LocalModuleSource
Hmmm yeah okay
great, i'll throw something together tomorrow π
thanks! was really struggling in this area today, thanks for confirming my guesses about how this all works π
At what point do we set the "next" version? It feels a little weird that our dev version strings are based on that "next" version, and even that we set one at all, since the actual decision is not made at the time of release, of what the next version will be. So if there's a file in the file that documents that pseudo-decision, that file is lying. Maybe no bad consequences to that - but it still feels weird to build our tooling on this unsteady ground
Edit: I guess it's reasonable to say: by default next version is a patch. Then we can manually bump to minor or major at the time that is decided. Assuming it doesn't break the factory somehow, that some artifacts are built with a dev patch version that eventually never gets released (because we bumped the minor instead)
Yes this is exactly what we've been doing, its true that there are some pre-releases (dev versions) for versions that are never actually released
But it's okay - because the only constraints that needs satisfying is that the dev version built before a release is smaller than the actual version that was released
fwiw that's also what Go does - when you pin to a commit after the latest released version, it "guesses" and fills in a +0.0.1 patch bump and appends the timestamp/commit
@fair ermine FYI I just found a bug in context dir, in remote modules, relative paths are interpreted as absolute (only in remote modules, not local)
Yep I'm aware of this one, I'll work on a fix next week
I think this was pointed out back in https://github.com/dagger/dagger/pull/7744#discussion_r1701317137 ?
@still garnet can you explain that one?
You're saying 2 different directory arguments, to avoid re-uploading non-test files when you only changed the tests?
yep exactly - right now the testing feedback loop can be pretty bad, because updating a _test.go file triggers a rebuild of everything on every run, adding (I guess) minutes to each feedback cycle
But doesn't go test build a test binary with everything in it anyway?
I will gladly do this, can you give me an approximation of the files to include or exclude in each?
If you give me a general idea I can give it a shot
not sure what you mean - the bit that's slow/annoying is that we rebuild the engine (and all other deps of the source dir) every time, even though building those shouldn't need any _test.go files
MVP i suppose is just excluding *_test.go file from the engine/CLI builds - maybe that could even be applied after-the-fact? π€ is there a withoutGlobs("**/*_test.go")?
@tidal spire how do I use your supermod check?
i can give it a go some other time too, already had a false start on it
There is no withoutGlobs but I agree we are missing a bunch of filtering functions in the core Directory type
Also a Directory.wrap (to replace the dag.Directory().WithDirectory("foo/bar", dir) with dir.Wrap("foo/bar"))
it's at least possible with buildkit - bass has (glob dir ! ./**/*_test.go) and it has nice cache-bust-avoidance semantics you'd expect
Just to confirm @still garnet, should I try excluding **_test.go from the engine source argument, and add a separate arguments with only those files? For caching benefits + re-uploading benefits
Filtering them after the fact will help with cache busting but not with upload
(but I don't know how good buildkit is at deduplicating individual file uploads)
yeah that's a good start - though the other arg might just need to be everything, assuming tests import code from the repo too. but that's fine, since the tests are gonna need to re-run anyway - I just want to avoid the whole engine/cli/sdk/whatever-else rebuild
I can just merge the 2 arguments
I'm stuck on something for the version logic... It impacts best practices for multi-module possibly.
I want the version module to be importable by everyone else, without passing arguments. But that requires it being able to get all the information it needs from the context, to construct the version string.
- Current git tag β
I can get this from
.git - Current git commit β same
- Digest β
I have an
inputsargument with contextual default - Clean checkout or not β -> I don't know how to get that information. I need it to decide whether to return <tagname>, <nextTag>-<commit>, or <nextTag>-<digest>
Possible strategies:
- Get the whole repo checkout from context, and run
git diff. But then I'm pulling in the whole repo for every module importing me... So it's a non-starter - Use
inputsas a partial checkout, rungit diff. Git will see a bunch of missing files (those I filtered out). But maybe I can land on my feet and get a diff only for the relevant files? - Take
dev boolargument: caller has to specify if this is a dev build or not. Annoying because I have to plumb that argument through everywhere, which makes the pattern less useful - Use a special file as shadow argument, eg.
./version/DEV? If the file exists, set dev mode?? Feels weird but would preserve the pattern. wdyt
Also this kind of makes me want a config format to inject arguments anywhere... Then I could 1) add a dev bool argument, and 2) to avoid plumbing it everywhere: dagger call -m github.com/dagger/dagger/engine --config github.com/dagger/dagger/version.dev=false container
Or in config file form:
$ jq .config dagger.json
{
"github.com/dagger/dagger/version": {
"dev": true
}
}
Any suggestions on the above would be appreciated π do you have a preference?
β― dagger core version
β connect 52.4s
β initialize 0.4s
β prepare 0.0s
β version: String! 0.0s
Full trace at https://dagger.cloud/dagger/traces/4d6131af87f939f3c6d357910923b685
v0.13.2
A new release of dagger is available: v0.13.2 β v0.13.2
To upgrade, see https://docs.dagger.io/install
https://github.com/dagger/dagger/releases/tag/v0.13.2
π€ Why does it think it's out of date? cc @wet mason
OK this is my ideal solution: https://github.com/dagger/dagger/issues/8520
getting the same msg locally
...yeah i think it might be time for another patch, it's on every command π looks like an issue with the code
π’
@wild zephyr https://github.com/dagger/dagger/pull/8521 π
v0.13.3 - 20th September 2024
Ugh. Sorry. I donβt know what happened, I tested it with a dummy image. Must have messed up a rebase
I don't suppose it would be possible to see individual files being downloaded in the traces? π
Or more generally, greater detail on what's going on during that upload
Sometimes I feel it would be faster to git clone the whole repo from remote, than upload it locally to my Docker-for-mac VM... but that can't be right, can it?
it could be. I think Buildkit even does fancy things to avoid cloning the repo fresh each time, but don't quote me on that
when using the commit in dev version, preference between short or full commit ID?
Also: when in a dev version (eg. <next-version>-<commit> or <next-version>-<digest>), what is the correct value of github.com/dagger/dagger/engine.Tag? I think that refers to the docker tag of the registry image to auto-download. Should it be <next-version>? Or should that value not be injected? Or does it not matter because that feature just won't work anyway?
Right now that tag is entangled with the git tag used for version - so it's hard to follow which tag is which in the pipeline code
What's the desired behavior here @civic yacht @still garnet ?
Use the git commit maybe?
We publish images off main tagged using the git commit, so yes
e.g. registry.dagger.io/engine:<git commit> (just making sure we're on the same page)
btw you can now inspect the current behavior easily while I develop it :
dagger call -m github.com/dagger/dagger/version@pull/8355/head version
EDIT: nevermind, -m strips .git from the context, so that breaks my module
damn
This should work then:
dagger call -m github.com/dagger/dagger/version@pull/8355/head --git-dir https://github.com/dagger/dagger#pull/8355/head version
Mmmm that fails with:
marshal: json: error calling MarshalJSON for type *main.Version: json: error calling MarshalJSON for type *dagger.Directory: input: loadDirectoryFromID resolve: load: load base: load base: parse field "git": Query.git has no such argument: "keepGitDir"
Not my lucky day
Isn't that related to a recent breaking change?
Yeah it changed to the opposite; we keep the git dir by default in Query.git and there's now instead an option to remove it (discardGitDir)
But I think this error comes from the CLI, as opposed to a module?
Also, shouldn't that new behavior also apply to -m? π
(eg. keep .git by default when loading a remote module)
I need to check the implementation for that, one sec. But the tradeoff is that .git is not really stable and thus causes unexpected cache invalidations, so including it in the module context might hurt performance
Yeah I can see that
But, consumers could still decide to strip, right? So each SDK could easily remove .git to be more cache efficient?
Honestly, I don't even want to access .git, it's just a workaround for this: https://github.com/dagger/dagger/issues/8520
I just need a way to get current commit, current tag, dirty/clean checkout state, etc. For the purposes of assembling our version string without requiring arguments
Yeah I think that's the better approach overall for modules specifically since .git is only really needed for running git subprocesses
Using .git has major downsides: 1) I have to upload all of it, including objects, which is big. and 2) As Justin pointed out, the git dir may not actually be at .git within the worktree.
But, that makes the above pretty high priority, because it's very common for parts of your pipeline to need information about the current repo, and right now you can't 1) have that, and 2) use the state-of-the-art contextual patterns
Wait, sorry, that PR that changes to discardGitDir is approved but not merged yet
https://github.com/dagger/dagger/pull/8318
So that shouldn't be applicable
Trying this locally to see what's happening
Yeah I think it's not too bad to implement. Trivial for modules pulled from git obviously. Local modules can probably just forward the git metadata in headers in their requests (as long as that doesn't add too much startup overhead latency)
Local modules can probably just forward the git metadata in headers in their requests (as long as that doesn't add too much startup overhead latency
Didn't understand that part
I was thinking more like - if a local module calls dag.Context().Git().Head() or whatever, the CLI just executes git rev-parse HEAD in the module directory (or ideally the equivalent without requiring git being installed)
Yeah that approach requires a session attachable so the API would reach back to the CLI and ask it for that info. For our own internal simplicity it would be nice if loading the git metadata was fast enough that we could just always do it on local module load and forward the metadata then so it's available if queried, rather than requiring the session attachable
It's just performance + internal complexity tradeoffs, either would work
repros for me, looking at it
I see. Makes sense
One benefit of the session attachable approach, is that it would warm us up for more uses of that technique. Things like contextual auth, native execution of some mac commands (to allow daggerizing xcode workflows), native LLM bridges (to take advantage of eg. ollama running locally)
I hope I'm wrong, but my first suspicion is that the version module has a dependency on a module named githttps://github.com/dagger/dagger/pull/8355/files#diff-1b17528b6c0a57a068c3cb37a8eea4c740f786edecb6af4733ebc5d0c85f9968R6
Which is overriding the core Query.git... There is supposed to be protections against this and we have tests, but possible something is going wrong there. Trying something locally
oooooh
FYI I pushed a small update to the branch, to gracefully handle absence of a git directory
So this should work now:
dagger call -m github.com/dagger/dagger/version@pull/8355/head version
@civic yacht unrelated question (sorry for pestering), is there any benefit to splitting a big contextual directory argument, into say, 3 smaller ones, with mutually exclusive ignores? Does this actually prevent re-uploading too much, say if files changes in one of the 3 but not the others... Or is buildkit smart enough to dedup those uploads anyway?
Ha ha amazing, it now works and look at this:
Buildkit filesync is smart enough to dedupe reloading files that haven't changed.
Leaving them combined would also save you in cases where there's overlap in the directories. If you split them up then any overlap in the dirs will get duplicated and loaded separately into each contextual dir.
But if you leave them combined you don't need to think at all about whether there's overlap
So overall I'd say combining them when it makes sense is the right approach
$ dagger call -m github.com/dagger/dagger/version@pull/8355/head
v0.13.0-b11e6ad29856a079612a2fb3f855e8fac324654d
Old non-rebased branch π
$ dagger call -m github.com/dagger/dagger/version@pull/8355/head \
--inputs=https://github.com/dagger/dagger \
--git-dir=https://github.com/dagger/dagger \
--changes=https://github.com/dagger/dagger \
version
v0.13.4-d3a0217-edcc585da90cb89048264f90ed2dde0105439fa8
Pretend my context is main π π
(Note in the first example, lack of git commit in the version, that's because of the lack of git-dir in the context, I just handle it gracefully for now)
OK. So for this request by Alex: #maintainers message
The solution is to keep 1 argument, but just make sure to exclude test files in builds that don't require it, to avoid cache busting? ie. upload is already efficient, it's just caching of some build operations that isn't
Yeah exactly
More version introspection goodies: support for component versions
$ dagger call -m github.com/dagger/dagger/version@pull/8355/head \ # dear new version module
--git-dir https://github.com/dagger/dagger#sdk/python/v0.11.0 \ # pretend you're at this commit
version-tags # What are all your tagged component versions at this commit?
- _type: VersionTag
commit: 210aedfb
component: sdk/elixir
version: v0.11.0
- _type: VersionTag
commit: 210aedfb
component: sdk/go
version: v0.11.0
- _type: VersionTag
commit: 210aedfb
component: sdk/php
version: v0.11.0
- _type: VersionTag
commit: 210aedfb
component: sdk/python
version: v0.11.0
- _type: VersionTag
commit: 210aedfb
component: sdk/typescript
version: v0.11.0
Let me know how this goes, I had some more follow up thoughts wondering what the actual end behavior would be here. There's a subtle difference in caching around "call digest" and "content digest" that might go either way and impact this indirectly. If you still don't see the ideal caching I'll poke at it
I've got my hands full at the moment, so didn't get around ot it
Yeah for sure, just ping me in the indefinite future when you get to it π
@civic yacht re: engine image tag in nightly builds... Is it the full commit sha, or short?
full commit sha
$ dagger call -m github.com/dagger/dagger/version@pull/8355/head \
image-tag
919f2758766cb5ad5e18fc85554871e535bdf37b
$ dagger call -m github.com/dagger/dagger/version@pull/8355/head \
--git-dir https://github.com/dagger/dagger#v0.13.2 \
image-tag
v0.13.2
Honestly this is fun π
Oh no I just realized that github.com/dagger/dagger/version.inputs may be causing a re-upload of most of the source files... Once to compute the version digest, then another time to actually build π
My next mystery:
dagger call -m github.com/dagger/dagger/engine@pull/8355/head container
https://dagger.cloud/dagger/traces/b66d272fe2ccf04af9c100b3e6886c28
marshal: json: error calling MarshalJSON for type *dagger.Container: input: dumbInit resolve: failed to collect IDs: failed to convert field "Platform": "" is an invalid OS component of "": OSAndVersion specifier component must match "^([A-Za-z0-9_-]+)(?:\\(([A-Za-z0-9_.-]*)\\))?$": invalid argument
This arg? https://github.com/dagger/dagger/blob/90c205c82c0681387e40304d6549958dd46e5267/version/main.go#L31
It should only be uploaded at most once per session (and have caching even across sessions, modulo pruning of course), when do you see it re-uploaded?
it's just that there are other args in other functions that include most of the same files, to build from them
so i worry about duplication of uploads between those different args to different functions
Ohh I see, yeah that makes perfect sense and is something that can be fixed, though it's in the bucket of the other filesync changes we talk about periodically that amount to "reimplement the whole thing"
The problem is buildkit caches each dir load from a given client individually in it's own little cache. I think what we want is more like something where a given stable client has a single filesync cache dir for it's entire /, and individual file sync calls just fill in the requested part of the tree. Then if there's ever overlap in distinct contextual dir loads, they will get deduped
I like how we're getting to the root of the problem and seeing the downsides of implementing something like https://github.com/dagger/dagger/issues/7777.
I've noticed that a ton of people have been looking for git modules in the Daggerverse and that your shykes/daggerverse/git module is heavily used, so I figured there might be some gaps in the core Dagger API's git functionality that perhaps your module has been filling.
Could this be assigned to me, please?
sure, done π left some notes on implementation details, we should auto-generate the json schema if possible, since otherwise we'll likely end up with out-of-date docs (there's a fair few incoming changes here)
Thank you
As discussed a few minutes ago: https://github.com/dagger/dagger/issues/8546
I finally got around to fixing this in upstream github.com/shykes/gha. Will open a PR to use it later today (should be just a dependency bump + re-generate)
I have no idea what's going on here: https://dagger.cloud/dagger/traces/c226bb313814920f76b83d531079c291
dagger call docs server as-service up
-> Python error about uv run??
"It's probably an error with a dependency, but which one?" Pretty hard to tell
You need to click >> on the command to see the logs: https://dagger.cloud/dagger/traces/c226bb313814920f76b83d531079c291?span=65f56a76c7feee08
It shouldn't be showing "uv run is experimental". Do you have a pinned version of uv? Maybe you're on an branch that's not been rebased for quite a while? If so, I suspect you'll see uv-version = "0.2.x" (don't know which x) in sdk/python/dev/pyprojec.toml.
Yes, you are in fact. If you're working on the CI module, it depends on sdk/python/dev and that uses uv since it's a python module. I suspect that error could be a fluke though.
No I was just trying to visualize one of the docs PR, what's on that PR is a black box to me. It may have made changes to the python sdk but that would be surprising to me since it's a docs PR
dagger call -m github.com/dagger/dagger@pull/8321/head docs server as-service up
Yes, but the python dev dependency is still in that pull request.
As I said, it may have been a fluke. Does it return the same error consistently?
I rebased the PR, you can try again.
I'm doing a check on all opened PR's today, if there any priority to review, feel free to ping!
would be great to get this PR merged: https://github.com/dagger/dagger/pull/8427
This adds expanding environment variables in WithNewFile. ref: #7951. If this looks OK, I will add similar changes to other API's in this same PR.
I'm checking it
There's an issue with the PHP SDK generation, I'll put a message in their channel, holding the merge till it's fix
right, I just commented on the ticket. there was a thread in PHP channel a few days back regarding this: #php message
Thanks! I'm gonna check what I can do about that
@spark cedar https://github.com/dagger/dagger/pull/8427#issuecomment-2354429740
Should we merge that one since it's a specific PHP CI issue? Can be fixed later no?
even though it's unrelated, we should still fix it before merging
making a green check go to red on main intentionally is something we should avoid
(also there's no rush for this - we're not planning for another release for this week, or the week after)
I think defaultPath="./some/file" might have a bug / is not working when calling remote modules. Could that be a thing @fair ermine ?
Yes there's a known bug
Relative paths are interpreted as absolute - is that the bug you're seeing @wild zephyr ?
As discussed with @spark cedar and @tidal spire today: starting a thread about ongoing refactoring of our CI
Low-hanging fruit: https://github.com/dagger/dagger/issues/8562
Relative paths are interpreted as
Yeah I need to fix that one
@spark cedar Also ready for a review: https://github.com/dagger/dagger/pull/8549
I am trying to add a test that mounts a secret and then verify it. but for some reason I am always getting "secret not found". any pointers what I could be doing wrong?
I was using wrong dagger client
hey folks, I think this PR is now ready for final review/merge: https://github.com/dagger/dagger/pull/8427
fyi all, we just merged some changes that affect how the cli is versioned - see https://github.com/dagger/dagger/pull/8564
there's currently a known issue in https://github.com/dagger/dagger/pull/8581m, but there may be more - so if you see weird behaviour then please shout!
totally separate from the above - noticed something weird... it seems like the apko module never caches correctly (see https://dagger.cloud/dagger/traces/6a73d805b6b04adc8ca14b3c14f63a04?span=c7dab25f378ba74d)
we always seem to have a very expensive import step - and given how much we use the apko module in our ci (through wolfi), that's probably not helping our build performance π’
unless i'm misreading it, i don't really understand how import could be cached at a buildkit level, we're just streaming an entire archive into containerd, and i think that's generally a little expensive
I had a prototype of a apk module that used merge-op to optimize apk image building: https://github.com/sipsma/daggerverse/tree/main/apk, wonder if it's worth reviving?
The README has the TODOs, but I forget the size of them in terms of effort
we could also optimize import? if that's the actual issue.
if we know the digest of the file (i think we should be able to get it efficiently), then we can add an annotation to the loaded image? then if we try loading that file again, we can see it's already in the store and skip that step
Ohh you mean import as in the core operation? I thought you meant something that the apko module was doing itself
Yeah agree that's worth optimizing
yeah, confirmed locally, just running dagger call version for dagger/dagger gives me this: https://dagger.cloud/jedevc/traces/4c1b5d97c65d7a8573315d314c6dec1c
everything is "cached", but we still do a very expensive wolfi step
My instinct is that we'd be better off not needing to do the import at all since it means we are doing container building "outside the dag" and then importing the result into the dag as an opaque blob; it would be better to build the container entirely in dagger and avoid all this entirely.
But on a practical level it makes sense to optimize import if possible anyways and that might be a faster path to speeding up our builds, in which case shipit
@spark cedar @civic yacht just to confirm, what's the issue with the wolfi/apko modules? Is it an issue with module implementation or engine implementation? Or both?
My understanding is that the engine implementation is inefficient and is likely improvable, but there's also a route for a module implementation that avoids the problem entirely, so more so an engine problem but kinda both.
Basically, the module shells out to apko to get a container tarball and then imports that into the engine. But if the module instead built the container entirely using dagger ops, we'd avoid this in the first place.
I see! Now I understand your reference to mergeop
Perhaps one could fork apko to be dagger-aware π
Any chance we could get a 0.13.4 out relatively soon, to fix dagger call -m github.com/dagger/dagger?
just thinking out loud here. have we considered the possibility of running a local registry with all images required for our CI pipeline pre-pulled to avoid flakes due to network latencies?
in our GitHub CI, this registry could be running inside the same k8s cluster where the CI pods are spinned up
Given we're doing the offsite next week, probably not π I don't really wanna end up debugging releases from the plane if things go wrong π
but you can pull a release directly off of main - which is what i'm currently doing to test the new versioned modules
yes π i did https://github.com/dagger/dagger/pull/7110 to try this out
the experiment didn't really go anywhere, but i've been meaning to bring it back
i tried again fairly recently: https://github.com/jedevc/dagger/tree/revive-cache-common-images
but again, got pulled into other things
would you be interested in continuing that work?
my plate is currently full with other things, some of which are not moving due to difficultly in debugging the failures in CI. I am considering following may help us improve the overall posture in debugging the failures:
- Index and track the versions used for different CI executions (engine version, SDK versions and other versions as required)
- Tag all tests with these versions (and thus makes it easy for us to query all tests that were using a given combination of these versions)
- Add the ability to index the test results output into a database (and thus add the ability to ask questions like "show me since when this specific test is failing/flaking. ok, what changed in that particular version". In my past experience of managing a huge (and super slow) corpus of test suites, these kind of queries helped a ton to narrow down the problems.
- reduce the number of network requests we do in CI (in a sensible and non-intrusive manner). Whenever the network is involved in CI, the flakiness is going to be there. we just need to identify patterns and either fix the flakiness (by using local image registry or other resources) or add an ability to automatically detect and retry some of these.
I do want to do POC on some of these in my spare time (to avoid de-prioritizing other things on my plate) but if any of these makes sense or have been discussed in past, that would be great to hear.
I am also evaluating if some of these can be queried from our existing traces (going to play around with graph queries manually).
for the first few ones, @stray heron is the right person to talk to there
he's been using honeycomb for ingesting trace data which has a lot of this information already
we're also working on getting this in dagger cloud
Is it currently possible to use a different OTEL collector to store and inspect the traces of a Dagger module when it runs? βΒ If so, is there any guidance or documentation that directs me on what to tweak?
you can connect dagger through to standard otel env vars (no docs really, it should just all work)
but i really can't recommend it for anything that's not development, most otel viewers produce way too much noise, and make it very hard to navigate
Gotcha; yeah, that's fine. Another question -Β will there be any intention to write a debugger for Dagger that goes beyond what we have with Terminal()? βΒ Context π #1230026862860963920 message
@civic yacht just noticed a weird thing while diving into a flake - https://github.com/dagger/dagger/pull/8586
my logic to kill runaway processes doesn't seem to be applying for our 30 minute timeout... so want to add some more logging, and want to make sure we're not sending multiple signals twice
@Erik Sipsma just noticed a weird thing
Hi, any way to Printf debug from a Go Dagger module (without changing function signatures) ? I was hoping to see some kind of dagger.Log or dagger.Debug
are the TestModule tests executed twice in CI?
Hi Folks, I am trying to make changes for "adding namespace to cache volumes". While doing so I noticed that in func (s *cacheSchema) cacheVolume (https://github.com/dagger/dagger/blob/main/core/schema/cache.go#L39), the current module info is not available. Because this function is called from inside my module (by calling dag.CacheVolume("volume-name"), i was thinking this info should be populated already.
any pointers why that might be the case?
how are you attempting to determine the module name you're in?
do you have an example module/engine changes?
it should be available from somewhere
just not quite sure where
I think it's dag.CurrentModule().Name(ctx)? Unless that always points to the top level module?
yah, that would be to access from the current module, but to access it from inside the schema api server, i think you just need to borrow the impl for currentModule to get it to work
I tried with s.srv.currentModule and parentquery.currentmodule, and both of them returned no module error
Also the getcurrent functioncall also returned empty string
I was going to try this with other api calls (instead of cache volume) to double check
@spark cedar could u pls elaborate a bit on βborrow the current implementation for current moduleβ
That function gets the current module
So just look at what it's doing, and do that π
Hm though that implies there's something else weird
I'm not really sure off the top of my head, but I'd try and trace, why it's not set and add debug statements
Won't really have time to dive into the details this week
You could also be inspired by the object namespacing? We namespace by module name there
So you could get the module name in the same way as from there
yeah, its user error. apparently there were multiple calls to the cachevolume function, and I was referring to the wrong one, where the current module was not available.
Hi Folks, I am running into an issue when using dagger cache volumes. It could be an error in my understanding of cache volumes, but from an end-user perspective, this looks like wrong behavior: https://github.com/dagger/dagger/issues/8628
On mobile, but the reasoning is - populate cache is updating the contents of the cache correctly, but list cache is being cached
The volume is the same, the command is the same, so you get the same result
The volume contents itself isn't part of the cache key to lookup previous cache
Unlike WithDirectory
Agreed this is maybe a bit weird to understand, but the alternative behavior of somehow not caching this would be even more unexpected imo
I'd recommend thinking of cache volumes not as persistent or consistent in any way really - but just as content that "might" exist and can be used to speed up an operation
(Thank you for taking out time to reply over the weekend).
I think from end user perspective if the cache is changed, in next step they would expect the new cache contents to reflect.
just thinking out loud, could our invalid go mod cache issue be triggered due to this behavior (the go mod cache content is changed, but the go build command is using old cache contents?)
Hm, but then if this is the case, and multiple writers use the same cache volume, then they'll constantly be rebuilding - because the contents are always different than expected. Changing this is a pretty big departure from how they've always worked, and how they work in dockerfiles as well.
Maybe? Doesn't seem like this is the cause of that issue, we seem to get a completely corrupted go cache
Also that's not what's happening in your example - the cache directory is changing, but the step isn't being re-run. The reverse isn't possible (barring some very weird bug) - if the step is re-run, it'll always use the newest version of the cache volume available.
@meager summit In my local testing, of v0.13.3, for the lifetime of a given engine instance, the state of a given file or directory within a cache-volume can only be set at most once (not sure, but it seemed like sometimes I could add a directory and sometimes I could not. #1292225631248973885 message Might be an internal optimization??).
func (f *Foo) Nginx() *dagger.Container {
return dag.Container().From("nginx:latest").
WithMountedCache("/nginx-foo-cache", dag.CacheVolume("foo-cache")).
WithExec([]string{"sh", "-c", fmt.Sprintf("mkdir -p /nginx-foo-cache/%s", "nginx-was-here")})
}
func (f *Foo) PopulateCache(ctx context.Context, input string) *dagger.Container {
ctr := dag.Container().From("alpine:latest").
WithMountedCache("/foo-cache", dag.CacheVolume("foo-cache")).
WithExec([]string{"sh", "-c", fmt.Sprintf("mkdir -p /foo-cache/%s", input)}).
WithExec([]string{"sh", "-c", fmt.Sprintf("mkdir -p /foo-cache/%s", "dog")}).
WithExec([]string{"sh", "-c", fmt.Sprintf("mkdir -p /foo-cache/%s", "cat")}).
WithExec([]string{"sh", "-c", fmt.Sprintf("mkdir -p /foo-cache/%s", "fuzz")}).
WithExec([]string{"sh", "-c", fmt.Sprintf("mkdir -p /foo-cache/%s", "gads")}).
WithExec([]string{"sh", "-c", fmt.Sprintf("echo '%s' > /foo-cache/bar.txt", input)})
ctr = f.OneMore(ctx, ctr)
return ctr.WithExec([]string{"sh", "-c", fmt.Sprintf("mkdir -p /foo-cache/%s", "last-one")})
}
func (f *Foo) OneMore(ctx context.Context, ctr *dagger.Container) *dagger.Container {
return ctr.WithExec([]string{"sh", "-c", fmt.Sprintf("mkdir -p /foo-cache/%s", "one-more")})
}
func (f *Foo) ListCache(ctx context.Context) (string, error) {
return dag.
Container().
From("alpine:latest").
WithMountedCache("/foo-cache", dag.CacheVolume("foo-cache")).
WithExec([]string{"sh", "-c", "ls -al /foo-cache/ && cat /foo-cache/bar.txt"}).
Stdout(ctx)
}
As you noted, I could call PopulateCache() again with a different argument and get a Container returned that will have the old state plus whatever the new state clobbers on top, but that won't necessarily be preserved in the cache-volume state if it conflicts with what is already there, I'm assuming.
Need to beef up the details here: https://devel.docs.dagger.io/manuals/developer/cache-volumes
Hi @hasty basin, thanks for trying this out.
What I noticed is that the cache-volume has the new data (when I checked using .Terminal). but the next step (which is actually using that cache volume) is returning the already executed and cached result, so end up using old data.
Changing this is a pretty big departure from how they've always worked, and how they work in dockerfiles as well.
I tried this with Dockerfile, and it seems the next step returns the updated content. Kindly refer to the screenshots.
Having said that, I ack that I have not used cached volumes a lot in Docker context, so if I am misunderstanding something here, please do let me know.
I did another test to split populating and listing cache in different dockerfiles. and now I do see when running the listing Dockerfile, it uses the cached result.
The reason it works is because the cache for the first step is busted, so the second step is also busted
Because it's dependent on the first step
What we're doing in dagger is not having that dependency
I guess the dockerfile comparison doesn't match here sorry - we just don't have that dependency, so it can be cached
Generally, you shouldn't use cache volumes for outputs to avoid surprises like this
π, thank you for clarifying this further.
Also heh, the build view has come a long way in docker desktop π I used to be in the team that was working on it
Hello,
First time that I've seeing this error
Stderr:
# github.com/Excoriate/daggerverse/terragrunt
./dagger.gen.go:988: not enough arguments in call to (*Terragrunt).TgExec
have (*Terragrunt, string, *"github.com/Excoriate/daggerverse/terragrunt/internal/dagger".Directory, string, string, string, string, string, string, string, bool, bool, bool, bool, string, []string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, bool, bool, bool, bool, int, bool, bool, bool, bool, string, bool, bool, string, string, bool, bool, bool, bool, bool, string, string, bool, bool, bool, bool, bool, bool, bool)
want (*Terragrunt, string, *"github.com/Excoriate/daggerverse/terragrunt/internal/dagger".Directory, string, string, string, string, string, string, string, bool, bool, bool, bool, string, []string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, bool, bool, bool, bool, int, bool, bool, bool, bool, string, bool, bool, string, string, bool, bool, bool, bool, bool, string, string, bool, bool, bool, bool, bool, bool, bool)
./terragrunt_cmds.go:280:14: m.TgCmd
It's in the context of a function that has a large number of arguments. Is there any known limitation? β As the number of arguments was reduced, the error disappeared, and the module was successfully regenerated (through dagger develop). The error is somewhat counterintuitive β it states that there were not enough arguments when calling this function, but it was correctly called β when the number of arguments was removed, it went away.
Another non-related question βοΈ , when inspecting the code generated by Dagger, I see that the type for a given function is meant to return *dagger. Container as a type returns just *Container. Why? Shouldn't it be *dagger? Container instead?
Example:
func (r *TerragruntCmd) Exec(command string, opts ...TerragruntCmdExecOpts) *Container {
q := r.query.Select("exec")
for i := len(opts) - 1; i >= 0; i-- {
// `source` optional argument
if !querybuilder.IsZeroValue(opts[i].Source) {
q = q.Arg("source", opts[i].Source)
}
// `module` optional argument
if !querybuilder.IsZeroValue(opts[i].Module) {
q = q.Arg("module", opts[i].Module)
}
// `envVars` optional argument
if !querybuilder.IsZeroValue(opts[i].EnvVars) {
q = q.Arg("envVars", opts[i].EnvVars)
}
}
q = q.Arg("command", command)
return &Container{
query: q,
}
}
From a function which's written (in the dagger module's code) as:
func (c *TerragruntCmd) Exec(
// command is the terragrunt command to execute. It's the actual command that comes after 'terragrunt'
command string,
// source is the source directory that includes the source code.
// +defaultPath="/"
// +ignore=[".terragrunt-cache", ".terraform", ".github", ".gitignore", ".git", "vendor", "node_modules", "build", "dist", "log"]
source *dagger.Directory,
// module is the module to execute or the terragrunt configuration where the terragrunt.hcl file is located.
// +optional
module string,
// envVars is the environment variables to pass to the container.
// +optional
envVars []string,
) (*dagger.Container, error) {}
...
Hi Folks, There are a few PR's which could use some review:
- https://github.com/dagger/dagger/pull/8427 (feat(engine): support expanding variables in more functions) - almost ready for merge
- https://github.com/dagger/dagger/pull/8557 (use '.dagger' as source dir during 'init' and 'develop' when current dir is not empty) - ready for review
- https://github.com/dagger/dagger/pull/8593 (chore(ci): don't send interrupt signal for foreground process) - we may end up just closing this PR
@final star no rush at all but when you're ready here's the two small issues I mentioned:
- https://github.com/dagger/dagger/issues/8513 Support for disabling the automatic init process in execs
- https://github.com/dagger/dagger/issues/8573 Support for setting the size of tmpfs mounts
Of course happy to help/chat about them more anytime too
Hi @civic yacht, I am making changes for namespacing of cache volume, and wanted to clarify a thing π§΅
which of these is prolly easier?
The tmpfs mount one. Disabling automatic init is a couple extra steps of plumbing
cool, i'm taking a stab at https://github.com/dagger/dagger/issues/8620 first but the tmpfs one will be next then
tiny thing that did come out of discussion around improving ci stability last week (though I think we're all pretty good at this already) - if you click the merge button on a PR, it's your job to monitor main afterwards for any failures, any new flakes - and if there are any issues, to investigate/revert/fix/etc. so for merging other folks prs, only do so if you're happy with that burden π
cc @stray heron @civic yacht @still garnet @rancid turret @fair ermine @final star @wild zephyr @tepid nova and anyone else who does merges π
on another note, starting π§΅for v0.13.4
- Merged the expanded variables just now - thanks π
- Pinged others on the changes to use
.dagger- this is a particularly thorny issue, so should probably be reviewed by the folks who have been leading that discussion. - I think we can close the interrupt signal one, I don't think this is the root cause of those flakes.
Thanks for your patience btw, we're all back from the offsite now, so should have more bandwidth for reviews π
Hi @spark cedar, a question about how ID is created. I am trying to trace how the CacheVolumeID is constructed. could you please point me to the code where this logic sits.
ids are computed from the the query that created them
Hey @meager summit , I'm curious how you're debugging and/or locally testing your changes. Any hint or guide that I can follow, besides some like that I recall you guys shared some time ago?
I am usually following 3 ways:
- my first preference is to add an integration test and run it, ensure that I am able to reproduce the problem i am trying to fix and then fix it.
- if that does not work for some reason or I get stuck, I use
hack/devto start the dev version and iterate the scenario usinghack/with-dev dagger call.....and then try to add an integration test for it - the last fallback that I use when I am totally stuck or unable to follow code flow, I am using panics at selected locations, which gives me stacktrace, which I am then using to understand the flow and then adding an integration test for it
having said that, I am still pretty new to this code base and I am sure other experienced folks have better workflows to test the changes, in which case I would also love to hear their approaches.
Thanks so much for sharing your experience @meager summit !
This is one of those things where automation is intensely valuable. Literally watching CI is nearly always a waste of everybodyβs time, ideally we find a way to notify the committer on failure
oh agreed - we should have a bot or something that notifies when there are things failing (sadly might be a bit noisy right now until we sort more flakes, but it would cool to hook into the discord notification hook that's already used in #releases)
but i think the general idea of making sure that when you merge things that you follow-up on any issues/failures from it still holds π
fyi, feel free to pick up any automation pieces you like π they're all super valuable, it's just they get lost in the noise of day-to-day π’
yo random thing, will y'all hate me for using gofumpt ? it comes wired into my lazyvim go defaults, and is kinda a pain to switch back to gofmt (wasted an hour yesterday attempting)
fumpt is a stricter gofmt, and will put a bunch of unrelated line diffs into changes, but is a superset of gofmt, like gofmt(gofumpt(file)) = gofumpt(file)
A stricter gofmt. Contribute to mvdan/gofumpt development by creating an account on GitHub.
ahhh it tried very hard to reformat some of the schema things and kept putting newlines in weird places?
i switched it out from my vim - https://github.com/jedevc/dots/blob/722639b6de9e8346ad5a8f237e50011a92786d58/modules/nvim/config/lua/plugins/main.lua#L47-L55
i did like the grouping of consts and vars though - the weird newlines with inline structs i found really strange
Didn't something like this work for you?
return {
-- conform.nvim config
{
"stevearc/conform.nvim",
opts = {
formatters_by_ft = {
go = { "goimports", "gofmt" },
},
},
},
}
https://github.com/stevearc/conform.nvim/blob/master/doc/conform.txt#L345
no joke, i have this exact block of config open from yesterday and it did not solve my problem 
"stevearc/conform.nvim",
opts = {
formatters = {
shfmt = {
prepend_args = { "-i", "4" },
},
},
formatters_by_ft = {
go = { "goimports", "gofmt" },
},
},
}
hmm
my thinking that it might be insufficient was basically that https://github.com/LazyVim/LazyVim/blob/main/lua/lazyvim/plugins/extras/lang/go.lua also configures gofumpt in mason, none-ls, and nvim-lspconfig, AND lazyvim likes to merge collections rather than overriding them
but now y'all are making me thinking i'm missing something else basic
hm, do you define this before/after the rest of lazyvim stuff?
i think mine is at the very end now
So if you override the LSP config you can set that to false.
imma try to crib yours, i have no idea how ordering works here so i cannot answer your question, that block is just a plugins/conform.lua file
also big fan that i have a lazyvim problem and i immediately have 2 other users coming to my rescue XD so thank you both
i honestly just switched to it in the last month or so, so have been learning it from scratch myself π honestly really loving it, so happy to sell new people on it π
i've been using it for ~18mo now, came from doom emacs, and it's a lovely onramp to the larger nvim ecosystem. plus it breaks on update waaaaay less often then the equivalent emacs frameworks
omfg that formatters_by_ft config block does work, i had just decided partway through that i was gonna try to uninstall fumpt from Mason (so i was checking Mason UI to see if it had taken) instead of checking whether it was actually running against my files
... i must've initially had the nesting wrong when i changed to that testing approach
i also think at some point that avante+claude literally told me that the nesting was the problem but i didn't trust it
yo so i either need a pair or maybe my description of the problem will point immediately to what i'm doing wrong here.
I'm trying to repro #8620, but i can't find the noisy logs that the user is pointing to as a problem. my approach for testing this is to script up a ./hack/with-dev ./bin/without-logs-and-traces.sh ./bin/dagger call check -vvv , hoping i can use the same thing on main vs my branch to see these noisy logs, then have them disappear following my changes.
i expect to see logs somewhere in the CLI output complaining that the trace endpoint is not configured. am i looking in the wrong place? cc @still garnet or @wild zephyr who have some context
Wow, I took this exact same path! Probably around the same time too, maybe a bit earlier..haha
suspiciously the traces are showing up in dagger cloud... my without-logs-and-traces.sh looks like this ```#!/bin/bash
export OTEL_EXPORTER_OTLP_ENDPOINT=""
export OTEL_EXPORTER_OTLP_LOGS_ENDPOINT=""
exec "$@"
oh like i need to be logged out, perhaps?
unsurprisingly logging out does stop sending to dagger cloud, but doesn't produce "noisy" logs unless we're counting Setup tracing at https://dagger.cloud/traces/setup. To hide: export GOAWAY=1
a classic dumb onboarding question, how are y'all dealing with github review request notifications? I'm used to the slack app managing and notifying me, AI says no such thing exists for discord
My go to is https://github.com/dagger/dagger/pulls?q=is%3Apr+is%3Aopen+user-review-requested%3A%40me which shows all the PRs where my review was requested but I haven't gotten back yet
I ignore all the GitHub notifications entirely, and use email - and have a series of filters to sort everything to the right folders (but this does mean I drink from the firehose and read almost every issue and pr that goes in)
I have mixed recommendations on whether this is a good approach π
these options are both very sad compared to the slack option lol
You can write an email filter to check for your review requested, if you can match by content
But yes, only if you like email π
looool more funny lazyvim problems, it hates the markdown formatting in our CHANGELOG.md
disabling markdown extras will solve this, i've needed the rendering magic all of one time
:noa w will write the file without triggering autoformat, if that's the issue you're hitting (I have the same problem all the time with markdown files)
it also has like 100000 inline lsp/linter warnings
so it can just die as far as i'm concerned
It's autogenerated from changie π
oh interesting am i doing the wrong thing by writing into it directly, then?
(we can change some options about spacing, but that's about it)
Ah, probably - if you wanna write a changelog, you want to use changie new - it puts a new changelog file into .changes
Some of this should be in CONTRIBUTING.md - and if it's not, we should update it π
Can't quite remember what's covered there off the top of my head
i read it at some point but immediately forgot and forged onwards
oh interesting we don't squash merge, do we
inferring from commit message instructions
The rule of thumb is to avoid squash merge for huge PRs where all the commits were nicely split up and should be retained in the git history for easier git-blame (problem in the past was massive PRs got squash merged and now show up as massive opaque blobs in the git history that's hard to get any context from).
But if a PR is pretty small and you accumulate a couple fixup-type commits then it's fine to use squash merge.
(IMO, I think this is where we have landed as a tribe generally speaking)
makes sense, i'm more used to the rule being "no massive but well-structured-commit-wise PRs", but i'd venture to guess that rule makes more sense in a SaaS environment where you've got feature flags and other release mechanisms
Yeah it's always better to split up into smaller digestible PRs when possible, but we still encounter enough circumstances where that is genuinely infeasible that we have unavoidable huge PRs from time-to-time.
Hello, guys, I have a question regarding the Plaintext function of a dagger secret. According to its definition, it seems if the secret has an empty string, it just returns the empty string β I'd expect an error; why should a secret have an empty string as a value?
func (r *Secret) Plaintext(ctx context.Context) (string, error) {
if r.plaintext != nil {
return *r.plaintext, nil
}
q := r.query.Select("plaintext")
var response string
q = q.Bind(&response)
return response, q.Execute(ctx)
}
Not sure I follow Alex. That logic is not currently checking for an empty string π€
having said that, empty string is still an "allowed" value if someone wants to define a secret to be that
β¨ v0.13.5 - 10th October 2024
Hello, am I the only one for which the traces just load indefinitely ? ex: https://dagger.cloud/dagger/traces/e49e618fdf256241eae7d38e40b5eee8
Loads for me!
Iβve been having this happen too, clearing safari cache helped once
That's precisely my point: why would someone want an empty secret? β I'd say a more explicit and strict approach to returning an error if the secret is set to empty would be preferable unless there is a practical reason or use cases where an empty secret is required.
IMO it's not about why would someone want an empty secret or not, it's mainly about that there's no right answer to this question as it's "subjective" to whatever someone wants to do with that. i.e: maybe some Dagger user wants to treat empty secrets as a special way to make a decision in the pipeline flow for example. Keep in mind that not defining a secret and an empty secret are two different things.
Since there's no right answer for this as it generally depends on userspace, I think it's ok to allow it
Gracias Marco π§ I'm aware of the difference between setting a secret and assigning it an empty string; I was just a bit curious. It makes sense to actually set a default secret if one comes empty (from a dagger module -> installing/using -> dagger module) point of view.
in dagger cloud UI, is there a way to differentiate cached result vs fresh evaluation?
any objections if we add a new kind/flake label to github? i keep wanting to filter to find all our ci flakes with area/ci, but adding kind/bug doesn't feel like it adequately filters everything: https://github.com/dagger/dagger/issues?q=sort%3Aupdated-desc+is%3Aissue+label%3Aarea%2Fci+is%3Aopen+
i've just been searching for flake in the title, but label is better π
kind/flake exists now π
Hi @spark cedar is there a way to identify if the function was called from mod B directly vs mod B function was called from mod A?
i have a module that accepts cache-volume-id as input. when loaded using id, it should work. but if i try to access it using name, it should not get access to same volume, and rather should create a different one.
right now, (after some of my partial changes for namespacing cache volume), it seems dagger caches the results after I accessed it using id, and now when I try calling it using name it returns same volume.
if you access it by name it should scope it using the name of the module that's calling - not who's calling that module
so if mod A is calling CacheVolume it should be scoped as A + id, but if mod A calls mod B which calls CacheVolume it should be scoped as B + ID
it should never try and determine it's parent
well hah, i guess you have a problem though if you're passing the id
because now you're getting different results from different modules
π hah, fun, this is pretty much the same case as secrets + sockets
so the way this works for secrets is that SetSecret returns a dagql.Instance to a Secret call
so SetSecret only gets called once - in the module where it's set
so you would likely want something similar for CacheVolume - the setting module calls that, but it returns a dagql.Instance to GetCacheVolume (or something like that) - which is now pre-scoped
cc @civic yacht just checking my reading of this as similar to secrets is right
so you would likely want something similar for CacheVolume - the setting module calls that, but it returns a dagql.Instance to GetCacheVolume (or something like that) - which is now pre-scoped
Yeah, I did that already. Erik helped me identify that in a different thread.
now the prob is that it seems mod B is caching something it should not.
do you have a pr π kinda difficult to help debug without that
but i suspect GetCacheVolume equivalent needs to be marked as impure
sure, i can get a draft PR going.
likewise with CacheVolume
wait no, only CacheVolume should need to be impure? GetCacheVolume should be fine to be pure, depending on implementation.
func (f *Bar) UseCacheVolumeId(ctx context.Context, id string) (string, error) {
return dag.Container().
From("alpine:latest").
WithMountedCache("/bar", dag.LoadCacheVolumeFromID(dagger.CacheVolumeID(id))).
WithExec([]string{"sh", "-c", "ls /bar/bar.txt"}).
Stdout(ctx)
}
func (f *Bar) UseCacheVolumeName(ctx context.Context, name string) (string, error) {
return dag.Container().
From("alpine:latest").
WithMountedCache("/bar", dag.CacheVolume(name)).
WithExec([]string{"sh", "-c", "ls /bar/bar.txt"}).
Stdout(ctx)
}
@still garnet yo so i've climbed up a tree regarding the dagger/dagger dev loop and OTEL config trying to fix this lil bug. the bug is unimportant at this point, but what combined me, @civic yacht , and @astral zealot cannot figure out is that somehow using a dev CLI plus engine (via ./hack/dev or more manual methods calling ./bin/dagger directly) never seems to hit the sdk/go/telemetry/init.go log line that you can hit when repro-ing against released artifacts.
is there some special handling of telemetry configuration in the dev build process or the release that we're just not finding?
to minimally repro, looking for "failed to emit telemetry" logs coming out of the cli:
OTEL_EXPORTER_OTLP_ENDPOINT="" OTEL_EXPORTER_OTLP_LOGS_ENDPOINT="" dagger call check
^ has the log, but v does not:
OTEL_EXPORTER_OTLP_ENDPOINT="" OTEL_EXPORTER_OTLP_LOGS_ENDPOINT="" ./hack/with-dev dagger call check
we've tried all sorts of other invocations, ./bin/dagger, checking out released tags, panicing instead of logging, etc, the dev cli always seems to find a way to get traces into dagger cloud despite being configured not to
h
o
w
a
welp. keyboard
π windows
how are you configuring it not to send to Cloud? i think you would need it to ignore your $HOME config state (ie. dagger logout) - don't think it cares about OTEL_*
on the released version, the bug only repros while logged in β ie it does care about the $HOME state and the OTEL_ simultaneously
there's also a really awkward way that Cloud config interacts with OTEL_* env vars, which miiiight be related, just putting it out there in case it clears up some other mystery
iirc if you have a http:// (non-TLS) endpoint configured in OTEL_* the Cloud config picks it up too (magically due to OTel defaults) and barfs trying to send to Cloud over http:// instead of https://
can hop in a chat if you want?
@civic yacht @astral zealot we figured it out, the whole time we had been assuming ./hack/with-dev rebuilds the ./bin/dagger, but we needed to run ./hack/dev prior
so indeed, dumb devloop problems
I should have noticed that when I talked to you
So when the error stopped showing up yesterday, was it because we randomly run ./hack/dev at one point after making the code changes?
yes, probably before i got on a call with any of you
which is also funny because the whole time, on every invocation, i'm fairly certain we were unwittingly confirming that my changes here fix the bug and work exactly as expected XD
π€·ββοΈ π
```## ./hack/with-dev
Usage: ./hack/with-dev ...
with-dev runs the specified command with the dagger context environment
variables set (similar to dev above, but does not rebuild the engine).
this is also clearly documented
Hi Folks, is there a way to enable TRACE logs for buildkit in dagger engine?
if you're using hack/dev you should be able to set TRACE=true in the env
that worked. thanks Justin.
good luck reading the incredibly verbose logs π
ha ha. yeah. that is true. but i am having hard time reaching the conclusion on cache volume dir issue, so trying to build some understanding of how caching works/get-busts in buildkit.
would it be ok if we added a timestamp to this prefix:
prefix := fmt.Sprintf("[buildkit] [client=%s] ", client.clientID)
This maybe useful for corelating timestamps when analyzing logs? I realize it will take submitting a PR upstream.
v0.13.6 π§΅
Which cloud config?
by Cloud config there I meant the code that configures OTel to send telemetry to Dagger Cloud (if you've run dagger login)
fyi - for anyone interested, tomorrow at 4pm UTC, we're having our core dev architecture discussion: anyone interested in working on the engine itself is welcome to drop in π
anybody got tips on what tests they might run to benchmark a global eschewing of buildkit netns pooling? been trying dagger call test specific --run=TestModule but it's heavy af on my machine, feels questionably consistent, and i've hit 30m timeouts and flakes a couple times with --parallel=12.
I'm trying to use dagger call dev terminal as a usable "dagger version manager"... But wiping the cache every time is making it not practical. Is there a way to re-use the engine state between calls dev terminal?
Not sure if I'd try to check this with our tests. Maybe writing a more controlled benchmark would be easier? What are you trying to verify?
Are you manually wiping the cache?
IIRC what's currently happening is that each version that you build has its own cache volume. We could add a flag to the dev command I guess if you want/need to override that behavior and share the state across all versions
https://github.com/dagger/dagger/pull/8712 for more context, basically we want to avoid bk's netns pool (we do this by setting unique hostnames for every hostnameless container) so that modifications to that netns' config don't bleed into bk's process-shared netns pool (thereby bleeding into other random containers) ... in theory the pool is a performance optimization to avoid the overhead of making new netnses for every build container, but it's unclear how big of a hit it incurs.
so in terms of what i'm trying to benchmark, basically want to see the effect on some pipeline that produces a shit ton of containers... TestModule fits that, but it's too many
Oh, I see.. I think it's ok not to use TestModule for this benchmark as it'll definitely bring a lot of noise.
I'd just create a basic pipeline that spins N containers and benchmark it between the two approaches
yeah fair, shoulda just done that out the gate rather than futzing with test
hey folks, is there a way to hide an API from SDK's?
context is that I am adding an api "cache", which is called internally by "cacheVolume" api. I don't want the new "cache" api to show up in SDK as an option.
hmm may be double underscores.
@rancid turret @spark cedar I gave a sneak preview of the dsh branch yesterday... It blew an appropriate number of minds π But I noticed subcommands don't seem to work $() is that a known issue?
Yup, known issue π
But it worked before no? We were only missing the plumbing to detect objects in arguments
Mm I will defer to @rancid turret on this one π I've been more helping out on tui side
No it's not just objects, it's not hooked in to state+querybuilder+id yet. I was going to hack on that next, but if I'd known you would demo this today I'd have tried to get it ready. π
Should be easy enough to make sure it hooks to normal scalars, let me see.
Can I get .git back please π
It's fine, I didn't demo to the whole audience, just gave a sneak preview in the restaurant after. Small audience, but they were already thinking about something similar, so they were excited and got the value right away π
In the current build, can I do .core container for example?
@spark cedar @rancid turret can you give me a working example of calling core functions in the latest build? I was going to show .git ... in my demo, but it's not there anymore
Yes, you can!
.git URL should still be there. For core functions you can just prefix with .core ..., I hope there's no regression.
Oh you're right, .git is still there. It just got removed from .help
Oh yeah, .git isn't in .help, but it wasn't there before. You didn't add it during the hackaton and I haven't touched .help other than changing the order a bit.
.core works for me:
β― dagger-dev shell -c '.core container | from alpine | terminal'
Note that you still need to run this from a module.
I'm trying to use `dagger call dev
@tepid nova Where can I see more about this dsh?
dagger shell
fyi we're reaching the point of merge for https://github.com/dagger/dagger/pull/8557 - speak now or forever hold your peace, ideally we'll get it in this week for the release this week
Hi Justin, I am looking at adding this testcase https://github.com/dagger/dagger/pull/8557#discussion_r1808953579. Kindly do not merge until I could add that to the regression suite.
created a new kind/dx label on github - we need something for cases where we can improve rough edges, but that aren't really bugs or features: https://github.com/dagger/dagger/labels/kind%2Fdx
adding this testcase has surfaced some other testcases, for which I have to make some additional adjustments. should be able to submit the PR tonight
Quick feedback on our repo layout & impact on daggerverse. I spoke to a relatively new user, and he found himself using github.com/dagger/dagger/modules/XXX assuming it was the blessed, official standard library. I emphasized that, though it was an incubator for a future stdlib, it was nowhere near finished, and should be considered a module for internal use - it could break at any time, have some project-specific opinions etc.
--> I think we should consider changing the directory name from modules to something more explicit, like modules/internal, and flag the need for that internal setting on Daggerverse cc @hasty basin @tidal spire
How far away do you think these modules are from good enough to be blessed/featured? Compared to the average daggerverse module
Indeed, because we've thought a few of them were good enough to be featured.
At the moment I would say we can treat them the same as other "nominated" modules. Some of them could be featured on a case by case basis. Others not. Either way we should not give them a special status beyond what eg. one of Mark's modules might get. In other words, they're not yet stdlib material
It's kind of a wild west still, for example I'm looking at a PR right now that merges my gha module and completely refactors the API... Who gets to decide on such a change is not clear at all, what is the "maintenance chain" etc
Any of those modules are just one dagger/dagger PR away from being blown up and rebuilt
Yep, we liked the looks of
https://daggerverse.dev/mod/github.com/dagger/dagger/sdk/python/dev for example
and someone suggested https://github.com/dagger/dagger/tree/main/modules/alpine
Do you think a proper "featured module" indicator would remove the need for changing the directory since it sounds like thats what they're looking for?
No, I think "featured" takes care of false negatives, but we also need a way to handle false positives
yep, most common ideas being
- semver tagged are "more official" (has issues of folks not keeping up with tagging on good candidates or folks tagging immature modules just because others doing it)
- special naming of module (difficult, need to reserve words or patterns, and have to rename before real release, but could be very clear:
foo-experimental) - special directory structure excludes (maybe easier:
experimental/foo,internal/,examples/,test(s)/) - special "not-for-general-use" semaphore file or metadata in
dagger.jsonor even in code files (e.g. comments) or README - less formal indicators in README like "experimental," "unstable," "not for production," or "work in progress."
will review/benchmark some more
fyi, go/php publishing is bit broken on main - fix is here: https://github.com/dagger/dagger/pull/8756
[async] @rancid turret @spark cedar hey, quick q: with the shell (e.g. with a long running session), were you able to grab "fresh" content from directories? I'm doing something adjacent and noticed that the upload is only happening once for the entire session, I'm guessing caching
That's intentional so we don't pick up changes to local dirs in the middle of pipeline execution and create inconsistencies. I.e. if you edit a local file while something is running you don't usually want half the pipeline to use one version and the other half to use a different one.
Yup, that makes sense ... however, is there a way to force it to comply? π
Yeah we could definitely add that as an option somehow. In the meantime, I wonder if you just provide the same path but with different exclude values if it would trigger again... like even if you provide exclude values that don't match anything in the dir. Pure 100%
hack if it works at all, but worth a shot if you are just searching for a quick unblock
@civic yacht π
Unfortunately I don't think I'm in the right path (I'm up in cmd/), don't have access to that ... as a workaround I can just do another withEngine I think
I smell a watch POC π€©
I think this is the same behavior we observed with pocketci @astral zealot where we couldn't re-sync files after the dagger session already started π¬ ref: #maintainers message
Hi @civic yacht , I was just trying out NewInstanceForCurrentID for CacheVolume namespacing, but it seems like it still caches the object across module calls? is this probably due to the fact that when we call dag.CacheVolume("name") from different modules, the query is not yet namespaced, and hence the caching?
Hi @Erik Sipsma , I was just trying out
Any idea how I could start a custom dagger-engine from a dagger pipeline, and point to it (_EXPERIMENTAL_DAGGER_RUNNER_HOST) from the main dagger cli on the host? Not dagger-in-dagger, I just want to start a custom dagger-engine that I can use normally from the host. I have a PoC that works well in docker-compose and I'd like to daggerize it.
almost sure this is not the answer you are looking for.... but... would hack/dev and then hack/with-dev work for this?
yeah not really, I am starting a custom engine directly from a stable version, I don't need to compile it
do you have code that you are using to start the dagger engine in this case?
Is the custom dagger engine in an image? You can do _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-image://<image ref> if so
yay, this is still broken lol - https://github.com/dagger/dagger/pull/8763
@spark cedar Can i use current version of oci-worker-gc-keepstorage flag to limit disc usage while waiting https://github.com/dagger/dagger/pull/8725 to be released?
yup
Would anyone else be interested in a feature to generate automatic constructors for types? I'm finding myself defining a type, and then creating a "constructor" for it manually, and repeating the huge number of fields over-and-over.
Curious if I'm missing a neater way of doing this in go.
@rancid turret @fair ermine is this just a go issue? or does this pattern also seem like something worth covering at an api level instead of just go?
Only Python has a convenience for this, because I felt the same thing π
what's the python convenience for it?
There's conveniance on TypeScript but it's very hard to parse and not supported yet (maybe I'll after once my refactor is finished), I would love a constructor at the API side
@object_type
class Bar:
result: str
@object_type
class Foo:
bar = function(Bar)
Foo.bar above is equivalent to:
@function
def bar(self, result: str) -> Bar:
return Bar(result=result)
Basically uses Bar's constructor as its own function. Works within Python, not just the API but API doesn't know the difference.
huh cool
okay, we should have something like that for go, but sadly, don't think it can work without the api
I would love this feature:
- If a module's top-level type has public fields, and doesn't explicitly define a constructor, the API exposes a default constructor with each public field as an argument
- Defining a constructor explicitly disables this feature
- Undefined: how to differentiate optional from mandatory arguments in the constructor?
Yeah, that's how it works in Python too, except for the second point because non-Go SDKs use the main objects own constructor for the module constructor, instead of an external function. As for the optional args, you could use the // +optional and // +default pragmas in the struct fields for that effect.
Oh, there's a distinction here. If you're thinking about just the module constructor, then the SDK just registers the main object's natural constructor as the module constructor. Additionally, Python's @object_type generates a constructor automatically based on class attributes (due to dataclass). The Python convenience that I showed is to create functions in an object that instantiate another custom type, when the arguments are the same.
but if I don't define a natural constructor in my Python object, then the SDK won't register one with the engine, and the engine's default constructor could kick in right?
As of now it always registers one. Every Python class has a natural constructor, the default being just without arguments.
I think Go can do the same without the API, just in codegen. Because you don't use the New function directly in code.
@rancid turret @tepid nova I recall you discussing dagger call surprisingly resolving module function that return scalar types in the past but I can't remember if that resulted into an issue or if we agreed on that being the default behavior.
i.e:
type Lala struct{}
func New() *Lala {
return &Lala{}
}
func (m *Lala) Prepare(ctx context.Context) string {
return "prepared"
}
dagger call in this case will evaluate the Prepare function to show the user the output of the command. I'm mostly bringing that again as @little hinge pointed it out to me via DM and the fact that he was surprised by that behavior
IMHO we should disable this behavior if possible as its spooky π surprising that things get executed even when the user didn't explicitly request it.
Yes, I plan on taking care of that very soon: https://github.com/dagger/dagger/issues/8516
thx Helder. My apologies for the late ping. Wasn't intended
Np, I came here for another reason and then saw your msg. Wasn't notified π
More constructors for Go
Hi @spark cedar, I am trying to debug an issue with uninstall and noticed something about the format of source in dagger.json:
with main branch:
"source": "github.com/shykes/daggerverse/hello@hello/v0.3.0",
with v0.13.5:
"source": "github.com/shykes/daggerverse/hello@54d86c6002d954167796e41886a47c47d95a626d
Please notice the hello after the @ symbol with main branch. is that expected?
yup
that's the actual name of the tag https://github.com/shykes/daggerverse/releases/tag/hello%2Fv0.3.0
we do some clever tag matching when we first resolve the tag - v0.3.0 in this case is actually shorthand for hello/v0.3.0, so we resolve the tag at this point in time
one of the reasons we do this is suppose the module refers to a sibling local dependency - like ../a-utility - then we need to make sure that they definitely point to the same tag+commit - we don't want one of them to point to hello/v0.3.0 and another to point to a-utility/v0.3.0, which might not actually be the same commits
thanks for this info. I will check more on my side to debug that uninstall issue.
right, i did saw that PR, and that is the reason I reached out here.
The issue I am trying to solve is to find a good way to match dependency when uninstalling it (without actually loading the source of the dependency).
If I understand correctly, we should be able to uninstall using:
- github.com/shykes/daggerverse/hello
- github.com/shykes/daggerverse/hello@hello/v0.3.0
- github.com/shykes/daggerverse/hello@v0.3.0
- hello (by name that is specified in
dagger.json
but there could be scenarios where this git repo has been deleted, so we cannot rely on source loading.
mmm, okay i would probably do a suffix check for the version:
if i uninstall github.com/shykes/daggerverse/hello@v0.3.0, then i look for an exact match of github.com/shykes/daggerverse/hello, and match the version that is exactly v0.3.0 or ends in /v0.3.0
actually - we should do the inverse of this logic: https://github.com/jedevc/dagger/blob/82020b944bf4ec25a445f33632dcda1250550f1b/core/schema/git.go#L453-L474
is it possible to install two versions of same dependency?
what a good question π€
so i think you can - as long as they have separate names, it should be possible
hmm, it seems like it overwrites the existing one
mm, yeah, dagger install probably handles it badly
but if you manually edit dagger.json, i think you could get it working?
oops. i may have introduced a regression with "init & develop litter" changes. (Found it while trying above scenario π¦ )
π what's the regression? we're delaying the release till tomorrow, so there's still time to fix
dagger init --sdk go --source .
dagger use github.com/shykes/daggerverse/docker@docker/v0.4.1 --name docker
dagger develop -> this removes the dependency from dagger.json
looking into this first. sorry about that.
i wonder if we need this restriction anymore
anyways, in that case, we don't need to consider matching versions at all π€
my 2 cents would be to keep things simple and therefore keep this restriction in place.
yeah agreed
I just tried this with fresh build of engine (after cleaning up existing volumes/containers to remove cache), and this seems to have worked fine.
so i am wondering if some sort of caching came into play when I ran into that issue π¦
mm, yeah i can't repro this either
well, i don't see how this would have caused it tbf
this is the cloud trace (incase it is helpful to identify why this might have happened) - https://dagger.cloud/rajatjindal/traces/bbda5d7bf17a87674380d17bc78d7d84
coming back to the original issue that we were discussing:
right, i did saw that PR, and that is the reason I reached out here.
The issue I am trying to solve is to find a good way to match dependency when uninstalling it (without actually loading the source of the dependency).
If I understand correctly, we should be able to uninstall using:
github.com/shykes/daggerverse/hello
github.com/shykes/daggerverse/hello@hello/v0.3.0
github.com/shykes/daggerverse/hello@v0.3.0
hello (by name that is specified in dagger.jsonbut there could be scenarios where this git repo has been deleted, so we cannot rely on source loading.
do we agree that we need to support all above variations of uninstall (assuming modules can have only one version of a dependency)?
i think there's two versions - we can uninstall by name, or by refstring - for refstring, we can just remove the versions entirely, and compare just the symbolic parts
that should handle all the above scenarios
π I will work on that tomorrow morning. thank you for your time today to review/merge other PR's
hm, just wanted to flag https://dagger.cloud/dagger/traces/8be63cc20bab1d4ddfc68aa9b3139160?span=5b0d52711fefc80c
go build std failing with go: error obtaining buildID for go tool compile: signal: segmentation fault (core dumped)
definitely not happening reliably, just noticed on a few jobs on main
I noticed that "moduleSource" api evaluates successfully for a local-source-module even if that does not exist, but fails for a git-source-module. is that intentional?
π§΅
Want to give a shoutout to all the initialization optimizations @civic yacht has been doing recently. v0.13.6 is noticeably faster on CI where, in my case, everything runs from scratch every time. I think the biggest one is pre-packaging some of the Go module stuff in the Engine image.
Great to hear! More improvements coming soon too π
was doing work with custom cache policies - and decided to take the leap of proposing a new engine config file format (instead of continuing to piggy-back off of buildkit's engine.toml file)
https://github.com/dagger/dagger/pull/8800
not fully ready, but any early comments would be appreciated π
let's talk about v0.13.7 π§΅ π
custom dagger engine config file
I made an issue tracking my requests for shell polish: https://github.com/dagger/dagger/issues/8807
cc @spark cedar @rancid turret
(sidenote: "shell polish" is kind of the polish notation for a polish notation shell)
If you're looking for something to bikeshed on: https://github.com/dagger/dagger/issues/8808
is it possible to exec into a container that was started using dagger pipeline? if so, could someone please point me to instructions on how can I do it.
I started a nsenter container, and I see the process of container that I want to exec into. I can find the file-system (using /proc/<pid> etc), but if possible I want to exec into the exact container to check some process details etc
nsenter -a -t $pid should get you in the same namespace as the target process. What are you looking for exactly?
I was trying to check if I can find some logs or reproduce a problem a user reported (puppeteer not able to start chromium using dagger).
oh, I see.
you could also use Terminal and run the command manually within the container
that's what I'm doing nowadays instead of running nsenter as Terminal is simply awesome!
yeah, in my case the command is hung, and I wanted to exec into it to see if there is anything obvious.
makes sense. You can run Terminal and then run the command manually in the background
hey folks, could I please get some follow up reviews on the following PR's:
- https://github.com/dagger/dagger/pull/8745 (dagger uninstall command)
- https://github.com/dagger/dagger/pull/8724 (dagger cache volumes are namespaced)
Thank you @meager summit !
as discussed with @spark cedar in San Diego
Hello, how can i export the engine dev into another platform ?
shell
$ dagger call dev-export --platform linux/amd64
Setup tracing at https://dagger.cloud/traces/setup. To hide: export SHUTUP=1
Error: response from query: input: daggerDev.devExport resolve: call function "DevExport": process "/runtime" did not complete successfully: exit code: 2
Stdout:
invoke: input: container.from.withExec.withEnvVariable.withFile.withExec.withDirectory.withExec.asTarball resolve: container image to tarball file conversion failed: failed to solve for container publish: process "/dev/.buildkit_qemu_emulator go build std" did not complete successfully: exit code: 1
Stderr:
regexp: /usr/local/go/pkg/tool/linux_amd64/compile: signal: segmentation fault (core dumped)
Am I the only one getting a segfault on the qemu emulator ? π
Does anyone recognize these orphaned traces? usually when this happens it's from a new integration test that's missing OTel trace propagation somewhere. https://dagger.cloud/dagger/ci?orphaned=true
Maybe it's from new benchmarking stuff? https://dagger.cloud/dagger/traces/c549cc2e4f8c72c2bbf2e1f84b42b22a /cc @civic yacht
That benchmark was setup a while ago by @fair ermine and @wet mason I think?
They should be coming from here:
https://github.com/dagger/dagger/blob/main/.github/workflows/benchmark.yml
This is pretty old stuff -- @fair ermine made some changes semi-recently
It's not doing much -- just running some dagger commands, e.g. init functions
Technically, when I open a dagger session from within an existing session (thanks to the powers of experimentalPrivilegedNested:true), is that a new session, or inheriting the parent session?
It's all the same session
They are different "clients" all attached to the same session
nice thanks
so different clients can see a different scope of the same session?
(for security purposes)
Yeah exactly, there's per-client state and session-wide state
So sessions are not nested. Are clients nested? ie. is there a nestable object in there anywhere? or not needed, just a flat list of clients attached to a session?
It's a flat list of clients attached to a session. We do track the "stack" of function calls, and function calls are all clients in a session, which is the closest thing I can think of to anything "nested" but we only use that for otel-related purposes
Nice, thanks!
So ther'es a 1-1 mapping of sessions to traces in cloud?
(a trace always starts from exactly one session; a session always generates exactly one trace?)
Short answer: yes.
Longer answer with caveats: in our CI we actually start up separate dev engines as services and run tests/etc. against those. In that case you do end up with multiple sessions all grouped under a single trace (the dev engine services forward OTEL data to the "outer" engine).
That however is hyper-specific to our CI, no user would do that. So practically speaking you can think of it as 1-to-1
I've been wondering why we build & run these dev engines in GHA scripts, instead of doing it in dagger? Too many layers of nested dagger?
Oh that's separate, that has to do with the fact that we want to run our CI modules themselves against a dev engine.
I was referring to this: https://github.com/sipsma/dagger/blob/d6e5f371e62aa961e731aa7a56349681f1dda11a/.dagger/test.go#L194-L194, where we are running the dev engine as a service in the outer "stable" engine.
The difference between the two is extremely subtle and mind-numbing, but that's par-for-the-course when it comes to these types of bootstrapping problems π€·ββοΈ
We have discussed ways of reconciling the two so we don't need those GHA scripts, but, you know, time+bandwidth and all that
I'm struggling with the module type system a little bit... Trying to implement shell features that call functions from an installed dependency
Dumping my problem in case someone with the required knowledge has pity on me...
This is what I have to play with:
// moduleDef is a representation of a dagger module.
type moduleDef struct {
Name string
MainObject *modTypeDef
Objects []*modTypeDef
Interfaces []*modTypeDef
Enums []*modTypeDef
Inputs []*modTypeDef
// the ModuleSource definition for the module, needed by some arg types
// applying module-specific configs to the arg value.
Source *dagger.ModuleSource
}
From there, I already know how to enumerate the dependencies to print their name and description:
// ...
for _, dep := range h.mod.Source.AsModule().Dependencies(ctx) {
name, _ := dep.Name(ctx)
desc, err := dep.Description(ctx)
}
well that was a massive waste of my time
It's not trivial at the moment because the introspection only has the types for the current module, along with the core ones. Not dependencies. So it's a prereq to load different moduleDef results per module, not just the one.
Thanks, it makes me feel better that it's not trivial - makes me feel less stupid π
Yeah, that's why it hasn't been done yet π
@rancid turret FYI we should talk about variables soon, there is a dependency on that part of the design from other projects. Depending on how we handle variable persistence, especially in interactive mode, a lot of assumptions might change in higher-level tooling
(specifically the ability to copy-paste any command from the history, and trust that there is no side effect from the interactive session)
one option is to simply disable setting variables. Read-only.
But that might be too surprising to bash users. (to be continued π
GitHub now provide insights into performance metrics of Github actions - https://github.com/dagger/dagger/actions/metrics/performance
It seems that we might need to rebase and merge: https://github.com/dagger/dagger/pull/8832, got some related errors yesterday on CI: https://github.com/dagger/dagger/pull/8805/files#diff-33ef32bf6c23acb95f5902d7097b7a1d5128ca061167ec0716715b0b9eeaa5f6R278
hey folks, I am trying to evaluate adding a new api container.Stat that would be similar to golang's os.Stat. While I am able to make slow progress in understanding how to add this new type, I wanted to make sure I ask here that this is something which is possible.
one thing that I am currently debating on with myself (and trying to check if there is an established pattern in code) is that should the api return *Stat, and then user can access its fields directly, or should this be like another api e.g. (*Stat).IsDir() , which is essentially another call to engine?
unless I'm mistaken, the way our SDKs work, you don't get to choose between returning a struct or individually querying each scalar field. Graphql queries that return objects automatically get generated in such a way that you have to individually query each field, even though a raw gql query could trivially get them all in batch. It's a frustrating limitation of our current DX IMO.
Wasn't .Stat basically unnecessary if you have an .Exists?
I am trying to see how different options would look like. I already implemented β.Existsβ api, and it seems to be working fine, but because .Stat also came up as an option in the last comment, i wanted to see how it would look like
I would like Stat! If you could use it to get file size, symlink information, file type, etc (back to pto π )
Atm you have to hack it with withExec and it's kinda difficult
@meager summit it would be interesting to compare 1) a Stat() function that returns a new struct with all the useful fields, and 2) adding the fields directly to Field(). Since the current DX requires individually querying each field anyway... there's not much benefit in grouping some of those fields in a separate struct.
Yeah, i agree. This is why i wanted to check if it is possible to return a complete struct from the api today, without the need to query individual fields. Looks like the best way forward would be to add fields to Field(). I will try out and open a draft PR tomorrow
It's definitely possible to do that - just not with our current generated bindings
For a module that has a very data-centric API (lots of querying of structured data), it's probably cleaner to query it with a raw graphql query today, than with the official codegen bindings
So there are 2 variables to this equation:
- Making sure short-term improvements like
Statwork well with the current DX - Challenging the current DX and looking for long-term improvements to it, when appropriate
We have a TODO in the CLI to have a file type in File because we don't know if the contents is text or binary for example. Size is already there.
API inconsistency with AsService using entrypoint - issue #8140 - π§΅
β¨ v0.14.0 - 8th November 2024
ref mutated by policy to non-image
basic q, but when using ./hack/dev on macos, is there an easy way to view engine logs? i'm not seeing everything in cli output, right?
docker logs dagger-engine.dev 2>&1 | less is what I do on linux, should be same on macos afaik
cool the ones i was looking at in orbstack ui are the right ones then (not a fan of the gui though lol)
π€ I have a failing lint check in this PR, but I have no idea how to fix it... https://github.com/dagger/dagger/pull/8868
Major improvements to github.com/dagger/dagger/modules/go, with all required features to build and test Dagger.
New configuration settings:
Download and build caches
Base container
Lldflags
Inject...
these are the lines it's complaining about: https://dagger.cloud/dagger/traces/b50b883ccaed76112e7c54af4984a6e5?span=dcec5c27de850397&logs
Yeah I see the lines... it's all over the "tidy xxx" spans, across many modules that I haven't touched?
oh wait
not I hadn't seen those lines. in my trace it'ds drowned by weird diffs on a million go.sum files
Thanks @civic yacht . I am unable to reproduce that linting output locally, but at least I know what to fix π
This would be very neat to have... https://github.com/dagger/dagger/issues/6365
Any chance I could get a quick LGTM review on https://github.com/dagger/dagger/pull/8868 ?
(not release-sensitive)
Probably want to wait until after the release, just in case
Hi @still garnet, in the attatched screenshot of a trace from dagger cloud UI, does it mean that the actual command which was last running is exec go test OR it is stuck at start entrypoint.sh.
I am having problems to identify where is my execution exactly stuck.
it's saying they're concurrent - since the bars occupy the same horizontal space. so the go test should be what it's waiting on - those start spans are just services that it depend on (two registries and the Dagger engine being tested)
got it. thanks. just FYI, the loading of logs is failing with out of memory error
There's a pretty hard limitation there unfortunately - WASM can only use up to 2GB of RAM. I've already sunk a lot of time optimizing it, but maybe something regressed - I can take another look soon π
thank you.
PAT via git credential store https is a good update
Ty team !
huh what is happening on main π https://dagger.cloud/dagger/traces/1761def81bd0097b5854b924c7572d77?span=9d7ce06e9320e348
dunno, was just looking into it, suspect something must have changed in the external world since it seems like it appeared "out of band" from changes we pushed
It's happening in PRs too, so at least it's consistent
β― docker buildx imagetools inspect docker.io/arm64v8/alpine:latest
Name: docker.io/arm64v8/alpine:latest
MediaType: application/vnd.oci.image.index.v1+json
Digest: sha256:7d206216871a9feecd6c6689d895889577fdcb28f3f789d8243b3a12f38a2730
Manifests:
Name: docker.io/arm64v8/alpine:latest@sha256:ea3c5a9671f7b3f7eb47eab06f73bc6591df978b0d5955689a9e6f943aa368c0
MediaType: application/vnd.oci.image.manifest.v1+json
Platform: linux/arm64/v8
Annotations:
org.opencontainers.image.source: https://github.com/alpinelinux/docker-alpine.git#7d63673353bd39d92ba42f6effcc199aeebd45ee:aarch64
org.opencontainers.image.url: https://hub.docker.com/_/alpine
org.opencontainers.image.version: 3.20.3
com.docker.official-images.bashbrew.arch: arm64v8
org.opencontainers.image.base.name: scratch
org.opencontainers.image.created: 2024-11-12T00:54:34Z
org.opencontainers.image.revision: 7d63673353bd39d92ba42f6effcc199aeebd45ee
Name: docker.io/arm64v8/alpine:latest@sha256:a8ba68c1a9e6eea8041b4b8f996c235163440808b9654a865976fdcbede0f433
MediaType: application/vnd.oci.image.manifest.v1+json
Platform: unknown/unknown
Annotations:
com.docker.official-images.bashbrew.arch: arm64v8
vnd.docker.reference.digest: sha256:ea3c5a9671f7b3f7eb47eab06f73bc6591df978b0d5955689a9e6f943aa368c0
vnd.docker.reference.type: attestation-manifest
looks semi-reasonable to me π€
we're not requesting a specific platform in that test, so not sure why we could get that error
However, it doesn't happen locally on my linux arm64 box, dagger call test specific --run='TestContainer/TestFromImagePlatform passes
π€ yayyy
If anyone has an x86 machine and the dagger repo pulled, would be curious if dagger call test specific --run='TestContainer/TestFromImagePlatform' passes for you locally
ohh i wonder what mirror.gcr.io has there
since we use that as a mirror for our tests
hm that's the same, nvm
so the switch that happened is they now produce oci image indexes - instead of single manifests
(I can actually repro if I change the test locally from arm64->amd64)
Ah nice, that still shouldn't result in that error... not sure if dagger or buildkit/containerd issue happening
Think I might see the problem in our code, trying a fix
yeah this repros for me
huh @tidal spire just hit this too - and mentioned that he can't even do docker pull on that image
(although i guess that's the matching platform thing)
Issue/fix here on the goreleaser side https://github.com/goreleaser/goreleaser/issues/5269
What happened? Tests are failing trying to pull docker.io/{ARCH}/alpine when running on machines opposite of {ARCH} (Full test failure below). What I mean is this fails to pull amd64 when running o...
ohhh yeah i see
the image is now annotated with the platform - technically i guess it should never have been possible to pull arm64v8/alpine on amd64
but because it had no platform annotation it was fine π€
I think the problem in our code is that we default to the host platform if not explicitly set on a container and then use that to resolve the image. In the "single manifest" case that never mattered since platform was ignored, but not it is actually interpreted and we try to find the right platform manifest in the oci index
yeah i'm kinda tempted to say that this is actually intended behavior?
Yeah exactly. The fix on goreleaser to always pass --plaform did the job
I think the fix might be to not request the host's platform when resolving the image unless it was very explicitly set. Trying that now
hm, but we need some way of preferring the host platform if it's available right?
if there's amd64 and armv64, we want amd64 on amd64 and armv64 on armv64 - but if we're on a third arch that doesn't have that, i think erroring out makes sense
instead of falling back to one of the others?
Yeah I guess the behavior would get super hard to explain and inconsistent if we don't just treat this as an error case
Probably better to keep it simple and require users explicitly set non-default platforms on Container if they want images for that platform.
I'll update the test with that and see if I can make the error message here a lil better
the other option would be to maybe filter out the unknown/unknown architectures, and then if there's only one arch left, grab that?
fiddly, and still feels a bit weird to explain there
oh also this - the docker cli fails here too, so i guesssss this is fine?
we've got error-for-error compat π
That's what I was aiming towards but then after thinking about it realized idk how to explain/doc that in a reasonable way, which is probably a sign it's too complicated to be worth it π
yeahhh i'm not thrilled at that idea either π
i've messaged the official images maintainer to ask how intentional this is, and whether docker pull might be updated to handle this?
reply: essentially this is "unfortunate and expected" - this erroring out should probably have been the original behavior, even for single manifests, and there's a chance that that might be introduced in the future
so im futzing around trying to emulate packet loss inside a dagger container and i'm watching logs come out in a way that violates a bunch of my previous assumptions about how WithExec works... the logs look like this, basically it looks like an apt-get install and my lil script are running concurrently
β HelloDagger.testPacketLoss: String! 3m20.0s
... omitted for brevity ...
β Container.withExec(args: ["apt-get", "install", "-y", "iproute2", "curl", "iputils-ping"]): Container! 3m19.4s
β ca-certificates curl iproute2 iputils-ping libatm1 libbpf0 libbrotli1
β libbsd0 libcap2-bin libcurl4 libelf1 libldap-2.5-0 libldap-common libmd0
β libmnl0 libnghttp2-14 libpam-cap libpsl5 librtmp1 libsasl2-2
β libsasl2-modules libsasl2-modules-db libssh-4 libxtables12 openssl
β publicsuffix
β 0 upgraded, 26 newly installed, 0 to remove and 0 not upgraded.
β Need to get 4445 kB of archives.
β After this operation, 11.0 MB of additional disk space will be used.
β Ign:1 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 openssl arm64 3.0.2-0ubuntu1.18
β Ign:2 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 ca-certificates all 20240203~22.04.1
β Ign:3 http://ports.ubuntu.com/ubuntu-ports jammy/main arm64 libelf1 arm64 0.186-1build1
β Ign:4 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 libbpf0 arm64 1:0.5.0-1ubuntu22.04.1
β Ign:5 http://ports.ubuntu.com/ubuntu-ports jammy/main arm64 libmd0 arm64 1.0.4-1build1
β Ign:6 http://ports.ubuntu.com/ubuntu-ports jammy/main arm64 libbsd0 arm64 0.11.5-1
β Ign:7 http://ports.ubuntu.com/ubuntu-ports jammy/main arm64 libmnl0 arm64 1.0.4-3build2
β Ign:8 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 libxtables12 arm64 1.8.7-1ubuntu5.2
β Ign:9 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 libcap2-bin arm64 1:2.44-1ubuntu0.22.04.1
β exec apt-get install -y iproute2 curl iputils-ping 3m19.4s | Disk Read: 54 MB | Disk Write: 2.2 MB | Network Rx: 2.1 kB | Network Tx: 9.1 kB
β Container.withExec(args: ["sh", "-c", ETOOBIG:sha256:1c32e0340dcc78d6c37760da86b80bb584ff34fb6390557618d107a2fdae7f7d], insecureRootCapabilities: true): Container! 3m19.4s
β Container.stdout: String! 3m19.4s
TestPacketLoss looks like this:
func (m *HelloDagger) TestPacketLoss(ctx context.Context) (string, error) {
return dag.Container().
From("ubuntu:22.04").
WithExec([]string{"apt-get", "update"}).
WithExec([]string{"apt-get", "install", "-y", "iproute2", "curl", "iputils-ping"}).
WithExec([]string{
"sh", "-c",
"tc qdisc del dev eth1 root || tc qdisc add dev eth1 root netem loss 25% && " +
"echo 'Network emulation configured with 25% packet loss' && " +
"echo '-------------------' && " +
"ping -c 10 8.8.8.8 &&" +
"for i in {1..10}; do " +
"echo \"Request $i:\" && " +
"curl -v -s -w '\\nHTTP Status: %{http_code}\\nTime: %{time_total}s\\n' " +
"http://httpstat.us/200 || true; " +
"echo '-------------------'; " +
"sleep 1; " +
"done"}, dagger.ContainerWithExecOpts{InsecureRootCapabilities: true}).
Stdout(ctx)
}
am i misreading the output here?
I would like to request a "design consensus gate" on new top-level functions in thre engine API
That top-level namespace is user-facing in various ways, and not just a hidden implementations detail. So it needs some amount of ux design.
Agreed, no way to really enforce this other than tribal knowledge, but that's okay for now.
I guess we could do something fancy like detect whether there was a change to the API schema file and automatically apply a label to the PR like Needs API review that causes it a CI check to fail until the label is manually removed, which would serve as a gate. But probably not worth it until proven necessary.
yeah a manual gate is totally fine, this is me asking the tribe π
I like that bikeshedding on seemingly inconsequential api calls often surfaces important questions on the overall architecture and its direction, which make it worthwhile
Souuuunds good π we should probably try and get design thoughts for any and all bigger API changes π
Not just top level
PSA: looks like sdk/java is failing engine-scan, so I'm putting together a function to bump those pom.xml dependencies
am I taking crazy pills or does mvn versions:use-latest-versions simply not work
mvn versions:display-dependency-updates shows all kinds of updates available, but use-latest-versions isn't updating the pom.xml file at all?
ah - needed versions:update-properties. PR up: https://github.com/dagger/dagger/pull/8942
Relates to #8768 (comment)
rebased - should be fixed now π
Just resolved conflicts in https://github.com/dagger/dagger/pull/8442 - any chance I could get a β so it has more time to bake for release? Sorry if you're looking at it already @civic yacht / @spark cedar π
i'll start using it properly asap π left one nit, and had a couple clarifying questions, but LGTM π
too early to be demoing the new ui at kubecon? π
go for it 
merged 
π± π± hype hype β€οΈ
@spark cedar saw https://github.com/dagger/dagger/pull/8442#issuecomment-2474842072 - in all honesty I was just cutting scope and getting open-PR-anxiety, will keep refining!
yup! sorry i meant to say in there that it should be for a follow-up, i didn't want to delay π
needing a sanity check, added some println debugging help around SecretStore.AddSecretFromOtherStore but for the life of me i cannot get anything to print.... i've tried passing test specific --debug && docker logs -f dagger-engine.dev 2>&1 |grep '!!!' but still nothing.... i'm fairly certain this path literally must be called because it's the only appearance of "not found in other store"
running out so haven't looked, but could be worth trying a panic() π
What test are you running to try to trigger it ? Have you tried using https://github.com/cwlbraa/dagger/blob/981b651a0d7a49730da42a1fc76030933a51d72f/core/services.go#L296 ? (I presume so as you're checking inside the engine logs)
i'm running TestModule/TestSecretNested... doesn't seem to hit the panics either
draft PR with actual test that should hit the !ok case: https://github.com/dagger/dagger/pull/8948
ah, wait i might know what i did... do docker logs only work for ./dev/hack and not an engine running instead of test specific?
they should work for both π I am testing on hack/dev now
weird
docker logs dagger-engine.dev will only work for the engine created via ./hack/dev. For test specific the engine is running as a nested service so you have to look at the service output to see the engine logs
yeah that's what i thought
Ok my bad πΌπ
(that's the main reason I still use ./hack/dev, it's less ephemeral which makes things like that easier)
(would love for that to not be true somehow someday π )
is there a better way to see those logs than clicking into the cloud trace?
With ./hack/dev and bklog above, I am able to see the outputs:
time="2024-11-13T22:59:04Z" level=debug msg="π debug my pipeline" client_hostname=buildkitsandbox client_id=4pffpi9e92bjkq0w3di6fume2 function= module=generator object=Generator session_id=k9nck951bhx8xoh7886iksli7 spanID=21a2fd8f37b21d33 traceID=f6a02cd5d2a8b3f9fcd1498458b2b74f
yeah i found them in cloud
so it's definitely printing, sanity confirmed
although still kinda annoying to access, lots of clicking required
I personally do the commands you do: docker logs -f dagger-engine.dev 2>&1 |Β grep π locally and ./hack/dev locally and rerun the specific test
but then you're seeing the engine logs of the outside one, which isn't the one running the test, right? or does the outside one end up having the inside one's logs, too?
I don't know with 100% certainty, but I get those output:
time="2024-11-13T22:59:04Z" level=debug msg="π debug my pipeline" client_hostname=buildkitsandbox client_id=4pffpi9e92bjkq0w3di6fume2 function= module=generator object=Generator session_id=k9nck951bhx8xoh7886iksli7 spanID=21a2fd8f37b21d33 traceID=f6a02cd5d2a8b3f9fcd1498458b2b74f
``` where you can see that the module calling the function is actually generator, which is the module definition you inlined
interesting, i'll try that then
other sorta random related thing, trying to pass -v to test specific chokes error: parse selections: parse field "specific": init arg "verbose" value as dagql.DynamicOptional (Boolean) using dagql.DynamicOptional: cannot create Boolean from int64
when i try this technique i don't get these logs in the dev engine logs... im doing dev && with-dev dagger call test specific --run="TestModule/TestSecretNested"
Oh, did you use bklog.G(ctx).Debug(...) ? I had to add the ctx to the function definition
ah, i'm just fmt.Printlning, lemme check if that's the problem
yeah keep us in touch ; Below seems to have worked on my side:
--- a/core/secret.go
+++ b/core/secret.go
@@ -6,6 +6,7 @@ import (
"sync"
"github.com/moby/buildkit/session/secrets"
+ "github.com/moby/buildkit/util/bklog"
"github.com/opencontainers/go-digest"
"github.com/vektah/gqlparser/v2/ast"
)
@@ -85,10 +86,12 @@ func (store *SecretStore) AddSecret(secret *Secret, name string, plaintext []byt
return nil
}
-func (store *SecretStore) AddSecretFromOtherStore(secret *Secret, otherStore *SecretStore) error {
+func (store *SecretStore) AddSecretFromOtherStore(ctx context.Context, secret *Secret, otherStore *SecretStore) error {
otherStore.mu.RLock()
secretVals, ok := otherStore.secrets[secret.IDDigest]
otherStore.mu.RUnlock()
+ bklog.G(ctx).Debug("π debug my pipeline")
+
if !ok {
return fmt.Errorf("secret %s not found in other store", secret.IDDigest)
}
diff --git a/engine/server/client_resources.go b/engine/server/client_resources.go
index 57cc6edd9..aa2f97def 100644
--- a/engine/server/client_resources.go
+++ b/engine/server/client_resources.go
@@ -81,7 +81,7 @@ func (srv *Server) addClientResourcesFromID(ctx context.Context, destClient *dag
// don't attempt to add the secret if it doesn't exist and was optional
continue
}
- if err := destClient.secretStore.AddSecretFromOtherStore(secret, srcClient.secretStore); err != nil {
+ if err := destClient.secretStore.AddSecretFromOtherStore(ctx, secret, srcClient.secretStore); err != nil {
return fmt.Errorf("failed to add secret from source client %s: %w", srcClient.clientID, err)
}
}
nope, even using bklog.G(ctx).Debug, running test specific, i can't see them in docker logs dagger-engine.dev
@obsidian rover what commands are you running and what docker-engine provider
this matches what i'm seeing, and then i'm able to see the nested engine logs in cloud
This is when you do dev && with-dev dagger call test specific ...? That would be expected because dagger call specific builds a dev engine and then runs it as a service. So with that command you are:
- building a dev engine and running it in docker as
dagger-engine.dev - connecting to
dagger-engine.devand runningdagger call test ...which builds and runs a dev engine inside ofdagger-engine.devas a service (so a dev engine in a dev engine)
My general process is:
- prefer just doing
dagger call test ...(nowith-dev) - if I really need to look through the engine logs locally or otherwise muck around with engine internals on a persistent engine, then
- build dev engine with
./hack/dev - run tests against it directly like
./hack/with-dev go test -v -run='TestWhatever' ./core/integration - look through logs as needed (
docker logs dagger-engine.dev), exec into it and poke arond (docker exec -it dagger-engine.dev sh), etc.
- build dev engine with
Like I said earlier, I really wish we could just consolidate everything to dagger call test and rm ./hack/dev, but we would need to add some sort of feature like "persistent services" that don't disappear when dagger call test goes away (or something along those lines)
So ./hack/dev then ./hack/with-dev bash and then running: go test -count=1 -v -run TestModule/TestSecretNested/pass_embedded_secrets_between_modules/double_nested_and_called_repeatedly ./core/integration
ahhhh tq the go test technique is what i was missing
i was not confused about why i wasn't seeing logs, fwiw, i was confused why @obsidian rover was π
annoyingly on macos the go test technique doesn't work, though
# github.com/dagger/dagger/engine/sources/gitdns
engine/sources/gitdns/source_unix.go:41:20: undefined: syscall.Unshare
engine/sources/gitdns/source_unix.go:41:36: undefined: syscall.CLONE_FS
engine/sources/gitdns/source_unix.go:41:55: undefined: syscall.CLONE_NEWNS
engine/sources/gitdns/source_unix.go:54:3: unknown field Pdeathsig in struct literal of type "syscall".SysProcAttr
engine/sources/gitdns/source_unix.go:81:19: undefined: mount.Mount
engine/sources/gitdns/source_unix.go:87:21: undefined: syscall.Mount
engine/sources/gitdns/source_unix.go:87:75: undefined: syscall.MS_BIND
# github.com/dagger/dagger/engine/buildkit
engine/buildkit/linux_namespace.go:108:21: undefined: unix.CLONE_NEWNET
engine/buildkit/linux_namespace.go:155:18: undefined: unix.Setns
engine/buildkit/linux_namespace.go:170:18: undefined: unix.Setns
FAIL github.com/dagger/dagger/core/integration [build failed]
FAIL```
which isn't terribly surpriseing
i can just become linux π
Oh that's annoying. It's fixable though. I'm not sure wha tthe import chain is that leads to the core/integration pkg requiring those packages, but I'm sure it could be trimmed one way or another. Or we could add darwin buildtags for those (that just do some no-op since the functionality is not actually needed for the integ test pkg).
Or that π
imma put this on my list of random chores
The java-dev and javascript-dev actions are failing on GitHub actions with error:
Error: The template is not valid. System.InvalidOperationException: Maximum object size exceeded```
can we upload these logs as artifacts instead of using GITHUB_OUTPUT?
mmm yeah this is annoying, we can definitely write the log outputs to a file - but we also produce a job summary, so we might also be hitting the limits of that
the other option would be to just trim the logs out?
or maybe even just not include the logs? though having in the summary is kinda neat
will put this on my todo list to think about π
fix for this https://github.com/dagger/dagger/pull/8966
do we have any flake detection infra yet or are we the flake detectors?
Us, it's usually pretty painfully obvious enough though. If you are seeing any and there's not an issue open for it, please open one and add the "flake" label.
The only flakes I just ignore without opening an issue for are when external services error or there's some transient network error. I.e. I get a 500 from yarn packages once or twice a week. In the longer term we should not even ignore those and see what we can do to lower our dependencies on external services working, but for now they are rare enough
cool imma make a point of searching for the failures in github then, like i just saw a closed one reappear
Started seeing this one a bunch recently: Cloud trace: https://dagger.cloud/dagger/traces/ce4f383ede89e9bf39d00fe89aacd444?span=087b98784bf05440#909ebe7a106b977f GHA job: https://github.com/dagger/...
pretty painfully obvious
this is a curse of the expert thing, but in the past month i've probably mistaken my own failures for flakes like 5 or 6 times only to reevaluate later XD β you definitely don't want me blindly submitting flake issues
If anyone wants to try their hand at helping with the shell: https://github.com/dagger/dagger/issues/8969
When using vanilla buildkit cache export - _EXPERIMENTAL_DAGGER_CACHE_CONFIG, which is better: regisdtry or object storage? In object storage, can I point at a subdirectory within an S3 bucket, or do I need to point at the root of the bucket?
cc @civic yacht since we discussed this recently
I started ingesting test results into a dashboard for detecting flakes
If this is something that interests others, we can add it to our infra
This happened with me as well, lol
You can point to subdirs in s3: https://github.com/moby/buildkit?tab=readme-ov-file#s3-cache-experimental
I'd say that registry is probably a simpler place to start just because the auth should get sourced from your host's ~/.docker/config and the infra is a little easier to setup
Yeah that makes sense in retrospect...
oh right, thanks for the tip. is the performance close enough at small scale that a poc using one will "feel" as fast as the other?
got a fix in https://github.com/dagger/dagger/pull/8982
should hopefully stop those jobs from failing!
also, got a pr that fixes the linting issue on main: https://github.com/dagger/dagger/pull/8986
Yeah wouldn't worry about it much yet. I think the biggest difference between the two is just how good your bandwidth is between the registry's backend vs. s3 (though if the registry is backed by S3 then who cares)
Is it a known issue that Container.asService doesn't inherit the ports exposed in the OCI image? I remember that was an issue a long time ago, and thought it had been fixed? regression?
Example:
$ dagger shell -c '.container | from nginx | as-service'
Error: response from query: input: container.from.asService.endpoint resolve: no ports exposed
(looking for an issue in the archives, but can't find one of course...)
I thought it was working as well
I also thought we fixed this recently, but I found the issue, and the reason we don't:
- Each Dagger 'exposed port' implies a healthcheck
- However
EXPOSEin aDockerfilein the wild apparently means "I might listen on this port" in a lot of cases, not "I will listen on this port" - So if we automatically treated
EXPOSEthe same asWithExposedPortit would mean some off-the-shelf images hang on start waiting for ports that will never be bound
Pretty similar bag of hurt as with ENTRYPOINT tbh.
Here's the bikeshed: https://github.com/dagger/dagger/pull/5158
Probably worth reviving the discussion - maybe we can just say "if you do that you're wrong"
Thanks for finding it! This is a good reminder that we should put these kinds of discussions in issues if possible
(ie. design bikeshedding)
Maybe we could handle it like entrypoint? Pick an opinionated default that implements "the dagger way" β’οΈ and have an optional flag to disable it?
so in my quest to try to get exec metrics into clickhouse, i took on a sidequest to find the code in the CLI where the metrics arrive from the engine... this led me to these db metric "Export" functions which, tbh, read more like imports to me in this context? i do understand this is an OTEL interface fwiw, and that the frontends implement it, but they take a resourceMetrics and save shit on the ui DB. anyways, at that point, the trail goes cold bc "Export" is deeply ungreppable and goto reference also seems to deadend one step up the stack where frontend_pretty and frontend_plain call this db export fn... engine-side, i can find this metrics pubsub equivalent call via references to metricsdata.ResourceMetrics that looks like it's probably the spot where the engine sends the metrics over the wire, but nothing on the client?
what am i missing here? my tools seem to be failing me bc these fns have to called somewhere, right? do you happen to know where the metrics flow into these frontend Export calls @civic yacht @still garnet @spark cedar ? i'm suspicious that there's some magic happening with the pubsub or i'm missing some embedded otel shit running somewhere
guessing you're looking for these:
- https://github.com/dagger/dagger/blob/fb5e3f5262a15a0c03b0f3cabf23d526a3c81b5e/engine/telemetry/cloud.go#L27 - this sets up the exporters to Cloud, notably missing one for metrics atm
- https://github.com/dagger/dagger/blob/fb5e3f5262a15a0c03b0f3cabf23d526a3c81b5e/cmd/dagger/engine.go#L68-L82 - this wires up the CLI telemetry pipeline, sending to both the frontend and to Cloud using the above function
tehehe at least im on the right track, i already had both these files open bc they're clearly where i need to make changes, but they don't quite get me where i wanna go in terms of understanding, gonna keep digging in there then
@still garnet knowing that these are the right pieces lets me phrase my question more clearly: in the case where telemetry/init.go's metrics exporter is pointed back at the CLI, i'm trying to figure out where the CLI initially receives them to pass them to the frontend db
annnnnd phrasing the question like that makes me read https://github.com/dagger/dagger/blob/fb5e3f5262a15a0c03b0f3cabf23d526a3c81b5e/cmd/dagger/engine.go#L73-L75 different
yup there ya go!
the frontend exporters are just passed to the telemetry config, and will be installed alongside any others (whether from OTEL_* env vars, or Cloud)
ahhhhh you know why i was refusing to read this as what i was looking for, is i had assumed cmd/dagger/engine.go was engine-side, but it's the CLI's representation of the engine, the engine is in cmd/engine/
tq, things are clearer now, and this is the periodic read element i was hoping to find https://github.com/cwlbraa/dagger/blob/16fd4686f55c5e3ad4322aaf7a830f59c83a4814/sdk/go/telemetry/init.go?plain=1#L448
yeah feel free to rename/refactor for clarity, can totally see how that would be confusing
naw, the layout is consistent and makes sense, i'll probably just add a comment
the level of abstraction provided by otel libs is very convenient, but coming into it fairly raw and used to more typical otel uses (exporting to datadog or prom), it's both very interesting and very disorienting to see how flexible those abstractions are
and to fully close the loop with the answered question, Export is called by otel readers
yeah for sure - we're also doing interesting things like sending 'unfinished' spans through our data pipeline, which the data pipeline technically supports, but some destinations (like Honeycomb) definitely do not lol
yeah, @wet mason was telling me about all that yesterday in dms, it's clever how it's just mucking with the data model a bit to make things nullable
the absence of that functionality is also really annoying in fintech where there's a lot of long-running background batch tasks generating traces... not having the parent trace for the whole task before it's finished can make traces read weird when datadog already the completed child spans
agreed, it's a pretty goofy oversight imo
i remember my day 1 experience with jaeger being like '...but why doesnt it have the root span yet?'
its extra annoying when you get product devs being like "hey, my long-running task isn't being scheduled"
like... did you check the logs?
tbh this was really only a problem at brex because we went on a datadog cost cutting crusade and made them delete bunch of "useless" logs lmao
@civic yacht, currentTypeDefs doesn't include types from dependencies even though they'll be included in the schema seen by the module. Is there a reason not to include them there?
Hi @civic yacht re: https://github.com/dagger/dagger/pull/8724 (cache namespacing). if we use m.Source.ID().Digest().String() as the namespace key, then with any changes done to module source, it returns a different key. I dont think we want that right?
If anyone wants to try @wet mason 's secrets provider first hand:
#!/usr/bin/env dagger shell -q -m github.com/aluzzardi/dagger@secret-providers-poc
# op is only available as amd64 package on alpine
platform=linux/amd64
.core container --platform=$platform |
from alpine |
with-exec sh,-c,"echo https://downloads.1password.com/linux/alpinelinux/stable/ >> /etc/apk/repositories" |
with-file \
/etc/apk/keys/support@1password.com-61ddfc31.rsa.pub \
$(.http https://downloads.1password.com/linux/keys/alpinelinux/support@1password.com-61ddfc31.rsa.pub) |
with-exec apk,update |
with-exec apk,add,1password-cli |
with-file /bin/dagger $(cli | binary --platform=$platform) |
with-service-binding dagger-engine $(engine | service dev) |
with-env-variable _EXPERIMENTAL_DAGGER_RUNNER_HOST $(engine | service dev | endpoint --scheme=tcp) |
terminal --cmd=sh,-c,'op account add; eval $(op signin); dagger shell'
Hi Justin, how strongly do you feel about some "behind the scenes" magic such as resolving tag v0.3.0 automatically to hello/v0.3.0 for a module named github.com/shykes/daggerverse/hello@v0.3.0
this gets saved in dagger.json as github.com/shykes/daggerverse/hello@hello/v0.3.0.
now when trying to uninstall this dependency using dagger uninstall github.com/shykes/daggerverse/hello@v0.3.0, unless I resolve the git dependency, during uninstallation, I get the version mismatch and uninstall fails.
sometimes my hack/dev script hangs at serving module step. I tried restarting docker/laptop but the issue remains. it gets resolved automatically after a bit (random time). has anyone else faced a similar issue ever?
Maybe its docker or github throttling me, but i dont see anything relevant in engine logs at that time
This OTel version bump PR is good to go, if anyone has a moment to review: https://github.com/dagger/dagger/pull/8991
- It also includes a feature (with tests): now the Dagger CLI will pick up
OTEL_RESOURCE_ATTRIBUTESand include them in its OTel resource. Operators can use this to add engine-side labels to identify which traces ran on what hardware/engine config/etc.
hangs for how long? that step can take a while, since it's where the module gets built, and right now the Dagger module gets rebuilt for a lot of different changes
i think I've seen it stuck for more than 10m today, and sometimes it is just stuck until I kill it. I am trying to find trace, found this one which took ~7m (https://dagviz.production.dagger.cloud/rajatjindal/traces/315a809f42aa6630abc554bce91391b0)
an interesting possible timeout flake... does anybody have a hunch as to whether this is timing out bc our engine has crashed, or if our engine has shut down bc this timed out? there's a mix of :53 lookup failures as well as http do: Post "http://dagger/query": dial tcp 10.87.0.38:1234 which imply to me that the engine has disappeared at some point in the run
also shoutout to @still garnet v2 webgui chokes on this super hard but it works great in v3
hm so the tests work by spinning up an engine and having it talk to that
you should be able to see the engine logs for the engine the tests ran against
it should be directly in the .asService or in the .withExec for that engine
lol i jinxed the wasm
https://v3.dagger.cloud/dagger/traces/c542a64c690b08dde82a78e17f46c025?span=c25c7af525b9b808&logs#L275191-c25c7af525b9b808 loads eventually, but i don't know how to use the engine logs to answer my question... i guess it doesn't looked like it crashed, so i can infer it shut down?
just FYI, I've opened a draft PR (https://github.com/dagger/dagger/pull/9011) to run tests using https://github.com/gotestyourself/gotestsum with archiving of the results in json format. Next step I want to try (as part of this PR) is to ingest those results in a tests dashboard to help easily identify flakes and performance of these tests.
Starting a thread to continue our discussion on secrets from yesterday @wet mason . When I have enough clarity I plan on opening an issue for more structured and lasting conversation.
When using vanilla buildkit cache export
@still garnet question π
is it intentional that ExecErrorOutput always seems set to false inside a module?
ExecError.Stdout always seems empty π€
yes
for modules, the intent is to trust the error log hoisting to do its job in the UI, otherwise the user sees 2-3x the amount of output, in worse representations
it's only kept for backwards-compatibility with non-module use cases
hm okay, so the problem is that i now can't check to see if the ExecError has a specific stderr and do a specific thing with it
can we fix the stderr API to work even if the container fails? π€
my thinking is generally these were decisions made a while back, mostly as crutches
ah nice
yeah, sorry, i hadn't realized we were deprecating the stderr field
we should mark all the fields in all the sdks as such
or maybe just remove them? it's a breaking change to depend on them now
end goal of my change is really to just clean up the errors UI, and part of that required ripping off the band-aid
so i guess alternatively we could continue collecting it, but remove it from the Error() display
e.g. the change breaks https://github.com/jedevc/dagger/blob/main/.dagger/sdk.go#L177-L184
oo, ok yeah even that case would be broken then, if we remove it from Error()
is there any way we could keep populating that field? and somehow strip it out at the display layer instead?
unfortunately not, unless we want to do regex style string munging, but sometimes it even ends up being nested
i even tried something like "only display the first line" - but some people legitimately use multi-line errors
can we do version compat with this?
maybe old modules could always get it, and then we just remove the ExecError fields entirely for this in newer modules?
i'd like to avoid breaking modules that use ExecError like this π’ (it's easy to hack our case to work, but it's kinda frustrating to break old modules)
hmm i feel like that might just kick the can down the road, modules would probably silently bump their engine and that Error() check would be just as silently broken, no?
(not saying no, just trying to understand)
yeah, that's true, but at least having an engineVersion dependency means that all currently published modules keep working
oh i guess because you can wrap errors π€ π€
yeah... i looked for any possible alternative and didn't really find any, because they so often just get turned back into strings or wrapped or etc.
we could have the go sdk detect if it wraps an ExecError, and clear the stdout/stderr before sending it back?
doesn't work, because when a module returns an error, it's just a marshalled string
hmm i'm pretty sure i tried that and there were still common cases where it didn't work, but i could try again
it's a very brittle path for perserving type info
yeah agreed
other option? we could break Error, and have that not return the stdout/stderr, so wrapping doesn't have it, but leave the Stdout/Stderr fields as-is?
it's still a bit breaking, but it feels okay, since technically I think we should have been using .Stderr in our code in the first place
that doesn't require any type-level magic
same as this right?
keep the fields, remove the display?
it would at least be better for sure
only thing is I wonder how many people are using .Error() as a shortcut for "I don't know/care whether it's stderr or stdout"
you'll also have to type-cast to get to it
blech, sorry, feels like we're painted into a corner a bit here
i'm also trying to deal with the OTel bump, it still breaks with the Bass SDK, for a completely different reason, despite the intent of the replace rules being to prevent that sort of breakage (it's because the Bass SDK depends on the released dagger.io/dagger/telemetry package, and the new replace rules break compat. with it)
mm yeah, if we can paint ourselves out of the corner somehow temporarily, then that's good - we should revisit this for sure, exec error feels a bit weird now we have withExec's expect, etc
Need a hand with that? I can also dive in a bit tomorrow
does using expect let you grab Container.Stderr and Container.Stdout after the failure?
yup, you can
one thing it doesn't do which is annoying, is you need to know which thing in a chain failed - ExecErr propagates nicely, so if it happens in the first step, it still works nicely, Stdout/Stderr only get the last one
would appreciate it, but i'm a bit doom-pilled on it, i'm not sure there's a way to make it work without just requiring everyone to bump Bass/Apko and hope for no more backwards incompatible changes. would love to be wrong, but it's in a lovely intersection of dependency hells
okay, i'm about to head off for my day soon, but i will try and see if i can get any mileage on this tomorrow π
no worries, I can pick that one up and look for alternatives, maybe I missed something. if nothing else, I can bring back the fields
or might try and weird ExecError detection in dispatch
okay, lemme know!
there's an accidental tracking issue here: https://github.com/dagger/dagger/issues/9023 π
@still garnet amusingly i was just reading this earlier today: https://abenezer.org/blog/hyrum-law-in-golang
felt somewhat relevant π

lovely to have a name for this
i glossed right over the fact this has a dagger shell shebang, subshells, and variable substitution
Yeah it's an actual bash syntax parser