#maintainers

1 messages · Page 9 of 1

dense dust
#

I'm also experiencing something similar
The engine hangs without printing anything to the console:

❯ cd dev
❯ gh repo clone https://github.com/vito/bass
Cloning into 'bass'...
remote: Enumerating objects: 14062, done.
remote: Counting objects: 100% (14062/14062), done.
remote: Compressing objects: 100% (3681/3681), done.
remote: Total 14062 (delta 10181), reused 13882 (delta 10071), pack-reused 0
Receiving objects: 100% (14062/14062), 29.02 MiB | 18.13 MiB/s, done.
Resolving deltas: 100% (10181/10181), done.
❯ cd bass
❯ export _EXPERIMENTAL_DAGGER_JOURNAL=$HOME/dagger.io/api/journals/$(date +%s).log
❯ cd dagger
❯ dagger functions
#

the terminal instance remains stuck after that (cannot use CTRL+C / CTRL+Z)

#

container logs are being generated, so maybe the engine is not completely stuck

#

restarting the engine container fixes it, but the terminal that execd that command remains frozen

wild zephyr
#

👋 seems like codegen and connecting to the engine via tcp seems to be broken.

 _EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://localhost:12345 dagger call -m github.com/shykes/daggerverse/hello@9d5ee5168db3e77f9e88f36b89df7ab7760579ad
✘ ModuleSource.asModule: Module! 2.6s
  ✘ Module.withSource(
      source: ✔ moduleSource(refString: "github.com/shykes/daggerverse/hello@9d5ee5168db3e77f9e88f36b89df7ab7760579ad"): ModuleSource! 1.6s
    ): Module! 2.6s
    ✘ Container.directory(path: "/src"): Directory! 1.8s
      ✘ exec /usr/local/bin/codegen --module-context /src --module-name hello --propagate-logs=true --introspection-json-path /schema.json 0.2s
      ┃ runc run failed: unable to start container process: error during container init: error mounting "/r
      ┃ kit/buildkitd.sock" to rootfs at "/.runner.sock": stat /run/buildkit/buildkitd.sock: no such file o
      ┃ ory                                                                                                

Error: input: resolve: moduleSource: asModule: failed to create module: failed to update codegen and runtime: failed to generate code: failed to get modified source directory for go module sdk codegen: process "/usr/local/bin/codegen --module-context /src --module-name hello --propagate-logs=true --introspection-json-path /schema.json" did not complete successfully: exit code: 1

just checking if this is a regression indeed. cc @rancid turret @civic yacht @spark cedar @dawn mauve

wild zephyr
civic yacht
wild zephyr
wild zephyr
civic yacht
# wild zephyr on top of this, is there a reason not to make the engine listen to the buildkitd...

It's always been required in order for nested execs to work (modules+codegen use those). We could change it to always listen on the unix socket but I'm very slightly hesitant since maybe there could be some obscure case where someone setting up a custom engine actually doesn't want it to listen on that sock for whatever reason. The better fix would be to just have nested execs connect using whatever listener is actually setup, which is only slightly more complicated

wild zephyr
tepid nova
#

This error looks very suspicious:

Error: input: resolve: moduleSource: asModule: failed to create module: failed to update module dependencies: failed to initialize dependency modules: failed to initialize dependency module: failed to create module: failed to update codegen and runtime: failed to generate code: failed to get modified source directory for go module sdk codegen: process "/usr/local/bin/codegen --module-context /src --module-name dagger --propagate-logs=true --introspection-json-path /schema.json" did not complete successfully: exit code: 1

Stderr:
Error: generate code: template: alias.go.tmpl:56:3: executing "_dagger.gen.go/alias.go.tmpl" at <ModuleMainSrc>: error calling ModuleMainSrc: failed to generate type def code for EngineSource: failed to convert method CLI to function def: default value "registry.dagger.io/engine" must be valid JSON: invalid character 'r' looking for beginning of value
civic yacht
tepid nova
#

earlier I had the same error but with testdata instead of registry.dagger.io

#

It's related to my machine somehow. Jeremy is running the same version of my module, from git, and it all works fine

civic yacht
spark cedar
#

ok, request-for-help 😄 i've started seeing an absurd failure in test/testdev on main recently, e.g. https://github.com/dagger/dagger/actions/runs/7972593385/job/21764667627 (after merging https://github.com/dagger/dagger/pull/6288 and https://github.com/dagger/dagger/pull/6611, though not sure if that's relevant)
I'm getting this insane internal go failure: this kind of error shouldn't be exposed to go users - googling runtime: name offset base pointer out of range does not take you anywhere helpful really - it feels like something memory corruptiony?

these failures happen often enough that they add a lot of flakiness, but i'm really out of ideas for how to debug this properly

spark cedar
spark cedar
#

AGH, do you know what's really fun to realize right after cutting v0.10 🙂

#

i guess we can do it for v0.10.1? 🤔

still garnet
#

yeah not the worst thing in the world

spark cedar
#

i might add this to RELEASING so it doesn't get forgotten next time 😄

still garnet
#

who knows, maybe we'll do a v0.11.0 in a shorter amount of time anyway, there are lots of fun things we could do with modules now that we have the foundation (I have a dagger exec spike that works like nix run for example)

civic yacht
#

who knows, maybe we'll do a v0.11.0 in a

tepid nova
civic yacht
tepid nova
tepid nova
tepid nova
#

@still garnet planting a seed for later: I think the lack of compat with the docker network model (n-to-n connectivity) is going to hurt use cases that rely on docker-compose compatibility, or at least easy migration. See #1213207059974197308

tepid nova
bronze hollow
#

Is the Dagger engine injecting OTEL_TRACES_EXPORTER=otlp into all my pipelines¿

tepid nova
#

Question. I want to run a headless Dagger service. Eg. boot a machine, and have the machine pre-configured to start a Dagger service, and to keep it running. I'm sure I can get that to work in a hacky way today. But what would it take to make it not hacky - production-ready, even?

This is related to the production architecture and compute driver discussion FYI @stray heron @civic yacht @spark cedar 🙂

#

My guess is that a "detached mode" would be needed (otherwise I need to keep a pseudo-client running at all times, which involves hacky glue)

#

Also things like restart policy, etc

civic yacht
tepid nova
#

Maybe there's an easier stopgap we can start with...

  • Full-blown detached mode for any client at any time, like you said, will take serious work.
  • Meanwhile, what if engine provisioning (via compute drivers, or even more basic, via a config file in the engine container) allowed configuring "init services", and the engine is responsible for calling those services when it starts, and keeping them running? Then there is no need for a detached mode: the engine is its own client, and doesn't need to detach from itself.
tepid nova
#

I'm hitting a pretty serious performance bottleneck when running a function against an array of custom objects... Example:

$ dagger call \
 -m github.com/dagger/dagger/linters/markdown@pull/6816/head \
 lint --source https://github.com/dagger/dagger \
 checks \
 format --text 'Error at {{.FileName}} line {{.LineNumber}}'

✘ MarkdownReport.checks: [MarkdownCheck!]! 29.6s
  ✘ exec /runtime 29.6s

Error: response from query: input: markdown.lint.checks call function "Checks": process "/runtime" did not complete successfully: exit code: 255

This runs for solid 45 seconds then fails without an error message.

The same call on a smaller repo (with less linter results) works just fine:

$ time dagger call \
 -m github.com/dagger/dagger/linters/markdown@pull/6816/head \
 lint --source https://github.com/kpenfound/dagger-modules \
 checks \
 format --text 'Error at {{.FileName}} line {{.LineNumber}}'

Error at dockerd/README.md line 9
Error at fly/README.md line 4
Error at golang/README.md line 5
Error at golang/README.md line 6
Error at golang/README.md line 7
Error at netlify/README.md line 4
Error at proxy/README.md line 1
Error at proxy/README.md line 5
Error at proxy/README.md line 6
Error at proxy/README.md line 7
Error at proxy/README.md line 10
dagger call -m github.com/dagger/dagger/linters/markdown@pull/6816/head lint   0.72s user 0.28s system 10% cpu 9.619 total
#

Are there any low-hanging fruits to pick here, or will it require deep optimizations of either our GraphQL server implementation (for more concurrent query evaluation), function call logic (ie. solve cache invalidation so that we can cache function calls)?

rancid turret
#

Yeah, that's the performance bottleneck I hit too in the codegen module. Ok in the CLI, especially when querying state but very bad when using it as a dependency.

tepid nova
tepid nova
#

Is dagger --focus deprecated now?

still garnet
#

-v = don't remove things when they finish, otherwise the same
-vv = show even really fast things
-vvv = show internal things too

there are probably a million ways to reconfigure this, so just having an increasing verbosity (like curl) felt like a bit less mental overhead vs. separate flags for every little thing

still garnet
#

maybe -vv and -vvv should be swapped though

tepid nova
#

@civic yacht I'm making my way to reviewing the views/config PR, but still going through docs review & analytics... I suspect there will be some UX bikeshedding involved.

How do you feel about preemptively splitting the view implementation, from the UX to manage it?

That way, we could fast-track a solution to @copper snow 's problem, and he can start porting things over to 0.10. Sure he has to manually edit his dagger.json while we bikeshedding UX, but I'm sure it's a tradeoff he would be happy with. Same for the other people who have been complaining about this.

WDYT?

civic yacht
civic yacht
# tepid nova <@949034677610643507> I'm making my way to reviewing the views/config PR, but st...

Actually, even easier, what if I just mark the new dagger config subcommands as hidden, so if they end up in the next release they aren't official and we can update them based on the bikeshedding? Then we can fast track the current PR, it's good to go now.

Will mainly just save some time in terms of rewriting the integ tests added for views to not rely on the CLI, which they currently do.

tepid nova
#

No docs either then right?

#

(I think you mentioned docs in the same PR)

civic yacht
copper snow
#

I don't mind editing a config file at all

tepid nova
#

What version of CUDA do we use for the gpu-enabled engine image?

#

(note, can't wait for that stuff to be functions)

tepid nova
#

@tidal spire and I are wondering if we could selectively cache execution of some of the "module initialization" functions?

#

to chip away at the overhead

tidal spire
#

Mainly my assumption is there are at least 3 stages: runtime building, dependency installation, and function execution. It seems to me that none are cached where the first two could be without risk, if that granularity is possible

tepid nova
#

@spark cedar is your port of our CI to Functions merged yet?

spark cedar
#

Not yet! I've been on holiday the last week, so it's still sitting in the PR - planning on hacking on it in any spare moments I get during kubecon, and will get back on it fully next week 🎉

civic yacht
# tidal spire Mainly my assumption is there are at least 3 stages: runtime building, dependenc...

In theory they should be cached actually (https://github.com/sipsma/dagger/blob/aec81d10a55db52937a5e67e5ad8870b29fe9a35/core/module.go#L140-L140) but there's not a test for that yet and I agree that it seems like they re-run more often than they should currently. Could be a bunch of things (e.g. in the past there was an issue with the schema introspection result not being sorted and thus semi-random).

If you have an repro off-hand where you're seeing it run too often, feel free to post it quick in an issue. Otherwise I can make one. I think I know how to write an automated integ test for this now (testing cache is kind of tricky, but I think we can do it with a custom SDK maybe), so may be able to repro that way

civic yacht
civic yacht
tidal spire
tepid nova
#

TUI feature request: follow dagger-in-dagger sessions, and visualize them all in the same call tree

still garnet
tepid nova
#

maybe - I think the issue for me is I'm execing dagger call so I'm getting TUI-in-TUI

#

but the innermost TUI seems to just wipe the rest

obsidian rover
#

Hello 👋 ,
Is there a way to modify the dagger engine so that it works with a module stored on a private repo ? I am ok with doing ugly / non scalable things on my local engine, but no host access (if possible). The best would be to be able to rely on a PAT, or a file. Other question, do you see a path where an engine can use several PATs ? Would that mean restarting the engine between those keys ?

#

The git implementation relies on the SSH socket for auth on private repos: https://pkg.go.dev/dagger.io/dagger#GitOpts (client side). However, I don't see how this could scale for an engine deployed as part of a server worker

civic yacht
# obsidian rover Hello 👋 , Is there a way to modify the dagger engine so that it works with a mo...

Yeah we haven't implemented that yet. There's multiple possible approaches, but I think it would make sense to detect whether the client has any git creds set (PAT, file, or ssh auth sock) and then automatically use that for module loading only (i.e. every module itself making Git calls by itself should not have automatic access to the callers git creds). I think this would probably mean implementing a new session attachable

obsidian rover
obsidian rover
civic yacht
#

So yeah, no workarounds yet, just need to implement it

obsidian rover
civic yacht
tepid nova
#

FYI we spotted an extra variation of the "dagger behind corporate proxy" issue in the wild: goproxy blocked, so go sdk can't build the runtime container.

tepid nova
spark cedar
#

@fair ermine @rancid turret any chance you could find time this week to connect the CI modularization work to the dev-SDKs? I've avoided porting those since @rancid turret mentioned you'd be able to handle it, and it feels like it's getting to a good enough place to look at that now 🎉

#

I think the only thing I have left in the list is to do cli publishing, then I can start writing the mage wrappers to provide some basic compat 😄

tepid nova
#

Can you all keep me in the review loop 🙏 I heard from several users that they expect our own CI to be a reference for best practrices, and a showcase for our DX. So I suddenly care about the design details for that reason.

spark cedar
#

Happy to have reviews on it from whenever, either in a discord thread or in GitHub whichever works - atm its pretty much a direct port from the magefile structure.
Happy to do bike shedding, but I also do have a slight preference to merge fast and iterate after, since getting the whole team using dagger modules to build dagger should hopefully find a lot of useful feedback that maybe we wouldn't have found (I've stayed a long unstructured list of these in the PR itself)

tepid nova
rancid turret
spark cedar
#

yeah sure - alternatively, it'll take a few minutes for me to port the existing mage python to modules, so i might do that instead 🙂

#

no worries, that prioritization sounds good 😄

tepid nova
#

FYI I started cleaning up my queue of issues, and moving my recent requests from Discord to Github

#

@rancid turret I'm assigning all my CLI issues to you, for tracking - but feel free to dispatch, delegate etc

#

A lot of my issues tend to never get picked up, unless I pick battles (I am reminded of this as I clean up my old issues). So I will pick battles 😛

#

I love closing old issues that have been solved 🙂

obsidian rover
#

I'm open to help on any of those (easy) 👀, I miss the engine work ahah 🙏 (feel free to assign)

tepid nova
spark cedar
#

tiny bit of tui feedback @still garnet 👀
would it be possible to show the platform for a step? i'm doing a bunch of multi-platform stuff, and it's not immediately apparent which one is which 😢 (which is fiddly to understand when they are producing logs)

still garnet
#

With otel you’ll be able to create arbitrary spans which might be the solution here

still garnet
#

Maybe it could tho

crystal lark
#

Not sure this is the best place to ask, but I'm looking into https://github.com/dagger/dagger/issues/6887 around creating a CLI image. If anyone has thoughts/ideas around this, esp how to test, I'd be all ears. Happy to pair if anyone has time, but I'm off for the easter weekend after today.

GitHub

What are you trying to do? The dagger docs for gitlab / github / etc have each job installing curl followed by installing the dagger CLI to use for a given CI run. It would be much nicer DX, and pr...

spark cedar
#

so, our ci is kind of in a bit of a weird place atm - we're doing a ton of work to overhaul the whole thing to dagger functions: https://github.com/dagger/dagger/pull/6843
but when this is done, the place to put the cli image would be in ci/cli.go with a new Container method, similar to how we already have in ci/engine.go. then, we'd also mod the Publish function to publish the container as well

#

until then it's a bit tricky - you could start hacking in internal/mage/cli.go, which uses mage files to build everything - it would just be a new function there, and we'd have cli:publish-image or something that would build and publish a new cli image

#

i'm also off for the easter weekend, bank holidays ftw 🇬🇧

crystal lark
#

Cool. I'm hoping the GPU bit is easy with wolfi -- we have an extra packages repo at https://packages.cgr.dev/extras that contains packages for nvidia-driver and nvidia-tools. That does add a lot of weight to the image though -- I think you'll want variants (there's also license differences). I'd also note that having git pulls in a shell and some other deps.

spark cedar
#

ideally, with the new ci structure, it should be a lot easier to add as many variants as we want without too much trouble 🎉

crystal lark
#

Great. For the moment I created a couple of images with apko, I'll take a look into how to build in Dagger next week. I'm not sure if the apko images are useful to anyone but they're available at amouat/dagger-cuda and amouat/dagger (the second one doesn't include CUDA libraries).

tepid nova
#

Picking up something we dropped in the mad race to ship Zenith: secret backends. Adding schemes like docker:..., op://..., vault://... depot:...etc. for better handling by the CLI

#

I don't think we ever opened an issue about that

rancid turret
#

I have an inkling there may be an issue about secret providers. Can't find it, must have been a comment.

tepid nova
spark cedar
rancid turret
#

It seems to be same flakiness around networking, giving off unrelated errors elsewhere. Same happens in other SDKs.

Establishing connection to Engine... Error fetching branch: error fetching branch from origin: exit status 128Error fetching branch: error adding fork as remote: exit status 3Error fetching branch: error fetching branch from origin: exit status 128Error fetching branch: error adding fork as remote: exit status 31: connect
spark cedar
#

aha, thanks

rancid turret
tepid nova
#

Do we still support arbitrary SDKs? While reading this thread (#go message) I just realized the Go SDK is hardcoded in the engine.

civic yacht
rancid turret
#

Not ready yet. If it's a script/program you can just dagger run <cmd>. If you want to connect to the API externally, I use DAGGER_SESSION_TOKEN=test dagger listen --allow-cors with https://studio.apollographql.com/, for example.

civic yacht
#

@rancid turret I got a test failure in a PR and this failure locally:

    multi.go:85: 19: exec dagger --debug init --sdk=python --name=test --source=. DONE
    module_python_test.go:645: 
                Error Trace:    /home/sipsma/repo/github.com/sipsma/dagger/core/integration/module_python_test.go:645
                Error:          Received unexpected error:
                                input: container.from.withMountedFile.withWorkdir.withExec.file resolve: lstat /work/re
quirements.lock: no such file or directory
                Test:           TestModulePythonLockHashes
    suite_test.go:296: 
--- FAIL: TestModulePythonLockHashes (27.82s)

Don't think my changes in the PR should have impacted this (just logger changes), is it failing locally for you too?

#

Wait sorry I accidentally ran against the wrong engine locally, it's passing locally.. I did get a failure in CI. I will try it again and see if it's a flake

civic yacht
#

Okay yeah the CI failure is good ol failed to compute cache key: failed to get state for index 0 on so just that dumb flake from buildkit, not a test issue

tepid nova
#

I'm getting a segfault when calling a function from the dagger repo:

dagger call -m github.com/dagger/dagger/sdk/python/runtime base-image

Output:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2c023c]

goroutine 1 [running]:
python-sdk/internal/dagger.(*Directory).WithoutDirectory(...)
    /src/sdk/python/runtime/internal/dagger/dagger.gen.go:2235
main.New(0x0)
    /src/sdk/python/runtime/main.go:63 +0x10c
main.invoke({0x3fe3d0, 0x638e00}, {0x40001824b8, 0x4, 0x8}, {0x40001823f0?, 0x4000104c68?}, {0x0?, 0x40000100e0?}, 0x10?)
    /src/sdk/python/runtime/dagger.gen.go:800 +0x174
main.main()
    /src/sdk/python/runtime/dagger.gen.go:591 +0x4b0
tepid nova
#

Possible bug report: in his live demo, Kyle got a "no ports exposed" when calling dag.Container().From("nginx").AsService().Up()

--> I've used this exact pipeline before, and the nginx image does export port 80. It looks like maybe a regression, where we are ignoring exposed ports in the image like we did in older versions?

tidal spire
#

it could have been from 10.2 -> 10.3 because I did this last night without WithExposedPort and it worked

bronze hollow
#

Do we have a nice way to work with paths and modules yet that isn't command line arguments?

spark cedar
#

Any objections for when we land our functional-ized ci https://github.com/dagger/dagger/pull/6843 ?
Happy to get this in today 😄 and then we see what happens next week, and get some battle testing in. The only thing to be aware of, is if I've made a mistake in the release pipeline at any point, then this will require manual intervention for when we do the next release, so I likely would need someone on hand to help approve, etc, as we go.

#

cc @stray heron 👀

bronze hollow
#

OK. Thanks. I've got modules that call modules that call modules and I need like 4 arguments passed down 😅

bronze hollow
#

If I set "root": "." inside dagger.json, I should be able to use dag.host().directory within that module to access those files; correct?

#

Wait, was root removed?

#

I feel like I'm going round in circles. Can modules access their own source?

rancid turret
bronze hollow
#

Ahh. Then I probably don't want to continue with that hack if source means bringing in much more than the module needs 😅

#

Thanks

tepid nova
#

@bronze hollow you can exclude I think

rancid turret
#

@spark cedar, what's the status of ./hack with the CI module?

spark cedar
#

./hack/make is a proxy to the ci module

#

./hack/dev is still a little bit special, it keeps the old load-to-docker behavior

#

but uses the ci module to build the engine still

#

there is a dev function you can call, but, it feels confusing to not have persistance for it i think

#

eventually, the aim is to completely kill it, but i think we need dagger script for that

rancid turret
#

I'm used to building with ./hack/dev and running with ./hack/with-dev. dev builds for Linux now.

spark cedar
#

ohhh, you're on mac?

rancid turret
#

Yep 🙂

spark cedar
#
go: github.com/pjbgf/sha1cd@v0.3.0: read "https:/proxy.golang.org/@v/v0.3.0.zip": stream error: stream ID 265; INTERNAL_ERROR; received from peer
go: github.com/sergi/go-diff@v1.3.2-0.20230802210424-5b0b94c5c0d3: read "https:/proxy.golang.org/@v/v1.3.2-0.20230802210424-5b0b94c5c0d3.zip": stream error: stream ID 281; INTERNAL_ERROR; received from peer                       
goroutine 1 [running]:                                        
go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End.func1()      /go/pkg/mod/go.opentelemetry.io/otel/sdk@v1.24.0/trace/span.go:405 +0x25                                                    ```
rancid turret
#

The command returns non-zero and that's ok, just the resulting panic that's not.

fresh harbor
#

I found an issue on dagger session on version 0.11.0. The session port and token (json format) show and disappear in around 1 second.

The workaround at the moment is to use --progress=plain instead.

still garnet
#

@fresh harbor running session interactively like that is pretty unusual, i don't think we expect much from that experience. you can pass -v to keep it from disappearing though

fresh harbor
tepid nova
#

@spark cedar where should I go to check out the latest and greatest of our daggerized CI?

tidal spire
spark cedar
tepid nova
#

Why not merge it right away? The previous current ci/ directory was is completely broken anyway wasn't isn't it?

tepid nova
spark cedar
#

We're using it for all our CI now (we did the last release using it)

tepid nova
#

Ah! Sorry I interpreted the #ci anchor in the URL above, as a branch name. as in dagger -m github.com/dagger/dagger#ci. I now see that I was mistaken!

#

this tells me that the repo#branch format in a git ref is not a good one. repo@branch would make more sense

spark cedar
#

Aha no worries 🙂 we've still got a little shim mage layer in there, which we can get rid of bit by bit (although I feel like we need scripts to get rid of it entirely...)

#

Is there an active proposal for scripts? I feel like I've got some ideas for it

#

Oh theres a GitHub issue somewhere I think, I'll go have a hunt for it

tepid nova
#

this?

tepid nova
#

UX regression: dagger call <FUNC> --help no longer works to list arguments, if there is a mandatory argument

tepid nova
tepid nova
#

In main when I call dagger [ANYTHING] --help, there is a weird behavior where "Usage of global: prints right away, then after the engine is initialized etc, the rest is printed. Not sure if that's you @obsidian rover ?

obsidian rover
# tepid nova In `main` when I call `dagger [ANYTHING] --help`, there is a weird behavior wher...

I don't think so no

But, I don't seem to see it. 🤔

go build ./cmd/dagger
➜  dagger git:(main) ✗ ./dagger functions --help
List available functions in a module.

This is similar to `dagger call --help`, but only focused on showing the
available functions.

Usage:
  dagger functions [flags] [FUNCTION]...

Flags:
  -m, --mod string   Path to dagger.json config file for the module or a directory containing that file. Either local path (e.g.
                     "/path/to/some/dir") or a github repo (e.g. "github.com/dagger/dagger/path/to/some/subdir")

Global Flags:
  -d, --debug             show debug logs and full verbosity
      --progress string   progress output format (auto, plain, tty) (default "auto")
  -s, --silent            disable terminal UI and progress output
  -v, --verbose count     increase verbosity (use -vv or -vvv for more)
➜  dagger git:(main) ✗ ./dagger sdfzxc --help
Error: unknown command "sdfzxc" for "dagger"
Run 'dagger --help' for usage.
obsidian rover
tepid nova
#

I built latest main, and it went away. So I guess it was in main at some point, but was since fixed

tepid nova
#

@spark cedar what is the target argument in github.com/dagger/dagger.dev() ?

#

looks like just a convenience workdir to be mounted in the interactive container, right?

crystal lark
#

Hey, just coming back to the Wolfi CI base image. I see the refactoring landed which is cool, and it doesn't seem much work to add in the Wolfi images. Do you have any advice on how to test changes though?

#

I'm happy to open a PR for those, I just wanted to check I wasn't missing something first -- I'm far from an expert with Go.

spark cedar
#

yup, a pr would definitely be appreciated 😄

#

if you're interested in adding wolfi, would definitely be interested in taking it - not sure about switching to the default right away, but we could definitely publish wolfi variants to test with for a release cycle or so? wdyt @stray heron

crystal lark
bronze hollow
#

@tepid nova Are you still working on the scripting idea with access to the local system?

spark cedar
#

i'm tempted to try and pick it up at some point, since without it, our own ci is a bit of a pain to work with locally

bronze hollow
#

Yeah, I'm finding myself really hating running functions atm 😅

#

It's fine for individual projects, but when there's dependencies it becomes super tedius

#

I'll manually configure a client to make it easier for now, but keeping an eye on the scripting stuff. Thanks, @spark cedar

crystal lark
#

@spark cedar what's a good test to exercise the CI base image?

spark cedar
#

oo, uh, just running the test suite should be good actually - the integration tests work by spinning up a dev engine (which uses the base image), and then testing against that

#

oh i see, yeah, we should probably also propagate the WithBase logic from the builder pattern up to the Engine object as well

crystal lark
#

It's also the tests are a bit overwhelming. Even just the core engine tests

tender lava
#

Hey guys, I was just curious when I was trying to understand some Dagger 'internals' while troubleshooting one of my functions — there are a bunch of unused code, particularly some variables like error messages. See this one for example https://github.com/dagger/dagger/blob/3266131d6d24f4fc3109d98351456e6365701e7b/core/cache.go#L32-L32 — any reason? Happy to push some boy-scout pull requests by the way 😄

GitHub

Application Delivery as Code that Runs Anywhere. Contribute to dagger/dagger development by creating an account on GitHub.

spark cedar
# tender lava Hey guys, I was just curious when I was trying to understand some Dagger 'intern...

Heya @tender lava 🎉 Yeah, honestly, these are super fiddly - unfortunately, while there are linters to catch these, they're usually incredibly aggressive (see this recent PR of mine https://github.com/dagger/dagger/pull/7174), because some of these variables might be used in external code/packages/etc.

However, this one genuinely does look like something we could remove 😄 if you have the time, any scout pull requests for this and any others like this would be very appreciated ❤️

spark cedar
#

🧵 want to have a discussion about engine logs 🙂

crystal lark
spark cedar
river root
night prawn
#

Hey guys, we're intrigued by Dagger and want to give it a shot with our Spring Boot project. Has anyone already put together an example using Dagger with Spring?

tender lava
# night prawn Hey guys, we're intrigued by Dagger and want to give it a shot with our Spring B...

I haven't found any, but i think a great place to start looking is in the Daggerverse. Perhaps you can get a good stating point looking for some Java-related modules? E.g.: https://daggerverse.dev/mod/github.com/jcsirot/daggerverse/java@c591e9e0b99def2fc8a67bb090fca5cd06cf6a1d — If there's no example yet, it'd be great if you can provide the first one 😉

tepid nova
#

Summarizing recent discussions about packaging the engine:

  • The engine OCI image should be the One True Artifact. All other release material should be derived from it.
  • The engine OCI image should include the CLI. It should be the canonical way to distribute the CLI.
  • The term "Dagger Engine" should be expanded to mean the CLI, in addition to the "buildkit runtime" (bk, runc and associated glue)
  • There should be no distinction between "configuring the dagger engine" and "configuring the dagger CLI". All engine configuration should be handled by the CLI. Communications between the CLI and the buildkit runtime should be kept private and considered unstable.
  • Examples of communications between the CLI and bk runtime that should be made private in the future: EXPERIMENTAL_DAGGER_RUNNER_HOST; buildkit.toml and its contents; /var/run/buildkit.sock. Generally anything with the name "buildkit" in it.
  • Running the Dagger Engine as a server ("server mode") means running the official OCI image directly, with the CLI as entrypoint
  • Running the Dagger Engine as a client ("client mode") means running the CLI on the native system, configured to bootstrap a server. Both the native CLI, and the OCI bootstrap material, should be derived from the One True Artifact
  • In client mode, the system for bootstrapping a server is called compute drivers. It needs to be improved.
  • The default compute driver needs to be improved to remove the dependency on a remote registry. Native packages should include the One True Artifact, and the CLI should use that artifact to bootstrap a server. Download from registry.dagger.io should be a fallback / optimization, not a hard requirement.
  • The above will break for downstream packagers - some of which rebuild from source in violation of our trademark policy (not blaming them, they are not aware). We will need to contact them and help them fix their builds - or request that they pull their package if they are unwilling or unable to fix.

FYI @stray heron @spark cedar @civic yacht

civic yacht
#

Summarizing recent discussions about

tepid nova
#

When building the CLI with dagger call --source=. cli file, uploading the context takes a long time. Should we use a view to filter out unnecessary contents?

spark cedar
#

trying to work out if i can avoid uploading the entire git directory now 😄

tepid nova
#

This makes me want contextual modules even more

spark cedar
tepid nova
spark cedar
#

yeaaaaaaa i know 😢

#

regardless 😄 it's really late for you

#

right?

tepid nova
#

There's no urgent action required in that thread. But it's been a fun creative process

tepid nova
#

Also the peace and quiet are addictive

spark cedar
#

eugh that sucks 😦 well i guess imagining contextual modules is a good way to spend the time 😄

tepid nova
spark cedar
spark cedar
#

I notice a lot of log lines coming from here

#

e.g. time="2024-05-02T10:47:33Z" level=error msg="unhandled otlpcommonv1.AnyValue -> attribute.Value conversion" type="*v1.AnyValue_ArrayValue"

still garnet
#

It's annoying because it goes from a slice of a set of types to a bunch of separate StringSlice, BoolSlice, etc things except not all types are supported

spark cedar
#

argh very nice 😱

#

wait but isn't this all OTEL - why would some have a different interface?

still garnet
still garnet
#

what's also fun is, what if the array is empty? then we don't even know what 'primitive slice' type to fake out. gotta just pick one, i guess?

#

and hopefully they're all the same type 🫠 - all around just an unfortunate type conversion

spark cedar
#

"empty arrays are not supported"

still garnet
tepid nova
#

@wet mason how do you feel about making volume management pluggable by allowing an "init function" to be called by the engine at startup, and mount the volumes data into that function. So the admin can implement any logic they want for persistence, including filesync to the CLI for eg. magical integration into the CI "just give me a directory to persist" feature

fresh harbor
#

I have a question about Dagger on Windows, how to run a Dagger from source (like ./hack/dev does) on Windows?

spark cedar
#

ah, you can manually call the things that ./hack/dev does if you like - so you'd chain together dagger calls.

fresh harbor
#

I'm on the PowerShell. When using ./hack/dev, the Terminal send me to open with other program. 😂

So I end up with the script that looks similar ./hack/dev but in PowerShell

$DaggerSrcRoot = Join-Path $PSScriptRoot ".." | Resolve-Path
$DaggerBinPath = Join-Path $DaggerSrcRoot "bin"
$MageDir = Join-Path $DaggerSrcRoot "ci" "mage"

Push-Location $MageDir -StackName "Mage"
go run main.go -w $DaggerSrcRoot engine:dev
Pop-Location -StackName "Mage"

# mage task export Windows binary without `.exe` suffix, cause Windows cannot executing a binary
Copy-Item -Path (Join-Path $DaggerBinPath "dagger") -Destination (Join-Path $DaggerBinPath "dagger.exe")

$Env:_EXPERIMENTAL_DAGGER_CLI_BIN = Join-Path $DaggerBinPath "dagger.exe"
$Env:_EXPERIMENTAL_DAGGER_RUNNER_HOST = "docker-container://dagger-engine.dev"
$Env:_DAGGER_TESTS_ENGINE_TAR = Join-Path $DaggerBinPath "engine.tar"

# Executing a command from the arguments.
if ($args.Count -gt 0) {
    $originalPath = $Env:Path
    $Env:Path = $DaggerBinPath + ";" + $Env:Path

    Start-Process -FilePath $args[0] -ArgumentList $args[1..$args.Count] -Wait -NoNewWindow

    $Env:Path = $originalPath
}
tepid nova
#

calling dagger functions in the dagger repo took almost 30 seconds at the "initialize" phase with no details on what was going on:

still garnet
obsidian rover
#

Hello everyone,

As part of the multi SCM (github, gitlab ...) support implementation, Justin found a library exposing the function we are looking for to separate the root a repo from the subdirs (without all the internal Go dependencies 👼 ) -> https://pkg.go.dev/golang.org/x/tools/go/vcs#RepoRootForImportPath.

However, it wasn't up-to-date on 2 important points, so I had to cherry-pick some commits. It is currently hosted on a filtered-branch fork: https://github.com/grouville/go-vcs

What do you think is best:

  1. Marcos proposed to integrate it as a subpackage directly in Dagger, in my PR. It is only 7 files, but 4 files with code, so manageable. If so, where shall I put it ?
  2. Keep it on my fork
  3. Move to Dagger

removed the personal ping, sorry Justin 🙏 cc @wild zephyr@wet mason

tepid nova
wild zephyr
still garnet
worldly flare
civic yacht
#

There's a caching fix here which likely makes a difference for non-Go SDKs (it seems we have been busting cache for all function calls, even for python/ts codegen, since v0.11.0): https://github.com/dagger/dagger/pull/7336

spark cedar
#

How to call RepoRootForImportPath

worldly flare
tepid nova
#

@civic yacht hypothetically if I wanted to scope out an ugly hacked POC of contextual modules, where in the codebase should I look? Even vague pointers are fine 🙂

#

Things to worry about off the top of my head:

  1. Modifying the loading logic, so that -m foo can sometimes return foo/.dagger instead
  2. Implementing dag.context().directory()
    • first step would be to carry that information around in the session state I suppose
    • if I start from the implementation of dag.CurrentModule() and copy-pasta from there, perhapsw
    • Maybe looking at how SDKs access the root of the git repo, for local imports, could help? Not sure where to look for that
civic yacht
# tepid nova Things to worry about off the top of my head: 1. Modifying the loading logic, s...

For the loading logic:

For the context directory that gets passed around:

So, there's a lot 😄 Starting vague like you said but happy to help more

tepid nova
#

@still garnet @civic yacht re: ripping off the bandaid, and caching functions by default, with a pragma. May I suggest that we make that pragma ttl=<n> where <n> is a duration in milliseconds, eg. ttl=100. I've had good success using this pattern, both to explain operation caching, and to make its API simpler (as opposed to cachebusters etc). I think we should converge towards that as a standard wherever we expose operation caching

radiant vortex
#

not sure where to post this... any chance I'm like... bricking the dagger cloud instance? getting connection timeouts after trying to load some of my recent (giant) builds

tepid nova
#

dagger cloud issue

tepid nova
#

Contextual modules & loading modules

tepid nova
#

Opening another thread related to contextual modules... Testing the waters before I propose this in the issue.

What if when importing a standalone module, that module inherited the parent's context directory?. I feel like this could be either completely dumb and terrible, or very awesome. Can't decide yet.

tepid nova
#

Are we able to ship a combined CLI/engine image for the next release? @stray heron @astral zealot @civic yacht . Seems like a nice incremental step towards more prod-readiness work

tepid nova
#

Can I load / introspect the official SDKs as regular modules, eg. dagger call them etc?

civic yacht
tepid nova
#

Continuing the packaging discussion... @stray heron @civic yacht @spark cedar @still garnet @tidal spire should we consider bundling everything into the CLI? (ie. the buildkit runtime)

civic yacht
# tepid nova Continuing the packaging discussion... <@796825768600141844> <@94903467761064350...

I'd love that, and have done something similar in the past w/ bincastle, but getting everything smushed into there will be a challenge. Containerd+runc would be easy. CNI binaries are doable provided the are written in go. dnsmasq is just some c program though. The qemu emulators I have no idea 😄 etc.

There's ways of dealing with all of that of course, but that would be the main source of difficulty in trying to get a standalone binary. There's other stuff around platform specific issues too, though relatively straightforward.

tepid nova
#

the alternative is an installer which drops the oci image in your home directory

#

and the CLI fails (or at least big warnint) without it

civic yacht
spark cedar
#

We don't necessarily need it in the home dir, it could also potentially be in like /usr/share

#

I like the idea though

wet mason
#

@spark cedar it's silly but ... any chance we can move the flags at the end of the command so we get to see what call it is (e.g. sdk python test [flags]). GH truncates the command ...

Not ideal eventually we want to have something where we can programmatically set the trace name, but it is what it is for now

/cc @still garnet

still garnet
#

i'm not sure if those can even be moved since they have to be passed to the entrypoint in particular

#

this might be fixable as part of browse-by-function

#

(to some extent, at least, since the trace starts before we even know what function is gonna be called)

spark cedar
#

Yeah sorry 😦 those are flags to the constructor, and currently have to appear at the beginning.
The alternatives would be to:

  • Allow constructor args to appear anywhere - probably a bit of a weird user experience, it's out of order.
  • Rework our CI to add these to the end instead of the constructor
#

That's a super simple change we could apply for v0.11.5

civic yacht
# tepid nova Continuing the packaging discussion... <@796825768600141844> <@94903467761064350...

random tangential thought, mostly for down the line: if we had this plus ideally some sort of non-privileged support (whether rootless usernamespaces, wasm, gvisor, etc.) I feel like we could also solve a lot of the various woes around filesync. Basically, filesync doesn't need to be a hard requirement if you just run the whole engine where the files are. Obviously we'd still support filesync to a remote engine, it just wouldn't be strictly needed for those cases where you have to sync like 500GB of data (or where you want to get a content checksum without transferring everything, etc.)

tepid nova
tepid nova
#

btw @hybrid widget you can now dagger call docs site and dagger call docs server, to build the docs website and server, respectively

fair ermine
#

@tepid nova Can I start working on this issue or the design is not ready?
/cc @rancid turret

#

(I forget to put the issue, my bad)

tepid nova
tepid nova
#

@civic yacht @still garnet any chance we could bikeshed live this morning? 🙏

civic yacht
still garnet
#

i'll be around 👍

obsidian rover
#

Hello core engineers,

I have a question regarding: https://github.com/dagger/dagger/blob/417743dd7a085ef6e99faedb4b51b5b0437782bf/cmd/dagger/module.go#L152-L159. I am not sure to understand / fully grasp the way it is evaluated ?

Context
I am having a segfault on a dagger init, inside a subdir where the parent dir has a git context:

506  mkdir dep1
507  cd dep1/
508  git init
509  mkdir foo
510  cd foo/
511  dagger_dev init --name=toto --source=. --sdk=go # segfault

The segfault seems to be coming from a change I introduced which incorrectly makes the foo dir context of kind GIT_SOURCE. However, even if I understand the cause via the change, I'm not sure to grasp / isolate the underlying reason

On my debug logs, it seems that the resolveFromCaller func is being called with the . source dir context at first. As I set the source as . in the dagger init command above, it is expected. Then, I suprisingly see a foo dir context being called afterwards. This only happens when the dep1 parent dir has a git context.

From what I understand, the second resolveFromCaller is segfaulting with the foo dir context, but seems to be called only when we use the export function: https://github.com/dagger/dagger/blob/417743dd7a085ef6e99faedb4b51b5b0437782bf/cmd/dagger/module.go#L159. It does not when commenting

I suspect some sort of lazy evaluation coming into play, however, it is a bit cryptic:

  • which API can I use from the moduleSource to force evaluation, to debug ?
rancid turret
#

Are you aware of the difference between the 3 directories in modules? Source, root and context?

obsidian rover
rancid turret
#

Basically the context directory is either the parent dir where .git is, or it's the root directory if not in git. The root directory is where dagger.json is. And the source directory is where "source" points to, which defaults to the root directory.

So, if not in a git repo and there's no "source" in dagger.json, they're all the same dir. But if it's in a subdir in a git repo, and the sources are in another subdir, then they're all different directories.

obsidian rover
#

I tried developing the dagger/internal in the multi-stage-build folder but says that the source is already set at the ci folder (which is true). Digging 👀

spark cedar
#

Splitting out some of the weird issues around the plain progress 🧵

tepid nova
#

@hybrid widget @stray heron @tidal spire what would it take to get rid of the unique IDs in the docs URLs? We keep talking about it but no clear path to actually getting it done

#

If you're waiting for your PR to be reviewed, it helps to manually ask specific people to review.

hybrid widget
#

I figured that as we're going to merge the dev and user manuals soon, we will remove the remaining IDs during that process

#

... because the URL structure will change anyway once we merge those manuals so it makes sense to remove the IDs at the same time

tepid nova
#

OK cool. And you are supportive of this change, right?

My rationale is that 1) we're not getting the intended benefits of never breaking URLs, because in practice we archive pages and break URLs anyway whenever we refactor, and 2) we're paying a prohibitive cost because docusaurus does not work well with this pattern, and we don't have the engineering cycles to make tooling that would

tepid nova
#

We're hitting a wall with this command:

dagger call -m github.com/dagger/dagger/docs/current_docs/manuals/developer/snippets/state-functions/go@pull/7431/head

Output:

# github.com/dagger/dagger/docs/current_docs/manuals/developer/snippets/state-functions/go/internal/telemetry
internal/telemetry/proxy.go:50:14: cannot use proxySpan{…} (value of type proxySpan) as "go.opentelemetry.io/otel/trace".Span value in return statement: proxySpan does not implement "go.opentelemetry.io/otel/trace".Span (missing method AddLink)
internal/telemetry/proxy.go:59:20: cannot use proxySpan{} (value of type proxySpan) as "go.opentelemetry.io/otel/trace".Span value in variable declaration: proxySpan does not implement "go.opentelemetry.io/otel/trace".Span (missing method AddLink)
still garnet
tepid nova
#

I'm looking at our CI and seeing dag = dagger.Connect(). Is this is some sort of compat layer so that the same code works in a Dagger function and also outside of one?

#

Or is it that this part of the code simply hasn't been ported to dagger functions yet?

spark cedar
#

I did think if we should autogenerate files in each sub package

#

But there's never been a huge demand for that much splitting and compartmentalization yet

tepid nova
#

From my experience today reviewing & merging docs PR: I think our best practice of mapping "test failure" to "dagger call error, probably passed through from an underlying withExec error` is simply not good

#

it's less work to develop a dagger function that way (just wrap exec and call it a day) but the end user experience is bad.

Errors are just not readable.

#

I propose a new best practice where a test or lint failure does not translate to a dagger call error

#

but instead is presented as structured data to be queried by the caller

#

Then if I want my CI to fail when there are non-zero linter failures - just add an if statement in my glue script

#

@spark cedar FYI I'm going to start a series of small incremental PRs to spin out some of out shared utilities into standalone modules. Hopefully this can evolve over time into an actual standard library. And in the meantime, catch inefficiencies in both the code and the cache (eg. lots of avoidable apk add)

#

starting with our go base image, then from there our alpine base image (which I'm going to port to wolfi to get the benefits of apko)

tepid nova
tepid nova
tepid nova
#
$ dagger call \
 -m github.com/dagger/dagger@pull/7461/head --source=https://github.com/dagger/dagger#pull/7461/head \
  engine \
  container \
  terminal

Output:

go: finding module for package github.com/dagger/dagger/ci/internal/dagger
go: github.com/dagger/dagger/ci imports
        github.com/dagger/dagger/ci/internal/dagger: module github.com/dagger/dagger@latest found (v0.11.4), but does not contain package github.com/dagger/dagger/ci/internal/dagger

😭

still garnet
#

deterministic system clock

tepid nova
fair ermine
#

@still garnet @rancid turret @civic yacht I see that we didn't implement the custom scalar registration in our API, is there a particular reason?
I'm currently working on a POC to get that done for TypeScript, which leaded me to also update the API to create WithScalar method

rancid turret
spark cedar
#

yeah, there was just a lot of weird complexity with doing custom scalars, that made it trickier than expected - the value was also not super clear

#

also, it required more python/typescript work than i wanted to require to get the main part of it through

#

i think i'd rather have custom enum support, that should cover most of the cases we'd want scalars for

#

@rancid turret just wanted to maybe continue our discussion about deprecating/removing the direct sdk provisioning we briefly discussed yesterday?

In digging through all the stuff last week, there's just a lot going on with each SDK needing all this stuff, that raises the barrier for new SDKs, and it's also some of the fiddliest tricky process management logic - each language has slightly different ways of doing everything, which isn't super great. It also makes OTEL spans and such need to start at the SDK, instead of at the CLI itself.

If we changed this, we'd be removing support for doing go run my_ci.go (and similar), and require wrapping it all in dagger run go run my_ci.go.

#

It'd be a pretty big breaking change, but we are doing v0.12 relatively soon, so maybe could be bundled there?

fair ermine
#

Thanks for the details!

rancid turret
# spark cedar <@768585883120173076> just wanted to maybe continue our discussion about depreca...

New SDKs don't have to worry about it. You can see this old note in https://github.com/dagger/dagger/blob/main/sdk/CONTRIBUTING.md#automatic-provisioning:

Note As we’re working towards standardizing project entrypoints and shift to always use the dagger CLI as the common way to interact with Dagger, the automatic provisioning will possibly become deprecated eventually.

So it's more of a legacy feature for the SDKs that we maintain, optional otherwise.

rancid turret
wild zephyr
#

👋 just updated to v0.11.5 and seems like for some reason, my first dagger call took ~20s to "connect" after the engine finished starting.

below the output of the engine startup logs. cc @spark cedar @civic yacht

wild zephyr
# wild zephyr

it also seems to be consistent. Just reproed it 5 times

spark cedar
#

do you have a trace, etc? could it be that the image needs to be downloaded?

#

oh i see, connect is taking time

wild zephyr
spark cedar
#

hm, that's odd, i can't reproduce

#

connect seems instant

wild zephyr
#

you're docker rm -fv the engine before each try, correct?

spark cedar
#

yup

wild zephyr
spark cedar
#

hm i can't think that we changed anything here

#

do you have a cloud trace?

wild zephyr
wild zephyr
#

not too much to see there

#

@spark cedar just confirmed it's some sort of race while starting and connecting to the engine the first time. If I start the engine manually and then dagger develop, it connects instantly

spark cedar
#

hm so this logic hasn't changed

#

but it does make sense

wild zephyr
#

let me check v0.11.4

spark cedar
#

the retry has no max timeout - and has exponential backoff

#

newBuildkitClient is much more reasonable

#

(i remember writing it to have a 1 second backoff with logic in buildkit to monitor the connection)

#

@still garnet could we move the retry logic to after newBuildkitClient? Or is this deliberate?

wild zephyr
#

I see.. still surprising given that in order to get to the 20s mark that'd mean that I should have tried to connect several times before that... 🤔

#

I mean.. 10,20,40,80,160......

spark cedar
#

yeah i'm not super sure why this would have just appeared - though, perhaps again, subtle timing changes cause now you to miss a backoff, so this takes longer

#

we know the plain progress changes have some weird timing changes/etc

wild zephyr
wild zephyr
still garnet
wild zephyr
#

I'll build from source and add some print statements to see what's happening 🙏

spark cedar
tepid nova
spark cedar
#

I wonder if we should spend some cycles on perf of modules, because it's tricky to consume if you have to make this tradeoff

tepid nova
#

Probably a good thing that we incur some of that pain, since it probably affects our users also.

And yes agree we should look at perf.

#

a low-hanging fruit is function call caching (we pay that tax for loading also)

still garnet
spark cedar
#

Caching helps a lot already, but there's still a lot of extra cost on each fresh module initialization

#

Parsing expensive, go mod tidy being weirdly expensive (and not pulling from a shared cache between modules), etc

tepid nova
#

Definitely sounds like good pain to incur!

#

I guess what I'm saying is: my PR making things slow is a good thing and is an argument for merging it 😛

spark cedar
#

Agreed 🎉

#

I'll try and do a proper review tomorrow (not just complaining about speed 😂)

still garnet
#

pain-driven development elmofire

fair ermine
#

@spark cedar I approved this PR https://github.com/dagger/dagger/pull/7438 so hopefully we can get this merge asap.
Also looking for a double check by @still garnet or @civic yacht

On top of that, the changes you made on the SDK generation is super useful and I need it in most of my new PR related to core API development.

GitHub

Fixes #7391
This started as just adding a new signals enum, and allowing it to be attached to stop. However, I realized - not every signal should necessarily cause a process to terminate, so we sho...

fair ermine
spark cedar
#

hm, i'm having a bit of an issue - i can't seem to run dagger-0.11.5 call sdk go generate - it hangs during the codegen session connection

#

for some reason, i don't see this in ci, but definitely do locally

#

dumping a trace for the session indicates we get stuck during client.Wait - we just spin waiting to connect to buildkit

#

oh

#

nested clients will be able to connect direct over http to the engine, no grpc tunneling required

#

could we be hitting this?

#

hm no thats' not it there's no nesting here

civic yacht
spark cedar
#

could still be something shim related

rancid turret
#

Is there a shim still?

spark cedar
#

shim-removal related, sorry

#

it's a client running in dagger, but it connects out to a different dagger server

#

mm okay looks like something might have broken: i modded the go generate script to try and connect to the connected service, and i can'd do a dns lookup:

#

(i do manage to get through on v0.11.4)

civic yacht
# spark cedar

Looking, but this only happens on your machine? Not CI? thinkspin

spark cedar
#

yeaaa something there definitely worries me 😢

#

ci is now using v0.11.5 on main

#

are you able to run dagger call sdk go generate? or does it hang for you?

civic yacht
fresh harbor
#

I just found similar issue with dagger call sdk elixir generate on the local machine. It's hang during exporting schema.json with codegen -o schema.json . Is that related?

spark cedar
#

definitely sounds like it

spark cedar
fresh harbor
spark cedar
#

Cool somehow we had the same issue, somehow seems to be host dependent 😭

fair ermine
#

I think this is my only blockers for now, you can check the rest but it looks pretty natural as an implementation

civic yacht
#

✨ v0.11.6 release - May 29th

tepid nova
tepid nova
#

Anyone else getting an error on this?

dagger call  -m github.com/dagger/dagger --source=https://github.com/dagger/dagger docs lint
civic yacht
tepid nova
civic yacht
#

I have orbstack on macos direct too, can try quick there instead of linux

tepid nova
#

I just don't understand how my local docker install factors in any of this

civic yacht
#

It is quite weird orbstack even shows up at this point though, yeah

tepid nova
#

possibly a registry auth callback?

civic yacht
tepid nova
#

I'm having trouble understanding what actually fails from that trace

#

Looks like it's this: exec /home/nonroot/entrypoint.sh -c .markdownlint.yaml -- ./docs README.md

civic yacht
#

Okay yeah I get the exact same error on macos+orbstack...

tepid nova
#

I am concerned that we have a strange issue, but also relieved that it's not just me 😛

#

Maybe a qemu exec handler thing?

#

wait execveat is a linux thing. How is orbstack reporting an error from a linux syscall, on my mac?

#

Oh maybe it's just the hostname from the vm?

civic yacht
#

If I was forced to guess maybe orbstack does some binfmt magic at some point, which would allow it to hook into every exec made inside the Linux VM, which somehow resulted in an EACCESS when the binary was the wrong platform? 🤷‍♂️ At least it appears it will be fixed in the release tomorrow!

tepid nova
#

yeah binfmt magic is the only explanation I see

#

Can't wait to merge everything into one binary, so that I can just grab a build from main and start using it

#

although this is pretty cool:

dagger -m github.com/dagger/dagger --source=https://github.com/dagger/dagger dev terminal
civic yacht
tepid nova
#

I guess I could do:

dagger call -m github.com/dagger/dagger --source=https://github.com/dagger/dagger \
  dev \
  with-exec --experimental-privileged-nesting --args=dagger,call,-m,github.com/dagger/dagger,--source,https://github.com/dagger/dagger,docs,lint \
  stdout
#

Still getting the orbstack error 😭

civic yacht
# tepid nova Still getting the orbstack error 😭

With _EXPERIMENTAL_DAGGER_RUNNER_HOST set? I think that would be necessary still at this point since I'm guessing the bug impacts every exec, including any made by the "outer" dagger call in the above command

tepid nova
tepid nova
#

I am really stuck on the CI errors from https://github.com/dagger/dagger/pull/7461....

Any help would be appreciated.

It appears the repo is missing ci/internal/dagger/. Which makes sense because it is git-ignored. But there is also a manual hack by @silent @spark cedar to import it, in order to wire codegened dag. into util packages which otherwise would not get access to it (because we lack a flexible dagger sdk codegen I think).

Anyway without this ci/internal/dagger/ my repo is bricked, but I don't know how anything worked in the first place since it is git-ignored...

tepid nova
# tepid nova I am really stuck on the CI errors from https://github.com/dagger/dagger/pull/74...

OK I found the issue. It is caused by interference between Dagger modules and Go modules when dealing with submodules. Specifically when initializing a dagger submodule, the Go SDK will look for an existing go.mod in the parent directories. If it exists, it will skip creating a new go.mod for the new submodule. Instead the dagger submodule will be part of the same go module as its parent. I suspect this is usually fine, but it broke Justin's clever hack of importing github.com/dagger/dagger/ci/internal/dagger in our CI code. The hack worked fine while there were no submodules, because the code is only evaluated after codegen, which generates the file to be imported. But now we have submodules (which are dependencies). Before codegen for the parent module, the dependencies must be loaded. That calls go mod tidy in the submodules. go mod tidy applies to the entire go module. This causes an error because the entire go module is broken: it is missing a generated file. Therefore we have a chicken and egg problem. Parent can't load because dependency can't load because parent hasn't yet loaded.

The fix was to manually run go mod init inside each submodule

#

TLDR it is not (always) safe for dagger submodules to share the same go.mod. At the very least we need a way to disable this. Honestly I thought sharing a go.mod between dagger modules was an opt-in for optimization. I am not a fan of this being a default. When things go wrong, it makes things very mysterious and hard to debug. I'm amazed I even resolved it.

fair ermine
fair ermine
#

@still garnet I'm currently working on the issue introduced by 0.11..5 when the package manage is set to yarn v4, is there a way to display logs from the runtime module?
I tried slog.Warn but I couldn't see my logs when calling dagger, even with dagger --progress=plain --debug

still garnet
fair ermine
#

Yeah it seems --debug doesn't reveal encapsulated logs, I'm gonna check on cloud

still garnet
#

i retire (spot the issue)

for exp, spans := range byExporter {
    exp := exp
    eg.Go(func() error {
        slog.Debug("exporting spans to subscriber", "spans", len(spans))
        return exp.ExportSpans(ctx, spans)
    })
}
civic yacht
still garnet
#

doesn't affect main - these are all local changes, but who knows how long it's been lurking while i test all this

civic yacht
still garnet
#

yeah agreed - i think there's at least one benign race atm from last time I looked into it, but seems super worth staying on top of

#

i think we run our test commands with -race but that doesn't help much for the integration tests

#

(at least, i want to believe it's a benign race)

tepid nova
#

In case anyone has a clue

tidal spire
#

do you need to pass the source to Go().Env() every time? its there for some but not others

tepid nova
#

Then in main.go it's wired as:

// Dagger's Go toolchain
func (ci *Dagger) Go() *GoToolchain {
    // FIXME: ugly glue, use interface?
    return &GoToolchain{Go: dag.Go(ci.Source)}
}
tidal spire
#

Ah, missed that part

tepid nova
#

Note the use of a glue wrapper type GoToolchain, really what I want is:

// Dagger's Go toolchain
// (returns a type defined in another module
func (ci *Dagger) Go() GoProject {
    return dag.Go(ci.Source)
}
#

I'm sorry if I already reported this issue yesterday (losing track of interconnected issues):

dagger call -m github.com/dagger/dagger --source=https://github.com/dagger/dagger sdk go lint

Does this 👆 work for you all?

tidal spire
tepid nova
tepid nova
#
dagger call -m github.com/shykes/daggerverse/wolfi container directory --path=/usr/local/bin entries
#

Sidetracked by trying to get IDE auto-complete working in my new submodules... Gave up

tepid nova
#

New CI error...

#
dagger call -m github.com/dagger/dagger@pull/7461/head --source=https://github.com/dagger/dagger#pull/7461/head sdk go lint

My output:

[...]
Only in modified/sdk/go: .changie.yaml
Only in modified/sdk/go: .gitattributes
Only in modified/sdk/go: .gitignore
Only in modified/sdk/go: CHANGELOG.md
Only in modified/sdk/go: LICENSE
#

Is that an actual legitimate lint error?

#

what is "modified" in this context?

hasty basin
#

It feels like a more hi-fidelity/first-class support for output for certain tools like linters, test runners would be helpful.
Though not your current issue, it seems.

tepid nova
#
  1. I have no idea what this Go lint error means, 2) I have no idea how my PR (which does not touch the Go SDK) triggers it
#

Also my IDE autocomplete is still broken.. Gah

still garnet
#

yea that could definitely be more descriptive - that's the failure for when files need to be re-generated. haven't ruled out wacky things happening, but that's the part I recognize

#

huh yeah your PR only changes ci/ eh

tepid nova
#

I think my PR changes the image in which that codegen linting is performed (can't be sure as there are layers of util functions, and my lsp is broken)

#

Ah, it looks like my PR does change the behavior of dagger call sdk go generate somehow

#

(note, I've never seen so many cache misses on apk add)

#
$ dagger call -m github.com/dagger/dagger \
 --source=https://github.com/dagger/dagger \
 sdk \
 go \
 generate \
 directory --path=sdk/go \
 glob --pattern .changie.yaml
$
$ dagger call -m github.com/dagger/dagger@pull/7461/head \
 --source=https://github.com/dagger/dagger \
 sdk \
 go \
 generate \
 directory --path=sdk/go \
 glob --pattern .changie.yaml
.changie.yaml
$
#

Right now I'm very happy that I can do this:

dagger call -m github.com/dagger/dagger@pull/7461/head --source=https://github.com/dagger/dagger go env terminal
#

I take it back, when I add --experimental-privileged-nesting to that command, it hangs

tepid nova
#

I don't get it, is go generate supposed to remove .changie.yaml?

civic yacht
# civic yacht same, looking quick

This one is a panic happening in the engine for obscure reasons, seems to have existed for a while but there's missing integ test coverage of that flag on terminal. Will fix, but also the CLI should just error, not hang, which is probably separate.

tepid nova
tepid nova
civic yacht
civic yacht
# tepid nova I don't get it, is `go generate` supposed to *remove* `.changie.yaml`?

The Only in modified/sdk/go: ... output is from diff and saying that those files only existing in the modified dir, which is the one run after doing codegen. The "modified" comes from here: https://github.com/sipsma/dagger/blob/fdfde1d7cbcf0ac05f7f1c9ffb694f0da6412505/ci/util/diff.go#L18-L18

I think what's happening is:

  1. The call to diff where "original" and "modified" are set here: https://github.com/sipsma/dagger/blob/fdfde1d7cbcf0ac05f7f1c9ffb694f0da6412505/ci/sdk_go.go#L27-L27
  2. The original dir (i.e. before codegen) is using util.GoDirectory which has explicit Includes that don't end up including .changie.yaml and those other files listed: https://github.com/sipsma/dagger/blob/fdfde1d7cbcf0ac05f7f1c9ffb694f0da6412505/ci/util/go.go#L14-L14
  3. Previously, the modified dir from GoSDK.Generate also used that same util.GoDirectory call, which meant that it had the same inclusion/exclusion rules and thus weren't in either the of the original or modified dirs: https://github.com/sipsma/dagger/blob/fdfde1d7cbcf0ac05f7f1c9ffb694f0da6412505/ci/sdk_go.go#L57-L57 (util.GoBase internally calls GoDirectory)
  4. But in your PR those includes/excludes aren't being set in GoSDK.Generate anymore, so .changie.yaml and the others are showing up but only in the modified dir: https://github.com/dagger/dagger/pull/7461/files#diff-3abbae04ff9bb5976eb507218a2f1ae3aa054364678f05278baa94df480b9317R57
  5. Which results in the diff being non-empty
#

I honestly can't tell what the goal of GoDirectory is anymore; I'm fairly sure it was added in the extremely early stages of the previous mage implementation and looks highly vestigial to me. It perhaps made sense when we were loading this all from host dirs, but that's obviously not the case anymore (the underlying dir from the host was already loaded by this time).

Good example of the sort of languishing we can clean up now I suppose.

tepid nova
#

Thanks! I suspected something like that but main doesn't have an equivalent to dagger call go env terminal to inspect the contents of pre-generate directory

#

makes me wish for arbitrary terminal at the failed state, like earthly does (someone asked for that recently)

civic yacht
tepid nova
tepid nova
#

I'm struggling with the overall code structure of our ci module. It's pretty "deep": many of the dagger functions are thin wrappers around hidden go packages (eg. build) which themselves have their own system of util packages (./util). Container.With is used heavily with various middleware. I find that it makes it difficult to reason about the overall structure when refactoring. Also Traces are not very readable because there it's all one giant module, so you only see a dump of all core function calls, mixed together

#

For example here is a trace view of dagger call check. It has 2000+ children, almost all core calls

fair ermine
#

@still garnet Do you know where I can see the lines displayed by these kind of logs:

I'm trying to debug my broken enum implementation, it seams enums are not registered in the schema but I cannot understand why

GitHub

Application Delivery as Code that Runs Anywhere. Contribute to dagger/dagger development by creating an account on GitHub.

still garnet
fair ermine
#

How can I add --extra-debug to the engine kickstart from ./hack/dev ?

#

Ok I think I found a way

still garnet
fair ermine
#

Yeah I would love that, because the traces tells me that I sucessfully registered the enum but they are not installed in the schema for some reason.
And I cannot dig more without these extra-debug lines

tepid nova
#

Ah! It appears my newest CI failure is actually a timeout! In the Rust SDK tests. Known issue? What should I do?

civic yacht
tepid nova
#

btw I'm trying to figure out which check failed, so I can run the corresponding dagger function locally. That's tricky to do at the moment

#

Ah actually I get the full output in a non-tty output:

$ gh pr checks  |grep fail
dagger call --source=.:default --host-docker-config=file:/home/runner/.docker/config.json sdk java test    fail    0    https://dagger.cloud/dagger/traces/3703249e22571224ab8c846d1179f3cb    Timeout after 5 minutes
dagger call --source=.:default --host-docker-config=file:/home/runner/.docker/config.json sdk rust test    fail    0    https://dagger.cloud/dagger/traces/2610a5aff67bcad3ad68f6dfd983abc1    Timeout after 5 minutes
dagger call --source=.:default --host-docker-config=file:/home/runner/.docker/config.json test important --race=true --parallel=16    fail    0    https://dagger.cloud/dagger/traces/a8ef7bcf8d9dc066ace6a3d0e74908e2    exit code 1
test / dagger-runner-v2    fail    10m9s    https://github.com/dagger/dagger/actions/runs/9320319363/job/25656859608    
test / dagger-runner-v2    fail    10m11s    https://github.com/dagger/dagger/actions/runs/9320319368/job/25656858730    
testdev / dagger-runner-v2-dind    fail    9m27s    https://github.com/dagger/dagger/actions/runs/9320319384/job/25656862722    
fair ermine
still garnet
#

actually scratch that, dagql already knew about enums, this is a very Dagger-specific part

#

i generally just search and pattern-match based on a prior thing (like scalars)

fair ermine
#

Yeah this is what I've done too but forget one place haha

#

Still got an issue with IDable enum but it's still a progress

still garnet
#

love it when a newly added test fails 48/100 times, and then passes 100 times in a row when i add --debug somewhere important 🫠

tepid nova
tepid nova
fair ermine
#

@still garnet I'm confused about something, I get the error:

Error: get module name: returned error 400 Bad Request: failed to get schema introspection JSON: introspection query failed: input: __schema.types[15].fields[2].args[0].type.ofType panic while resolving __Type.ofType: unknown type: StatusID

For the code

/**
 * Enum for Status
 */
@daggerEnum()
class Status {
  /**
   * Active status
   */
  static readonly ACTIVE: string = "ACTIVE"

  /**
   * Inactive status
   */
  static readonly INACTIVE: string = "INACTIVE"
}

@object()
export class Enums {
  @field()
  status: Status = Status.ACTIVE

  @func()
  setStatus(status: Status): Enums {
    this.status = status

    return this
  }

  @func()
  getStatus(): Status {
    return this.status
  }
}

I think it's because I didn't implemented the IDable for enums but I don't understand how to do it, could you give me a hint?

still garnet
#

enums shouldn't need to be IDable, that's for objects

fair ermine
#

So no DynamicID?

still garnet
#

StatusID also doesnt seem like it should be a thing thinkspin

#

oh wait

#

your enum is a class? huh interesting

#

is that how enums are usually represented in TypeScript? i thought there was something more primitive

fair ermine
#

You can do it in many way, but enums like that cannot be decorated

#

So you cannot specify to Dagger to expose it or not

still garnet
#

could it just be exposed when it's used? (referenced by an exposed thing)

#

anyway yeah, enums should behave more like scalars, not like objects

#

no ID needed, they're just passed around as string values

fair ermine
#

Okay make sense, it's actually the input I send when I try the invokation in typescript

#

I just send ACTIVE for xample

fair ermine
#

Ok quick update on the enum support:

  • Registration works! I can call dagger functions (with a small update in the CLI)
  • dagger call doesn't work for now, I'm trying to figure out why
#

Even if it's the call isn't working yet, I would love to get a review from you @still garnet or @civic yacht in order to give me hint but also correct possible error of implementation that I made
The PR: https://github.com/dagger/dagger/pull/7498

You can ignore the TS changes for now, I also did couple of optimization in the internals, I'm mostly interested into the review of core and dagql changes

GitHub

This PR implements custom enum support from a user module and the TypeScript support for it.
Current state

Enum TypeDef GraphQL schema extension
Enum registration using the GQL API
Internal Enu...

#

(The PR is pretty big, sorry for that)

tepid nova
#

I'm trying to understand the dependency graph in our CI, between SDKs and engine.

  • There are dagger functions for developing our SDKs (test, lint, etc). They have a dependency on engine, for integration tests
  • To build the engine I need a private package called build.
  • build seems to build both the engine and the SDKs
#

--> context: I'm trying to spin out the toolchain for developing each SDK into its own dagger module

civic yacht
tepid nova
#

Makes sense. But SDKs also have a dependency on engine for integration tests... Isn't that a circular dependency in the making?

civic yacht
#

if that makes sense

tepid nova
#

but dagger call sdk go test builds & runs an engine, then binds it as a service into the SDK's go environment, and runs the SDK tests

#

so isn't that sdks -> engine ?

#

dagger call sdk go generate also requires a live engine, not sure what for (introspect graphql schema perhaps?)

civic yacht
#

Yeah I think it should work out since the engine needed for those tests + generate can just be provided as a *Service as an arg to the parts of the SDK modules that need it. So they shouldn't need an actual dep on an engine module

tepid nova
#

What I'm trying to do, is make sdk/go a dagger module, with sdk/go/.dagger/ as its source directory. Then move dagger functions to test, lint, build etc the Go SDK in that module

#

Then I was hoping this would become possible:

$ cd sdk/go
$ dagger call --source=. test
$ dagger call --source=. lint
$ dagger call --source=. publish

etc.

#

If I have to pass an engine service as a required argument, it's less convenient - and beyond convenience, I'm not even sure how I would instantiate it

#

With contextual modules, the hope is that pattern would then become even better, with --source becoming optional:

$ export DAGGER_MODULE=github.com/dagger/dagger/sdk/go
$ dagger call test
$ dagger call lint
$ dagger call publish
#

Maybe because of the special embedded nature of the Go SDK, the engine build just has builtin knowledge of how to build the Go SDK, and the sdk/go module doesn't know how to build its own SDK - it imports that logic from the engine?

#

If that maps reality, it makes sense that it would be reflected in the dependency graph of our modules

#
$ export DAGGER_MODULE=github.com/dagger/dagger
$ dagger call build  # builds the engine
$ dagger call test # tests the engine
$ dagger call build-go-sdk # builds the builtin go sdk (special case)
#

(I'm assuming core/ is where the engine lives today in the repo layout) looks like that's wrong, I'll just assume engine toolchain is at the root of the repo

#

As a side note, I keep going back and forth on whether the daggerized toolchain should be woven into the upstream repository structure (like I'm trying to do now), or have its own separate structure, free to abstract away the messy realities of the upstream repo layout

civic yacht
#

I see, yeah that makes sense but is tough... The thing the engine needs to actually do in order to package the SDKs in itself is access the source code of those SDKs (except go, which is the one special case, the engine actually just builds a binary from cmd/codegen/ for it).

If the engine can access all that source code then there's no explicit dependency needed on SDKs. But if the SDK source code is inaccessible due to the fact that they are separate modules, then the only way would be a circular dep between the modules.

We don't allow circular deps at the moment, but I do sorta wonder if there's a way to allow it. Theoretically, the fact that we completely hide transitive deps should open a path to making circular deps allowed, but it's tricky of course.

tepid nova
#

Two very different patterns

tepid nova
civic yacht
tepid nova
#

But why is the source code of the Python SDK needed to build and release the Dagger engine?

civic yacht
tepid nova
#

Ah, I see. The python source code is actually bundled in the OCI image of the engine!

#

That's the part I missed

civic yacht
tepid nova
#

So technically it's not correct to say that the Python SDK tests depend on the engine. Since the engine can't be built without the Python SDK

#

Or if you do, it means there are 2 Python SDKs in the picture: the one being tested, and the one bundled into the engine that is a test dependency

#

Hopefully the python SDK integration tests don't trigger codepaths in the engine that lead to running the python SDK itself 🙂

civic yacht
tepid nova
#

Looks like a lightweight embedded module cache 🙂

civic yacht
tepid nova
#

This is super helpful, I might model this as something like sdkCache or equivalent in the engine build, to make it more clear what's going on. I think this can still be spun out in separate modules. And it will force us to surface the actual dependencies in our pipeline

#

For the Go SDK, you mentioned the engine build does something special - it builds a binary? What is that binary called and where does it go?

#

Separate question: when you think about "the engine" as a software component, where does it live in the repo? Is it one directory? Several? (thinking about where the dagger module for the engine's dev toolchain would live)

civic yacht
# tepid nova For the Go SDK, you mentioned the engine build does something special - it build...

It's this codegen binary: https://github.com/sipsma/dagger/blob/1878470ffe35f32c995759208895f2472c39de8a/ci/build/sdk.go#L132-L132

It is also specifically putting it in the golang:alpine image and saving that image with the codegen bin installed in the content store.

Go is special like this because it's the one SDK that's not technically a module, which solves the bootstrapping problem around SDKs-are-actually-just-modules-too.

tepid nova
#

I see so the codegen binary is functinally equivalent to an entire SDK module. For each dagger function that a SDK module must implement, codegen implement an equivalent subcommand?

civic yacht
civic yacht
tepid nova
#

Thanks for the tour 🙂

tepid nova
#

I am torn between 1) the nice Dockerfile-like idea of "attaching" modules to different directories in the repo.

#

and 2) a clean dagger module structure that can hide the mess of eg. engine core and dagql being 3 directories for 1 logical component

civic yacht
# tepid nova This is hurting my head. I'm trying to fit it into the contextual modules idea. ...

I feel like generally speaking both need to be viable options for users, but it does feel nice to just have your module code live exactly along side your actual code. Like you don't have to go searching all over the place (as I do whenever today I need to update a GHA yaml).

and 2) a clean dagger module structure that can hide the mess of eg. engine core and dagql being 3 directories for 1 logical component
I guess the rule of thumb might be to put a module in the first directory that encapsulates all its context. So for the engine, that would indeed be the top of the repo. But our SDKs are mostly independent as far as I can think, so for us this may work out

#

It does feel important we don't get overly opinionated here though since people's needs and desires around directory layout get into religious-level convictions

tepid nova
#

I feel like pattern 2 (module structure is a "view" of the underlying repo's logical components). But don't know how to transpose the promising design for context directory

tepid nova
#

But you're right of course. I would settle for finding the perfect pattern for us.

#

That would be a great start 🙂

civic yacht
tepid nova
#

@civic yacht say we discovered that it's not enough for context directories to be auto-filtered. That developers need a way to specify filters for their context directory. With magical arguments we didn't discuss a way to do that. Do you think it could be a pragma? Like:

func Build(
  source *Directory // +optional +include=*.go,go.mod,go.sum,
)

Could that work?

civic yacht
tepid nova
#

What if we relied on git remotes to magically pass the right context directory 🙂

func BuildEngine(
  source *Directory, // +optional +default=https://github.com/dagger/dagger +include=engine/,core/,dagql/
)
{}

func BuildGoSDK(
  source *Directory, // +optional +default=https://github.com/dagger/dagger/go/sdk
)
{}

The magic is that when setting the default value, if the workdir is in a git repository, and the remote matches, then the local checkout would be used (at the right subdirectory) rather than fetching a remote copy of the repo.

That way you could navigate your repo at will, and the right directory would be passed to the right function regardless of your workdir, as long as you're inside the repo

#

Nice bonus is that it works equally well whether the module is embedded or standalone

#

If it's standalone, it will just always fetch the remote repo. Perhaps overrideable with --context

#

Then you're free to structure your module, and your upstream repo, any way you want. They can match, or not

civic yacht
tepid nova
queen ibex
#

i'm gonna ask it here because it's more focused than in #general .
In regards to caching, you may cache your credential files from docker login etc...

#

Mark made a video about how to generate said files externally and mounting them instead

#

how hard would it be instead to add a way to block some section of the FS from being cached ?

#

Reverse example from gitlab-ci:

#

something along the lines of func (c *Container) WithoutCache(path string)
seems also in line with the following "WithoutSecretVariable:

tepid nova
#

@still garnet is Service.start the same as Service.sync?

#

I know a bunch of people are on vacation. Who can I ask questions about our CI?

still garnet
#

i don't think it does (just looking at code)

fair ermine
#

@still garnet Do you have a query somewhere that can dump the whole GQL schema of the API? I would like to verify something in it but I only find partials queries in our integrations tests
I want to dump: functions, types, enums etc...

tepid nova
still garnet
#

but, i guess maybe you could see it that way thinkies if the intent expressed with Service is "i want a service running asynchronously" and Sync is "make that happen"?

#

Could be one angle to simplify it?

tepid nova
#

I'm trying to remember your past explanation of it. What stuck with me was "you don't actually need Service.start except in this very specific case where you need to access a particular field (was it endpoint?) in advance so you need to explicitly start it first"

still garnet
#

It was more to have total control over when it starts and stops, so you don't have to worry about the grace period. For example to keep a long running docker daemon that spans many uses in a test suite

tepid nova
#

Related, I am seeing users confused about our relationship to Docker when it comes to entrypoint, CMD, basically what arguments are executed when.

There was a suggestion of documenting a "matrix" to explain it. I kind of feel like we should make it all simple enough that a matrix is not needed.

--> related to Container.asService being configured by the previous withExec. I was wondering if you'd be OK with a Container { start(args: [String!]): Service! as a more clear alternative (you pass the args when creating the service)

still garnet
#

Probably considered that at one point in the bikeshedding but would need to dig it up to remember why we didn't go for it

still garnet
#

Maybe duplication with params to withExec. Which some of those should also be changed to withFoo imo

tepid nova
#

Yeah, we have a good reference point with the recent implementation of terminal() and corresponding config

#

initially it was based on last exec too, but was deemed too confusing. We avoided using entrypoint, and went for explicit argument + dedicated default (withDefaultTerminalCmd)

#

There's a "policy" emerging which is to treat the entrypoint field in an OCI image as dumb data. Dagger can always read it and write it, for the purpose of interoperating with systems that rely on it. But Dagger itself shouldn't rely on it, because it has a strictly better system: code.

tepid nova
#

@civic yacht @still garnet @wet mason how do you feel about changing the structure of modules so that the generated client lib is always in its own subdirectory, instead of mixed in with the module implementation like today?

tepid nova
#

I'm seeing functions like Foo() error getting generated bindings like Foo() (dagger.Void, error) is that normal?

civic yacht
fair ermine
#

@still garnet I'm happy to tell you that I got the input/output to work, I think there's still mistakes in my implementation but it works, I'm gonna update the PR with an example!

fair ermine
#

@civic yacht If you got 10 minutes, I also would love to get your opinions on this, I'm going to write some tests too, let me know if there's anything wrong with my implem 😄

civic yacht
tepid nova
#

Oh just saw your comment

tepid nova
#

go workspaces: WHY?

#
go: finding module for package github.com/dagger/dagger/ci/internal/dagger
go: github.com/dagger/dagger/ci imports
    github.com/dagger/dagger/ci/internal/dagger: module github.com/dagger/dagger@latest found (v0.11.6), but does not contain package github.com/dagger/dagger/ci/internal/dagger
#

@jed FYI I am utterly stuck again

still garnet
#

(looking into it)

tepid nova
#

Could anyone help me understand this linting error in my PR?

dagger call -m github.com/dagger/dagger@pull/7461/head --source=https://github.com/dagger/dagger engine lint
#

I can't tell if it's a genuine linting error, or a build error in my module that somehow only manifests itself when calling engine().lint() thinkspin

still garnet
#

will put up a small PR for this fix

#

(will wait for CI to be sure)

still garnet
wet mason
#

@still garnet [async] Hey, so I'm trying to get a session attachable terminal (e.g. the engine can pop open a terminal back in the client via the session rather than using the websocket+Terminal workaround), can you point me in the right direction to take "over" the TUI? I see we're using Frontend.Background() in some other place to do something similar, should I plumb that through?

https://github.com/dagger/dagger/compare/main...aluzzardi:dagger:session-terminal?expand=1

/cc @civic yacht

GitHub

Application Delivery as Code that Runs Anywhere. Contribute to dagger/dagger development by creating an account on GitHub.

civic yacht
spark cedar
#

any hot takes on how/where to display cloud urls with the tui? 🧵

tepid nova
#

What's the difference between:

  • ci/engine.go
  • ci/mage/engine.go
  • ci/build/builder.go

Is there a clean separation of concerns?

spark cedar
#

the other two, nothing conceptual that stops these from being merged imo

spark cedar
#

the build package is designed to provide an abstraction between the higher level "build orchestration" and the lower level "commands that we have to run"

tepid nova
#

Gerhard is showing me his WIP contribution of a "publish wolfi image" function, to publish the new wolfi variant of the engine image. He was working on a change to ci/mage/engine.go. Which file should he change instead?

spark cedar
#

I might be getting confused with some ongoing work

#

Is there a publish function in mage/engine.go?

#

If so it can go in there

tepid nova
#

OK so at the moment publishing is still done using mage, but an alternative implementation exists as a dagger function - just not yet used in CI?

#

@spark cedar which should we merge first? 7349, or a new PR to add Wolfi image to publish() ?

#

ie. should Gerhard rebase his publish change to 7349?

spark cedar
#

Whichever one is ready first - it doesn't matter imo, the rebase/conversion shouldn't be that tricky

tepid nova
#

How ready is 7349 7483?

spark cedar
tepid nova
#

🤯

#

utils.DaggerCall

#

This is great. So either way, Gerhard's PR is the same, plus or minus 5 lines of convenience glue in mage

civic yacht
still garnet
#

I'll do a quick self-review and call out anything "interesting"

still garnet
#

@civic yacht ok self-review done!

wild zephyr
#

👋 am I crazy and I'm seeing that dagger init --sdk x currently uploads all the CWD context to the engine? cc @civic yacht. I did this on a fairy large project and dagger init is taking forever...

tepid nova
#

Slight frustration: dagger.json has 2 different formats for expressing file filters:

  • .include and .exclude
  • views.[].patterns

I find myself migrating filters from one to the other, and it makes it harder

civic yacht
# wild zephyr 👋 am I crazy and I'm seeing that `dagger init --sdk x` currently uploads all t...

It uploads only what it needs, which depends on the SDK value. The Go SDK is not especially clever yet and just asks for every .go file + vendor/, due to the fact that theoretically any of that may be needed to compile the users code. Ideally it should do more introspection to figure out what it actually needs.

However, that should be true of all the module commands, not just init. It wouldn't be surprising if init is slowest since it's gonna be the first one and subsequent commands can reuse the local dir sync cache.

tepid nova
#

I'm assuming I can concatenate include and exclude as-is into patterns, with ! as a prefix on each exclude?

civic yacht
wild zephyr
# civic yacht It uploads only what it needs, which depends on the SDK value. The Go SDK is not...

The Go SDK is not especially clever yet and just asks for every .go file + vendor/, due to the fact that theoretically any of that may be needed to compile the users code. Ideally it should do more introspection to figure out what it actually needs.

Erik, seems to also be uploading **/*? upload /home/marcos/Projects/tracetest from xps (client id: q4pkow1ooz7mqj2h4rm0go3pe) (exclude: **/.gi t) (include: **/go.mod, **/go.sum, **/go.work, **/go.work.sum, **/vendor/, **/*.go, dagger.json, ./**/*)

wild zephyr
civic yacht
civic yacht
wild zephyr
wild zephyr
#

I think the **/* might be the culpirit then which doesn't seem to be prefixing the source

#

which ends up uploading everything to the engine context

civic yacht
wild zephyr
#

@civic yacht found something.. might be a 🐛

#

setting --source dagger doesn't actually upload **/*

#

which should have the same effect as not setting the flag.. but doesn't seem to be the case

civic yacht
#

Ah! Interesting, that does smell like a bug...

#

Can you open an issue about this? Trying to not get super distracted from tasks so I can finish them but don't want to lose track of this

wild zephyr
tepid nova
#

Does anyone have experience in the Python SDK, to help me navigate unreadable error messages?

#

(unreadable to the untrained eye)

wild zephyr
wild zephyr
tepid nova
#

sorry had to go on a call

#

but appreciate the offer

#

I'll push my broken PR and ask for help there

spark cedar
#

🧵Release thread for next dagger release!

fresh harbor
tepid nova
#

From sdk python lint:

return util.DiffDirectoryF(ctx, t.Dagger.Source, t.Generate, pythonGeneratedAPIPath)

Does this compare the raw source to the re-generated source, to make sure all generated code was committed?

tepid nova
#

I feel like I aced a trick question on the exam

civic yacht
#

Uh... I got this weird typescript error in most of the TS integ tests in my PR out of nowhere (completely independent of the other TS issues I was just fixing):

    multi.go:85: 12  :     [2.3s] | node:internal/process/promises:289
    multi.go:85: 12  :     [2.3s] |             triggerUncaughtException(err, true /* fromPromise */);
    multi.go:85: 12  :     [2.3s] |             ^
    multi.go:85: 12  :     [2.3s] | Error [TransformError]: Transform failed with 5 errors:
    multi.go:85: 12  :     [2.3s] | /src/dagger/src/index.ts:3:0: ERROR: Transforming JavaScript decorators to the configured target environment ("node21.3.0") is not supported yet
    multi.go:85: 12  :     [2.3s] | /src/dagger/src/index.ts:5:4: ERROR: Transforming JavaScript decorators to the configured target environment ("node21.3.0") is not supported yet
    multi.go:85: 12  :     [2.3s] | /src/dagger/src/index.ts:8:4: ERROR: Transforming JavaScript decorators to the configured target environment ("node21.3.0") is not supported yet
    multi.go:85: 12  :     [2.3s] | /src/dagger/src/index.ts:11:4: ERROR: Transforming JavaScript decorators to the configured target environment ("node21.3.0") is not supported yet
    multi.go:85: 12  :     [2.3s] | /src/dagger/src/index.ts:24:4: ERROR: Transforming JavaScript decorators to the configured target environment ("node21.3.0") is not supported yet

Then repro'd locally, then tried on main on a whim and it seems to be consistently happening there too, completely out of nowhere 😭😭😭

I'm guessing that we have some unpinned dep somewhere that got updated out-of-band and broke everything? Or I'm crazy. Or both.

cc @fair ermine when you are online

civic yacht
# civic yacht Uh... I got this weird typescript error in most of the TS integ tests in my PR o...

Found it, tsx wasn't pinned and was published an hour ago: https://github.com/privatenumber/tsx/releases

This fixes it, though it will take either an engine release or tsx pushing a fix in order for our users to pick it up automatically: https://github.com/dagger/dagger/pull/7573

I suppose it would also be possible to specify the TS SDK using a git ref rather than using the engine-builtin one as another short term workaround for users

fair ermine
#

Ohhhh my bad on this one! Thanks for the fix

#

I’ll verify and merge your PR soon

civic yacht
#

Just so users can have a short term workaround

#

Thank you though btw!

fair ermine
#

I'm not sure it would, we're not pulling the SDK source from Git but from the engine bin artifacts (like Go)

#

I'm going to verify though

#
dagger init --sdk=github.com/dagger/dagger/sdk/typescript/runtime@main --name test
✘ ModuleSource.asModule: Module! 0.2s
! failed to create module: select: failed to update codegen and runtime: failed to generate code: failed to call sdk module codegen: select: call function "Codegen": process "/runtime" did not complete successfully: exit code: 2

Error: failed to generate code: input: moduleSource.withContextDirectory.withName.withSDK.withSourceSubpath.asModule resolve: failed to create module: select: failed to update codegen and runtime: failed to generate code: failed to call sdk module codegen: select: call function "Codegen": process "/runtime" did not complete successfully: exit code: 2

Hmm it's not doing great

fresh harbor
fair ermine
#

Yeah this is what I suspected

I'm not sure it would, we're not pulling the SDK source from Git but from the engine bin artifacts (like Go)

civic yacht
#

Ah right I forgot TS was not self contained either, thought it was just Go

fair ermine
#

Nope, Ts, Go & Python sharing this property if I remember well

civic yacht
#

so i'll double check that it resolves the problem. But if not we probably need to do an engine release or else every TS module user is completely broken atm

fair ermine
#

I think it's better to stick with the version we set for now, and update it from periodically

#

Oh yeah okay, let me know

#

I got my head stuck on the enums, sorry I didn't catch that earlier

#

I had something that works but pretty hacky, I'm fixing the refactor since this morning

civic yacht
fair ermine
#

Making a Codegen Typescript program would be pretty funny tbh and pretty hard too but possible I think

civic yacht
#

😢 It does not appear that tsx v4.13.2 fixes it, I still hit the error unless I go back to v4.13.0...

fair ermine
#

Transforming JavaScript decorators to the configured target environment ("node21.3.0") is not supported yet
This error is pretty weird tho

#

It might be linked to some configs inside the TS SDK that triggers the error

civic yacht
#

@fair ermine can you double check too for a sanity check? It's also worth opening an issue on the tsx repo, but the maintainer appears to be extremely aggressive about immediately closing issues if there isnt a minimal repro included. I have a baby's grasp of all the es vs cjs etc. issues at play here, if you can see if there's any way to repro it in a minimal dir of just package.json/tsconfig.json/etc. that would be extremely helpful

fair ermine
#

I'll try

#

(btw you pinged the wrong tom)

civic yacht
#

Also, if there's any way for users to workaround this (is there some way for them to specify a tsx version in some package.json/package-lock.json/etc. that would force the module to use that version during the npm install -g tsx) that would be good, but based on the code it looks like this runs pretty early

fair ermine
#

And maybe you can help me on something, when do the typedef installation happens in the Dagger Engine?

Because it seems the inputs maybe be transformed before the module's load it's type or I might miss a registration somewhere
Here's an example of output (I used debug level error because it's easier to track):

2024-06-07 19:02:18 time="2024-06-07T17:02:18Z" level=error msg=TypeDef.ToInput.TypeDefKindEnum enum=Status values=0
2024-06-07 19:02:18 time="2024-06-07T17:02:18Z" level=error msg=TypeDef.ToTyped kind=ENUM_KIND
2024-06-07 19:02:18 time="2024-06-07T17:02:18Z" level=error msg=TypeDef.ToTyped.TypeDefKindEnum enum=Status values=0
2024-06-07 19:02:18 time="2024-06-07T17:02:18Z" level=error msg="installing enum" enum=Status name=enums values=2
2024-06-07 19:02:18 time="2024-06-07T17:02:18Z" level=error msg=DynamicEnumValues.Install e=EnumTypeDef values="[Active Inactive]"

The Install is called there: https://github.com/dagger/dagger/pull/7498/files#diff-5eb7740241ddbf3efbef98d43f59b2081a4b6f7bfd00a3daba69d36f107d53bbR226-R238

But the ToTyped happens there: https://github.com/dagger/dagger/pull/7498/files#diff-99b6c62a22f12302b5f3d3032f9866a2adb2940d30022cf614b846c8a9ae3459R327-R352

I would have expected the installation to happens before but it doesn't.. or it seems like it's done in parallel
As far as I understand, the type the functions ToInput & ToTyped are call, the enumerations has no values in his typedef... I'm pretty much confused

fair ermine
fair ermine
civic yacht
fair ermine
#

So normally all my typedefs should be filled with values...

#

That's so strange

fair ermine
wild zephyr
#

👋 checking if this behavior is expected. I have the following module

package main

type Lala struct {
    Ctr *Container // +private
}

const defaultVersion = "3.19"

func New() *Lala {
    return &Lala{
        Ctr: dag.Container().From("alpine:" + defaultVersion),
    }
}

func (m *Lala) Base(version string) *Lala {
    m.Ctr = dag.Container().From("alpine:" + version)
    return m
}

func (m *Lala) Container() *Container {
    return m.Ctr
}

1/2....

wild zephyr
# wild zephyr 👋 checking if this behavior is expected. I have the following module ```go pa...

If I call dagger call base --version 3.20 container then in the logs I see that the engine seems to be checking both 3.19 and 3.20 alpine images even though it seems to be pulling only the 3.20 one. Looks like even the 3.19 one doesn't get pulled, it's being queried somehow when the module gets initialized.

This really confused me a bit since I wasn't expecting anything related to 3.19 in the logs. cc @civic yacht does this seem correct?

  ✔ container: Container! 0.0s
  ✔ Container.from(address: "alpine:3.19"): Container! 2.2s
    ✘ remotes.docker.resolver.HTTPRequest 0.5s
      ✘ HTTP HEAD 0.5s
    ✔ HTTP GET 0.6s
    ✔ remotes.docker.resolver.HTTPRequest 0.2s
      ✔ HTTP HEAD 0.2s
    ✔ remotes.docker.resolver.HTTPRequest 0.2s
      ✔ HTTP GET 0.2s
    ✔ remotes.docker.resolver.HTTPRequest 0.2s
      ✔ HTTP GET 0.2s
    ✔ remotes.docker.resolver.HTTPRequest 0.1s
      ✔ HTTP GET 0.1s
    ✔ remotes.docker.resolver.HTTPRequest 0.3s
      ✔ HTTP GET 0.3s
✔ Lala.base(version: "3.20"): Lala! 1.1s
  ✔ Container.from(address: "alpine:3.20"): Container! 0.9s
    ✔ remotes.docker.resolver.HTTPRequest 0.2s
      ✔ HTTP HEAD 0.2s
    ✔ remotes.docker.resolver.HTTPRequest 0.2s
      ✔ HTTP GET 0.2s
    ✔ remotes.docker.resolver.HTTPRequest 0.2s
      ✔ HTTP GET 0.2s
    ✔ remotes.docker.resolver.HTTPRequest 0.2s
      ✔ HTTP GET 0.2s
    ✔ remotes.docker.resolver.HTTPRequest 0.1s
      ✔ HTTP GET 0.1s
✔ Lala.container: Container! 0.2s
✔ Container.sync: ContainerID! 0.0s
  ✔ cache request: pull docker.io/library/alpine:3.20 0.0s
  ✔ cache request: pull docker.io/library/alpine:3.20 0.0s
  ✔ pull docker.io/library/alpine:3.20 0.0s
still garnet
wild zephyr
still garnet
#

it's a bit unfortunate yeah. i don't think number of layers should matter though since it's just resolving the ref, not pulling

wild zephyr
#

it'll only be once I guess since that will be cached afterwards.. just thinking about downsides here

still garnet
#

it would be nice if we could optimize the ID(ctx) call somehow, since we really don't even want a resolved value stored in this case, it's more like a logical reference

fair ermine
#

Couldn't make it shorter, it seems the maintainer is really picky on the details

wild zephyr
fair ermine
#

Hopefully he will not rekt me though, there's a blocked issue from last year related to that: https://github.com/privatenumber/tsx/issues/393
But it's a different kind of context of execution imo, my issue is more related to a breaking change I guess

GitHub

Precheck I searched existing issues before opening this one to avoid duplicates I'm able to reproduce this issue and prove it with a minimal reproduction I understand this is not a place to ask...

civic yacht
#

I also think I just got a better-than-nothing workaround working:

sipsma@dagger_dev:/tmp/test$ dagger version
dagger v0.11.6 (registry.dagger.io/engine) linux/arm64
sipsma@dagger_dev:/tmp/test$ dagger init --name=test --sdk=github.com/sipsma/dagger/sdk/typescript/standalone@4c85e5047e9c4a1ed71b82a17fe710e53c1b93d8 --source=.
10:57:13 WRN no LICENSE file found; generating one for you, feel free to change or remove license=Apache-2.0
Initialized module test in .
sipsma@dagger_dev:/tmp/test$ dagger call container-echo --string-arg=yo stdout
yo

I pushed this standalone version of the TS SDK that wraps the underlying TS SDK to change the tsx version. It works as a git ref though, which is the important part https://github.com/sipsma/dagger/blob/4c85e5047e9c4a1ed71b82a17fe710e53c1b93d8/sdk/typescript/standalone/main.go

Can you give it a shot too to confirm @fair ermine?

#

dagger init --sdk=github.com/sipsma/dagger/sdk/typescript/standalone@4c85e5047e9c4a1ed71b82a17fe710e53c1b93d8 being the important part

fair ermine
#

I'm checking

fair ermine
# civic yacht I also think I just got a better-than-nothing workaround working: ```console sip...

It works!

dagger init --sdk=github.com/sipsma/dagger/sdk/typescript/standalone@4c85e5047e9c4a1ed71b82a17fe710e53c1b93d8 --name foo
Initialized module foo in .

dagger functions
Name             Description
container-echo   Returns a container that echoes whatever string argument is provided
grep-dir         Returns lines that match a pattern in the files of the provided Directory

dagger call container-echo --string-arg "hello" stdout       
hello
civic yacht
#

Or even better tsx fixes the issue but won't count on that

civic yacht
# fair ermine Alright, keep me updated!

PR to put that sdk in our repo for a bit more officiality: https://github.com/dagger/dagger/pull/7584

Also, once that's merged and I post the workaround to relevant channels, I'll open an issue to track making all of our SDKs inherently usable as git refs. The codegen bin problem is hard but we can take a shortcut at first by just making the codegen bin an optional constructor arg to SDKs. I think if we had that then the TS SDK would have already worked as a git ref and this would have been less convoluted

  • Go is a whole other complication since it's "special" but various options there too
civic yacht
fair ermine
#

Approved!

wet mason
#

what's the latest way to re-generate the SDKs once changing the core signatures?

civic yacht
wet mason
#

yep, did that, I think it messed up with my git history

#

probably human error

civic yacht
#

Hm yeah I just tried too and it worked. One gotcha I just realized is that after an API change you also need to run docs generate (for the api schema used in docs) which isn't included in the above command, need dagger call --source=.:default docs generate export --path . for that

#

Probably worth making a one stop shop generate function

wet mason
#

it worked, but somehow there's a rebase -i that got undone (i squashed everything and now it's unsquashed), felt like I went back in time with cache

#

probably a fluke i did myself

#

I was in the middle of git surgery so I probably messed up that myself

civic yacht
#

Weird, we exclude .git so I don't think it should show up in the output theoretically

wet mason
#

@still garnet what's the coolest way to print a call.ID?

still garnet
#

@wet mason pipe it to ./cmd/dump-id - but it got substantially less cool at one point, since it still truncates like the TUI does (Container.foo)

#

so there's no big structural dump atm

tepid nova
#

Any objections to committing a go.work at the root of the repo? I know in theory you can/should keep it outside of the repo, to allow full control of your entire Go setup independently of one repo (or something). As a non-fanatic Go developer, who doesn't have a custom Go workspace on my machine, it means out of the box, IDE autocomplete doesn't work in a fresh checkout.

  1. Am I doing it wrong?
  2. If not, would adding a go.work at the root fix my problem, like I think it would?
  3. If so, any objections to me opening a PR to add one?
wet mason
#

basically to print to the user where they are

still garnet
wet mason
#

oh yep, but I wanted to pretty print it instead 🙂

still garnet
#

ah, in that case yeah idtui.DumpID, though it might need some adjustments to be truly useful for what you're doing, not sure

#

the primitives are all there, just might not be conducive to a full recursive show-me-everything dump, since the TUI abbreviates into things like Container.fooBar instead of showing what Container is

wet mason
#

I was injecting a PS1 using String()

#

pretty neat but ... not suitable as soon as the call chain gets longer

still garnet
#

yeah haha

wet mason
#

so yeah, now i'm dealing with longer IDs which don't render well in String

still garnet
#

sweet

wet mason
#

at least you can figure out where you are when the call stack gets more than 3-4 layers deep

#

you can add as many breakpoints as you want so it comes in handy, otherwise you have no clue where you are

#

@still garnet also, for the record, when CI fails I don't look into GHA anymore, I go straight to Cloud to figure out the problem in no time

wet mason
#

@still garnet oh btw -- service logs are not attached to traces, right? at least I don't see the terminal service logs. Not that it matters really

still garnet
#

next step is to delete .github i guess

wet mason
#

I was about to say the same haha

#

well, specifically, the trace name

still garnet
#

i think we're actually hitting the limit for # of checks you can have

#

there are tons of errors in the API server logs

wet mason
#

like

#

right now I just cmd+click on all traces to open them up in separate tabs

#

to figure out who's who

still garnet
wet mason
#

[separate thread, async] @still garnet @dense dust btw, I just noticed that when navigating traces via cloud, I don't see all the pending ones? Right now I see my PR but only the successful ones

still garnet
#

yeah they only show up once GHA finds a runner and actually starts the dagger CLI

#

oh, you mean after that point too?

wet mason
#

oh no

#

yeah

#

I see them yellow on GH

still garnet
#

could be an issue with the grouping, i see running things in the local tab at least

wet mason
#

like

#

if you go on cloud right now, look at the attachable PR (the top one)

#

this is not in there

wet mason
#

not duplicates but checks from different git pushes?

#

oh i thought i did

#

nevermind

wet mason
#

@still garnet [less important, neat trick] is it possible to add span attributes at any point in the stack? Would be pretty sweet to flag terminal sessions to connect the debugger

still garnet
tepid nova
#
  • I have no idea where that log at the top-level comes from (ie. which span produced it)
  • issues() returns an array of objects
  • text() returns a string, for each object in the array. How many times was it called in this trace, and what did it actually return? From looking at the trace I can't figure it out
still garnet
#

is that reproducible?

tepid nova
#

I think so, I can push the code to make sure others can reproduce it

#

You can see me making a few calls in a row on the dagger org

#

(not sure why it says "Helder Correla" in there btw)

#

Unrelated issue: our git repo seems to break dag.Git()?

dagger call -m github.com/shykes/core \
 git --url=https://github.com/dagger/dagger \
 branch --name=main \
 entries
fatal: No url found for submodule path 'telemetry/opentelemetry-proto' in .gitmodules

civic yacht
#

If so, then docker rm -fv dagger-engine-32e0269fab8e9d98 (presuming you're on v0.11.6) should fix it I think

tepid nova
tepid nova
# still garnet is that reproducible?

It is now:

dagger call \
 -m github.com/dagger/dagger/ci/std/go@pull/7587/head --source https://github.com/dagger/dagger#main:sdk/python/runtime \
 lint \
 issues \
 text
tepid nova
# tepid nova It is now: ```console dagger call \ -m github.com/dagger/dagger/ci/std/go@pull...

I confirmed in raw graphql that the logs come from the return value of the final function:

dagger query -m github.com/dagger/dagger/ci/std/go#pull/7587/head <<<'{go(source:"ChV4eGgzOjE0NzY1ZjcwMjc4N2ZjZjkSfwoVeHhoMzoxNDc2NWY3MDI3ODdmY2Y5EmYKFXh4aDM6MTk3NTgxMzZjZDNlMDVkNBINCglEaXJlY3RvcnkYARoJZGlyZWN0b3J5IhwKBHBhdGgSFDoSc2RrL3B5dGhvbi9ydW50aW1lShV4eGgzOjE0NzY1ZjcwMjc4N2ZjZjkSXAoVeHhoMzoxOTc1ODEzNmNkM2UwNWQ0EkMKFXh4aDM6MzJjYzk0OTUzYWZiZDRhYxINCglEaXJlY3RvcnkYARoEdHJlZUoVeHhoMzoxOTc1ODEzNmNkM2UwNWQ0EmsKFXh4aDM6MzJjYzk0OTUzYWZiZDRhYxJSChV4eGgzOmVmZGIyZjM1NzY0MmNjYzASCgoGR2l0UmVmGAEaBmJyYW5jaCIOCgRuYW1lEgY6BG1haW5KFXh4aDM6MzJjYzk0OTUzYWZiZDRhYxJzChV4eGgzOmVmZGIyZjM1NzY0MmNjYzASWhIRCg1HaXRSZXBvc2l0b3J5GAEaA2dpdCIpCgN1cmwSIjogaHR0cHM6Ly9naXRodWIuY29tL2RhZ2dlci9kYWdnZXJKFXh4aDM6ZWZkYjJmMzU3NjQyY2NjMA=="){lint{issues{text}}}}'

Response:

{
    "go": {
        "lint": {
            "issues": [
                {
                    "text": ": # python-sdk\n./discovery.go:41:13: undefined: ModuleSource\n./discovery.go:50:14: undefined: Directory\n./discovery.go:106:48: undefined: File\n./discovery.go:111:43: undefined: File\n./discovery.go:118:39: undefined: File\n./discovery.go:124:52: undefined: Directory\n./discovery.go:132:31: undefined: Directory\n./discovery.go:140:58: undefined: ModuleSource\n./main.go:81:16: undefined: Directory\n./main.go:87:13: undefined: Container\n./discovery.go:140:58: too many errors"
                }
            ]
        }
    }
}
still garnet
#

@tepid nova ooh, yeah there's definitely something interesting happening here haha, will take a deeper look but i think we just haven't had to deal with the case of chaining from an array result

tepid nova
still garnet
#

nope - we don't emit scalar query results in the trace, only object results

#

we could consider that, but we'd have to figure out how we want to handle large values (i.e. File.contents)

dense dust
dense dust
tepid nova
dense dust
tepid nova
#

does is use the git metadata of my local dir, or of the module loaded with -m?

dense dust
#

that's why we were thinking about not surfacing that kind of data on local runs, it's way more useful to have it when a call happened in a CI context

tepid nova
#

even in a CI context I'm wondering if the git metadata we're showing is always relevant

dense dust
#

the best would be to have a dagger primitive that allows call grouping

#

that way groups would be created by the call author instead of inferred by context metadata

#

I mean, you could create your own groups

tepid nova
# dense dust local dir

So regardless of CI or local, if I do dagger call -m REMOTE_MODULE and don't pass local dirs as argument, then git metadata from local dir us irrelevant

#

we just happen to not do that often in CI. but it's not a best practice written in stone

#

I guess that will start happening more when CI starts skipping the checkout step, and pulling from dagger instead

#

or in post-CI scenarios like pocketCI

#

basically the current git metadata is a good stopgap but needs to be changed eventually. It's not a UI problem. UI changes might buy us some time

dense dust
dense dust
#

ideally we would be able to group different calls under a same "session ID" or something

#

but we would need to find a primitive for that, either in the CLI or sdks as a new API field maybe?

still garnet
#

if we made passing in git repo args the norm instead, i think it'd be faster, and also any IDs you get out would be reproducible, which could be an interesting property to build on later (i.e. for publishing recipes alongside artifacts, for provenance)

#

@tepid nova there could be a connection to contextual modules here - basically, the CLI could have sugar to look at the $GITHUB_REF/etc. env vars (which it already needs to do, for labels/metadata), and use that to construct the context Directory

#

that way you don't need to run a checkout step

tepid nova
tepid nova
dense dust
tepid nova
#

My immediate problem is that if the module itself doesn't build, golangci-lint returns an error, I'm looking for a way to differentiate that error from an actual linting error

tidal spire
#

Looks good so far

if the module itself doesn't build
you mean the go module being linted and not the dagger module right?

tepid nova
#

@still garnet I'm working on spinning out our Go linting pipeline into ci/std/go. Looking at the otel-related part, which calls Tracer().

ctx, span := Tracer().Start(ctx, "lint "+path.Clean(pkg))
ctx, span := Tracer().Start(ctx, "tidy "+path.Clean(pkg))

So I'm hitting a concrete case of what we discussed in the abstract before: an escape hatch to raw otel, wrapped in a Dagger-specific API. What happens if I remove these calls from the code?

#

I'm guessing I'll still see the invidual test spans in the Trace, but perhaps all mixed together, instead of neatly organized in a properly labeled parent span?

still garnet
#

yep

tepid nova
#

Actually the grouping will remain the same, since it's already one WithExec per custom span. So it's really just the label right?

still garnet
#

iirc without those labels there was a ton of stuff interleaved and it was impossible to grok, so i think it's a bit of both

#

there's also a buildkit bug lurking here, I noticed its solver seems to get confused with parallelism and span contexts, so things end up in the wrong groups 😬

#

easily repro'able with dagger-dev call -m github.com/vito/daggerverse/viztest many-spans --n 10 -v

tepid nova
#

@still garnet do you agree that we'in the realm of "something that ideally we would hide, but as a stopgap we don't, for now"? If not, we should discuss that first. My take is that Tracer() should disappear, and be replaced by a proper call to the Dagger API - and for practical reasons we will do this gradually.

#

For example, in this specific case of our Go linter, my first thought was: "I'll just make those Dagger function calls". Which is very practical. BUT won't show up in Traces, because of lack of self-calls (another source of pain in our CI right now).

#

If we had self calls, I would remove this particular Tracer() call in a heartbeat, and replace it with a self-call of a per-package function call. Then if that is not as neat in Trace, that's a Trace problem.

still garnet
tepid nova
#

(the same way golangci-lint is just using otel in its code)

still garnet
#

ah ok, I misread your proposal at the end there then, conflated it with past topics

tepid nova
#

What I'm proposing is:

  1. Let's make our otel DX consistent across the board. You can use otel libraries in your code, and it will just work. That is true whether your code is a tool wrapped in a container by Dagger (eg. golangci-lint), a library imported by a Dagger Function, or a snippet inlined directly into a Dagger Function (eg. our go linter here). The otel-specific code should work the same either way (which is not the case with the magic Tracer()).

  2. If using a full-blown otel library is inconvenient in some cases ("I just want a neat label attached to this withExec call, so it looks pretty in Traces"), then perhaps we are missing a Dagger-specific convenience in the Dagger API, so you don't have to learn otel at all in these simple cases. That's what we do with calling Dagger Functions (you emit otel traces without having to learn otel). Self-calls will help here. Perhaps there is a bit more we can do, for example adding an optional "description" argument to Container.withExec?

#

I see Tracer() as an awkward middle ground between 1 and 2. Kind of the worst of both worlds.

still garnet
#

hmm it feels like more meaning is being assigned to Tracer() than the reality of what it is. It's super easy to remove, users will just start doing otel.Tracer("") or something instead. There's no magic there, it's pretty common with OTel to have a helper func/var defined for the tracer so you don't have to repeatedly decide what instrumentation library to pass in (or punt and put "" everywhere), so it seemed harmless to include in the SDK, but we can remove it if the concern is making the middle ground slightly too convenient.

#

it came along as a low-conviction guess as to what would be helpful, and has a dumb static value passed as the instrumentation library, so I'm not married to it, but I'm surprised it's the lightning rod of this discussion 😛

tepid nova
still garnet
#

fwiw we've also discussed gradually adding other helpers to SDKs, for example fs.FS conveniences for go

#

so if we're banning any helpers from SDKs, that logic would extend to that too, no?

tepid nova
#

If it's very easy to swap out, and no deep magic involved, then great, makes me feel like I understand what's going on as a user, and have options to control my destiny

still garnet
#

we don't actually use the value that gets passed to otel.Tracer("foo") anywhere, so yeah it's easy enough to let people just do otel.Tracer(""), but at one point I thought it might be helpful to automatically put something like the current module ref there

#

probably not necessary though, we already have ways of tracking which module is doing what

tepid nova
#

And the use of fs.FS is not a stopgap that becomes unnecessary in a planned improvement of the SDK (self calls)

still garnet
#

well, learning fs.FS is not a requirement either, so it depends on where you're coming from. if you're already familiar with otel, they're both just conveniences for using what you already set out to use. if your goal is better presentation/visualization and you don't know anything about otel, then yes I'd agree, and there are likely solutions there not involving reaching for OTel (which I deeply hope does not involve bringing back APIs like .pipeline("foo"))

#

i just don't think it's fair to say using OTel in a function call is always a stopgap

#

i do think fixing self-calls should be a high priority

tepid nova
civic yacht
tepid nova
#

It (lack of self-calls) does make Traces much less useful for large modules. It creates a strong incentive to split up a large module into several local submodules, just so you can get that sweet cross-module call tracing. But sometimes that's just not practical, or a good idea.

still garnet
#

yeah agreed. It's especially painful for anyone that has one big CI job that runs everything. The caching implications are a nice incentive for folks to do self-calls instead of calling sibling methods, too. You initially pay the API call overhead, but it might pay for itself later (🤞)

tepid nova
#

Not sure if this is known or not. Apologies if I'm stating the obvious. At the moment, async buildkit spans are not always attributed to the "right" function call.

Example just now: https://dagger.cloud/dagger/traces/c5e156cd0cc5e3de3f16f1e90b3ec2a1

  • I call withExec().file()
  • Exec is triggered by file()
  • Exec fails
  • Exec error is attributed to file() instead of withExec()
still garnet
# tepid nova Not sure if this is known or not. Apologies if I'm stating the obvious. At the m...

Stating the obvious part so everyone's on the same page: this gets back into lazy vs. eager; "did withExec fail or did it cause failure later?" - we went on to call file so obviously withExec itself didn't fail.

We're already capable of bridging that gap - and even do partly, since you can see the failed run in the bar to the right. Now it's a question of attributing failure to the originating span, not just time cost. Will think about it, but also trying to be conscious of "cooking the books" too much - making a span look failed when really it didn't could be confusing in other scenarios. Maybe it's fine to make it go full errored state, or maybe we want a middle state (like "this span itself was fine, but it caused problems later").

Tangent: (maybe I should have opened with this:) I could also play with the idea of yoinking the "effect" spans out of their unlazied point (which tends to be uninteresting and frankly confusing with parallelism) and place them in the spot that produced them instead.

#

For that last part, we'd probably want to give those 'effect' spans a special UI cue so they don't get confused with actual child spans. Maybe it's a special kind of drawer

tepid nova
civic yacht
tepid nova
#

My use case is that I'm returning a json file as a *File (felt too weird to return a string) and I just want to print it on my terminal

#

I changed it to return a string, problem solved

still garnet
#

Could call contents?

tepid nova
#

probably means File is underutilized

tepid nova
#

This is great, tired-melted-brain-solomon is a whole different dimension of dogfooding

#

It's a good approximation of "haven't used Dagger for long, just trying to get things done, this is a lot to take in"

civic yacht
#

tbh there is a distinct advantage to exporting to /dev/stdout in that it streams the bytes to the stdout whereas contents sends it all as one string in memory... not that I suggest relying on exporting to /dev/stdout to keep working forever, wasn't intended per-se 😄

tepid nova
#

Just to be clear, exporting to /dev/stdout does not work, because it goes in the runtime container's /dev/stdout..

civic yacht
#

that raises more questions actually but I'll leave those unanswered for now lol

tepid nova
#

I went looking in that trace because my bytes were nowhere to be found

#

@civic yacht @still garnet could I walk you through my dogfood PR tomorrow? I'm not discovering any new issues, but it's definitely changing how I prioritize some of the known issues

#

For example I am now stuck on interfaces, which is a first for me.

bronze hollow
#

The contextual modules and toolchains issues are pretty quiet atm, I was wondering if there's still active discussions happening and how I could help?

tepid nova
bronze hollow
#

Cool. I'm hoping this new thing solves all my problems 😁

spark cedar
#

👀 mentioned this idea to @obsidian rover:
once we merge module support for vanity urls/etc, would we want to allow something like dagger -m dagger.io/ci ...?

tepid nova
#

support for vanity urls -> ?

#

@spark cedar @tidal spire @stray heron @still garnet if you're around, I'd love to show you my findings on our own CI so far, before EU goes to bed

spark cedar
tepid nova
spark cedar
#

yup, it's exactly like go modules lol

#

as in

#

we "borrow" all the code for it

#

with some improvements for our use case

tepid nova
#

Will the daggerverse scraper be able to find them?

spark cedar
#

this is a good question 😄 i think it should, from my knowledge of how daggerverse works, it just runs these against a dagger engine - if a dagger engine can understand them, so can daggerverse

still garnet
#

yep, all that logic got pushed into the engine

tepid nova
#

We may have issues with duplicates (same module at several addresses) but that seems manageable

still garnet
#

iirc there's a way with Go to deduce which one is the "canonical" ref

#

but we'd have to respect that

#

hmm and it might be via Go code

spark cedar
#

or Symbolic

#

or tbf, 100s of other little properties it has

still garnet
#

ah yea or that. but I guess you'd still have a theoretical problem if there are many vanity URLs for the same source. but don't do that. 😛

spark cedar
#

hehe, i wonder if vanity urls are supported, i could run a little server that autogenerated modules

obsidian rover
#

Hello I am having a hard time debugging this issue on the codegen side (showing for rust sdk, but php and others also fail):

1215 |  pub async fn clone_url( &self, ) -> Result<String, DaggerError> { let mut query = self.selection.select("cloneURL"); query.execute(self....
     |  --------------------------------------------------------------- other definition for `clone_url`
1216 | /// The URL to clone the root of the git repo from
1217 |  pub async fn clone_url( &self, ) -> Result<String, DaggerError> { let mut query = self.selection.select("cloneUrl"); query.execute(self....
     |  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ duplicate definitions for `clone_url`

I cherry-picked this commit from Justin: https://github.com/dagger/dagger/commit/741b2b23d3f356604ea2b890a8915530a614891f#diff-6e7d0c02c8bd065fb312e4109724ee002cb041cd44c4c6e97abe267ce31cedf4R287, which replaces an implementation of the CloneURL to a graphql field, since then I'm having a hard regenerating the sdks + understanding where the root of the issue comes from 🙏

I'm not sure how to unlock the situation, any hint appreciated 🙏 👼 I'll keep looking on the meantime 👀

wet mason
#

@civic yacht Not familiar with the latest in Go codegen ...

The generated invoke function takes a inputArgs map[string][]byte that is not being used (but is being passed elsewhere AFAIK). Is there a reason (e.g. depends on what we're codegen'inig -- sometimes it's used sometimes not) or is this a leftover?

civic yacht
wet mason
#

a fresh dagger init gives go warnings (default golangci-lint config), it was driving my OCD crazy 🙂

obsidian rover
tepid nova
#

oh nevermind, I guess that changed with the dagql refactoring?

obsidian rover
wet mason
#

@civic yacht struggling to get CI green on the terminal attachable PR, but once that is good, ok to merge, or thoughts regarding removing WithDefaultTerminalCmd?

civic yacht
tepid nova