#maintainers

1 messages · Page 13 of 1

rancid turret
#

For now just disabled bash syntax so its more like sh for now. Didn’t affect variables anyway, from the context of the script.

#

We need to decide what we want to support. Those are available with a .. prefix to avoid a name clash with a function name. Except function because that’s not a shell builtin, its a part of bash syntax and that was disabled because of export.

tepid nova
#

can I get an example of valid value for _EXPERIMENTAL_DAGGER_CACHE_CONFIG ?

civic yacht
final star
#

what's up with this thing where github reports "pending" for all its dagger checks? https://github.com/dagger/dagger/pull/9027 https://github.com/dagger/dagger/pull/9030 https://github.com/dagger/dagger/pull/9031 @fair ermine @spark cedar

GitHub

Follow-up to #8466.
Some tools (like git checkout) return valid 128 exit codes.
We only exclude the range 128+, because this is the exit code for a process that was terminated by signal(). 0 isn&am...

GitHub

does what it says on the tin, the memstat double-print is an ugly consequence of a merge conflict resolution gone poorly.
fixes #9012

GitHub

Decouple provisioning from the TS SDK to dynamically load it depending of the context of execution.

If used as a library: provision from env then from binary (install if it doesn't exist)....

civic yacht
#

Looks like the GHA checks run+pass and the checks are not pending in dagger.cloud, so feels like the checks themselves are just not getting marked as done and/or GHA is having issues with showing the latest status?

tepid nova
#

@hybrid widget @tidal spire for generating the gif examples 🙂

#!/usr/bin/env dagger shell -q -m github.com/shykes/daggerverse/termcast

print "# Let's start with a simple command" |
enter |
exec "dagger shell -c '.container | from alpine | with-exec apk,add,git,openssh,rsync'" |
gif |
..export ./demo.gif

First run will be slow, it has to build a bunch of tooling from scratch

#

@still garnet I still have that problem where the TUI output is somehow not intercepted... 😦 But otherwise it all works

fair ermine
spark cedar
#

hm, i don't think this is related to an engine change, since all the jobs here are using the same engine version that they were before, and that was working

#

potentially a cloud change?

spark cedar
wild zephyr
spark cedar
#

hm, it definitely seems fixed now 😄

still garnet
#

huh, so i guess the theory is the api workers became saturated retrying errors that were previously not retried?

rancid turret
#

@still garnet, I'm getting a panic on TUI and non-tty stdin:

❯ dagger shell --no-load <<<'.help'
...

ADDITIONAL COMMANDS
  .core         Load a core Dagger type
  .doc          Show documentation for a type, or a function
  .help         Print this help message

Use ".help <command>" for more information.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x18 pc=0x105a507cc]

goroutine 217 [running]:
github.com/muesli/cancelreader.(*fallbackCancelReader).Read(0x140001e6080, {0x140007f6100, 0x100, 0x100})
    /go/pkg/mod/github.com/muesli/cancelreader@v0.2.2/cancelreader.go:53 +0x4c
github.com/charmbracelet/bubbletea.readAnsiInputs({0x1060d1df0, 0x140001810e0}, 0x1400014c070, {0x14da59c98, 0x140001e6080})
    /go/pkg/mod/github.com/charmbracelet/bubbletea@v1.1.1/key.go:565 +0x88
github.com/charmbracelet/bubbletea.readInputs(...)
    /go/pkg/mod/github.com/charmbracelet/bubbletea@v1.1.1/key_other.go:12
github.com/charmbracelet/bubbletea.(*Program).readLoop(0x1400042c8c0)
    /go/pkg/mod/github.com/charmbracelet/bubbletea@v1.1.1/tty.go:94 +0x8c
created by github.com/charmbracelet/bubbletea.(*Program).initCancelReader in goroutine 1
    /go/pkg/mod/github.com/charmbracelet/bubbletea@v1.1.1/tty.go:86 +0x110

It does show the output but doesn't seem to close properly in the end. What's the proper way to get that stdin? It used to work correctly, but I can see v0.14.0 has the issue (main as well).

still garnet
#

ironically the release title for bubbletea v1.1.1 is Don't panic!

spark cedar
#

is there a reason we don't use h.stdin there in the line you link to?

rancid turret
spark cedar
#

yeah, i think this fixed it for the echo "" | dagger call case

still garnet
#

(pretty unfamiliar with this code, still reading)

rancid turret
rancid turret
still garnet
still garnet
spark cedar
#

Hm, but WithInput(nil) is definitely valid 🤔

#

the doc strings for WithInput tells you to do it 😛

still garnet
#

ahh

#

i think it's RestoreTerminal

spark cedar
#

mmmmm

#

that would do it 😄

rancid turret
#

Yeah 🧐

still garnet
#

(brb, vet)

rancid turret
#

So it's a regression?

spark cedar
#

i think it's just a bug in bubbletea - if there's no input, then it will fail to RestoreTerminal - which only gets called in this way by the shell

#

i guess you could also do it via echo "" | dagger core container from --address=alpine terminal

#

this also crashes

spark cedar
#

ohhh yeah, okay, we used to WithInputTTY

rancid turret
#

Why doesn't the same thing happen with dagger query?

spark cedar
#

i can open a pr that will handle it for this specific case, but i think we still need the bubbletea upstream fix if no /dev/tty is available

spark cedar
wild zephyr
#

so not sure why not returning the error was causing some checks to be left in a pending state

spark cedar
#

note - we still need a bubbletea fix

#

but, this at should handle your case @rancid turret

#

it still does really weird things if no /dev/tty is available, but that's very uncommon if stdout/stderr/stdin are ttys 😛

still garnet
wild zephyr
#

which is also strange as the nil dereference happens after the status is effectively updated 🤔

still garnet
#

oh yep that seems likely, didn't see the nil deref fix

#

maybe those nil deref jobs kept getting added back to the river queue?

wild zephyr
still garnet
tepid nova
#

Fun fact: you can get a dev build of the CLI that thinks it's a stable release, so it agrees to connect to your stable engine 🙂

#
$ dagger call -m github.com/dagger/dagger@v0.14.0 --source https://github.com/dagger/dagger#main cli binary --platform=darwin/arm64 export --path ./dagger-dev

$ ./dagger-dev version
dagger v0.14.0 (registry.dagger.io/engine:v0.14.0) darwin/arm64
still garnet
#

Think that'll break some of the TUI functionality, like duration accounting (withExecs will probably say 0s)

#

but i'm thinking of fixing that anyway, so old traces look right in cloud v3

obsidian rover
tepid nova
#

Does anyone have experience using buildkit cache export to S3 with dagger? 🙏 I managed to inject the configuration into my session, the engine is clearly doing something at export time, but nothing gets uploaded in the end... The exact same config string works just fine with docker buildx build

#

Ah, I got one lead in engine logs:

time="2024-11-23T02:43:00Z" level=error msg="error running cache export for client j1jp4r4d3wqdd9k5uy3fk8n47" client_hostname=mbsh4.local client_id=j1jp4r4d3wqdd9k5uy3fk8n47 error="failed to check file presence in cache: operation error S3: HeadObject, context canceled" session_id=wqhht3fg95go9xgnd3kiwof3m spanID=e1571202493b78c7 traceID=6ace9b43a215147d32986e1552f3b9bd
#

Is there any way to get more details in the logs? 😭

tepid nova
#

Slightly more logs...

time="2024-11-23T03:01:26Z" level=debug msg="getting remotes for jarz0d34dv0vhyfjr4v57w1hf::i56o1idyahbuy67m9bqrimfqc" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:26Z" level=debug msg="got remotes for jarz0d34dv0vhyfjr4v57w1hf::i56o1idyahbuy67m9bqrimfqc" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:26Z" level=debug msg="getting remotes for jarz0d34dv0vhyfjr4v57w1hf::j6cawqukhzvlblmkq2vl1pqfx" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:26Z" level=debug msg="got remotes for jarz0d34dv0vhyfjr4v57w1hf::j6cawqukhzvlblmkq2vl1pqfx" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:27Z" level=debug msg="finalizing exporter" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:44Z" level=debug msg="finalized exporter" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:44Z" level=debug msg="waited for cache export" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:44Z" level=error msg="error running cache export for client c8p0i5017e2cq7kh6b3rh807c" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c error="failed to check file presence in cache: operation error S3: HeadObject, context canceled" session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:44Z" level=debug msg="done running cache export for client c8p0i5017e2cq7kh6b3rh807c" client_hostname=mbsh4.local client_id=c8p0i5017e2cq7kh6b3rh807c session_id=8af6qk5739pf9dd2eyx9fojpz spanID=5219b6416f96a325 traceID=aeb1d99f0a7d086b91577f8645e36066
time="2024-11-23T03:01:44Z" level=error msg="failed to flush telemetry" clientID=c8p0i5017e2cq7kh6b3rh807c error="map[error:context canceled\ncontext canceled\ncontext canceled kind:*errors.joinError stack:<nil>]" isMainClient=true mainClientID=c8p0i5017e2cq7kh6b3rh807c sessionID=8af6qk5739pf9dd2eyx9fojpz
time="2024-11-23T03:01:44Z" level=debug msg="removing session; stopping client services and flushing" session=8af6qk5739pf9dd2eyx9fojpz
time="2024-11-23T03:01:44Z" level=debug msg="stopped services" session=8af6qk5739pf9dd2eyx9fojpz
time="2024-11-23T03:01:44Z" level=debug msg="session removed" session=8af6qk5739pf9dd2eyx9fojpz
#

Could this be related to our dagger-specific cache exporter issues?

#

testing with dagger run...

#

yup it works with dagger run

#

(yay but also boohoo)

civic yacht
# tepid nova Slightly more logs... ``` time="2024-11-23T03:01:26Z" level=debug msg="getting ...

Oh Marcos noticed this the other day too, at some point a timeout of 10s got added to the CLI's shutdown process: https://github.com/marcosnils/dagger/blob/8c0d24b399be77a90d8c4bfa21496a296a6ec39d/engine/client/client.go?plain=1#L754

Which inadvertently applied to the cache export (since that runs when the client is closing the session). Didn't notice because our integ tests aren't exporting a ton of data so they always made it out in under 10s

#

Need to fix that, probably just not having a timeout when cache export is enabled

#

But you can build a CLI with that timeout rm'd to unblock quick

civic yacht
tepid nova
#

ah I see. well my dagger run is for a tiny program that runs a tiny pipeline, so it might simply be the overhead of loading a module that causes the issue

obsidian rover
# still garnet https://tenor.com/uXSy.gif

Hey Alex,

I am a bit confused on how to use the dynamic purity feature you implemented to ensure that any socket or pat being retrieved shows up in the dagql call as a secret / socket to be passed to functions:

We currently do like that for the PAT.

But my understanding was that I could replace this selection by marking the field selectors as impure when selecting them: here for the PAT and that I could add it there for the socket

Plus, as unixSocket is impure, then anyone calling it would make it impure

My tests show that it's not enough. What am I missing ? 🙏

still garnet
#

(i changed the pr name to 'elective purity' or something)

obsidian rover
# still garnet this looks like it was written against the old version of my PR for dynamic puri...

Mmmh, thanks. I was doing it wrong indeed 🙏 ; Now, there's still something that confuses me: we used to rely on the withAuthToken selector to ensure that the PAT retrieved shows up in the dagql call to functions.

As the core.unixSocket is marked as Impure, and Purity is not set (commented), my current understanding was that the core git API makes an impure call when selecting the socket, and, when creating the new instance for current ID, this would automagically be part of the dagql ID, removing the necessity for an __internalWithSocket and withAuthToken call as this is constructed directly here, from here or here from there.

However, I tried every variation of purity / impurity (for the unixSocket + IpSocket) + inside the git API, setting it as pure or not ; but removing the withAuthToken does seem to make it disappear from the dagql call for functions calling it ;

On your elective purity PR, you mentioned that we need to self select the field with purity: true ?

Summary: I'm still misdoign something ahah, and not sure how to unlock myself🙏

still garnet
#

dagql elective purity no jutsu

civic yacht
#

@obsidian rover did something change in the config for gitlab we're using in our integ tests? I got a weird failure in my PR. Then I saw main passed 12h ago but just re-ran CI on main and hit the same weird error there: https://dagger.cloud/dagger/traces/e098e2462cc91e1254d3a1bbe18df97a?span=e02b84e361a6d970

In TestCLI/TestDaggerInstall/GitLab_public/git/sad. Not an emergency if so, just wondering what could have changed to cause that to fail consistently

obsidian rover
civic yacht
still garnet
obsidian rover
# still garnet fwiw, seeing the same failure in my PR: <https://v3.dagger.cloud/dagger/traces/4...

Yeah there's something odd: multi.go:85: 27 : [1.9s] | 22 : parseRefString: gitlab.com/dagger-modules/test/subdir/dep2@323d56c9ece3492d13f58b8b603d31a7c511cd41 DONE [0.2s]

The source ref is: gitlab.com/dagger-modules/test/more/dagger-test-modules-public, from vcsTestCase named GitLab public. It seems truncated weirdly

parseRefString prints an impossible ref here: https://v3.dagger.cloud/dagger/traces/fdc6f94b5168ef19cf43163fa85355e9#L297-ff0aa4dda54d634a.

still garnet
#

is that /subdir supposed to not exist?

#

since the test is expecting an error

tidal spire
#

double checking - the buildkit s3 exporter does not include dagger cache volumes right? @civic yacht

obsidian rover
#

gitlab error

tepid nova
#

Just hit this random python error while querying the core api (unrelated to python in every way)


✘ .withExec(args: ["uv", "run", "--isolated", "--frozen", "--package", "codegen", "python", "-m", "codegen", "generate", "-i", "/schema.json", "-o", "/gen.py"]): Container! 1.9s
error: Failed to prepare distributions
  Caused by: Failed to build `codegen @ file:///src/sha256:66b11cafd456559c6a162a3349d8409b474f5a6476ed31c9534ef046dada37d6/sdk/python/dev/c
  Caused by: Failed to resolve requirements from `build-system.requires`
  Caused by: No solution found when resolving: `hatchling`
  Caused by: Failed to download `hatchling==1.26.3`
  Caused by: Failed to fetch: `https://files.pythonhosted.org/packages/72/41/b3e29dc4fe623794070e5dfbb9915acb649ce05d6472f005470cbed9de83/ha
one-any.whl.metadata`
  Caused by: Request failed after 3 retries
  Caused by: error sending request for url (https://files.pythonhosted.org/packages/72/41/b3e29dc4fe623794070e5dfbb9915acb649ce05d6472f00547
-1.26.3-py3-none-any.whl.metadata)
  Caused by: client error (Connect)
  Caused by: tcp connect error: Network unreachable (os error 101)
  Caused by: Network unreachable (os error 101)
#

Is the dagger engine hitting pythonhosted.org on every core api call every time it loads github.com/dagger/dagger?

#

I think it is

still garnet
#

found the issue - it's actually from https://github.com/dagger/dagger/pull/9054 (edited original msg), and it's because we've been using semconv v1.24.0 meanwhile the otel SDK version we're using wants v1.25.0 - should be a trivial bump, i think, this sort of dependency pain is pretty localized

tepid nova
tepid nova
#

@still garnet I'm filing an issue, and it's not urgent, but just in case you know the answer off the top of your head: I can't get the TUI to work inside asciinema, inside dagger... For some reason the TUI output is excluded from the recording

still garnet
#

in dagger shell/terminal?

tepid nova
#

unrelated to the shell

#

I had the same problem last year when I made that asciinema module... still breaking my teeth on it now

#

(was hoping to finally automate the production of our docs gifs)

still garnet
#

hmmm, shot in the dark, but try asciinema rec --cols 80 --rows 24 - maybe something isn't passing along a window size?

tepid nova
# still garnet hmmm, shot in the dark, but try `asciinema rec --cols 80 --rows 24` - maybe some...

Nevermind, I may have tracked down the issue, there is a huge difference between stable releases of asciinema, and the dev branch on their repo. The latter appears to be an in-progress rust rewrite. I was using that to make it easier to inject the tool into arbitrary containers (to allow recording any command on any container). But the tty problem seems to go away when I just use the stable release. So, I may redirect my efforts towards finding another way to record in any container.

still garnet
#

but maybe it'll be fine on your machine 🤷‍♂️

meager summit
#

Hi folks,

I have a question about how View works (cross posting the question from here (https://github.com/dagger/dagger/pull/8865#discussion_r1862896017)

  • so lets say current release is v0.14.0
  • now I changed the AsService api so that for BeforeVersion("v0.15.0") it should serve old api, and for AfterVersion("v0.15.0") it serves the new version.
  • now when we run the tests, it runs with the current version of dagger, which would be v0.14.1-.....
  • that would mean that the api visible to tests would be the old one. so we don't need any changes in existing tests yet.
  • now lets say v0.15.0 is released. this would mean tests would suddenly start seeing v0.15.0 version of api, and the tests would fail as the new api is no longer backward compatible.
  • now we would need to make changes to the existing tests?

does ^^ makes sense? It almost feels like I am missing something obvious here.

spark cedar
#

yes. welcome to complexity hell lol 😄

#

so, if you glance at future_test.go you'll see an example of where we've done this

#

but. this is maybe kind of irrelevant? the next release is going to be v0.15.0? so we should update the "current" releases to be v0.15.0-...

#

which sidesteps the problem entirely

meager summit
#

so we should update the "current" releases to be v0.15.0-...

ah, yeah that may work. The prob is with the existing tests (which are not necessarily testing this api but depends on it).

spark cedar
#

yeaaah okay um

spark cedar
#

i don't really want to figure this out with future_test.go, i'm not sure that approach is very fun

#

fyi @meager summit, went through and looked at your open prs, and left some comments - trying to make sure you're not blocked!

meager summit
#

fyi @Rajat Jindal, went through and looked at your open prs, and left some comments - trying to make sure you're not blocked!

that is super helpful. thank you. It is still a secret, but I was planning of pinging someone on Monday to help move me forward with these PRs. No points for guessing who is this someone.

#

Justin, while you are here, I have one more question for you 🙂

I am working on chore(tests): archive test results in json format with CI runs (https://github.com/dagger/dagger/pull/9011) and as part of that I need a few information to be attached to the test results:

  • current PR number
  • current branch
  • current commit
  • current GitHub action run id

This information is available as env variables when triggering the GitHub actions. Is it possible to make changes to our .dagger module to pass this information to test function?

spark cedar
#

uh, getting the current commit (and branch+PR to some extent) should be possible using t.Dagger.Git which can get info for that.
but getting the action run id, we'd want to inject it through an env var, manually using a flag (probably on the dagger constructor)

#

i'd kind of like to avoid injecting github-specific stuff into dagger config, it feels like a bit of a smell, but not sure really how to keep it moving

#

i'm trying to get our ci into a shape more away from the specifics of github actions

meager summit
spark cedar
#

dagger cloud gets passed these automatically

spark cedar
#

do we currently have a way in core of getting a dagger Directory from a directory on the engine host?

#

i have some ideas for how to implement it, but just wanted to check before i spent time on hacking it in

tepid nova
#

no thank you...

#

how to open the gates of hell

spark cedar
#

👀 this is an internal impl detail i need it for

#

it's not for an external api

#

without it, it will be harder to add this feature 😛 (it's for contextual git)

tepid nova
#

Ah nevermind then 🙂

civic yacht
spark cedar
#

Hm okay well maybe I'll try and come up with something better 🤔

civic yacht
tepid nova
#

What's the recommended best practice for installing a dev version of the engine on my system? I know where to put the CLI, but where should I put the engine image?

#

If I only install the CLI, it seems to auto-download a dev build of the engine, but unless we recently implemented "magical build on demand" from our registry, I'm guessing it's just main

tepid nova
#

is there a plan for supporting custom certificates & proxy config without requiring building a custom engine, or messing with the engine image in any way?

final star
#

also where do you put the CLI?

tepid nova
#

I setup a new bare metal lab machine, to ssh into for performance (especially when on a plane 🙂 . I just want the system install to be for dagger-dev rather than stable release. But otherwise it's a persistent system-wide install

spark cedar
#

Yeah I think as @obsidian rover said above - we'll do it as part of that issue

Which I would like to do, but priorities haven't let me get to it atm 😭

#

But it is definitely on my little list of things to do 🎉

#

Maybe after my little visit to git land?

#

We have a lot of enterprise type users asking for this, so I really want to get to it 🎉

wild zephyr
spark cedar
#

might interest you too @obsidian rover ^

#

the basic idea is to just make the current git implementation one of many backends - for now, we'll add Directory.asGit, but we can also live-load these details from the client as well later

tepid nova
#

Rant(status:optional, length:short, urgency:low, areyoustill:reading?)

Rant of the day: prefixing commit messages & PR titles with feat(come-component): or chore: hurts readability. IMO titles are for data, not metadata. If a commit does a thing, the commit should be "do the thing"

final star
#

especially around bugs where it's already mildly tricky to get the right verb tense, putting bug(engine): in front reads weird

#

bug: make thing do correct behavior

#

wait is doing the correct behavior a bug now?

tepid nova
#

Is is by setting _EXPERIMENTAL_DAGGER_DEV_CONTAINER?

#

Basically, how to install a dev build of cli+engine pair on my system

spark cedar
spark cedar
#

Which shells out to a mage script that I want to get rid of, but for now is still necessary

tepid nova
#

What is the relationship between ./hack/dev and dagger call dev-export?

spark cedar
#

Hack/dev -> weird mage thing -> dev-export

tepid nova
#

Context: I don't have a go dev environment installed on the host system, and would prefer not to. ./hack/dev requires a certain go version installed on the host

#

Can I replace ./hack/dev by 1) calling dev-export and 2) setting env vars myself?

spark cedar
#

Yup

tepid nova
spark cedar
tepid nova
#

OK

#

so docker import + export _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-image://xxx?

spark cedar
#

Yup I think that should work - it's what the weird mage script is doing under the hood anyways

tepid nova
#

This makes me want "one binary" more

#

@spark cedar ha ha! I found a way to use ./hack/dev directly without sullying my host, using the power of dagger:

$ dagger shell -m github.com/dagger/dagger -c 'go | env | with-file /bin/dagger $(cli | binary)  | terminal --experimental-privileged-nesting'
dagger /app $ ./hack/dev
spark cedar
#

🤔 how does the terminal there have access to the hosts docker?

tepid nova
#

funny you should mention that. the command just failed

#

complains about lack of .git

spark cedar
#

But if you pass the docker socket it should work 👀

tepid nova
#

so I guess I didn't even make it as far as lack of docker access

spark cedar
#

Oh yeah I hit this lol yesterday - go env doesn't include the .git directory

tepid nova
#

Here's what I tried:

COMMIT=$(dagger core git --url=https://github.com/dagger/dagger head commit)
dagger call -m github.com/dagger/dagger@$COMMIT dev-export -o ./dagger-$COMMIT
docker import ./dagger-$COMMIT/engine.tar registry.dagger.io/engine:$COMMIT
./dagger-$COMMIT/dagger core version

Last command fails with:

$ ./dagger-$COMMIT/dagger core version
✘ connect 1.2s
! start engine: failed to run container: 2f9b918a4258bf650a247b29e4fc987f0802f99491816fc1e48404c34ecb4db7
! docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "--debug": executable file not found in $PATH: unknown.
! : failed to run command: exit status 127
│ ✘ starting engine 1.2s
│ ! failed to run container: 2f9b918a4258bf650a247b29e4fc987f0802f99491816fc1e48404c34ecb4db7
│ ! docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "--debug": executable file not found in $PATH: unknown.
│ ! : failed to run command: exit status 127
│ │ ✘ create 1.2s
│ │ ! failed to run container: 2f9b918a4258bf650a247b29e4fc987f0802f99491816fc1e48404c34ecb4db7
│ │ ! docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "--debug": executable file not found in $PATH: unknown.
│ │ ! : failed to run command: exit status 127
│ │ │ ✔ exec docker ps -a --no-trunc --filter name=^/dagger-engine- --format {{.Names}} 0.0s
│ │ │ ✔ exec docker inspect --type=image registry.dagger.io/engine:8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 0.0s
│ │ │ ✘ exec docker run --name dagger-engine-8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 -d --restart always -v /var/lib/dagger --privileged registry.dagger.io/engine:8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 --debug 1.1s
│ │ │ ┃ 2f9b918a4258bf650a247b29e4fc987f0802f99491816fc1e48404c34ecb4db7
│ │ │ ┃ docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "--debug": executable file no
│ │ │ ┃ t found in $PATH: unknown.
│ │ │ ! failed to run command: exit status 127

Error logs:

✘ exec docker run --name dagger-engine-8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 -d --restart always -v /var/lib/dagger --privileged registry.dagger.io/engine:8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 --debug 1.1s
2f9b918a4258bf650a247b29e4fc987f0802f99491816fc1e48404c34ecb4db7
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "--debug": executable file no
t found in $PATH: unknown.
! failed to run command: exit status 127
Error: start engine: failed to run container: 2f9b918a4258bf650a247b29e4fc987f0802f99491816fc1e48404c34ecb4db7
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "--debug": executable file not found in $PATH: unknown.
: failed to run command: exit status 127
#
$ docker ps -a | grep engine
2f9b918a4258   registry.dagger.io/engine:8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916   "--debug"                About a minute ago   Created                                                                     dagger-engine-8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916

wat

#

I see that dagger tries to execute this command:

docker run --name dagger-engine-8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 -d --restart always -v /var/lib/dagger --privileged registry.dagger.io/engine:8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 --debug

Is that the right command?

#

2 hours in, I am giving up on installing a dev build of dagger on my home server

civic yacht
tepid nova
tepid nova
wild zephyr
#

as those are the only ones that publish the CLI and the Engine

tepid nova
#

Ah, so this is the same as dagger call github.com/dagger/dagger/cmd/dagger@$COMMIT binary I believe -> will get the right CLI. Then the CLI itself downloads the right engine, as long as it's from a commit on main (and perhaps not too old I guess)

wild zephyr
tepid nova
wild zephyr
tepid nova
#

@wild zephyr how soon does an engine image become available from main on the registry?

#

I tried pulling the current head commit, and it doesn't work. Same for a 2-day old commit.

$ docker pull registry.dagger.io/engine:d1e140d84910b0d0bc5427d845e6bdf4d2d16e83
Error response from daemon: manifest unknown
#

Looks like it's weekly

wild zephyr
#

we only publish the engine when any of the engine files gets modified

#

it's part of the GHA workflow trigger basically

tepid nova
#

Ah. That makes my life a little harder

civic yacht
#

CI on main is also current stuck with a bunch of jobs pending waiting for a GHA runner for some reason, so the last two commits haven't run the publish job yet

tepid nova
#

Because I have to manually find the last commit that change the engine, from the history of the commit I actually want, and build that

#

OK most recent working commit I found: 8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916

tepid nova
tepid nova
#

I'm thinking maybe my version of docker is too old?

civic yacht
wild zephyr
civic yacht
#

Yeah... When I run that dev CLI the engine starts successfully using the entrypoint as expected... My docker versions are

Client: Docker Engine - Community
 Version:           27.3.1
 API version:       1.46 (downgraded from 1.47)
 Go version:        go1.22.7
 Git commit:        ce12230
 Built:             Fri Sep 20 11:40:38 2024
 OS/Arch:           linux/arm64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.0.3
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.11
  Git commit:       662f78c
  Built:            Sat Jun 29 00:02:44 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.7.18
  GitCommit:        ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
 runc:
  Version:          1.7.18
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
wild zephyr
#

@tepid nova does docker run --rm --privileged -v /var/lib/dagger registry.dagger.io/engine:v0.14.0 --debug actually work?

#

^ if that doesn't work then yes, maybe a docker version issue

tepid nova
wild zephyr
#
  • if you haven't upgraded docker yet 😛
tepid nova
#

OK now that I upgraded docker, running these commands myself works, but still getting the error when calling dagger... monka_think

#

(note, upgrading docker was a painful affair, I'm on an old ubuntu and ended up downloading the static binaries and manually writing the systemd unit for dockerd)

wild zephyr
tepid nova
civic yacht
wild zephyr
tepid nova
#

Trying with the most recent main commit that seems to work: c811cc8b23c6398b4bf3b3ea358733759b9b9257

wild zephyr
tepid nova
#

Ah it works for me on latest commit!

#

so weird

#

For the record, this worked:

sudo rm /usr/local/bin/dagger ; curl -fsSL https://dl.dagger.io/dagger/install.sh | sudo -E DAGGER_COMMIT=c811cc8b23c6398b4bf3b3ea358733759b9b9257 BIN_DIR=/usr/local/bin sh ; /usr/local/bin/dagger core version

And this continues to fail:

curl -fsSL https://dl.dagger.io/dagger/install.sh | sudo -E DAGGER_COMMIT=8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 BIN_DIR=/usr/local/bin sh ; /usr/local/bin/dagger core version

So I think it might be a regression in main? Or perhaps in how we built the engine images from main.

civic yacht
# tepid nova For the record, this worked: ```console sudo rm /usr/local/bin/dagger ; curl -f...

8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 works for me and I checked both platform images and they have the entrypoint set, so I suspect something in your docker state is messing with it. If you imported/pulled/etc. the image corresponding to 8cfb4bc89fbb1a2cba30ecb35feaed5a81fc2916 earlier and it had some manifest cached that ended up not having an entrypoint, that might explain it.

tepid nova
#

ah yeah, that's probably it 👍 thank you

final star
#

whoall works frequently on the engine from a macos host other than me?

#

i've only this morning realized how unfriendly the tests are to macos hosts, like i've been running ./hack/dev dagger call test specific like the hack scripts are doing anything meaningful here, when really they just double wrap everything in my built code for no real gain... all of this is in pursuit of being able to profile an engine while i run tests against it

#

i've got a lot of ideas as to how to do this effectively, but the decision matrix of which to pursue is rough:

  1. enhance .dagger/test.go to pull pprof profiles after test runs, so you can call something like dagger call test profile -o /tmp/profile.p (worried this is gonna give me profile files locally that have no function names since the binaries weren't built on my machine, but inside dagger)
  2. teach the integration tests that they need to put linux binaries into the containers they create, then teach hack (and prolly dev-export too?) to cross compile the CLI if it's on macos
  3. give up, pull test cases into manual test modules (this is the easiest, but has no benefit for other macos engine devs)
  4. super give up, spend a couple days building a reproducible & semi-persistent linux setup (this seems deranged given that we're working on a tool that promises to make this sort of thing unnecessary, but less deranged if we acknowledge there's a legit bootstrapping problem here)
#

having typed that, i think 2 sounds best?

meager summit
#

sorry, I don't have a lot of context around the pprof work you are doing. do you mind giving a little context of how you are using the profiling here

final star
#

lol, yeah, sorry it's very x-y, i'm a couple layers deep rn.

i'm looking to pull profiles (initially pprof, but i wanna play with fgprof because we likely have disk-write perf problems) from an engine while i run ModuleTests against them... to get hyper-specific i'm looking to understand the time cost of OCI importing built-ins

GitHub

🚀 fgprof is a sampling Go profiler that allows you to analyze On-CPU as well as Off-CPU (e.g. I/O) time together. - felixge/fgprof

GitHub

An engine to run your pipelines in containers. Contribute to dagger/dagger development by creating an account on GitHub.

tepid nova
#

Calling maintainers... We have an opportunity to try one of those "AI software devs" to farm out issues humans don't have time to get to. I need a list of potential candidates for such issues...

@civic yacht @meager summit @spark cedar @stray heron @fair ermine @obsidian rover @final star @rancid turret @surreal berry @wet mason @still garnet 🙏 any suggestions?

final star
#

the known technique that linux-host-devs do to do this sorta thing is dev && with-dev go test -run "TestModule" ./core/integration && curl docker-container://dagger-engine.dev:6060/debug/pprof/profile

-- this doesn't work on macos bc core/integration likes to load your local built CLI into the container

#

now regretting having not made a thread

final star
final star
tepid nova
#

What's the go-to test suite for every day engine dev? dagger call test specific?

final star
#

depends on how tight you're trying to make the loop. I'll use specific for running 3 or 4 tests and tryna keep it simple

#

running go test directly gets very nice and useful when you're iterating on the test itself or profiling

#

and then in terms of subsets of tests i've found myself mostly in TestModule & TestService, but that's largely a function of what i'm working on

#

regardless, i can't actually get any large subset of the tests to run locally (or dagger-call-locally) in a reasonable time, so it's always small subsets. i wish ./hack/most-tests was maintained just to have a quicker, sub 10m sanity-check-all-the-things

civic yacht
final star
#

i guess i shouldn't say sanity-check, that issue is not what i'd like, i more want to run everything that isn't resource-intensive

tepid nova
final star
#

yeah, actually quite adjacent to flake detection

civic yacht
#

would be cool to somehow have heuristics for "most useful" tests (based on code-coverage, number of times a non-flake error occurs in unmerged PRs, etc.) thinkies

final star
#

most efficient is what i'm looking for lol, useful/time

#

the one problem though with any auto-test-subset-detection is you can really shoot yourself in the foot by assuming that some test you need to run is running when the system has suddenly decided that it should not, and with a big corpus, even showing a diff "heres the ones we're skipping" doesn't end up human-comprehensible

spark cedar
meager summit
#

Hi @spark cedar I am working through the scenarios of dagger update, and one scenario is when user runs dagger update <just-name>@<version>.

based on our discussion on the call the other day, I am thinking it should trigger the update and change the version to specified version.

however, when we are parsing this using parseRefString function, it returns the hasVersion: false. I made the change for it to parse the version correctly, and that fixes the testcase (and i am keeping an eye on CI for this).

BUT - what do you think about this change?

spark cedar
#

🤔 so the reasoning for the way it is currently is that local modules can't currently be versioned

#

actually this is easy

#

i don't think this is a valid case

#

you shouldn't be able to uninstall by name@version

#

that's not a module ref

meager summit
#

I am currently talking about update scenario.

spark cedar
#

right or update

meager summit
#

User can do “dagger update name”

spark cedar
#

oh i see, if we have update-to-version semantics

#

bleh

#

okay, then i think this is subtly different than module ref semantics.
we should remove the @ before passing to parseRefString

meager summit
#

But… it would mean it will just behave like “dagger update name”?

spark cedar
#

sorry, i'm not making sense, my bad

#

i mean, we remove it, and keep it 😛

#

we don't change the behavior of parseRefString - it's not valid to do dagger install ./local/mod@version

#

for the update case, you take the "@version" off first, and save that - that's the version to upgrade to 🙂

#

(also, realized - make sure we have a test case for trying to update a local source module - that shouldn't be possible)

#

since local modules aren't versioned, you shouldn't be allowed to update them

meager summit
#

I will add an explicit testcase and validation for it 👍🏻

rancid turret
spark cedar
#

this feels like a useful way to eventually get rid of ./hack/dev - but it feels weird that i can load the docker image using the docker socket directly, but i can't export to the ./bin directory 🤔

#

it feels like i want a WriteableDirectory type or something to be able to directly write to from a function

#

this is somewhat neater with the shell fyi - but it still feels like a bit of an api disparity - one of these functions can write stuff to the host, one of them can't

final star
#

til withunixsocket, last time i tried to kill hack i was put off by the fact i still needed a script to orchestrate the load-to-docker the a host (figured this was probably mitigated by shell, but i didn't even consider using the docker socket)

#

works on macos partycat

tepid nova
tepid nova
#

<@&946480760016207902> I messed up and merged a PR with a committed binary. I want to fix it before too many people pull the tainted history. Do I have your blessing to force-push an amended version of that commit?

spark cedar
#

👍 sounds good! i've got a local copy of the latest state of main as well, so worst case, i can push that if we make a mistake 🙂

tepid nova
#

Rewriting history

#

⚠️ attention maintainers & contributors ⚠️ we had to rewrite history on main. If you pulled main in the last 12 hours, you may have failures next time you pull. If that's the case, you'll have to delete the old main branch:

git checkout <another-branch>
git branch -D main
git fetch origin
git checkout main

Sorry about the inconvenience

storm wind
#

⚠ **attention maintainers &

tepid nova
# spark cedar inspired by this: https://github.com/dagger/dagger/pull/9124

Continuing the chain of inspiration 🙂

#!/bin/bash

set -ex

SOURCE=$1
if [ -z "$SOURCE" ]; then
        echo >&2 "Missing source"
        exit 1
fi

TMP=$(mktemp -d)
echo "Installing from $TMP"
cd $TMP
dagger shell -i -m "$SOURCE" <<'EOF'
        container |
        from index.docker.io/docker |
        with-unix-socket /var/run/docker.sock $(host | unix-socket /var/run/docker.sock) |
        with-workdir /root/dagger |
        with-directory . $(dev-export --platform=current) |
        with-env-variable COMMIT $(.deps | version | git | head | commit) |
        with-new-file load.sh 'docker tag $(docker load -i engine.tar | sed -n -e "s/^Loaded image ID: //p") registry.dagger.io/engine:$COMMIT' |
        with-exec sh,load.sh |
        file dagger |
        export ./dagger
EOF
civic yacht
tepid nova
#

Using a recent dev build of dagger, all of sudden I'm getting the same error loading pretty much any module:

error: parse selections: parse field "pin": ModuleSource has no such field: "pin"

tepid nova
tepid nova
#

OK I think I'm screwing up somewhere in my technique for building & installing dev engines...

#

How is this possible (fresh out of a dev-export):

./dagger version
dagger v0.15.0-241209202543-8dac6d5f8db7 (registry.dagger.io/engine:eb738ebe8bf53a80f8061d377dd04934e1489fce) darwin/arm64

--> note the different commit IDs for CLI (8dac6d5f8db7) and engine image eb738ebe8bf53a80f8061d377dd04934e1489fce) what's up with that???

#

this is from:

dagger call -m $(github.com/helderco/dagger@helder/dev-4805-shell-navigation-model) dev-export --platform=current export --path=.
#

The commit I'm building from is indeed 8dac6d5f8db7. So why won't the CLI build inject that same commit into its engine image ref?

tepid nova
#

I'm very confused right now

#

@civic yacht sorry to both you... every other core maintainer is either sleeping, on vacation or sick 😅 am I doing something very stupid above?

#

My theory:

  • when building CLI from commit foo, it will attempt to download an engine from registry.dagger.io/engine:foo.
  • By loading the corresponding engine.tar into docker then tagging it as registry.dagger.io/engine:foo, I can trick the dev CLI into using the correct dev engine

Reality:

  • The CLI from commit foo attempts to download an engine from registry.dagger.io/engine:bar, where bar is a commit on main
  • Therefore I don't know how to reliably inject my engine.tar
civic yacht
# tepid nova How is this possible (fresh out of a `dev-export`): ``` ./dagger version dagger...

Yeah for a local build of a CLI (i.e. not an official release or a build of the CLI off of main), there's no published image for the corresponding engine for us to default to (since it's all based off local code). I guess the current behavior is to default to the engine image corresponding to main, which isn't entirely unreasonable since that's the "closest" image that's actually published somewhere, but it's not going to be a dev engine built off of your branch by default.

But what you typically do in this situation is tell the CLI which engine to connect to via _EXPERIMENTAL_DAGGER_RUNNER_HOST. So you'd need to build the dev engine and load it into docker and then point to it with that env var.

tepid nova
civic yacht
#

Justin's PR does that and defaults the container in docker to "dagger-engine.dev" (but can be overridden via the name arg), so you'd need to set _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://dagger-engine.dev

tepid nova
#

I guess what I need to do to simplify install, is parametrize my build to inject the registry image I want in the CLI binary

#

looks like that's not exposed in dev-export though

civic yacht
#

(probably, going off memory here)

tepid nova
#

yeah I'm familiar with the build logic because I recently refactored it all to use github.com/dagger/dagger/modules/go

#

now there's a nice clean values argument which abstracts away the -X ldflags 🙂

#

What's missing is optional an optional argument in github.com/dagger/dagger/cmd/dagger.Binary() to override the image tag

#

when you're trying to build a go binary, but you have a dependency on the Dagger Python SDK and pythonhosted.org is down...

#

(once nice benefit of splitting up modules, github.com/dagger/dagger/cmd/dagger doesn't have the python dependency, and therefore it still builds.

#

@civic yacht OK I think I managed to hack it just with the dagger CLI 🙂

fair ermine
spark cedar
fair ermine
spark cedar
#

Thanks!

final star
#

@civic yacht is there a quick way to totally invalidate engine caches? the WithEnv(CACHE_BUST) thing prolly works fine for my current application, but im curious if there's some hack to throw out the entire engine cache without rebuilding it from scratch

tepid nova
#

there used to be a dagger --no-cache

civic yacht
final star
#

guy_fieri_chef_kiss slick and easy

#

interesting that it's undocumented afaict?

#

(this is me implicitly offering to document lol)

#

unless there's a reason we dont wanna

civic yacht
storm wind
#

Continuing the chain of inspiration 🙂

final star
tepid nova
#

I think I found a regression: loading a module without an sdk field now fails, but used to work

civic yacht
tepid nova
civic yacht
spark cedar
#

v0.15.1 🎉

final star
#

idk how much y'all macos engine devs have played with different container runtime provisioners, i was getting fed up with orbstack producing untrustworthy container stats, and started playing with colima again... it turns out, once you start your colima VM with enough memory and enable rosetta, by my cursory measurements it's significantly faster than orb for ./hack/dev builds, like 20%+, both with a cold cache or a warm one

colima start --vm-type=vz --vz-rosetta --cpu=8 --memory=16 --disk=500
Colima vs Orb

(Don’t know how warm) Orb dev took 2m 13s

Orb, pruned cache, unlimited cpu: 3m 49s

Pruned cache. 8cpu/16ram:

Colima dev took 2m 48s

Orb dev took 3m 26s

Colima dev took 3m 8s

Warm cache, 8cpu/16 ram:

Orb dev took 33s

Colima took 25s

Warm cache, change to bk worker.go, 8cpu/16 ram:

Orb took 57s

Colima took 46s
#

extremely bullshit benchmarking, fwiw, single runs here, but still

civic yacht
# final star idk how much y'all macos engine devs have played with different container runtim...

Interesting, curious what's making a difference there. I've always wondered about the connection from the CLI -> engine for these macos cases, since there's a lot of data that needs to cross a VM boundary, which could plausibly be more/less efficient for different underlying implementations. But also a zillion other things that could impact this.

(Realize that's probably not a rabbit hole worth going down at the moment amongst all possible rabbit holes 😄 , good to know either way)

final star
#

i have the fridays rn so i'm playing with tools when i told myself i would backfill tests... the other thing i wanted to make work was colima/containerd/nerdctl instead of docker, but seems like we might rely on some specific docker cli output from docker load that's ever-so-slightly different with nerdctl

dev
~/src/dagger/.dagger/mage ~/src/dagger
> dagger call dev-export --platform=darwin/arm64/v8 export --path=/Users/braa/src/dagger/bin
✔ connect 12.8s
✔ load module 1m18s
✔ parsing command line arguments 0.0s

✔ daggerDev: DaggerDev! 11.0s
✔ .devExport(platform: "darwin/arm64/v8"): Directory! 1m10s
✔ .export(path: "/Users/braa/src/dagger/bin"): String! 1.1s

/Users/braa/src/dagger/bin

Full trace at https://dagger.cloud/dagger/traces/8bf485b80d073ef10f90c6bbe251b4d8
Error: unexpected output from docker load: unpacking overlayfs@sha256:c1e560a0d2a6b878a25bb361bb208c5e7996c42e2b700c45c392e0df7ba67999 (sha256:c1e560a0d2a6b878a25bb361bb208c5e7996c42e2b700c45c392e0df7ba67999)...
Loaded image: overlayfs:

exit status 1
~/src/dagger
tepid nova
tepid nova
tepid nova
meager summit
#

Hi @spark cedar , when we call dagger call some-fn-that-returns-container up, does the last up part starts a different server? What I am observing in the logs is that the view is v0.14.0 (from dagger.json) up until the function returns, and then it is v0.15.2.

spark cedar
#

yes

#

so each caller gets a different server

#

the up here is actually a different server, because it's being called from the cli

#

imo - this is expected - we should not attempt to fix this

#

we should fix the case where Container.up is called from another module

#

but from the cli, it should follow the new version - the reason is, if we follow the design of "use the version returned by the last thing", it suddenly breaks all sorts of things - e.g. it means that you can't Container.terminal if a module uses a super old version of dagger when we modified it to make it chainable

#

when you call up on the cli - it should use the latest api - regardless of who returned it.
we shouldn't do something clever and try and use the version declared by the module - it hugely complicates the version matrix, i'm really hestitant to make versioning even more complex than it already is

#

hm, well maybe i'm wrong here, up isn't actually versioned

#

(for AsService)

#

maybe we should jump into #911305510882513037 since i feel unsure of what's actually going on here

#

Container.up breakages

meager summit
#

Give me 2 mins

spark cedar
civic yacht
#

@obsidian rover I'm gonna need to update all the git repos we use in our integ tests, I found most of the creds but not the one for "Azure DevOps public", is that hiding somewhere or do I need to go through you?

The context is sort of funny: I have had to re-work a few parts of dagql for this PR and saw that this integ test for custom SDKs started failing. Turns out, that test should have started failing a long time ago because the type of the introspectionJson arg changed from string->File, but for whatever reason it kept running without an error. I'm not even 100% sure yet what I did to cause it to fail correctly, so I guess I just accidentally fixed a bug but now have to deal with the consequences of updating all the integ test repos 😂

spark cedar
obsidian rover
#

checking 👀

spark cedar
#

Since currently my pipeline is @ Guillaume, pls pls help me

obsidian rover
obsidian rover
spark cedar
obsidian rover
wet mason
wet mason
wild zephyr
#

Private packages in modules 🧵

spark cedar
spark cedar
#

through an absolutely incredible coincidence, we haven't been importing this package (which sets the rules up) - so we haven't actually been validating any of these 😱

spark cedar
#

and when you enable it you get such delightful errors: Fragment "TypeRef" cannot be spread here as objects of type "__Type" can never be of type "__Type".'

#

ah of course, __Type is not __Type, makes perfect sense

final star
#

GraphQL Introspection is often weird but that’s extra weird

civic yacht
civic yacht
#

@Erik Sipsma is it intentional that we

tepid nova
tepid nova
#

I would love to hear everyone's ideas for "a config file for modules"... It came up earlier with @wet mason (in the context of secrets providers, eg. "can they be designed independently"). I personally really feel the need for it... @upbeat hare I know you guys have your in-house equivalent, I believe a yaml file? Are you happy with your current design? Could it be generalized to being a "config file for my module"?

#

Basically I want the equivalent of an .env file but for my module. Perhaps we could use an actual .env file? But it may not be structured enough?

tepid nova
tepid nova
#

I'm doing something wrong in my custom otel spans, but I'm not sure what... These │ ✔ Missing.plaintext: String! 0.0s are mysteriously appearing... Does that ring a bell?

upbeat hare
# tepid nova I would love to hear everyone's ideas for "a config file for modules"... It came...

👋 We do have our in-house config file for a contextual module in yaml, yes.
We are somewhat happy with the design 😅 , our yaml config file generation/parsing moved into its own module as the complexity began to increase, and consuming any field of the config now is a bit sad as it needs error checking for every field

config := dag.Config("file.yaml")
version, err := config.Version()
if err != nil { ... }

Having said that, having a way to pass values to a module through a .env would be quite helpful.
At the moment, what we've seen in our consumers is that they wrap dagger calls around a makefile, to avoid passing a bunch of input parameters that never change
Having a .env will likely render makefiles irrelevant to an extent (for the cases where they are only used for passing inputs, which we've already seen)

wet mason
#

Did we remove the ability to start a graphql server?

tepid nova
# wet mason Did we remove the ability to start a graphql server?

I think dagger listen still works, are you having trouble with the token?

This worked for me in the shell:

server() {
 container |
 from alpine |
 with-file /bin/dagger $(github.com/dagger/dagger/cmd/dagger | binary)
 with-exposed-port 8080 |
 with-env-variable DAGGER_SESSION_TOKEN onedag |
 with-default-args -- dagger listen --listen 0.0.0.0:8080 --allow-cors |
 as-service --experimental-privileged-nesting
}

client() {
 container |
 from alpine |
 with-file /bin/dagger $(github.com/dagger/dagger/cmd/dagger | binary)
 with-exec apk add curl |
 with-service-binding dagger $(server)
}

client | with-exec curl http://onedag:@dagger:8080/query | stdout
final star
#

i've noticed some behavior messing around with TestModule locally that smells like an engine memory leak... i run tests and regularly prune the cache between runs, but after 4-5 of these pretty heavy, 8-15m long test runs, im fairly certain i consistently see the engine getting OOM'd -- no panic logs make it to docker, the engine process just disappears and my tests start 502'ing

#

annoyingly this would take an hour+ to repro via scripting if my hypothesis is correct, so curious if anybody knows how to catch evidence of the OOMkill after-the-fact

#

(made extra annoying by colima/macos virtualization lol)

final star
#

circling back on memory utilization, putting #s to things before moving on - all numbers here are me looking at process-specific dev engine memory utilization via btop:

  • a fresh dev engine uses like 85M.
  • running a slightly-trimmed version of ModuleTest (PR forthcoming, there are 4 tests in here that should be treated as benchmarks and ran in serial/isolation)
    • it'll peak at ~9.3G memory utilization, then after tests are complete fall back down to ~8.4G at rest...
    • rerunning the same suite will make it climb even higher. with the warmed cache it climbs up to 11.2G during the run, 10.6G after
    • pruning and rerunning, during the prune we get up to 11.2g, then during the run up to 13G before getting OOMkilled (my box has 16G total, 15.5 userspace, at least 1G of which is consumed by the released engine at any given time)
#

smells like a leak, no? i should prolly look at CI engine #s to see if they crawl upwards in the same way...

final star
#

in CI, things look flat, but it's likely we paper over the hypothetical leak by running dagger-in-dagger each time

tepid nova
wild zephyr
#

Does anyone know if there's a non-hacky way to get the client sessionID in userspace? I can't seem to find anything in the SDK and/or generated code that exposes that. Silently pinging @still garnet @spark cedar @civic yacht @paper epoch

the reason why I'm trying to do this is so I can reference a service FQDN by just knowing its hostname

hexed portal
#

Hey everyone, I want to reach out regarding the issue https://github.com/dagger/dagger/issues/6990. The company I work for needs to have an option to mount the volume from the host, and it is currently not supported.
I would love to work on that feature if you agree, and I did the initial investigation. My thoughts are that havin run options that would be passed to containerd will probably be a good approach. When specifying a container, run options could be passed either through a new function, or through a functional options for Up, Stdout, etc. I would personally prefer the new option, but I don't have a strong opinion.
I'm happy to work on a proposal, POC or have a complete PR for a review. However, I guess the proposal route would be better since the API would be the most important thing to do right.
Please let me know if you are open for this, so I can start working on it right away :).

GitHub

What are you trying to do? I want to be able to make changes to code locally and see them reflected in my running Dagger services similar to how docker-compose and docker run --v works. In particul...

final star
# wild zephyr Does anyone know if there's a non-hacky way to get the client sessionID in users...

the closest thing i can find via grep is potentially a way to set it SessionID? I vaguely recall someone asking a similiar question a while back...

i imagine using the helpers around contexts and ClientMetadata in engine/opts.go is probably hacky for whatever application you're looking at, though...

GitHub

An engine to run your pipelines in containers. Contribute to cwlbraa/dagger development by creating an account on GitHub.

hexed portal
#

[WIP] dagger watch by aluzzardi · Pull R...

hasty basin
final star
# hasty basin I was just wondering about Orbstack this morning. Nice to see this testing 🙂

annoyingly there does seem to be a bit of a performance vs stability tradeoff between orb and colima... i've somehow broken the colima network 3-4 times since doing all that perf testing and I don't think i've ever broken the orb equivalent. restarting the vm fixes it though.

there's also some memory_commit config that you gotta add to colima to keep redis from whining.

i have MACOS_ENGINE_DEV.md in my TODO to document this in a less ephemeral spot...

storm wind
#

What I'm proposing is:

tepid nova
hushed wyvern
#

Hello everyone,
I'm new to Dagger and not sure if this is the place to discuss issues for a newbie like me. Currently, I'm trying to expose ports as a service for the development cycle, and I have a few questions that I hope can be answered:

  1. I'm not sure whether using Dagger to replace Docker Compose to run services for the development process is a good idea or not. For example, running MySQL, Redis, Memcached, MongoDB, phpMyAdmin (all the services that my application depends on).
  2. Is there any document describing how to start them in parallelism and maintain persistent data during development? I'm trying to do as shown in the code below, but it seems that the exposed services cannot be accessed by their ports from the host machine. Is there anything wrong with my approach?
    https://gist.github.com/PhuongTMR/c4eca5508d976a189b0eca5d094ac379
hushed wyvern
#

Hello everyone,

leaden glade
#

While trying to understand why my PR (https://github.com/dagger/dagger/pull/9322) build is failing, I enabled (thanks @spark cedar ) the new merge preview on GitHub to get the link to the dagger cloud traces. But when I follow the link (https://dagger.cloud/dagger/traces/1e2071af9aa48db6c40aa171f4ae9761) I have a 500
If I try to add a v3. in front (just in case 😅 ) I have a different kind of error

GitHub

Javadoc comments are wrapped between HTML <p> tags. In this case & is not a valid entity and generates the following error:
error: bad HTML entity

  • <p>&lt...
wild zephyr
#

maybe it was a temporary hiccup?

leaden glade
wet mason
#

@civic yacht @spark cedar @rancid turret @tidal spire 👋

Working on cleaning up the new secret providers API

For context: the current API is SetSecret("name", "plaintext"). The new API (in its current, POC state) is MapSecret("vault://foo/bar") (returns a dagger.Secret as well).

Main difference is: 1) Secrets don't have names anymore 2) Can't set plaintext value 3) Secrets are mapped to an URI

I want to rename MapSecret to something else.

Option 1: NewSecret

Option 2 (my favorite, because more consistent, but breaking): Secret.

We have dag.Container to create a new container, dag.Directory ... it kinda makes sense to have dag.Secret to create a new secret (e.g. foo := dag.Secret("vault://foo/bar"). However, Secret() already exists and its used to lookup a secret by name (which doesn't make sense in the new "world" since secrets don't have names).

Thoughts on this one? IMHO Secret is the consistent answer, however if we'd rather not break things, NewSecret is a fallback. Is secret lookup by name really used?

rancid turret
#

Secret providers API

tidal spire
#

Running dagger functions in dagger/dagger

dagger functions
✔ connect 0.9s
✔ load module 3m56s

what can we do to make this better?

civic yacht
#

Running dagger functions in dagger/

spark cedar
#

hm, main is now failing with weird docker pull failures (the target pr i just merged was green before i clicked merge, i suspect something environmental?)

#
1   : [53.2s] | Error: failed to serve module: input: moduleSource.withContextDirectory.asModule failed to create module: select: failed to update module dependencies: failed to initialize dependency modules: failed to initialize dependency module: select: failed to create module: select: failed to update module dependencies: failed to initialize dependency modules: failed to initialize dependency module: select: failed to create module: select: failed to update codegen and runtime: failed to generate code: failed to call sdk module codegen: select: failed to copy: httpReadSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/library/composer/blobs/sha256:3ae0d9dfc4dada15e6a030ba7b9c9a3b16f9f5a7597a4d46ff24226e91b91db7: 403 Forbidden
1   : [53.2s] | 
#

i can access the library/composer image 👀

#

have we changed something docker hub creds related? @astral zealot @stray heron

stray heron
#

Did you try pulling the image locally? What happens if you delete it & loop retry?

spark cedar
#

Works perfectly locally 🤔

final star
tepid nova
wet mason
#

is it me or CI seems dead?

#
7   : │ │ │ │ exec docker pull registry.dagger.io/engine:c427926cc12fc0b01d1ad894e810dc6bb6948353 ERROR [0.7s]
7   : │ │ │ │ ! failed to run command: exit status 1
7   : │ │ │ │ [0.7s] | Error response from daemon: manifest unknown
#

not dead, but test provision is failing because of a missing tag

#

also this:

                +2025/01/09 XX:XX:XX WARN failed to set up OTel resource error="1 errors occurred detecting resource:\n\t* conflicting Schema URL: https://opentelemetry.io/schemas/1.25.0 and https://opentelemetry.io/schemas/1.26.0"
#

I see main is green, rebasing ...

wet mason
#

yep that fixed it

wet mason
#

actually no, php still failing (it's been a while). I see you did a php change @spark cedar, is it supposed to have fixed CI or is that unrelated?

spark cedar
#

mmm this is a known flake

#

i'll open an issue so we can track it 🙏

#

php-dev constantly failing was the fix i worked on

#

this one is unrelated 😢

final star
#

sooo say i'm inside of a go test process running inside dagger, and im adding otel spans for better obs... if i want to get git metadata about the repository, how might i do that? seems like we strip out the .git directory, but is there a reverse api where i can ask the engine what SHA i'm on?

tepid nova
final star
#

the same in this case

#

my initial approach didn't involve module code fwiw, that's what i'm gonna try next. i was just inside core/integration/testctx.go

#

but if i can get the git info from the dagger/.dagger module and pass it down to the go test process that would at least solve the problem (although it wouldn't be as easily reusable from the testctx-as-a-library-others-might-use perspective)

tepid nova
#

Trick question: if I run a container with dagger, and inside that container I run the dagger CLI and expose ports on the "host" (my container)... Is there any way to dynamically inspect those ports from the top-level client?

#

eg.

#!/usr/bin/env dagger shell

INNER=$(
  container |
  from alpine |
  with-file /bin/dagger $(github.com/dagger/dagger/cmd/dagger | binary) |
  with-exec --experimental-privileged-nesting dagger shell -c 'container | from nginx | with-exposed-port 80 | up'
)

# FIXME: how to introspect ports forwarded by the inner "up" and expose them here?
# (assuming I don't know them in advance)
<something something> $INNER | up
civic yacht
#

sooo say i'm inside of a go test

#

eg.

leaden glade
#

While working on the Java SDK, I'm curious about your workflows.
For instance, I want to dagger functions in a Java module. So for that I need the runtime module.
Once I'm inside a dev engine (I'm doing that with ./hack/dev zsh if that's the right way) is there an easy way to rebuild the runtime module? (in my case the sdk/java/runtime I'm working on) Something quicker than to rebuild the full engine.
Maybe there's a dagger command for that I haven't found?
And second question, is there an easy way, once it's built, to explore this runtime module content? (the container returned by moduleRuntime)

meager summit
#

Hi @spark cedar, re: windows terminal issue, when I run echo ".core | container | from alpine | terminal" | dagger shell, it works fine, but when I type in exit, the terminan clears up and stdout is NOT restored until I press any key.

does that ring any bell?

rancid turret
#

While working on the Java SDK, I'm

tepid nova
#

Anything we can do about that dagger CLI build being not fully cacheable?

civic yacht
#

so first step is rm'ing SetSecret, which is the biggest blocker by far, after that it's just a few more adjustments

tepid nova
#

It looks like it's wolfi stuff

#

Called twice, once by dagger/dagger/version and another time by dagger/dagger/modules/go, which somehow makes a big difference.

Besides function call caching, isn't there somehint we could do in the implementation of wolfi/apko itself? I see fresh http requests to the wolfi registry every time, maybe we could pin versions or digests or something at the wolfi level?

#

maybe vendor an apko index file or something?

#

I'll look into it

#

(unless you tell me not to bother 🙂 )

civic yacht
#

I think something like that could help in theory, provided the apko lib supports it or we can implement that ourselves. Specifically that might help avoid extra requests happening in this chunk of code: https://github.com/sipsma/dagger/blob/9a18a0bd7f346ad9a9c2d05e2d7ac77999d3051d/modules/alpine/main.go#L99-L136

But the majority of the http steps you're seeing in that trace are from here: https://github.com/sipsma/dagger/blob/9a18a0bd7f346ad9a9c2d05e2d7ac77999d3051d/modules/alpine/main.go#L207-L207

Which is just the raw buildkit HTTP source op. It should just be making an HTTP HEAD request, get an etag and see it already has the download cached and re-use that. Those also run in parallel so the overhead is just a bunch of HEADs, so hopefully not adding tons of time. It does look like buildkit's HTTP source implementation supports some fast paths if you spcify an extra Checksum opt, so maybe there's a path to using that? But would require core API changes to dag.HTTP

#

Hard to say if all that's worth it since it would be almost entirely invalidated once function caching is a thing, which is hopefully not too far away thanks to the new secret stuff Andrea is doing

tepid nova
#

nice, thanks for the details!

spark cedar
spark cedar
#

hallo y'all - was curious if i could get a little discussion started on https://github.com/dagger/dagger/issues/8421 again - i came up with a use case where it's quite painful to work against this, so was hoping to try and work on a fix at some point

#

specifically @fair ermine and @rancid turret since I'm struggling with how you'd express this type-safely in typescript and python

fair ermine
final star
#

i'm real close to done with the initial benchdev PR, I've got 1 TODO left: make some call on how to handle caching for these runs.

summarizing the change: benchmarks are doing dagger call test all --bench and running on compute that's guaranteed 1 run per node. you can trigger them pre-merge with a "benchmark" PR label, and then they run on chron against main. each benchmark runs in serial so as to avoid x-bench pollution. in follow-up PRs im gonna try to add the span data needed to graph individual benchmark durations over time on main, and for pre-merge runs dagger cloud displays the spans

@still garnet @civic yacht seeking opinions: ideally individual go test benchmarks should be more order-agnostic than they are right now, but we're also building a dev engine to run against, and I don't particularly want to prune the whole engine cache because that can be slow. any bright ideas for the best way to set up bench tests so the first one isn't consistently slower than the rest?

spark cedar
#

have we changed anything particularly significant? or has this always been the case?

final star
spark cedar
#

hmmm poking around honeycomb actually indicates it's been about like this for a while

#

so ignoring the massive spike:

#

all of the little bumps are at about the 20minute mark

#

and the bottom baseline is about 10minutes

#

maybe i'm honey-combing wrong tho

final star
#

and you can see the trend you're describing, new year, around the 7th the peaks get taller (what got committed evening jan6, morning 7th?)

still garnet
#

Go 1.24 interactive tour

spark cedar
wet mason
#

@still garnet e.g. current API returns a "selector" with the accessor etc

Wondering if we could just return the Secret, with a reference to the URI and that's that

#

The other tricky part is leaking -- we don't want modules creating "root" secrets (e.g. file://...), instead scope them to their own module

still garnet
#

ah gotcha

wet mason
#

[/cc @civic yacht ^^]

#

I was thinking of leaving it like that, and when we do remove the "old secrets API", we might revisit that

civic yacht
# wet mason [/cc <@949034677610643507> ^^]

I'll take a look at the PR again closely today. There's some changes in my mostly unrelated PR that might end up simplifying things here by making it easier to control the caching around all this, we'll see

meager summit
#

for some reason when I am running dagger v0.15.2 commands on my windows laptop I am getting "exec format error", while v0.15.1 works fine.

meager summit
#

Hi Justin, would you have some time to discuss the sdk string to struct changes.

wet mason
#

@rancid turret @spark cedar 👋 I'm wondering what's the simplest way to introspect a module nowadays to list all available functions etc via the API?

#

(/cc @civic yacht)

#

back in the days there was a hidden __sdl in each module, and a way to programmatically load modules/introspect, not sure if that's still around

wet mason
#

working my way backwards from the CLI code

tepid nova
#

I know last year Helder at a reusable introspection module, I forget what it was called

wet mason
still garnet
#

probably just using the Dagger API, there's like CurrentModule.TypeDefs, Module.objects, etc. (or maybe Module.initialize.objects)

still garnet
wet mason
#

a module ref

wet mason
tepid nova
#

@wet mason you can look at the shell source code, it had to implement all this very recently, and it's encapsulated in a relatively condensed codebase

wet mason
tepid nova
#

Helder had to deal with the API sprawl and pull it together

wet mason
tepid nova
#

so maybe you can piggy back on that

wet mason
#

private helper code, like thousands of lines

tepid nova
#

It's just that dagger functions (and the rest of the CLI) is huge and complicated

#

and shell was an opportunity to clean up and simplify with a smaller footprint and less bagage

#

might work to your advantage

wet mason
#

wondering if that could have been moved up the stack, directly into the API, rather than the client itself

still garnet
#

should we expose a Module.schema: JSON (or something) that returns its GraphQL schema in the "standard" format?

#

then you'd just do one big json.Unmarshal or whatever

wet mason
tepid nova
still garnet
wet mason
tepid nova
still garnet
#

to be clear i don't think it's wrong at all that we have our own introspection API too, i think it just ends up being less convenient in some cases, and maybe we can add a helper

wet mason
#

Oh wow ... we actually have source mapping? e.g. function to source file/line

#

that'd be pretty cool to include in telemetry

still garnet
wet mason
#

neat

tepid nova
#

In theory we could even link back to the corresponding link on github, since we have references to the module source code and exact version too?

wet mason
#

yeah

#

or the opposite -- overlay the source code with telemetry data

still garnet
tepid nova
#

That may be true but also our SDKs are just not good at querying data in the Dagger API - for API introspection or anything else. So it would also be nice to fix that. Maybe it makes the need for a dump escape hatch less acute?

#

I run into this all the time for my own modules

still garnet
#

yeah true - gets back to one of the older bikesheds 😛

#

actually, that's kind of what i did with daggerverse at the beginning - i just did one big query and unmarshalled into my own struct type. maybe it still does that?

tepid nova
#

Just having a first class, well-documented GraphQL escape hatch in all SDKs would go a long way

still garnet
wet mason
#

sweet, thank you!

still garnet
#

pretty tedious writing the query + schema defs, but it's a viable escape hatch. would be nice if we could do something like https://github.com/shurcooL/graphql which infers the query from the struct

wet mason
#

what's the cleanest module we have around, using the latest best practices (constructors etc) @still garnet @tepid nova ?

wet mason
#

by anyone

hasty basin
#

Daggerverse modules are in score order.

#

couple of good ones

wet mason
#

Thank you!

tepid nova
#

@wet mason I like github.com/dagger/dagger/modules/go it's pretty meaty and works well

#

recently refactored

#

& used in a real project (ours)

wet mason
#

Not a huge deal, but it gets noisy

#

I have a module with ONE useful function (for now), but when I introspect, I see 6-7 public functions (most of them are just my constructor fields, saved as Public fields)

wet mason
#

Same here -- publishFile etc are the meaty ones, but they get mixed up with e.g. username and password which are just the constructor arguments, stored inside public fields

tepid nova
#

Yeah I use +private a lot to hide fields.

#

(but sometimes I find it useful to keep them public, depends on how I want people to use the module)

wet mason
wet mason
#

@obsidian rover

failed to resolve source metadata: failed to get credentials: error getting credentials - err: signal: killed, out: ``

Does that ring a bell?

Happens when I try to pull docker images

wild zephyr
wet mason
wild zephyr
#

@wet mason re module introspection: actually @tidal spire and Guillaume have been working in a "loader" library which has been spawned from the daggerverse code. Maybe that's something useful: https://github.com/dagger/dagger.io/pull/4173

wild zephyr
tepid nova
sullen trail
#

Hey there.
Taking a shot at https://github.com/dagger/dagger/issues/7721 (tbh. with a limited amount of go knowledge). I think I got through passing the rewriteTimestamp:bool down from Container publishArgs/exportArgs to the actual publish operations in buildkit (making them actual properties on Container.export() and Container.publish()).

However, passing the SOURCE_DATE_EPOCH itself, I think this should probably be something that should not be picked by the engine but at the client during session initalization (a "frozen" value that would then be used for all calls within that session) - so wdyt. Would you prefer something where the epoch is:

  • picked up by the engine from the environment globally (IF the client is creating the engine, it would pass it from its environment as env)
  • read from env SOURCE_DATE_EPOCH during client creation and passed to the engine as optional ClientMetadata.SourceDateEpoch (similar to how CloudToken or DNT are passed at client creation time) <- my current preference
  • an explicit optional "Epoch" field on Container.withExec/export/publish Args ?
  • anything else?
GitHub

What are you trying to do? I'd love to have binary reproducible builds (motivation: https://reproducible-builds.org/) in dagger. Why is this important to you? Beside its nice security propertie...

fresh harbor
#

I try fixing the return list of container bug (https://github.com/dagger/dagger/issues/8202) with this PR https://github.com/dagger/dagger/pull/9425. It has a simple test case that covers this issue but is a bit hack for me. I'd appreciate it if anyone could guide the right fix.

GitHub

What is the issue? From this snippet: $ cat main.go package main import ( "dagger/reproduce-bug/internal/dagger" ) type ReproduceBug struct{} func (m *ReproduceBug) Containers() []*dagger...

GitHub

GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

rancid turret
#

List of containers

coarse heron
#

Did you know that every SDK module contains the source files for every other SDK module? And for the whole Dagger engine and all of its tests, for that matter.

Give it a try: dagger core module-source --ref-string "./sdk/python/runtime" resolve-from-caller as-module source context-directory directory --path=sdk entries

I started a discussion about it on GitHub: https://github.com/dagger/dagger/discussions/9303

sullen trail
#

Reproducible Builds

spark cedar
spark cedar
tepid nova
#

If we wanted to get the filesystem state of a container-backed Service during or after execution, could we? ie. does buildkit allow getting that state at all?

civic yacht
tepid nova
#

Would it be cleaner to do after execution than during?

#

(I'm trying to imagine a possible "persistent services" feature where services could be configured to be persisted across engine restarts, or even engine migrations)

civic yacht
# tepid nova Would it be cleaner to do *after* execution than *during*?

No that would make it worse, the problem is just that the codepath we have to go through for services is something bk calls a "gateway container", which is different than the containers made for exec-ops. The codepaths there are pretty hardcoded to assume that the filesystems are ephemeral, so you would have to jump through a million hoops to trick everything to make them not-ephemeral. Sure it's possible some way or another but probably a nightmare.

It might be possible to change services to not use gateway containers and instead be exec-ops, but IIRC that's what @still garnet tried in the first services v1 impl but then hit weird buildkit scheduler bugs and some other problems I can't remember now and moved off of them. Some stuff has changed since then, but even if that's plausible now it would be a significant lift to make that move

civic yacht
#

To be clear, on a purely conceptual level this should all be totally possible, the only blockers are implementation details

tepid nova
#

I guess the escape hatch is cache volumes

civic yacht
tepid nova
#

(but in the future could be some other primitive we concoct, that also builds on buildkit bind mounts)

#

what about one-off exporting files from the gw container's fs into the DAG, kind of like Host.Directory() ?

#

so you wouldn't get the whole FS state as a Directory object, but you could query for some files to be exported, and under the hood it would be implemented with a bunch of non-atomic walk readfile etc. Maybe too hacky to be worth it

civic yacht
tepid nova
#

Makes sense. Dropping this particular angle 🙂

#

OK another question: is there any way for a Dagger module to receive a generic object (ie. an interface with no specified methods), and somehow introspect the whole schema of the concrete implementation?

#

And from there, build queries to call that object?

#

This 👆 is what it would take to turn any Dagger object into an agent, without having to patch the engine

still garnet
#

it's technically possible given the content/structure of IDs, but the APIs don't exist for it. IDs contain their type, and any modules they need, so you should be able to load/serve those modules and inspect the schema for its type

civic yacht
#

Yeah, as long as you are willing to be dynamic and not have codegen'd apis for calling it, that would be theoretically possible and supportable. But not really possible today most likely

#

I'm not even sure if interfaces would be the right approach there, sounds more like something different like a type called Dynamic that has generic methods for getting the apis and calling them in type-unsafe ways ("stringly typed" calls)

#

I'd want to be 100% sure that's the only option to support that use case before pursuing it but sounds plausible

tepid nova
#

well the alternative is to patch the engine...

civic yacht
#

Actually I'm pretty sure some early iteration of the module API had a field on Module named call(args: JSON): JSON which is fairly similar conceptually. I think I removed it because there was no use and it was becoming a pain to maintain atm, but maybe there's a use for it now

civic yacht
tepid nova
#

My use case would be to "agentify" any dagger object by plugging it into a llm. eg.:

func (m *Langdag) Angentify(env *dagger.Object, llm *dagger.Service, prompt string) *dagger.Object
civic yacht
#

Oh okay sure, basically just want to give up type safety in terms of your module code and do everything dynamically w/ the llm. Yeah, we'd just need a new type like Object that has fields for making calls with strings/json/whatever

#

I don't think it would be a huge effort to add support for that

#

Another angle would be to create a new SDK for this. If you implement an SDK you handle receiving args/returning values as raw JSON, which is close to what you're describing anyways. So that would actually be plausible today. Not convinced that's easier than us just adding support for dagger.Object but food for thought

tepid nova
#

Is there an escape hatch at the moment? 🙂

#

I'm trying to find out what's the path of least resistance for a POC:

  1. Module
  2. Engine patch (implement this in the core API)
  3. External tool
civic yacht
#

Well now that I think about it, today it is possible to pass arguments of type Module, and I guess that implies that in your module code you could do e.g.

func (m *Langdag) Angentify(mod *dagger.Module) {
   mod.Serve() // replaces your schema with the schema for calling the mod
   dag.GraphQLClient().MakeRequest(...) // make raw gql queries to the newly loaded schema, including introspection
}

Not 100% sure if it works but also can't think of why it wouldn't 😄

You can then pass around any other dynamic state as type string (including IDs serialized to strings)

#

That's my best stab at what's possible for a POC atm. But adding support for dagger.Object really doesn't strike me as that hard (often famous last words of course, but might actually be true), it wouldn't require anything very new because of what @still garnet mentioned about how IDs work

tepid nova
#

passing a dagger.Module doesn't get me much, because I can just load it myself from ref (Andrea's POC does that)

#

I was hoping that passing a dagger.Object allows me to setup my object state with complete freedom, eg. env := dag.Foo(src).WithToken(bar).WithSource(bla); dag.Agentify(env)

still garnet
#

Object does sound neat... would it have fields for introspection (typeName?), and a dynamic call API? (maybe a step too far: a asFoo field for every type?)

civic yacht
#

env.MarshalJSON() exists and would let you pass it as type string, would that work?

civic yacht
tepid nova
#

My fallback I think, is to use @wet mason 's current implementation, which is: 1) load module from ref + 2) introspect module constructor (and only module constructor) and try to infer what to pass, eg. secret argument bar of module foo is loaded from env variable FOO_BAR

#

ooh, I could pass a dagger shell script instead of a ref?

#

for custom init

#

.... which makes me think this could be an experimental shell builtin?

#

github.com/my/agent --token=FOO | with-home-directory ./home/dir | with-db tcp://localhost:4242 | .agentify

#
  • Arbitrary object construction ✅
  • Access to typedefs for introspection ✅
  • Access to query builder / client ✅
  • Avoid spelunking in the engine internals ✅
  • Avoid custom wrapper tool ✅
#

Downside: not programmable. So, not nestable

tepid nova
#

I'm looking for the implementation of the id() resolver for custom types in core... Any pointers would be appreciated 🙂

tepid nova
#

Ah, found a trail at core.ModuleObject.Install

#

Aha! Looks like dagql is where I might need to attach myself? Same level as ID()

#

@still garnet does it even remotely make sense to patch dagql itself, so that every object gets a Prompt() alongside an ID()? (where prompt is my POC implementation of introspecting that object's fields, and plugging them into a llm as tools)

#

@still garnet quick dagql question if you're around.

func (s *Server) installObject(class ObjectType) {
    class.Extend(
        FieldSpec{
            Name:           "prompt",
            Description:    "prompt a LLM with this object as environment",
            Type:           class.Typed(),
            ImpurityReason: "I have no idea what I'm doing",
            Args: []InputSpec{
                {
                    Name: "prompt",
                    Type: String,
                },
            },
        },
        func(ctx context.Context, self Object, args map[string]Input) (Typed, error) {
            promptArg, ok := args["prompt"]
            if !ok {
                return nil, fmt.Errorf("no prompt specified")
            }
            prompt := args["prompt"].???    // <------ 🤔
            ctx, span := Tracer().Start(ctx, "[👨] "+prompt)
            // insert magic here
            span.End()
        },
    )

How do I get the actual string value for the prompt arg ?

#

I'm going to try ToLiteral().Display() since it's the only method I can find that returns a string

still garnet
tepid nova
#

Noooo unrelated docker install fail 😭

Error: input: container.from failed to resolve image "docker.io/library/alpine:latest" (platform: "linux/arm64"): failed to resolve source metadata for docker.io/library/alpine:latest: DeadlineExceeded: failed to get main client caller: no active session for w450zguly0n08y955d3xkpwp0: context deadline exceeded
tepid nova
#

I may be jinxing it, but this feels like perhaps the perfect layer for my POC

#

@still garnet how do I declare that my argument is of type string? String gets a type error:

           Args: []InputSpec{
                {
                    Name: "prompt",
                    Type: String, // <-- wrong

                },
            },
still garnet
#

String("") should do

tepid nova
#

😢 😢 😢 😢 😢 😢

Error: Post "http://dagger/query": command [docker exec -i dagger-engine-v0.15.2 buildctl dial-stdio] has exited with exit status 137, make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=
Run 'dagger call dev --help' for usage.
tepid nova
#

back on track

#
⋈ directory | prompt hi
Error: input: directory.prompt panic while resolving Directory.prompt: runtime error: invalid memory address or nil pointer dereference

⋈ directory | .doc prompt
prompt a LLM with this object as environment

USAGE
  prompt <prompt>

REQUIRED ARGUMENTS
  prompt string

RETURNS
  Directory - A directory.

Use "prompt <prompt> | .doc" for available functions.

⋈ directory | prompt hi
Error: input: directory.prompt panic while resolving Directory.prompt: runtime error: invalid memory address or nil pointer dereference

goroutine 29708 [running]:
runtime/debug.Stack()
    /usr/lib/go/src/runtime/debug/stack.go:26 +0x5e
github.com/dagger/dagger/dagql.(*Server).resolvePath.func1()
    /app/dagql/server.go:689 +0x78
panic({0x21bfe20?, 0x3ca8f50?})
    /usr/lib/go/src/runtime/panic.go:785 +0x132
github.com/dagger/dagger/core.(*Directory).PBDefinitions(0x3cb2e00?, {0x29d3470, 0xc00065aa20})
    /app/core/directory.go:51 +0x2a
github.com/dagger/dagger/core.collectDefs({0x29d3470, 0xc00065aa20}, {0x29ac2a0?, 0x0})
    /app/core/telemetry.go:27 +0xa2
github.com/dagger/dagger/core.AroundFunc.func1({0x29ac2a0, 0x0}, 0x0?, {0x0, 0x0})
    /app/core/telemetry.go:139 +0x730
github.com/dagger/dagger/dagql.Instance[...].call.func1.1()
    /app/dagql/objects.go:441 +0x30
github.com/dagger/dagger/dagql.Instance[...].call.func1()
    /app/dagql/objects.go:474 +0x3f0
github.com/dagger/dagger/dagql.Instance[...].call(0x29fc880, {0x29d3470, 0xc00065a6f0}, 0xc0000c0cc0, 0xc000635740, 0xc00065a930)
    /app/dagql/objects.go:482 +0x229
github.com/dagger/dagger/dagql.Instance[...].Select(0x29fc880, {0x29d3470, 0xc00065a6f0}, 0xc0000c0cc0, {{0xc0005aa510, 0x6}, {0xc000999ba0, 0x1, 0x1}, 0x0, ...})
    /app/dagql/objects.go:387 +0xac5
github.com/dagger/dagger/dagql.(*Server).resolvePath(0xc0000c0cc0, {0x29d3470, 0xc00065a6f0}, {0x29d8460?, 0xc0006344c0?}, {{0xc0005aa510, 0x6}, {{0xc0005aa510, 0x6}, {0xc000999ba0, ...}, ...}, ...})
    /app/dagql/server.go:698 +0x154
github.com/dagger/dagger/dagql.(*Server).Resolve.func1()
    /app/dagql/server.go:457 +0x7e
github.com/dagger/dagger/dagql.(*Server).Resolve.(*ErrorPool).Go.func3()
    /go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/pool/error_pool.go:30 +0x23
github.com/sourcegraph/conc/pool.(*Pool).worker(0x1b99e74?)
    /go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/pool/pool.go:154 +0x69
github.com/sourcegraph/conc/panics.(*Catcher).Try(0xc000e30930?, 0xc0008727d0?)
    /go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/panics/panics.go:23 +0x42
github.com/sourcegraph/conc.(*WaitGroup).Go.func1()
    /go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/waitgroup.go:32 +0x4d
created by github.com/sourcegraph/conc.(*WaitGroup).Go in goroutine 29707
    /go/pkg/mod/github.com/sourcegraph/conc@v0.3.0/waitgroup.go:30 +0x73


⋈
#

progress!

tepid nova
#

current status: trying to understand the difference between Typed.Extend() and Fields[Typed].Install()

#

dagql has a Tracer() but sending custom spans doesn't seem to work, or at least they don't show up in the TUI or Dagger Cloud

tepid nova
#

Update:

✅ Successfully hooked a prompt function in every object
✅ Successfully passed the prompt to LLM (hardcoded token for now)
🚧 trying to introspect dagql.ObjectType.. No dice so far. Maybe raw graphql introspection?

still garnet
tepid nova
#

So far I'm doing Server.Schema[Object.Type().Name] and from there I get an ast.Typedef with a lot of good stuff. Will see how far it gets me

#

(and indeed I looked at introspection to figure that out)

#

Now doing some LLM tool call plumbing to connect my code to @wet mason 's . Then the final touch should be figuring out how to build & send a query

#

then we can start having some fun with the call conventions 🙂

meager summit
#

do we have an example of SDK impl which is not part of dagger/dagger repository?

tepid nova
wild zephyr
meager summit
#

thanks Marcos. I am just trying to test some changes I am doing to sdk config structure in dagger.json, and want to test it with an sdk which is not a inbuilt sdk.

tepid nova
#

Trying to understand what is a Class in dagql 🤔

#

since there is already ObjectType

wild zephyr
spark cedar
#

i'm not 100% sure why this logic appears in both places? does this not have the affect of selecting something twice?

#

🤔 how does Nth even get set actually

tepid nova
#

@still garnet re: how to plug in the llm config... Actually the cleanest way would be neither engine config, nor attaching to the query (although I might still do the latter for my POC). It would be session attachables no? Like either a custom one just for that, or more pragmatically, use the new secret provider stuff which is exactly what you suggested 😛 I'm just trying to not have my PR depend on another fast-moving PR if possible 🙂

still garnet
#

yeah agree

tepid nova
#

maybe I can copy-paste an ugly throaway session-attachable hack from the secret providers PR, then just use the real thing when it merges?

#

@still garnet what was the incantation again to go from dagql.Server to *core.Query ?

still garnet
#

srv.Root().(*core.Query)

tepid nova
#

that's what I tried but it won't let me

still garnet
#

what's it say?

tepid nova
#
compiler: impossible type assertion: srv.Root().(*Query)
    *Query does not implement dagql.Object (missing method Call)
#

(I'm inside core)

still garnet
#

oh - maybe it's dagql.Instance[*core.Query]

tepid nova
#

Nice. I can't say if it works - but it definitely stopped complaining 🙂

#

thank you!

still garnet
#

@tepid nova wdyt of merging https://github.com/dagger/dagger/pull/9327 but marked experimental? too early? there's a bunch of other changes on that branch that would be nice to get in, but maybe I can split them out

tepid nova
still garnet
#

yeah, i think so

#

(2 weeks is a long time ago 😵‍💫 - but i do remember editing it as things changed)

tepid nova
#

My thoughts

  1. Obviously shipping something incrementally useful soon would be nice
  2. I worry that "experimental" might become "good enough, let's revisit in 6 months" or even worse "hey don't break my code I was using that!" Not sure how to avoid that.
  3. Looks SDK-specific no? Would this work out of the box in other SDKs, or would it require more design work?
#

(I know it's a real core API call, but the nesting seems to require SDK heavy lefting)

still garnet
#

re 3), i don't think there's a silver bullet that avoids per-SDK integration. the amount of work for each is pretty light and low-touch, it's just a matter of mapping to the natural "context propagation" pattern in each language (with in Python, callbacks in TS, context.Context in Go, etc)

#

i was able to do it in Python/TS without being an expert in either which seems like a good enough sign

tepid nova
#

What feels weird is that it's half SDK-backed and half engine-backed.

still garnet
#

like i said - that's pretty unavoidable afaict; your proposed APIs from side conversations would also need a lot of SDK special-casing. all of the core logic is in the engine; the SDK side changes are just to make the API feel like something that you would actually want to use, and so it's able to interoperate with any existing OTel integrations you might have, which are still going to be dependent on the language-native context propagation patterns.

#

basically - if we do it only in the engine and just have SDKs use the API as if it were any other codegen'd API, that means we lose automagic OTel integration as a feature.

#

but, if we're still not comfortable committing to it even in experimental state, that's fine; I can pull out the other parts to merge separately

tepid nova
tepid nova
#

Super niche core question: are core ID types (like core.SecretID) nullable? Or do I need a *core.SecretID to get a null value?

still garnet
#

they're not nullable by default

final star
#

poking through typescript module init traces, 2 suspicious things, both of which sit on a line where idk whether to distrust the telemetry or the thing being instrumented:

1, containers seem to have absurd netns TX numbers... like corepack use yarn Rxs 1.1Mb (sensible) but apparently Txs 113Mb (wtf) @civic yacht
2, container construction calls show up repetitively a lot, and we only get logs and a duration measurement for the first invocation. looks like caching, sure, cool, BUT there seems to be 2 ways of getting a cache hit that are displayed differently? or some get the cache hit displayed and others don't? or i just don't know how to read the output here? see screenshots. how do i interpret the middle of trace one? what's going on there? @still garnet

lmao i hate discord so much, it doesn't show my filenames: 1st screenshot it top of trace cache miss, 2nd is middle, 0 duration but not explicitly CACHED, 3rd is bottom, CACHED

tepid nova
still garnet
#

Valid bool field - it's true if the value is provided

tepid nova
#

@hasty basin re: recorder module. If it's a PITA to separate the commits don't worry about it

hasty basin
#

I've already cherry-picked the history of the module. Will do the snippets part separately after

tepid nova
#

@still garnet oh no

    s.srv.OnInstallObject(func(selfType dagql.ObjectType, install func(dagql.ObjectType)) {
        &agentSchema[selfType]{
            srv:      m.srv,
            selfType: selfType,
        }.Install(m.srv)
    })
#

This is the linchpin of the whole thing... How do I get a dynamic dagql type definition into a Go generic type bracket 😭

still garnet
#

wanna pair?

tepid nova
#

yes please 🙂

tepid nova
#

@still garnet this is what I'm going to try to make work:

m.srv.OnInstallObject(func(bodyType dagql.ObjectType, install func(dagql.ObjectType)) {
  class := dagql.NewClass(dagql.ClassOpts[*core.Agent]{
    // Instantiate a throwaway agent instance from the type
    Typed: core.NewAgent(bodyType),
  })
  install(class)
})

Does that look right?

#

What not 100% clear (writing it down to help think through it) is how exactly is Typed used

still garnet
#

Yup!

tepid nova
#

Because that will determine how my core.Agent will be consumed by NewClass exactly

#

Oh wait this is what my instinct still tells me to do:

// [...]
    Typed: core.NewAgent(dagql.Instance[bodyType]),
// [...]

If bodyType is eg. *core.Directory, won't dagql.Instance[*core.Directory] give the agent everything it needs? It can call .ObjectType() to get the type back for introspection by NewClass. And it doesn't have to take the type & instance separately

#

("body" as in the body of the robot 🙂

still garnet
# tepid nova Oh wait this is what my instinct still tells me to do: ```golang // [...] T...

i think bodyType will be a dagql.Class[*core.Directory], not a *core.Directory

the Typed field is actually godoc'd:

    // Typed contains the Typed value whose Type() determines the class's type.
    //
    // In the simple case, we can just use a zero-value, but it is also allowed
    // to use a dynamic Typed value.
    Typed T

in your case, you need a dynamic Typed value, whose sole purpose is to have Type() *ast.Type so the dagql.Server knows what type is being installed into the schema

#

so, i think the right thing to do is something like &core.AgentType{Inner: bodyType} with a Type() that returns &ast.Type{NamedType: Inner.Type().NamedType + "Agent", NonNull: true}

tepid nova
#

I already have that (I think) I just get the type name from the instance:

type Agent struct {
  // [...]
  self      dagql.Object
}

func (a *Agent) Type() *ast.Type {
    return &ast.Type{
        NamedType: a.self.Type().NamedType + "Agent",
        NonNull:   true,
    }
}
still garnet
#

but you won't have an instance at this point in time

tepid nova
#

OK that's the part that I don't understand. I guess in my case there's no easy way to make an "zero -value" like the Typed godoc recommends

still garnet
#

yeah - it has to be determined from runtime values, so there's no way for a zero-value to do it

tepid nova
#

Got it. Then I'll just add an extra argument to NewAgent to take both the bodyType and an optional body

#

(using self and body interchangeably, can't decide 🙂

#

like your internalId

still garnet
#

yep yep

tepid nova
#

Is it too cute to try to pass a single self interface{} type and try to cast it either as a dagql.Object or dagql.ObjectType? It's annoying to have to pass both, and have to trust the caller that they always match

#

@still garnet follow-up question, is it accurate that NewClass will perform Go reflection on core.Agent to infer the fields to install?

#

Or do I still need to manually install fields somewhere?

still garnet
tepid nova
#
    s.srv.OnInstallObject(func(bodyType dagql.ObjectType, install func(dagql.ObjectType)) {
        class := dagql.NewClass[*core.Agent](dagql.ClassOpts[*core.Agent]{
            // Instantiate a throwaway agent instance from the type
            Typed: core.NewAgent(dagql.Instance[bodyType]),
        })
        class.Install(
            dagql.Func("withPrompt", s.withPrompt).
                Doc("add a prompt to the agent context").
                ArgDoc("prompt", "The prompt. Example: \"make me a sandwich\""),
            dagql.Func("run", s.run).
                Doc("run the agent"),
            dagql.Func("history", s.history).
                Doc("return the agent history: user prompts, agent replies, and tool calls"),
            dagql.Func("as"+s.selfType.TypeName(), s.asObject).
                Doc("convert the agent back to a " + bodyType.TypeName()),
        )
        install(class)
    })
#

Childcare break...

tepid nova
#

Back at it. Managed to get it to build... Let's see it actually works 🙂

tepid nova
#

panic in the general area you would expect...

leaden glade
#

I have a first PR for the Java SDK (to run Java modules) that looks pretty good for now: https://github.com/dagger/dagger/pull/9422
Not everything is covered, but it allows to run java modules, call them from java or other languages, etc.
Optional args and default values are covered. Constructors are not yet handled.
The PR starts to be a bit big, I'd like to stop adding anything to it (except if we find bugs of course).
dagger init --sdk=java and dagger develop are working as expected.
Can I get some 👀 and feedback from <@&946480760016207902> 🙏 ?
If any question about it, do not hesitate, I can also jump on a call to explain why, how, etc if that can help to review it.

GitHub

This PR allows to create Dagger modules using Java.
$ dagger init --sdk=java my-java-module

$ tree my-java-module
my-java-module
├── dagger.json
├── pom.xml
└── src
└── main
└── java
...

spark cedar
rancid turret
#

Telemetry: cache miss reasons

final star
#

verbose tui trace confusion

hybrid widget
#

Engine error on k8s cluster

fair ermine
#

Also what happens if I bind a same service 2 times but with different name? Will it work as expected and the service be reachable with both name or it will fails? Maybe that's the reason

leaden glade
#

What's the best way to generate an introspection JSON file?
I'm using a simple go run cmd/introspect/main.go but I wonder if this also exists as a dagger command so that I can call it without building Go code

fair ermine
#

Like … call sdk typescript generate -o .

leaden glade
# fair ermine In the dagger repository, you can do dagger -m .dagger call sdk <your sdk> gener...

yep, but I'm looking at doing roughly the opposite. At least for now the way the java sdk is built is by reading a local introspection json file or by calling dagger with a kind of custom query that will mimic the introspection (kind of because if the instrospection query changes it has to be reflected in the java code).
And I'd like to remove the query by a proper call to dagger.
The advantage of today's way of doing (so not with a dagger call sdk java generate) is the local dev ex is better. It's just a mvn install call.
So I've seen the instrospect/main.go that does exactly what I'd like, but wondering if it's exposed somewhere. Without the full sdk/codegen aspect.

spark cedar
#

I think there's an API method called something like __schemaFile somewhere

#

Mmm I lie actually

#

That's odd, I could have sworn we did that at some point

rancid turret
#

introspection.json

tepid nova
#

question: when an API client gets an ID in their response, how big is that ID? Is it the full uncompressed recipe, or is it a compressed version, or even a digest that the engine can use in a lookup later (I guess that last one is unlikely)

civic yacht
# tepid nova question: when an API client gets an ID in their response, how big is that ID? I...

It's base64 and uncompressed atm. We made changes a while back that de-dupe it so each vtx in the DAG only appears once, etc. but it can still get large for really complex DAGs. I think we should move to passing digests around for better performance but we'd need a persistent (on-disk) cache of digest->ID mappings, whereas today that dagql caching is just in-memory and per-session. So once we move all caching logic from buildkit->dagql we should be able to do that

civic yacht
leaden glade
#

sdk vendoring, library split

tepid nova
#

@leaden glade @rancid turret @spark cedar @fair ermine I don't want to pollute your other thread (https://discord.com/channels/707636530424053791/1334447130256867390) but there is a related topic that is on my mind: generated clients. At the moment they are a tightly coupled component of a monolithic SDK. But I think it's time to work on decoupling them. We've discussed this in the past but never got around to it - IMO it's time.

🧵

meager summit
#

one minor suggestion on v3 cloud ui. can we add a mouseover msg in the "time it took for the trace" div. e.g. "failed after 3.5m", "running since 2m 27s", "used cache". that way I don't have to remember the color encoding. (pls ignore if that does not make sense).

meager summit
#

Hi Folks, during one of my change, I am getting error:

Error: failed to get module SDK: input: module.withSource.sdk.source get field "source": reflect: call of reflect.Value.Field on zero Value

any pointers on how we can fix this. The SDK is a struct in this case (I am working on changing sdk from string to struct) which is nil as user has not configured the sdk yet.

this works fine when the sdk is initialized in dagger.json

spark cedar
#

do you have code that you can share?

meager summit
spark cedar
#

🤔 where does the error actually come from?

#

i suspect it's somewhere in dagql

meager summit
#

oh, i think i fixed it by uncommenting some code which was commented out 13 months ago

#
index 295fe28e5..3cf471cba 100644
--- a/dagql/objects.go
+++ b/dagql/objects.go
@@ -1102,9 +1102,9 @@ func getField(obj any, optIn bool, fieldName string) (res Typed, found bool, rer
        }
        objV := reflect.ValueOf(obj)
        if objT.Kind() == reflect.Ptr {
-               // if objV.IsZero() {
-               //      return nil, false, nil
-               // }
+               if objV.IsZero() {
+                       return nil, false, nil
+               }
                objT = objT.Elem()
                objV = objV.Elem()
spark cedar
#

failed to return error: input: currentFunctionCall.returnError failed to get requester session: session for "hncyh7pd4t5f2vju9z31uuoc7" not found

#

i wonder if it's got something to do with the underlying engine getting shut down, the client being closed, but the module is still running? (in which case "context cancelled" would be the right error, but it could be getting lost?)

#

only guessing that, since it seems to happen at the same time as a lot of other jobs on the same runner

wet mason
meager summit
#

is there a way to visualize via server calls vs direct calls in the dagger ui. I think it would be useful to know what level of nesting are we in.

leaden glade
#

Based on this comment in the java PR (https://github.com/dagger/dagger/pull/9422#discussion_r1935476374) I did some tests about module names. So I created a e2e module in multiple languages and see if they can call each others. That works quite well for Go, PHP and Java. But for both python and typescript it's not working. In both case, a dagger function right after the dagger init will complain not finding the module. No @dagger.object_typedecorated class named E2e was found for python, and could not find module entrypoint: class E2e from import. Class should be exported to benefit from all features. from typescript.

meager summit
# meager summit oh, i think i fixed it by uncommenting some code which was commented out 13 mont...

Hi @still garnet, I noticed that you had committed this code (but commented out) in this commit (13 months ago :P): https://github.com/dagger/dagger/commit/353ba56468fbb908cf29e2ccd50b187d178e95de#diff-9eea0c3c5756d18267e5948dc786fda84991c46a1694b9be7904931e7bfd639eR681

without this I am getting following error:

Error: failed to get module SDK: input: module.withSource.sdk.source get field "source": reflect: call of reflect.Value.Field on zero Value

do you know what would be the side effect of uncommenting this code. (I am still going through the tests to see if it impacts any test).

meager summit
#

The snippet, i may have messed up the link

still garnet
#

hmm I may have tried it as a fix and then realized it's a secondary issue - if I'm reading it right it seems to imply we're trying to get the source field off of a nil value, which is a little strange because you'd expect the whole object to just be sdk: null rather than e.g. sdk: {source: null}

meager summit
#

I think the error msg says “trying to read field on null value”, which i read as “sdk is nil and you are trying to read a field on it”

#

But i can print the type on that error msg to be sure

still garnet
#

yeah agree

#

i'm just not sure why it's descending into a nil value and trying to select fields off of it - that seems like a bug

meager summit
#

I have it as *SDKConfig, a new struct i am adding to type Module (to change sdk in dagger json file to a struct)

#

I thought maybe i need to send it back as dagql.Instance but i struggled converting this struct to a dagql instance

still garnet
meager summit
tepid nova
#

So, am I right that there is a regression in the caching of function calls?

  • It used to be cached within the same session
  • Now it is never cached even within the same session.

Is that correct @civic yacht?

civic yacht
# tepid nova So, am I right that there *is* a regression in the caching of function calls? -...

Not exactly, they have always been cached within a session in terms of buildkit operations. 4 months ago we made a change that caused them to not be cached in the dagql level of caching, but that's not a ton of extra overhead since the bulk of the expensive work gets covered by buildkit caching for now.

This fix gets us more dagql-level caching back, but the main motivation was to avoid confusing duplicate telemetry

#

It's confusing due to the multiple cache systems in play obviously

tepid nova
#

"Will the real client.gen.go please standup" 😭

tepid nova
#

What's surprising is that I noticed a sharp increase of in-session cache misses way more recently than 4 months... like in the last 2 weeks maybe. But maybe it was related to something I was doing

hasty basin
#

Hitting this https://github.com/dagger/dagger/pull/8991#issuecomment-2612710834 or something similar while trying to merge last of Solomon's docs gif recorder things in. dagger -i develop drops me in a sandbox with

root@buildkitsandbox:/src# cat dagger.json
{
  "name": "bass-sdk",
  "engineVersion": "v0.11.9",
  "sdk": "go",
  "source": "."
}

hmmm...

spark cedar
#

This is caused by the otel version bump, it's the top line change in the changelog

#

The bass sdk version needs updating - or whatever depends on it

#

Which may be vito's apko module

hasty basin
#

The bass sdk version needs updating - or

meager summit
#

I think I need to change something in bitbucket.org:dagger-modules/private-modules-test.git/cool-sdk repo to fix some tests that are failing in one of my PR. could someone please give me access to this repo

meager summit
#

access to private test modules

tepid nova
#

🙋 tentatively escalating..

tepid nova
meager summit
#

I have been seeing The operation was canceled. error in GitHub action runs, and it seems like it could be because the runner was stopped for some reason. could someone with the access check why that might be the case:

spark cedar
#

depends on the check though, do you have a link?

tepid nova
#

Following up on today's discussion on the future of the "remote engine protocol" and _EXPERIMENTAL_DAGGER_RUNNER_HOST... @spark cedar @civic yacht @still garnet @paper epoch @rancid turret @stray heron

https://github.com/dagger/dagger/issues/9516

GitHub

Problem Officially, there is no supported way to connect remotely to an engine. But there is an escape hatch (_EXPERIMENTAL_DAGGER_RUNNER_HOST) and lots of people use it in production. In fact it h...

leaden glade
#

Java PR https://github.com/dagger/dagger/pull/9422 has been approved (thanks @rancid turret )
But I don't have write access, so could anyone with write access (I guess <@&946480760016207902> ?) merge it? I guess a squash is best due to the number of commits 😅
Let me know if you need any support from me to merge it.

GitHub

This PR allows to create Dagger modules using Java.
$ dagger init --sdk=java my-java-module

$ tree my-java-module
my-java-module
├── dagger.json
├── pom.xml
└── src
└── main
└── java
...

leaden glade
leaden glade
# leaden glade Rebased, and pushed

Looks like something's wrong (or just unexpected from my knowledge). It tries to access sdk/typescript while checking sdk/java for instance

2336: │ go(
2336: │ │ │ source: Directory.withDirectory(
2336: │ │ │ │ directory: no(digest: "sha256:4ad4fb9dd46b164f2c00b86b85a05e6d9ffacc5f2ef1ef22c007ee662e634724"): Missing
2336: │ │ │ │ exclude: [".git", "bin", "**/.DS_Store", "**/node_modules", "**/__pycache__", "**/.venv", "**/.mypy_cache", "**/.pytest_cache", "**/.ruff_cache", "sdk/python/dist", "sdk/python/**/sdk", "go.work", "go.work.sum", "**/*_test.go", "**/target", "**/deps", "**/cover", "**/_build"]
2336: │ │ │ │ path: "/"
2336: │ │ │ ): Directory!
2336: │ │ ): Go!
2337: │ Container.from(address: "golang:1.23.2-alpine"): Container!
2338: │ Container.withRootfs(
2338: │ │ │ directory: Directory.withDirectory(
2338: │ │ │ │ directory: Directory.directory(path: "sdk/typescript"): Directory!```
spiral fog
spiral fog
#

Update, synced versions between modules and repos etc but this time it didn't worked at all. To unblock our ci/cd, we're moving all private repo modules to inside of the repositories.

spark cedar
#

could you put the update in the issue?

#

sorry, just trying to make sure we don't lose that 😄

spiral fog
#

will do first, need to unblock ci/cd pipelines sadcat

rancid turret
spark cedar
spark cedar
#

thinking about enums

obsidian rover
wet mason
#

@still garnet 👋 hey, is there a way with dagger.Connect to have the log output not be interactive?

still garnet
#

isn't it already not interactive? it should just do plain logging

spark cedar
spark cedar
#

and daggerverse-private is a mirror of the test-modules?

obsidian rover
#

And it's not, I misunderstood 😿

#

Let me check

#

Do you specifically want github or bitbucket or gitlab is ok ?

obsidian rover
# spark cedar and daggerverse-private is a mirror of the test-modules?

It could be, feel free to push on top of that I'm ok -- the only test using it is this one -- easy to change and it can become a private github repo that mirrors the test-module with a PAT. Otherwise, this could work: https://github.com/dagger/dagger/blob/f920f96ffd0a0252073c28f39608ad2796e28985/core/integration/git_test.go#L332-L336 (credentials on 1password) ; or https://github.com/dagger/dagger/blob/f920f96ffd0a0252073c28f39608ad2796e28985/core/integration/git_test.go#L347-L351 (same, credentials on 1password)

obsidian rover
tepid nova
#

When using export-to-docker, what name should I give it as argument? Is there a rule I have to follow to get the corresponding CLI to pick it up?

#

Should it be registry.dagger.io/engine:<git-commit> ?

tepid nova
#

🙏 🙏 🙏 🙏 🙏 Can someone explain to me why the CLI git commit doesn't match the engine tag? 😭

dagger v0.15.4-250207020442-075eac5c0946 (registry.dagger.io/engine:736eabb66f8cbe32ecac21cd49d2696e41110084) linux/amd64
                            ^^^^^^^^^^^^   👈                 👉     ^^^^^^^^^^^^^^^

I'm trying to build a CLI/engine pair such that, once I loaded the engine into docker, the CLI will pick it up

#

Oh it's the last commit of upstream main, rather than the commit it was built from I guess (how does it compute that I don't know)

spark cedar
#

But we do this because we need some stable things that's been published to a registry - every main commit gets published, so we can use that

#

But generally - dev CLI builds aren't really intended to be used without a dev engine, this is just used as something somewhat sane (it used to be just get the latest build of main, which for an old cli build would break a lot of things)

rich island
#

Hey, is there anywhere one can get a recap at what is possible with Dagger shell in the latest release?

tepid nova
# spark cedar But generally - dev CLI builds aren't really intended to be used without a dev e...

Thanks. The reason I need this is to work a dev engine (+corresponding CLI) on my system.

This requires:

  1. building the CLI
  2. building the engine
  3. loading the engine into the docker engine at a name that the CLI will look for.

That step 3 👆 is proving hard to automate. I guess I need to call <MOD>/version | merge-base to get the right commit? But I need to pass as arguments the commit I'm building, plus the latest main commit, looked up by me. Any chance load-to-docker could do this by defaut if I don't specify a name?

#

I just want the equivalent of make; make install for dagger

tepid nova
rich island
#

Shell status

leaden glade
#

I have a few PRs opened regarding Java SDK. But I can't have a green CI at all.
For instance this PR https://github.com/dagger/dagger/pull/9533 which simply pins a dependency to fix a critical severity inside the Java module template.
Is there anything I can do to improve the situation? I found it really hard to know if I'm introducing something bad or not (I'm sure not in this specific PR but I have 3 other opens).
Side note: I'm not able to follow the links to dagger.cloud from the GHA results. I always got a "No traces available".

GitHub

fixes vulnerability inside transitive dependency:
org.eclipse.parsson:parsson │ CVE-2023-7272

This should fix the scan CI issue on other PRs

spark cedar
#

that shouldn't have occured there

#

i mean that error makes sense 🤔 why is cacerts being imported on windows

#

fyi it looks like that's happening on main as well

#

so it's not your pr

#

ah yup, because of that cmd/dagger imports engine/buildkit which imports engine/buildkit/cacerts

#

so we need to have it not do that 🤔

#

looks like we're just using it for various env vars

#

i think we should probably move those to engine/consts.go

spark cedar
#

i think it probably requires a bit of rethinking how we do building and reasoning about versions

tepid nova
spark cedar
#

yeahhhh several hours later i'm not actually sure if this solves the real issue

#

the real issue is that we need every component that we build to be able to be aware of various "globals"

#

e.g. in this case, "is this build a dev build?"

#

previously, we've been inferring this from git - but actually this is wrong, and we shouldn't do this. e.g. currently there's very subtly different behavior if you commit all your changes and have a clean state, vs, if you create just one different file

#

technically you can solve this by just passing a boolean everywhere, but it's so ridiculously messy that it will make it makes our CI so much more complicated than before

spark cedar
#

</braindump>

tepid nova
spark cedar
#

mmm, but having something like ./hack/build build + start the engine, and then killing the engine that built it is a little bit annoying

#

all of this logic is so annoyingly fragile

#

the ideal end state i want is that ./hack/build builds+starts an engine (using dagger shell), and outputs a cli to bin/

#

then, we would remote hack/dev and hack/with-dev - because that cli would just work and connect to the already started engine

spark cedar
#

💡 okay, actually i think i've worked out a way to avoid this whole thing maybe - that said, i still want a way to be able to "mark" a dev build explicitly, instead of relying on git metadata (rubber ducking is good)

spark cedar
#

@tepid nova https://github.com/dagger/dagger/pull/9555
^ this should mostly do what you want. ./hack/build is a dagger shell script 🎉 the built bin/dagger will always connect to the engine that was just started

#

i'm gonna work on using this to fully purge mage entirely 🙂

#

in a perfect world, hack is either removed, or just becomes a handy little directory of dagger shell scripts

#

any core engine devs with any objections to the above plan?

still garnet
tepid nova
#

is the default "RUNNER_HOST" endpoint still a unix socket with the name "buildkit" inside it?

#

And if so, isn't that technically a lie since the engine protocol has been changed to no longer be a passthrough to buildkit?

#

asking because I saw a user say "I connect to the buildkit socket in production and it works fine" and I wasn't sure which socket they meant

spark cedar
#

Yeah I think it still is

#

Mildly inconvenient to change, since then old clients will just hang on connection

tepid nova
#

ok that's actually reassuring, just a naming inconvenience - not a mysterious new part of the stack I was missing

#

@spark cedar I am about to disappear for 2 days for conference reasons, but FYI I am eager to pick up a thread I started discussing with @stray heron today : the relationship between "compute drivers" (https://github.com/dagger/dagger/issues/5583) and "stable engine protocol" (https://github.com/dagger/dagger/issues/9516). And in particular the fact that they may be incompatible?

--> Stabilize engine protocol: we officially support provisioning engines yourself out of band; take on the burden of decoupled CLI & engine versioning

--> Compute drivers: equip CLIs with a low-level provisioning interface (containerd? cri?), so that it can manage the provisioning of its own engine. Thus coupling CLI & engine versioning, and not supporting out-of-band provisioning of engines.

How do we reconcile those? --> to be discussed in those issues

spark cedar
#

And maybe some version info

spark cedar
tepid nova
#

When I brought it up at the last maintainers call, I totally forgot to dig into the details of the compute drivers discussion... After reloading it in memory, I realized my mistake. We can't discuss one without the other IMO.

spark cedar
#

But I think (in summary of that upcoming response) I prefer the idea of a more stable protocol and doing things out of band. That said, I think we need to be really careful to control the amount of effort this is gonna take - I'm happy with changing this protocol over time, as long as we keep some core parts the same. If we need to redesign an entirely new protocol, or fully document and support it, it's going to take a while (and I'm kinda still not fully sure of the benefits of doing so, beyond it would be nice).

#

I think that means personally I'd scrap the compute drivers - and move all provisioning out of band. We can build all of our logic server side if we want to do something magic in cloud (aka the beast project)

tepid nova
#

If there's no auto-provisioning interface, technically that means the docker run auto-provisioning remains a weird special case?

spark cedar
#

Lol yes, was just typing that

#

Maybe there's something we could do there

#

E.g. for a Linux system, an install could be a systemd oci image or something? Instead of running under docker

#

But tldr, I think we still have one case of auto-provisioning (but we could hide that almost completely I guess... if we bundle the engine into the CLI, and solve those problems)

tepid nova
#

I just worry that choosing to stabilize RUNNER_HOST, and building the production architecture from that constraint, ends up being circular. "It's what we had in the beginning, so we built the architecture around that, so now it's all we have"

#

"It's all we have, so now we have a feature in the CLI for manually managing our engines. dagger engine list; dagger engine start prod-engine-2".

spark cedar
#

I would probably pare back the runner host env var a lot, simplifying it to only tcp, unix, tls, etc. Very very simple.

tepid nova
#

Reminds me of docker contexts, or before that - docker machine

spark cedar
tepid nova
spark cedar
#

I mean the alternative is to combine the CLI and engine as you've suggested before

#

But I think that's hugely limiting

#

Because now you have to run it on the same machine

tepid nova
#

they could be coupled components without being merged into the same component

spark cedar
#

You can't offload compute resources (which will be hugely useful for llms, etc)

spark cedar
tepid nova
#

Because today that's only available on docker

#

If you want to run docker on anything other than the local docker engine, you have to eject out of auto-provisioning, and manually provision out of band

#

stabilizing RUNNER_HOST is enshrining the "manual out of band" as the one true way to provision

#

And maybe it is... but I'm worried that it might not be, because once we cross that door there's no going back

spark cedar
#

Extending the provisioning mechanism to everything is what I'd call the compute drivers approach

spark cedar
#

But I think that also leads to docker contexts or similar

tepid nova
#

Hence my proposed design debate: "stabilize runner-host vs. compute drivers"

spark cedar
tepid nova
spark cedar
#

So I think we end up there anyways

#

But how would you pick which driver to use

#

Do you just look at the env? And see what you can find?

tepid nova
#

Yeah you would still need to configure that in the CLI. But configuring a provisioning driver is not the same as configuring individual instances

spark cedar
#

Oh sure - happy with that

#

So can you just not connect to an already running instance?

#

E.g. you've run it in kubernetes. Similar to a setup today.

#

Or do we need to build a kubernetes driver to enable this

tepid nova
#

yes in this model the current method of deploying engines would be replaced by something lower level

#

since the whole point would be to couple versions. So you get rid of "Hey I upgraded my CLI to 0.14, any chance we could upgrade the Kub cluster to that version by next week?"

spark cedar
#

Yeah that's nice

tepid nova
#

Ideally that conversation no longer happens between humans. It happens between the CLI and the kub driver. "Hey I need an engine for commit 424242". or maybe "hey here's a custom image I'm uploading, deploy me an engine from that please"

spark cedar
#

Yeah I guess this is the substrate you've suggested before

tepid nova
#

right. Last chance for discussing it basically

#

If we don't incorporate it into the plan now, it goes down the trash forever I think

spark cedar
#

Eh I don't think it's gone forever (no door gets closed forever), but I do think it makes it significantly harder

tepid nova
#

mmm yeah some doors do indeed get closed forever trust me

spark cedar
#

I remember trying to implement this before, but trying to avoid the tripple nested containers was not very fun.

#

I might need more time to remember the details

#

Honestly, I think having platform specific drivers would be the easiest way to go about it - i.e. a kubernetes operator or something for kubernetes

#

I think this is where compute drivers ended up

tepid nova
#

maybe compute drivers are just different "transports" to get to a remote-host session?

spark cedar
#

With some knowledge of provisioning right?

#

Or something has some knowledge of it

tepid nova
#

yes, call the driver and say "hey give me a session to an engine with these properties (we standardize the config there)" then the driver either gives you a session (could just be stdio proxy) or returns an error, eg. "sorry can't give you this version" or "sorry I don't support direct upload" or "sorry I don't have this architecture"

#

So it's basically "remote-host protocol but with a twist"

spark cedar
#

Yeah!

tepid nova
#

then that becomes the open interface for open and proprietary hosting solutions

spark cedar
#

I'm here for it

#

We can build it pretty easily

tepid nova
#

the trick is to define the driver configuration carefully, so that it works for the implementations we have today, but is future-proof for when we introduce distributed caching, clustering capabilities, fully stateless engines etc

#

ok now I really have to disappear 🙂 to be continued

spark cedar
#

I'll try and write up something I guess?

tepid nova
# spark cedar I'll try and write up *something* I guess?

🙏 perhaps in the "stabilize engine protocol" issue, since that one is the freshest, and it will force us to discuss compute drivers in the context of the obvious default path (which is to just open the protocol as-is and call it a day)

leaden glade
#

Quick question regarding vulnerability scans. Trivy is run on PRs with the following settings: --scanners=vuln --vuln-type=library --severity=CRITICAL,HIGH --show-suppressed
Right now it's failing because of multiple vulnerable Java packages. I fixed some of them https://github.com/dagger/dagger/pull/9533 but one is still there.
The status raised by trivy is affected, meaning there's no available fix at this time.
In this specific case it should happen soon, the fix has been integrated to the project but not yet released. Once it will be released we can upgrade this dependency.

In the meantimg it means that even after to merge the mentionned PR the scan check will fail.
I was wondering about the expected behavior here. Is that what is wanted, so to fail about things we can't (right now) solve? Or should we add for instance --ignore-unfixed and only focus on the ones that have a fix available?

GitHub

fixes vulnerability inside transitive dependency:
org.eclipse.parsson:parsson │ CVE-2023-7272
io.netty:netty-handler | CVE-2025-24970

This should fix the scan CI issue on other PRs

spark cedar
spark cedar
#

👋 need to confirm a design decision we made (not released yet, holding releasing until more clarity).
as part of working on private go dependency in modules, we did https://github.com/dagger/dagger/pull/9454. we needed a way for a module to mark a dependency with GOPRIVATE (see discussion in #1318581231465533450 message) - so we added an sdk.env field to dagger.json

little hinge
#

How does cache volume/layer persistence behave when multiple engine instances run concurrently? (using distributed caching with the same dagger cloud token)

For example, let’s say two engines start up and both download the same initial cache volumes and functions lazy-download cache layers during execution as expected. They are both running the same dagger function call, which mount cache volumes of the same names, with cache mode shared, from the same branch and but I've ran it twice and thus two distinct machines and engines. In this example each engine is gracefully shutdown after the function call finishes.

So one finishes first and begins persisting it's caches, while the other finishes slightly later (but while the first is still persisting) and begins to persists it's caches. My understanding is that Dagger’s cache sharing is scoped to an individual engine/buildkit, and that there’s no coordination/awareness between separate engine instances—so it’s up to the remote caching service to handle any possible divergence/conflicts/merging.

final star
#

noticed something a little weird while mucking around in the TS SDK's runtime module: it's running a bit behind in terms of go versions and go.sum versions, but when i dagger develop or go mod tidy in there and the versions all get bumped, it seems to dramatically slow down the go-sdk-codegen step of module init... like this can make a fairly large difference in module init time, 13.3s vs 3s

ideally most of those deps get pulled from the go sdk builtin container, but how are we handling bumping those deps over time or giving ourselves a way to guarantee that our 1st party modules are using the same go module versions that we bundle in the engine image? cc @spark cedar @civic yacht @still garnet @rancid turret @fair ermine

GitHub

An engine to run your pipelines in containers. Contribute to cwlbraa/dagger development by creating an account on GitHub.

wet mason
#

[async] 👋 Is there a way to get combined stdout and stderr, @civic yacht @spark cedar?

civic yacht
# wet mason [async] 👋 Is there a way to get combined stdout and stderr, <@9490346776106435...

Not atm. It would be extremely trivial to add an API where stdout/stderr are just appended to one another, but if we wanted to add support for returning the string where they are interleaved in the actual order stdout/stderr was written (which is probably what would actually be expected) we'd need to do something fancy. Mainly tricky because we'd need to support all 3 APIs of "just stdout", "just stderr" and "interleaved".

#

Would probably need to prepend each written line of stdout/stderr streams with either timestamps or something indicating whether they are stdout vs. stderr and then do a lot of sorting/trimming for each of the api implementations

wet mason
#

I was looking for interleaved unfortunately :/

Patching swebench to use dagger for evaluation, and at its core, it parses the combined output to figure out which test passed failed etc

civic yacht
#

Ah I see, the only other possibility I can think of is a setting on withExec that says "stdout/stderr are the same pipe", so then they are interleaved and .stdout and .stderr just return the same (interleaved) string. Which feels a little weird but is the most plausible in terms of an implementation that's not super complicated and doesn't have a ton of potential performance overhead

wet mason
civic yacht
wet mason
wet mason
civic yacht
wet mason
wet mason
tidal spire
#

I'm trying to re-enable dag.Host().Service() in a local build of my engine. I've commented out this line https://github.com/dagger/dagger/blob/main/core/schema/query.go#L22 which succesfully codegens the .Host() fields, but I must be missing another piece that needs to be changed. My code using dag.Host().Service() runs, but it seems to just hang infinitely. Any ideas?

tepid nova
#

@tidal spire I asked o1 for you:

Possible Explanation

When you see a “hang” or infinite wait in a Dagger + GraphQL resolver, it usually means that the query is never completing, often because the relevant resolver wasn’t registered with the DAG/GraphQL server. In other words, your “host” field might exist, but the “service” field on that host type does not actually get installed, causing the query to stall.
Dagger uses code-generation heavily. Each GraphQL field needs two parts:
The resolver function in Go (for example, “func (s hostSchema) service(… ) …”)
A registration that says “this function resolves the ‘service’ field on Host objects” (i.e. “dagql.Fields[core.Host]{ … }.Install(s.srv)”).

If you only have the “func (s *hostSchema) service” code but never do something like:

dagql.Fields[*core.Host]{
    dagql.Func("service", s.service).
        Doc("Access services on the host"), 
}.Install(s.srv)

then “service” won’t make it into the final schema. You’d see .Host() codegen appear, but calling .Host().Service() may attempt to execute a non-existing field and wind up stuck.
What to Check
Locate where your “hostSchema” or equivalent is installed (for example, “func (s hostSchema) Install() {}”).
Ensure it has a block that registers your “service” function, something like:
}
// This registers the top-level Query.field("host")
3. Re-run your codegen or recompile. Your .Host().Service() calls should now resolve properly, rather than hanging forever.
Usually, the fix is just making sure both the resolver implementation and the field registration are re-enabled. Once they match up, the “service” calls should work again without hanging.

tidal spire
#

I will give my branch to qwen2.5-coder:32b and see what it can find

civic yacht
#

Where does it hang specifically? If not obvious one brute method is to tail engine logs and send it SIGQUIT, you'll get a stack trace of all goroutines. Slightly less brute is to uncomment this line, re-run ./hack/dev and then use pprof to look at goroutines (curl -s -v http://localhost:6060/debug/pprof/goroutine > ~/gr.pprof && pprof -http localhost:8080 ~/gr.pprof)

tidal spire
#

enabling dag.Host()

tepid nova
tepid nova
steel basin
#

just when I got used to seeing ETOOBIG

tepid nova
#

@still garnet I tried rebasing llm on main, and getting some light merge conflicts relative to your otel emoji stuff. Would you mind doing it, I'm afraid of breaking something 🙂

#

(trying to keep up with main to avoid rot)

#

also to stop the "upgrade to 0.15.4" nag 😛

still garnet
#

@tepid nova done

tepid nova
#

@still garnet @civic yacht gut check. I think we should add token count as an otel metric in the llm branch 🙂 wdyt?

still garnet
#

can we show the $ cost? 😛

tepid nova
#

I noticed every AI tracing/observability product has that, it's like the number one feature

tepid nova
#

But that would be a Cloud feature IMO - doesn't make sense to hardcode that stuff in the engine

#

As long as model info & token count is sent up - we can do everything else from there

#

Btw my micro-agent fun cost me almost $20 of OpenAI tokens this month so far...

still garnet
#

yeah not sure how all that works, I mean if it's something we can get easily out of the API it seems like it'd be nice to see that in the TUI too

#

since we're already integrating with OpenAI/etc at that level

tepid nova
#

I think the APi clients all have it

civic yacht
tepid nova
#

Should we retire #1121837200712142878 ? In practice everyone is using either #engine-dev, or language-specific channels eg. #python , #php etc. Seems redundant to have another channel on top.

tepid nova
#

When you really need the op binary in your dagger dev environment and you're too lazy to follow the 10-step instructions for installing it on alpine:

⋈ dev | with-file /usr/local/bin/op $(container | from 1password/op:2 | file /usr/local/bin/op) | terminal

$ op --help
1Password CLI brings 1Password to your terminal.

Usage:  op [command] [flags]
[...]
tepid nova
#

@spark cedar before you log off, could you give me pointers for how we could get llm branch connected to a proper client-side config system? My hackish workaround is the number one issue for people getting started with the melvin demo at the moment

spark cedar
#

i genuinely don't really know what it should look like, but it's probably some stateful api under a new top-level client/session object that the cli hooks into

#

stateful apis suck, but better in graphql imo, than some side-channel (like more session attachable hell)

tepid nova
#

Don't we have engine.json wired already? Or something like that?