#maintainers

1 messages · Page 8 of 1

still garnet
still garnet
civic yacht
north jay
#

cpuguy83 1527 made an issue for getting

tawny flicker
still garnet
tawny flicker
#

just saw that when pulling :p

#

Thanks

ancient kettle
#

@civic yacht @still garnet @wet mason Is there any good way to capture the graphql queries that are run during a pipeline?

civic yacht
#

Probably just in dagger session so it doesn't have to be replicated in each sdk

ancient kettle
#

Thank you!

civic yacht
celest totem
#
GitHub

Refactor marshal to use an approach similar to #5007.
Given this benchmark:
func BenchmarkMarshal(b *testing.B) {
s := struct {
A string json:"a,omitempty"
B in...

GitHub

It's a minor improvement to querybuilder performance, given this benchmark:
func BenchmarkBuilder(b *testing.B) {
for i := 0; i < b.N; i++ {
var contents string
root := Query().
Sele...

still garnet
#

nice!

tepid nova
#

@rancid turret I think you should take over the top comment - your summary in the bottom comment is essential reading, I would just take over the editorial work of keeping that top comment up-to-date and exhaustive

rancid turret
tepid nova
#

@stray heron @rancid turret @cosmic cove sorry I'm only looking at 4205 now. Failed to protect my async time sooner

#

sadly I have questions, so we'll need another 24h cycle at least

tepid nova
#

@wet mason @civic yacht @still garnet do you guys have consensus on 4205? I'm trying to parse the many layers of comments to understand who is proposing what exactly

still garnet
#

seems to me like we're reaching consensus on bits and pieces, but don't have a full proposal yet. Guessing @rancid turret is working on that

rancid turret
#

Yes, I planned on doing it today but will work on that tomorrow. Main thing is still the general purpose err := ctr.Sync(ctx) that just forces a buildkit evaluation (doesn't trigger default entrypoint/args). See if we can get unblocked on the name choice at least.

There's more context on Lazy execution is confusing, just click on the issues that have #4205 as proposals.

slate hawk
obsidian rover
slate hawk
#

sadly no since its a new system, I'll try to build it using the main Dockerfile, thank you!

obsidian rover
slate hawk
#

Yep, managed to pull the image, thanks again!

ancient kettle
#

It takes graphviz dot (and a few other formats) and renders to ASCII/Unicode (and other formats)

still garnet
ancient kettle
still garnet
#

thanks! i was losing hope last night, but after a few changes things started to really click

olive halo
#

beautiful 🤩

fair ermine
#

Please review when you have a moment 🙂

unborn elm
#

Hello everyone, Dagger looks very interesting and we would like to do a try in our Spring Boot microservices? Is there any example already built for Spring?

north jay
#

Looking at building my own engine and cli for supply chain reasons.
Wondering if you'd consider a bootstrapper that can build dagger without dagger in tree?

tepid nova
#

We were just talking about that (I'm subscribed to your moby packaging repo 🙂

civic yacht
tepid nova
wild zephyr
rancid turret
tepid nova
still garnet
#

Commit? ducks

rancid turret
still garnet
#

between the two, I prefer Sync - it seems to be in common enough use (sync/async discussions, synchronization in general, etc) to be familiar outside of the Unix context, it's also much shorter

rancid turret
#

That purpose still seems clearer to me with sync instead of checkpoint which is similar to status.

tidal spire
#

Container.🚀() 👍

rancid turret
#

I've been writing a separate proposal for the "multiple pipeline synchronization" (in https://github.com/dagger/dagger/issues/4205), seemed nice until now when I find myself wondering if it's worth it.

Essentially doing err := client.Sync(ctx, lint, test) as a convenience/syntax sugar on top of the language's concurrency:

Pros

  • Very easy/simple to use;
  • Reduces concurrency burden on user to implement correctly;
  • Uniform in all SDKs, which eases adoption (doing concurrency directly is different in all of them);

Cons

  • Requires custom code in all SDKs, can be source of drift (hard to do via API);
  • Would only work by calling .Sync() internally, so not usable for concurrently calling other fields, meaning you should still learn how to do concurrency correctly;

That last point is the killer and why now I don't think it's worth it.

tepid nova
still garnet
#

I wonder if we could have a version of this that works by constructing a parallel GraphQL query? like { foo: container { from("alpine") { id } }, bar: directory { id } } - we currently have no way to express that with the SDKs

rancid turret
still garnet
#

neat! did you plan to have the user provide names for the query fields, or for codegen to handle that internally?

rancid turret
#

Automatically, it's just needed to match the responses:

query {
    id1: container {
        ... withExec(["go", "test", ...]) {
            id
        }
    }
    id2: container {
        ... withExec(["gofmt", ...]) {
            id
        }
    }
}
tepid nova
#

I worry that our original intent (to expose the query builder as a low-level, but usable plumbing) has been lost along the way - I wouldn't even know how to experiment with alternative native DX without becoming a codegen and dagger internals expert

#

since our current DX is opinionated, and we shipped it saying "let's just ship it and see how devs like it, we can always adjust later" -> but I worry that's actually not true

tepid nova
rancid turret
#

These are the main issues with our type-safe query builder:

  • No sibling selection, requiring multiple requests;
  • Needs multiple requests to be done concurrently, otherwise blocking;

Laziness is actually good for performance, but depending on the language's concurrency has been problematic. It's also more verbose. That's why I'm thinking on ways to improve both in our DX.

rancid turret
tepid nova
#

yes we definitely are adjusting based on user feedback. but so far within certain constraints, it has to fit the existing DX. I mean specifically: how easy is it to try new things quickly (like an experimental alternative SDK) without rebuilding the foundation. We haven’t really tried that kind of experiment yet - for example the “pipeline builder” model, or other ideas to stick more closely to graphql

tepid nova
#

where does hack/make engine:build export the binary?

tepid nova
#

As I play with the Pipeline call, and now that I can visualize the result, I realize that I can't group steps in a sequence of logical steps (for example "build, then test"). Instead I can only represent nesting ("build, which includes test"). The only exception is if I can encode an explicit dependency between the pipelines, like mounting a file from one to the other. If there is no explicit dependency, then as a workaround I can create a fake one - but that's weird

tepid nova
# tepid nova As I play with the `Pipeline` call, and now that I can visualize the result, I r...

I wonder if the With() proposal couldn't be the solution? If you rename it to Step() and add a mandatory name and description, then it starts looking like a different grouping method, with cleaner delineation between nesting (the contents of the callback) and sequencing (chaining).

dag.
  Step("build", func(_ dagger.Client) error {
    // Build step goes here
  }).
  Step("test", func(_ dagger.CLient) error {
    // Test step goes here
  })

Is this a dumb idea?

rancid turret
#

It sounds like there's actually a dependency between "build" and "test". If test := build.Pipeline("test").WithExec(...), maybe we can flag in Pipeline("test") somehow to show next to "build", but after. I've seen some cases of users building a report system where they want to visually see "build" next to "test" (for example), when one is actually a dependency of the other. So I see a visualization need there.

#

As for the Steps in sequence, wouldn't this make it easier for people to fall into the trap of running sequentially when things can be run in parallel?

wild zephyr
tepid nova
#

This is not what I’m talking about, sorry my explanation probably wasn’t clear. I’ll share a code snippet

#
package main

import (
    "context"
    "os"

    "dagger.io/dagger"
)

func main() {
    ctx := context.Background()

    dag, err := dagger.Connect(ctx, dagger.WithLogOutput(os.Stdout))
    if err != nil {
        panic(err)
    }
    defer dag.Close()

    build := dag.
        Pipeline("build").
        Git("https://github.com/dagger/dagger").
        Branch("main").
        Tree().
        DockerBuild()
    test := build.
        Pipeline("test").
        WithExec([]string{"version"})

    test.Stdout(ctx)
}

And the corresponding run: https://dagger.cloud/runs/2ecd263b-a58b-4b50-8789-bf00fa42b178

still garnet
#

I see what you mean - you basically want to use a pipeline to represent the build phase, but then "pop the stack" and pass it to a test phase, but you can't because the pipeline is baked in to the resulting container. There's still an explicit dependency in that case, but nesting the pipelines doesn't align with your mental model. Is that right?

tepid nova
#

test is "inside" build but I want them to be siblings

still garnet
#

What happens if you do this?

id, err := build.ID(ctx)
test := dag.Pipeline("test").Container(dagger.ContainerOpts{ID: id}).WithExec([]string{"version"})
tepid nova
#

I can hack it by artificially replacing the "chain" by a reference:

package main

import (
    "context"
    "os"

    "dagger.io/dagger"
)

func main() {
    ctx := context.Background()

    dag, err := dagger.Connect(ctx, dagger.WithLogOutput(os.Stdout))
    if err != nil {
        panic(err)
    }
    defer dag.Close()

    build := dag.
        Pipeline("build").
        Git("https://github.com/dagger/dagger").
        Branch("main").
        Tree().
        DockerBuild()
    test := dag.
        Pipeline("test").
        Container().
        WithRootfs(build.Rootfs()).
        WithEntrypoint([]string{"dagger"}).
        WithExec([]string{"version"})

    test.Stdout(ctx)
}

tepid nova
still garnet
#

I think my suggestion might not work since the container ID still has the pipeline, but curious how it deals with the already existing 'test' pipeline

tepid nova
#

So yeah my "withrootfs" hack is a dirtier version of your hack. It works as expected (they are siblings)

#

So the problem is specific to the chaining of Pipeline calls, which implies nesting even when you don't mean it as nesting

#

Hence my proposal to make nesting and sequence more distinct with a Container.Step(func)

tepid nova
#

note that my hack is not only cumbersome, it's actually not correct because I don't get all the metadata in the image

#

I mean the container! 😛

still garnet
tepid nova
still garnet
#

oh interesting

tepid nova
still garnet
tepid nova
#

Another way to put it: could Pipeline be merged into the proposed With?

rancid turret
tepid nova
rancid turret
#

You're right, somehow I was thinking of our chain as expressing dependency, but that's in inputs.

#

However, if you have a dir := ctr.Directory() doesn't the directory depend on container?

tepid nova
#

yes, chaining implies dependency, in this case

  1. Build container (or whatever produced ctr)
  2. Get directory
#

but it’s different from nesting

#

ie this would be incorrect:

  1. Build container
    1a. Get directory
rancid turret
#

Yep

#

Ok, now I see your reasoning on With. 😇

#

But I'm having a hard time visualizing usage deeper than top level, like with our CI for example (sdk / xxx / yyy). But it's late for me, I'm sluggish now 🙂

still garnet
#

Maybe something like Pipeline vs SubPipeline, BranchPipeline. Don't know a great name, but yeah

rancid turret
#

Yeah, Pipeline wouldn't nest, while SubPipeline would.

#

I already see users using the term "subpipeline", I think meaning nesting.

tepid nova
#

Do we document anywhere how to install the dagger engine OCI image?

#

I just realized (remembered really) that we don't have install instructions for the engine itself

civic yacht
tepid nova
#

No latest tag... So opinionated

civic yacht
#

@still garnet Thought of another potential reason to modify how we dedupe services, just want to double check it makes sense (making thread)

rancid turret
#

@civic yacht & @still garnet, if you solve a LLB with multiple chained ExecOps and the first one fails, do you know if it's possible to detect if the failed exec was the last one or not?

obsidian rover
#

./hack/dev doesn't seem to work anymore ? Has it been deprecated/ replaced ? 🤔

tepid nova
#

You can look at .github to find the new entrypoints

civic yacht
bronze hollow
#

Is it possible to expose a Container run by Dagger to the host? Like if I do :

client.Container().From("postgres").
        WithMountedCache("/var/lib/postgresql/data", client.CacheVolume("postgres")).
        WithEnvVariable("POSTGRES_USER", "postgres").
        WithEnvVariable("POSTGRES_PASSWORD", "postgres").
        WithEnvVariable("POSTGRES_DB", "postgres").
        WithExposedPort(5432).Stdout()

Will I ever be able to use local tools to play with that database? 😄

fair ermine
#

@civic yacht Is there any chance the new domain system creates flacky tests in the CI?
I have some errors on the 5052 CI but they do not seems related to my changes

For instance

#11 220.3 #5 0.252 See 'docker run --help'.
#11 220.3 #5 ERROR: process "docker-entrypoint.sh sh -e -c echo zj6i1ndhzly6yn70pgdfpstdx-from-outside > /tmp/from-outside\ndocker run --rm -v /tmp:/tmp alpine cat /tmp/from-outside\ndocker run --rm -v /tmp:/tmp alpine sh -c 'echo zj6i1ndhzly6yn70pgdfpstdx-from-inside > /tmp/from-inside'\ncat /tmp/from-inside" did not complete successfully: exit code: 125
#11 220.3 
#11 220.3 #3 service KJ53UQVCTGCUQ
#11 220.3     container_test.go:3194: 
#11 220.3             Error Trace:    /app/core/integration/container_test.go:3194
#11 220.3             Error:          Received unexpected error:
#11 220.3                             input:1: container.from.withMountedCache.withServiceBinding.withEnvVariable.withExec.stdout process "docker-entrypoint.sh sh -e -c echo zj6i1ndhzly6yn70pgdfpstdx-from-outside > /tmp/from-outside\ndocker run --rm -v /tmp:/tmp alpine cat /tmp/from-outside\ndocker run --rm -v /tmp:/tmp alpine sh -c 'echo zj6i1ndhzly6yn70pgdfpstdx-from-inside > /tmp/from-inside'\ncat /tmp/from-inside" did not complete successfully: exit code: 125
#11 220.3                             Stdout:
#11 220.3                             
#11 220.3                             Stderr:
#11 220.3                             docker: error during connect: Post "http://EHOHEVQ3FHJ7I:2375/v1.24/containers/create": dial tcp: lookup EHOHEVQ3FHJ7I on 10.88.0.1:53: no such host.
#11 220.3                             See 'docker run --help'.
#11 220.3                             
#11 220.3                             Please visit https://dagger.io/help#go for troubleshooting guidance.
#11 220.3             Test:           TestContainerInsecureRootCapabilitesWithService
#11 220.3 #3 CANCELED
#11 220.3 ------
#11 220.3  > :
#11 220.3 #5 0.252 docker: error during connect: Post "http://EHOHEVQ3FHJ7I:2375/v1.24/containers/create": dial tcp: lookup EHOHEVQ3FHJ7I on 10.88.0.1:53: no such host.

More details on the Run: https://github.com/dagger/dagger/actions/runs/4940600943/jobs/8833006029?pr=5052
PR: https://github.com/dagger/dagger/pull/5052

#

And when I try to run the tests in local, I'm having the following error:

    container_test.go:2844: 
                Error Trace:    /Users/tomchauveau/Documents/DAGGER/dagger/core/integration/container_test.go:2844
                Error:          Received unexpected error:
                                Post "http://dagger/query": EOF
                                Please visit https://dagger.io/help#go for troubleshooting guidance.
                Test:           TestContainerWithUnixSocket

Even for basic tests like TestContainerWith, should I raise an issue or it might come from me?

I'm basically just doing the command: ./hack/dev go test -v -count=1 -run TestContainerWith $(pwd)/core/integration/ or with another test but the result is the same 😦

civic yacht
civic yacht
fair ermine
#

I could make it works somehow after multiples run, but not everytime

#

Same errors on main, but not everytime

civic yacht
fair ermine
#

Ah! Make sense

#

This one always fails yeah

civic yacht
#

The thread is kind of long, but the tl;dr is that there were some issues that got fixed, but there's still a bug that only appears on macos. We haven't prioritized investigating+fixing it yet. For the time being, you can probably just skip running that one directly on your local machine

fair ermine
#

No problem, ty!

civic yacht
#

secret scrubbing

still garnet
civic yacht
fair ermine
ancient kettle
celest totem
#

is there any scenario where we intentionally avoid running Go tests in parallel? just thinking about this as it seems we don't have t.Parallel in a few test cases (e.g. a few secret integration tests, a few file integration tests, etc.).

wild zephyr
#

Add resource usage monitoring for build ...

obsidian rover
civic yacht
civic yacht
#

@celest totem is running into an issue where he can't run any tests when he doesn't have internet access because even if all the images are pulled to the local cache we still fail when trying to resolve the tag->sha: https://github.com/dagger/dagger/issues/5135

At this point I think just updating the tests to use image SHAs is probably the quickest unblock, but mentioned another feature we could add to make this work too in that issue. Let us know if anyone else has any ideas

celest totem
#

a quick update: I've spent some time in the past days going through the secrets code and particularly the secrets scrub feature (https://github.com/dagger/dagger/issues/4864), seems we already had a PR (https://github.com/dagger/dagger/pull/4944) and it was closed due to inactivity. I made tests work and did some benchmarks, etc. I think it generally achieves what we expect and the code could be improved too (I think there's a lot of string operations going on, multiple string scans, splits, etc.), so I'm spending some iterations around this. I will also make sure that we extend tests to cover all scenarios. Some notes and draft stuff, very dirty for now -based on the original PR-: https://github.com/matiasinsaurralde/dagger/tree/secret-scrub-multi-write-2

GitHub

A programmable CI/CD engine that runs your pipelines in containers - Issues · dagger/dagger

GitHub

We need to scrub even if the secret is written in multiple os.Stdout.Write().
Current state
After investigation, our system split big secrets into 32KB chunks.
And our secrets are limited to < 1...

GitHub

A programmable CI/CD engine that runs your pipelines in containers - GitHub - matiasinsaurralde/dagger at secret-scrub-multi-write-2

still garnet
obsidian rover
dense dust
#

I'll publish now

obsidian rover
dense dust
#

LMK is something wasn't clear about the process, I think it's not documented in the README yet

obsidian rover
dense dust
celest totem
# celest totem a quick update: I've spent some time in the past days going through the secrets ...

raising a draft PR here, based on the previously existing branch with some improvements: https://github.com/dagger/dagger/pull/5149

GitHub

Picking up #4944 with some refactoring:

Simplify SecretScrubWriter write logic and only use []byte for matching and replacements. I also explored strings.Replacer and bytes.Replacer but didn't...

tepid nova
granite coral
#

@stray heron i am gonna do a few things with the dagger rust sdk. First, I am retro fitting the ci onto how you're doing it in hack/make. I've already got tests working with a properly cached base image.

2nd. I will consolidate the rust crates into features instead, this will drastically reduce the release (sdk:rust:bump) process, it shouldn't take that long as well. I.e. the new setup will only have (dagger-sdk, and dagger-codegen). the parts codegen needs to actually generate will just use features imports instead, so that it doesn't pollute the clients.

3rd. Append the sdk to the exising workflows in .github/workflows/*

The reasoning for not migrating the ci, was the awkwardness of it. the old ci setup doesn't work that well with the official dagger setup. as such I am sort of rewriting it, it isn't that much work either.

ps. don't know if this is the right channel for these kinds of things, but I wanted to open the talk here, instead of dm so people could follow along if they want to 😄

stray heron
#

gerhard 1272 i am gonna do a few things

celest totem
rancid turret
rancid turret
#

Command extensions

obsidian rover
#

@civic yacht, would you have some time today to sync with me ? it's related to https://github.com/dagger/dagger/issues/5069#issuecomment-1568277016. The main question is: would it be possible to mount files as secrets from a path. @rancid turret told me that we used to rely on a dagger implementation of "fsync" (from what I understood) that could lead to the implementation of a new API endpoint enabling this feature ?

rancid turret
# obsidian rover <@949034677610643507>, would you have some time today to sync with me ? it's rel...

The important question is if it's possible to transfer a file from the client to the engine, keeping it a secret from BuildKit. We could then avoid sending their content via API with setSecret (bypassing encoding issues), and instead use setSecretFile(path: String!) (for example) to load the content as bytes in the in-memory store. We used to have secrets loaded from files but it used llb.Local (I think), so not sure if it's possible to do it secretly or not.

rancid turret
tepid nova
#

The main problems are 1) leaking the file into the cache, and 2) it breaks the composition model

rancid turret
#

@obsidian rover I just tested a Host.setSecretFile(name: String!, path: String!) and it works fine. Commented on the issue.

obsidian rover
obsidian rover
still garnet
civic yacht
#

Erik Sipsma 3294 would you have some

obsidian rover
#

Can't repro locally, when running just these tests

still garnet
#

yeah, seen it in a few runs now

civic yacht
celest totem
#

seeing some issues on my latest builds too

rancid turret
#

context deadline exceeded

civic yacht
rancid turret
tepid nova
#

Note for later watching the "dagger on equinix metal" demo:

  • The buildkit unix socket is leaking into user configurations. If it's part of our public API, we need to own it and standardize a name other than "/var/run/buildkit". If it's not, we need to hide it and give operators an alternative (eg. Airbyte production setup uses that path)

  • Exposing the engine tcp port on non-localhost... I know we talked about it. Seeing this talk reminds me how dangerous it is. Right now I want to disable non-localhost altogether. When someone inevitably shoots themselves in the foot and gets hacked because of an unsecured remote access to our privileged container... Saying "but we said not to do this in the video!" will not be enough.

tepid nova
#

Zoom Q&A from that demo

tepid nova
#

On the bright side: clearly there is great interest in seeing these "recipes" for deploying Dagger to different environments.

#

Maybe it's time to converge snippets from these open presentations, and our own snippets from production customers, clean them up and open-source them? We talked about doing it, but we didn't talk about when. Maybe the answer is: now? What does everyone think?

civic yacht
#

Also, for /var/run/buildkit, that’s a default value we inherit from buildkit, but it’s configurable by users. We can change the default very easily too

fresh harbor
#

Hi @stray heron ,

I want to discuss about the releasing of Elixir SDK a bit to make sure I'm align with you:

  • Renaming the package from dagger_ex to dagger.
  • Having a dagger user on hex.pm or create an organization for publishing the package. I never try the organization but it should work like user publishing a package.
  • Implementing ./hack/make sdk:elixir:publish. I can take a look at this part.
  • Changing the LICENSE, it's need to change to comply with dagger, am I right?
  • What version we should start? Same version as dagger engine? (0.6.1 for example).

So we can test releasing that mentioned in https://github.com/dagger/dagger/pull/5201 after done all tasks above. Am I missing something?

GitHub

This imports https://github.com/wingyplus/dagger_ex under sdk/elixir as an experimental SDK.
@gerhard action items (part of this PR)

Open PR with https://github.com/wingyplus/dagger_ex history im...

stray heron
# tepid nova Note for later watching the "dagger on equinix metal" demo: - The buildkit unix...

Being able to access Dagger Engine via TCP port is a must-have for me. Using something like nc or socat for this purpose, or some other type of TCP to UNIX proxy, would be an extra hoop to jump through. Doable, but it would make Dagger less versatile out of the box. Binding to a UNIX socket by default is a good strategy.

Docker's ability to bind to a TCP port (although disabled by default) is the only reason why I am still using it today (have been running it on Linux & connecting via a Tailscale tunnel to it for a bunch of years now).

I was thinking the same thing about /var/run/buildkit.

tepid nova
#

@stray heron I remember that you said that last time, and I want to take that into account - good OX is important. I'm just worried about the path we're taking right now.

stray heron
tepid nova
#

@stray heron can you remind me in that equinix metal setup, why there is both a tcp port and unix socket involved?

#

tcp for the graphql server, unix for the "dagger remote API"/wrapped buildkit grpc ?

#

but that can't be right

stray heron
#

TCP so that we connect to it from local, laptop -> K8s -> Dagger Engine.

#

It's the same as DOCKER_HOST=tcp://100.81.87.121:2375. In this case it was _EXPERIMENTAL_DAGGER_RUNNER_HOST=tcp://remote.dagger.engine:30000

tepid nova
#

what about the unix socket?

#

OK so we are talking about tcp and unix for the same component - clients would use one or the other depending on the situation

stray heron
#

Yes. We did not use the unix socket in the demo. It was there when we add self-hosted GitHub runners. These run on the same host as the Dagger Engine and then connect to it via that shared unix socket which you saw mentioned yesterday in the demo.

tepid nova
#

OK thanks. I have to run into a meeting, to be continued. So far I do feel that tcp support beyond localhost is not worth it. This can be done in userland.

stray heron
#

Having used both setups, it is simpler & more elegant to use free GitHub runner which connect to remote Dagger Engines via TCP. Doing this via WireGuard / Tailscale is what I'm thinking.

stray heron
#

I have another experiment which runs Dagger Engine on Fly.io Machines. That also requires the Dagger Engine to bind TCP to 0.0.0.0. With nearly-instant shutdown, start & scale-up + volume cloning & native WireGuard integration, that makes for the simplest production-ready setup that I am currently aware of. K8s is easy for me, but I know users that would prefer to not have it in their stack.

tepid nova
#

You can just ssh port forward

#

Or setup something more custom yourself

#

Allowing 0.0.0.0 implies that it’s safe to do so, when in fact it’s not. It invites a use case that is an anti-pattern. In return we get a small convenience for an unusual use case. Just tunnel the port!

tepid nova
#

(playing devils advocate a bit)

#

We could even add a tunnel-command flag to dagger run so that it can call the ssh tunnel command on the fly, no out-of band scaffolding needed

#

like RSYNC_CONNECT_PROG for those who remember that 🙂

wild zephyr
# stray heron I have another experiment which runs Dagger Engine on Fly.io Machines. That also...

there's another thing we can leverage that can be used if we want to prevent exposing the TCP way until we make a decision which si basically relying in DOCKER_HOST ssh capabilities. If you set DOCKER_HOST=ssh:// in your Dagger pipelines and point it to a capable Docker server, that will still run Dagger pipelines the same was as using the _EXPERIMENTAL_DAGGER_RUNNER_HOST variable. The only caveat, is that the server requires to have the docker daemon installed instead of just the Dagger engine

civic yacht
#

And once we move away from the _EXPERIMENTAL_DAGGER_RUNNER_HOST env I am guessing it could look a bit different in terms of ux, but we can use the same plumbing

tepid nova
#

@stray heron the markdown format in sdk/go/README.me is a bit weird, is it related to the go.dev tooling requirements, or just a neutral formatting choice? Specifically it's the link format, it uses a format I'm not familiar with (all brackets, with a footer reference)

stray heron
still garnet
#

yep that's normal Markdown; I use that format all the time, makes it easier to read paragraphs without long URLs intermingled

tepid nova
#

good to know 🙂 thanks

tepid nova
#

That is my new favorite markdown hack, why aren't we doing this everywhere, it's so much more readable

#

@civic yacht I'm trying cd ci ; dagger do in our own repo but getting errors about missing go.mod.

#

Am I doing something stupid?

civic yacht
tepid nova
#

Thanks

#

Tried running ci:sdk:go:lint but getting cryptic error

#
│ │ │ │ │ │ │ █◀──╯ CANCELED exec golangci-lint run -v --timeout 5m
█ │ │ │ │ │ │ │ ERROR dagger do --config ./ci ci:sdk:go:lint
┃ │ │ │ │ │ │ │ Loading+installing project...                                                       
┃ │ │ │ │ │ │ │ Running command "ci:sdk:go:lint"...                                                 
┃ │ │ │ │ │ │ │ Error: process "/entrypoint" did not complete successfully: exit code: 1            
█◀┴─┴─╯ █ │ │ │ ERROR [ci sdk go lint]
┃       ┃ │ │ │ input:1: pipeline.pipeline.pipeline.container.from.withMountedDirectory.withWorkdi  
┃       ┃ │ │ │ r.withExec.stderr failed to load cache key: error fetching default branch for repo  
┃       ┃ │ │ │ sitory https:: exit status 128                                                      
┃       ┃ │ │ │                                                                                     
┃       ┃ │ │ │ Please visit https://dagger.io/help#go for troubleshooting guidance.                
┻       │ │ │ │ 
        │ │ │ █ ERROR git://
        │ │ │ ┃ fatal: unable to access 'https://': Could not resolve host: info                    
        ┻ ┻ ┻ ┻ 
#

Also found a bug in cloud I think, since that error doesn't show up in cloud at all

civic yacht
tepid nova
#

It's pure docs, but had to re-run engine-race-detection

rancid turret
#

That's ready for review btw

civic yacht
tepid nova
celest totem
oak sandal
# stray heron Being able to access Dagger Engine via TCP port is a must-have for me. Using som...

I use Dagger engine over TCP too. Currently, its in a private network but once there's mTLS support, I'd really like to use it over public internet too. It was also relatively simple for us to switch from Buildkit to Dagger Engine. We also want to use other projects that rely on Buildkit and have a shared instance to manage. This is probably not intended behaviour but it would be nice to have that extensibility

tepid nova
#

Production-grade TLS support is a lot of work. If anything goes wrong, we're on the hook to troubleshoot and fix it. Add to that the security implications, which sets an even higher bar. That's cycles we're not spending on another feature or quality improvement. I question whether that's worth it.

#

Wouldn't you rather hook up a TLS implementation you already trust and is operationally familiar?

civic yacht
celest totem
#

hello! I've been going through the pipeline freeze issue that @civic yacht described here: https://github.com/dagger/dagger/issues/5294
have spent the past days going through the multiple discussions around it and implementing the solution suggested by Erik. Seems that adding a mutex to prevent concurrent Send calls from the gRPC server might fix the issues.
for now my implementation overwrites the PB generated code as it was faster to test it this way. I'm running a final round of tests (and also Dagger tests based on buildkit patch for this) and will wrap it up so that it doesn't mess with PB generated code 👍

GitHub

There is a lot more context in this whole thread: #5229 tl;dr Long-standing issue with buildkit where certain grpc operations just freeze indefinitely until client disconnects. Reported a few years...

celest totem
#

Pipeline freeze possibly due to Buildkit...

median charm
#

New old user here, been lurking since the old CUE-days. I've just managed to "self build" the Dagger Graphql through dotnet via the dagger run dotnet run-dance and I'm wondering to what extent this is supposed to be dynamic (as in being compiled runtime) or rather if it's supposed to be a one off built client released in versions (i e the graphql is ingested via dagger run dotnet run when the client is built and compiled and that's that)?

celest totem
#

a quick one: is anyone able to reproduce this issue? seems to be happening on latest main and caused by a recent change -still figuring out which one-:

go test -v -timeout 30s -count 1 -run TestContainerWithMountedSecretOwner github.com/dagger/dagger/core/integration 
...

the failing test is TestContainerWithMountedSecretOwner/userid.

tepid nova
#

We started a DX thread somewhere, which I would like to continue and resolve, but I can't find it...

It was about the awkwardness of Pipeline() for logical grouping, and the idea of using With instead.

  1. Does anyone remember where that thread is?
  2. Any opinions on that? 🙂
celest totem
wild zephyr
still garnet
#

We started a DX thread somewhere which I

celest totem
#

I m in a linux machine BTW not sure if

celest totem
rancid turret
still garnet
rancid turret
fresh harbor
celest totem
still garnet
#

@civic yacht How easy/hard would it be to change dagger do to not start its own session, and leave it to the SDK instead? I'm trying to bring more clarity to https://linear.app/dagger/issue/DAG-1801/engine-autostart-dx, and right now there are a few points on the side of no-session-in-dagger run, but that victory will be shortlived if dagger do is harder to split up in the same way. /cc @fair ermine

civic yacht
still garnet
#

Ah interesting, but we would need to use the right version of the Go SDK, right? So kind of a lateral move

civic yacht
# still garnet Ah interesting, but we would need to use the right version of the Go SDK, right?...

Yeah, but we already have dependencies in the engine code on the go sdk and just use an override in go.mod. So if you are using an officially released CLI, that will end up using the corresponding officially released SDK, which will point back to the corresponding CLI 🙂 Maybe we could have dagger do set _EXPERIMENTAL_DAGGER_CLI_BIN to it's own executable, just to save some time re-checking.

still garnet
# civic yacht Yeah, but we already have dependencies in the engine code on the go sdk and just...

But that won't solve the issue where you can dagger run or dagger do with a v1.0.0 CLI (and therefore API), but run SDK code that depends on the v0.5.0 API, and ends up talking to v1.0.0 because it prioritizes $DAGGER_SESSION_PORT/etc. - right?

In the past we've talked about adding an API version compatibility check to prevent this, which is still on the table, but I'm also exploring going back to SDK-initiated sessions, which resolves a lot of issues but leaves a question-mark for dagger do

civic yacht
# still garnet But that won't solve the issue where you can `dagger run` or `dagger do` with a ...

Ohh sure, yeah that's another dimension to all this. Interestingly, the fact that dagger do runs code containerized gives us more ability to deal with this gracefully.

Right now, the entrypoints run as a nested session, so that's served from the shim and thus tied to the engine version. But in theory we could identify the version of dagger the user's code depends on (as part of the runtime setup https://github.com/dagger/dagger/blob/7ab8a6983dc8139d5f540cc63a8203805718785a/core/project.go#L257-L264) and then use that to ensure that we serve them a session at the right version.

tepid nova
#

SDK-initiated sessions won’t solve this, because in production, operators will require control of their engine version; it won’t be negotiable for many

#

If we don’t design with that version decoupling in mind, operators will work around the design, making the problem worse

#

that’s on top of the other benefits of “thin SDKs”

still garnet
#

Seems like that forces the question of whether engine version should always === API version. Also related to @civic yacht's grand experiment of having the API run as an engine frontend

#

The buildkit API moves at a much slower pace, compared to say a large monorepo with multiple projects advancing their Dagger integration at different paces. It could be painful to have to keep bumping them to match exact versions, or even semver

tepid nova
#

Generally I think GraphQL recommends against versioning APIs, right?

still garnet
#

Apparently, but you have to deal with physics at some point no? 😄

tepid nova
#

Does a CLI-centric UX change the parameters of this problem, by introducing CLI version in the equation?

#

In that UX, CLI is upstream in the control path to both engine and sdk

#

so maybe version checking etc could take place there? This is a half-baked though at best

still garnet
#

Yeah, so one option is to keep the CLI serving the session API, and have SDKs run a compatibility check when they first connect to the session. That way we just have one version encompassing the CLI (incl. progress streaming protocol), the API, and the engine

#

Definitely the easiest for us since it collapses a few cells in the compatibility matrix into one

#

The downside is that upgrading Dagger can become kind of all-or-nothing, imagining a large monorepo running against one big shared engine setup, so people could end up stuck on old versions for quite a while to avoid the pain of upgrading projects they're less familiar with. But this is imagining the worst-case scenario

still garnet
tepid nova
still garnet
#

true, then you just have the to deal with the dagger CLI that user actually runs. I guess we could make it a thin pass-through that keeps a local cache of each version and re-execs to the appropriate one

fair ermine
rancid turret
fair ermine
#

Interesting!

still garnet
#

@civic yacht @rancid turret FYI: I think it might be time to get rid of the Stdout/Stderr error wrapping. It's causing Buildkit to raise a cryptic error when I try to pass refs to a gateway container:

unexpected type for reference, got *core.ref

which is coming from here:

    for _, m := range req.Mounts {
        resultID := m.ResultID
        if m.Ref != nil {
            ref, ok := m.Ref.(*reference)
            if !ok {
                return nil, errors.Errorf("unexpected type for reference, got %T", m.Ref)
            }
            resultID = ref.id
        }

(code: https://github.com/moby/buildkit/blob/2ff0d2a2f53663aae917980fa27eada7950ff69c/frontend/gateway/grpcclient/client.go#L771)

Kinda gross that this is an interface but in practice it relies a concrete implementation, but the easiest step right now is probably to just clean the wrapping up, assuming its only purpose was Stdout/Stderr collection, which is now somewhat redundant with the TUI

rancid turret
#

The engine does raise a new custom ExecError instead of the original SolveError, is that the issue?

still garnet
#

not the new error types, which are great

#

though maybe pieces of this are still involved in that path?

rancid turret
#

But the new ExecError comes from there.

still garnet
#

hmm, yeah, unfortunate... maybe there's somewhere else that we can wrap, or maybe I can handle the wrapping by adding a type assertion + Unwrap() bkgw.Reference or something

rancid turret
#

Yeah

tepid nova
#

Is there a guide or cookbook recipe, or any sort of material, on how to use privileged exec? Ideally in the context of running a service, eg. kubernetes or dockerd? cc @still garnet @tidal spire @wild zephyr

tidal spire
wild zephyr
fresh harbor
stray heron
fresh harbor
stray heron
#

Not at all! That was a great catch. I will update the issue next.

fresh harbor
fresh harbor
#

I would like to share about the progress on elixir sdk a little bit:

GitHub

A programmable CI/CD engine that runs your pipelines in containers - Issues · dagger/dagger

tepid nova
#

Is there a guide somewhere on how to add content to the cookbook?

#

I see a docs/current/cookbook/snippets directory, but there are only 5 snippets in there

tepid nova
#

Ok I see, so there's a gradual migration underway, and docs/current/cookbook/snippets is where new snippets should go?

obsidian rover
tepid nova
#

Ok thanks !

tepid nova
#

with_mounted_file("/var/run/docker.sock", client.host().file("/var/run/docker.sock"))
rancid turret
tepid nova
#

thank you I had missed that

#

I can see how the asymmetry can be confusing

tepid nova
#

Oh actually there is no asymmetry

#

The correct snippet is with_unix_socket("/var/run/docker.sock"), client.host().unix_socket("/var/run/docker.sock"))

#

I'm working on adding this to the cookbook 🙂

#

I'm trying this:

_, err = client.
  Container().
  From("docker").
  WithUnixSocket(
    "/var/run/docker.sock",
    client.Host().UnixSocket("/var/run/docker.sock"),
  ).
  WithExec([]string{"docker", "run", "--rm", "alpine", "uname", "-a"}).
  Sync(ctx)

but getting this:

Connected to engine 7be02c9cdded
panic: input:1: host.unixSocket eval symlinks: lstat /private/var/run/docker.sock: no such file or directory
#

Any suggestions welcome... I'm not sure what's wrong now

#

checking if there's an opt to follow symlinks in WithUnixSocket

#

Oh wait, there's no /var/run/docker.sock on my mac. What am I missing (or what obvious thing did I forget?)

rancid turret
#

Try From("docker:cli")~

#

Is docker running in the host?

tepid nova
#

yes (via d4mac)

#

docker ps works

rancid turret
#

Does that put the socket in a different location?

tepid nova
#

that’s what I’m trying to find out. but it seems that yes

rancid turret
#

Can you ls it?

tepid nova
#

no

rancid turret
#
❯ ls -lha /var/run/docker.sock
lrwxr-xr-x  1 root  daemon    37B May 31 09:42 /var/run/docker.sock -> /Users/helder/.docker/run/docker.sock
wild zephyr
#

👋 currently we don't have a "friendly" way to enable magicache in the engine which doesn't involve telling users to manually spin up the engine and and supply the DAGGER_CACHESERVCE* variables themselves. Shall we add support for adding those automatically to the engine when first spawned if they're present at the session handler level?

Additionally, I've been bitten sometimes in the past by trying to disable the DAGGER_SERVICES feature for troubleshooting purposes after realizing that I needed to kill the engine for that to take effect. So the fact that some variables take effect on every run and others only when the engine is spawned is somehow confusing. Do you have any thoughts about this or shall we create an issue to keep track and discuss further there?

thoughts? @tepid nova @still garnet @civic yacht @ancient kettle

wild zephyr
still garnet
#

it'll cache them too aggressively, since they don't have a direct dependency on the code they're testing

hasty basin
tidal spire
civic yacht
# wild zephyr 👋 currently we don't have a "friendly" way to enable magicache in the engine wh...

If you mean that we would allow clients to set the env var rather than setting it when the engine starts, then no that's not possible yet and it's unclear if it would be possible to get that working. The reason being that:

  1. When the cache service is enabled, it swaps out the whole cache manager used by the engine, so making this per-client would require somehow using one cache manager for some clients and and another for other clients
  2. Cache mounts specifically need to be synced before the engine starts serving any clients, it's not possible today to sync them later

So for now this has to be configured when starting the engine container. I think the best way to improve the experience would be:

  1. Improve the custom provisioned engine experience in general (e.g. maybe CLI interfaces for doing it rather than the whole "you're on your own" thing today)
  2. Related to above, add an "operator API" that lets admins configure engine global settings at runtime. I could see it being possible to implement a global swap of the cache service from disabled->enabled (as opposed to trying to make it per-client).
celest totem
#

is there a specific reason to use Alpine -besides size, etc.- as the dev engine’s base image? just curious as we’ve been discussing a related issue today

fair ermine
#

@civic yacht Hey, I'm hitting a funny issue on the Ts entrypoint.

For multiples arguments, arguments are by default ordered by alphabetic order but in the function signature it can be any order so it might happens that the argument are sent at the wrong place.
I read your code in the go SDK but could not found some kind of sorting, so how do you make sure arguments are sent in the right order?

For instance:

async function build(client: Client, repo: string, branch: string, subpath: string): Promise<string> {}

The thing is that input are given in the following way

{                                                                                                                                                "args": { "branch": "", "repo": "https://github.com/dagger/dagger", "subpath": "" },
   "parent": null,
   "resolver": 'Query.build'
} 

So branch will be inserted as first arg, instead of repo (in Typescript)

#

I could reintrospect the function to know in which order I shall send arguments but it might not be the best solution, I don't know if you did something special in Go

wild zephyr
# civic yacht If you mean that we would allow clients to set the env var rather than setting i...

@civic yacht that's the thing, we're not setting the env var when the engine starts through the CLI as we do for caching config or services (https://github.com/marcosnils/dagger/blob/fdfb3528af6dff7256c727503949ab46cd8fed27/internal/engine/docker.go#L96). I'm not suggesting to change that behavior, I'm mostly suggesting for engine provider to be able to use that variable if present when provisioning the engine. Otherwise, the only way for users to start the engine with magicache is if they do it manually outside the SDK provisioning flow

ancient kettle
civic yacht
# wild zephyr <@949034677610643507> that's the thing, we're not setting the env var when the e...

Oh I didn't remember that we did that, and actually the "-e", CacheConfigEnvName serves no purpose anymore since the engine hasn't interpreted that for a while (I'll send a fix).

We can do that but I really don't think it's a good UX because it will only work if you are auto-provisioning for the first time. If you've run dagger before and then try to set that it will have no effect.

The other thing is that magicache is only usable in AWS right now, where I think very few to zero users would be relying on the auto-provisioning, so I'm not sure if it would be useful in that sense either.

wild zephyr
# civic yacht Oh I didn't remember that we did that, and actually the `"-e", CacheConfigEnvNam...

yeah.. agree, that's why I was originally thinking that we should eventually come up with a way to bump this so we can do it at runtime as you mentioned. I also agree that given that magicache is still AWS specific, for UX purposes, it might not be worth to to add it at the engine provisioning phase. So in summary, I think we're aligned to hold off this one for a bit until it's clearer how to magicache toggle would be used best. Thx for the input Erik 🙏

@ancient kettle is there a specific use-case you have in mind which would be benefited from enabling magicache with the docker provisoner?

ancient kettle
civic yacht
ancient kettle
wild zephyr
civic yacht
wild zephyr
ancient kettle
#

@civic yacht @still garnet I'm getting this error when running engine:test on Dagger GHA EKS

failed to get stream processor for application/vnd.docker.image.rootfs.diff.tar.zstd: no processor for media-type.

Does that ring any bells?

FAIL: TestEngineExitsZeroOnSignal - https://github.com/dagger/dagger/actions/runs/5359103142/jobs/9723222159#step:6:5612
FAIL: TestEngineSetsNameFromEnv - https://github.com/dagger/dagger/actions/runs/5359103142/jobs/9723222159#step:6:5669
FAIL: TestRemoteCacheRegistry - https://github.com/dagger/dagger/actions/runs/5359103142/jobs/9723222159#step:6:7327
FAIL: TestRemoteCacheS3/buildkit_s3_caching - https://github.com/dagger/dagger/actions/runs/5359103142/jobs/9723222159#step:6:7528
FAIL: TestClientWaitsForEngine

Digging a bit deeper, it's definitely the devEngineContainer function - https://github.com/dagger/dagger/blob/23be7ce76167167fd12ada64dd2a0339d258b8f3/core/integration/engine_test.go#L16C6-L16C24

Which is loading a tar file. I'll keep digging, but if either of you have any insight, I'd be grateful.

civic yacht
# ancient kettle <@949034677610643507> <@108011715077091328> I'm getting this error when running ...

Huh okay yeah I think it's:

  1. We compress all cache layers using zstd (it's way faster than gzip while also achieving a better compression ratio)
  2. We are hitting the remote cache for the dev engine builds
  3. When we try to load the engine tar for those tests the media type ends up being vnd.docker... instead of the oci type (which is supported with zstd) and get an error because there isn't a handler configured for docker+std for some reason, even though docker has supported zstd for 3+ years.

1+2 are good things, 3 is weird. I will dig a bit in buildkit/containerd to see why those handlers aren't setup when trying to do the import.

ancient kettle
#

I'm glad it's using the cache. 🙂

#

Which you commented on. 🙂

civic yacht
ancient kettle
#

😆

#

Glad I could help you be nostalgic. 😉

civic yacht
#

I agree with past me that this might need an upstream fix, but it should be pretty small+quick from what I can tell, will try it out

ancient kettle
#

I'm gonna do a walk + some lunch, but I'll have my phone with me and can be back at computer, if ya need me.

civic yacht
# ancient kettle Let me know if I can help with that.

Have a potential fix, trying it out by just changing dagger code because it only involves registering something in containerd w/ an init func (will upstream if it works). It's hard to repro locally so I pushed the commit to your PR to try it out

ancient kettle
civic yacht
# ancient kettle Awesome! I'll check on those tests. 🙂

The change is to the engine, so I think in order to confirm the fix we'll need to let the tests run, wait a bit for the node to spin down (so local cache doesn't get re-used), and then run the tests again with source code unmodified to hit the zstd cache layers

ancient kettle
#

Node killed. Gonna run those jobs again. (I'm curious about the rust and elixir SDK failures...)

civic yacht
ancient kettle
#

(current run -- #3 -- is with existing nodes)

#

@civic yacht Ok. 3 green runs in a row.

ancient kettle
civic yacht
# ancient kettle <@949034677610643507> Should we cherry-pick that commit into main?

I was just working on the upstream fix to buildkit, have it all ready now, I can cherry-pick it into our main but if it's no rush it's probably easier to just upstream and then pick it up in dagger once merged. I don't think it will be very controversial, but worst case I'll just fallback to fixing in dagger with that commit I pushed to your PR

ancient kettle
civic yacht
ancient kettle
civic yacht
celest totem
celest totem
#

finally some initial GPU access from Dagger! daggerfire

hasty basin
#

Awesome! I don't get the eggplant though...an auberGinePU? 🍆

civic yacht
wild zephyr
tepid nova
hollow gorge
#

Hi All! love what dagger is doing! I am trying to get a better understanding of how the dagger engine is working: you need to spin up a docker container that runs the dagger engine. Then you can use that container to run pipelines in other containers that you specify with the SDK (e.g. client.container().from("python")). How does this work? are these containers also being spun up on the host machine where I also run the dagger engine container, or are they running inside the dagger engine container, and if so, how does that work (dind, docker.sock volume, or something todo with buildkit?) ? If I exec in the dagger engine container, I do see some stuff about buildkit, but the docker command is not available.. Is there some documentation about how this works?

tepid nova
tepid nova
#

We use Dagger for the merge/diff experiments - it exposes a higher-level interface to buildkit, comparable to HLB. However instead of a custom language, Dagger has a GraphQL API, and generated SDKs on top of that.

#

The equivalent of "merge" is Directory { withDirectory }. Under the hood it's llb copy, but in PR 5400 above, it will be optimized to use the fancy new llb merge instead - without breaking the GraphQL API 🙂

#

Here's an example (in Go) of using withDirectory: https://github.com/dagger/examples/blob/main/go/multistage/main.go#L23-L29

    project := client.Git("https://github.com/dagger/dagger").Branch("main").Tree()

    build := client.Container().
        From("golang:1.20").
        WithDirectory("/src", project).
        WithWorkdir("/src").
        WithExec([]string{"go", "build", "./cmd/dagger"})
GitHub

Contribute to dagger/examples development by creating an account on GitHub.

tepid nova
#

How/when do we release docs please? I have a docs contribution that got merged, and I'm eager to see it on the docs website 🙂

wild zephyr
#

👋 @civic yacht @still garnet is there a possibility that there's a caching regression in v0.6.3? I was trying a very basic example today and I can't get the exec op cached whatever I do. Here's the repro:

-- ci/main.go --
package main

import (
    "context"
    "os"

    "dagger.io/dagger"
)

func main() {
    ctx := context.Background()
    client, err := dagger.Connect(ctx, dagger.WithLogOutput(os.Stderr))
    if err != nil {
        panic(err)
    }
    defer client.Close()

    testClient := client.Pipeline("test")
    src := testClient.Host().Directory(".", dagger.HostDirectoryOpts{Include: []string{"hola.txt"}})

    testClient.
        Container().
        From("docker.io/library/alpine@sha256:82d1e9d7ed48a7523bdebc18cf6290bdb97b82302a8a9c27d4fe885949ea94d1").
        WithMountedDirectory("/test", src).
        WithExec([]string{"cat", "/test/hola.txt"}).Sync(ctx)
}
-- hola.txt --
hola

Tried reverting that to v0.6.2 and it works correctly

wild zephyr
wild zephyr
still garnet
#

the problem is that container hostnames are derived from LLB digests, and I think the difference in v0.6.3 is that host Directories now contain the session ID in their LLB, so that causes the hostname to change in every new Dagger session, which busts the cache

#

This will be partly fixed when services-v2 lands, since only services will have hostnames now. But it still means that client containers downstream of a service that depends on a host dir will have their cache busted, since the service hostname changing would propagate to it.

#

iirc the change to include the session ID in host directory LLB had to do with having dagger do support host directory inputs/outputs (I forget which) - and maybe that change is no longer necessary with the new architecture. If so we can maybe roll that change back, and just let dagger do be broken for now instead, since it's experimental anyway. cc @civic yacht

wild zephyr
civic yacht
wild zephyr
#

shall we retract v0.6.3 until we fix this? WDYT?

civic yacht
#

you still get the correct behavior, just with less caching

#

would be the same as if you were running the dagger engine on a machine with low disk space (which would trigger aggressive pruning of the local cache). In general cache should always be considered best effort rather than a "guarantee" of sorts

still garnet
civic yacht
civic yacht
wild zephyr
#

so @still garnet @civic yacht Alex's patch posted above fixes this in main (just verified it). Shall I go ahead and open a PR for it?

civic yacht
civic yacht
#

cc @stray heron (who just opened an issue about this), see above

still garnet
#

The downside of that fix is that anyone who was using services without explicitly exposing a port will break. (Sometimes ports aren't known ahead of time, or something like that, @tepid nova gave rabbitmq as an example)

#

Still very low impact I guess, and easy workaround (expose a random port)

wild zephyr
civic yacht
#

Side note: we need testing around caching. The interaction that caused this was extremely non-obvious (local source having session ID and setting of hostnames in execs), there's very little chance we'll be able to catch anything like this going forward without having integ tests around it.

The problem is that testing caching is surprisingly hard... I'll think about it quickly and either open an issue or a PR w/ an initial test for this situation if I think of an easy approach

still garnet
#

@civic yacht +100. I had a test in Bass that tries to test cache busting, and I thought I found a decent trick for it, but the test flakes all the time, and I haven't had time to dig into why. 😭 The tl;dr of the approach was to mount a cache volume and have each cache-bust append to a file in that volume, and check that the # of writes matches the expected # of cache busts. https://github.com/vito/bass/blob/main/pkg/runtimes/testdata/globs.bass

#

This was actually how I found the original issue with the hostnames, so it was at least somewhat helpful

civic yacht
still garnet
#

ah interesting, yeah that could be a factor

#

Maybe we could run these tests against a temporary engine with GC disabled

civic yacht
civic yacht
#

Actually, for this very particular case I think I have something that is simpler by just opening two clients and running the same operation in both, with the pruning prevented by keeping the first client open. Pushing the test to your PR @wild zephyr

wild zephyr
civic yacht
still garnet
#

@civic yacht just pushed my Container.import changes to session-frontend if you want to take a look - not 100% sure I did the lease stuff properly, and it's kind of hard to test reliably at the moment. Pushed for now so I can context-switch to some Progrock changes: https://github.com/dagger/dagger/pull/5315#issuecomment-1636086684

#

I decided to pull in my stableDigest() helper from the services-v2 branch, since I noticed FileID was busting the cache key a lot. Didn't seem like there was a quick path to Buildkit "source of truth" cache keys based on yesterday's convo, so that'll have to do for now, but it has all the issues you mentioned in https://github.com/dagger/dagger/pull/5452#issuecomment-1636062194

wild zephyr
#

@civic yacht @still garnet getting

DONE 619 tests, 52 failures in 193.705s
Stderr:

Please visit https://dagger.io/help#go for troubleshooting guidance.
exit status 1

when running ./hack/make engine:test in my local branch for this PR (https://github.com/dagger/dagger/pull/5452). Doesn't seem real since CI only showed 4 failures only. Just tried to re-run and getting the same results each time. Any pointers what could be causing that many failures on my end?

still garnet
#

do you have the full output?

wild zephyr
civic yacht
wild zephyr
#

brb gotta step out for ~30m

still garnet
#

looks like ~everything is failing to resolve HTTP/git service hostnames, which is odd because those definitely expose a port

#

oh, maybe they call WithExposedPort after WithExec or something like that

#

but yeah not sure why that'd be different for your local checkout vs CI or Erik's machine thinkies

#

just checked and they both expose their port before WithExec - but also hopefully everyone using this API is doing that in the right order, since now it actually matters

still garnet
#

@wild zephyr and I decided to go for the alternative fix after noticing that our docs actually tell you to do WithExec().WithExposedPort(), so the impact of skipping hostname would actually be pretty high.

This PR removes the llb.SessionID instead, t.Skips the related tests, and yoinks over the cache-busting test from the original PR: https://github.com/dagger/dagger/pull/5468

tepid nova
#

what’s the TLDR of the DX change that we’re making?

still garnet
wild zephyr
#

@stray heron question about the new release note process https://github.com/dagger/dagger/pull/5408#issue-1788536705. IIUC changie new requires a PR number so it can later reference that in the changelog. One thing that's not quite clear to me is that when doing changie new, you don't always have a PR number beforehand since the PR will be created after you commit your changes and effectively open it 🤔 . Am I missing something here?

stray heron
# wild zephyr <@796825768600141844> question about the new release note process https://github...

The simplest thing would be to open the PR, then push an extra commit with the changie new content.

The other option would be to reference issues instead of PRs. Maybe we support both?

In the interest of making this easy to search / link to / reference, would you like us to continue this in a new GitHub Discussion? It can also be on that same PR. Discord will make this really difficult to track down for future selves.

wild zephyr
wild zephyr
high heart
#

As the creator of changie, let me know if I can be of help. There are lots of configuration values available if you need anything more advanced than the starting template.

stray heron
# high heart As the creator of changie, let me know if I can be of help. There are lots of co...

Hi! It's nice to put a name to the author of changie.dev.

We did a few config changes, the project's documentation made it fairly straightforward: https://github.com/dagger/dagger/blob/main/.changie.yaml

This reference also helped: https://github.com/DelineaXPM/dsv-github-action/blob/main/.changie.yaml

Will be happy to give more feedback as we build experience with it. So far, it was smooth to learn & use - thank you!

wild zephyr
high heart
high heart
tepid nova
obsidian rover
#

@rancid turret. Regarding the mount of binary data as secrets. I'm well advanced, currently writing the integration tests

__Two questions: __

  1. How do we want to scope the host interaction on this method: https://github.com/dagger/dagger/pull/5500/files#diff-09f798f10567cb9017b829bfa481df0a90e444368d36bd8f7025a69d1d2521c9R96-R101 ?

I don't see people copying their secret files in the workdir of the project ... Do we add the exclude / include option as another security layer ?

  1. Do you know where the 128000 byte limit comes from. Is it a secret store limitation ? https://github.com/dagger/dagger/pull/5500/files#diff-4242183254eaee9649c3ed82114bd292d61bd23a0d7b2ff701c3bb16d3fb8fabR76 (cc @celest totem)
rancid turret
# obsidian rover <@768585883120173076>. Regarding the mount of binary data as secrets. I'm well a...
  1. That's up to them I guess. Not sure what you mean by the exclude/include option here. You're not using llb.Local, the file is being read outside of buildkit. However, @civic yacht will this work with the API in runner? Not sure how host access like this will work.
  2. According to Tanguy in https://github.com/dagger/dagger/pull/4944, it's because of how we're passing the secret to the shim via os/exec.Cmd. Not sure if that's changed. However, again with https://github.com/dagger/dagger/pull/5415 that should no longer be a limitation.
obsidian rover
rancid turret
#

Don't see why not.

obsidian rover
#

I remember it was a choice/request from Solomon, used as a behavioral security layer: when using any host API, you can be sure that the API is not checking outside the scope of the project. Just double checking as you're the API master, if you're up I'm up 😇 🙏

fair ermine
civic yacht
# rancid turret 1. That's up to them I guess. Not sure what you mean by the `exclude`/`include` ...

That's up to them I guess. Not sure what you mean by the exclude/include option here. You're not using llb.Local, the file is being read outside of buildkit. However, @Erik Sipsma will this work with the API in runner? Not sure how host access like this will work.
That change is totally backwards compatible from the client/API perspective, so it should work the same as today

Do you know where the 128000 byte limit comes from. Is it a secret store limitation ?
I'm not 100% sure, don't see that it could possibly be a secret store limitation, my best guess is it could have been was referring to limitations in linux on the size of env vars. But I don't know if that's really going to apply when mounting files. I'd say that you should see if you can ignore size limits for your current effort @obsidian rover, but be sure to add a test with a huge secret (like 128MB+ or something absurd) and see if it fails or not, we can go from there if it fails or is unusably slow or something like that.

rancid turret
civic yacht
# rancid turret How does it work when an api field receives a path the a file in the host? Won’t...

No the quick gist of the new architecture is that even though the api server is in the runner, any session specific requests (e.g. local dirs, unix socket forwarding, etc.) will be routed back to the correct client and work the same as today. In buildkit terminology, the clients all have the same session attachables that they have today, they are just hooked up to our custom controller in the runner rather than the vanilla buildkit controller.

rancid turret
#

But that’s for buildkit stuff right? If the input is a string for a path to a file, where’s the code in core/schema running that does a io.ReadAll on that file?

civic yacht
# rancid turret But that’s for buildkit stuff right? If the input is a string for a path to a fi...

Ah okay, I understand the question now, yeah directly reading a file based on a path like that will no longer work. And (catching up on the whole problem now) we don't want to just do a standard local import because then the secret host file will end up in the cache, which defeats the purpose.

Luckily, the changes we're making to move the gql server -> runner result in us having very fine grained access to the low level session methods. So, what I think we will do is add some utils that hooks directly into the localdir sync and lets us read files into memory rather than being forced to write them to buildkit's cache as a local import.

Let me 100% confirm that will be possible, one min

civic yacht
# rancid turret But that’s for buildkit stuff right? If the input is a string for a path to a fi...

Yeah it should be possible without tons of effort. The code in buildkit where local dir sync actually happens is here: https://github.com/sipsma/buildkit/blob/d9a6afdf089a7c4b97cac704a60ad70c21086f12/source/local/local.go#L188-L217

And we now have access to the same object as what's set to caller there, so we can call those same methods except read the files to wherever we want rather than to buildkit's cache. It looks like atm we'll probably still need to sync to a path inside the runner container, but as long as it's a random tmpfs directory that's cleaned up after being read into memory that should be totally fine I think. It should also be possible in theory to sync the files directly to in-memory buffers instead, but looks like that'd be a lot more effort, so don't need to start there.

Basically @obsidian rover, I think you should continue w/ the current approach and then depending on the order of merging PRs, I'll either update mine to handle this correctly or help out updating yours to.

obsidian rover
# civic yacht > That's up to them I guess. Not sure what you mean by the exclude/include optio...

We synced with Helder today. I updated the current PR: https://github.com/dagger/dagger/pull/5500/files#diff-3981512bfc30a22bf4870ef86642b70161c68ab4768680a62302a8c845c1ed6eR184. Basically, our secrets break after 512000 bytes.

As discussed with Helder, it doesn't seem to be immediately blocking, as the current size is enough to mount most of private keys used as secrets, as shown on this docs example https://github.com/dagger/dagger/pull/5502. This could be a follow-up fix, WDYT ?

civic yacht
#

Saw this in my gh notifications, interesting movement towards being able to run buildkitd directly on macos hosts: https://github.com/moby/buildkit/pull/4059#issuecomment-1649549037 Plenty of caveats of course, doesn't mean we'll be able to use it ootb, but cool to see nonetheless

GitHub

This PR almost fixes compilation of buildkitd on macOS.
The only thing left is missing implementation of goInChroot function in github.com/docker/docker/pkg/chrootarchive. And here lies THE EVIL. b...

fair ermine
fair ermine
#

This PR : https://github.com/dagger/dagger/pull/5488 is blocked by a strange Go issue, everything is fine except on Go test on the Go SDK itself, this one: https://github.com/dagger/dagger/actions/runs/5670762890/job/15374801748?pr=5488#step:4:1266
The source code of the test is here: https://github.com/dagger/dagger/blob/736927938824e8c35d28aec7287e79c1f89ff3fd/sdk/go/client_test.go#L100

It seems that writing in the io.Pipe() makes the code freeze, I can actually reproduce the issue locally and the code actually simply stop at the first print.

    if cfg.LogOutput != nil {
        fmt.Println("Before print")
        n, err := fmt.Fprintf(cfg.LogOutput, "Creating new Engine session... \n")
        fmt.Println("after display: ", n, err)
    }
    fmt.Println("After display")

Output

Before print
# nothing, it's stuck here

I have already see that kind of issue on previous project, but usually we can fix it with a flush, however I cannot call this method here.
Do anyone with a good experience in Go has an idea what can I do to find the issue? It's strange that it is actually blocking the code, not even returning an error

#

@stray heron This is the only thing holding the PR to be merged, all other tests are perfectly fine!

#

According to this issue https://stackoverflow.com/questions/47486128/why-does-io-pipe-continue-to-block-even-when-eof-is-reached it makes sens why it's still blocking but then how am I suppose to pass this test? The blocking behavior is expected but it's not right in my case :/

#

@rancid turret I remember you were against an option to track engine loading but that looks like the only effective solution to workaround this problem, or we shall change the test to not use io.Pipe() but something else, wdyt?

bronze hollow
#

There's no Dagger Cloud channel (Unless I missed it?), but I've got a feature request:

  • OIDC token minting for AWS/GCP/et al

Cloud could iinject a short lived credential into my BuildKit instance to save me dropping long lived tokens into my builds

stray torrent
fair ermine
#

@still garnet Hey 🙂 Ready for your final review! That should be good for a merge now 🚀

civic yacht
civic yacht
# bronze hollow There's no Dagger Cloud channel (Unless I missed it?), but I've got a feature re...

I just re-read this and realized I may have misinterpreted your message, sorry 😅 (I got thrown off when I saw the words "cloud", "tokens" and "buildkit instance" because we've been having internal conversations around those topics, but a totally different context). I think what you're describing is more that you have your own builds with dagger and rather than providing a static token as a secret you'd want something more like a dynamic secret that vends out short lived oidc tokens to the Execs implementing the build, is that correct?

rancid turret
#

Was that just testing File.secret in a service?

#

Ah, looks like it.

still garnet
#

yep!

rancid turret
#

I'm going to remove Host.envVariable now.

rancid turret
#

GitHub is not seeing changes to my PRs: https://www.githubstatus.com

We found an issue with repos that affects ability of some of the customers to merge pull requests. We are working on mitigation.

#

Wait, does os.Getenv work inside our CI's mage tasks (dev engine, nesting, etc...)?

still garnet
#

@worldly flare 👋 I noticed in your presentation today you mentioned wanting c2c sockets. I was just wondering if anyone would need that last night (assuming you mean Unix sockets). Whenever you have a moment, I'm curious about the use case 🙏

rancid turret
worldly flare
#

afaik, nerdctl and podman will only talk to sockets directly, not over tcp (like docker cli supports)

worldly flare
# still garnet interesting, thanks!

If I can get those to work like a TCP socket, then I wouldn't need c2c. My goal is to test hof container runtime integrations from within a container (so I can avoid the packer VM setup we use today, or use it less often anyway)

If you could test the same for Dagger in Dagger for the three main container runtimes, then we ought to be able to use the same techniques

wet mason
worldly flare
#

so the context is running a container with dind as a mounted service, but for nerdctl / podman

With docker:

  1. Daemon container: https://github.com/hofstadter-io/hof/blob/_dev/test/dagger/dockerd.go#L26
  2. Mount service: https://github.com/hofstadter-io/hof/blob/_dev/test/dagger/dockerd.go#L55

Then my application in the container exec's out to the docker cli. What I'm after is the same for nerdctl | podman

The problem I have is nerdctl seems to only want to talk to local sockets, i.e. the tcp here does not work: https://github.com/hofstadter-io/hof/blob/_dev/test/dagger/dockerd.go#L57

I could look into running multiple processes in the same container, but I don't really want to. (hof + containerd) or (hof + socat?)

GitHub

Framework that joins data models, schemas, code generation, and a task engine. Language and technology agnostic. - hofstadter-io/hof

worldly flare
worldly flare
#

actually, I wonder if mounting a service binding as a unix socket rather than tcp would solve my issue?

worldly flare
#

getting closer with socat

│ │   █ [11.4s] ERROR exec bash -c set -euo pipefail socat -d -d UNIX-LISTEN:/run/containerd/containerd.sock,reuseaddr,unlink-early,user=root,group=root,mode=777 TCP4-CONNECT:global-containerd:2375 & sleep 1 ls -lh /run/containerd nerdctl info pkill socat
│ │   ┃ 2023/07/28 23:00:17 socat[15] W unlink("/run/containerd/containerd.sock"): No such file or director
│ │   ┃ y                                                                                                  
│ │   ┃ 2023/07/28 23:00:17 socat[15] W ioctl(5, IOCTL_VM_SOCKETS_GET_LOCAL_CID, ...): Inappropriate ioctl 
│ │   ┃ for device                                                                                         
│ │   ┃ 2023/07/28 23:00:17 socat[15] N listening on AF=1 "/run/containerd/containerd.sock"                
│ │   ┃ total 0                                                                                            
│ │   ┃ srwxrwxrwx 1 root root 0 Jul 28 23:00 containerd.sock                                              
│ │   ┃ 2023/07/28 23:00:18 socat[15] N accepting connection from AF=1 "<anon>" on AF=1 "/run/containerd/co
│ │   ┃ ntainerd.sock"                                                                                     
│ │   ┃ 2023/07/28 23:00:18 socat[15] N opening connection to AF=2 10.87.1.112:2375                        
│ │   ┃ 2023/07/28 23:00:18 socat[15] E connect(5, AF=2 10.87.1.112:2375, 16): Connection refused          
│ │   ┃ 2023/07/28 23:00:18 socat[15] N exit(1)                                                            
│ │   ┃ time="2023-07-28T23:00:28Z" level=fatal msg="failed to dial \"/run/containerd/containerd.sock\": co
│ │   ┃ ntext deadline exceeded: connection error: desc = \"transport: error while dialing: dial unix:///ru
│ │   ┃ n/containerd/containerd.sock: timeout\""   

but socat is unable to connect to the mounted service binding

civic yacht
# worldly flare actually, I wonder if mounting a service binding as a unix socket rather than tc...

It's a hack and not something I recommend per-se, but if you are looking for other short terms solutions and weren't aware, you can share a unix socket across two exec containers using a CacheVolume. Underneath the hood, a CacheVolume is just a bind mount to the two containers, so if a server binds+listens a unix socket on that cache volume, then it will show up in the client container under its mount of that volume

still garnet
fair ermine
rancid turret
fair ermine
#

Yaaay

rancid turret
obsidian rover
fair ermine
wet mason
#

That would require running 2 processes in each container (e.g. WriteFile an entrypoint.sh and run that instead of the process)

#

I could look into running multiple processes in the same container, but I don't really want to. (hof + containerd) or (hof + socat?)

Yeah, not ideal, however if that works we could look into doing a socat-like in the dagger shim to proxy unix->tcp traffic out of the box for services /cc @still garnet

still garnet
# wet mason > I could look into running multiple processes in the same container, but I don'...

I've been wondering if there are legit use cases for tcp <=> unix; so far I've been focusing on tcp <=> tcp, and unix <=> unix seems like it would fit well with our existing Socket type (just need Container.unixSocket alongside Host.unixSocket). I think unix <=> tcp is a bit of an oddball and at that point you might as well just run your own socat service, which you could build on top of both of those primitives

#

I spent some time seeing if Socket could be assimilated into Service but at this point think it's probably clearer to keep them separate

wet mason
#

Yeah, I think tcp<->tcp and unix<->unix does the job

#

agree that if you need tcp<->unix, you need something custom

still garnet
fair ermine
rancid turret
#

@civic yacht maybe this rings a bell but my brain isn't working atm 🙂 Python is failing a provision test because of checkVersionCompatibility not existing in the API (https://github.com/dagger/dagger/actions/runs/5742052801/job/15565207204?pr=5558#step:5:226). My PR is https://github.com/dagger/dagger/pull/5558 and that field was merged in https://github.com/dagger/dagger/pull/5315. Why didn't the test use the dev engine? Test is here: https://github.com/dagger/dagger/blob/cf82b979ed764724d548b0acf280bb2ddb84a32c/sdk/python/tests/engine/test_download.py.

#

By the way, that's being checked inside dagger.Connection.

civic yacht
# rancid turret <@949034677610643507> maybe this rings a bell but my brain isn't working atm 🙂 ...

Yeah it's confusing, but I think I know what's happening:

  1. Those provision tests normally don't run as part of PRs (because they take a long time), they only run when we push to main and when the version bump file is modified: https://github.com/dagger/dagger/actions/runs/5742052801/workflow?pr=5558#L16
  2. Normally that version bump file is only modified during a release, but you have an exceptional case where you're migrating it to a new location, so it's causing the provision tests to kick off
  3. That's triggering this branch of our yaml-shell (😩 Zenith plz save us): https://github.com/dagger/dagger/actions/runs/5742052801/workflow?pr=5558#L229 Which is running using the most recently released engine image
#

I'm trying to think of a way of undoing the catch-22 here... obviously moving those version files around is pretty rare, but it's an annoying situation to be in nonetheless

rancid turret
#

Ah, makes total sense. I think I was bitten again by Live Grep not searching in hidden files.

civic yacht
#

If we allow overrides for merging PRs w/ failing tests, I guess that's the last resort 🫤

civic yacht
rancid turret
#

But even if I caught that you're saying that just by the path changing, we have this problem?

#

Yeah, it makes sense.

#

Any ideas?

civic yacht
# rancid turret But even if I caught that you're saying that just by the path changing, we have ...

Yeah exactly, the only solution I can think of besides overriding the failed tests and merging anyways would be to make those provisioning tests actually work on arbitrary PRs. The thing that makes that really hard is that the tests are meant to test real-world provisioning, which means pulling from the actual registry we publish images to, which obviously becomes really iffy when it comes to just running on every push to every PR...

#

I'm thinking more though

rancid turret
#

I think we just merge with the fail.

#

Wait, will that make every merged commit fail until release?

civic yacht
# rancid turret I think we just merge with the fail.

I think that's probably the most productive, if somehow something goes wrong it'll be caught when pushed to main and we can fix it right away. Also, quite a bit of the code paths involved in provisioning end up exercised anyways in various integ/unit tests all over the place, having these very "real-world" e2e provisioning tests is mostly for an extra blanket of security around not releasing broken stuff and to get some macos test coverage (which occasionally runs into unique bugs not on linux)

Wait, will take make every merged commit fail until release?
No it shouldn't; everything works as expected on main because we actually publish images to our registry on main. Those provision tests in that case run after the publish is done and run their tests with it https://github.com/dagger/dagger/actions/runs/5742052801/workflow?pr=5558#L221-L224

rancid turret
#

we actually publish images to our registry on main

Ah yes, true 🙂

rancid turret
#

Thanks @civic yacht 🙏 I actually knew how this works just completely forgot. Even felt strange to have the provision tests run on PR.

civic yacht
rancid turret
#

Strange that I can't even run that test locally. Keeps giving me a connection refused when trying to download the cli from the test suite. Even when checking out v0.6.4 tag. Might have to reboot 🙂

civic yacht
# rancid turret Strange that I can't even run that test locally. Keeps giving me a connection re...

I just gave it a shot on my mac directly but gave up once I hit the error ERROR: Package 'dagger-io' requires a different Python: 3.9.6 not in '>=3.10', which I don't have time for right now😅 But the tests all passed on main CI, so yeah probably something local

Tangential, but your comment about rebooting spurred me to try that to see if it would magically fix dagger shell on MacOS and it actually did! #daggernauts message

So thank you 😆

fresh harbor
#

Hi, I found the _README.md in docs directory mentioned that should run make web on the root repository, but the Makefile is already gone. So I'm not sure the process has been changed to run another command or can skip this step?

GitHub

A programmable CI/CD engine that runs your pipelines in containers - dagger/dagger

tepid nova
#

In that thread, we reached the conclusion that the engine image should be split in 2 functionally distinct components. We need to agree on what to call those components, to avoid future confusion. And then we need to agree on how to package and distribute them (one OCI image or two?). And of course we need to bikeshed our way to those answers.

tepid nova
#

What's the canonical way to check for an ExecError in raw graphql?

civic yacht
tepid nova
#

Ah interesting, I didn't even know about graphql extensions

#

I'm confused because I thought the reason we chose this inferior DX for exec error handling was backwards compat, but it's still listed as a breaking change in the 0.8 release. Makes me feel like maybe we should have gone for the more straightforward experience if we were going to break anyway. Sorry if I'm missing context

rancid turret
#

What's the more straightforward experience?

tepid nova
#

If Sync fails, it means Dagger failed to sync (no need to check for special errors in each SDK). If I want to know an exit code, I check exitCode.

rancid turret
#

Yeah, but there's multiple things to consider here.

#

You're not required to check for ExecError if you don't care about the type of error. Previously, the stderr, stderr and exit code were encoded in the error message, which means lots of people were using regex to parse the message and extract these bits. They're now easily accessible in ExecError. Making multiple calls to ctr.ExitCode, ctr.Stdout and ctr.Stderr would be a chore, but the main issue was caching. We didn't think of a way to continue using a container, after a failure and without caching said failure.

#

That's still a thorn because some people require that. The workaround today is to sh -c "<cmd>; echo $? > /exit_code and get it from that file, but the caveat is caching the failure.

tepid nova
#

Isn't caching the failure want I want when I'm executing test suites?

#

I proposed months ago taking that workaround and making it part of the core API, to get a quick solution

rancid turret
#

Well it depends. If it's a failed test yes, if it's an unexpected failure like the network was down then no.

#

So we'd need controls for people to decide if to cache or not, but that's not easy to generalize. Not to say it can't be done, just that no good solution has come up yet.

tepid nova
#

Don't I still have the symmetrical problem now though? When implementing my test pipelines, either I get zero caching on failed tests, or I have to implement wrappers myself, but no way to distinguish a failed test from a failed exec for other reasons.

rancid turret
#

Yes, but we'd cache the failure for you that would be a different problem.

#

By the way, you do have a bit more control over what type of failure you have.

#

If it's not an ExecError when running a test suite then it's an unexpected error.

#

But you can also have an exec error with unexpected errors if it happens during the execution of that exec.

tepid nova
#

On the other issue of "multiple calls would be a chore", I think that's a hasty conclusion. First, in regular GraphQL it's very easy to query for as many fields as you want. So there's a limitation of our current arbitrary codegen'ed DX rather than our API. Separately, there was the option of defining rollup logic to avoid querying after each exec. For example stderr/stdout could be concatenated; exitCode could make the last error sticky (so error->success->success results in an exitCode 'error'). Lots of possible solutions we didn't explore

tepid nova
#

Just backing up, I see that my initial assumption (we chose this path primarily for backwards compat) was wrong, clearly there are a lot of design constraints at play here. So I don't know if I agree with the design we picked, but I am no longer confused 🙂

#

I guess in the context of Zenith, the biggest pain will be losing cache on entire test suites because of one failed test.

rancid turret
#

Let me clarify some things on how this developed:

  • ExecError was primarily an improvement to avoid users having to regex the error message to fetch those fields;
  • Sync was a separate and general issue, but ended up replacing the need for the _, err := ctr.ExitCode(ctx) pattern;
  • We wanted to fix ExitCode() but there was attrition because of backwards compatibility. The issue of caching failures comes from here, but it hasn't been discussed at length, it's just a can of worms we keep pushing down the road;
  • People were using ExitCode() incorrectly, like if exitCode != 0 when that never worked. It would have been fixed, but then there were a lot of other people who depended on the bug actually so it would break for them if we fixed it 🤷‍♂️;
  • I was looking into a simple way to fetch multiple sibling fields from our API and I have a proposal for this, but there's some tricky corner cases;
  • Based on all of this, it seemed reasonable to just replace the ExitCode baggage with something new. It ended up being ExecError and Sync as they were already there for other reasons but made ExitCode() unnecessary.

Having said all of that, I remind you that the "caching failure" is still a problem that needs solving especially for two use cases that I find recurring (I have it as the main thing in my DX problems list atm). ExecError wasn't meant to to be the end of the story, just a simpler improvement for now.

tepid nova
#

OK thanks for the reminder. Part of my reaction is that this is the headline of our 0.8 release, so hard no to interpret it as "the end of the story", at the very least we're presenting it like a milestone meaningful enough to be a headliner.

rancid turret
#

It’s just because it’s everywhere so may require changing more code compared to other deprecations that were simple renames.

#

Btw I originally had the other deprecations first in the blog post and the exit code after in the first api section then sdks. In review the feedback was it should come first but it wasn’t meant as the headline of the post, just required more explanation.

fair ermine
wild zephyr
celest totem
fair ermine
tepid nova
#

I like that one because we’re chasing each other upwards - presumably towards the perfect design 😁

still garnet
#

lol, I thought that too spiderman_pointing

spiral fog
#

I observed unusual behavior while running a demo on the `dagger/dagger@v0.8.1 codebase. The dagger workflow, which is version 0.8.1, initiated a new engine with version 'v0.8.2', which resulted in a version conflict for 'gale' since it explicitly uses 'v0.8.1'.

fresh harbor
fair ermine
#

docs: initiate Elixir quickstart documen...

fair ermine
#

@still garnet May I can collect your opinion really quick on the actual necessity to switch ot a real query builder: https://github.com/dagger/dagger/issues/5609
It might not solve our issues so I suggested another approach 😄

GitHub

What are you trying to do? #5594 fix an issue created by our custom query builder but include an issue if a string has the same value as one defined as an enum. To fix this issue, we need to switch...

fresh harbor
#

@still garnet I try to find what is the root cause of Elixir SDK tests failure. And see something strange that every time Elixir SDK tests are failure, the engine show the log like

177: [21.9s] 2023/08/15 15:43:42 http2: server: error reading preface from client localhost: rpc error: code = Unknown desc = failed to get client metadata for session call: failed to base64-decode x-dagger-client-metadata: illegal base64 data at input byte 0

This snippet above found from PR 5628 https://github.com/dagger/dagger/actions/runs/5868523844/job/15912901044?pr=5628#step:4:881.

I'm not sure why it's related to the SDK client. Maybe you give an guidance.

cc @stray heron

GitHub

A programmable CI/CD engine that runs your pipelines in containers - Fix utf8 labels causing hang, clean up + fix label collection in general · dagger/dagger@2aae6af

still garnet
#

or vice versa actually, not sure

#

from that error it looks like it might be an outdated CLI

#

I think Gerhard mentioned that might be happening after it fails to connect with the new one? it falls back to downloading a stable one? so if that's the case, that error might be secondary, and we need to figure out why it went for the fallback

fresh harbor
# still garnet yeah, so that's directly related to the change in that PR; that `x-dagger-client...

I have been try to log cli version with this patch:

$ git diff
diff --git a/internal/mage/sdk/elixir.go b/internal/mage/sdk/elixir.go
index 534ba700..83b397f3 100644
--- a/internal/mage/sdk/elixir.go
+++ b/internal/mage/sdk/elixir.go
@@ -94,6 +94,7 @@ func (Elixir) Test(ctx context.Context) error {
                        WithEnvVariable("_EXPERIMENTAL_DAGGER_RUNNER_HOST", endpoint).
                        WithMountedFile(cliBinPath, util.DaggerBinary(c)).
                        WithEnvVariable("_EXPERIMENTAL_DAGGER_CLI_BIN", cliBinPath).
+                       WithExec([]string{"/.dagger-cli", "version"}).
                        WithExec([]string{"mix", "test"}).
                        Sync(ctx)
                if err != nil {

The engine version output

171: exec /.dagger-cli version
171: > in sdk > elixir > test > 1.14.5
171: [0.09s] dagger devel () linux/arm64
171: exec /.dagger-cli version DONE

T_T

stray heron
fair ermine
#

@still garnet I fixed the enum and string conflict on NodeSDK with this PR: https://github.com/dagger/dagger/pull/5645, this shall delay the necessity to use a true query builder.
I'm not satisfied by what currently exist so I might create my own as an open source project later

still garnet
still garnet
fresh harbor
stray heron
stray heron
#

FWIW, here is the important change:

fresh harbor
#

Thank you for your information. 🚀

#

It stills happen on the dagger-runner

stray heron
#

As long as the github-paid-runner succeeds, we are good.

#

Btw, these errors are just temporary, we will have the dagger-runner jobs fixed soon. cc @ancient kettle

fresh harbor
#

I see.

fresh harbor
fair ermine
#

Anyone has ever met this issue?

dagger session
1: connect
1: > in init
1: starting engine 
1: starting engine [2.44s]
1: starting session 
2023/08/21 16:30:11 http2: server: error reading preface from client localhost: rpc error: code = Unknown desc = failed to get client metadata for session call: failed to unmarshal x-dagger-client-metadata: invalid character 'e' looking for beginning of value
fresh harbor
#

I found it during fix the Elixir SDK connect failure issue.

fair ermine
#

I'm using a dagger binary compiled from main... I'm up to date :/

civic yacht
fair ermine
#

I'll empty my image storage and let you know

fair ermine
tepid nova
#

I want to start a thread about cache volumes... Will make it an issue if it goes anywhere. Paging @still garnet specifically since our discussion about namespacing on friday is what got me climbing down this particularly deep rabbit hole...

tepid nova
#

@still garnet FYI drafting a response to your servicesv2 ping now

fresh harbor
#

Hi, I found Elixir SDK hang when running with dagger run (https://github.com/dagger/dagger/issues/5666). After try adding --debug option, the TUI show context deadline exceed on http://dagger/query but graphql query still perform. Anyone has ever met the issue?

GitHub

A programmable CI/CD engine that runs your pipelines in containers - Issues · dagger/dagger

tepid nova
civic yacht
#

Also, if anyone has a sec for a quick shipit, I made a traditional "yaml compilation error" that wasn't noticed in PR since it was in a job that can only run on pushes to main 😞 : https://github.com/dagger/dagger/pull/5690

#

Thankfully, the plan is to move that to a Zenith Check so we won't have to deal with that anymore in the future 😁

still garnet
#

Is there any way we could make these dagger-runner checks opt-in somehow until we expect them to be stable? It's causing a ton of noise in PRs, and is training me to ignore them, which isn't good. The only occurrence of dagger-runner I can find in the repo is in a specific workflow (_hack_make.yml) which doesn't seem related, so is this some sort of infra-level setting? cc @ancient kettle (not sure who to ping, just guessing)

tepid nova
#

@oak sandal moving our discussion about p2p, caching etc here 🙂

tepid nova
#

centralized vs p2p caching

fair ermine
#

@mellow bolt FIY Netlify docs is failing on some CI, did we break anything this week?

tepid nova
#

@civic yacht sorry to bother you about this while you're deep in Zenith Checks. Not urgent.

--> When we switch to the containerd bk backend, you mentioned "all engine drivers will need to do, is provide a containerd socket".

mellow bolt
#

I can build the docs locally

mellow bolt
# fair ermine <@529471641479151617> FIY Netlify docs is failing on some CI, did we break anyth...

I discovered someone else experiencing the exact same issue, and it began on the same day for them as it did for us:
https://answers.netlify.com/t/builds-stopped-working-out-of-sudden-error-while-installing-dependencies-in-opt-build-repo-netlify-plugins/100256
It seems the problem might be on their end. Despite efforts, including upgrading the Node version, I haven't managed to replicate it locally.

wild zephyr
#

👋 I was trying to test something with dagger listen in Dagger main latest commit and seems like dagger listen stopped working altogether? command just hangs. Doesn't happen in 0.8.4. cc @still garnet @civic yacht

fresh harbor
#

Hi, I sent a PR to the engine https://github.com/dagger/dagger/pull/5712 to try to fixes the Elixir hang when run with dagger run. I just found that it hang because of cmd.SysProcAttr.Setpgid, I'm not sure why. 😂 It works, gradle also works but I'm not sure if it's the right way to fix.

GitHub

This issue discover in #5666, the Elixir got hang when running with dagger run. After debugging, the process hang somehow when using SysProcAttr.Setpgid. Remove it fixes the issue.
Luckily, it fixe...

tepid nova
#

Starting a thread about cross-session IDs and all the associated tears and blood

fair ermine
celest totem
#

anyone knows if latest main is broken or perhaps my box is in some weird state?
getting the following when trying ./hack/dev bash:

[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x17ca344]

goroutine 1 [running, locked to thread]:
main.setDaggerDefaults(0xc000360000, 0x0)
    /app/cmd/engine/config.go:31 +0xe4
main.main.func3(0xc0004e3080?)
    /app/cmd/engine/main.go:252 +0x4da
github.com/urfave/cli.HandleAction({0x1926620?, 0x1d500a8?}, 0xc00047f340?)
    /go/pkg/mod/github.com/urfave/cli@v1.22.12/app.go:524 +0x50
github.com/urfave/cli.(*App).Run(0xc00047f340, {0xc0000500c0, 0x4, 0x4})
    /go/pkg/mod/github.com/urfave/cli@v1.22.12/app.go:286 +0x7db
main.main()
    /app/cmd/engine/main.go:401 +0x1214
panic: runtime error: invalid memory address or nil pointer dereference
celest totem
celest totem
#

The above issue occurs on latest main when _EXPERIMENTAL_DAGGER_SERVICES_DNS is set to 0. The reproduction flow is:

  • setupNetwork is called, _EXPERIMENTAL_DAGGER_SERVICES_DNS is set to 0.
  • setDaggerDefaults takes netConf (returned by setupNetwork).
  • netConf is nil and setDaggerDefaults attempts to interact with it. (I found this while rebasing GPU access branch on top of latest main`)
    is this worth fixing @still garnet ? I think your PR will change a lot of this code: https://github.com/dagger/dagger/pull/5557
GitHub

This PR combines a few efforts that have been piling up throughout our chaotic Zenith adventures; sorry for the huge PR:

Switch services to run via the gateway interface instead of via Solve() (pr...

oak sandal
#

I was looking into various containerd remote snapshotters over past couple days and their proposed image formats (Nydus, overlaybd,etc). They all have strong arguments against tar- https://www.cyphar.com/blog/post/20190121-ociv2-images-i-tar . Ideal story in a perfect world: developers have containerd on their machine(via docker's containerd image store/nerdctl) , Buildkit supports building these newer formats and harbor(maybe others will) support storing them. So people can build, run CI/CD via Dagger and deploy using the same stack . All sharing the same cache and having the ability to lazy pull efficiently with low latencies,etc.
In reality , it does not look easy to get people to migrate to the new shiny thing, maybe the build/CI is doable but deployment runtimes doesn't seem to be an easy task.

Also, the newer formats are still something that's being debated over, its not clear which ones are better/going to stick around in a year. I've also just heard changelog podcast where @tepid nova talked about how community may never agree on any of formats.
Do you folks think there'll be specific use-cases/domains that we'll able to share the image formats across the stack? Or is it worth making it easier for users to migrate to the newer formats and have all the tooling necessary to enable that(docker and OCI kinda did that, I feel)? Really curious about what you folks think.

civic yacht
#

I was looking into various containerd

fair ermine
#

@civic yacht @still garnet @stray heron
I just opened a big issue about buildkit rootless https://github.com/dagger/dagger/issues/5763
This one aims to close all sub issues and propose a temporary solution with a piece of documentation, lmk your opinion 🙂

mellow bolt
#

Is this error familiar to someone ? I'm running a job on Github action with our dagger-runner . The pipeline is written in TS. Thanks 🙏

ancient kettle
mellow bolt
#
    "@dagger.io/dagger": "^0.6.4",
#

I guess I should bump to 0.8.4

ancient kettle
mellow bolt
#

thanks @ancient kettle.Will try that

mellow bolt
slate rivet
tepid nova
#

I could have sworn there was a Container.Mount() which allowed you to access the contents of a mount after it changed? I remember it vividly because it was such a mindblowing moment for me, I didn't even know buildkit supported this, and it was such a painful gap in the pre-cloak version.

But @wild zephyr pointed out that it doesn't exist in the current API? Does anyone remember this, or am I going crazy?

still garnet
tepid nova
#

Holy 💩 so Container.Directory preserves the modified contents of a mount post-exec?

tepid nova
#

File too I guess?

still garnet
#

yeah File works the same way

tepid nova
#

One more item on my list of reasons to kill cache volumes

still garnet
tepid nova
#

I have shelved this pet peeve of mine because we're so busy... But my gut feeling remains that cache volumes / cache mounts are a relic of the past and we can do better

remote jetty
#

Anyone has ever met this issue

tepid nova
#

Just realized ExperimentalPrivilegedNesting and WithEntrypoint don't work well together 😭

still garnet
#

nesting + entrypoints

remote jetty
#

Hey, is there a plan to support cache rotation i.e. some sort of logrotate functionality or can I make a new feature request? This is what happened when I started using Dagger in our CI environment 💥. I'm testing it on a project that I'd say is on the extreme end of the scale, a front-end monolith, and we commit very frequently throughout the day.

wild zephyr
#

Hey is there a plan to support cache

tepid nova
#

@dull stream can you please explain dagger to me?

stray torrent
# fresh harbor Hi, I found the [_README.md](https://github.com/dagger/dagger/blob/main/docs/_RE...

Hi! I just saw this as I was trying to figure out how to run docs locally as well. The readme is indeed incorrect. @fresh harbor

The correct readme is here: https://github.com/dagger/dagger/blob/main/website/README.md

cc @hybrid widget

Note that node v.20.6.0 has a bug in it that breaks the Docusaurus build. It's a Docusaurus issue and not specific to the Dagger setup.

https://github.com/facebook/docusaurus/issues/9291

tepid nova
#

Do we have an official script for cross-compiling the Dagger CLI to multiple architectures and OSes?

tepid nova
tidal spire
#

The guide would need some modification for the package to build, but other than that it should work 🙂 go build ./cmd/dagger

wild zephyr
tepid nova
#

what's the motivation for using goreleaser?

#

(I'm going to guess less shell scripts)

#

So does ./hack/make exec out to goreleaser?

#

and not to go build at all?

wild zephyr
# tepid nova So does `./hack/make` exec out to goreleaser?

./hack/make doesn't use goreleaser since it builds the binary with the standard go build command which will only build a single binary for the OS and ARCh that the dev currently uses. IIRC the decision to use goreleaser which basically execs go build underneath was because it also helps a lot with all the publishing aspect of the CLI S3 upload, homebrew, etc.

tepid nova
wild zephyr
fair ermine
#

@still garnet @civic yacht Do you know if it's possible to just provide a buildkit daemon already ready to use for Dagger?
I know that we are doing a lot of thing on https://github.com/dagger/dagger/blob/dd9d63a0cc7faf5f692cb34ef29e6d3e9d1f1a4a/internal/mage/util/engine.go

I disabled the --privileged to see what happens and as expected, it's not working but I think I can fix the error by customizing the buildkit daemon with rootlesskit (at least I can try), however, I do not see where we are launching the buildkit daemon, I can see the entrypoint script but it actually execute dagger-engine :/

I'm looking at https://github.com/dagger/dagger/blob/e5feaea7f1d6eb01d834e83b180fc3daed3463ea/cmd/engine/main.go but I' not sure I'm actually reaching this part, I can see the error mkdir: can't create directory '/sys/fs/cgroup/init': Read-only file system
Which happens here https://github.com/dagger/dagger/blob/dd9d63a0cc7faf5f692cb34ef29e6d3e9d1f1a4a/internal/mage/util/engine.go#L46

#

As a workaround, I would like to directly provide an engine (if maybe there's a way to avoid this error to run a rootless engine)
My purpose is to see what can work and what cannot work but to do so I need to find a way to run the engine rootless (which looks quite complex haha)

#

Or I can just compile ./cmd/engine and run it, if it works?

spark cedar
#

vito Erik Sipsma Do you know if it s

tepid nova
#

Did the protocol for dagger listen change in the last few months? I could have sworn dagger listen would print to stdout everything I need to connect, and I could pass the session token as a regular bearer token. But now it looks like the token doesn't get printed, and I have to use ~digest~ basic auth - am I going crazy or is this a relatively recent change?

wild zephyr
tepid nova
#

but it I type dagger listen does it require authentication by default?

still garnet
wild zephyr
still garnet
#
Error: introspection query: make request: input:26: Unknown type "__Type".
input:59: Unknown type "__InputValue".
input:68: Unknown type "__Type".

hello, darkness, my old friend...

tepid nova
#

Engine release question: it looks like when we publish the engine container, we use the "devEngineContainer" function to build it. Is that right? I wasn't sure since there's "dev" in the name, which implies "not production".

wild zephyr
tepid nova
fair ermine
#

@wild zephyr I think this issue is fixed: https://github.com/dagger/dagger/issues/5727#issuecomment-1734215567, I'm not sure there's anything we should do about it, client.Secret expect an ID but you use a human readable name in your issue
Should we close it?

GitHub

What is the issue? Given this example: func main() { ctx := context.Background() client, err := dagger.Connect(ctx, dagger.WithLogOutput(os.Stderr)) if err != nil { panic(err) } _, err = client.Con...

tepid nova
#

@dull stream are you able to explain Dagger on this channel?

wet mason
#

[for tomorrow] @tidal spire hey, just checking but if you don't do docker stop at the end of GHA, is the container stopped at all or GHA just kills the VM right away?

tidal spire
wet mason
#

@tidal spire no rush / need, either way it's not good for us, was just wondering what was actually happening

tidal spire
#

Yeah unknown, we just know that without the stop + extra time we weren’t getting the completed sync

fair ermine
#

How can I pull the engine from our registry? docker pull registry.dagger.io/engine is not working lmao, is there anything I should configure?

#
$ docker pull registry.dagger.io/engine                                                                      
Using default tag: latest
Error response from daemon: manifest unknown
#

Oh I need to put the exact tag, my bad

hasty basin
still garnet
tepid nova
#

Any objections to enabling CORS by default in the nested dagger session exposed in the shim? My assumption is that if you trust a container enough to make nested dagger queries, you also trust it to delegate that authority to a webapp. It may or may not be good security, but it's not malicious

#

Also, if a user trusts that container enough to expose one of its ports to the host network, restricting CORS doesn't enhance security since there is no relevant concept of a trusted domain in that context: either you have access to the port or you don't.

OK alternatively, it may be the case that enabling CORS by default is not a good idea - instead perhaps we need to make configurable what the trusted domains are, so that the port can be forwarded on the host network with restricted CORS

Thinking through it even more... Maybe I can just do that in userland. I'll need an http proxy instead of socat 😇 but not the end of the world

tepid nova
#

Is there a way to know the latest released version of Dagger?

still garnet
rancid turret
#

I think that's just because the other releases are not semver (they have a prefix).

still garnet
rancid turret
#

Ah, yes 👍

civic yacht
#

Pretty interesting new filesystem, PuzzleFS, being worked on in Linux: https://lwn.net/Articles/945320/ Meant to make container images more efficient by doing away with layers and instead assembling them via (de-duplicated) blocks. (cc @spark cedar this is what I what talking about the other day)

civic yacht
median charm
#

I'm writing a new guide for Dagger on OpenShift with Gitlab Runners. Is there any option to render a preview locally for the guides?

tidal spire
median charm
tidal spire
rancid turret
#

@civic yacht re: the help blurb from the Go SDK, I created an ErrBlurb (public) that wraps the api error, including ExecError. Isn't that a breaking change if users are doing if err.(*ExecError)? Really don't want to find the substring in the error message and delete from .Error() 😅

#

+1 from me to delete that blurb althogether though.

civic yacht
tidal spire
fair ermine
#

Do anyone has ever hit that issues? Repro: ./hack/dev go test -v -count=1 - $(pwd)/cmd/shim/

package github.com/dagger/dagger/cmd/shim
        imports github.com/dagger/dagger/core
        imports github.com/dagger/dagger/core/reffs
        imports github.com/dagger/dagger/engine/buildkit
        imports github.com/dagger/dagger/engine/sources/blob
        imports github.com/moby/buildkit/cache
        imports github.com/moby/buildkit/snapshot
        imports github.com/containerd/containerd/snapshots/overlay/overlayutils: build constraints exclude all Go files in /Users/tomchauveau/go/pkg/mod/github.com/containerd/containerd@v1.7.2/snapshots/overlay/overlayutils

I hit the same issue when I try to run integration test locally too.

I would prefer to fix that to work on https://github.com/dagger/dagger/issues/5791, otherwise I can do a local repro but it's way simpler with unit test

I already tried go clean -testcache, go mod tidy etc... It doesn't fix the issue

civic yacht
subtle heath
#

Hey dagger team.
I noticed you guys are using changie for changelogs, and we're looking at switching over to that for our changelogs in Pulumi now.
Seems both of us had the same issue of wanting to use PR numbers in changelogs. I see you've added PR as a custom entry for changie https://github.com/dagger/dagger/blob/main/.changie.yaml and we're probably going to do the same, but I've raised https://github.com/miniscruff/changie/issues/562 to see if it makes sense to get this better supported by changie somehow.
Also curious if y'all have any input on changie use that you think would be useful for us to hear.

GitHub

A programmable CI/CD engine that runs your pipelines in containers - dagger/dagger

GitHub

Is your feature request related to a problem? Please describe. We've been using an internally developed tool at Pulumi for some of our changelog management. We're looking at switching over ...

tribal fable
#

Context, i try to enable a test env so we can contact test containers directly when we are developing our services.
To avoid havint to reconfigure the apps to random ports it would be a nice feature to be able to set the ip om the client.Host().Tunnel() to 127.0.0.2 . Then we are sure there will not be a port collision.

wild zephyr
wild zephyr
tribal fable
wild zephyr
#

Indeed i can use that as a solution. I

tepid nova
tribal fable
tepid nova
fair ermine
#

Is it possible to mount a socket from another container? (not from the host, I know we already have a method for that)

I'm stuck on the PostgreSQL module that require a socket to use psql and I couldn't find a way to mount it...
I might try using another way like a prisma project to try but that would be soo cool to make possible to use psql, specially for pg_dump later

spiral fog
fair ermine
#

Ok I found a workaround using an actual prisma example. The shared volume isn't working though :/
I'll let this on hold for now, there are certainly othet tool that can use the TCP/IP connection to create backups

fair ermine
#

Is there a way @civic yacht to call the dagger engine from a dagger module?
I would like to implement a dagger publish solution but I'm not able to connect to the dagger engine from dagger

#

I need to use the dagger cli inside a dagger module, but I cannot make it works (/cc @spark cedar @still garnet)

fair ermine
#

I cannot setup CNI on my mac, I do not understand why though.. I could find a workaround to run dagger inside dagger but it's still broken with DNS

spiral fog
fair ermine
#
    engine := dagger.
        Worker().
        Container("amd64").
        WithUnixSocket("/var/run/docker.sock", dag.Host().UnixSocket("/var/run/docker.sock")).
        WithExec(nil, ContainerWithExecOpts{InsecureRootCapabilities: true}).
        WithExposedPort(1234, ContainerWithExposedPortOpts{Protocol: Tcp}).
        AsService()
fair ermine
fair ermine
#

My error is pretty funny to debug

 buildkitd: plugin type="bridge" failed (add): failed to list chains: running [/sbin/iptables -t nat -S --wait]: exit status 3: iptables v1.8.9 (legacy): can't initialize iptables
  ┃ le `nat': iptables who? (do you need to insmod?)                                                                                                                                  
  ┃ Perhaps iptables or your kernel needs to be upgraded.     
#

Ok I'll check that

spiral fog
fair ermine
#

Because I have the dagger cli available, I can run dagger --help etc.. but I cannot execute anything with the engine

#

I'll try your way tho, may it will bypass the CNI issue

spiral fog
fair ermine
fair ermine
spiral fog
#

do you have docker cli in the image you're using?

fair ermine
#

Nope, I do not see it in your code either, are you using it?

#

I see you're simply using dagger().your cmd, do you have docker installed somewhere?

spiral fog
#

can you try to add docker to WithExec([]string{"apk", "add", "curl"}). line? Maybe my usage doesn't trigger the client connection

fair ermine
#

dagger mod init should trigger docker

#

That's soooo strange

spiral fog
#

yes, I'm not sure why it's working 😄 looking the code again

fair ermine
#

I tried to ctrl+f docker, I'm not seeing it

spiral fog
#

did you add ContainerWithExecOpts{ExperimentalPrivilegedNesting: true}?

#

maybe it's trying to create new client since you didn't give permission to access existing engine

fair ermine
#

You're so right, omg

spiral fog
#

It's an interesting error. I knew it because I encountered the same issue.

fair ermine
#

Ok so it's funny bc downloading the CLI and allow nesting privileges works, but binding the engine manually does not because CNI, i'm really confused

fair ermine
#

@still garnet Modules published by dagger mod publish does not appear at the top of daggerverse, but they can be seen if we write the name of the module in the search bar

fair ermine
spiral fog
spark cedar
#

So had a bit of a thought last night - are we planning to support native multiplatform builds (without qemu emulation)? buildx does this by having a bizarrely complicated client, which I'd actually really like to avoid - it makes client/server side cooperation really fiddly and horrible, and also doesn't fit with how we'd like people to eventually deploy dagger (having a platform team deploy the bits and pieces, and then the dev teams just connect to it).

Ideally, what I'd love would be to just connect to a single dagger engine, that then had multiple worker backends with native architectures. To me this sounds a little bit like the "clustered buildkit" we've discussed before, but the baby case - a small number of engines, no complex scheduling requirements (all steps go to their native architecture).

From the configuration perspective, I'd imagine configuring one dagger "primary", which would be configured to connect to multiple "secondaries" that it could offload other architecture's builds to. From an architecture perspective, with https://github.com/dagger/dagger/issues/5484, we could deploy the "primary" component as a container which does everything except actually running the builds, and then have the "secondary" component deployed alongside it (essentially like a networked buildkit worker?) Then we could configure other "secondaries" as well.

The big issue with something like this would be that we'd need shared storage between all the secondaries. However, this would be signifantly easier than trying to co-ordinate shared storage between multiple buildkits, since there's only one primary to coordinate everything. There is still the question of what that shared storage layer would look like, I don't really have strong opinions (beyond letting people configure what they want).

Curious about people thoughts - it feels like a nice feature to try out "clustered buildkit" work, and test the waters for that.

tepid nova
spark cedar
#
  1. Caching yes, storage no. e.g. if I pull an image during the build, I actually need to get it to the executor somehow. Or for an ultra fun example, what if for a single build I want to mix-and-match steps that use arm64 and amd64, and want to use the same cache-volumes/layer caches?

If we ran multiple buildkits then this could potentially be the solution? But this leads to so many issues like I mentioned, e.g. if I run a module, then it kind of needs to be run on each engine, even though in reality, we could just run it once - there's loads of edge cases. But if we try and do the idea of 1 buildkit that supports multiple architectures, then we get to sidestep all of those problems at once, and keep a really elegant client-side experience.

Maybe this isn't entirely incompatible with the distributed cache? I think it's probably more likely to be orthogonal.

tepid nova
#

Oh I see

#

sounds like having a bunch of full-featured engines, with the added capability of redirecting jobs to each other, might be easier

#

or, alternatively, a bunch of regular engines + a control plane redirects to the right one

#

that sounds less complicated than splitting the engine along a new executor interface, solving the storage issues etc

civic yacht
# spark cedar 2. Caching yes, storage no. e.g. if I pull an image during the build, I actually...

I think the distributed cache in one form or another would be involved here. Today when engines use it, they are just querying it to ask the service "do you have any cached layers that match this operation I'm about to perform?" and also periodically syncing their own mapping of operation cache keys -> layers to the service. So if there's 2 engines, one x86 and one arm64, the cache service does already allow them to share cache. If they are performing ops that include the platform in the cache key, they won't have any cache matches between each other; if they are performing ops that don't include the platform in the cache key, then they can share cache.

The biggest impediment right now is that the cache synchronization happens async periodically, every 5 minutes by default, which is probably too much a delay to be especially useful over the context of a single build. There's pretty low-hanging fruit in terms of significantly optimizing that though.

Either way, the general idea of the distributed cache service (mapping operation cache keys -> cached results) seems likely to be something we'd want for any sort of distributed worker type setup.

civic yacht
spark cedar
#

Native multi-arch builds with distributed engines

celest totem
fair ermine
#

https://github.com/dagger/dagger/pull/6118 the default NodeJS client is almost done, I still need to update the code generator but the logic works fine.
I'm waiting for your feedback, it required severals change in the internal to make it works, this is the best solution I found after (a lot of) tests

GitHub

Expose a Node client that can be used without a Dagger connection set up.
Abstract the connection to the engine outside the NodeJS generated code.
Add test for default client.
Refactor provisioning...

rancid turret
still garnet
#

Stumbled across an awkward bug with how git refs are resolved in Buildkit, which made one of my tests fail in a surprising way: https://play.dagger.cloud/playground/lh6rroNcvzL
tl;dr currently Buildkit is resolving our v0.9.3 tag to sdk/go/v0.9.3 which is a different commit from v0.9.3. More details in the playground link. Will TODO either fixing or opening an issue and move on for now.

rancid turret
still garnet
tidal spire
#

Anyone know why an image published from dagger has

"Metadata": {
   "LastTagTime": "0001-01-01T00:00:00Z"
}

instead of a real timestamp?

still garnet
#

assuming this is part of metadata that goes into container/image digests

tidal spire
#

Got it. FWIW a docker build produces a real timestamp here

spiral fog
#

Do we have any change related to how cache works in the dev engine or a new change in the main?

I'm trying to access my files in a cache volume with dagger v0.9.3 and devel and v0.9.3 works as expected but with devel can't see any file in the volume.

devel:

/opt/homebrew/bin/dagger export artifacts --output .gale/artifacts
✔ dagger download artifacts [1.23s]
┃ Asset exported to ".gale/artifacts"
• Cloud URL: https://dagger.cloud/runs/56474944-9e4a-4ad3-a6c0-e89345613aed
• Engine: 20d26947aee5 (version devel ())
⧗ 5.08s ✔ 42 ∅ 12
✔ dagger download artifacts [4.23s]
┃ Asset exported to ".gale/artifacts"
• Cloud URL: https://dagger.cloud/runs/8155e655-7133-43d9-ae6f-fa1206209d91
• Engine: 49d372dbc70d (version v0.9.3)
⧗ 10.95s ✔ 88 ∅ 7

Executed function: https://github.com/aweris/gale/blob/c0bed86a298782335454b1282ff63d5d06ad26c6/daggerverse/actions/artifacts/main.go#L83-L101

I can export the files in the screenshot with v0.9.3 but getting empty directory with devel version

civic yacht
#

Do we have any change related to how

spark cedar
#

but the commit is wrong - i'll submit an upstream pr for this one, it should be pretty simple

spark cedar
primal stone
#

I didn't see this error reported in the #help channel, so maybe this is worth mentioning here:

• Engine: 9051862dce9f (version v0.9.4)
⧗ 1.05s ✔ 3 ✘ 1
fork/exec ./test.py: exec format error

on OS X 12.7.1

wild zephyr
#

👋 I've noticed that 0.9.4 buildctl inside the engine seems to be different than v0.9.3 as I only see that dial-stdio is ony available? I used to use buidctl prune to remove the cache without stopping the engine. Is there a way to achieve the same result in the latest version?

spark cedar
#

i did have the thought - should we just have some basic cache management commands on the dagger cli? it would be so useful for me while developing dagger at least, though maybe it doesn't have great value to end-users.

wild zephyr
spark cedar
#

nice, well that's at least 3 votes to do something 😄

#

do we already have a gh issue? if not, we should probably make one 😄

still garnet
#

don't think so! (also searched prune now and didn't see anything)

spark cedar
obsidian rover
obsidian rover
obsidian rover
#

Not sure to understand how to turn it off and on ? My guess is that one of the time I ran my PR, I encountered this bug, and it seems to be stuck to that now on CI (wild guess)

rancid turret
#

By "on and off" I meant a re-run 😛

obsidian rover
torpid tapir
#

sorry i like to understand the dagger engine opensource code. how does dagger engine "call" the different platform's API?

#

I'm assuming that the daggerengine calls the various platforms' api directly and then perhaps offers webhooks to receive requests from the platforms.

rancid turret
#

Dagger engine and platforms

bronze hollow
#

I'm a little behind on recent developments, so this may be an easy question to answer:

Currently, I'm using dagger up to spin up a backend stack for a frontend application, and then using bun dev to run my frontend locally and consume the backend.

Is it possible for the frontend to run in Dagger too, with syncing / reloading of code?

tepid nova
#

in theory, you could do it yourself on top of the current engine, but it would require quite a bit of plumbing, and running an additional helper tool on the host

bronze hollow
#

Thanks. I'll stick with the local step for now 👍🏼

#

Using mprocs atm to spin up the dagger up and the local commands, which does work quite nicely

tepid nova
#

great, more wrappers 😛 I’ll check it out, thanks for the link

tepid nova
#

Hey @rancid turret I'm working on cosmetic improvements to dagger functions, and running into a problem with FuncCommand, was wondering if you could point me in the right direction?

#

Basically:

  • I am adding logic to dagger functions so that it can process arguments directly, without the FuncCommand boilerplate. The clean way to do this is to make FunctionListCmd a raw cobra.Command instead of a FuncCommand. I plan on doing that, but it;'s a but more work (because I need to steal the module loading logic from FuncCommand ) so I'm doing a stopgap for now
  • The stopgap is to slightly patch FuncCommand, so that in addition to an Execute field (currently used by FuncListCmd), there is also an ExecuteAll, which does exactly the same thing, except it receives the arguments passed by cobra.
  • PROBLEM: these arguments include -m NAME_OF_MODULE! I don't understand why, since I assume that is a cobra flag and is processed somewhere along the way. Which means cobra should not include it in the args... but it does!

This is where I'm stuck

rancid turret
#

You can just change Execute to pass the args in. It's only used by dagger functions.

#

Then you'll be able to pass the parsed arguments which should remove -m (and it's simpler).

#

FuncCommand will be removed soon, when we put everything under dagger run.

tepid nova
#

That’s what I did

#

but somehow -m foo is still mixed into the supposedly parsed arguments

#

hence my problem

rancid turret
#

But which args are you passing in? To not have -m it has to be c.Flags().Args().

tepid nova
#

I was just passing a []string argument received by RunE

rancid turret
#

Yep, try using flags from cmd, flags, err := fc.load(c, a, loader).

rancid turret
rancid turret
tepid nova
bronze hollow
#

I'm trying to configure a module with a struct:

Threading

tepid nova
tepid nova
rancid turret
#

Which arguments are you trying to use? By this point flags have been parsed already so you can get their value.

rancid turret
tepid nova
#

Ah ok

#

I want to do eg dagger functions -m MODULE foo bar baz so I actually walk the pipeline myself

rancid turret
#

With ExecuteAll or Execute? It can make a difference because Execute has escape points, so you may be traversing during load and consuming the args.

#

What you want is c.Flags().Args(). You should be able to use that inside the current Execute in dagger functions without having to pass args.

tepid nova
#

@rancid turret it worked 🙂 no more cobra issues, thanks

spark cedar
#

Is there a way when using service tunnels to get what port was assigned automatically?
https://github.com/dagger/dagger/blob/main/core/schema/host.graphqls#L69-L71
The only way I've worked out how to do it is to call ports on the service, and then work out how that corresponds with the same index? I quite like the endpoint helper, but it doesn't seem like there's any way to get that to work in this case (since it uses the frontend port, not the backend port, which makes sense).

civic yacht
#

Is there a way when using service

rancid turret
#

Why does make engine:lint complain on modules in docs/versioned_docs/version-zenith/ but not in core/integration/testdata/modules/go/, does anyone know (for example undefined: Optional)? I mean when you haven't dagger mod sync locally.

#

For context, engine:lint lints all .go files in the repo, including doc snippets.

civic yacht
#

Why does make engine:lint complain on

tepid nova
civic yacht
#

replace up command with chainable core a...

wild zephyr
#

Thread about WithMountedX and inconsistent Exports. Coming from here: https://discord.com/channels/707636530424053791/1196918439185494116

Seems like calling Container().WithMountedDirectory("/tmp/foo", somedir).Directory("/tmp/foo").Export(ctx, "foo") can effectively export all the files contained in the mount directory ( I thought this wasn't possible, not sure if something changed). However, if I call Export or Publish at the *dagger.Container level, the underlying tar archive doesn't contain the mounted directory which is what I initially expected. Is this working as designed? It feels weird to me that it works differently

wild zephyr
#

👋 recent issue we've found with @astral zealot hacking on our webhook module (https://github.com/franela/webhook). Seems like dagger calling withing the same dagger session doesn't work. repro: dagger run sh -c "dagger call -m github.com/shykes/daggerverse/hello message && dagger call -m github.com/shykes/daggerverse/hello message"

I get this error on the second execution:

┃ Error: query module objects: json: error calling MarshalJSON for type *dagger.Module: returned error     
┃ 400 Bad Request: failed to merge schemas of [daggercore hello hello]: conflict on type "Query" field     
┃ "loadHelloFromID": field re-defined                                                                      
┃                                                                                                          
┃ 1: load call ERROR: query module objects: json: error calling MarshalJSON for type *dagger.Module: re    
┃ turned error 400 Bad Request: failed to merge schemas of [daggercore hello hello]: conflict on type "    
┃ Query" field "loadHelloFromID": field re-defined 

Seems like a valid issue to open @spark cedar @still garnet @civic yacht ?

spark cedar
#

has anyone else got the vibe that the number of log lines in our test/testdev jobs has grown a lot recently? it feels like logs now take longer to load/scroll through, and downloading them to check indicates that we're now at about 50MB of logs per run 😱
is there anything we can do in our tests to maybe make them a little quieter?

rancid turret
#

Yeah, I think there was a change in december that led to that. Was it during the switch to dagger runners?

tidal spire
#

Do we have any docs/readmes about configuring garbage collection?

#

I expected to find it in the operator manual if it were documented anywhere

tidal spire
spark cedar
#

oh sorry, i meant to respond with more lol

#

this isn't really even documented at all well in upstream buildkit 😦

#

I'm happy to try and find some time to scratch down some notes (since it pretty much needs to be pulled directly from the source code 😱), which we could use to write up some better quality docs

tidal spire
#

I would love that! I know we've shared some gc flags with people in the past but I don't think we've tracked that anywhere

#

If it's all buildkit specific configs, it might make sense to document that upstream and we can provide some reference to that in the operator manual

spark cedar
#

which isn't unrelated

spark cedar
#

Does anyone here have any context for what e.serverMu is protecting in our buildkit controller? (🤞 @civic yacht @still garnet ) https://github.com/dagger/dagger/blob/main/engine/server/buildkitcontroller.go#L280-L284

I'm trying to understand it, and it appears to be held for way longer than we need it in lots of scenarios (like in the above instance, to shutdown the server). This is sort of related to a deadlock I've been investigating where shutting down a progrock writer stalls (since someone is connected to it), and then the entire dagger engine locks up. I originally thought this might be related to a buildkit upgrade, but actually I think this has maybe been potentially an issue for a while, though I struggle to imagine why 1. I would only start hitting it now, and 2. what kind of weird nesting circumstances could actually trigger this.

#

My proposed "fix" would be to just hold the lock while we're accessing the servers map - but if there's another reason we need syncronisation here, I'm missing it

civic yacht
spark cedar
#

ok nice 😄 i'm gonna open a pr then to try and remove those late unlocks then

#

"maybe that will make my deadlocks go away" - jed, 2024

wet mason
#

@civic yacht Hey quick q -- are cache volumes retrieved lazily or fully at startup? (in the calling import cache step)

civic yacht
wet mason
#

thank you! 🙏

#

@civic yacht Actually -- does the step "importing cache" include the volumes? Looking at the code it seems like no? (trying to debug import cache taking minutes)

wet mason
#

/cc @wild zephyr

civic yacht
#

FYI/good news for anyone developing on the engine who has run into the infuriating problem where you have to clear the dev engine cache in order for tests to actually re-run after making changes, this PR fixes it (as a side-effect of more general better version handling): https://github.com/dagger/dagger/pull/6469 cc @still garnet @spark cedar @rancid turret

GitHub

Currently based on top of #6386, 3 most recent commits are unique
Goals:

Add engine version module was most recently updated with to dagger.json for the purposes of tracking as metadata

Currently...

spark cedar
#

@still garnet @civic yacht based on our discussion about progrock/buildkit/etc in github, figured we might start a thread to share updates 👀

rancid turret
#

@still garnet, core/schema/typedef.graphqls should be removed, right?

still garnet
bronze hollow
#

Having some trouble with PortForward on Services:

 Look at json field for more details
    Syntax Error GraphQL request (1:41) Expected Name, found String "backend"

    1: query{host{tunnel(native:false, ports:[{"backend":5432,"frontend":55432,"protocol":"TCP"}]
bronze hollow
#

Also I'm noticing that if I run a container locally that speaks to a Dagger exposed service; it hands indefinitely. Is this a known limitation?

Basically, supabase gen types typescript unfortunately uses a container to speak to the database and generate typescript code. When that database in a Dagger service I can see it's genrated the TS types, but the command never ends.

spark cedar
#

can someone check me on how SIGINT works? i'm finding that if i start a dagger client go run main.go, and send CTRL+C, the session process receives it? before we even get the chance to handle it at the top-level, so i can do a nice service shutdown.

tidal spire
#

Having some trouble with PortForward

still garnet
#

can someone check me on how SIGINT works

karmic mulch
#

curious about what motivated using graphql in dagger, was it to reduce data fetching?

tepid nova
# karmic mulch curious about what motivated using graphql in dagger, was it to reduce data fetc...

It was an accident of the design. @wet mason and @civic yacht started out in grpc, building on top of buildkit (which is also grpc). Then along the way they realized they were building something that looked a lot like GraphQL, and as an experiment, tried using GQL instead. The result was so successful that we made the switch. It's just a near-perfect match for dynamically querying a DAG, which is what Dagger is all about

#

The initial goal was to replace CUE (a static declarative configuration language) with a dynamic declarative API, which could be queried from any language.

spark cedar
still garnet
#

Hm i might have discovered something

tepid nova
#

It's a small detail, but it irks me. I feel like the --help output is polluted by ultra-niche subcommands and flags that don't belong there.

  • dagger completion: this command is not important enough to pollute every single top-level usage. I say either hide it, or move it into a helper tool.

  • --cpuprofile, --pprof: either hide these, or make them environment variables (and document them properly of course)

  • --silent, --progress: move these to the relevant subcommands, like call, run... They are not relevant to every subcommand.

I know these are small details, but they add up. If we're not strict now, soon our CLI will be as bloated as the Docker CLI...

When you run --help and the output is concise and readable, it really makes a difference in the UX

rancid turret
#

I’m dealing with that currently.

#

Those things are actually a part of my list. Atm I’m wrapping text.

tepid nova
rancid turret
tepid nova
tepid nova
#

I find myself wishing that we did not honor OCI image entrypoints by default... Anyone else had this feeling?

tepid nova
#

Kind of like the way Kubernetes manifest ignores a bunch of fields in the docker image, and just makes you re-write them in the kub config. At the time it irked me, but now I get it. Clean slate. And more control over the overall experience for the wrapping tool (then: kube. now: dagger)

still garnet
#

didn't we have a big bikesheding discussion about this a while back?

#

(in GitHub somewhere)

tepid nova
#

I don't see how we could have avoided that 🙂 But I don't remember it.

#

I think our focus was on maximum compatibility and familiarity for Docker users. Mapping every Docker concept to Dagger seamlessly.

Now I'm realizing that's not super helpful in practice, at least not by default.

I would much rather know that, by default, From("foo").WithExec("bar") will always actually execute the binary "bar" in the image "foo", unless I specifically tell Dagger that I care what the image author put in the entrypoint (most of the time I don't).

still garnet
#

yeah exactly

tepid nova
#

Entrypoint was an early attempt at making containers like functions - but now Dagger is doing that, better

#

So it should go the way of the Dodo, like ONBUILD (aka the most misunderstood Dockerfile command)

still garnet
#

I struggled with this for a while in Bass, since there the syntax is ($ go build --foo) and typing ($ build --foo) because you know the entrypoint is go just felt nonsensical. I eventually had to add support for ENTRYPOINT/CMD to support publishing images, but had to square their semantics with everything else.

tepid nova
#

I feel less alone 🙂

#

@still garnet you feel it would be worth a small breaking change?

still garnet
#

Personally yeah, if we think it's better in the long-term, the sooner we rip off the band-aid the better. But it might be a painful band-aid

tepid nova
civic yacht
tepid nova
tepid nova
#

engine architecture detail for future 1.0 discussion. To cement the concept that the engine container is a private implementation detail of the CLI, and not meant to be managed separately: what if in the future we configured the image such that the daemon is only reachable by the CLI that provisioned it, for example with an ssh or http/tls tunnel using dynamically injected keypair, or sone other tunneling transport. The more portable the better.

tepid nova
#

I'm trying to open a PR that makes dagger completion hidden (while we're fixing UI details), but can't find the word "completion" anywhere relevant in the source. Can anyone point me in the right direction?

I'm guessing it's a builtin cobra feature, but can't find where it's enabled (maybe enabled by default?!?)

rancid turret
#

It's built-in, created by default by cobra, yes.

tepid nova
#

For those following engine internals, and production architecture, I updated the issue on "compute drivers" (previously engine drivers) to keep the conversation moving forward: https://github.com/dagger/dagger/issues/5583

GitHub

This issue was previously named "engine drivers", but "compute drivers" is proving more clear. Problem The Dagger CLI has a builtin “compute driver”: a software interface which ...

pulsar sage
obsidian rover
#

Engine not flushing all events on sigint

rancid turret
#

@civic yacht, how's the context dir determined? Find a parent .git or default to nearest dagger.json?

civic yacht
obsidian rover
#

Quick question:

Context
Following a debugging session on https://linear.app/dagger/issue/DEV-3402/git-metadata-not-showing-up-for-node-and-python-cloud-runs

Due to the relative permissions of each language: it is easier to run a Dagger pipeline (older version, not modules) from outside its git repo/context in Python and Node than in Go. I tried dagger run node --loader ts-node/esm typescript_sdk/copy.ts from outside the repo where it was implemented context, same in Python, that works.

Consequence
We have more often Dagger Cloud infos without git labels on these languages, as people run them from outside the git context

Actual Question
Do you think it makes sense for dagger run to change the workdir context based on the path set by users when running dagger run, in order to have a more consistent label collection ? It looks like opening a rabbit hole ...

wet mason
#

@tidal spire 👋 Hey, I vaguely remember you having a GHA to invoke dagger, do I remember that correctly?

tidal spire
wet mason
#

sweet!

#

very nice thank you

tidal spire
wet mason
#

@tidal spire can you call a local module using that?

tidal spire
#

Should be able to, yeah, but I think you need to actions/checkout first

wet mason
#

@tidal spire is there a way to see what arguments are being sent?

wet mason
#

sorry i did miss that!

wet mason
#

@tidal spire another one sorry ... where you ever able to use multi-line args? I'm trying a few different things and getting stuck with a broken command

#

@tidal spire nevermind got it -- there's some weird escaping happening, if the command is given in a multiline yaml (e.g. |), then it must end with \ not to break the shell script

tidal spire
#

ah yeah that makes sense!

obsidian rover
#

Can we dagger shell in a container with a service binding ? I have set up a service in a module function, binding it to my current module, and when I do:

dagger call add-secret shell
✔ ModuleSource.resolveFromCaller: ModuleSource! 0.1s
✔ ModuleSource.asModule: Module! 0.5s
  ✔ Module.withSource(
      source: ✔ ModuleSource.resolveFromCaller: ModuleSource! 0.0s
    ): Module! 0.5s
    ✔ Container.import(
        source: ✔ Directory.file(path: "go-module-sdk-image.tar"): File! 0.0s
      ): Container! 0.4s

Error: unknown command "shell" for "dagger call add-secret"

Implementation:

func (m *LambdasApi) AddSecret() *Container {
    s := m.LocalStack()
    return dag.Container().
        From("amazon/aws-cli").
        WithServiceBinding("localstack", s)
}
obsidian rover
tepid nova
tepid nova
#

Has anyone encountered this error before while building the engine?

 /go/pkg/mod/github.com/vito/progrock@v0.10.2-0.20240119030128-52ef9ee1a291/console/trace.go:7:2: package log/slog is not in GOROOT (/usr/local/go/src/log/slog)
note: imported by a module that requires go 1.21
/go/pkg/mod/github.com/moby/buildkit@v0.13.0-beta3/solver/result/result.go:4:2: package maps is not in GOROOT (/usr/local/go/src/maps)
note: imported by a module that requires go 1.21
#

Oh I guess my build container has version 1.20, and that's too old

#

?

#

Love how I found that out by the way:

dagger -m github.com/shykes/daggerverse/dagger call \
  engine release --version=0.9.10 source \
  go-base with-exec --args go,version stdout
still garnet
tepid nova
#

Yeah even better 🙂

#

I feel like we haven't taken advantage of the fact that we can write custom functions that just pop a terminal whenever they want (does that work?)

#

If my function calls Container.Terminal(), does the end user get a terminal?

still garnet
still garnet
tepid nova
#

I guess it works with up too then? So I could call eg. -m docker-compose call up and it would bring up my services?

still garnet
#

i think up is different; in that case the Service.up API call itself is what starts your tunnel, I don't think we special-case the Service return value in any way

tepid nova
#

Oh I see - your function has to return a Terminal type

#

I thought just the fact of calling Terminal (ie. half-way through the implementation) would make it happen

#

uh oh, installing Go 1.22 didn't fix the error...

still garnet
#

oh right yea, you can't just spawn a terminal willy-nilly; has to be the return value

still garnet
tepid nova
#

oh nevermind, it appears to have worked!

#

pushing my changes and calling from the remote git repo, worked cleanly

#

Are you able to reproduce?

dagger -m github.com/shykes/daggerverse/dagger call engine release --version=0.9.10 source cli --arch $(uname -m) --operating-system $(uname -s) export --path ./dagger-0.9.10
#

ah, but I'm still getting an error by making the same exact call, in Go from another module... 😭

#

Repro:

dagger -m github.com/shykes/daggerverse/daggy call \
  container \
  terminal
still garnet
#

yep happens here too

#

maps and log/slog are both new in 1.21 so it doesn't seem like the bump was respected

#

wait, let me try with main, @spark cedar pushed a fix to respect the go.mod's reported toolchain

tepid nova
#

The error itself isn't that weird. It's the fact that it occurs when I call the function in one way, and doesn't occur when it's called in another way

spark cedar
tepid nova
#

Out of the blue, dagger completely hangs for anything that requires the engine container. This happens for both 0.9.10 and main. The CLI itself works, eg. I can run dagger --help without problem.

#

It worked this morning. The only change to my system is that I updated OrbStack. Did that happen to anyone else?

#

dagger --debug makes no difference

#

I wiped all dagger-related containers, images and volumes on my local docker/orb install

obsidian rover
rancid turret
tepid nova
#

I ended up nuking everything + rebuilt ./hack/dev, and everything worked fine after that