#dagger shell

1 messages · Page 1 of 1 (latest)

olive rampart
jaunty field
#

Thanks!

delicate tartan
#

@glacial cedar @olive rampart 👋

glacial cedar
#

ah there we go

delicate tartan
#

thanks for joining @glacial cedar !

glacial cedar
#

@olive rampart I'm around the next 45m or so if now works? otherwise in about three hours

olive rampart
#

Hey @glacial cedar, welcome!

#

Yes, now it works! Do you prefer going on Zoom, or something else? Video in DMs here is 720p for me.

glacial cedar
#

google meet works I assume? I haven't used zoom in years

olive rampart
#

Yeah it works, never initiated one. Do you want to send me the link via email?

glacial cedar
#

do you need it by email too?

olive rampart
#

No, it was just to keep it private 🙂

glacial cedar
#

sent in any case.

#

ah, of course

glacial cedar
#

OK I understand the problem now, I need to think about it for a bit and try a couple of ideas.

glacial cedar
#

I started looking into this today. When we met with Helder the other day I explained to him that subshells inherit all file descriptors from the parent, meaning that when starting a command substitution like foo | bar $(baz), baz may consume the pipe before bar has had a chance to. This is a problem for you because you want each beginning of a pipeline to act as the top-level command, and given that you use stdin/stdout pipes to "handshake" the sides of each pipe, it makes sense to use "no stdin" as a signal that you're at the start of a pipeline.

I suggested to him that there are ways in shell to close file descriptors for the entire remainder of the shell's life, but I couldn't remember the exact syntax. I found it, it's exec <&-. For example, in bash:

$ echo foo | cat
foo
$ echo foo | cat $(cat >/dev/null)

You can see the problem I described earlier; the cat inside the subshell consumes the pipe first, because it inherits stdin. It can be rewritten as follows to avoid that:

$ echo foo | cat $(exec <&-; cat &>/dev/null)
foo

Of course you wouldn't want users to have to write this themselves, so I suggested that you could fairly easily rewrite the parsed syntax tree to insert the statement. Code follows...

#
$ cat prototype.go
//go:build ignore

package main

import (
    "context"
    "fmt"
    "os"

    "mvdan.cc/sh/v3/interp"
    "mvdan.cc/sh/v3/syntax"
)

func main() {
    file, err := syntax.NewParser().Parse(os.Stdin, "")
    if err != nil {
        panic(err)
    }
    syntax.Walk(file, func(node syntax.Node) bool {
        switch node := node.(type) {
        case *syntax.CmdSubst:
            // Rewrite command substitutions from $(foo; bar) to $(exec <&-; foo; bar)
            // so that all the original commands run with a closed (nil) standard input.
            node.Stmts = append([]*syntax.Stmt{{
                Cmd: &syntax.CallExpr{Args: []*syntax.Word{litWord("exec")}},
                Redirs: []*syntax.Redirect{{
                    Op:   syntax.DplIn,
                    Word: litWord("-"),
                }},
            }}, node.Stmts...)
        }
        return true
    })
    runner, err := interp.New(
        interp.StdIO(nil, os.Stdout, os.Stdout),
        interp.ExecHandlers(execHandler),
    )
    if err != nil {
        panic(err)
    }
    if err := runner.Run(context.TODO(), file); err != nil {
        panic(err)
    }
}

func litWord(s string) *syntax.Word {
    return &syntax.Word{Parts: []syntax.WordPart{&syntax.Lit{Value: s}}}
}

func execHandler(next interp.ExecHandlerFunc) interp.ExecHandlerFunc {
    return func(ctx context.Context, args []string) error {
        hc := interp.HandlerCtx(ctx)
        fmt.Printf("Stdin=%v args=%q\n", hc.Stdin, args)
        return nil
        // return next(ctx, args)
    }
}
$ go run prototype.go <<<'top1 | top2 $(sub1 | sub2)'
Stdin=&{0xc00008a4e0} args=["sub2"]
Stdin=<nil> args=["sub1"]
Stdin=&{0xc00008a420} args=["top2"]
Stdin=<nil> args=["top1"]
#

you can see how top1 and sub1, which start each of the pipelines, now see a nil stdin reader.

#

I think that should resolve your issue? I didn't want to go any route which would go against established shell semantics, such as adding an option like "do not inherit stdin in subshells", because that feels oddly specific and against the documented behaviors. It's easy enough to have any behavior you want by manipulating the syntax tree as shown above, so I think this is the most flexible approach.

#

Happy to chat again anytime this weekend to help clarify. Next week I'm busy with work Monday-Friday so I won't be able to do video calls, I'm afraid - but I should still be able to respond to Discord.

olive rampart
#

Yay! It works @glacial cedar 🎉, thanks!

❯ dagger shell -m modules/wolfi -c 'container | with-directory /mnt $(.core container | from alpine | with-exec echo,-n,sdk/go | stdout) | directory /mnt'
_type: Directory
digest: sha256:cd109cb13b8a922e99168241d648f79ff74fe29a552f425e189c0ead510b0f53
entries:
    - .changes
    - .changie.yaml
    - .gitattributes
    - .gitignore
    - CHANGELOG.md
    - LICENSE
    - README.md
    - client.go
    - client_test.go
    - dag
    - dagger.gen.go
    - examples_test.go
    - fs.go
    - generate.go
    - go.mod
    - go.sum
    - internal
    - provision_test.go
    - querybuilder
    - telemetry
#

@glacial cedar, I have another question. All the handlers get blocked on reading stdin, until the previous stdout is closed, which is ok. However, if a command in the middle of the pipe line returns an error, then the next commands continue after that point with a stdin of <nil>. Is there a way in the exec handler to check if the command errored so that it can bail after getting unblocked from reading stdin? Right now those continue on to the default handler and I can see in the logs the "xxx" command not in $PATH error. This is from my own logging so it doesn't show to the user when the debug mode is turned off, and it doesn't affect the result of the command, but it would be better to not continue processing.

I tried setting pipefail param, not sure if this is correct:

interp.Params("-e", "-u", "-x", "-o", "pipefail"),
glacial cedar
#

All the handlers get blocked on reading stdin, until the previous stdout is closed, which is ok.

FWIW I don't see this behavior:

//go:build ignore

package main

import (
    "context"
    "fmt"
    "io"
    "os"
    "strings"
    "time"

    "mvdan.cc/sh/v3/interp"
    "mvdan.cc/sh/v3/syntax"
)

func main() {
    file, err := syntax.NewParser().Parse(os.Stdin, "")
    if err != nil {
        panic(err)
    }
    runner, err := interp.New(
        interp.StdIO(nil, os.Stdout, os.Stdout),
        interp.ExecHandlers(execHandler),
    )
    if err != nil {
        panic(err)
    }
    if err := runner.Run(context.TODO(), file); err != nil {
        panic(err)
    }
}

func execHandler(next interp.ExecHandlerFunc) interp.ExecHandlerFunc {
    return func(ctx context.Context, args []string) error {
        hc := interp.HandlerCtx(ctx)
        fmt.Printf("Stdin=%v args=%q\n", hc.Stdin, args)
        if hc.Stdin == nil {
            fmt.Fprintf(hc.Stdout, "%s (first)\n", args[0])
        } else {
            var sb strings.Builder
            for {
                var buf [1]byte
                _, err := io.ReadFull(hc.Stdin, buf[:])
                if err == io.EOF {
                    break
                }
                if err != nil {
                    return fmt.Errorf("Stdin.Read: %w", err)
                }
                b := buf[0]
                if b == '\n' {
                    break
                }
                sb.WriteByte(b)
            }
            line := sb.String()
            fmt.Fprintf(hc.Stdout, "%s - %s\n", line, args[0])
        }
        fmt.Println("sleeping for 5s")
        time.Sleep(5 * time.Second)
        return nil
    }
}
$ go run prototype.go <<<'foo | bar | baz'
Stdin=&{0xc00008a360} args=["baz"]
Stdin=&{0xc00008a420} args=["bar"]
Stdin=<nil> args=["foo"]
sleeping for 5s
sleeping for 5s
foo (first) - bar - baz
sleeping for 5s
#

this is a simple handler where the first part of the pipe, foo, writes a line - then the others read that line, add a bit more, and write it again. you can see that all three get to the sleep just fine, even though none of them finishes nor closes stdout.

#

Your second issue about failing commands has me confused quite a bit, and I cannot reproduce it. I think we would have to do another call for you to screenshare and show me, or share a small reproducer Go program.

#

In particular I dont understand how you could possibly see a nil stdin - HandlerContext.Stdin is set when each of the commands in a pipe is started, and they are all started concurrently, so one command in the pipe failing should not affect the value of stdin for any following commands.

#

I also assume you should not be using the default/fallback exec handler context at all, so I'm also confused why you're seeing it doing something.

#

I'll largely be unavailable for calls this coming week until Saturday I'm afraid, but if you can share a reproducer for this - either as a standalone example here, or by pushing to that PR and telling me what to run with just Go installed to reproduce the issue - then I'll investigate in the evening.

olive rampart
#

@glacial cedar, thanks for your response! I’m going to adapt your simple example to provide clarity but in the meantime, here’s some explanations.

#

this is a simple handler where the first part of the pipe, foo, writes a line - then the others read that line, add a bit more, and write it again. you can see that all three get to the sleep just fine, even though none of them finishes nor closes stdout.

The blocking on stdin part isn’t the problem, I was just providing context. The problem is, let’s say, if command bar returns an error, I don’t want baz to proceed into the Sleep line. That’s basically what my question was about.

It’s not affecting the result, which is correct, I just want to avoid the extra processing and potential problems I’m not foreseeing.

#

In particular I dont understand how you could possibly see a nil stdin - HandlerContext.Stdin is set when each of the commands in a pipe is started, and they are all started concurrently, so one command in the pipe failing should not affect the value of stdin for any following commands.

I only see nil when a previous command in the pipeline returns an error, but to be clear, I’m not checking hc.Stdin == nil directly. Sorry for misleading as I kind of forgot about this detail! I have a function reading content from stdin and that may return nil if there’s no “proper” content or "handshake". So I think basically stdin could be EOF or something, which returns a nil "handshake". I’ll double check with the repro.

#

I also assume you should not be using the default/fallback exec handler context at all, so I'm also confused why you're seeing it doing something.

That’s because I’m returning next in case the first command doesn’t match an API function, but when there’s an error in the middle of the pipeline, later commands also see the “parsed input” as nil so they end up falling back to returning next. It’s by logging to stderr the error returned from next that I see the “command not in $PATH” error, but if I can exit the handler right after reading the input in this case, this can be avoided as well.

Maybe if stdin != nil but returns EOF or similar? Was just wandering if there’s a more clear “signal“ I could use to check if it’s an error condition. HandlerContext doesn’t have enough information here, but could be checking if ctx is being canceled.

#

There’s another question I was leaving for later but it’s sort of related to this (i.e., default handlers). Shell builtins, for example, run before my exec handler, so I can’t have a function named echo because the shell’s own echo is executed instead. I was wondering if I could do something about it in CallHandler, but at that point I can’t check stdin for the handshake and validate which is which, right?

Do you have any suggestion?

olive rampart
olive rampart
glacial cedar
#

typed nils are a bit of a trap in Go

#

@olive rampart I should have clarified that I did a fix in master yesterday, my bad

#

please let me know if you still run into the other issues after this fix and I will look more closely

olive rampart
#

Maybe one solution could be to check if input is empty but not nil. That could mean the previous one failed:

//go:build ignore

package main

import (
    "context"
    "fmt"
    "io"
    "log"
    "os"
    "time"

    "mvdan.cc/sh/v3/interp"
    "mvdan.cc/sh/v3/syntax"
)

func main() {
    log.SetFlags(0)

    file, err := syntax.NewParser().Parse(os.Stdin, "")
    if err != nil {
        panic(err)
    }
    runner, err := interp.New(
        interp.StdIO(nil, os.Stdout, os.Stdout),
        interp.ExecHandlers(execHandler),
    )
    if err != nil {
        panic(err)
    }
    if err := runner.Run(context.TODO(), file); err != nil {
        log.Default().Fatalf("Error: %s", err.Error())
    }
}

func execHandler(next interp.ExecHandlerFunc) interp.ExecHandlerFunc {
    return func(ctx context.Context, args []string) error {
        hc := interp.HandlerCtx(ctx)
        fmt.Fprintf(hc.Stderr, "Stdin=%v args=%q\n", hc.Stdin, args)

        var s string
        if hc.Stdin == nil {
            s = "(first)"
        } else {
            b, err := io.ReadAll(hc.Stdin)
            if err != nil {
                return fmt.Errorf("Stdin.Read %v: %w", args, err)
            }
            s = string(b)
            if s == "" {
                // bail if no chain from stdin (i.e., assuming previous command failed)
                return nil
            }
        }

        if args[0] == "bar" {
            return fmt.Errorf("some error while processing bar [stdin: %q]", s)
        }

        fmt.Fprintf(hc.Stdout, "%s - %s", s, args[0])
        fmt.Fprintf(hc.Stderr, "%v sleeping for 5s [stdin: %q]\n", args, s)
        time.Sleep(5 * time.Second)
        return nil
    }
}

But would feel more at ease with a proper check instead of assuming an empty stdin means failure.

glacial cedar
#

the previous command might have started and done the handshake OK, but then failed just after, no? or does writing to stdout to do the handshake mean that the command has succeeded?

olive rampart
#

Things to note from the code:

  • The first command just sets an initial state, but the rest of the handler should execute the same way whether it's the first or not.
  • I'm simulating an error after reading from input on purpose. That's because our errors occur during processing, after getting the chain state.
  • Writing state to stdout occurs at the end of the exec, but I could write something before all processing as a signal to next commands.
  • Without the s == "" check, baz continues to run and writes - baz to stdout even though bar returned an error (note empty stdin contents).
glacial cedar
#

either way you could indeed write one more line or message to stdout to signal "I succeeded"

#

(I think synchronizing with stdin/stdout for e.g. failures is better and more flexible than set -e -o pipefail)

olive rampart
glacial cedar
#

sounds good. that is indeed what I would do. you could also consider a JSON message describing a failure. does this model not work to stop when one part of the pipe fails?

#

given that each part waits for a message (handshake) from the previous to do any work

#

(I would also write JSON messages in case of failure, because empty stdin contents could be caused by other unintended scenarios or edge cases)

olive rampart
# glacial cedar (I would also write JSON messages in case of failure, because empty stdin conten...

Yeah, that's what I was wondering. Errors can occur in multiple places. I'll need to make sure (for now and future) that any errors get caught in the same place, or are wrapped in a function that writes the right thing to Stdout before returning the error. Isn't there anything that could be done in the library, so that the handler would have this information more naturally? Like something in HandlerContext maybe, for example?

glacial cedar
#

I'm not sure - because like I said, both ends of a pipe get started concurrently, so HandlerContext does not contain any information about what the "previous" command in the pipe did. Because it may not have finished yet. We could possibly add a synchronization API, but ultimately, stdin and stdout already exist for this purpose - synchronizing between the two ends of a pipe. They're just based on bytes, so you need to marshal/unmarshal, but they work perfectly fine for this use case :)

And yes I would wrap most of your code inside a func() error and then a little bit of code calling that which then writes the success or failure result as a JSON message.

#

ExecHandler and HandlerContext are also for any command being run, not just for pipelines. There's no strict notion of "the previous step in a pipeline", because you could be in nested pipelines. For example, foo | bar could be written like foo | { bar; } too, or foo | (bar)

#

really the only aspect that links foo and bar in those examples is stdin+stdout

olive rampart
glacial cedar
olive rampart
glacial cedar
#

CallHandler is indeed the only API you can use for this right now. I would suggest a slightly hacky solution but which would work - rewrite echo foo bar into .echo foo bar, and then in ExecHandler you can do whatever you want with .echo. Adding the dot (or any other prefix or rewrite) would let you do your own logic in ExecHandler as long as the resulting name does not match a shell builtin.

#

you could also, for example, rewrite echo foo bar into .shellbuiltin echo foo bar - whatever you find easiest.

olive rampart
#

Yeah, that's the route I was thinking on taking.

glacial cedar
#

it's a bit hacky but it will work today. I'm not sure what other kind of API would be nice in this space.

olive rampart
#

Is there an easy way to check if a command is a builtin or not?

glacial cedar
#

good question...

olive rampart
#

@glacial cedar btw, how does pipefail work, in mvdan/sh specifically?

glacial cedar
#

it is supported and it should work, but as with everything else, there might be a bug :)

olive rampart
#

The way I understand it, it ensures that a pipeline returns a failure if any command in the pipeline fails, not just the last one, but it does seem to work like that without it in mvdan/sh.

glacial cedar
#

if you can show me a difference between bash -c 'xxx' and gosh -c 'xxx' then I can most likely fix it pretty quickly

#

BTW given that your shell is interactive, I would set the new Interactive runner option. Right now it just enables the use of shell aliases. But bash has lots of little behaviors which are only turned on in "interactive mode".

olive rampart
glacial cedar
#

sort of; see what cmd/gosh does for example

#
    if e, ok := interp.IsExitStatus(err); ok {
        os.Exit(int(e))
    }
    if err != nil {
        fmt.Fprintln(os.Stderr, err)
        os.Exit(1)
    }
#

(any other error would usually be unexpected or coming from your handler)

olive rampart
#

So when you use ExitStatus error, you're usually sending the error message through Stderr so you don't need the err.Error() (i.e., usual error message) at that point?

glacial cedar
#

if you mean programs/handlers in general, yes.

#

you could return whatever Go error type you want here to be clear. set -e -o pipefail should still work, because those simply check whether err is non-nil

#

so e.g. if you wanted to use your own error type with the error message everywhere

#

I'd probably do that

olive rampart
glacial cedar
#

cmd/gosh uses the interactive mode everywhere. I honestly think that is what most users would expect anyway. non-interactive mode is weird, e.g. aliases not working at all.

#

An interactive shell is one started without non-option arguments (unless -s is specified) and without the -c option, whose standard input and error are both connected to terminals (as determined by isatty(3)), or one started with the -i option.
this is the bash logic, which IMO is bonkers

olive rampart
#

Ok, thanks a lot for your help! 🙏

olive rampart
#

@delicate tartan, what do you expect to happen when you use a subshell to expand a chain that does not end in an object? Pass it through to the API, or pass it to the flag? Example in #1296732593282224148 message is passing to flag, but does it make sense to bypass the flag only on objects (for its ID) or for every type?

delicate tartan
olive rampart
# delicate tartan This, right? 👇 ```shell ❯ dagger shell -m modules/wolfi -c 'container | with-d...

The alternative is to bypass the flag when a subshell is used and use the result directly in the API (like IDs). It could be a way to support values that aren't currently supported by the flags, but then the above example wouldn't work. IDs are a special case. If it's an object, I add an "id" to the query and bypass the flag. It's the other ones I'm wondering about, if there's a use case for it.

delicate tartan
#

But during the hackathon, it seemed like we had a way to pick and choose for each argument right? So for the example above:

  • with-directory, /mnt, sdk/go
  • The handler sees the first arg, /mnt, doesn't detect an object, processes it the usual way
  • The handler sees the second arg, sdk/go, doesn't detect an object, processes it the usual way
  • An API call is crafted with 2 strings

--> Oh now I realize, this should error...

#

I guess I don't understand your example

olive rampart
#

The handler processes the arguments via the same flags that dagger call uses. However, in order to support stitching IDs, I made it such that the ID doesn't go through the flag. Otherwise I'd have to prefix the value with id:xxx or something. Current implementation is that if not an object, the value is passed on to the underlying flag. So in the example above, it's like you typed --directory sdk/go in dagger call.

delicate tartan
#

But $(.core container | from alpine | with-exec echo,-n,sdk/go | stdout) returns the string sdk/go right?

#

So it's like dagger call container with-directory /mnt sdk/go?

olive rampart
#

Yep, that's why I also think that's less surprising and why it's built like that.

#

Was just wondering that maybe in the future we could support more advanced stuff through a .builtin command or something, that returns something not currently supported by a flag.

#

Let's just cross that bridge when we get there.

#

I do have a way for the .builtin to decide to bypass the flag, but have no use case for it yet.

delicate tartan
#

I was wondering about the builtins too. For example I would love a .doc that you can insert anywhere

#

I think I'm confused by what "bypass the flag" means in practice, it seems to work differently from the version we worked on together. I don't want to give you an uninformed opinion.

#

In the last version I saw, it seemed practical to only rely on passing objects around. But granted we had not yet gotten to implementing id:xxx

#

In any case to answer your initial question: assuming there is no error in the example, then I would expect that example above to fail with an error. Otherwise I would be very confused (and in fact I am still confused now)

#

(the confusion comes from the fact that my subshell prints a string, but that string is interpreted in a special magic way because it comes from a subshell)

#

or alternatively, the confusion could come from me just misunderstanding what you explained 😛

#

(Actually a | .doc would be absolutely amazing right now)

#

Since we don't have an equivalent of --help

#

That alone would be a killer feature of the shell

olive rampart
#

Ok, it's like this. When the handler processes arguments it can tell if the value comes from a command expansion or not (ie, you see the encoded chain state in that value). So if I see that a value is a call chain, I resolve that query and replace the value. In other words, the with-directory handler would see the following arguments:

args: ["/mnt" `{"calls": ["..."]}`]

After resolving the query for the second argument and replacing that value, the following arguments will be passed on to flags.Parse:

args: ["--path", "/mnt", "--directory", "sdk/go"]

Now these values go through the custom flags that dagger call uses.

#

Suppose that you return an object instead (e.g., $(.git URL)). The resolved ID won't be included in the flags to parse because we already know the final value:

args: ["--path", "/mnt"]

But both values will be added on the query builder just the same:

withDirectory(path="/mnt", directory="<id>")
delicate tartan
olive rampart
#

Like this:

❯ container | with-directory /mnt $(.git github.com/dagger/dagger/sdk/go) | directory /mnt

I can see that the second argument to the with-directory handler is an encoded chain, and see if it's an object that has an "id" field. I just turn that into a query, resolve it, and store that value. But I don't need to pass it on to a custom flag (in this case, it's the directoryValue flag, which can turn a filesystem path into a dagger.Directory object).

#

So this returns a scalar, thus I pass it on to the flag:

$(.core container | from alpine | with-exec echo,-n,sdk/go | stdout)

But this returns an object so I can get the ID and bypass the flag:

$(.git github.com/dagger/dagger | directory sdk/go)
olive rampart
delicate tartan
#

I see, thanks

#

If I understand correctly, there is a decision of how to decide whether an argument is to be pre-processed as "special", or passed-through as is.

  • Option 1: decide based on origin: is the argument "dynamic" (a lazy pipeline def to be resolved to get the value), or "static" (just a literal value)
  • Option 2: decide based on content: regardless of dynamic or not: is the argument a typed object, or a scalar?

The original example you shared with me implied option 1 (go/sdk may be a string, but it's dynamic so we're considering it special). And at the moment I prefer option 2 (go/sdk is a string, it doesn't matter whether it's dynamic or not - it should be processed the same)

--> Does that summary make sense to you?

olive rampart
#

Yes. But in the original example, go/sdk is option 2 (#1296732593282224148 message). Otherwise it would return an error in that case. It's dynamic but resolved as a string so processed the same way as if go/sdk was passed in directly.

delicate tartan
#

(catching up)

delicate tartan
#

So, yes option 2 (decide based on content) makes more sense to me, and it seems we already agree, and the original example followed option 2? So in other words, we're aligned? 🙂

#

IMO supporting special builtins could be handled by having the builtins send special objects, with special types that don't exist in the engine API, but would be intercepted by the shell handler, and processed accordingly. Then you just need the ability for the builtin to output a literal object

olive rampart
#

Hey @glacial cedar 👋, If a command substitution returns an error, what happens to its stdout?

.core container | from $(wat) | workdir

In this case I don't even see from being executed and workdir executes but sees an empty input (not nil), however I'm sure I write the "error state" on "wat" command's stdout before returning the error. I suppose that if from doesn't get executed it won't pipe the error condition to workdir.

Is there a way to chain this properly other than having wat not return an error (just writing that it did to stdout)? I wouldn't know that command is running from a subshell anyway.

#

In that command I'm actually seeing the error from workdir having an empty stdin rather than the error from wat not being something valid.

glacial cedar
#

@olive rampart can you clarify what you would like to happen, in terms of "chain this stuff properly"?

olive rampart
#

Ideally I want workdir to know that the previous command errored but I get that from is dependent on $(wat) succeeding in order to run.

glacial cedar
#

ack. let me dig a little.

olive rampart
#

Looking at Runner.expandErr and I see the error printed to stderr, I didn't know where that came from.

glacial cedar
#

from a quick look, bash and us agree on the subshell and pipe semantics relating to errors, so I think in principle this could work. I'll try doing a Go example.

$ bash -c 'set -e -o pipefail; echo foo | { cat; echo $(echo bar; false); } | { cat; echo baz; }'
foo
bar
baz
$ gosh -c 'set -e -o pipefail; echo foo | { cat; echo $(echo bar; false); } | { cat; echo baz; }'
foo
bar
baz
#

ok I'm pretty sure it's a bug with handler errors :)

olive rampart
#

Go doesn't give the same result?

glacial cedar
#
    err := r.execHandler(r.handlerCtx(ctx), args)
    if status, ok := IsExitStatus(err); ok {
        r.exit = int(status)
        return
    }
    if err != nil {
        // handler's custom fatal error
        r.setErr(err)
        return
    }
    r.exit = 0

the code currently treats custom handler errors, i.e. errors other than interp.NewExitStatus, as fatal - and it stops the entire command containing the command substitution.

you can for example replace your custom error with NewExitStatus, although then in your example from $(wat), from will not see that wat failed, because when using command substitutions, only stdout is passed along - the exit status is set in $?:

$ x=$(true); echo $?; x=$(false); echo $?
0
1

so in your case I think what makes the most sense is:

  • use NewExitStatus for your exec handler, to let command substitutions and pipelines behave like normal shell when any errors happen
  • when your exec handler fails, write a json payload to stdout to describe the error
  • when parsing the json payloads (either from stdin or from command substitutions), detect if they describe an error, and act accordingly
#

(the fact that we currently treat custom handler errors as fatal is likely something to fix regardless, but for now I think this will suffice, and I also don't think you want nor need custom errors anyway)

#

something I'll warn you about: command substitutions without quotes (which I assume most of your users will write) will word split the result of stdout, so that means that for e.g. JSON with spaces, it will be split into multiple arguments.

$ printf "%s\n" one two three
one
two
three
$ printf "%s\n" "one two three"
one two three
$ printf "%s\n" $(echo '{"foo": "bar"}')
{"foo":
"bar"}
$ printf "%s\n" "$(echo '{"foo": "bar"}')"
{"foo": "bar"}
olive rampart
#

Ok, I already do most of that. I write the error state to stdout before returning the err, so you're saying I should use NewExitStatus instead in that return. I'm going o try it 👍

glacial cedar
#

so if you are e.g. writing results to stdout as JSON, perhaps avoid any whitespace, or use base64 encoding, to avoid this issue.

#

I would probably do base64 because avoiding whitespace is hard, e.g. {"foo":"bar baz"} will still cause issues

#

alternatively, use the syntax tree walking code to ensure all command substitutions are within double quotes.

#

but modifying the syntax tree like this may break the user doing something that relies on it on purpose

olive rampart
#

Cool, now I see the json with the error as an argument of the outer command, even without surrounding quotes and base64 encoding.

#

Wait, I haven't properly confirmed it's not split into multiple args 😅

glacial cedar
#

if your JSON is what encoding/json produces by default, it probably is 😉

#

this will be very important for e.g. .foo $(one) $(two) three to work correctly

olive rampart
#

Yeah, instead of 2 args I have 12 😂

glacial cedar
#

base64 encoding makes sense in general anyway, for the sake of ensuring exactly the same string is passed along

olive rampart
#

Yep! Thanks for the tip!

glacial cedar
#

another FYI: bash does not support null characters, \0, in strings at all - not even when they come from stdout and they are interpreted as arguments with "$(write-null-to-stdout)". I don't suppose you need to write bytes, but FYI if you try to write binary or other non-plaintext to stdout, it often will not work.

#

we don't have that limitation, but it's best to avoid nulls anyway.

delicate tartan
#

In latest commit, sub-pipelines don't seem to work:

.core container | from alpine | with-directory /src $(.core git "https://github.com/shykes/daggerverse") | with-workdir /src | with-exec ls | stdout
no function ".core" in type "Container"
Error: no function "stdout" in type "DaggerDev"
olive rampart
#

Yeah, haven't pushed the changes this week yet, sorry. Got sidetracked fixing a few error conditions for a better user experience.

#

Fixed now.

olive rampart
delicate tartan
#

while I'm writing my list to santa: <dep> <func> to call a function of a dependency would be super useful. eg. go build to do the equivalent of dagger -m go call build

delicate tartan
#

@olive rampart do you think there's a realistic path to dynamically load modules from the same session? ie. once we have dagger shell working without loading a module. Would be cool to give the ability to .load MODULE from there. I'm thinking of version switching in particular...

#

oh wow imagine if .load didn't affect the global state of the shell, but you could just pipe the loaded module object into the pipeline...

#

.load github.com/dagger/dagger@v0.13.0 | .doc 🙂

#

Or as a shorthand: .load @v0.13.0 | .doc (would apply to the currently loaded module?)

olive rampart
#

Yes, that will be easy to do after that. Executing a function from a dependency too.

#

Just need to put introspection into a map, scoped by module name or something.

delicate tartan
#

Ha, what if "use this object as the top-level scope" were its own builtin? Since nothing has side-effects by default (nice immutable outputs by default)

#
# Load without a module
$ dagger shell

# Load a module dynamically, without setting the state
> .load github.com/dagger/dagger | .doc

# Set the state this time:
> .load github.com/dagger/dagger | .use
olive rampart
#

I was writing .use but you were faster 😂

#

Yeah, but what about defaulting to .use by default if you start the shell with a module (i.e., -m)

delicate tartan
#

Honestly this is such a fun foundation to build a shell experience on top of. Zero side effects by default.

delicate tartan
olive rampart
#

Yep, exactly.

delicate tartan
#

Powershell fans: "hey that's our thing!"

olive rampart
#

I wonder about core functions. If you're in a module and want to run a core function, it'll still load the module even if you don't need it.

#

I can make it lazy and only load if the first command isn't a built-in.

delicate tartan
#

Yeah it can be:

  • .load [MODULE] loads a module. Without argument, load the current module if any.
  • dagger shell --load to force loading the module. Or, if we decide we like auto-loading, we can reverse it to dagger shell --no-load
#

Personally I think auto-loading the current module is what I'd expect. Also it's nice that once I have a prompt, 100% of commands are super fast by default

#

But worth experimenting

#

@olive rampart would .use even work for objects that are not the top-level of a module??? Like .load github.com/dagger/dagger | sdk | python | .use ? Then test would run Python SDK tests; .functions would show available functions in the python sdk; etc

#

FYI @plucky moss @wintry slate @scenic flare @slender folio - watch this thread for explorations of how valuable we could make the interactive shell experience (as opposed to batch shell)

olive rampart
#

But that's presuming you start off from the same state. For example, if you run test it would append to sdk | python | test. That's only for stuff that requires a query, not for .functions which only depends on the introspection data.

delicate tartan
#

I see, so it's all possible, but we'd have to keep a "top-level type type + query" global, persisted across commands. Today that top-level type + query is implicitely the loaded module type

#

(I keep forgetting we have this dual pointer: type for some things, query for some others)

#

@olive rampart do you have any ideas on the "reload local files" problem? I hit that right away trying to use it for local dev

#

One cool idea was to combine with Andrea's "watch" prototype:

test | .watch 🙂

#

Actually that is a separate thing. I would still expect a way for local dirs to be reloaded between commands. Explicitly is fine

#

Maybe .load does that also?

#

FYI @low imp since I know you're thinking about improving filesync. We're discussing how to expose local dirs in the shell UX

#

(just for your awareness)

#

(also mentioned you in the github issue about this)

olive rampart
#

For context @low imp, in interactive mode you run commands against the API with the same withEngine() session.

#

But the CLI isn't caching any inputs or outputs. It makes the same query, and the same calls if you repeat them.

low imp
#

Part of what I'm doing involves taking control of the filesync code in the engine, including the lines that currently enforce "once-per-session". So we'll be able to add a flag to the dagger API that supports re-loading

#

Would argue we should keep the default behavior as is, but an optional flag that forces reloading makes sense to me

delicate tartan
#

OK cool. Note that the shell is embedded in the CLI. So we have full control of both CLI and API behavior.

#

ie. things that would be a hacky workaround with dagger call, could become elegant solutions in the shell, even without changes to the API.

#

In some cases, perhaps it can give us a facade to the API

#

(hopefully this makes sense)

low imp
#

Yeah it might make sense to just enable that flag that forces reloading by default in the shell

#

Even if not the default elsewhere

delicate tartan
#

For example, the weird case of export / -o might be elegantly solved with a builtin, eg. .directory | with-new-file 'hello there' hello.txt | .export ./hello.txt

#

How much does your ongoing improvement require client cooperation @low imp ?

#

I guess it doesn't matter since all SDKs embed the CLI anyway

#

It would only matter in the remote API scenario (which we need to address sooner or later)

low imp
delicate tartan
#

@low imp ok cool. So basically we should assume everything will work mostly the same from the CLI's point of view, only better, and with more granular control over what to reload when? Like granularity at the path level?

olive rampart
# delicate tartan

🙂 The Python REPL uses >>> for new prompts and ... for continuing on multiline. The gosh experiment from the sh library uses $ by default in interactive mode and > for continuing multiline (for comparison, and that we may also want to consider multiline support).

low imp
delicate tartan
#

@olive rampart maybe .freeze / .unfreeze to control the local filesystem snapshotting?

#

@low imp with your changes, do you think it would be reasonable to "unfreeze" local directories by default? Meaning, if I run test twice in a row, all local changes would be taken into account. Would that be prohibitively slow? Or might it preserve the "feels 10x faster" magic of the shell?

#

I think it would be way better to un-freeze by default, assuming the perf penalty isn't too bad. It's very surprising to run test twice locally, and have the local changes silently ignored after the first call

low imp
# delicate tartan <@949034677610643507> with your changes, do you think it would be reasonable to ...

We won't know for sure until we try, but I'd predict it'll be fast enough that you could unfreeze by default. The biggest overhead will be from the need to walk the entire dir that's being synced (minus excludes), which has overhead based on the number of files/dirs. Also the overhead of retransfering any files that changed, but that's expected.

I think you'd have to get to pretty extreme cases for that to be a noticeable burden though. Like node_modules type cases, but you usually exclude those types of dirs anyways

#

One thing to think about is if you do sync a local dir at a given state and store it in a variable (that's a thing with the shell, right?), then should that be frozen? Or should we actually retransfer every time the local dir reference appears anywhere in the DAG of a call

#

My instinct would be that you could unfreeze-by-default on the first reference to a local dir, but once it's in the DAG of a call or stored in a variable it's frozen in it's state at that time

#

I think that would be most intuitive, but not super obvious either way

delicate tartan
#

Mmm, we haven't wired in variables, so still TBD what the ideal experience would be there. But yeah if makes sense that if you can pipe something to another command, you can also set it to a variable. If for some reason we decide not to allow that, we could detect the type and return an error

#

tldr we have options

olive rampart
#

Variables don't work only when used in different prompts in interactive mode. That's because (currently) each prompt resets the runner, with a new script (the read line).

jaunty field
#

I think this is very cool, as another way of interacting with Dagger.
However, I am still looking for a good way to interop with the cli tools that are used in my actions. E.g. having a container with some tools like kubectl and helm to carry out commands on my source.

I have been thinking if it would make sense to go the other way: taking an existing bash script, analysing it to detect required tools and inputs/outputs so that it can be wrapped in one or more dagger functions. I don't mind defining my pipelines in bash, I often prefer this, but would love to utilize Dagger capabilities like caching, locking dependencies, monitoring.

delicate tartan
jaunty field
#

That is one way to look at it, basically superpowering an existing script by seamlessly adding caching, locking of dependencies, fancier logging - all the Dagger niceties.

I have not had a chance to look at it, but mvdan's 'sh' go library seems like a nice way of parsing an existing bash script to try and extract inputs, outputs and dependencies.

delicate tartan
glacial cedar
#

^ I think this is doable to some extent but it would have to be dynamic, i.e. as you interpret/run a script, rather than as a static analysis from just parsing the script. Shell as a language is just too hard to do static analysis on, because even telling what functions/programs get executed is basically impossible once the user does ${foo} bar baz.

jaunty field
#

I guess one could set strict rules, if one wants an automatic conversion - the golden path.

It might also be the wrong way to go. All in all, I would just love some good interop between Dagger and other CLIs I use.

olive rampart
#

@delicate tartan, I pushed https://github.com/dagger/dagger/pull/8860 for review but there's an issue when a function has required arguments, because they're validated in that function's handler, before .doc is executed. To fix it I need quite a bit of refactor to lazify these commands and only resolve in the end, but that may mean I don't have enough time to add other stuff. Would you rather avoid using .doc on functions with required arguments and get more in (like no-module mode), or make sure this is resolved?

GitHub

Closes #8831
Examples
Module constructor
NoteBikeshedding needed on the constructor's usage line.

> .doc
USAGE
.config [options]; <function>

RETURNS
DaggerDev

OP...

delicate tartan
#

maybe if you make a clear list of other improvements you're delaying, other can help? I would gladly contribute a S-size improvement

olive rampart
olive rampart
#

To be clear, .doc works now. You just can't use it on a function with required arguments.

delicate tartan
olive rampart
# delicate tartan any builtin that you can insert in the pipeline, right?

No, that's done, and it's not the refactor I'm talking about. I'm talking about delaying execution until the end of the pipeline. This means not validating arguments and not parsing their values. Basically just "record" the passed in args as is to the next handler. On the last step, instead of just building the query for the call chain, also validate and parse argument values.

delicate tartan
#

oh I see! mmm

#

ok maybe there's a simpler way then

#

it does feel a bit magic if one particular builtin "neutralizes" arg checking before its own position in the pipeline. doesn't feel very DAG-esque

#

what about .doc [FUNC] ?

olive rampart
#

Yeah, that was the other solution I thought of and was just about to suggest.

#

So:

- cli | binary | .doc
+ cli | .doc binary

Same for functions:

- cli | binary | .functions
+ cli | .functions binary

No arg implies module constructor:

.doc

But awkward for a core function:

- .core container | .doc
+ ??

If you can use it both ways it becomes two ways of doing the same thing, so I'm trying to think of only one way it works.
What should happen if you use:

cli | binary | .doc

Possibly an error, i.e., if .doc doesn't have an arg, it expects the command to be the first (i.e., stdin to be nil)?

delicate tartan
#

well .doc without an argument would document the whole object type

#

btw I strongly prefer .container over .core container. The . already means "core"

olive rampart
#

Yeah, but not all core functions have an alias, just generalizing.

delicate tartan
#

cli | binary | .doc -> print docs for entire type CliBinary (or File? can't remember the return type)

#

cli | .doc binary -> print a subset of the Cli type docs, specifically its binary function

#

So .doc always shows a type doc, possibly filtered by function

#

so actually it could be .doc [FUNCTION...] ie. multiple function names allowed

olive rampart
#

Questions:

  • What about .config and .container?
  • If the returned type is a string, what do you expect to see?
  • What's the purpose of the multiple functions? Can you please look at the current output in the PR ? That's more similar to --help, but you may be thinking more like daggerverse, right?
delicate tartan
#

What about .config and .container?

  • .container | .doc would show the docs of the Container type
  • .config | .doc I'm not sure... I think .load might deprecate .config altogether?

Is that what you were asking?

#

If the returned type is a string, what do you expect to see?

The doc for the String type. I guess a minimal version would just be the word String with nothing else?

#

What's the purpose of the multiple functions?

No particular purpose, but it helped clarify the underlying model, and the fact that it always works the same. .doc always prints docs for a type, optionally filtered by function name.

Can you please look at the current output in the PR ? That's more similar to --help, but you may be thinking more like daggerverse, right?

I will check it out. Yes I was thinking more like daggerverse or go doc, ie. more like API docs. But I don't have an exact idea of the format, and --help is already not too far. I think the main difference would be showing types more prominently (something that is lacking in --help)

olive rampart
#

.container | .doc would show the docs of the Container type

But Query.container has an optional argument. How do you document the function itself?

delicate tartan
#

.doc .container ?

#

or in the case of builtins: .help .container perhaps?

#

Strictly speaking, .doc .container should not work, because it's a filter on the docs for the current module type. We could make an exception to always allow .container but it's weird. .help .container works better I think

#

Also .help without arguments can probably show the arguments syntax right there in the list, which will help

olive rampart
#

I had .help [cmd] in mind for builtins, but that's a different kind of output from .doc. For example .help .git or .help .doc.

delicate tartan
#

Ah I see you were setting me up for .core 😛

#

Wouldn't .help FOO show the same format as .doc FOO ? You need the function name and description; name and description of each argument; return type; probably could unify the formats no?

olive rampart
#

Well, you're describing documenting types instead of functions with .doc. It's functions that equate to a command, have a usage line, etc.

delicate tartan
#

Yes but a type doc is mostly made of a list of function docs. .doc FOO filters that list to show only the function doc for FOO. Which IMO would have the same format as a builtin doc

olive rampart
#

What about core functions that don't have an alias? .git is different from .core git.

delicate tartan
#

or to enshrine the shell's role in shaping future changes to the core API 😇

#

But then shell scripts aren't as easily transcribed to code

olive rampart
#

So you suggest removing the .git URL shorthand altogether? There's still core functions that don't have a built-in. Do you suggest generating a .<core func> for all of them?

delicate tartan
#

So you suggest removing the .git URL shorthand altogether?

I really don't want to, but there is a dilemma there.

There's still core functions that don't have a built-in. Do you suggest generating a .<core func> for all of them?

Yes for sure. I think we got most of them?

#

The dilemma is: do we prioritize convenience of the shell syntax, or 1-1 equivalence with code?

#

But I think the dilemma is only specific to .git, since it fills a specific weakness of the core API (can't use those very convenient git refs)

olive rampart
delicate tartan
#

.git https://github.com/dagger/dagger#main:docs vs git https://github.com/dagger/dagger | branch main | tree | directory docs

#

well the load-xxx aren't needed I think

#

since the shell handles that, right?

#

.default-platform we can skip IMO

#

I guess there is a de-facto separation of "core functions we want to show" and "core functions we want to hide"

olive rampart
#

well the load-xxx aren't needed I think
In normal cases yes, but there may be a use case for it. Would it be better to disable entirely or just hide under .core (no longer shown in .help)?

delicate tartan
#

Yeah, .core as an escape hatch, seems prudent. We don't even need to hide it. Just make it clear it's raw API access, and not usually needed for most users

olive rampart
#

I'm having an itch between .doc .git and .help .git 😅

delicate tartan
#

yeah

#

I think allowing .doc .git is reasonable

olive rampart
#

Yeah, I think so too. Just allow for core functions but not other builtins.

delicate tartan
#

Actually it does make sense for .doc to show which builtins are available in a given context (assuming some of them are only available at top-level? or are they all available anywhere?)

olive rampart
#

No, only a few can be anywhere.

#

The rest can only be the first command.

#

But .doc .git would be a special case, specifically when .doc is the first command in the pipeline.

delicate tartan
#

what are other builtins that work anywhere, like .doc?

#

I feel like we're going to discover more as we go 🙂

#

like .json?

olive rampart
#

Only .functions, but we need to rethink that. It's easy to configure, so trivial for future commands.

delicate tartan
#

I wonder if we'll eventually take this too far, like create a whole parallel engine API in the CLI 😛

#

Who knows this might be a useful stepping stone to removing the engine's dependency on buildkit in the long run.

#

I think other candidates are the "weird calls" in the core API, like .export and .up

#

("weird" because they have side effects on the client context)

olive rampart
#

When doing .container | .doc, meaning we document the Container type, I imagine listing its functions and descriptions like .functions, only showing arguments for a function if you do .container | .doc with-exec. Do you imagine listing all the arguments for every function like in Daggerverse, rather than just the list though?

delicate tartan
#

I think it will require some experimentation, but in doubt I would look at go doc, it's pretty awesome

#

Probably won't translate directly, but still a good reference point

olive rampart
#

I just think it's too much for the CLI.

#

On a web page it's different.

delicate tartan
#

go doc is a CLI though

#
$ go doc encoding/json
package json // import "encoding/json"

Package json implements encoding and decoding of JSON as defined in RFC 7159.
The mapping between JSON and Go values is described in the documentation for the
Marshal and Unmarshal functions.

See "JSON and Go" for an introduction to this package:
https://golang.org/doc/articles/json_and_go.html

func Compact(dst *bytes.Buffer, src []byte) error
func HTMLEscape(dst *bytes.Buffer, src []byte)
func Indent(dst *bytes.Buffer, src []byte, prefix, indent string) error
func Marshal(v any) ([]byte, error)
func MarshalIndent(v any, prefix, indent string) ([]byte, error)
func Unmarshal(data []byte, v any) error
func Valid(data []byte) bool
type Decoder struct{ ... }
    func NewDecoder(r io.Reader) *Decoder
type Delim rune
type Encoder struct{ ... }
    func NewEncoder(w io.Writer) *Encoder
type InvalidUTF8Error struct{ ... }
type InvalidUnmarshalError struct{ ... }
type Marshaler interface{ ... }
type MarshalerError struct{ ... }
type Number string
type RawMessage []byte
type SyntaxError struct{ ... }
type Token any
type UnmarshalFieldError struct{ ... }
type UnmarshalTypeError struct{ ... }
type Unmarshaler interface{ ... }
type UnsupportedTypeError struct{ ... }
type UnsupportedValueError struct{ ... }
#
$ go doc encoding/json Encoder
package json // import "encoding/json"

type Encoder struct {
        // Has unexported fields.
}
    An Encoder writes JSON values to an output stream.

func NewEncoder(w io.Writer) *Encoder
func (enc *Encoder) Encode(v any) error
func (enc *Encoder) SetEscapeHTML(on bool)
func (enc *Encoder) SetIndent(prefix, indent string)
#
$ go doc encoding/json Encoder.Encode
package json // import "encoding/json"

func (enc *Encoder) Encode(v any) error
    Encode writes the JSON encoding of v to the stream, with insignificant space
    characters elided, followed by a newline character.

    See the documentation for Marshal for details about the conversion of Go
    values to JSON.
#

Seems like a good compromise. They manage to fit each function on a single line, until you ask for that function's doc.

#

They do it by hiding the description and showing the args (you suggested the opposite), I think that's the part that needs experimentation. Maybe we can get both? 🙂 But "one line per function unless you asked specifically about that function" seems like a good rule to follow

#

Also there's the problem of multi-language... what syntax do we use to print types and arguments? GraphQL, CLI, language-specific? go doc doesn't have that problem.

olive rampart
#

cli syntax (kebab)

#

That means function signature is a CLI usage line

delicate tartan
#
  • Makes sense for function and arg names, at least by default
  • You could imagine a flag to .doc to make that configurable in the future. .doc --sdk=python or .doc --graphql ?
  • In any case: what about types? Those don't exist in CLI usage messages
olive rampart
#

Usage lines are currently like this:

func1 <req1> <req2> [options]
func2 [options]
func3 <req>
delicate tartan
#

yeah makes sense. Then more details only if I call .doc func1?

#

But when I do call .doc func1, we'll need to talk about types right? Return type at least. Possibly argument types? Would be very useful.

olive rampart
#

what about types? Those don't exist in CLI usage messages
For a function, its usage is just some text under a "Usage" heading. For example:

> cli | binary | .doc
Build the CLI binary

USAGE
  ... | binary [options]

RETURNS
  File

OPTIONAL ARGUMENTS
  --platform Platform

Use "... | binary | .functions" for available functions.

So for a type, it just has other headings. Notice the "RETURNS" heading. Haven't been sure to keep it or not, but I like to know when navigating functions.

#

If you simply do .doc, then you're basically documenting the main object, so you can have a section there for the constructor usage.

delicate tartan
olive rampart
# delicate tartan Just checking, but with the last discussion above, that would become `cli | .doc...

Yes, that's right. As for required args, they're currently rendered like this:

> .container | .doc with-directory
Retrieves this container plus a directory written at the given path.

USAGE
  ... | with-directory <path> <directory> [options]

RETURNS
  Container

REQUIRED ARGUMENTS
  path string           Location of the written directory (e.g., "/tmp/directory").
  directory Directory   Identifier of the directory to write

OPTIONAL ARGUMENTS
  --exclude []string   Patterns to exclude in the written directory (e.g. ["node_modules/**", ".gitignore",
                       ".git/"]). (default: [])
  --include []string   Patterns to include in the written directory (e.g. ["*.go", "go.mod", "go.sum"]).
                       (default: [])
  --owner string       A user:group to set for the directory and its contents.

                       The user and group can either be an ID (1000:1000) or a name (foo:bar).

                       If the group is omitted, it defaults to the same as the user.

                       (default: "")
  --expand bool        Replace "${VAR}" or "$VAR" in the value of path according to the current environment
                       variables defined in the container (e.g. "/$VAR/foo"). (default: false)

Use "... | with-directory | .functions" for available functions.
#

Is there still a need for .functions? If you're in .container | .doc with-directory you could see the list of functions with .container | with-directory | .doc. Basically via the documentation of the returned type.

delicate tartan
#

I think .doc deprecates .functions

olive rampart
#

If we do add .help [builtin] at some point (which I want to), then .help .container will be equivalent to .doc .container. Unless we stick to using .help .container to document the Query.container function and .container | .doc to document the Container type.

#

We can't add a "constructor" section in Container doc like with the main object (.doc) because multiple core functions can return the same core type.

delicate tartan
#

Yeah constructors are a special case across all of this, because we don't yet have a cohesive design for dealing with them

olive rampart
#

With dependencies we could have wolfi | .doc document the Wolfi type (main object from the wolfi dependency), including its constructor arguments (same as just .doc for current module), however, that's not possible if it has required arguments. 🤔

#

We'd have to .doc wolfi, but that conflicts with a wolfi function in the current module.

delicate tartan
#

Mmm, what if .doc also listed available types? Then you could do .doc FUNCTION or .doc TYPE. That would allow .doc Wolfi

olive rampart
#

@delicate tartan, do you want to try it? I've pushed the updates to the .doc PR, but haven't updated the description yet.

#

I've added all core functions to .<name>, just some may be hidden from .help, but all accessible. This way you can use .help .load-directory-from-id but that function doesn't show on .help. It does show on .core (shows everything).

delicate tartan
#

@olive rampart yes! thanks for the ping

#

@olive rampart you should definitely make sure it gets merged in time for 0.14

#

no downside to silently getting the last shell updates in each release while the command is hidden (as long as they don't break the feature of course)

olive rampart
#

Yeah, I'm updating the PR's description before dropping for today.

#

Done

olive rampart
# delicate tartan Yeah it can be: - `.load [MODULE]` loads a module. Without argument, load the c...

@delicate tartan, I can spin up a quick version of this (as in 15 mins) if dagger shell doesn't load the module by default , so you'd have to add --load to do that. You can also .load when in interactive. Doing it twice would reload the module. That's the quick version, which defaults to allow core functions from anywhere.

To default to loading the module and require --no-load will require a bit more customization, that I probably won't have time today.

As for .load [MODULE] and .load github.com/dagger/dagger | sdk | python | test (nad ... | .use), I have an idea on how to do that without much effort (but not in time for v0.14).

delicate tartan
#

Oh I see, it would be parsed so that .load without argument loads that module

delicate tartan
#

once in a lifetime bikeshedding opportunity 🙂

glacial cedar
olive rampart
#

I started on .load but there's a bit more changes needed to make that work. In particular, re-creating builtin commands built from module introspection.

delicate tartan
#

i think this works better

jaunty field
delicate tartan
#

@olive rampart FYI .cache-volume doesn't work

olive rampart
delicate tartan
#

yeah we tried that also, weird

olive rampart
#

Ah, it's getting filtered I just didn't realize the introspection function I'm using was the one doing that. Specifically:

            // for SDKs only
            "builtinContainer",
            "generatedCode",
            "currentFunctionCall",
            "currentModule",
            "typeDef",
            // not useful until the CLI accepts ID inputs
            "cacheVolume",
            "setSecret",
            // for tests only
            "secret",
            // deprecated
            "pipeline",

Gonna lift that up.

olive rampart
#

@delicate tartan, you can play with dynamic module loading:

❯ dagger call -m github.com/dagger/dagger@pull/8938/head dev terminal
olive rampart
#

@delicate tartan, since every core function is now runnable with .<func>, what do you think about removing the fallback on normal functions? For example if you run container with the wolfi module, it'll run that module's Container function, but if you do the same in a module that doesn't have it it'll execute core's container. I think it's short enough to avoid the ambiguity by making you run .container if core is what you need, and error on container if the module doesn't have that function. Got slightly confused one time because I didn't put . in the beginning (typo) and got a different result than I expected.

delicate tartan
jaunty field
#

What are the possibilities regarding linting or even an LSP?
Would be pretty neat with some diagnostics and completion when working with a shebang Dagger shell script.

delicate tartan
delicate tartan
#

Also, here's a conversation starter on the important but difficult topic of script portability. ie, how much do we want dagger shell scripts to be safely copy-pasted between users and environments - and how does that affect convenience?

https://github.com/dagger/dagger/issues/8988

GitHub

Problem One benefit of dagger shell is that it could potentially match the convenience of bash scripts. 👍 One drawback is that it could potentially match the portability of bash scripts 👎 Since the...

delicate tartan
delicate tartan
#

@olive rampart do you want to talk live?

olive rampart
delicate tartan
#

yes

olive rampart
#

@delicate tartan, for calling dependencies like wolfi container, we'll need to load it before running. What I can do is if your first command isn't a function but matches the name of a dependency, then I can automatically load it (potentially erroring after that's done if the function doesn't exist). So it'll feel as if you're doing .load wolfi | container.

delicate tartan
#

@olive rampart makes sense. It seems that there is a combined design problem: 1) calling deps 2) "entering" a module 3) passing constructor arguments 4) reloading local modules 5) "entering" any object as if they were a module

delicate tartan
olive rampart
# delicate tartan <@768585883120173076> makes sense. It seems that there is a combined design prob...

I can do 5) easily, along with 2) by having .use set a starting "state" instead of just a ref to a module. The state already includes a ref to a module, that's why you can .load module | func, so it works out.

For 4) I’ve thought about .load --force my-module to force a reload, but if you want to switch back to a module that’s already loaded, .load does seem like it’ll load it again. So what if we rename that to .mod, so it’s a “get or load” kind of thing, then either one of:

  • still keep .load to force reload
  • add a flag like .mod --refresh [module]
  • maybe do it automatically with a .mod [module] | .watch or .watch [module].

As for 1) and 3), since .load, .config can be used when chained, so it returns a state in that case, and only if it’s the first argument, it sets those args globally. If omitted it’s implied that the constructor is used without any arguments. We can make it always return state, so if you want to save it globally, just .use up to that point.

#

There’s a sticky point with <dep> <func> (e.g., go build) because of the constructor, so I’ll just punt that for now and use the more explicit version. Forcing a reload is also not possible right now though, so I’ll keep this particular aspect out of the examples.

Then, putting it all together:

  • Use current module (assume --no-load):
-⋈ .load | .use
+⋈ .mod | .use
dagger-dev ⋈ 
  • Call function in local module, without saving it (assume --no-load):
-⋈ .load | version
+⋈ .mod | version
  • Run a function from a dependency:
-dagger-dev ⋈ .load go | build
+dagger-dev ⋈ .mod go | build
  • Call current module function with constructor setting:
-dagger-dev ⋈ .config --source=$SRC; build
+dagger-dev ⋈ .config --source=$SRC | build
dagger-dev ⋈ 
  • Save constructor config for next calls:
-dagger-dev ⋈ .config --source=$SRC
+dagger-dev ⋈ .config --source=$SRC | .use
dagger-dev ⋈ 
  • Run a function from a dependency, with constructor setting:
+dagger-dev ⋈ .mod go | .config $SRC | build
  • “Root” at another object:
+dagger-dev ⋈ .config --source=$SRC | sdk | python | .use
dagger-dev ▶ DaggerDevPythonSdk ⋈ .doc
OBJECT
  DaggerDevPythonSdk
  ...
  • Use two “states” interchangeably (syntax from var to pipe still to be worked out):
dagger-dev ⋈ PY=.config --source=$(.host | directory . | --include sdk/python)$ | sdk | python
dagger-dev ⋈ TS=.config --source=$(.host | directory . | --include sdk/typescript)$ | sdk | typescript
dagger-dev ⋈ $PY | test
dagger-dev ⋈ $TS | test
#

Maybe allow putting a name with .use [name] so it can be referenced easily afterwards? So using it as the last command saves it, but as the first command “loads” it. Without a name, it becomes the “default” when saving, and as the first one, lists what’s been “saved”:

dagger-dev ⋈ .config --source=$(.host | directory . | --include sdk/python)$ | sdk | python | .use py
dagger-dev ⋈ .config --source=$(.host | directory . | --include sdk/typescript)$ | sdk | typescript | .use ts
dagger-dev ⋈ .config --source=$(.host | directory . | --include sdk/go)$ | sdk | go | .use
dagger-dev ▶ GoSdk ⋈ .use py | test
dagger-dev ▶ GoSdk ⋈ .use ts | test
dagger-dev ▶ GoSdk ⋈ test
dagger-dev ▶ GoSdk ⋈ .use
SAVED ???
  (default)   dagger-dev ▶ GoSdk
  py          dagger-dev ▶ PythonSdk
  ts          dagger-dev ▶ TypescriptSdk
dagger-dev ▶ GoSdk ⋈ .mod | .use
dagger-dev ⋈ .use
SAVED ???
  (default)   dagger-dev
  py          dagger-dev ▶ PythonSdk
  ts          dagger-dev ▶ TypescriptSdk
delicate tartan
#

What I mean is that those 5 things might not be doable independently

#

(let me catch up)

delicate tartan
#

I think the solution might be related to self-calls...

#

what if we removed the distinction between "installed modules" (dependencies) and "main module" (the one you're .useing) and instead, everything had to be installed

delicate tartan
#

Consequently:

  • Every function call from a module would require 2 parts: <MOD> <FUNC> [ARGS...].
olive rampart
delicate tartan
#
  • core API could takeover top-level namespace (container instead of .container), if we find a way to handle possible conflicts between deps and core functions
delicate tartan
# olive rampart Not sure what you mean. Self-calls mean something within a module's code, but do...

I'm not 100% sure, but maybe something like this:

Before:

$ dagger shell -m github.com/dagger/dagger

# Call function from the "main" module:
⋈ cli | binary

# Call function from a dependency:
⋈ go build

After:

$ dagger shell -m github.com/dagger/dagger

# The "main" is just another dependency
⋈ dagger-dev cli | binary

# Call function from a dependency:
⋈ go build

Not sure this makes sense... But feels like there's something there. is that what you had in mind @plucky moss ?

#

Or even more radical: you can only access modules as dependencies:

#!/usr/bin/env dagger shell

.install github.com/dagger/dagger
.install github.com/shykes/daggerverse/wolfi

CLI=$(dagger cli | binary)
CTR=$(wolfi container | with-file /bin/dagger $CLI)

(not sure about a global .install, just throwing stuff at the wall)

#

One thing to keep in mind, is that the current dagger/dagger module is not representative of the best practice for monorepos, because it is not yet fully split up into sub-modules. As we split it up, more useful stuff will be available as dependencies rather than top-level functions

olive rampart
# delicate tartan I'm not 100% sure, but maybe something like this: ## Before: ```console $ dagg...

That should be:

⋈ dagger-dev | cli | binary
⋈ go . | build

Because then constructor arguments become the arguments to the module (the go module has a required source arg). But I think it's cumbersome to repeat the module you started the shell with so I'm keen to have an alias for "self". I had a discussion with Justin about this in https://github.com/dagger/dagger/pull/8611#issuecomment-2435191302, where he suggested ".", but that has a meaning in bash (alias for source).

delicate tartan
#

eg. instead of this:

.load github.com/dagger/dagger | cli | binary

Now that cli is a sub-module, it would actually be:

.load github.com/dagger/dagger | cli binary

(ie. the top-level cli function is a deprecated shim already)

delicate tartan
#

It might all be dynamically specified in the script

olive rampart
#

Right, but they're not mutually exclusive.

delicate tartan
#

go . | build

mm good point about the required constructor arguments. How were you planning on handling them? I thought you were going to handle the go build format?

olive rampart
#

It's just a shortcut for a common case, but the long form should still be available.

olive rampart
#

That works with .load go | .config /src | build, but not with the shorter go build format. See when I said:

There’s a sticky point with <dep> <func> (e.g., go build) because of the constructor, so I’ll just punt that for now and use the more explicit version.

delicate tartan
olive rampart
#

With that discussion with Justin, his suggestion:

That said, my best alternative suggestion is to require the module name as a prefix for all module commands (with a magic . builtin or something that gets the current module). Not a huge fan of that either though, so 🤷

My reply:

Me neither, let’s discuss. The single . is an interpreter builtin by the way (used to source a script). Would rather avoid that one if possible.

Thought of .mod as well if not a replacement for .load. But anyway, that could be just a shortcut if we find one later. In the meantime I can make sure dagger-dev | ... works consistently with loaded/installed modules.

#

What about .use for any part of the chain, not just the root of a module?

delicate tartan
#

The nice thing if every module is exposed as a function, whether it's a dependency or not (eg. go | ..., dagger | ... or even .self | ...) is that there's always an obvious place to attach constructor arguments

olive rampart
#

Yes, that's exactly it.

delicate tartan
#

But one possibility is that we just remove the concept of .self and simply say "you don't need it - there's no such thing as entering a module - just installing it"

#

(but maybe that's not practical - I don't know)

olive rampart
#

When you say "just installing it", you mean .use?

#

Or add it as a dependency? Like saving it in dagger.json?

delicate tartan
#

I meant adding as a dependency. Either by saving it as dagger.json, or in memory / temp if there's no current dagger.json

#

(but I'm not sure of the answer, just playing with 10 different dimensions of the problem...)

olive rampart
#

I kind of like being able to run a function from any module wiht .load <module> | <function>.

delicate tartan
#

yeah that definitely feels right

#

but it's not that different from .install <module> | <foo> <function>

olive rampart
#

And I think saving any state with .use saves you typing if you want to. So you can for example | .use a constructor config, then next functions will apply that.

olive rampart
delicate tartan
#

Yeah maybe .install is not the right keyword for what I meant. alias?

#

Sorry I'm kind of thrashing

olive rampart
#

I see, so it's like my idea of .load <module> | .use <alias>, except that could be an alias to any state, not just a module.

delicate tartan
#

sorry I missed .use <alias>. Let me re-read your examples

olive rampart
delicate tartan
#

No matter what angle I take, I end up stuck on the same problem: a module has 2 namespaces - top-level functions, and dependencies - and we only know how to present one to the user. Therefore we have a conflict.

olive rampart
#

I don't see a conflict. You can think of a dependency as a named module ref. For example, if you're outside a module:

⋈ .load github.com/sagikazarmark/daggerverse/gh@main | .doc

But if you're in our root module, that's saved as a dependency, so you can access the pinned version with:

dagger-dev ⋈ .load gh | .doc

And if you have a local module in a gh subdir, you can disambiguate with:

dagger-dev ⋈ .load ./gh | .doc
#

That's the long form, the rest is sugar.

delicate tartan
#

Well the conflict is that there could be a top-level function called go, and a dependency called go (in fact that conflict does exist in dagger/dagger)

olive rampart
#

For a currently loaded module you can omit the .load.

delicate tartan
#

so go | build is ambiguous

olive rampart
#

I'm also keen on reserving the go | build shortcut for current module, and require a .load go or .mod go for the dependency.

delicate tartan
#

yes that might be the answer

#

I've been assuming that installing a dependency should feel like installing a CLI tool: "install go; call go; boom!" but maybe that's not the killer feature I imagine

olive rampart
#

Yeah, too many chances for conflict with that.

delicate tartan
#

small detail but I worry about introducing builtins that are not strongly typed

#

but for .load there may be no choice

#

I just don't know how that would ever be moved into the core api...

#

(what would be its return type, since it can return literally any type_

olive rampart
#

Well, you can think abstractly that .load returns a dagger.Module. I've resisted the urge to equate it to a constructor because it limits doing .doc after it if there's required arguments. Thus the need for .load <module> | .config ....

#

But I don't think any of these builtins we've been talking about are a good fit for the API.

#

Only .login, .logout, .install, .uninstall... which are already cobra commands.

delicate tartan
#

I think that .load <module> isn't ideal because it can't take full advantage of our type system, and it would be the only command in that situation:

  • Can't have module-specific constructor arguments
  • Can't have module-specific return value

I think it's possible to tweak it to fit better in our type system. Here's a strawman:

Installing a module

⋈ .install github.com/dagger/dagger
dagger-dev
⋈ .modules | .doc
dagger-dev       Dev environment for the Dagger Engine
⋈ .modules | .doc dagger-dev
dagger-dev [options]

OPTIONAL ARGUMENTS
  --source Directory    (default: "/")
  --docker-cfg Secret

Calling functions from a module

⋈ .modules | dagger-dev --source=https://github.com/dagger/dagger@v0.13.0 | dev | terminal

Entering a module

⋈ .modules | dagger-dev | .shell
olive rampart
delicate tartan
#

Right. In this model everything would have to be installed first

#

That way every module always has a clean, properly typed constructor

#

And feature parity between script and code

olive rampart
#

Does it also mean that if you .install a module, you also get its dependencies in .modules, or you'd .install go, for example?

I'm still confused by .install because that currently persists on dagger.json. Assuming that:

  • we want to keep that feature in the shell
  • we want an in-memory kind of thing, for the current session only
  • we move the current .use to the stacked behavior of .shell

Then, how about .use for the in-memory thing? So .use <module> means load it in current session and make it available to .modules, while .install persists as a dependency of the module in current workdir, maybe also followed by an implicit .use?

Also, how about shortening .modules to .mod or .mods?

south idol
#

hey 👋 is it cool if I report some bugs I'm running into with dagger shell (latest from main) here?

#

everything goes smoothly until it is about to start the terminal, then the shell goes unresponsive. I have to wait or kill it manually. (if I kill it, I think the previous containers are not getting cleaned up)

#

some other unrelated thoughts/feedback:

  • Is it possible to refer to the object created by New that all my dagger functions are methods of? it's not really obvious how that interacts with the shell.
  • One of the first things I tried to do was write a function like
func WithSources(ctr *dagger.Container) *dagger.Container {
  return ctr.WithMountedDirectory(<my sources>)
}

and then call it like

$ .container | from alpine | with-sources | ...

but unfortunately this didn't work. had to do with-sources $(.container | from alpine) ... instead which feels less "ergonomic" to me.

  • (this may already be in the works or something you've thought about) If you create and bind services using the shell, it would be very helpful to have some kind of status for those background services displayed in the prompt line, and also if background services exit or fail.
delicate tartan
south idol
# delicate tartan Is `WithSources` part of your module?

yeah:


func New(
    ctx context.Context,
    // +defaultPath="/"
    // +ignore=[".git", "bin", "dev", "*.pem", ".config.yaml", "*.txt", ".vscode", ".idea", "**/*.test", "**/*.out", "coverage.txt", "pkg/envoy/files/envoy-*", "**/node_modules"]
    source *dagger.Directory,
) (*Pomerium, error) {
    return &Pomerium{
        Src: source,
    }, nil
}

func (m *Pomerium) WithSources(ctr *dagger.Container) *dagger.Container {
    return ctr.WithMountedDirectory("/src", m.Src)
}
delicate tartan
#

actually do you mind explaining what WithSources is supposed to do? What's the use case

south idol
#

it felt like the natural way to write

dag.Container().From("alpine").WithMountedDirectory("/src", m.Src)

in the shell

#

except that in the shell, i (ostensibly) can't refer to m.Src, so I wrote a wrapper around it

delicate tartan
#

But what would you want the module to actually do?

#

I'm trying to understand the lifecycle of source, ctr etc to figure out the nicest DX for it

south idol
#

I wanted that to be a general purpose function that could be used to mount the local sources into a container, like for setting up a build environment.

.container | from golang:alpine | with-sources | with-exec make,build

<oops, make isn't installed, edit and retry>

.container | from golang:alpine | with-exec apk,add,make | with-sources | with-exec make,build
delicate tartan
#

I see, yeah you can't extend/monkey-patch types, but you can wrap them. eg.

south idol
#

I think I expected that the first argument of a method could be piped in from the previous command. not sure why I expected it to do that, or if other people would expect it to do that 😄

delicate tartan
#
func New(
    ctx context.Context,
    // +defaultPath="/"
    // +ignore=[".git", "bin", "dev", "*.pem", ".config.yaml", "*.txt", ".vscode", ".idea", "**/*.test", "**/*.out", "coverage.txt", "pkg/envoy/files/envoy-*", "**/node_modules"]
    source *dagger.Directory,

    // +optional
    base *dagger.Container,
) (*Pomerium, error) {
    if base == nil {
      base = dag.Container().From("alpine")
    }
    return &Pomerium{
        Src: source,
        Base: base,
    }, nil
}

func (m *Pomerium) Container() *dagger.Container {
   return m.Base.WithMountedDirectory("/src", m.Src)
}

#

Then you just call container

#

It turns out that at the moment, the dagger shell doesn't have a great way to pass arguments to the module's constructor (that New function) but we're working on that 🙂 see the 500 lines of design discussion above

#

In dagger call it would be:

# Override arguments with random values
dagger call --source=https://github.com/dagger/dagger#v0.10.0:docs --base=index.docker.io/alpine container

In code it would be:

dag.Pomerium(dagger.PomeriumOpts{
  Source: dag.Directory().WithNewFile("hello there", "hello.txt"),
  Base: dag.Container().From("alpine"),
}).Container()
#

In the shell at the moment I think it's:

.config --source=$(.directory | with-new-file 'hello there' hello.txt) --base=$(.container | from alpine)
container

But we 100% are going to change that .config syntax to something else

#

We just have to choose with @olive rampart from one of the 500 possible alternatives 🙂

olive rampart
#

Yeah 🙂 right now that would be:

> .config --source=$(.directory | with-new-file hello.txt 'hello there' ) --base=$(.container | from alpine) 
> container

So when the first command it doesn’t return anything, so it needs its own prompt or a ;. Can be chained if after a .load, but need the pipe.

south idol
#

what does .config actually do right now? does it reload changes to the module/recompile the code?

olive rampart
#

No, it just saves the arguments for the constructor. After setting once it’s reused when you call module functions.

south idol
#

ohh, so when you call module functions it always calls the constructor first?

olive rampart
#

Yes.

#

Like in code or dagger call. It’s the same.

south idol
#

aha that makes sense. i assumed it was calling it once on startup

olive rampart
#

It may seem like that if you have arguments with subshells, but it’s the subshell that’s executing and the value saved for later, when the constructor needs to execute 🙂

delicate tartan
#

@olive rampart I think any way you slice it, the best UX for passing constructor arguments is as flags to a command which has the module's name in it, or is clearly associated with that module

#

eg. from @south idol 's example above:

  • pomerium --source=$FOO --base=$BAR | container
  • .new pomerium --source=$FOO --base=$BAR | container
  • .new --source=$FOO --base=$BAR | container
  • .self --source=$FOO --base=$BAR | container
  • . --source=$FOO --base=$BAR | container
olive rampart
delicate tartan
#

yeah I know, I'm just trying to think outloud because I'm overwhelmed by all the dimensions of the problem

delicate tartan
# olive rampart Does it also mean that if you `.install` a module, you also get its dependencies...

Does it also mean that if you .install a module, you also get its dependencies in .modules, or you'd .install go, for example?

In order to see a module in .modules, you would have to install it.

I'm still confused by .install because that currently persists on dagger.json.

Yeah me too. That's one of the variables I was trying to isolate. But I agree it complicates the rest.

Assuming that:
we want to keep that feature in the shell
we want an in-memory kind of thing, for the current session only
we move the current .use to the stacked behavior of .shell
Then, how about .use for the in-memory thing?

I guess it would also work, eg.

.container | from alpine | .shell
Container ⋈ .doc
<shows doc for Container>

So .use <module> means load it in current session and make it available to .modules, while .install persists as a dependency of the module in current workdir, maybe also followed by an implicit .use?

Well that's a different model than my example above, so I guess I'd have to compare the two options side by side.

Also, how about shortening .modules to .mod or .mods?

Sure, if .modules survives long enough we can shorten it 🙂

olive rampart
#

Well that's a different model than my example above, so I guess I'd have to compare the two options side by side.

Which one? The most recent one or the one I was commenting there? If so, not really, just meant renaming .install in your example to .use, so that the former persists on dagger.json (offload to cobra like today) while the latter doesn't (like .load). Since today's .use would be renamed to .shell:

⋈ .use github.com/dagger/dagger
dagger-dev
⋈ .mods | .doc
dagger-dev       Dev environment for the Dagger Engine
⋈ .mods | .doc dagger-dev
dagger-dev [options]

OPTIONAL ARGUMENTS
  --source Directory    (default: "/")
  --docker-cfg Secret
#

Similarly, for the go dependency, and if running the shell on dagger/dagger (with --no-load):

⋈ .use .
⋈ .use go
⋈ .mods | .doc
dagger-dev       Dev environment for the Dagger Engine
go               ...
#

Without --no-load, that first .use . would be implied.

delicate tartan
#

Ah you're thinking of like an "in-memory install"?

olive rampart
#

Yes

delicate tartan
#

For simplicity I was ignoring the problem of in-memory vs. on-file (dagger.json) because it's an extra dimension that melts my brain

#

so I just pictured .mods would show what's in the dependencies of the current dagger.json

#

if you call .install outside of a module, it would just install in ~/.dagger/dagger.json or something like that

#

But I agree it's a little weird that everything is immutable, no side effects etc. except module loading. Especially since you already implemented the ability to load them remotely without side effects

olive rampart
#

Yes 😄

#

In this scenario, multiple calls to .use would be as if you could do multiple -m in dagger call.

#

So you can start with one (the -m flag), and add more if you want. Similarly, you can do dagger call -m go --source=. build, thus the .use go. It's the same mechanism. That's what powers .load today.

delicate tartan
#

I was wondering what would happen if we just removed an indirection... And just tried to make as many of these things appear as regular commands in the shell

#

eg. what if you could just execute a remote module as if it were a command?

#
GOMOD=$(github.com/dagger/dagger/modules/go --source=https://github.com/dagger/dagger)
CLI=$( $GOMOD | binary ./cmd/dagger )
CTR=$( github.com/shykes/daggerverse/wolfi | with-file /bin/dagger $CLI )

$CTR | terminal
delicate tartan
#

If you throw in awareness of the local filesystem: just like dagger makes remote git repos runnable (shout out to @proud gorge for that choice of words), it makes sense that it makes your local directories runnable also. So you could ask the shell to run a local directory, and it would do so by loading it as a module

#

eg.

$ dagger shell -m github.com/dagger/dagger/modules/go
⋈ CLI=$(. --source=github.com/dagger/dagger | binary ./cmd/dagger)
⋈ CTR=$( github.com/shykes/daggerverse/wolfi | with-file /bin/dagger $CLI )
delicate tartan
#

I have a very good feeling about this one! 👆

delicate tartan
#

Done with final edits. Going to bed!

#

@olive rampart looking forward to discussing the above with you tomorrow

#

@azure prawn @halcyon laurel the masterpiece 🙂 👆 thank you for your help

halcyon laurel
#

Yay!!! ☺️

delicate tartan
#

@olive rampart if you're around today, I'm available to sync on all the above

delicate tartan
#

@olive rampart just getting a tea refill and then I can call you (if not too late?)

delicate tartan
#

I'm using the shell in main, and when with-exec fails, I don't get any useful outout. Example:

⋈ .container | from alpine | with-exec sh,-c,'echo >2 a useful error; exit 1' | stdout
Error: input: container.from.withExec.stdout process "sh -c echo >&2 error....; exit 1" did not complete successfully: exit code: 1
plucky moss
delicate tartan
#

A few other trace details pop up when using them for long-running shell sessions:

  • The relative gantt chart when entering a span (already mentioned elsewhere today)
  • Color code in the gantt chart for completed/failed/running would be nice. Right now I get red for failed, but then other colors are for returned type I think. IMO that's not as useful as always using the color code to communicate status (context: long shell session, tons of spans, looking for what's currently running)
  • When looking at logs for an error: either a "copy all" button, or a link to the raw contents (to make it easy to select all / copy)
unique quail
#

i can pick this up when i'm done with autocomplete? (which is going well)

#

one interesting little design thing came up when i was playing with a basic autocomplete - suppose i have this:
container | with-exec \t - it can show all the flags that with-exec can take.
but what should we show for the required arguments? we could show nothing, and trust the user, but it would be cool if we could somehow indicate to the user what should go there

plucky moss
plucky moss
#

answering: ls \t completes args, ls -\t completes flags

unique quail
#

but with-exec is a bit more unclear

plucky moss
#

yeah i don't think there's a sensible autocomplete for with-exec thinkspin

#

maybe you could look at the type of the arg

#

if it's File or Directory, do fs completion

unique quail
#

okay, yeah, we should maybe just not in that case

#

that's pretty easy

plucky moss
#

fwiw, here's fish flag completion

#

notably not conforming to my color scheme FrogeRage

unique quail
#

as a note, reading through the source code for zsh's completion is wild

#

it's genuinely incredibly clever

#

the fact it's all almost entirely written in shell scripts is sort of wonderful, and the way it handles parsing + completion + expansion + evaluation all at the same time is very very cool

unique quail
plucky moss
#

have you looked into go-prompt? it has a pretty good popup menu UI

unique quail
#

i'm really sad that no readline library for go has both: 1. a good ui, 2. any sort of vim emulation

#

including go-prompt 😛

#

in the back of my head i'm wondering if it's worth shipping the zsh/fish/readline lib as wasm in the cli 😛

#

^ problem to think about later

upbeat shale
unique quail
#

very initial wip, quite a lot of stuff missing

unique quail
#

^ above should be ready-for-review now 😄

#

kinda hitting a bit of an issue with trying to autocomplete things of the form container |, since it's not a valid parse tree (it needs something after the pipe to parse correctly)
@glacial cedar i threw together some quick hacks in https://github.com/mvdan/sh/compare/master...jedevc:mvdan-sh:allow-partial-parse-tree, with the idea that I could get Parse to return a best-effort partial parse tree, instead of only returning nil if we failed. is that something you're open to? or have some other idea of what might be a good way to resolve that?
my fallback option is to just pull in tree-sitter for the auto-completion, but it would feel weird to ship two entirely separate shell parsers in the cli 😢

glacial cedar
#

I started something in this space for a different project wanting auto completion of incomplete shell: https://github.com/mvdan/sh/commits/recover-errors

The idea is sound but not finished, so I'll need to find a few hours this week to finish it up with more tests.

GitHub

A shell parser, formatter, and interpreter with bash support; includes shfmt - Commits · mvdan/sh

unique quail
#

yes! error recovery was the other kind of approach i was thinking 🙂
nice to know that this is also something you've thought of before 😄

#

we're not at all blocked on shipping any completion on this fyi, i can come up with plentiful hacks to get that use case working

olive rampart
#

@delicate tartan, we already have a global flag --workdir. That's supposed to replace -m, --mod, right?

delicate tartan
olive rampart
delicate tartan
#

yeah I assumed, what other option is there?

olive rampart
#

The problem is if users want to use two different values, one for the global flag, the other for shell. But not sure what that use case is.

#

And it'll show as a global flag, not a flag of shell.

delicate tartan
#

🤷‍♂️

#

We should probably test the UX itself before going too far into polishing little details like this (unless it's super easy of course)

#

I think "direct module execution" (eg. ⋈ github.com/dagger/dagger | dev | terminal) is a no-brainer. But "module navigation as filesystem navigation" is more of a gamble, I'd give it 70% chance of success at the moment

olive rampart
delicate tartan
#

oh nice

olive rampart
#

There's an issue with executing off of .deps (e.g., .deps | wolfi | container), but will be fixed tomorrow.

#

There's changes to .doc. There's a differece between .doc . and . | .doc. The first documents the module, and is where you find the constructor arguments. The second documents the main object, which is what the module returns. Before, the entrypoint and module name were documented in the main object doc. I like the separation now.

delicate tartan
#

Do you still think merging .doc and .help is a good idea?

#

(I personally think so, to reduce cognitive load)

#

worth the extra effort of carefully designing the output, so that it's not confusing. Builtins, functions etc

olive rampart
olive rampart
delicate tartan
delicate tartan
#

@olive rampart I didn't have as much time as I wanted to review the PR, but will do today. Do you feel like some of these PRs are mergeable as-is, so we can more easily test in main?

olive rampart
# delicate tartan <@768585883120173076> I didn't have as much time as I wanted to review the PR, b...

There's only two PRs:

GitHub

This PR introduces support for shell autocompletion. We allow auto-completing module commands, chained commands, builtins, arg flags, etc.

There's still some important functionality missin...

GitHub

Implements first part of #9035
Notes

documentation for a module and its constructor/entrypoint:

-.load <module> | .doc
+.doc <module>

documentation for a module’s mai...

delicate tartan
#

Can we make sure 9097 gets into 0.15? Will make testing and demos easier

delicate tartan
#

I played with it yesterday and loved it

olive rampart
delicate tartan
olive rampart
glacial cedar
#

@unique quail I've rebased https://github.com/mvdan/sh/commits/recover-errors/ and updated it with what feels like pretty complete initial support for recovering from missing tokens, including your container | example.

I wrote a bunch of tests as well as an initial API to collect the missing tokens by node. I'm struggling with this side of the API a bit more; once the parser has given you a syntax tree with "recovered" positions for missing tokens, what do you do to realise what kind of tab-completion you should be doing? my thinking is that you need to know...

  • how many tokens are missing; there may be multiple, e.g. in foo=$(bar | we're missing at least one word/command, and then the closing parenthesis, in that order. hence the prototype API provides a []missingToken result, probably in left-to-right order.
  • the position of the missing tokens. it's not always going to be at the end of the input; for example, imagine foo > ; bar, where the first command is missing the word (file) to redirect to. Here, a boolean like "this missing token would need to go directly at the end of the input" might suffice for knowing what to complete.
  • the "kind" of missing token, again to aid the use case of tab completion. are we missing a specific shell token like )? or any word? or a file, e.g. due to a redirect? are we starting a new word, or are we in the middle of one?
  • the context leading up to the missing token, so that e.g. given foo | bar | you can know "ah, I'm piping from the commands foo and bar, so I know the following commands will be available here". my naive thinking is to just record the parent node where the token is missing, in this case a binary command (pipe), but a path from the root of the syntax tree via a []syntax.Node may be more useful to e.g. know if you're inside a subshell, or inside a function, or to navigate up the tree and look at sibling nodes.
GitHub

A shell parser, formatter, and interpreter with bash support; includes shfmt - Commits · mvdan/sh

#

the other quirk of a nice API for tab completion use cases is that, in many cases, tab completion works when there are no missing tokens, e.g. foo | bar | baz where baz could take specific forms of arguments or flags. So, if we design an API for tab completion solely around missing tokens, then it's not really useful in the general case.

#

I'm happy to do a quick call at some point to understand what API would work well for you, as at this point I'm trying to guess a bit. The parser support is basically done, and you could write all of this logic yourself; but its rather tricky to get right, and I think we should expose it in the syntax package itself in some way.

#

in any case, I'll prototype an API specifically for tab completion next, which would support missing tokens, but still work when there are zero missing tokens. this will be rather different than the prototype API I've shared above, and I'm not sure if both will be worth exposing.

unique quail
#

hey thanks @glacial cedar 🎉
so for doing the auto-completion right now, it's this logic: https://github.com/dagger/dagger/blob/f2bcf227a435fff401857eb505649e963f3d209e/cmd/dagger/shell_completion.go#L26-L77
essentially - i navigate the parse tree to try and find the "smallest statement" (which lets sub-shell completion work fine as well)
then i essentially try and build a "completion context" for where the cursor currently is - the way this works, is by walking the parse tree (up until we find a node that's past the cursor)

#

so actually, i think that needing information on what the next missing token is might not be necessary for how it's currently working? I think just having a best-effort parse tree, so if you have foo | bar | , then having the Y branch of the first pipe have its Y unset would probably be good enough. I know that it's a pipe, so I can work out the context, and then I'm pretty much past the cursor, so I don't really need to look at Y there

#

maybe i'm missing a case where this approach doesn't work very well - it's definitely useful to have the missing tokens in an api somewhere, but i think just having an incomplete tree might be enough for what i'm trying to do

glacial cedar
#

👍 the parser support is done, so I'll clean it up, add more docs, and merge it to master. I'll leave it to you to continue with your logic on top for now

unique quail
#

or is that already handled?

glacial cedar
#

I should clarify that the way the parser support works, no nodes are left as nil when they are incomplete - rather, positions are left as "incomplete". in the case of a missing command, for example, this means &Stmt{Position: incompletePos}

#

leaving entirely nil nodes is not a complete solution for error recovery as sometimes only part of a node is missing, e.g. "forgot to close this quote

unique quail
#

perfect! thanks for clarifying 😄

glacial cedar
#

@unique quail here you go - let me know how you get along with it. I wrote a bit of a godoc with an example, and there are tests as well for a variety of cases.

#

note that I didn't go out of my way to make sure every little edge case is covered; just the main ones I could think of. we can add more of those if p.recoverError() { branches and test cases as needed.

#

I also added a hidden flag to shfmt, so we can do quick tests with it:

$ shfmt --exp.recover=3 <<<'foo | '
foo |

also see the --to-json flag, which pretty prints the syntax tree in JSON format for debugging.

#

(always use the latest master when trying this out, as I just pushed a small follow-up fix)

unique quail
#

picking this up now, thanks so much @glacial cedar 🎉

#

from a quick play, this definitely seems to be what i want 🙂

unique quail
#

one thing i am hitting that i'm curious about - i can get a token <recovered>, and this pretty reasonable to handle for when there's one token to recover

but suppose in the case of foo $(bar | \t - the CmdSubst is a <recovered> position, and so is the BinaryCmd that is the pipe.
the problem is - given these, it's slightly tricky to work out that i should complete items for bar, not for anywhere else.

the piece of information that feels missing is some "guessed" position or similar, which is capped at the End of a parent (but not quite that? since for a Stmt, we derive the End from the children themselves so those also count as <recovered>)

#

i might be a bit distracted though - really the end goal is to find the leaf node below the cursor

#

gonna pick this up again tomorrow 🙂

glacial cedar
#

not sure I follow - CmdSubst will have a recovered end position, for its closing parenthesis, but otherwise it should have a start position. Similarly, BinaryCmd should have a missing "right side command" position, but its start and pipe operator positions should be there.

but yes, figuring out what exactly you need to complete is non-trivial. which is why I started prototyping an API for completions in #1296732593282224148 message. my intuition is still that we should provide something, because DIY in this case is far too error prone.

#

I can spend some time on it this weekend if you give your thoughts about what I wrote in the linked message above, as that's my latest thinking on it

glacial cedar
#

Or I might write a demo with a few test cases being completed, to force myself into figuring out the minimum useful api needed.

unique quail
#

mm, yup, it's very much non-trivial 😆

#

knowing the missing tokens would probably be useful, you're right - with the information there though is that enough to work out which node those would correspond to?

unique quail
glacial cedar
#

👍 I'll get back to you with a draft API to deal with the more complex cases

glacial cedar
#

@unique quail FYI you probably need RecoverErrors to be higher than 1, otherwise multiple missing tokens like foo | { bar; $(baz will not work. I think a reasonable setting is probably 5 or 10, most users are likely going to stay under 3 or so

unique quail
#

yeah, i know - the problem is i don't think i can handle those cases anyways?

#

this is the whole problem with not knowing what to complete on - i can't actually work out which node the cursor is on easily

#

so should i be trying to complete bar or baz entirely depends on where the cursor is

glacial cedar
#

well I always assume that you parse up to the cursor position when the user hits "tab" to autocomplete, so in the case above you would be completing what is at the end - baz

#

parsing anything after the cursor when tab is hit is I think unnecessary - and complicates things a bit in terms of figuring out what to complete

unique quail
#

ohh

#

sigh

#

of course that's what you do

#

😛

unique quail
#

@olive rampart ^ this is ready for review now

glacial cedar
#

@unique quail does that mean that you don't want/need me to prototype an API to help with more complex "tab-complete incomplete input" cases? or at least not yet?

#

totally fine with me either way, I just want to understand if you're happy to build directly on the syntax tree with recovered positions for now

unique quail
#

i think not yet, i think this works "good-enough" for now - at least for the current state of stability of the shell

#

at some point, i definitely think we'll need something a bit more fancy, but i think it's probably worth holding until we get more complex requirements instead of trying to design something now that then hits issues when we get more complex asks in the future 😄

glacial cedar
#

👍 I'm totally down with tackling issues as they become tangible, and agree with this prorotyping approach

olive rampart
#

Hey @glacial cedar 👋 Quick question. Is there a way to hook into the final output of a command, when at the end of a pipe line, other than through coordinating via the stdout buffer's Write (in interp.StdIO)?

For example, consider this:

foo | bar; wat

Let's say those three commands use the same handler code and write their result to the handler's stdout without knowing if it's going to be piped or not. So they pass around the lazy value, not resolving. Only at the end of the chain I want to resolve the lazy value and replace with the result before printing to the screen.

Currently my approach is a custom io.Writer wrapping buf.Bytes and processing the final output from that buffer after the runner finishes. However, now I want to process each command's output individually as they return instead of the whole output in the end.

I can hook into the custom io.Writer's Write method for that, just wondering if there's an actual hook from the library for this.

glacial cedar
#

there is no "out of the box" way to do this, writing your own io.Writer logic for this sounds fair enough to me

#

the only alternative that comes to mind would be api in the interpreter to report which "piece" of the original syntax it's currently on, but I'm not sure what that could look like

scenic flare
#

@olive rampart here's a very rough multi-line patch support I have locally I implemented as I needed it for the hackathon

diff --git a/cmd/dagger/shell.go b/cmd/dagger/shell.go
index 589b81c97..f8ff0f815 100644
--- a/cmd/dagger/shell.go
+++ b/cmd/dagger/shell.go
@@ -407,6 +407,7 @@ func (h *shellCallHandler) runInteractive(ctx context.Context) error {
         }
 
         var line string
+        var cmds []string
 
         err := h.withTerminal(func(stdin io.Reader, stdout, stderr io.Writer) error {
             var err error
@@ -431,8 +432,29 @@ func (h *shellCallHandler) runInteractive(ctx context.Context) error {
             } else {
                 rl.SetPrompt(prompt)
             }
-            line, err = rl.Readline()
-            return err
+
+            for count := 0; count < 2; {
+                line, err = rl.Readline()
+                if err != nil {
+                    return err
+                }
+
+                cmds = append(cmds, line)
+                if strings.Count(line, "'")%2 != 0 {
+                    cmds[len(cmds)-1] += "\n"
+                    count++
+                    rl.SetPrompt("> ")
+                    continue
+                } else if count > 0 {
+                    cmds[len(cmds)-1] += "\n"
+                    continue
+                }
+                break
+            }
+
+            line = strings.Join(cmds, "")
+            cmds = cmds[:0]
+            return nil
         })
         if err != nil {
             // EOF or Ctrl+D to exit

it only works with ' for now. Just sharing in case it might be worth to build from there

#

here's an example:

. ⋈ .core | container | from alpine | with-new-file /test 'hello
> there
> how
> are
> you
> today?' | terminal
dagger / $ ls
bin    dev    etc    home   lib    media  mnt    opt    proc   root   run    sbin   srv    sys    test   tmp    usr    var
dagger / $ cat test
hello
there
how
are
you
today?dagger / $

edit: note that the character counting is very brittle and we'd probably have to implement something more robust when it comes to line parsing surely

olive rampart
glacial cedar
#

yep indeed it does, look at how cmd/gosh shows continuation prompts

#
$ gosh
$ echo 'start of a command
> continuation of the
> command'
start of a command
continuation of the
command
#

I would definitely avoid any approach that tries to search/count/tokenize the input string yourselves because with shell syntax that's always a bad idea

scenic flare
olive rampart
#

@delicate tartan, would you expect to be able to use a module's dependency as a "symlink" to that module (same behavior as -m)?

(dagger-dev) ⋈ .cd alpine
(alpine) modules/alpine ⋈ .pwd
/Users/helder/Projects/dagger/modules/alpine
jaunty field
#

With the current state of the shell, is it possible to parallelize in any way?
E.g. if I would like to build several containers in parallel and publish them once all succeed.

olive rampart
# jaunty field With the current state of the shell, is it possible to parallelize in any way? E...

Yes:

⋈ _echo foo & _echo bar & _wait

If you run that several times you should see random ordering. The _ prefix means these are interpreter builtins (like bash builtins, lower level than dagger shell builtins which start with a .), and it's to avoid a conflict with function names. Replace _echo xxx with whatever pipelines you have, the & will make them run in parallel, and the final _wait is to wait for them to finish, like bash's wait.

jaunty field
#

That is awesome, thanks.