#Simplify CLI proposal
1 messages Ā· Page 1 of 1 (latest)
One of the most significant parts is that most of it actually just comes along with supporting the core API in the CLI, which I think we mostly all agreed we'd want independent of simplifying the command structure.
So I kind of feel like it might be worth pursuing from that angle alone. If we're happy with the end result then that's great, if we're not there's still room to do more iteration on the CLI UX from there.
Not sure if that helps š Happy to talk it all through more too
upporting the core API in the CLI, which I think we mostly all agreed we'd want independent of simplifying the command structure
Yes agree on that
@robust shoal hope you don't mind me sending half-baked questions and reactions here, instead of a well thought-out paragraph in the issue. Don't feel like I have something structured enough to say, yet
@robust shoal how do you feel about python/go/ts one-liners as a possible alternative to CLI args & flags?
ie. where does it fit in all this
another random question - is it actually possible to get shell autocomplete for this? I feel like perhaps shells won't support it, because the autocomplete isn't static for a given command (changes based on value of -m)
Oh yeah we should absolutely support that too either way. I think that we should continue to support the posix-y commands+flags because it's more approachable and easier in a lot of cases, but the lang-snippets approach should exist too
I think it works pretty orthogonally with the proposal, e.g.
dagger run --lang=python 'dag.my_module().my_binary().build(somearg="someval").export(path="./bin/my-binary")'
is the python equivalent to:
dagger run -m my-module my-binary build --some-arg=someval export --path=./bin/my-binary
I haven't set up autocompletion previously, but yeah my understanding is that you need to generate it ahead of time and tell your shell about it. So we could support exporting the autocompletions for a user module, but it would not be dynamic, would need to be updated whenever the module changes etc. So not a great experience.
The other thing we could do that would be more work but a better UX is use TUI magic for this. So an interactive mode that helps you build the command you want to run or something like that
Which would be very cool in the long run; hard to imagine us having time in the near future to do it but can remain an open option
If you're referring to the tui autocompletion idea, then yeah that would be an option. I suppose it's that or just do our own thing with the TUI that's close enough to a unix shell autocompletion experience (but not literally wrapping one underneath). If that makes sense
the my-container shell example feels like a special case somehow
private because it doesnāt return a scalar
It would require adjustments to the core API (currently shellEndpoint), but I think we can make it work coherently with the overall idea.
If you think about how the export APIs in core work today, they are essentially the equivalent of "side effect only" functions that users can define. The difference is because they are in the core api they have special privilege to directly interact with the caller's host resources. I am imagining that the new shell function on Container is like that. It connects back to the caller, sees that it has a tty available and connects to that. So just like export can interact w/ the caller's host filesystem, shell can interact with the caller's tty.
Basically just refactor it to be more aligned w/ how all the other core APIs, which is probably a good thing in general anyways
There could be a Host.Terminal
Or maybe itās Host.Stdout, Host.Stderr, Host.Stdin, of type Stream, and they are the default values of three new options to Ā“WithExec`
(not sure about this at all)
That would be super cool and useful, but I think it would be orthogonal to this. Basically the whole idea here relies on being able to chain off of types, so there would have to be a shell function on container still in the same way there's export on file/directory/container. That doesn't contradict what you're saying at all though, it can all coexist peacefully.
Yeah that makes sense. The part that's throwing me off is that I don't know how to explain a hypothetical Container.Shell() function in pure API terms (outside of the context of the CLI).
Would it be: "Execute a shell and attach an interactive terminal to it"? But that opens a few questions:
- What is the return value?
- What shell are we executing?
- How is it different from
WithExec? Why not use that? - What interactive terminal am I attaching, and how do I attach it?
So these questions led me down a path of trying to represent that mysterious "interactive terminal" in the API somehow, so that I could explain a Shell function that interacts with it
It seems hard to fit the dagger shell square peg into the Core API round hole, without some sort of a "stream" type
What is the return value?
Void, it's a side-effect only function. The other APIs in core that are side-effect only like the variousexports should have beenVoid, but we didn't have it at the time so they use a placeholderboolinstead. Those APIs are spirituallyVoidthough ()
What shell are we executing?
Yeah we would need to decide on the details here. Something like "Container entrypoint+default args by default, plus args to the Shell API to override those" could be a start of the bikeshed š
How is it different from WithExec? Why not use that?
Yeah agree, maybe just a better name thanShellcould help here. Something more specific like:
AttachToCallerAttachDebugShell- etc.
There's other possibilities like making this API be on Service instead, though that would increase verbosity of how you call it from a CLI (even if from a programmatic perspective that's not a big deal). I kind of like this problem in a way though because it's another case where user modules are going to be facing the exact same issue: divergence between what's a nice CLI API and what's a nice programmatic API.
I hinted at ways we can go about addressing that under The Core API was not optimized for a CLI interface in my comment, so this may tie back to all that. Basically, advanced module authors could have opt-in ways of configuring visibility of functions on the CLI in the programmatic clients (fortunately this is actually pretty easy to implement now). Details/bikeshedding TBD
What interactive terminal am I attaching, and how do I attach it?
The caller's TTY (or technically pty, whatever). Same way thatexportinteract's with the caller's filesystem, etc.
how to explain a hypothetical Container.Shell() function in pure API terms
Also, whatever it would end up being called and where it exactly it lives, the behavior between the CLI and the API clients would be consistent. Programmatic API clients all rely on the session for anything that involves "host/caller interaction", so eitherdagger run, the "magic automaticdagger sessionsubprocess", or in the case of running inside a Module it's technically the shim. Either way, it's again the same asexport; forexportthe API is implemented by talking back to the session process and exporting the files. For this shell case the API would be implemented by talking back to the session process and attaching to its tty if it's available (error otherwise).
Got it
Then I guess, when we finally get around to "sandboxing" the GraphQL API, to remove the tether to the buildkit grpc API, this would add one more item on the list of features to add to the GraphQL API:
- Filesystem upload/download (to sandbox for
Host.Directory,Directory.ExportandContainer.Export) - Socket forwarding (for H2C and C2H networking)
- Secret lookup (or maybe that's already sandboxed with the current secret API? Does the session binary need to touch that?)
- New: tty attach
Is that right?
That's right, but I'd say that tty attach doesn't really complicate anything beyond the points you listed above it. If we've tackled those then tty attach is easy on top of the rest of the plumbing needed for the rest
Yeah makes sense. OK so that leaves me with the confusing name + relationship to WithExec
So I'm leaning towards either:
- Make it a flag to
Container.WithExec, likeattach: bool - Call it
Container.Attach()and it only does the attaching - actual execution is still controlled byWithExec
But CLI experience is not great for either of those
A gotcha of intertwining this with WithExec is that an interactive session is very far from hermetic, but WithExec is cached. So I think we'd have to always invalidate cache of WithExec when there's a caller tty attached. Possible I suppose, there some low-level annoyances/performance considerations to deal with but they could be dealt with.
That's what we'd avoid via a single field that has Void return type and/or putting it on Service instead.
What do you think about this:
- We add a graphql field to
Containerwith a very specific name likeattachCallerTerminal(<optional args for overriding command to run, tbd>) Void. - In the raw graphql and programmatic interfaces (i.e. codegen clients), it has that name of
AttachCallerTerminal. It's usable if really desired and behavior is consistent (as mentioned above) but it has such a specific name that it's not easily confused with the otherWithExecAPIs - We add the aforementioned general support for module authors to optionally customize how certain APIs are presented in the CLI specifically. We use that general mechanism in the core api to give the api a more friendly name like
shellorattachwhen it's being presented in the CLI.
@robust shoal I think I'm warming up to Container.Shell after all. It's a slight misuse of the term, because you can run a shell in non-interactive mode... For correcteness we should call it eg. InteractiveShell or ShellWithTerminal... But maybe in this context the intent is clear.
(In the back of my mind there is the use case of llb's shlex -> a convenience function to wrap sh -c <CMD>. Which would be the other contender for a Shell function, but I feel ready to just ignore that.
Also Shell could use a simpler logic for determining what the hell gets executed (something that still confuses me with the current dagger shell). I think we should disregard entrypoint altogether, and just make it an argument to Shell, with a default value of sh if not specified
Things that would NOT affect what Container.Shell executes:
- Past calls to
Container.DefaultArgs - Past calls to
Container.Entrypoint - Past calls to
Container.WithExec - Entrypoint or Command in the base image
The one use case I sort of like having some default is if you want to drop right into a repl in the container, in which case it's nice if the container sets up the default for you. But I'm pretty low opinion here; I agree 100% on past calls to Container.WithExec not affecting it, and also agree that DefaultArgs is to confusing. I'd be okay with starting with Entrypoint not having any effect and going from there
Maybe thereās a WithDefaultShell?
I was brushing up on unix terminology for this stuff
We could have Container.Terminal(Shell Optional[string])
Container.WithDefaultShell
Terminal(): open an interactive terminal to the container
Actually for future sandboxing of the API (ie not important now) we could have a Terminal type š
with a proper terminal control graphql API
that could be a genuinely awesome contribution to the state of the art
Oh totally, I think what we'd be doing here would be a first step towards a nicer+fuller API around all this (well, technically, shellEndpoint was but that was just meant to be a prototype)
We still need to do bikeshedding on the equivalent of all the above but for what replaces dagger up. I think it's mostly the same type of considerations to work through (requires a backwards compatible addition to Service along the lines of listen that returns Void and results in the caller's host network being listened on, basically a shortcut api to the fuller Host tunnel APIs)
But just speaking generally, pending the rest of the bikesheds needing paint, are you onboard with the general idea here?
@vital crescent let me know if you have any thoughts on the whole idea here (know you've been planning on it and there's a million other things too, just a gentle ping š). If we're onboard with the general approach we can start working on the pre-reqs in the background as we continue to sort of the api details and such.
ChatGPT says it might look like this:
type Query {
# Retrieves the current state or initializes a terminal.
getOrCreateTerminal(id: ID): Terminal
}
type Terminal {
id: ID!
content: String
status: TerminalStatus
# Sends a keystroke to the terminal.
sendKeystroke(key: String!): Terminal
# Writes text to the terminal.
writeText(text: String!): Terminal
# Resizes the terminal.
resize(columns: Int!, rows: Int!): Terminal
# Changes the terminal's settings.
updateSettings(settings: TerminalSettings!): Terminal
# Retrieves the terminal's current settings.
getSettings: TerminalSettings
# Closes the terminal.
close: Terminal
# Flushes the terminal's input and output.
flushIO: Terminal
# Reads output from the terminal.
readOutput: String
# Sets the terminal to raw mode.
setRawMode(enable: Boolean!): Terminal
}
# Terminal settings (can include various configuration options).
input TerminalSettings {
# Define specific settings here.
}
# Enum for terminal status.
enum TerminalStatus {
ACTIVE
INACTIVE
CLOSED
}
doesnāt Service already have Start?
Yeah but that returns a ServiceID. I think we just want a field on Service that returns Void and does the whole process of Start and setting up the host tunnel. So it would just be a shortcut for the CLI essentially that encapsulates existing functionality (rather than doing anything new or fancy)
Is it really a hard requirement that it return void? I forget the reason
is it to show less noise in āhelp?
No, just that it returns a scalar. So technically ServiceID is okay, but the end effect I'd want would be something like this:
dagger run my-server listen -p 8080:8081 (where listen is the new field that returns Void)
So we want the API to result in the service listening on the caller's host, basically as a side-effect of just having executed. That's where Void comes in; it's for those side-effect only apis like export, shell/terminal and whatever we would add here
sorry I still donāt get why the scalar requirement? what happens differently if I dagger run my-server start?
oh itās missing the h2c
but thatās orthogonal to returning a scalar no?
Yeah that. The scalar requirement is what I wanted for this new CLI approach because it helps resolve all the ambiguities around "am I operating on a pipeline, the final return value, or both" and also just keeps the general number of cases the CLI needs to handle simpler and more predictable. We don't need to think about every core type, every user type etc.
And it works out nicely here because side-effect only functions return void, which is a scalar that is very easy to handle š
I think that can be solved by simply pretty-printing all the scalars if you end your pipeline on a struct (and no flags to customize that)
seems less arbitrary that way. You call a pipeline of functions, the CLI prints the result
if you want side effects, here are functions you can call
Yeah totally, I was thinking we'd handle that case too (what Helder refers to as "simple objects"), but that is spiritually in line with what I'm saying. I was just trying to avoid the CLI seeing the return value was a Container and then deciding what to do with it as a special case of that particular type (same thing for every other core type etc.)
yeah Iām definitely warming up to that concept
I know we said āremotingā the API is out of scope. But there is one concern: the API of those side effect functions will be partly a lie after remotingā¦
I think it would be about the same level of lie as today imo, what are you thinking of that makes it more of one?
if the graphql server is remote, Directory.Export canāt write to the local filesystem, yet it takes a path as argument, what will it do with it?
Oh I so the graphql server is already remote (runs in the engine container), we are using the buildkit session stuff to communicate back to the buildkit client in the CLI/session-binary/shim. If you mean there's just pure SDK and no session binary w/ buildkit client etc. then that just means something else in the SDK needs to handle all that.
There's plenty of options there, all a buildkit session really truly is underneath the hood is a stream between the client and server and the server sends requests to the client (inverts the relationship), so there's various ways of implementing that. Websocket is obvious one that complements a graphql client
But overall I don't think any of this makes that situation any better or worse, it's about the same
Right, by āremoting the gql APIā (not sure what to call it) I meant collapsing the features of the buildkit session stuff into the graphql API, so that our API actually expresses 100% of what you can with Dagger in one schema. The reason I thought that might affect the CLI UX, is because I was assuming it would require changing the schema, specifically how side effect functions work. For example Directory.Export might return a URL to a tarball, or to a mutagen or rsync websocket, or something. And it wouldnāt take a path anymore (not the serverās concern where the client will store the contents itās downloading). And so on.
And in turn those API changes would change the CLI UX, since in this proposal itās coupled to the API.
For example Directory.Export might return a URL to a tarball, or to a mutagen or rsync websocket, or something
I see, I mean that would be a "breaking change, rethink things" type moment. There definitely would be reasonable options still though. If those APIs return scalars of particular types, we could have the CLI just handle that new set of primitive scalar types. I think the key at that time would be to find the right small set of primitive scalar types so that we don't go all the way back to trying to specialize the CLI to handle tons of different cases.
To be less abstract, at that time we'd be making breaking API changes anyways so we could change shell/terminal to not return void anymore but return TerminalStreamID. Then the code that exists today as a "buildkit session attachable" (i.e. callback) would be replaced with code that conceptually does the same thing, it's just that it isn't a buildkit session attachable anymore and instead operates based on that stream ID.
export would return a URL scalar type, which the CLI could have generic handling for.
The host tunneling parts of services today are also implemented as buildkit session attachables, so there it would be a SocketID or something, etc.
Basically, I think the "side-effect-only" void approach makes sense today (and for our current time frames), but we could switch to something like the above pretty seamlessly as part of a big effort like that. We wouldn't be locking ourselves in to a particular architecture.
Makes sense š
I'm dedicating the rest of today to reviewing and commenting on these CLI issues. I've read a few yesterday and have comments š
same! will try to summarize my thoughts from this discussion, thank you Etik for indulging me
I think it would be very useful to include a collection of real-world examples using real modules,
to anchor the discussion
Yes, sgtm
Great questions and answers here! This was insightful. š¤ In general I'm on board, apart from flushing a few details. š I have a few notes and quite a few more thoughts to put out and need to write all of that but I'll write in the issues (and create at least a new one).
Didn't know anything about "remoting". First I'm learning abou it š
Sorry I missed this question!
YES I am completely on board with the general idea
I made up that word yesterday, but it's a reference to a very old design discussion that we agreed to shelve many moons ago. github.com/dagger/dagger/issues/3595
Actually forget that issue, here are our current docs:
Using the SDK, your program prepares API requests describing pipelines to run, then sends them to the engine. The wire protocol used to communicate with the engine is private and not yet documented, but this will change in the future. For now, the SDK is the only documented API available to your program.
- "The wire protocol is private and not yet documented" is a reference to the "buildkit session stuff"
- "This will change in the future" is the shelved "remoting" work I'm talking about
My main concern is that if we tie the CLI 1:1 with the API, we may be pulled into doing something that's not the best for one or the other and the API affects several "interfaces". Leaky abstraction and all. I've seen some compelling arguments though.
If the CLI is so powerful I worry that users will tend to resort more to using the CLI rather than making a higher-order module (because it's easier), but be frustrated with some limitations and request more advanced CLI features, and more commonly write bash scripts or makefiles just to alias dagger commands, rather than create a new module.
That reminds me of docker run and fig/docker compose that came out to simplify how to call it. I'm not sure if there's lessons learned from docker compose that we should avoid by taking down that path of having a simple config file to declare more lenghty dagger cli commands in a project (cli thing, not api thing). š
At least I'm convinced we need users to feel it's easy enough to declare their project functions at a higher level, leading to dead simple cli commands (like dagger run build), therefore avoiding wrapping dagger in bash/makefiles.
Maybe users will understand that a complicated command warrants a higher level module and relying on big commands is only for more advanced users or uses, even debugging. We can wait and see what way it goes.
@vital crescent I agree that the risk of designing for the least common denominator between CLI and API, is a risk. So far it seems like the risk is worth it compared to the alternatives. But it;s still workth remembering the risk
If the CLI is so powerful I worry that users will tend to resort more to using the CLI rather than making a higher-order module (because it's easier), but be frustrated with some limitations and request more advanced CLI features, and more commonly write bash scripts or makefiles just to alias dagger commands, rather than create a new module.
This is an important point. We are missing an important piece to the puzzle: one-liner scripts. We need those to address this issue.
The key is to provide a gradual learning curve, so you can start simple, then graduate to the next level of complexity, with just enough gap between the levels that it feels smooth.
The levels are:
-
Pure CLI. Easiest. You're calling functions from the CLI. Makes sense. There are limitations, makes sense too.
-
Scripts. Need to choose a language. But no module to create, you get a repl, or you pass a one-liner as an argument, or you write a script to a regular file. Dependencies and codegen are handled for you, so it's not the same as today's
ci.py. It's much nicer š -
Module. When your script gets bigger, you upgrade to a full blwon module.
We need level 2, otherwise the jump is to wide between levels 1 and 3, and you get the problem you're talking about @vital crescent : people get stuck at level 1, and write bigger and bigger shell scripts. Maybe wrapper tools appear. The experience worsens.
one-liner scripts. We need those to address this issue
Iām itching to implement that btw š But it doesnāt address the use case Iām thinking about.
I imagine when itās time to share ādagger tasksā with the team, a user commits this to git instead of a module:
# Makefile
build:
dagger -m github.com/xxx/daggerverse/golang run --src="." build export --path="bin/"
docs:
dagger -m github.com/yyy/daggerverse/docusaurus run ... etc ...
So they tell the team to make build instead of dagger run build.
Could also be a user that doesnāt feel confident in programming. Maybe a simpler, config language based SDK for simple cases like that?
I think it's inevitable that we will see some of that. The important thing is not to make it impossible, but to "channel" those use cases so that as the Makefile grows, there's an easy path to upgrade to the next level
Just appending one other possibility to all of the above (which I agree with both in terms of the concerns and mitigations): our current dagger mod init is very barebones and could be way more useful for helping out in these types of situations. More "starter templates" could go a long way towards shortening the leap required to get to level 3.
And even something like allowing users to take an existing complicated CLI command or lang one-liner and automatically turn it into module code that they can invoke more easily is not really as crazy as it might sound I don't think
Started on some of the plumbing we'll need for all of this, hooked it up to the current dagger call just to see what it looks like completely raw:
package main
func New() *Test {
return &Test{
Ctr: dag.Container().From("alpine:3.18"),
}
}
type Test struct {
Ctr *Container
}
sipsma@dagger_dev:~/repo/github.com/sipsma/daggerverse/test$ dagger call ctr file --path /etc/alpine-release contents
ā dagger call ctr file contents [1.47s]
ā 3.18.5
⢠Cloud URL: https://dagger.cloud/runs/aecc25e4-fd8a-4981-9ce5-92f8e53a7685
⢠Engine: 63c25ac7c1cb (version devel ())
ā§ 2.18s ā 66 ā
7
But then if you run --help you, as we were expecting, get this absurd output š (discord won't let me paste it for some reason, had to take screenshot)
So I suspect getting that tamed (and related adjustments to core API) is going to be the bulk of actual work (plumbing part was super easy, I'll send out a PR for the plumbing only w/out updating CLI since it's pretty standalone and testable independently)
@worldly junco I agree with what you said in the meeting chat earlier today about being suspicious of going down the road of APIs being annotated w/ metadata that customizes how it's presented in the CLI specifically. I'm suspicious of it too and don't want to jump 100% into immediately, but I do think there would be some important differences relative to the old "entrypoints" stuff:
- We can start out just w/ the core API having this ability, not user facing in any way.
- If we wanted to expose it to users, then it would be a completely optional + advanced feature. There's support in all the SDKs now for "opt-in" annotations on everything, which would be a pretty clear route for supporting this.
But we don't even need to jump into it immediately either. I'm thinking we can just see how long we can go without anything like that and then only jump to 1. if there's no other viable options.
I don't actually think this super-long help message is a problem... š
First, I think we should invest more in dagger functions as the canonical way of discovering. --help to me is more of a hack - a nice hack, but still a hack.
On top of that, I think there's plenty we can try on the CLI side, without involving the API at all, before we have to even consider messing with the API layer.
For example, --help and/or dagger functions could default to not showing core functions. Or there could be custom logic that is aware of the core types, and groups the functions by category to add order to the mess
)