#DX wishlist
1 messages ยท Page 1 of 1 (latest)
Starting a thread ๐
-
Cross-module type export. We really need it. I know there is a risk of dependency matrix from hell, but there are possible mitigations - they are worth the effort. Modules are too cumbersome for true agent composition otherwise.
-
Better laziness model. The current rule "returning an object is always lazy; returning a scalar is always sync" - is too heavyweight and mysterious. When building more amibitious modules, it gets realy confusing
-
Send object fields as struct fields. I know I was against "leaking" the concept of fields vs. functions in the API, because I liked that GraphQL didn't differentiate. But it has leaked anyway, and the current DX makes it very painful to get a struct from another module. If I call a function that returns an object, and that object has 10 regular fields (not functions) - I should get all 10 fields for free in a single query.
-
Self-calls. Not being able to pass my own module's object to my LLM, forces me to artificially split up my agent into more modules than is needed.
Cross-module type export. We really need it. I know there is a risk of dependency matrix from hell, but there are possible mitigations - they are worth the effort. Modules are too cumbersome for true agent composition otherwise.
I'd say I'm open to this; in practice the ecosystem hasn't ended up with that many modules that have highly complex dependency DAGs. So I think if we enabled this 95% of cases would be fine, but the mitigations for the other 5% will result in either really ugly generated code or headache-inducing dependency hell problems for end users. But I don't think it's worth blocking on because of those cases anymore
Better laziness model. The current rule "returning an object is always lazy; returning a scalar is always sync" - is too heavyweight and mysterious. When building more amibitious modules, it gets realy confusing
Very open to changing something here, but would strongly advocate for retaining the ability to have laziness, even if something about how it's "shaped" or default behavior, etc. change. I think there might be paths to retaining the idea of laziness but reconceptualized to be based on "functions as args"/"interfaces", for instance.
Extra subscribed to this subtopic as it's extremely relevant to Theseus since quite a bit of the more confusing parts of laziness result from buildkit/LLB. So as we swap things out we want to make sure the new implementation matches what we actually want in terms of lazy behavior.
Most helpful would be examples of where the current behavior got in your way or became confusing.
silent cc @worthy grail on the above
Send object fields as struct fields
I think this one is mostly a Go specific problem? Or at least ts/python don't suffer from it.
Agree it's a problem but not sure yet if it's something to solve on the engine/gql level vs. just improving the DX of individual sdks
in practice the ecosystem hasn't ended up with that many modules that have highly complex dependency DAGs
I think part of this goes back to the lack of cross-module types IMO. At least I've run into it a lot
Haha true, it's a chicken-egg situation. By opening the door we may also exacerbate the problem we were trying to avoid
Yeah, if there's another way of explicitly saying "hey this type is special and will probably be passed around" rather than just doing it for all types, that's ok I think
The problems arise when you end up with two types that have the same name but are from different deps. Either literally completely different deps that happen to be using the same name or (even worse) two different versions of the same dep that appear in your dependency DAG. So declaring something special about the type wouldn't really avoid that I don't think.
Like I said, most cases won't hit it but then to deal with those that do possible choices are:
- Error out, but then what is the user supposed to do to fix it?
- Do extra type namespacing where the type is prefixed with the direct dependency that returns it. This might be the least worse option, but it will make already long types names even longer for those cases, and there's degenerate cases where you'd have to do it multiple times and the names would get truly beyond absurd
- Change SDKs so that each dep is a separate imported pkg; basically piggyback on each language's import namespacing
- however, this doesn't fix the problem when there's multiple versions of the same dep in the DAG. I wouldn't know what to do there other than full on version resolution, which is a famously hard problem
Yeah I acknowledge it opens up dependency hell ๐ฑ I think the main use case has to do with interfaces though. So maybe rather than assuming each module using a type needs to depend on the same version of that type, each can have it's own interface and if the type passed in doesn't match it just errors. Might be easier to come up with a real use case/example to solve
Trying to pivot it back from "here's the solution I want" to "here's the problem I'm trying to solve"
Right, yeah that's why interfaces support exists in it's limited form today, obviously incomplete though. So yeah I guess an option is "make interfaces good" as opposed to just literally "return concrete types from other modules".
I'm very onboard with that since interfaces are useful even outside of just avoiding dependency hell, so making them not suck has even more benefits
Thoughts on 4. Self calls?
Oh nothing to add other than ๐ฏ . Interestingly, lack of self call support is actually one source of limitations around interfaces: https://github.com/dagger/dagger/issues/6366
And I don't think there's any blocker on it other than getting it prioritized
Yes I completely agree that we don't want to lose laziness - just make it feel more like a superpower rather than a chaos agent ๐
No strong opinion on how to design this, open to anything really
Most useful would be examples of cases where it was a chaos agent (anytime, obviously not gonna solve this today ๐ )
This is a good reference. At this scale (200 lines-ish), you start running into the "state machine" phase of the design: you start wanting your own version of "these functions unlazy these other functions"; you start wondering how monolithic vs. composable you want the individual parts of the workflow to be. The more composable, the more the question becomes important
https://github.com/dagger/agents/blob/main/melvin/main.go
So to be clear, it's not really a "bug"; it's not even really a specific missing feature; it's more of an unknown on "what's the best design, and is it good enough", that gradually looms larger and larger as your module grows.
https://github.com/dagger/agents/blob/main/melvin/main.go#L167
func (task *GoProgrammingTask) Source(ctx context.Context) (*dagger.Directory, error) {
task, err := task.withReviewLoop(ctx)
if err != nil {
return nil, err
}
return task.Workspace.Dir(), nil
}
This is the "output" function. You call source, and it gives you the result. Note the .withReviewLoop() which is the de-facto sync().
Wait so in this example you want withReviewLoop to be lazy right? Or is it that lazy is confusing for most people who aren't dagger experts?
I'm leaning on that kind of laziness for concurrency in a function I'm working on now but I know the behavior isn't obvious
well in this case there's the extra complication that I'm calling my own function directly (native self-call) so that throws an additional dimension in the mix
Here I rely on it not being lazy. If I called it via the engine (graphql self call) then I guess it would be lazy
Yeah I just realized my use case has the same issue ๐คฃ
mmm, i think i need a bit more clarification before i can understand this. is what's being suggested that if we have a "field", then that turns into a go/python/ts field, instead of an awaitable/fetchable function?
if so... yeah, this is interesting. we'd probably want to rework parts of our core api if we did this as well, some things are "function"-y, some things are "scalar"-y
one big reason this is hard though is this: suppose i have a method Foo.MakeChangeToBaseObject and some fields Foo.A and Foo.B. If I access Foo.A and Foo.B I can, but if I do Foo.MakeChangeToBaseObject.A, then how do we know what that is without a request to the engine? (which needs an await/ctx or equivalent)
if we did that suggestion, it feels like all functions (not fields) would have to change to take a ctx and return an error
It does feel like the issue to solve is that you can't fetch multiple fields at once - you need lots of little queries, which has crappy perf and crappy dx.
I wonder if instead of making the fields just accessible, if we had a better DX for doing multiple fields.
e.g. maybe in go that could look like being able to pass a custom object in:
result, err := dag.Container().From("alpine").Unpack(struct{
RootFS *dagger.Directory
Stdout string
}{})
if err != nil { /* error handling */ }
_, _ = result.RootFS, result.Stdout // do things with result
^ possible now with go generics ๐
I wonder if we could create "auto-interfaces" for objects passed between modules like this.
so we would "graduate" an object to an interface (no graphql schema change, since we don't actually use graphql interfaces for our interfaces).
so if a module exposes Foo with a function Thing(a: int) int
then if another module returns Foo, it gets stiched into it's own schema - but as an interface, instead of that exact type, with the implementation going to the original one.
then, if you have conflicting interfaces (e.g. because of different versions of a dep for example), then you error - but you can error with a really nice message, something like "v0.0.X has Foo.Thing(a: int) int, but v0.0.Y has Foo.Thing(b: float) float"
the underlying implementation would always forward the fields/functions to the original creator module - this avoids unmarshalling the json fields into two different versions, which forces module authors to think about the version compat of their own internal representation which sucks.
what's really cool about this is that as long as the returned interface is a super-set of at least the current interface (e.g. suppose my dependency has upgraded Foo so it has a Thing2 method, but I haven't upgraded to that), then I can still consume that object, I just don't get access to Thing2.
obviously, this kinda gets a bit screwed up if the module completely changes behavior between versions, but keeps the same interface - but this is unavoidable if we want this feature at all
IMO, out of all the items suggested, I think this is the one I would want us to pin down and try and design first?
I think it's the most unclear one, with pretty huge implications across the ecosystem, and fairly reaching consequences through the engine.
good point...
If only we had separated from the start 1) query building from 2) query sending ...
we would have avoided so many of these issues
you mean the "cross-module type export" issue? Issue number 1 in the list correct?
I think that would've actually compounded the problems further because there would be yet another layer of laziness and whole layer of types (or just not having type safety at all)
Agree this is the way to go, I tried an experiment with that at one point. My other thought at the time was to also do something with type embedding so users could just do e.g.
result, err := dag.Container().From("alpine").Unpack(struct{
dagger.ContainerRootFS
dagger.ContainerStdout
}{})
With ContainerRootFS and ContainerStdout being autogenerated types like
type ContainerRootFS struct {
RootFS *dagger.Directory
}
type ContainerStdout struct {
Stdout string
}
So then users get pretty close to full type safety (i.e. don't have to remember and write the names/types of the fields themselves). Would be a meaningful addition to our codegen complexity though of course.
ooh, i was trying to think of how you'd do it type-safely
that's kinda interesting
It's not 100% type safe because you could still put like dagger.ContainerRootFS inside a selection on a dagger.Directory, but that's at least pretty obvious in terms of typos. Maybe there's some weird magic to prevent even those mistakes, but probably not
Also, because of the restrictions on not adding type params to individual methods in go, it technically would have to be something more like:
result, err := dagSelect(dag.Container().From("alpine"), struct{
dagger.ContainerRootFS
dagger.ContainerStdout
}{})
but same idea either way. I'm not holding out hope for go ever fixing that problem based on the issue for it, so even though it's slightly uglier we can file that under the "when in rome" doctrine
I disagree but I'm already overbooked to die on 3 different hills today, so will drop this one ๐
10k message bikeshedding thread successfully avoided ๐
avoided delayed
I'm just waiting for @deft radish to merge client generators, then it will be super easy for me to make alternative generators to put my money where my mouth is
that makes me realize, one cool side effect of decoupling client generators, is that I could use the same "API extension" logic from the Go SDK, but use an alternative Go client generator
if you have conflicting interfaces (e.g. because of different versions of a dep for example), then you error - but you can error with a really nice message, something like "v0.0.X has Foo.Thing(a: int) int, but v0.0.Y has Foo.Thing(b: float) float"
That's dependency hell though, what is the user supposed to do to fix that?
the underlying implementation would always forward the fields/functions to the original creator module - this avoids unmarshalling the json fields into two different versions, which forces module authors to think about the version compat of their own internal representation which sucks.
That's how interfaces work today thankfully
what's really cool about this is that as long as the returned interface is a super-set of at least the current interface (e.g. suppose my dependency has upgraded Foo so it has a Thing2 method, but I haven't upgraded to that), then I can still consume that object, I just don't get access to Thing2.
Yeah I'm fully onboard with this idea because of stuff like that, it seems just strictly better for all types from other modules or core to be interfaces.
But I think it doesn't avoid the choice I mentioned here in terms of namespacing: #1344474673840394360 message
Honestly I'd just be fine w/ the namespacing-by-prefix solution. There would be cases where the names get really outrageous if types pass through multiple layers of deps where there was conflict that resulted in an extra prefix being needed. But I don't imagine it would be all that common to arise and users can always use type aliases to alleviate the pain for consumers of their module
That's dependency hell though, what is the user supposed to do to fix that?
๐ yeah, not a lot - ideally we'd be able to give hints as to what's going on and suggest updates... but if we want this feature, i don't see a way round it ๐ฆ
The other ways around it are the namespacing things I mentioned
Which I'd honestly prefer since then at least it doesn't drop you into dependency hell errors, you just get really ugly autogenerated names in corner cases
fair. but then it's really not actually that useful imo - you can return those types, but they won't work with each other at all. if i return a Util type from A and B takes a Util type, there's no way to plug them into each other (since once is UtilA and the other is UtilB)
you can only really return types and consume them, it's much harder to actually construct and pass them in
(unless we allow UtilA = UtilB somehow if they're the same underyling type, though not sure how that would work in practice)
Well if their interfaces satisfy each other, then you can pass one interface value as the other interface value. And if the autogenerated interface value is too "wide" (e.g. has methods you don't actually care about satisfying), then that's when you'd use custom interfaces
oh i see
right, we're suggesting both options
sorry, getting late here, head fuzzy with interfaces ๐
Yes for sure, today you can technically accomplish this by defining lots of custom interfaces, it's just that doing that every time for every time is a huge PITA (and the other limitations around interfaces like not being able to have your own types satisfy your own interfaces, which will be fixed by self call support)
Should we create issues for each of these, and move the convos there, if we agree they all make sense?
i think the idea works, though i think i need to do a worked example to understand what this actually will look like
like, in a graph of deps with types exported through each other, who sees what, what names, and then what "implied links" are there (from modules that don't explicitly reference each other, but do so through the derived interface of a depedency module ๐ฑ)
yeah, i think that's what's tripping me up - we'd be introducing a "new" type of implied dependency between modules, where a module can end up calling another module that it doesn't explicitly depend on - but one that the developer hasn't explicitly used interfaces to declare