#DX wishlist

1 messages ยท Page 1 of 1 (latest)

cinder adder
#

Starting a thread ๐Ÿ™‚

#
  1. Cross-module type export. We really need it. I know there is a risk of dependency matrix from hell, but there are possible mitigations - they are worth the effort. Modules are too cumbersome for true agent composition otherwise.

  2. Better laziness model. The current rule "returning an object is always lazy; returning a scalar is always sync" - is too heavyweight and mysterious. When building more amibitious modules, it gets realy confusing

  3. Send object fields as struct fields. I know I was against "leaking" the concept of fields vs. functions in the API, because I liked that GraphQL didn't differentiate. But it has leaked anyway, and the current DX makes it very painful to get a struct from another module. If I call a function that returns an object, and that object has 10 regular fields (not functions) - I should get all 10 fields for free in a single query.

  4. Self-calls. Not being able to pass my own module's object to my LLM, forces me to artificially split up my agent into more modules than is needed.

modern tusk
#

Cross-module type export. We really need it. I know there is a risk of dependency matrix from hell, but there are possible mitigations - they are worth the effort. Modules are too cumbersome for true agent composition otherwise.
I'd say I'm open to this; in practice the ecosystem hasn't ended up with that many modules that have highly complex dependency DAGs. So I think if we enabled this 95% of cases would be fine, but the mitigations for the other 5% will result in either really ugly generated code or headache-inducing dependency hell problems for end users. But I don't think it's worth blocking on because of those cases anymore

Better laziness model. The current rule "returning an object is always lazy; returning a scalar is always sync" - is too heavyweight and mysterious. When building more amibitious modules, it gets realy confusing
Very open to changing something here, but would strongly advocate for retaining the ability to have laziness, even if something about how it's "shaped" or default behavior, etc. change. I think there might be paths to retaining the idea of laziness but reconceptualized to be based on "functions as args"/"interfaces", for instance.

Extra subscribed to this subtopic as it's extremely relevant to Theseus since quite a bit of the more confusing parts of laziness result from buildkit/LLB. So as we swap things out we want to make sure the new implementation matches what we actually want in terms of lazy behavior.

Most helpful would be examples of where the current behavior got in your way or became confusing.

#

silent cc @worthy grail on the above

#

Send object fields as struct fields
I think this one is mostly a Go specific problem? Or at least ts/python don't suffer from it.

Agree it's a problem but not sure yet if it's something to solve on the engine/gql level vs. just improving the DX of individual sdks

mellow stone
#

in practice the ecosystem hasn't ended up with that many modules that have highly complex dependency DAGs
I think part of this goes back to the lack of cross-module types IMO. At least I've run into it a lot

modern tusk
mellow stone
#

Yeah, if there's another way of explicitly saying "hey this type is special and will probably be passed around" rather than just doing it for all types, that's ok I think

modern tusk
#

The problems arise when you end up with two types that have the same name but are from different deps. Either literally completely different deps that happen to be using the same name or (even worse) two different versions of the same dep that appear in your dependency DAG. So declaring something special about the type wouldn't really avoid that I don't think.

Like I said, most cases won't hit it but then to deal with those that do possible choices are:

  • Error out, but then what is the user supposed to do to fix it?
  • Do extra type namespacing where the type is prefixed with the direct dependency that returns it. This might be the least worse option, but it will make already long types names even longer for those cases, and there's degenerate cases where you'd have to do it multiple times and the names would get truly beyond absurd
  • Change SDKs so that each dep is a separate imported pkg; basically piggyback on each language's import namespacing
    • however, this doesn't fix the problem when there's multiple versions of the same dep in the DAG. I wouldn't know what to do there other than full on version resolution, which is a famously hard problem
mellow stone
#

Yeah I acknowledge it opens up dependency hell ๐Ÿ˜ฑ I think the main use case has to do with interfaces though. So maybe rather than assuming each module using a type needs to depend on the same version of that type, each can have it's own interface and if the type passed in doesn't match it just errors. Might be easier to come up with a real use case/example to solve

#

Trying to pivot it back from "here's the solution I want" to "here's the problem I'm trying to solve"

modern tusk
mellow stone
#

Thoughts on 4. Self calls?

modern tusk
cinder adder
modern tusk
cinder adder
# modern tusk Most useful would be examples of cases where it was a chaos agent (anytime, obvi...

This is a good reference. At this scale (200 lines-ish), you start running into the "state machine" phase of the design: you start wanting your own version of "these functions unlazy these other functions"; you start wondering how monolithic vs. composable you want the individual parts of the workflow to be. The more composable, the more the question becomes important

https://github.com/dagger/agents/blob/main/melvin/main.go

So to be clear, it's not really a "bug"; it's not even really a specific missing feature; it's more of an unknown on "what's the best design, and is it good enough", that gradually looms larger and larger as your module grows.

GitHub

Contribute to dagger/agents development by creating an account on GitHub.

#

https://github.com/dagger/agents/blob/main/melvin/main.go#L167

func (task *GoProgrammingTask) Source(ctx context.Context) (*dagger.Directory, error) {
    task, err := task.withReviewLoop(ctx)
    if err != nil {
        return nil, err
    }
    return task.Workspace.Dir(), nil
}

This is the "output" function. You call source, and it gives you the result. Note the .withReviewLoop() which is the de-facto sync().

mellow stone
#

Wait so in this example you want withReviewLoop to be lazy right? Or is it that lazy is confusing for most people who aren't dagger experts?

#

I'm leaning on that kind of laziness for concurrency in a function I'm working on now but I know the behavior isn't obvious

cinder adder
#

Here I rely on it not being lazy. If I called it via the engine (graphql self call) then I guess it would be lazy

mellow stone
#

Yeah I just realized my use case has the same issue ๐Ÿคฃ

worthy grail
#

if so... yeah, this is interesting. we'd probably want to rework parts of our core api if we did this as well, some things are "function"-y, some things are "scalar"-y

#

one big reason this is hard though is this: suppose i have a method Foo.MakeChangeToBaseObject and some fields Foo.A and Foo.B. If I access Foo.A and Foo.B I can, but if I do Foo.MakeChangeToBaseObject.A, then how do we know what that is without a request to the engine? (which needs an await/ctx or equivalent)

#

if we did that suggestion, it feels like all functions (not fields) would have to change to take a ctx and return an error

#

It does feel like the issue to solve is that you can't fetch multiple fields at once - you need lots of little queries, which has crappy perf and crappy dx.
I wonder if instead of making the fields just accessible, if we had a better DX for doing multiple fields.
e.g. maybe in go that could look like being able to pass a custom object in:

result, err := dag.Container().From("alpine").Unpack(struct{
  RootFS *dagger.Directory
  Stdout string
}{})
if err != nil { /* error handling */ }
_, _ = result.RootFS, result.Stdout // do things with result
#

^ possible now with go generics ๐Ÿ‘€

worthy grail
#

then if another module returns Foo, it gets stiched into it's own schema - but as an interface, instead of that exact type, with the implementation going to the original one.
then, if you have conflicting interfaces (e.g. because of different versions of a dep for example), then you error - but you can error with a really nice message, something like "v0.0.X has Foo.Thing(a: int) int, but v0.0.Y has Foo.Thing(b: float) float"

#

the underlying implementation would always forward the fields/functions to the original creator module - this avoids unmarshalling the json fields into two different versions, which forces module authors to think about the version compat of their own internal representation which sucks.

#

what's really cool about this is that as long as the returned interface is a super-set of at least the current interface (e.g. suppose my dependency has upgraded Foo so it has a Thing2 method, but I haven't upgraded to that), then I can still consume that object, I just don't get access to Thing2.

#

obviously, this kinda gets a bit screwed up if the module completely changes behavior between versions, but keeps the same interface - but this is unavoidable if we want this feature at all

#

IMO, out of all the items suggested, I think this is the one I would want us to pin down and try and design first?
I think it's the most unclear one, with pretty huge implications across the ecosystem, and fairly reaching consequences through the engine.

cinder adder
#

we would have avoided so many of these issues

cinder adder
modern tusk
modern tusk
# worthy grail It does feel like the issue to solve is that you can't fetch multiple fields at ...

Agree this is the way to go, I tried an experiment with that at one point. My other thought at the time was to also do something with type embedding so users could just do e.g.

result, err := dag.Container().From("alpine").Unpack(struct{
  dagger.ContainerRootFS
  dagger.ContainerStdout
}{})

With ContainerRootFS and ContainerStdout being autogenerated types like

type ContainerRootFS struct {
  RootFS *dagger.Directory
}

type ContainerStdout struct {
  Stdout string
}

So then users get pretty close to full type safety (i.e. don't have to remember and write the names/types of the fields themselves). Would be a meaningful addition to our codegen complexity though of course.

worthy grail
#

that's kinda interesting

modern tusk
#

Also, because of the restrictions on not adding type params to individual methods in go, it technically would have to be something more like:

result, err := dagSelect(dag.Container().From("alpine"), struct{
  dagger.ContainerRootFS
  dagger.ContainerStdout
}{})

but same idea either way. I'm not holding out hope for go ever fixing that problem based on the issue for it, so even though it's slightly uglier we can file that under the "when in rome" doctrine

cinder adder
modern tusk
cinder adder
#

avoided delayed

#

I'm just waiting for @deft radish to merge client generators, then it will be super easy for me to make alternative generators to put my money where my mouth is

#

that makes me realize, one cool side effect of decoupling client generators, is that I could use the same "API extension" logic from the Go SDK, but use an alternative Go client generator

modern tusk
# worthy grail I wonder if we could create "auto-interfaces" for objects passed between modules...

if you have conflicting interfaces (e.g. because of different versions of a dep for example), then you error - but you can error with a really nice message, something like "v0.0.X has Foo.Thing(a: int) int, but v0.0.Y has Foo.Thing(b: float) float"
That's dependency hell though, what is the user supposed to do to fix that?

the underlying implementation would always forward the fields/functions to the original creator module - this avoids unmarshalling the json fields into two different versions, which forces module authors to think about the version compat of their own internal representation which sucks.
That's how interfaces work today thankfully

what's really cool about this is that as long as the returned interface is a super-set of at least the current interface (e.g. suppose my dependency has upgraded Foo so it has a Thing2 method, but I haven't upgraded to that), then I can still consume that object, I just don't get access to Thing2.
Yeah I'm fully onboard with this idea because of stuff like that, it seems just strictly better for all types from other modules or core to be interfaces.

But I think it doesn't avoid the choice I mentioned here in terms of namespacing: #1344474673840394360 message

Honestly I'd just be fine w/ the namespacing-by-prefix solution. There would be cases where the names get really outrageous if types pass through multiple layers of deps where there was conflict that resulted in an extra prefix being needed. But I don't imagine it would be all that common to arise and users can always use type aliases to alleviate the pain for consumers of their module

worthy grail
#

That's dependency hell though, what is the user supposed to do to fix that?
๐Ÿ˜› yeah, not a lot - ideally we'd be able to give hints as to what's going on and suggest updates... but if we want this feature, i don't see a way round it ๐Ÿ˜ฆ

modern tusk
#

Which I'd honestly prefer since then at least it doesn't drop you into dependency hell errors, you just get really ugly autogenerated names in corner cases

worthy grail
#

fair. but then it's really not actually that useful imo - you can return those types, but they won't work with each other at all. if i return a Util type from A and B takes a Util type, there's no way to plug them into each other (since once is UtilA and the other is UtilB)

#

you can only really return types and consume them, it's much harder to actually construct and pass them in

worthy grail
modern tusk
worthy grail
#

oh i see

#

right, we're suggesting both options

#

sorry, getting late here, head fuzzy with interfaces ๐Ÿ˜›

modern tusk
# worthy grail right, we're suggesting both options

Yes for sure, today you can technically accomplish this by defining lots of custom interfaces, it's just that doing that every time for every time is a huge PITA (and the other limitations around interfaces like not being able to have your own types satisfy your own interfaces, which will be fixed by self call support)

cinder adder
#

Should we create issues for each of these, and move the convos there, if we agree they all make sense?

worthy grail
#

like, in a graph of deps with types exported through each other, who sees what, what names, and then what "implied links" are there (from modules that don't explicitly reference each other, but do so through the derived interface of a depedency module ๐Ÿ˜ฑ)

#

yeah, i think that's what's tripping me up - we'd be introducing a "new" type of implied dependency between modules, where a module can end up calling another module that it doesn't explicitly depend on - but one that the developer hasn't explicitly used interfaces to declare