#@vito maybe a stupid idea. Have we ever

1 messages ยท Page 1 of 1 (latest)

timber pumice
#

What's the goal? (And what do you mean by serialized state?)

real raft
#

the serialized fields of the object, that the engine collects to persist objects when chaining calls

#

no specific short term goal, was just pondering my experience instantiating my module's types; and whether there are lurking DX improvements there

#

for example we have loadXXXfromID at the root query which is horrible

#

and no generic new for a given type

timber pumice
real raft
#

and separately, the same exact object might be instantiated by different chains of calls - leading to different IDs for the same object

real raft
timber pumice
real raft
#

but what about foo.bar() with:

func (foo Foo) Bar() *Container {
  return dag.Container().From("alpine:latest")
}
timber pumice
#

should work the same way yea

#

IMO recipe-addressed content is strictly better than content-addressed values, since it gives you the magical power to derive the value if it's missing in the destination. It also lets us be as lazy as possible. One of the DagQL experiments I tried was to have queries like foo { bar { baz { id } } } actually just statically return the ID without even evaluating foo.bar.baz, so things can be evaluated with maximal parallelism later

#

It's also deeply integrated into the TUI and Cloud UI. I know you're not arguing against it, just pointing out how leveraged it is at the moment ๐Ÿ˜›

real raft
#

Yeah I can definitely see that

#

I'm wondering if content-addressed has a role though, perhaps complementary to recipe-based IDs?

timber pumice
real raft
#

It seems like a gap that, when different recipes lead to the exact same content, only buildkit knows about that, and dagger doesn't

timber pumice
#

best would be to have both, yeah. I wanted to do this for artifact publishing to Daggerverse

#

basically an ID with an attestation for what content digest it should produce

real raft
#

(and in some cases even buildkit may not know, in the case of recipes that don't compile to llb at full resolution)

timber pumice
timber pumice
real raft
#

Is it fair to say that we are conflating 1) function memoization and 2) addressable objects?

#

There is loss of information there, and perhaps that information is currently worthless, but in the abstract we are losing information today - right?

#

For example given an object, the engine doesn't know how many different chains returned it

#

(this might just be "Solomon learns DAG 101", sorry)

timber pumice
#

this probably ties into Erik's mention of giving Dagger its own content store and swapping out LLB (or was it integrating with LLB? don't remember)

real raft
#

This traces back specifically to custom ID-able types, which in my defense are very new, and still not fully understood by us mere mortal developers using the platform ๐Ÿ™‚

#

and also don't exist in buildkit

real raft
#

it's like we introduced a whole new layer of capabilities, that are incredibly powerful, currently mixed into the pre-existing "almost llb" design, and eventually we will go back and layer it more cleanly once we understand it better

#

Perhaps eventually object ID wil be flipped: the ID is the content; and recipes (plural) are attached metadata saying "this is how to reproduce this artifact"

timber pumice
real raft
#

not sure I believe my own BS here ๐Ÿ˜›

#

since the "content" in that content-addressed ID will most of the time itself be a recipe... Like a Container field in a custom App object...

#

All the way down to actual blobs that we're not permanently storing anyway

#

and therefore cannot be retrieved without a recipe to create them...

#

so not sure what these "not recipe" IDs would even be used for

timber pumice
#

there'll probably be a few details to shake out if/when we start using IDs for persistence (beyond the in-memory cache) - whole different set of concerns there. the ID design is meant to accommodate it (e.g. the 'impure' metadata so we can know not to persist the result), but it hasn't seen a trial by fire yet.

#

the blob() API underlying local file syncs is a good example of an actually content-addressed ID, but as you would expect, it only works if the content is already in the store ๐Ÿ˜›

timber pumice
real raft
#

I have no idea with meta: true/false is, first time I hear about that

timber pumice
#

We don't use it anywhere now, since the only things that needed it before were pipeline() and withFocus(), which we've moved away from

#

The idea was that you could take an ID and remove any 'meta' calls from its DAG to yield a canonical representation that you can use for persistent cache keys, so APs like pipeline (or I suppose your ideal OTel alternative) don't bust cache keys

#

But it's pretty expensive and complicated to do that conversion

#

Interestingly it's a case where you'd want that canonicalization to apply to persistent caches but not the query cache - because pipeline and withFocus actually do affect the in-memory content, at least with how they were implemented before (they would store their value in a field on the object)