#a little bit - it means I can't do
1 messages ยท Page 1 of 1 (latest)
Yeah happy to wire it up to honeycomb if you want! I don't need anything debugged on my end so it would be strictly for your research purposes
thanks!
Actually I'm about to PR this code so maybe that's easier?
actually tbh I can probably just poke through the data in clickhouse
so feel free to punt
that way i'm not reliant on LLM making the exact same choices ๐
i just need to take a closer look at the span attributes
oh wait. i did a whole debug UI. good job past alex
they do indeed have different effect IDs ๐ญ
"EffectIDs": [
"sha256:3095f90aaeacb47f9f4be5b1b2292f39fd2cc142e499bf02bc40319b4d1ff8e1"
],
vs.
"EffectIDs": [
"sha256:eb2fd9ae179c7f8e8971dcee6f734713f4e6e91a83b891bafca0df962f30d5d8"
],
cc @subtle pelican @rare hornet not urgent, but while we're ship-of-theseusing out of Buildkit - would it be possible to expose the actual cache key that Buildkit uses for deduping?
Here's the scenario:
- Container A ran against
directory.withNewFile("a", "foo") - Container B ran against
directory.withNewFile("a", "foo").withNewFile("a", "foo")(same content) - Container A ran and failed, logs showed up at the
withExeclike normal - Container B "ran" but its
withExecjust shows pending and has no logs, yet itsstdoutreturned the error immediately
So: different call ID (unavoidable really), but also a different LLB vertex digest (sad panda), and Buildkit's solver must have deduped based on the actual cache key. We never see a span for Container B's effect ID (aka vertex digest).
Here's the code where that happens https://github.com/dagger/agents/pull/12/files#diff-f4a19aab2fda91e219a5dc0ac51d6c4cd43b2f9f427ce9b670ff5c83f3dee927R57
I think all of our workspace modules have worked the same way. So we could flatten the files/directories but the core issue remains
Well I don't want to spend much (or ideally any) time/effort exposing more buildkit stuff vs. just replacing it with our own stuff. But if you want to see if there's some quick hack to expose that cache key you could just apply it to the dagger/buildkit fork. I think the cache key you want is the one used in this interface: https://github.com/sipsma/buildkit/blob/a1a1cc0e6c4538424d619c8c454186a2f9061e9f/solver/cachemanager.go#L64-L64
But that interface is extremely convoluted, so I couldn't say off the top of my head precisely where to hook in. There's some internal docs on parts of it here
In terms of the new Theseus world, this will all just be handled by call ID digests. Either an ID can have multiple digests (i.e. the recipe digest vs content-based one) or we "dag-ify everything" so that there's no "built-in" notion of laziness and what's modeled today w/ effect IDs just become different fields. Sort of hard to describe quickly, not sure if that makes sense ๐
Justin and I have two possible approaches in mind for how to deal with all the stuff today that is lazy LLB and uses effect IDs. We haven't reached the point where we've hit that crossroads yet, but making sure telemetry can deal with it in a headache-free way will obviously be an important factor
cool, ty for the braindump - don't need to do anything now really, it's just a very particular scenario so I was curious how it might be handled in the brave new world
As an addendum, for this very particular scenario I actually think we'd be able to normalize pretty easily so that directory.withNewFile("a", "foo").withNewFile("a", "foo") just straight up returns the same ID as directory.withNewFile("a", "foo"). The withNewFile impl would just see "I'm the same operation as the previous one, nothing to do, just return parent dagql.Instance[core.Directory]
But those sort of normalizations are a case by case basis, can't possibly handle all of them like that, so the more general problem stands