a little bit - it means I can't do | Dagger | Page 1

ruby frigate Feb 26, 2025, 9:00 PM

#

Yeah happy to wire it up to honeycomb if you want! I don't need anything debugged on my end so it would be strictly for your research purposes

stone shore Feb 26, 2025, 9:00 PM

#

thanks!

ruby frigate Feb 26, 2025, 9:01 PM

#

Actually I'm about to PR this code so maybe that's easier?

stone shore Feb 26, 2025, 9:01 PM

#

actually tbh I can probably just poke through the data in clickhouse

#

so feel free to punt

#

that way i'm not reliant on LLM making the exact same choices 😛

#

i just need to take a closer look at the span attributes

#

oh wait. i did a whole debug UI. good job past alex

they do indeed have different effect IDs 😭

https://v3.dagger.cloud/kpenfound/traces/3dabde63ed3a0b6c3cfe80b69881f022?listen=6a1fad3e2b3fec60&listen=6b3bdb9b5dbe28f6&listen=258644302e896d58&listen=811cf5e7e48dbeb5&showHidden=f73a7d9cc2efd45c&showHidden=1a880d1ed0669f4a&span=dff09196d48b1955&debug

  "EffectIDs": [
    "sha256:3095f90aaeacb47f9f4be5b1b2292f39fd2cc142e499bf02bc40319b4d1ff8e1"
  ],

vs.

  "EffectIDs": [
    "sha256:eb2fd9ae179c7f8e8971dcee6f734713f4e6e91a83b891bafca0df962f30d5d8"
  ],

Dagger Cloud

Browse and visualize Dagger traces.

#

cc @subtle pelican @rare hornet not urgent, but while we're ship-of-theseusing out of Buildkit - would it be possible to expose the actual cache key that Buildkit uses for deduping?

Here's the scenario:

Container A ran against directory.withNewFile("a", "foo")
Container B ran against directory.withNewFile("a", "foo").withNewFile("a", "foo") (same content)
Container A ran and failed, logs showed up at the withExec like normal
Container B "ran" but its withExec just shows pending and has no logs, yet its stdout returned the error immediately

So: different call ID (unavoidable really), but also a different LLB vertex digest (sad panda), and Buildkit's solver must have deduped based on the actual cache key. We never see a span for Container B's effect ID (aka vertex digest).

ruby frigate Feb 26, 2025, 9:08 PM

#

Here's the code where that happens https://github.com/dagger/agents/pull/12/files#diff-f4a19aab2fda91e219a5dc0ac51d6c4cd43b2f9f427ce9b670ff5c83f3dee927R57

I think all of our workspace modules have worked the same way. So we could flatten the files/directories but the core issue remains

subtle pelican Feb 26, 2025, 10:07 PM

#

stone shore cc <@949034677610643507> <@488718750690967563> not urgent, but while we're ship-...

Well I don't want to spend much (or ideally any) time/effort exposing more buildkit stuff vs. just replacing it with our own stuff. But if you want to see if there's some quick hack to expose that cache key you could just apply it to the dagger/buildkit fork. I think the cache key you want is the one used in this interface: https://github.com/sipsma/buildkit/blob/a1a1cc0e6c4538424d619c8c454186a2f9061e9f/solver/cachemanager.go#L64-L64

But that interface is extremely convoluted, so I couldn't say off the top of my head precisely where to hook in. There's some internal docs on parts of it here

In terms of the new Theseus world, this will all just be handled by call ID digests. Either an ID can have multiple digests (i.e. the recipe digest vs content-based one) or we "dag-ify everything" so that there's no "built-in" notion of laziness and what's modeled today w/ effect IDs just become different fields. Sort of hard to describe quickly, not sure if that makes sense 😅

#

Justin and I have two possible approaches in mind for how to deal with all the stuff today that is lazy LLB and uses effect IDs. We haven't reached the point where we've hit that crossroads yet, but making sure telemetry can deal with it in a headache-free way will obviously be an important factor

stone shore Feb 26, 2025, 10:18 PM

#

subtle pelican Well I don't want to spend much (or ideally any) time/effort exposing more build...

cool, ty for the braindump - don't need to do anything now really, it's just a very particular scenario so I was curious how it might be handled in the brave new world

subtle pelican Feb 26, 2025, 10:21 PM

#

As an addendum, for this very particular scenario I actually think we'd be able to normalize pretty easily so that directory.withNewFile("a", "foo").withNewFile("a", "foo") just straight up returns the same ID as directory.withNewFile("a", "foo"). The withNewFile impl would just see "I'm the same operation as the previous one, nothing to do, just return parent dagql.Instance[core.Directory]

But those sort of normalizations are a case by case basis, can't possibly handle all of them like that, so the more general problem stands

#a little bit - it means I can't do