#DagQL + Buildkit cache misses
1 messages · Page 1 of 1 (latest)
I'm also seeing some unexpected cache misses at the Buildkit level:
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=37cd3dd1329dbf3f
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=76d6a21b74e63135
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=7e87c5849bbd02de
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=a05239199ce5b6d9
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=8f7477785b3aecb5
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=4a388795811a5533
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=a85d1d7446140068
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=21976696df2c8882
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=e3a7c1208ba9f038
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=547ba60c9f790af3
- https://dagger.cloud/dagger/traces/ff74ad4609f4e7bd800b5c86f462e15f?span=f67b3545e7e575f9
These are the same vertex digest and are created by the same (albeit repeated) API call, as evidenced by the new way this shows up with the cause/effect tracking, where I would normally expect just a single entry here:
Huh I am not 100% sure but think this may have been fixed in https://github.com/dagger/dagger/pull/7823
That was fixing this behavior which was accidentally introduced earlier that day, so you may have just been unluckily running on a commit from that time range?
Think you might be right - it looks OK on main: https://dagger.cloud/dagger/traces/0ce430659db68f24b38e17a105bb73f5?span=64199e6177011bb5
phew!
still a bit odd that the buildkit caches were busted, but maybe things were GCed when the session went away? 
Yeah
I'll poke at that a little bit to make sure there's not something deeper with unexpected cache invalidations across sessions going on, but pruning is a possible explanation.
I also can't tell yet if the fact that everything still worked in this scenario is a good or bad thing. I guess "resilience" is a nice property arguably, in that things still worked but were not optimal, but it might be worth adding some check that nested clients aren't initializing a new session. And at least logging loudly if that unexpectedly happens
yeah - DagQL has unit tests for caching, but I guess we want to also externally verify it. We might even want a test that asserts Buildkit-level caching is working, to make sure DagQL level caching isn't covering for it? 🤷♂️
thanks for poking at it!
The cache misses you were seeing in apk add git were also from the run with these engine logs right?https://dagger.cloud/dagger/traces/14bee421794b09d2012b0a65aed7e286?span=59de326987a2595b&logs#59de326987a2595b:L2083
If so, there is still only one log for actually running that container: https://dagger.cloud/dagger/traces/14bee421794b09d2012b0a65aed7e286?span=59de326987a2595b&logs#59de326987a2595b:L102
Which would suggest it is indeed being cached by buildkit despite all the session close+opens happening
What's weird is you can see differences in the log output, and the spans are all named 'exec foo', so it seems like it actually is running each time