caching math | Dagger | Page 1

zinc quarry Mar 11, 2024, 5:07 PM

#

oh they're on different nodes

#

that, dagger will not handle. and caching will not be shared in real time

proven mesa Mar 11, 2024, 5:08 PM

#

wait! I don't want to sahre cache on different nodes

zinc quarry Mar 11, 2024, 5:08 PM

#

(only synced in the background if you use dagger cloud)

proven mesa Mar 11, 2024, 5:08 PM

#

cahce will be local per node

#

Just want to spread the 40 runs on 4 or more nodes

zinc quarry Mar 11, 2024, 5:11 PM

#

I see. Then yeah if you want jenkins to route those 40 runs across nodes , with maximum flexibility, then you'll need those 40 CI entries in the Jenkins config. Caching will indeed be shared locally without explicit "math"

#

you won't be able to fully optimize with a wrapper dagger run because that would require deciding in advance which of the 40jobs are lumped together on one node.

#

so there's a small tradeoff of clustering efficiency vs cache efficiency. But only for relatively cheap lookup operations. The bulk of the caching will work the same with or without wrapper

#

hopefully I am making sense 😁

proven mesa Mar 11, 2024, 5:13 PM

#

Perfect, but now that I'm thinking, even if there are some cache misses, it won't "matter" too much. Well only for the first batch, after that it will be hydrated.
And node_modules cache will be hot anyway.
And git will also have the big part of the git objects there, so worst case will download twice few changes.

unborn warren Mar 11, 2024, 5:14 PM

#

What about cache volumes? would they conflict because they have the same name?

zinc quarry Mar 11, 2024, 5:14 PM

#

cache volumes at the moment are never namespaced so you should always assume there is a conflict, and design your pipelines accordingly

unborn warren Mar 11, 2024, 5:15 PM

#

Right that's what I thought..

proven mesa Mar 11, 2024, 5:15 PM

#

Oh you are making sense! Dont worry.
I'm just "over engineerring" it some times.
I started with using go's concurrency after the common steps, but then I thought that this will be hard to spread across nodes, so I'm adapting to single "project" per run

(For cache volumes I use the "Shared" option) and its for generic cache, eg npm's cache for node_modules (not the actual node_modyules))

unborn warren Mar 11, 2024, 5:16 PM

#

ah.. ok so if you did use cache volumes for actual node_modules, you'd have issues. IDK how well npm handles a mutating cache.

#caching math