#caching math
1 messages · Page 1 of 1 (latest)
oh they're on different nodes
that, dagger will not handle. and caching will not be shared in real time
wait! I don't want to sahre cache on different nodes
(only synced in the background if you use dagger cloud)
I see. Then yeah if you want jenkins to route those 40 runs across nodes , with maximum flexibility, then you'll need those 40 CI entries in the Jenkins config. Caching will indeed be shared locally without explicit "math"
you won't be able to fully optimize with a wrapper dagger run because that would require deciding in advance which of the 40jobs are lumped together on one node.
so there's a small tradeoff of clustering efficiency vs cache efficiency. But only for relatively cheap lookup operations. The bulk of the caching will work the same with or without wrapper
hopefully I am making sense 😁
Perfect, but now that I'm thinking, even if there are some cache misses, it won't "matter" too much. Well only for the first batch, after that it will be hydrated.
And node_modules cache will be hot anyway.
And git will also have the big part of the git objects there, so worst case will download twice few changes.
What about cache volumes? would they conflict because they have the same name?
cache volumes at the moment are never namespaced so you should always assume there is a conflict, and design your pipelines accordingly
Right that's what I thought..
Oh you are making sense! Dont worry.
I'm just "over engineerring" it some times.
I started with using go's concurrency after the common steps, but then I thought that this will be hard to spread across nodes, so I'm adapting to single "project" per run
(For cache volumes I use the "Shared" option) and its for generic cache, eg npm's cache for node_modules (not the actual node_modyules))
ah.. ok so if you did use cache volumes for actual node_modules, you'd have issues. IDK how well npm handles a mutating cache.