#I've been thinking about the layer cache
1 messages · Page 1 of 1 (latest)
@solemn onyx in the case of OSV scanner, doesn't OSV automatically handle the caching for you? i.e: https://google.github.io/osv-scanner/experimental/offline-mode/#specify-database-location
So if you just add a CACHE_VOLUME to OSV_SCANNER_LOCAL_DB_CACHE_DIRECTORY and always bust the cache when calling the osv-scanner step, you should get a similar behavior as if running that from your local machine which delegates the database update to the underlying program without having to do Dagger specific caching around it
Use OSV-Scanner to find existing vulnerabilities affecting your project’s dependencies.
that's generally the same way I prefer to handle app dependencies as well.
I'm usually more a fan of the adding /root/.npm/cache folder as a cache_volume instead of doing the two step copy package.json && npm install layer caching thing TBH
re: OCI artifact DBs, if ghcr rate limit is an usually, I'd just setup a pull through cache and that's it. So basically addressing the issue as any other OCI rate liming problem
I ususally prefer warming the cache upfront if possible to make sure I get consistent results (ie. build time).
Yeah but for that use case you need offline support right?
Correct. The problem is there is no usable cache key neither in the OCI tag for trivy nor in OSVs database. So basically, I always have to download it. Or set an arbitrary TTL for busting the cache.
@solemn onyx there's something I'm not following. In trivy's case for example, shouldn't running
trivy --cache-dir $TRIVY_TEMP_DIR image --download-db-only
tar -cf ./db.tar.gz -C $TRIVY_TEMP_DIR/db metadata.json trivy.db
rm -rf $TRIVY_TEMP_DIR`
be enough to download the vulnerability db so future pipelines can use it?
you can schedule that to run as often as you need to keep your Vuln DB updated so your build pipelines use it
why do you necessarily need to have a cache key?
A Simple and Comprehensive Vulnerability Scanner for Containers and other Artifacts, Suitable for CI
You are right...in that case a cache busting TTL set to 6 hours (or however often the DBs are rebuilt) should work.
My point about trivy:
- it uses OCI to distribute its DB, so I was wondering if there could be a more Dagger-native way to manage it
- setting a cache TTL still feels like a workaround
- every tool tackles this problem differently, I was wondering if there was something to be done about that
Yeah... It's a hard problem to generalize I think. One thing I'd like to see is maybe an interface to handle this cache warmup use-case and then having different modules that implement that interface