#Persisting bazel state
1 messages ยท Page 1 of 1 (latest)
Hi there! It looks like what you need is a Cache Volume. This is commonly used when running build tools inside Dagger, to persist tool-specific state in between invocations.
Don't the volume only keep file caches? In this case the state is in memory of the Bazel server
There are two steps to using Cache Volumes:
- Add them to your Dagger logic (I will share API reference in a second)
- Make sure Dagger cache is persisted in between runs - this is infrastructure-specific, and may require zero work, or a little one-time work, depending on your setup
Ah, so what's the best practice when running jobs in short-lived CI runners? To have them connect to a long-running Bazel server running on the side, shared by all CI runners?
AFAIK most orgs that use Bazel eventually get to a place where you don't really run short lived CI runners since the expense of the analysis phase(The phase where Bazel discovers all targets in the repo and their configuration) is just too high.
E.g. in our repo a build with a warm worker might take under a minute(But of course depends on how many targets got invalidated) but if you would run that build with a cold worker the analysis phase adds 4-5 minutes on top of that.
In our CI setup a single runner(VM) runs a single CI run so there are no conflicts between jobs. If dagger could leave a container running on the host and the next run could reuse it then that could possibly work
I see. If I understand correctly, your current Github Actions workflows involve 1) a bunch of hard-to-debug logic, which 2) among other things, connects to a long-running bazel server, to take advantage of shared state.
Did I get that right?
And if so, that means that your Bazel project is already setup for Remote Execution, correct?
Pretty much, except we only use remote caching at the moment but not remote execution.
Running Bazel is just one step of our pipeline, we also upload artifacts, trigger deployments, write github PR comments etc.
Having this all in a proper statically typed language is very appealing
Makes sense. This gives me a good picture, thanks.
The short version is: yes, I think this should work just fine in Dagger, you should be able to get all the benefits of a Daggerized pipeline (local testing, proper types, reusable code etc), while also keeping your current Bazel architecture.
I recommend a very incremental, staged approach to trying Dagger. You could start with a single GHA job - perhaps one that would particularly benefit from being Daggerized. If that job happens to include bazel, then you would simply "lift and shift" the bazel logic from GHA to Dagger.
Yes Dagger will run Bazel in a container, but you can connect to a host service. So you would arrange for the bazel cache server, still running on the host, to be addressable from the container at the usual hostname and port.
You don't even need to worry about running the Bazel server itself in Dagger (although you can do that too if you'd like, but it's orthogonal to the rest)
If the above goes super well, and you decide you love Dagger, later you can be more radical in your "Daggerization", and replace the GHA runner altogether ๐ And just run your own long-running service, which would include a Bazel cache server, a github webhook router - then you have your entire CI logic in a single self-contained DAG ๐ But forget I said that for now!
Thanks a lot for taking your time to answer my questions. I'll see if I can create small proof of concept and I'll probably be asking more question by then ๐
No problem. Let us know if you get stuck, we're happy to help.
Here's the blog post for the new host services feature. Look for "container to host" https://dagger.io/blog/dagger-0-9