#Run code on the host

1 messages ยท Page 1 of 1 (latest)

cyan gust
#

No, Dagger doesn't allow this. The properties that makes Dagger's graph useful, require all operations to be containerized. But you can do this from your own code, intermixed with calls to the Dagger API.

#

One "escape hatch" that is available, is accessing host services. Dagger can orchestrate containers such that they have access to specific network endpoints running on your host network. You could setup an ssh service, then ssh into that from the containers. Or, if the software you need to run on the host has a server mode, just expose that instead.

#

But as a rule, if you're using Dagger, containerizing should be the rule, and "escaping" the containers the exception. Otherwise you're going against the grain of the platform.

formal ledge
#

I'm not Linux ninja enough to know, but is it possible to run containers in a way that a process will outlive the container? I'm still trying to find ways to make stateful Bazel work with Dagger, there is no advertised way to start the Bazel server and then configure the client to talk to a specific server. But I'll see if I can find some hidden flags for it, there are plenty of them in Bazel ๐Ÿ˜›

cyan gust
#

You can have a container connect to another container as a service, but it will only last for the duration of your Dagger session - so the Bazel server will be wiped after each run, which is not what you want.

#

One great solution would be to persist the state of the bazel server each time, then you can use cache volumes like we discussed in the beginning. But I don't know enough about Bazel to tell you if that's possible or not

#

I think you mentioned long start times

formal ledge
#

Yeah, it's not really possible to persist the state since it's just stored in memory of the Bazel process. I think I'll just have to leave Dagger out of the Bazel part of the pipeline for now.

formal ledge
#

Could I mount a named pipe to the Dagger container and the pipe commands to the host? That does feel very hacky ๐Ÿ˜‰

cyan gust
#

If the only thing you need to run on the host is the Bazel server, I would just run it as a "regular" service, without involving Dagger, and forward the TCP port via Dagger's C2H networking feature

#

I don't think it's worth rigging a way for your Dagger logic to run commands on the host, if the only command you'll run is that bazel server

unreal rampart
formal ledge
#

During local dev the client and server run on my laptop

#

And in CI they run on the CI host. AFAIK there is no way to start a remote Bazel server and then issue commands to it with the client. I at least have not found a way to do so. It might also not make sense to do so since the client and the server need access to the same set of source files.

cyan gust
unreal rampart
#

Additionally, bazel starts different servers depending on the userId and workspace directory of the build. Apparently that's how they supporto multi-tenancy of multiple projects and users.

#

@formal ledge let me check if I find anything how to make a client connect to an existing server

cyan gust
#

fking bazel

#

that's what happens when you want your software to wrap everything, and never the other way around. Headaches.

#

If that's true, then cache volumes should be viable then

#

specifically a cache volume with shared option

#

@unreal rampart what docs page are you looking at?

unreal rampart
#

now skimming through the code to see how the client <> server communication works

cyan gust
#

FYI @burnt grove ๐Ÿ™‚

formal ledge
#

Yeah, this is pretty much correct @unreal rampart
The issue not really the caches that Bazel uses. They can either be on disk or in remote servers which works fine with Dagger.

The issue is that Bazel also keeps in memory state to make incremental builds faster. Doing the initial build might add a few seconds up to minutes to your build depending on how many Bazel targets you have in your repo.

cyan gust
#

Right, I now remember that you explained that earlier

#

So what happens when the first client transparently spawns the server, then exits - it just keeps running in the background, detached from the initial process group?

#

a sort of auto-daemon mode in the user's session I guess

#

and @formal ledge do you know how the client connects to that server? A named pipe or unix socket at a specific location in the workspace?

#

Ironically this is quite similar to how Dagger handles its engine container ๐Ÿ™‚

unreal rampart
#

seems like GRPC over TCP

#

seems like the architecture is very "same sandbox" oriented. I can see from the code that the connect function does some pid level checking to make sure the server is running, etc.

#

there might be a way to hack this by running the dagger engine in non-sandbox mode (no pid namespace basically) and then sending the server_info.rawproto file to the bazel client container. However, it's very very hacky and will probably require some very custom ugly setup

cyan gust
#

I strongly discourage doing that ๐Ÿ˜›

formal ledge
#

Thanks for really digging into this. This is all very interesting, I'm curious if you've ever had other stateful processes that haven't really fit into the Dagger model?

#

Would it make sense for Dagger to have some kind of persistent containers that can live through many invocations?

cyan gust
formal ledge
#

Alrighty, do you want me to create a GH issue to track this?

cyan gust
formal ledge
burnt grove
#

Which got a lot faster in bazel 7 with skymeld btw

#

And yes there is no official way to separate the โ€œserverโ€ and the cli, actually the server is shipped inside the cli. There is a server only to keep the state in memory and for watching the workspace. Same with buck2.

unreal rampart
cyan gust
burnt grove
#

fun part, the server is actually a jar file that lives in the install base:

~/c/g/z/zml (master)> ls $(bazel info install_base)
total 238328
drwxr-xr-x  14 steeve  wheel   448B Dec 11 23:25 ./
drwxr-xr-x   5 steeve  wheel   160B Dec 11 22:58 ../
-rwxr-xr-x   1 steeve  wheel   116M Dec  8  2033 A-server.jar* <----------- BAZEL SERVER
-rwxr-xr-x   1 steeve  wheel     5B Dec  8  2033 build-label.txt*
-rwxr-xr-x   1 steeve  wheel    72K Dec  8  2033 build-runfiles*
-rwxr-xr-x   1 steeve  wheel    50K Dec  8  2033 daemonize*
drwxr-xr-x   8 steeve  wheel   256B Dec 11 22:58 embedded_tools/
-rwxr-xr-x   1 steeve  wheel    32B Dec  8  2033 install_base_key*
-rwxr-xr-x   1 steeve  wheel    51K Dec  8  2033 libcpu_profiler.dylib*
-rwxr-xr-x   1 steeve  wheel    16K Dec  8  2033 linux-sandbox*
drwxr-xr-x   6 steeve  wheel   192B Dec 11 22:58 platforms/
-rwxr-xr-x   1 steeve  wheel   106K Dec  8  2033 process-wrapper*
drwxr-xr-x   9 steeve  wheel   288B Dec 11 22:58 rules_java/
-rwxr-xr-x   1 steeve  wheel   150K Dec  8  2033 xcode-locator*
cyan gust
burnt grove
#

in theory you can also leverage --repository-cache and --disk-cache too, but they are made mostly for cross WORKSPACE sharing, but they work for (somewhat) stateless builds

#

but for true stateless system a remote cache is the way to go, since you only download the invalidated leafs in the build graph (google build without the bytes)

cyan gust
#

Now we're getting somewhere ๐Ÿ™‚

cyan gust
burnt grove
#

you can use GCS itself as a bazel cache, too, but it's rather slow

burnt grove
cyan gust
#

Well in the case of running Bazel in Dagger, the rootexec directory will be persisted on a Dagger cache volume, which can be distributed across nodes just like the rest of the Dagger cache. So the end result I think will be stateless bazel, or at least stateless enough

burnt grove
#

yeah, although the problem is that cache is huge, so snapshotting/restoring is something to watch out for

#

Zenly iOS that was like 5 to 8GB for a cold build

cyan gust
#

@formal ledge I believe we have a possible solution to your problem, which will not require a long-running Bazel running on the host after all! Thanks to @burnt grove

#

I'm going to create a "Dagger and Bazel" channel to celebrate ๐Ÿ™‚

burnt grove
#

it's just that on Github Actions snapshot/restore is painfully slow

cyan gust
#

We don't use the Github Actions cache ๐Ÿ˜›

#

With Dagger Cloud distributed caching, the snapshots go to the closest storage bucket. We have an edge in each AWS and GCP region, so for self-hosted GHA runners in either of those clouds it should be very very fast. If running in managed Github Actions (Azure), it will fallback to Cloudflare R2 which is quite good too

burnt grove
#

how does that work? is it a blob ?

cyan gust
#

For layer cache it's just the buildkit state directory, so a bunch of layers. For cache volumes (just a bunch of bind mounts) they're just directories that we snaphot. I don't know what the snapshotting method is, @midnight tinsel will know. I'm guessing something straightforward like one tarball per volume maybe? ๐Ÿคทโ€โ™‚๏ธ

burnt grove