#Question on engine state
1 messages · Page 1 of 1 (latest)
We're experimenting with the "cache seeding" idea, as a stopgap for lack of proper persistent storage
The general idea: build custom engine images which are pre-warmed by running an ephemeral service on the vanilla image, having it load a module and perhaps even run checks; then snapshot the resulting state and baking it into the engine image.
Then on deployment, we have the option of deploying a blend of vanilla and specialized images. No persistent storage - further writes to cache are discarded.
The theory is that the layers that are common to every run of a given check, are expensive enough that the speedup is worth it
Right now we're trying to prototype this at small scale, to validate whether it is worth it
But, we're running into an implementation issue: you can't just snapshot containerd overlay layers, and bake them into an OCI image. It causes an "overlay on overlay" situation which breaks the final engine
Still a viable option, I did realize there’s one wrinkle in that the engine strictly requires /var/lib/dagger to NOT be overlay (cause you can’t do overlay-on-overlay) but if the cache is stored in an image layer it’ll be overlay. So this idea requires a small dance of mv’ing the cache at startup, but I doubt that would be a huge overhead. Just fui
Oh
Lol

So the question is: how difficult would it be to get, say, a super low-level "export state / import state" engine-wide, to support the same idea conceptually?
I'll take this as an encouraging sign that this is possible and possibly even a good idea? 🙂
Actually, rather than an imperative API call, it might make more sense to make it a configurable directory in the engine image
-> at startup, if the engine finds anything in that path, it ingests it
If you haven’t tried just doing a rsync of the cache off overlay to a bind mounted non-overlay dir yet (so local only rsync, just probably faster than mv), I would start there. If overhead is really bad we could try more
Ultra bonus if it supports ingesting & merging several such directories, that would allow us to experiment with even more strategies 😇
It needs to be more like a full replacement than an additive ingest. Even after theseus we are still using containerd snapshots which cannot be merged like that
Haven't tried yet. If it's at all possible to use image-baking, there is a strong benefit: tooling. We are a Dagger-savvy shop (obviously) so anything that involves assembling and moving lots of images is relatively easy for us. Would require less bespoke tooling than an rsync-based setup
Ah. Would it be basically engine doing the equivalent of the rsync, itself?
OK I will try
Yeah to be clear I only mentioned rsync cause I think it would be faster than the mv command due to parallelism, but I could be misinformed and maybe mv is already parallel
But wait, after I rsync it, can I still pack it into an OCI layer? Or no
Here's what we already do:
- Build a dagger engine container
- start it in an ephemeral service with a dedicated state cache
- run a client against it that loads the given remote module
- cleanly shut the engine down once module loading completes
- snapshot the seeded state back into the returned container image
We already do this with /var/lib/dagger in a cache volume (not rsync, but same idea)
But, I do this with an engine configured to use buildkit native snapshotter (plain files) because I assumed that if we use overlay, the resulting state cannot be safely packed into the image (regardless of whether it was mv-ed or rsynced out prior to baking)
sorry if I'm not clear 🙂
If the issue is packaging rather than extraction, perhaps I could wrap the raw directory in a tarball, and the OCI image contains the tarball?
Here's what I'm suggesting spelled out more:
Baking step:
- Run engine with custom engine state dir (arbitrary, but let's say
/var/lib/dagger2), this is a setting that can be controlled by an arg to the engine process (and/or config file) - Load modules, do whatever you want to be baekd
- Stop engine, build + publish image
Starting pre-baked engine step:
- Run pre-baked engine, but with default engine dir
/var/lib/dagger - On engine startup, before even starting the engine process do a
mv /var/lib/dagger2 /var/lib/dagger- I am wondering if
rsyncmight be faster because it does stuff in parallel, but that's just an optimization maybe. That's the only reason I mentioned rsync
- I am wondering if
- Then run as normal
There's a million variations and potential little optimizations you could try but that's the gist of the simplest approach I think
what's the difference between that and just changing /var/lib/dagger ?
Not following the question. What I'm suggesting avoids any overlay-on-overlay problems entirely
you will never try to run overlay-on-overlay
In the baking step (while the baking ephemeral engine is running), should /var/lib/dagger2 be in a cache volume, or not in a cache volume, or indifferent?
Cache volume (or in docker terms, a bind mount from the host). Then the baking doesn't have to be extremely slow + expensive.
OK. At the moment we already bake in a cache volume (bind mount), but mounted at /var/lib/dagger
Then we copy the contents of the cache volume out into /var/lib/dagger in the container proper. And publish that
But if I understand correctly, the crucial step is to avoid that content being mounted into /var/lib/dagger directly when the final engine runs
Yeah the problem is that you want the final engine state dir to be a bind mount from the host (to avoid overlay-on-overlay). But if you make that bind mount, you are masking all of the data underneath it, which defeats the point of course
OK. The part that seems magical to me, is that /var/lib/dagger2 which will be mounted as an overlay (because part of the image), can be copied into the cache volume at deployment, and somehow nothing is lost in translation
So the dagger process can't access it directly, but somehow rsync can
(this is definitely a me problem 🙂 )
just comprehension
To be clear this is something that'd run probably as part of the entrypoint and before the actual engine process is exec'd. It's nothing too magical, it would just be copying data from one directory (/var/lib/dagger2) to another (/var/lib/dagger).
The problem isn't that the engine couldn't use /var/lib/dagger2, it's just that it's overlay and thus trying to use it will result in the native snapshotter being used. But if the data is copied to /var/lib/dagger (which isn't overlay) then that's not an issue
To be clear this is all quite stupid, it's limitations in the kernel and in containerd, it's just what we have to work with for the time being
IIRC you technically can use overlay as a lowerdir, but not an upperdir. Theoretically that could save us but it would require containerd snapshotters knowing how to be "split brain" between two state dirs, which isn't gonna happen in a short time frame
Yeah stupid and simple. I think it's worth trying as a first step
so the question is: do we want the engine itself to support this as an optional config key or hook - or should we keep it in the image entrypoint
(will do it in entry point for prototype regardless)
I'd consider this a short term hack. Not worth enshrining more.
side note: what I like about this caching/deployment strategy in general, is that it feels like a step towards stateless & even ephemeral engines post-theseus
Yeah I agree, it's the utterly simplest version of a "remote cache import" as you could get. But conceptually it is close to where we're headed
i'm assuming that even post-theseus, it will be useful to configure different engines to read and write cache differently (eg. pre-warm for this or that specialized workload). It's just that it will no longer require moving data in advance- the engine will be able to do its own moving
@primal lotus wouldn't it be possible to bundle this module loading layers in the same way that we're bundling the SDK's runtimes now? As part of the engine build process we generate the required .tarfiles and then we add them to the content store so everything can be bundled as part of the same OCI image and we don't have to do anything in the engine entrypoint
It wouldn't really work the same. What we do with the SDKs is bundle very bespoke base images that we then hardcode certain SDKs to use for the base of their runtimes. It's not actually cache logic, it's more like hardcoding some base images and telling the SDKs to use that.
Trying to get that to work with "arbitrary user modules" would be really complicated since you can't just hardcode anymore
IIRC I originally tried having the SDK bundling use buildkit local cache import but obviously was a nightmare, which is how we ended up with the hardcoded images
Post-theseus, I see a possibility that these 2 very different kinds of bundling will start converging?
There could be a first-class concept of "cache pool", that you configure this or that engine to read from, write to etc
some of which would be pre-configured, public read-only, private, etc
Yeah 100% the SDK bundles would just become an additional local cache you import from, not bespoke base images. May potentially help performance since I am not even convinced the base images really work as consistently as we want today (it's just too confusing and convoluted to think through and get working perfectly)
@primal lotus I think I'm doing something stupid, what's the env var I should set to use a given image in my local docker engine? Failures are silent so it's hard to know for usre
I'm trying this:
# Copy my local vanilla image
docker image tag registry.dagger.io/engine:v0.20.0 dagger-vanilla-engine:v2
# Have dagger auto-start a new engine container with an empty vache (control experiment)
time _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-image://dagger-vanilla-engine:v2 dagger core version
v0.20.1
--> output should be v0.20.0 😭
that looks correct to me
I mean in terms of the env var
trying too
I'm presuming you actually ran docker image tag, not dagger image tag, right?
oh..yes 🙂
$ docker image ls dagger-vanilla-engine:v2
REPOSITORY TAG IMAGE ID CREATED SIZE
dagger-vanilla-engine v2 ef79ee63b9e5 2 weeks ago 676MB
$ docker image ls registry.dagger.io/engine:v0.20.0
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.dagger.io/engine v0.20.0 ef79ee63b9e5 2 weeks ago 676MB
No engine started:
$ docker container ls | grep vanilla | wc -l
0
Here's what I get 
sipsma@dagger_dev:~/repo/github.com/sipsma/dagger/.github/workflows$ docker image tag registry.dagger.io/engine:v0.20.0 dagger-vanilla-engine:v2
sipsma@dagger_dev:~/repo/github.com/sipsma/dagger/.github/workflows$ time _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-image://dagger-vanilla-engine:v2 dagger core version
✘ connect 0.0s ERROR
! start engine: parse runner host: parse "docker-image://dagger-vanilla-engine:v2": invalid port ":v2" after host
real 0m0.835s
user 0m0.095s
sys 0m0.127s
do you have dagger aliased to dagger --cloud or something?
oh... duh I had DAGGER_CLOUD_ENGINE=1 🤦♂️
But... now that I'm getting the same as you: isn't it weird that we can't set the image tag?
maybe it's reserved for setting the CLI version?
Yeah that's a bug for sure, but it does seem to work if you drop the tag