#Manually provision an engine
1 messages · Page 1 of 1 (latest)
if dagger kills an engine when it starts it's because it finds a difference between engine versions and it garbage collects the old ones
what's your current setup and what are you trying to achieve? so we can give the best advise here
When upgrading engines on busy repo it’s really difficult to land the upgrade as new PRs running on old engine versions with kill the engine upgrade PRs.
But the other use case is to pre start the right engine on the runner host with all of the required flags like setting the cache dir. rather than letting the action to start the engine.
gotcha.
starting the engine manually is basically starting a container with some specific flags and a particular name so the dagger CLI can find it and use it. This is the code that performs that: https://github.com/dagger/dagger/blob/9419454effd67b2d0abff2a18f46490c041823bb/engine/client/drivers/docker.go?plain=1#L64
the command that ends up executing is basically this one docker run -d --restart always --privileged -v /var/lib/dagger --name dagger-engine-v0.19.8 registry.dagger.io/engine:v0.19.8
the important thing to note from the above is the container name as well as the fact that the image and the name version should match
that's pretty much it. If you then run a dagger pipeline with a client on the v0.19.8 version, it will pick up that engine automatically and use it without provisioning a new one
Thank you for that.
But is there any way to prevent dagger from killing these in favour of other versions? I’d rather it just fail with an error saying engine mismatch.
yes, if you set the variable DAGGER_LEAVE_OLD_ENGINE=1, that will prevent it from killing missmatching engines
take into account that if you do that, you might end up with multiple engines on the same node
Thanks!!
Ok, any other ideas? I think what is happening is that some older, stale PRs might be pushed to, and they still contain old config, as noone had updated from main.
So it nukes it down to v0.19.3
We didnt use v0.19.3 almost from the beginning. We were on .6 until yesterday.
I have no even idea where this is coming from would have to check every open PR branch to see
I need the server to somehow deny this. I guess I could set the env on the runner too
will dagger also do this the other direction? such that if I have a 0.19.x running and someone kicks off an old commit build with 0.18.x, will dagger / SDK do the right thing?
There is apparently a breaking change (withDirectory.directory -> source) and I need to account for long-lived release branches
Yes. it does in both directions
which I also think is the cause of our other problems, with the layers or cache
as it writes into the same dir
I think maybe engines need to namespaces the dir further by version
I just set the DAGGER_LEAVE_OLD_ENGINE flag on our runner, and now I see two engines running side by side
yes. If you set the _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://$container-name that will force Dagger to always try to use that engine. If a client tries to use a newer version module on an older engine, that will fail since we don't support forward compatibility yet.
Not sure if they will succeed tho given that they use the same dir
@prisma temple Not sure, at least GH action will upgrade and downgrade
We do use that flag too
I don't know if this is the GH action itself that does it or the engine, but the result is that the versions keep going up and down
😄
I already use the _EXP... var, problem is it is globally set...
if that variable is set, it shouldn't do any form of garbage collection. Let me verify really quick. I'm not 100% sure
I do see from the logs that GC is still happening, unless I don't understand what is considered garbage
but it was doing clean up, you can see from the logs I shared
right, I think what I want to do for the 0.18 is drop the env var, then if one of those builds starts, dagger will start it up automatically
Oh wait, do you mean the docker-container://$container-name part is the imoportant bit not the var itself?
Because we set it to a hostname, where Docker runs
but... (see me newer post), if there already was a way to set just the registry dagger uses on that auto-start pull....
pretty sure the image GC dagger does matches the container name itself
ya name matches too
just double checked this and it's effectively NOT doing any GC when using the docker-container scheme
I was clarifying if the _EXPERIMENTAL_DAGGER_RUNNER_HOST env var is important, or the actual value needs to be docker-container... because we use _EXPERIMENTAL_DAGGER_RUNNER_HOST but set it to a network hostname
this is my experience as well
Ok interesting...
But how to deal with the shared mount dir?
if both engines will use the same volume
Or is this volume not host mounted?
it's not shared if you started the engine with --volume /var/lib/dagger. That creates an annonymous mount that will only be valid for that engine only
yes, the dagger SDK will always start to the appropriate engine depending on the SDK / CLI version
by default it GCs existing running engines unless you set the DAGGER_LEAVE_OLD_ENGINE env var
@prisma temple but you were asking me if we use a shared EBS volume. If the engine does not share the volume with the host, why this would be a problem? Or were you just assuming we host mounted the volume?
So there's really no issue then running multiple engines in parallel?
no there's not as long as they don't share the state directory
yes, I was asuming you were intentionally mounting the engine state dir in the host
by that you mean we didn't manually start it with host mounted volume which is shared?
*you did start it you mean?
yes, I was assuming you were manually starting the engine wuth with shared host volume
Dagger doesn't share the cache volume across engines when starting multiple in the same machine
it's not supported
Yes, perfect. Understood.
To confirm, no, we didn't. It was auto-started by the dagger client.
Which then still is strange that we are seeing so many weird errors. I was already thinking that we can write that off to the volume sharing. 😄
by setting this ENV var, can I expect two versions of dagger to then play nice on the same host? Is the default way dagger runs that container with an anonymous volume?
yes
What happens when I set this:
_EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://$container-name
But the container does not exist? Will the client try to infer the version from the name somehow or will it just fail as it can't find this container?
Probably fail