Concerns about having a local engine | Dagger | Page 1

trim surge Oct 6, 2022, 7:51 PM

#

I would rank that as number 1, but there's some other tricky stuff we'd have to port into each language

#

The support for gateway containers has a lot of custom go code on top of the underlying grpc api

timber osprey Oct 6, 2022, 7:51 PM

#

Yeah I know just food for thoughts 🙂

trim surge Oct 6, 2022, 7:51 PM

#

No totally I agree with the general sentiment

timber osprey Oct 6, 2022, 7:51 PM

#

But yeah if it was the case. Then each sdk would just spin up the dagger docker image

#

Nothing else

#

No dependency on the binary

hearty matrix Oct 6, 2022, 7:51 PM

#

What problem are we talking about?

timber osprey Oct 6, 2022, 7:51 PM

#

No bootstrapping of dagger-buildkitd with dagger

trim surge Oct 6, 2022, 7:52 PM

#

Yes the dependency on the binary is a product of lack of resources to reimplement lots of code in each SDK, which can always be addressed by supplying more resources, but we'll need to go through carefully and make sure we know how much we're taking on

#

It's worth more thought though I agree

timber osprey Oct 6, 2022, 7:52 PM

#

hearty matrix What problem are we talking about?

A bunch 🙂 sorry at a doc appt on my phone can’t type a lot

hearty matrix Oct 6, 2022, 7:53 PM

#

Ok, going on a call in 5mn but will catch up after. At first glance that list looks like a list of design decisions, rather than a list of problems

trim surge Oct 6, 2022, 7:53 PM

#

^ Yeah it ties into our architecture (we spin up the local engine as binary which just talks to the separate daemon binary, currently buildkitd, soon to be our buildkitd wrapper). It also thus ties into multitenant support, everything we were talking about in terms of session-state yesterday, etc. etc.

timber osprey Oct 6, 2022, 7:53 PM

#

hearty matrix What problem are we talking about?

For instance, SDKs depend on the binary. How to package distribute etc

If cloak could fully run as a docker container then the problem goes away

#

Or the fact that we have a weird arch (see my comment on buildkit embedding PR)

hearty matrix Oct 6, 2022, 7:54 PM

#

but if we packaged it as a container, wouldn't we lose host { } ?

trim surge Oct 6, 2022, 7:54 PM

#

Another thing we'd have to reimplement in each SDK: provisioners (just thinking out loud)

timber osprey Oct 6, 2022, 7:55 PM

#

hearty matrix but if we packaged it as a container, wouldn't we lose `host { }` ?

Exactly. Because the lack of filesysnc

#

It’s not the case for buildkit because the SDK has filesync

trim surge Oct 6, 2022, 7:55 PM

#

hearty matrix but if we packaged it as a container, wouldn't we lose `host { }` ?

It's technically possible to reimplement the equivalent of host in each SDK, it's just probably not straightforward at all

#

(probably)

timber osprey Oct 6, 2022, 7:55 PM

#

If the SDK were to upload local, then cloak could run anywhere (container, hosted, etc). Doesn’t matter anymore

hearty matrix Oct 6, 2022, 7:55 PM

#

have you been talking to Olivier recently Andrea? 😉

timber osprey Oct 6, 2022, 7:55 PM

#

Nope

#

Haha 🙂

#

It ties into the virtualized host stuff, sessions, local dirs on the playground

#

With filesync all of that doesn’t matter anymore

hearty matrix Oct 6, 2022, 7:56 PM

#

that's a pretty major design change that would affect a lot of things. I think we should talk about the actual problem part a bit more before going to deep down this rabbit hole

trim surge Oct 6, 2022, 7:57 PM

#

It's all coming to head in multiple areas (cloud, embedding buildkit, etc.) so we should have that discussion soon, but I agree there's no time to dive fully at this precise moment 🙂

hearty matrix Oct 6, 2022, 7:57 PM

#

is the problem "packaging a binary in a SDK is too hard"?

trim surge Oct 6, 2022, 7:58 PM

#

hearty matrix is the problem "packaging a binary in a SDK is too hard"?

That's one, but also just that the architecture becomes much more complicated with multiple daemons talking to one another. It would be nice to simplify

hearty matrix Oct 6, 2022, 7:58 PM

#

I also don't see the relationship to packaging buildkit (wouldn't we have to do that inside the container anyway?)

timber osprey Oct 6, 2022, 7:58 PM

#

hearty matrix is the problem "packaging a binary in a SDK is too hard"?

Comment from yesterday on the buildkit PR explains a bit

timber osprey Oct 6, 2022, 7:59 PM

#

hearty matrix I also don't see the relationship to packaging buildkit (wouldn't we have to do ...

There’s no more buildkit just graphql, with this approach

Whereas now there’s a sdk —> localbin —> container over 2 different protocols

hearty matrix Oct 6, 2022, 7:59 PM

#

Concerns about having a local engine

timber osprey Oct 6, 2022, 8:01 PM

#

Concerns about having a local engine AND a containairized engine at the same time

To illustrate the issue

#

Two engines

#

Because one must run on the host, the other in a container

hearty matrix Oct 6, 2022, 8:02 PM

#

Right I understand (but that's too long for a discord channel name 🙂

timber osprey Oct 6, 2022, 8:03 PM

#

Either we make buildkit run on the host (not possible), or we make cloak run in a container (limitation: filesync)

#

Anyway

#

It’s incredibly hard to do so it’s just day dreaming at this point

trim surge Oct 6, 2022, 8:05 PM

#

If the problem was filesync alone I'd actually categorize it at the edge of possible (many possibilities besides just literally reimplementing the filesync api), but I think the random pile of other assorted issues probably puts it over the edge of realistic in the medium term.

But worth double-checking those assumptions

#

And thinking through what the shortest possible path to getting something working would be

#

Trying to finish dagger.json stuff asap so not much detail, but made a stub for further discussion: https://github.com/dagger/dagger/discussions/3280

GitHub

Dagger Architecture Possibilities · Discussion #3280 · dagger/dagger

TODO: actual outline Concerns here: #3187 (comment) Ideas around whether we can re-implement parts of the buildkit client/gateway api (filesync and gateway containers being big ones, but also many ...

hearty matrix Oct 6, 2022, 10:41 PM

#

back, catching up

#

https://tenor.com/view/pizza-disaster-fire-gif-12899078

Tenor

hearty matrix Oct 7, 2022, 12:54 AM

#

if the engine is always remote, then we would have to either 1) add file streaming to the graphql API or 2) expose parts of the buildkit/grpc API to all clients, right?

#

in both cases, all SDKs would have to implement it

#

we would also have to change the secrets API to support sending secret value in the api (similar options 1 or 2 to do that)

trim surge Oct 7, 2022, 12:57 AM

#

hearty matrix if the engine is always remote, then we would have to either 1) add file streami...

Yes that, among other things.

I do wonder if there could be a shortcut though. If there's a file transfer API with native support for the languages we support and we can just expose that over a service from the engine, maybe there's a relatively cheap path to accomplishing this

#

That's a lot of "ifs" obviously

hearty matrix Oct 7, 2022, 12:58 AM

#

we could ship the engine with an ssh server

#

make ssh part of the api

#

gql + ssh

#

rsync or sftp for sync

#

regular ssh session for attach

trim surge Oct 7, 2022, 12:59 AM

#

We need native ssh clients in each language then. Unless we shell out to ssh, but shelling out to ssh is not conceptually different than shelling out to local engine

hearty matrix Oct 7, 2022, 12:59 AM

#

v1 just a black box container with duct tape. we could bundle it in a binary later but less urgent now

hearty matrix Oct 7, 2022, 1:00 AM

#

trim surge We need native ssh clients in each language then. Unless we shell out to ssh, bu...

I don’t know for sure where the pain is today, so hard to say if forkexecing ssh is meaningfully different

trim surge Oct 7, 2022, 1:00 AM

#

(I'd also suggest replacing ssh with websockets if we are just going to run rsync through it, but implementation detail)

trim surge Oct 7, 2022, 1:01 AM

#

hearty matrix I don’t know for sure where the pain is today, so hard to say if forkexecing ssh...

Yeah it's a different kind of pain, if it's better or worse I don't know yet

hearty matrix Oct 7, 2022, 1:15 AM

#

trim surge (I'd also suggest replacing ssh with websockets if we are just going to run rsyn...

oh I didn’t mean anything as clean as implementing the ssh protocol. I meant literally running openssh in our container with a custom configuration and hooks. actually rsync files to an actual dir in the container then plug that into buildkit client somehow

#

duct tape basically 🙂

trim surge Oct 7, 2022, 1:20 AM

#

hearty matrix oh I didn’t mean anything as clean as implementing the ssh protocol. I meant lit...

Oh totally, but if we want to connect to the openssh server in the engine from a script in, e.g., javascript, then we either need a native javascript ssh client or to shell out to ssh. I was just thinking websockets because that's what we are using for service otherwise. But doesn't matter, point is we need a transport.

Curiosity got the best of me and I took a brief detour to google native lang rsync implementations:

https://github.com/gokrazy/rsync
https://github.com/isislovecruft/pyrsync
https://github.com/WebDeltaSync/WebRsync
I don't really trust the python/js ones, but having given them a very brief glance I am surprised how simple they are. It makes me wonder if it would be easier to implement the rsync protocol in python/js than implementing the buildkit filesync stuff. I truly have no idea, it just is an intriguing line of thought.

#

I'm sure this all occurred to Tonis and he made his decision to implement the filesync the way he did for some reason, so I expect it's more complicated than it seems

hearty matrix Oct 7, 2022, 1:25 AM

#

there’s also sftp

#

my head hurts

#

Not to stress you out, but once we ship a design built around a concept of local engine, it will be very hard to rip it out later

#

OK another idea: how about a fork-exec helper of reduced scope, designed as a stopgap for a future where we reimplement it all in each SDK?

#

for exemple a dagger-sync-workdir tool?

trim surge Oct 7, 2022, 1:44 AM

#

hearty matrix OK another idea: how about a fork-exec helper of reduced scope, designed as a st...

I'm thinking something along these lines too. Like implement our SDKs such that all the functionality like this (filesync, connecting to service containers, etc.) can be implemented via shelling out right now and then later swapped for native impls

#

Router goes into our buildkitd wrapper

#

Yeah this needs more thought

#

But it is probably right. I gotta finish up the other stuff at this exact moment obviously but this can't wait long either

hearty matrix Oct 7, 2022, 1:46 AM

#

We would still need to bundle buildkit though. At best we get the luxury of bundling it in a container image for now, instead of a binary. But we still need the bundling for a bunch of reasons.

trim surge Oct 7, 2022, 1:47 AM

#

hearty matrix We would still need to bundle buildkit though. At best we get the luxury of bund...

Yes 100%, the bundling becomes more important than ever here

hearty matrix Oct 7, 2022, 1:48 AM

#

In this scenario would we still want to bundle both the engine + client utility in the dagger binary?

#

Another issue is the provisioners, it seems wrong to reimplement them in every SDK, even long term

trim surge Oct 7, 2022, 1:49 AM

#

hearty matrix In this scenario would we still want to bundle both the engine + client utility ...

My default would be to just bundle everything into one binary still to start because it makes it easier to think about and change. And once we have solidified designs we could consider breaking up if there's a need too

timber osprey Oct 7, 2022, 1:49 AM

#

catching up

timber osprey Oct 7, 2022, 1:49 AM

#

hearty matrix https://tenor.com/view/pizza-disaster-fire-gif-12899078

and this

hearty matrix Oct 7, 2022, 1:49 AM

#

hearty matrix Another issue is the provisioners, it seems wrong to reimplement them in every S...

it's not an issue of hard engineering (like eg. implementing filesync) it's the fragmentation that is a permanent problem

timber osprey Oct 7, 2022, 1:50 AM

#

Erik you know the internals of bk better than me, but I think rsync or whatever would be a big downgrade compared to native right?

#

AFAIK a ton of work went into filesync to be snappy for that particular use case

hearty matrix Oct 7, 2022, 1:50 AM

#

@timber osprey tally so far:

Problem solved by removing local engine

? unclear for me at the moment

Implications of removing local engine

Filesync
Secrets
Provisioners
Host API (workdir, env)
API is no longer 100% graphql?

timber osprey Oct 7, 2022, 1:50 AM

#

(although it doens't feel like that at times 😂)

trim surge Oct 7, 2022, 1:51 AM

#

timber osprey Erik you know the internals of bk better than me, but I think rsync or whatever ...

Honestly I don't know, I think the filesync service is built around similar concepts in terms of diff copying. I know rsync has heuristics around block based diffing too. But I have never looked into either in depth to have an informed opinion

timber osprey Oct 7, 2022, 1:51 AM

#

hearty matrix <@707661669819613324> tally so far: 1. Problem solved by removing local engine ...

provisioners less so I believe?

even in that world, there would be a "dagger" CLI (only, at this point, more like a daggerctl than a daggerd)

trim surge Oct 7, 2022, 1:51 AM

#

I would be biased to agree with what you're saying though because I doubt all the work went into filesync for nothing

timber osprey Oct 7, 2022, 1:52 AM

#

provisioning could be done in one offs

#

not that it changes anything but yeah

trim surge Oct 7, 2022, 1:52 AM

#

hearty matrix <@707661669819613324> tally so far: 1. Problem solved by removing local engine ...

API is already not 100% graphql today due to services (expose streams over websockets). That's not a bad thing either, I think it's fine as long as it doesn't keep growing with more and more protocols/endpoints/etc.

timber osprey Oct 7, 2022, 1:52 AM

#

trim surge I would be biased to agree with what you're saying though because I doubt all th...

Yeah. I would assume, in a naive way, that there MUST be a reason why they're not rsyncing stuff around

hearty matrix Oct 7, 2022, 1:52 AM

#

timber osprey provisioners less so I believe? even in that world, there would be a "dagger" C...

But then we'd still be fork-execing to dagger from the SDK, isn't that what you wanted to avoid?

trim surge Oct 7, 2022, 1:53 AM

#

timber osprey Yeah. I would assume, in a naive way, that there MUST be a reason why they're no...

Agreed, that's where I'm at too. But at the same time rsync is widely used and "okay enough" IME so I don't know what the reasoning was

timber osprey Oct 7, 2022, 1:53 AM

#

timber osprey provisioning could be done in one offs

so that was my question

#

does the sdk need to provision?

trim surge Oct 7, 2022, 1:53 AM

#

timber osprey does the sdk need to provision?

Something needs to provision. I think the fact that if the engine isn't running it will be spun up for you is extremely extremely nice

hearty matrix Oct 7, 2022, 1:54 AM

#

with the current assumptions (avoid managing stateful daemons) yes

trim surge Oct 7, 2022, 1:54 AM

#

hearty matrix with the current assumptions (avoid managing stateful daemons) yes

Yeah that too

hearty matrix Oct 7, 2022, 1:54 AM

#

part of that requires illusion/polyfill, but it's worth it because fundamentally there's no reason for buildkit daemons to be pets, should be cattle

timber osprey Oct 7, 2022, 1:55 AM

#

by provisioning we mean spinning up the container? or more complex provisioning (e.g. k8s etc)?

hearty matrix Oct 7, 2022, 1:55 AM

#

unclear

trim surge Oct 7, 2022, 1:56 AM

#

timber osprey by provisioning we mean spinning up the container? or more complex provisioning ...

Starting with just container, but maybe someday more complex, nice to leave "more" complex open

#

@timber osprey what do you think about the idea of retaining the local binary, but don't run it as an engine, just shell out to it for some utilities we don't want to rewrite in every language (mentioned above somewhere I think)

timber osprey Oct 7, 2022, 1:57 AM

#

the former -- it's really nice to have. however:

if we don't, not that weird (that's what you do with virtually all tooling)
and if we do, not rocket science for every SDK to do, right?

#

e.g. not rocket science --> compared to the freaking codegen and xdx 🙂

#

the bar is already high

hearty matrix Oct 7, 2022, 1:57 AM

#

Ideally it would all be a single, indempotent action that can be called on demand by the SDK

#

ie. the first engine.Start may do a lot. Subsequent calls may be faster because images are already downloaded kub services deployed, etc. But conceptually it's all bundled together

timber osprey Oct 7, 2022, 1:58 AM

#

trim surge <@707661669819613324> what do you think about the idea of retaining the local bi...

yeah, like a helper binary

timber osprey Oct 7, 2022, 1:58 AM

#

hearty matrix ie. the first `engine.Start` may do a lot. Subsequent calls may be faster becaus...

that's why I was asking simple container vs k8s deployments etc

hearty matrix Oct 7, 2022, 1:59 AM

#

to get an answer? 🙂

timber osprey Oct 7, 2022, 1:59 AM

#

timber osprey by provisioning we mean spinning up the container? or more complex provisioning ...

for the latter, I don't know honestly. Would I feel comfy giving root access to a tool to do stuff on my k8s cluster?

#

vs having an official helm chart, CF template, etc

#

same feeling as "use the language you're familiar with" -- "use the native tooling you're already familiar with"

#

but debatable

hearty matrix Oct 7, 2022, 2:00 AM

#

timber osprey for the latter, I don't know honestly. Would I feel comfy giving root access to ...

Some shops give access to kub namespaces. So you could use that

#

If your kub admin won't let you deploy stuff on kub, then you don't use the kub provisioner

#

in that case "provision" may just be "connect to this known URL"

#

maybe provisioner is the wrong term?

timber osprey Oct 7, 2022, 2:07 AM

#

timber osprey same feeling as "use the language you're familiar with" -- "use the native tooli...

with the assumption that there's not a one-size fits all deployment, we'd have to replicate what 3rd party tools are doing

#

like, ok, deploy to my k8s cluster. but I happen to use istio or whatever service mesh, so i need those extra labels. Oh and rbac is configured this way so blah blah

trim surge Oct 7, 2022, 2:09 AM

#

hearty matrix in that case "provision" may just be "connect to this known URL"

I would consider setting BUILDKITD_HOST env to be the dumbest possible provisioner implementation. Agree the term provisioner is not totally accurate

timber osprey Oct 7, 2022, 2:11 AM

#

timber osprey like, ok, deploy to my k8s cluster. but I happen to use istio or whatever servic...

(not complaining about k8s fragmentation -- just wondering if complex "provisioning" is better handled outside of SDKs, into native tooling like cloudformation, terraform, helm, etc etc -- in which case we only need "basic default provisioning" which is doable in every SDK, I think)

#

but anyway -- doesn't change anything for filesync, secrets, etc

hearty matrix Oct 7, 2022, 2:16 AM

#

timber osprey like, ok, deploy to my k8s cluster. but I happen to use istio or whatever servic...

yeah agreed

#

but we either 1) need something pluggable in the client to invoke stateless engines remotely (possibly pre-provisioned) or 2) everyone needs to manage pet daemons

#

on kubernetes I think correct invocation is actually with a one-off exec (as opposed to declarative kubect apply)

#

on a remote server: ssh + exec

trim surge Oct 7, 2022, 2:22 AM

#

I mentioned the shell out utility idea here: https://github.com/dagger/dagger/discussions/3280#discussioncomment-3819142
Think that would be a step up from current state if nothing else. Could also be stepping stone to native clients or just stay like that forever

hearty matrix Oct 7, 2022, 2:23 AM

#

trim surge I would consider setting `BUILDKITD_HOST` env to be the dumbest possible provisi...

would be DAGGER_HOST though right?

trim surge Oct 7, 2022, 2:23 AM

#

hearty matrix would be DAGGER_HOST though right?

yes once Guillaume's PR is merged 🙂

hearty matrix Oct 7, 2022, 2:36 AM

#

this helper binary would need to be a local proxy for the duration of the session. Since it needs to handle callbacks from the server

#

so the http2/grpc plumbing needs to flow through the helper or too much work left to the sdk

#

this actually solves the provisioning problem too

#

gives the helper a hook to handle that too

#

same helper can wrap your shell script to solve session management

#

host API could remain: callbacks would flow back from remote engine to local helper

#

oh and socket forwarding too (adding that to tally)

#

dagger client ?

#

or dagger-client

trim surge Oct 7, 2022, 2:42 AM

#

Yes to all the above, dagger client is fine, in my current thinking this would be a hidden command from end users, just meant to be shelled out by sdks

#

If we continue to use buildkit's session management (as we do today), then we'll need to either A) fix session sharing upstream or B) have dagger client be long-running for full duration of session and accept commands over pipes (brings us back to two daemons technically I guess, but in slightly different form)

Or we can make our own session concept and not expose buildkit's from daggerd. Usual tradeoff of more work for more customizability.

#

Just thinking out loud

hearty matrix Oct 7, 2022, 3:00 AM

#

doesn’t the problem go away if we do everything in one buildkit client connection?

trim surge Oct 7, 2022, 3:03 AM

#

hearty matrix doesn’t the problem go away if we do everything in one buildkit client connectio...

Yes, but that also means that the helper binary needs to run for the duration of the session. So it's not shelling out for individual functionality, it's running continuously and just accepting commands for individual pieces of functionality over pipes. Which is all totally fine, just thinking about it

hearty matrix Oct 7, 2022, 3:05 AM

#

that’s what I’m saying, we need that anyway because of the callback-driven nature of filesync (not to mention socket forwarding)

#

no way around it imo

#

it’s basically exactly the same as today except instead of running the engine locally you’re running a local proxy to the engine

trim surge Oct 7, 2022, 3:15 AM

#

hearty matrix it’s basically exactly the same as today except instead of running the engine lo...

Yes this has somehow come full circle... I think there's probably cleanups in the details to be made, but our current architecture is probably not very far off from reasonable even though it seems weird

hearty matrix Oct 7, 2022, 3:17 AM

#

it’s suddenly way less weird if you fork-exec a helper proxy rather than a server talking to another server

#

Now actually feel that we could keep the fork-exec forever (but keep the option open to change our minds later)

#

remaining part is the API now having a non-gql component

trim surge Oct 7, 2022, 3:18 AM

#

hearty matrix it’s suddenly way less weird if you fork-exec a helper proxy rather than a serve...

Yeah I think it may just be more a change in terminology rather than any huge changes in implementation

hearty matrix Oct 7, 2022, 3:21 AM

#

still would be a pretty major change not running a server locally, plus lots of new grpc plumbing

#

but smaller diff than I feared at the beginning of this thread

trim surge Oct 7, 2022, 3:21 AM

#

hearty matrix still would be a pretty major change not running a server locally, plus lots of ...

Yeah and I'm now questioning whether even that is all that beneficial actually.

#

If we are sending commands over pipes to a local proxy running for the duration of the session, then why not just do what we're now and run the router in that local proxy and send graphql through it.

#

idk I need to sleep on it, brain is sputtering

hearty matrix Oct 7, 2022, 3:27 AM

#

That’s where making a list of current pain would be helpful

#

then we can weigh cost & benefit

timber osprey Oct 7, 2022, 5:20 AM

#

Catching up

#

Sorry about the “food for thought”, it turned into a full banquet

hearty matrix Oct 7, 2022, 5:28 AM

#

https://tenor.com/view/monty-python-explode-meaning-of-life-gif-13575259

Tenor

timber osprey Oct 8, 2022, 2:01 AM

#

trim surge If we are sending commands over pipes to a local proxy running for the duration ...

Good point. Need to think more about it. I guess the major difference is it’s scoped to a much smaller functionality rather than being the full engine

Basically, SDKs would use that binary as an “implementation stopgap”, with the end goal of eventually supporting it natively (and the SDK could actually support it natively from day 1)

That point of view changes things quite a bit. For instance it would be ok for the node sdk to just embed the proxy binary inside the npm package itself (I’ve seen a few packages doing that). Since it’s more of a “.dll” than an engine.

Embedding the full engine has bigger implications. Every script has its own instance, no multi tenancy. Services die off as soon as the script is done. We talked about how cloak attach was weird because it wanted to attach to an existing server rather than spawning its own

#

On the operations side, taking playground as an example:

Marcos will probably run dagger on a container, and mount the docker socket inside so that dagger can launch itself again as a daggerd container and communicate through stdio, which is plenty weird (although we could probably make that better regardless for the use case of running dagger in a container)

But then if it runs as a container: what to do with local files etc? (problem we do have in playground)

timber osprey Oct 8, 2022, 2:08 AM

#

hearty matrix That’s where making a list of current pain would be helpful

Yes

timber osprey Oct 8, 2022, 2:11 AM

#

hearty matrix or `dagger-client`

Related to point above

Could potentially be a much smaller binary embedded in the SDK. Basically a dagger.so library-ish, but over fork exec, to stopgap whatever the SDKs can’t implement on their own

trim surge Oct 8, 2022, 2:14 AM

#

I have to call it a night but now that I have a slight bit of breathing room plan is to think about this so we can draw out options some more. But yes first step will be to just write out the problems explicitly, ensure they are real, etc. I'll update this discussion (which is currently just my dumping grounds, but feel free to dump thoughts too, I'll clean it up once thoughts are formed): https://github.com/dagger/dagger/discussions/3280

Could potentially be a much smaller binary embedded in the SDK. Basically a dagger.so library-ish, but over fork exec, to stopgap whatever the SDKs can’t implement on their own
I'd like this a lot; there are devils in the details related to session crap but I have been thinking about it on and off today and have some vague ideas on shortcuts we can look into. The local dirs specifically have an insanely complicated caching scheme, there's a small chance we might be able to use them independent of sessions, which will simplify things greatly I think

timber osprey Oct 8, 2022, 2:14 AM

#

(Hope i’m not adding to confusion here, just brainstorming, not convinced either way)

trim surge Oct 8, 2022, 2:15 AM

#

timber osprey (Hope i’m not adding to confusion here, just brainstorming, not convinced either...

No the architecture is complicated and we need to make sure if we continue going down this route it's at least consciously. But even better would be simplifying it. Worthy exercise to go through

timber osprey Oct 8, 2022, 2:20 AM

#

I’ve been thinking about filesync/rsync etc, and would would be the simplest possible thing to do (relatively speaking).

Turns out, you can “wrap” gRPC over websockets (found an example repo on GitHub). And you can “proxy” buildkit (Erik you implemented that in the early days of the typescript SDK when TS was talking gRPC directly).

So in theory: engine could proxy buildkit’s filesync as is. Wrap it in WS.

Go SDK natively imports bk, wraps it in WS, exposed as SDK functions

Small binary uses the Go SDK to do just that. It’s embedded by SDKs that can’t implement the session stuff

Someday: we replace gRPC over WS by our own thing over WS, SDKs use that directly, no more binary

#

(e.g. /ws/v1/filesync is just a proxy for the filesync bits in gRPC over WS. /ws/v2/filesync can be a simpler protocol in the future that can be implemented natively in TS etc)

timber osprey Oct 8, 2022, 2:26 AM

#

timber osprey I’ve been thinking about filesync/rsync etc, and would would be the simplest pos...

Err: by “exposed as SDK” functions I didn’t mean we expose bk. I meant the filesync bits work under the hood in the Go SDK because internally it uses bk client, but that’s an implantation detail.

Then in theory if this could be compiled as a .so and linked to JS, it would work too. Since it’s not possible, it’s embedded bin+fork/exec

timber osprey Oct 8, 2022, 2:28 AM

#

trim surge I have to call it a night but now that I have a slight bit of breathing room pla...

100% — let’s make sure the problems are real before diving into a costly implementation

Have a great weekend everyone!

trim surge Oct 8, 2022, 2:31 AM

#

timber osprey 100% — let’s make sure the problems are real before diving into a costly impleme...

You too!

trim surge Oct 8, 2022, 4:42 AM

#

timber osprey I’ve been thinking about filesync/rsync etc, and would would be the simplest pos...

Yes to this, sync-as-a-service (or similar) in the very long term feels like a nice approach. Interestingly, we might actually replace local source llb ops with cache mounts running on gateway containers if we do that (maybe).

Slight tangent, but I was thinking a few days ago about how you’d implement “hot reload” of generated code clients whenever a change is made to your code first extension schema. One possibility would be to run mutagen (two way remote syncing service) over a cloak service. The HLB authors actually told me about that idea, they ran it directly over bk ssh sockets. So then you sync local extension code changes to the remote service over websocket, it generates client code and syncs that back to you.

I guess you could enable general hot reload use cases with this approach too? Like frontend dev tools and stuff. Or hot reload of extensions into the cloak engine too I suppose.

Either way, not high priority but kind of interesting and semi related in that it also involved file syncing. But mutagen doesn’t have native clients in non-go (I think, didnt check though), so doesnt really answer any of those questions.

hearty matrix Oct 10, 2022, 10:21 PM

#

So, I understand the concern about having a complicated architecture with one local engine and possibly more remote engines. I find the potential solution (local engine is replaced by a helper proxy) elegant.

But. I have a concern of my own on the UX side: no more local engine means there is always a stateful daemon that needs to be installed and managed out of band. No more “install SDK, it just works!”. This leads to a UX similar to docker engine: soon you need to manage infrastructure on the side before you can do anything. Enter Docker Machine, boot2docker, Docker Toolbox, Vagrant, and of course podman, kubernetes… That is to say, a horrible and fragmented UX. Everyone endlessly tending to pet-like stateful engines, and arguing over the best tool to do so. Turning everyone into grumpy sysadmins.

How do we avoid that?

#

Unlike docker engine, our engine can get away with being stateless because it is based on buildkit which only has a cache to worry about. It would be a shame if we wasted that opportunity with a pet management UX a-la docker machine

trim surge Oct 11, 2022, 4:29 PM

#

hearty matrix So, I understand the concern about having a complicated architecture with one lo...

I agree that's one of the things we need to figure out and that we should find any way we can to hide the requirement of persistently running daemons. But I also think it's mostly orthogonal to whether the local engine exists or not.

In today's current state w/ the local engine, there is still a persistent stateful daemon (buildkitd). We could in theory fix that by making our buildkitd wrapper ephemeral while still retaining the local engine.

In a world where we get rid of the local engine (either have a helper binary or put everything in the client SDKs), we will still need to solve the problem of running buildkitd functionality ephemeraly.

So it's not that the problems have no interaction with one another, it's more that making buildkitd ephemeral and cattle-like will require its own independent set of solutions.

#Concerns about having a local engine