I did read you mentioning sessions etc | Dagger | Page 1

cyan stag Nov 1, 2022, 10:37 PM

#

Yeah long-running engine is going to require a lot more changes since it would by default result in there only ever being one buildkit session, which breaks a lot. Fixable, but most likely not by the next SDK release.

Last night what I did was put buildkitd+cloak in an image and then use commandconn to do docker exec into that container to exec the engine. But then I later realized that -v /:/host doesn't really work on macos, so I think we probably need to fallback to just having SDKs do a local fork/exec of the binary, same as before.

Either way, need the stdio-dialer

fallow pike Nov 1, 2022, 10:57 PM

#

Hey @cyan stag and @last wyvern, quick check-in before going to bed... need some feedback from me?

cyan stag Nov 1, 2022, 10:58 PM

#

fallow pike Hey <@949034677610643507> and <@707661669819613324>, quick check-in before going...

No nothing urgent at all! We are good to go for now

fallow pike Nov 1, 2022, 10:59 PM

#

Ok, can you sum up for me how we are on provisioning in the other SDKs? There's quite a bit of discussion but I don't have time to go through.

cyan stag Nov 1, 2022, 11:00 PM

#

fallow pike Ok, can you sum up for me how we are on provisioning in the other SDKs? There's ...

I will but first I want to double-check that everyone else is on the same page, and before that I want to finish the docs I'm writing right now 🙂

I will absolutely write the summary out though once we have it (either tonight or tomorrow)

fallow pike Nov 1, 2022, 11:01 PM

#

Ok 👍 I've put a pause on that feature until we know what's best 🙂

#

Oh man, I'm behind 67 commits 😲 (from main)

cyan stag Nov 2, 2022, 6:36 PM

#

Just to update here, I'm currently building on the previous prototype but this time instead of doing commandconn to docker exec I'm bundling cloak in the dagger-engine image and then having the SDK copy it out of there locally (for its platform) and using that to talk to buildkitd running in the dagger-engine image. Makes packaging way easier (just an image, not an image plus also figure out how to bundle binaries in pypi+npm) and also makes it way easier to switch to the approach where engine is fully remote in the future.

I'm just running through it w/ go first to make sure it actually works (so don't take as final design or anything), but if anyone has any immediate thoughts lemme know. The end result would be that SDKs need to do a little bit of shelling out to docker (similar to what engine does today), but I'm thinking that's a good tradeoff given the packaging work it saves us and other benefits

last wyvern Nov 2, 2022, 9:27 PM

#

cyan stag Just to update here, I'm currently building on the previous prototype but this t...

Hit roadblocks with commandconn'ing the engine from a container?

#

(the volume hack I assume?)

cyan stag Nov 2, 2022, 9:28 PM

#

Oh I've abandoned the volume hack at this point, but my new approach of having the sdks obtain the cloak binary out of the container image is working

last wyvern Nov 2, 2022, 9:28 PM

#

(to me it makes sense -- long running engine > engine as a command in the container > engine copied from the container and executed locally)

last wyvern Nov 2, 2022, 9:28 PM

#

last wyvern (to me it makes sense -- long running engine > engine as a command in the contai...

in order of "getting closer to the real deal"

cyan stag Nov 2, 2022, 9:29 PM

#

last wyvern (to me it makes sense -- long running engine > engine as a command in the contai...

Yep that's exactly it, there's a nice step-by-step path

last wyvern Nov 2, 2022, 9:29 PM

#

option 1 is the long term approach
option 2 is kinda long term as well: SDKs will always exec a command (dial-stdio). We cheat to start and dial-stdio spawns a new engine instead of proxying traffic to long running
option 3 is the stopgap

#

what I liked about option 2 is that there kinda wouldn't be any SDK changes -- we just change the meaning of dial-stdio (and btw, that's how buildkit does it -- dial-stdio is a proxy from stdio to unix)

cyan stag Nov 2, 2022, 9:30 PM

#

The code I have works on linux-client<->linux-host and now I'm just making some updates so it works on darwin-client<->linux-host, once I see that working I'll send out a draft PR. Code will need cleanup but we can get consensus on the details of the plan

last wyvern Nov 2, 2022, 9:31 PM

#

but if it's not possible, the next best thing is option 3 (downside is extra work for the SDKs, copying stuff into the user machine). Not the end of the world

cyan stag Nov 2, 2022, 9:32 PM

#

last wyvern what I liked about option 2 is that there *kinda* wouldn't be any SDK changes --...

We can go to option 2 once we have localdir syncing, which @round dove is looking into, so we're already taking steps towards it

last wyvern Nov 2, 2022, 9:32 PM

#

awesome

cyan stag Nov 2, 2022, 9:32 PM

#

last wyvern but if it's not possible, the next best thing is option 3 (downside is extra wor...

Yeah I'm doing what I can to make that minimally painful, but right now there's a TODO for cleaning up old binaries

#

it's going in ~/.cache/dagger

last wyvern Nov 2, 2022, 9:32 PM

#

yeah to me, option 3 is way better than "just talk to a CLI that got installed somehow, disregard compat issues"

cyan stag Nov 2, 2022, 9:33 PM

#

last wyvern yeah to me, option 3 is way better than "just talk to a CLI that got installed s...

Yeah trying to package binaries for every package manager scares me

last wyvern Nov 2, 2022, 9:34 PM

#

cyan stag it's going in ~/.cache/dagger

pre-emptive nit: should probably include <os>-<arch>-<version>. Making sure that multiple SDK versions on the same machine don't end up using "whatever the other SDK pulled"

cyan stag Nov 2, 2022, 9:35 PM

#

last wyvern pre-emptive nit: should probably include <os>-<arch>-<version>. Making sure that...

Oh I have TODOs for all that yeah

#

I'm doing all the good stuff like including the sha with name, downloading it to a tmp file and then rename(2) it to the final things, etc. etc.

round dove Nov 2, 2022, 9:35 PM

#

cyan stag We can go to option 2 once we have localdir syncing, which <@108011715077091328>...

posted an update here btw: https://github.com/dagger/dagger/issues/3629#issuecomment-1301189176 tl;dr - gRPC over HTTP possible, seems like a good bet for the various server->client reqs we might need to support, aiming to figure out how much the client needs to implement and try to avoid any hard dependency on github.com/moby/buildkit

cyan stag Nov 2, 2022, 9:35 PM

#

But also making sure i'm only doing stuff that's available in nodejs+python stdlibs (nodejs lacks flock...)

last wyvern Nov 2, 2022, 9:35 PM

#

cyan stag Yeah trying to package binaries for every package manager scares me

go:embed of a go binary in a Go SDK is 🤢

last wyvern Nov 2, 2022, 9:38 PM

#

round dove posted an update here btw: <https://github.com/dagger/dagger/issues/3629#issueco...

One thing to possibly consider: there might be a way to do gRPC over websockets. Like, as a transport wrapper

#

Like, not a "simpler websocket endpoint", but literally "over websockets"

#

which someday could become a simpler websocket endpoint (e.g. v1 is gRPC as is, v2 is a simpler protocol)

#

I found this POC online, not sure how good it is: https://github.com/tmc/grpc-websocket-proxy

round dove Nov 2, 2022, 9:40 PM

#

is there a practical difference between that and a HTTP/2 connection upgrade? the latter seems to be what docker<->buildx<->buildkit does

last wyvern Nov 2, 2022, 9:41 PM

#

Fair.

The only practical difference is we'll probably end up using websockets one way or another. GraphQL subscriptions are websockets-based

That's what I picked websockets for dagger service attach -- could have been http/2, but I figured we'll pull in WS for other reasons (e.g. if we expose logs over GraphQL, it's probably going to be a WS subscription)

#

But ... it's a nice to have. Like, priority -12

round dove Nov 2, 2022, 9:42 PM

#

gotcha. i don't feel strongly really, just curious. sounds like something we can slip in later too if/when we want it right? (including if "when" is just after I finish my prototype)

#

also what are our plans for graphql subscriptions?

last wyvern Nov 2, 2022, 9:43 PM

#

yeah, agree. Eventually I think we'll have our own protocol (not sure if you agree with this) so that SDKs can implement it directly. Right now with the gRPC based approach, SDKs will have to spawn a binary to handle this

round dove Nov 2, 2022, 9:44 PM

#

round dove also what are our plans for graphql subscriptions?

("none yet but want to keep the door open" works for me)

cyan stag Nov 2, 2022, 9:44 PM

#

round dove ("none yet but want to keep the door open" works for me)

There's a long standing TODO in my brain for "Figure out how we could use subscriptions"

last wyvern Nov 2, 2022, 9:44 PM

#

last wyvern yeah, agree. Eventually I think we'll have our own protocol (not sure if you agr...

but anyway -- whether we do this or not, at that point we could switch to another protocol

last wyvern Nov 2, 2022, 9:45 PM

#

round dove also what are our plans for graphql subscriptions?

none yet

I think in the future the first use case is logs

Right now for instance, the only way for SDKs to grab logs is from stderr

As soon as the Engine is a long running container and we communicate exclusively over HTTP, that trick is gone -- we'll need to stream logs somehow, subscriptions sound like a good candidate

#

e.g. right now we're cheating: the API is GraphQL + stderr. Make the engine a "true" API server, and we need to shove logs somewhere else

round dove Nov 2, 2022, 9:47 PM

#

last wyvern yeah, agree. Eventually I think we'll have our own protocol (not sure if you agr...

gRPC/protobuf is a good candidate for implementing in other languages too, no? I haven't tried in non-Go yet but I thought that was the point

last wyvern Nov 2, 2022, 9:49 PM

#

round dove gRPC/protobuf is a good candidate for implementing in other languages too, no? I...

Yep. But if we decide we want a simpler protocol than the buildkit one (because it's hard to implement), then we might as well reconsider transport (gRPC is not the easiest to work with)

round dove Nov 2, 2022, 9:50 PM

#

it did take a while to grok. seems really handy once you know it, there's a lot of moving parts

cyan stag Nov 2, 2022, 9:59 PM

#

just tried to test on macos but I can't run git because I updated to ventura and now I have to reinstall all dev command line tools including git........

While I'm waiting @last wyvern we're gonna need a container image registry by the next SDK release and associated automation for releasing to it when we do engine releases. Do you have any thoughts on what registry to use? I'm also guessing it would be nice to use a vanity URL so we can switch backends in the future, but I have no clue if that's possible with registries, never tried before

last wyvern Nov 2, 2022, 10:10 PM

#

cyan stag just tried to test on macos but I can't run git because I updated to ventura and...

we're supposed to chat with the docker folks about rate limiting etc (/cc @midnight widget @stiff temple)

Using a vanity URL would be nice to avoid being tied to one provider

It's not a widespread practice but I've seen a few doing this

cyan stag Nov 2, 2022, 10:11 PM

#

last wyvern we're supposed to chat with the docker folks about rate limiting etc (/cc <@4884...

If rate limiting or other issues become a blocker there's always the fallback option of just putting the image tarball somewhere and having the SDKs download it and docker load it in. But if we can have a solid registry setup by the next release that seems ideal

last wyvern Nov 2, 2022, 10:57 PM

#

@cyan stag Yeah agreed. Especially since we still need to run the "engine" image for buildkit itself

#

I mean, not a hard requirement for next week, but we could have an engine image that includes the engine binaries and also a "runner" binary (buildkitd), so you pull this one thing and done

cyan stag Nov 2, 2022, 10:58 PM

#

last wyvern I mean, not a hard requirement for next week, but we could have an engine image ...

that's what I've implemented

#

(draft PR is imminent)

last wyvern Nov 2, 2022, 10:58 PM

#

yeah, neat!

#

so either way -- even if we pull tarballs, we still got to distribute the image right?

cyan stag Nov 2, 2022, 11:05 PM

#

last wyvern so either way -- even if we pull tarballs, we still got to distribute the image ...

Yep exactly, I was just thinking if it's easier to put a tarball in S3 or github releases and not have to deal with ratelimiting, then we can fallback to it

#

https://github.com/dagger/dagger/pull/3647 Gonna go do more cleanup passes to fix the stuff mentioned in the description, but if there's any high level concerns with the approach let me know

#

Do we have a timeline on talking to dockerhub about rate limiting? It would be good to figure that out sooner than later. I can make an issue too (presuming there is not one already)

round dove Nov 2, 2022, 11:17 PM

#

lol, I just implemented another singleConnListener before finding we had one already. (not that it's a ton of work, just funny that we've now run into such an odd thing twice.)

cyan stag Nov 2, 2022, 11:22 PM

#

round dove lol, I just implemented another `singleConnListener` before finding we had one a...

We do weird things here I think

#

(in a good way 🙂 )

round dove Nov 2, 2022, 11:22 PM

#

elmofire

midnight widget Nov 2, 2022, 11:27 PM

#

last wyvern we're supposed to chat with the docker folks about rate limiting etc (/cc <@4884...

On the topic of hosting providers:

@stiff temple is talking to Docker about their special startup program
@dapper grotto mentioned that ECR is a very strong candidate: pricing is basically S3, will be hard for Docker to beat that since they're on top of AWS
Either way I agree that having a vanity URL would be good for future flexibility. Note that docker hub short name is not a vanity url! since obviously it's tied to Docker Hub

last wyvern Nov 2, 2022, 11:46 PM

#

midnight widget On the topic of hosting providers: - <@933501536624054272> is talking to Docker...

Chatted with @dapper grotto about ECR and so forth.

So far my preference would be:

Vanity URL + ECR.

With a vanity URL users don't see the "real" registry address, and that's the only thing Docker Hub has going on
Vanity requires infrastructure which will be on AWS, might as well have the registry there and not pay traffic between the 2
ECR is cheaper and way more stable than Hub

IF vanity URLs don't work ... I'd suggest we go with GitHub

We're already "leaking" github.com/dagger/dagger, might as well also leak ghcr.io/dagger if vanity doesn't work
I trust the stability more

Hub: this would be last preference, depending on the startup program

If vanity doesn't work, we get a "vanity-ish" image (hub short name)

stiff temple Nov 3, 2022, 12:15 AM

#

last wyvern Chatted with <@707661676056674346> about ECR and so forth. So far my preference...

Hub startup program might be "promo pricing for a year and then 10 (basic + remove-rate-limiting) to 30k(add advanced reporting)/year after that"

last wyvern Nov 3, 2022, 12:25 AM

#

stiff temple Hub startup program might be "promo pricing for a year and then 10 (basic + rem...

As a comparison, on ECR that price buys us between 100TB & 600TB of data transfer -- roughly 2 to 12 million image pulls outside of AWS. On AWS->AWS it's 2.5 times that

midnight widget Nov 3, 2022, 12:27 AM

#

what’s the pricing like on Github?

stiff temple Nov 3, 2022, 12:30 AM

#

midnight widget what’s the pricing like on Github?

https://docs.github.com/en/billing/managing-billing-for-github-packages/about-billing-for-github-packages#about-billing-for-github-packages

#

https://github.com/pricing/calculator?feature=packages

#

So for example where image is 50MB (and we store several copies, maybe up to 1TB storage) and 100TB of egress