From Docker to Dagger — The Changelog: S... | Dagger | Page 1

next ravine Aug 22, 2023, 2:20 AM

#

That's the reason I'm here 🙂

spare grotto Aug 22, 2023, 2:48 AM

#

The core engine software and SDKs are open source. We sell a complete application delivery platform as a subscription. The platform includes a proprietary control plane + supported & certified builds of the engine

#

Popular features of the control plane include: distributed cache orchestration (a must for scaling out on ephemeral CI runners); pipeline observability; troubleshooting & collaboration for dev teams; and SCM integration

#

The commercial subscription is in early access

#

fyi @hard pulsar 👆

hard pulsar Aug 22, 2023, 1:56 PM

#

Thanks @spare grotto for the insight. If we wanted to use our own orchestration tool like Jenkins or Argo, is caching something open-source users will have to roll our own? Or will it not be possible at all to cache. Not sure what caching your speaking of though, I'm thinking, Docker images, dependency caches, build caches, etc.

spare grotto Aug 22, 2023, 2:48 PM

#

hard pulsar Thanks <@488409085998530571> for the insight. If we wanted to use our own orches...

You can use Jenkins or Argo in combination with Dagger Cloud (tentative name for our commercial control plane). It handles the distribution of caching across all the nodes of your jenkins/argo/… cluster.

maiden pendant Aug 22, 2023, 3:21 PM

#

But is it technically possible to implement the Dagger Cloud distributed cache feature on our own? Like is it possible from a technical point of view and do you intend to add eg. documentation on how one could achieve it? Or will it be the "secret" selling point of the premium dagger service?

hard pulsar Aug 22, 2023, 3:40 PM

#

Right now we are using the Kubernetes worker node host volume storage for caching. If I can keep using that it's good enough for me.

spare grotto Aug 22, 2023, 3:41 PM

#

dropping off kids at school, will reply after

hard pulsar Aug 22, 2023, 3:45 PM

#

No rush from my end. Thanks for your time.

spare grotto Aug 22, 2023, 4:01 PM

#

maiden pendant But is it technically possible to implement the Dagger Cloud distributed cache f...

Yes the distributed cache service will be a proprietary cloud service, so not open source. The open-source engine is built on a project called buildkit, which has builtin facilities for batch-exporting and batch-importing cache data to/from a storage service like S3 or a registry, in a stateless way. We expose those hooks as experimental features. Mostly because they have many limitations and we don’t want to support them.

#

So if you absolutely don’t want to buy anything from the makers of the open source engine, it is possible to build your own distributed caching solution from scratch, but it will be more expensive to develop in the end.

#

in persistent machines there is no issue

#

Our philosophy with monetization is that some features just work better in a centralized service instead of being distributed across all engines. Once we identify those features, we package them into Cloud. If the engine can do it better, we package the feature in engine.

hard pulsar Aug 22, 2023, 4:15 PM

#

Ok thanks for the information, much appreciated.

maiden pendant Aug 22, 2023, 4:23 PM

#

Ok thanks a lot for the information.

spare grotto Aug 22, 2023, 4:25 PM

#

believe it or not, one thing we’re worried about at the moment is whether there is too much dependency on our commercial product. We didn’t plan it this way: caching from the engine has limitations; we looked for the best caching experience; determined it requires a cloud service to be awesome. And here we are.

#

The last thing we want is to push out Daggernauts from the community because our commercial product doesn’t fit their requirements.

#

But on the other hand: for the platform & community to thrive we need to sell something. Making that thing a centralized service separate from the engine, based on design & engineering decisions, allows us to protect the engine being open source and keep incentives align: we will never be tempted to make the engine less good, or (ahem) relicense it

maiden pendant Aug 22, 2023, 4:38 PM

#

I highly appreciate your openness on this topic. To be honest I am not afraid of paying any money for good work or a good product, I am more afraid of the community. I think for many people this could higher the burden of trying out something new. I think the dagger(engine) is already a great product as is, but the only 2 missing points from my side right now to recommend somebody to roll it out for bigger projects are the missing visualization and the ability to scale the machines on-prem or in the cloud horizontally without loosing the cache. And from what I have understood so far, these are exactly both candidates for the cloud offering?

spare grotto Aug 22, 2023, 6:35 PM

#

maiden pendant I highly appreciate your openness on this topic. To be honest I am not afraid of...

Yes exactly. Those are the two features we offer in Dagger Cloud. It's still in early access, but it is available. We are happy to get you setup if that helps.

charred wolf Aug 22, 2023, 6:59 PM

#

@junior dome can help anyone who would like early access to Dagger Cloud. Please fill out the request form, and we'd be happy to set up some time to do a demo and discuss your usecases - https://dagger-io.typeform.com/to/RDGtJLic?utm_source=homepage&typeform-source=discord

Dagger Demo Request

Turn data collection into an experience with Typeform. Create beautiful online forms, surveys, quizzes, and so much more. Try it for FREE.

hard pulsar Aug 22, 2023, 7:27 PM

#

In order for us to be potential customers I have to prove to my leadership team that paying for a build tool is going to provide value for the business. In my opinion and experience at my company that is a build tool that allows us to do the following things...

Customizable: Allows for varying build workflows and workflow configurations depending on a teams need.
Policies: Ability to enforce organizational policies in the build workflows.
Rollouts: Be able to publish pipeline code changes for policy updates or bug fixes and have them take effect immediately.
Compatibility: Steps works on a developer’s laptop and in CI.
Multi-Language: Ability to create pipelines for multiple languages.

Here are some of the things that are being asked of our team that has prompted us to start searching for a replacement to Makefiles and Groovy code.

Support multi module Go repositories.
Only build what has changed.
Support other languages Python, Java.

#

Word got around the organization that we've done a good job with Go language support, now developers want us to build similar tooling for other languages in use at the company. Which is a great feeling but we can't do it with Makefiles and Groovy code anymore, the breadth of support for each language plus number of languages themselves add a lot of complexity for us. Which is why we need to mature our build tooling into something that can carry us into the future. Hence, we started looking for alternatives, I was not looking for a paid tool but if it can help our organization accomplish some of these goals and there is no open-source (Let's be honest I mean free) tool to do so I may be able to convince my leadership that this is a worthwhile investment. The thing scares me is if I can't convince them and a feature we really need is a paid one we just lost a bunch of time. And to be clear it wasn’t obvious to me when I first started looking at Dagger that it was a company, I thought it was a open-source community project.

bleak lotus Aug 23, 2023, 1:03 PM

#

@hard pulsar currently, pretty much everything you listed above can be accomplished without using the paid Dagger Cloud offering. We are on a similar path at my company but we will also be evaluating Dagger Cloud at some point to see if it benefits. The OSS engine capabilities are what we are most interested in.

I noticed that you didn't list "Visualization/Metrics" as a requirement. If that becomes a priority, that may be one of the things that is tougher to solve without paying for Dagger Cloud.

hard pulsar Aug 23, 2023, 1:54 PM

#

That's great to hear! Dagger looks like a great tool and I would like to see the Dagger Cloud offering as well assuming it is a self-hosted tool, not sure with Cloud in the name.

bleak lotus Aug 23, 2023, 2:01 PM

#

It's currently not, but that's a requirement for us too and afaik they are working on it. Our data (cache) can't go outside our firewall but the metadata about the cache can so we would need to self host the cache.

charred wolf Aug 23, 2023, 2:04 PM

#

@steep lynx to answer as she is leading the product roadmap, but @bleak lotus is exactly right 🙂

hard pulsar Aug 23, 2023, 2:09 PM

#

@bleak lotus we plan on using metrics provided by Argo for now.

spare grotto Aug 23, 2023, 5:11 PM

#

Yes Dagger Cloud is a managed cloud service (no on-prem version) but we are planning a "brying your own bucket" feature so that the actual caching data stays on your infra. Orchestration of moving the cache data around between nodes, and to/from cold storage, remains a managed cloud service.

#

For larger customers who need everything to run on their infra, we will consider deploying and managing a single-tenant version on the customer's infra. This is an arrangement some SaaS vendors have implemented successfully to accomodate the needs of large customers without compromising the unique benefits of a managed service.

spare grotto Aug 23, 2023, 5:31 PM

#

Note that on the caching side, it's possible to have distributed caching without Dagger Cloud, but it will be generally slower. The larger the scale, the wider the gap in performance.

hard pulsar Aug 23, 2023, 5:36 PM

#

Caching to the internet I would have thought to be slower than caching on local disk or local network storage.

barren vector Aug 23, 2023, 9:15 PM

#

hard pulsar Caching to the internet I would have thought to be slower than caching on local ...

I’m not an expert, but if there is a bring your own bucket in principle the large data could be on a local bucket backed by I think it is called minio or similar

spare grotto Aug 23, 2023, 11:32 PM

#

hard pulsar Caching to the internet I would have thought to be slower than caching on local ...

Yes you do need cold cache to live in your region. We support all AWS, GCP and Cloudflare R2 regions at the moment. Azure coming soon.

Local storage is always used as hot cache no matter what. But with ephemeral distributed machines, your local filesystem gets wiped all the time, so getting the right data to the right machine at the right time becomes the bottleneck.

#

Assuming same-region cold storage, as you scale the bottleneck actually becomes the orchestration. You can’t download all the data into all your nodes all the time. So you need to download the smallest amount of data that gives you the best cache hit rate for a given workload. How to figure out that subset? That’s the hard part.

#

Since Dagger Cloud receives telemetry from all runs, it has the data to make those decisions. The more telemetry it receives, the more efficiently it can orchestrate the movement of data. At some point you can start moving data preemptively based on past workload patterns

#

Separately we pick lower hanging fruits:

buildkit cache export doesn’t persist cache volumes. so your go build or npm build takes minutes each time because it redownloads all packages each time
buildkit cache import / export happens synchronously at the end of a run. Dagger moves data to/from cold storage continuously in the background

spare grotto Aug 23, 2023, 11:41 PM

#

barren vector I’m not an expert, but if there is a bring your own bucket in principle the larg...

Yes correct

hard pulsar Aug 24, 2023, 2:30 PM

#

We have to regularly clear our Go module cache, if we don't it will just grow and grow. Does it account for the need to purge the cache once in a while?

#

We do have minio here that we could use.

#From Docker to Dagger — The Changelog: S...