Hello. I am currently reading about | Dagger | Page 1

true grotto Oct 25, 2024, 12:52 AM

#

vapid sage Oct 25, 2024, 1:32 AM

#

Welcome! Super intrigued by some of your work on GitHub around Hanabi 🙂

The final example on that page makes it seem that Dagger is replacing the pipeline steps and is not replacing the pipeline itself. Meaning that that with Azure DevOps, I would still use YAML files to stich together all of my Dagger functions.

Today, you would have some minimal ADO YAML to define a few things about the VM runner and the trigger (e.g. PR on the main branch), but as for the rest, you can do all of that stiching in Dagger code and then just call one function that serves as the entrypoint to your pipelines, for example a single call to dagger call ci might call build, test, release functions under the hood.

note: we'll update the examples. It's common now to have a Dagger module right in your software project (often in .dagger directory) so your Dagger functions are checked out along with the rest of your source code. You can also specify the "context" (default source fed into your Dagger module) to be your project, so the invocation gets very succinct.

trigger:
- main

pool:
  name: 'Azure Pipelines'
  vmImage: ubuntu-latest

steps:
- script: curl -fsSL https://dl.dagger.io/dagger/install.sh | BIN_DIR=$HOME/.local/bin sh
- script: dagger call ci

env:
    DAGGER_CLOUD_TOKEN: $(DAGGER_CLOUD_TOKEN)

Or it might make sense for you to call those functions separately because of how you want things to show up in ADO (e.g. as separate steps) or another CI: dagger call build, dagger call test, dagger call release

Either way, you can call your dagger functions the same way from your local machine as from any CI. So your pipelines are now portable 😁

Is that the case? Do people typically use Dagger to replace all of their YAML files? Is the point of Dagger to solve the problem of making individual CI steps be able to run locally (and not solving the problem of having to use YAML)?

We do have folks who would like to replace ALL of their YAML 🙂

A couple of moves in that direction are 1) using Dagger to generate the YAML you need, so you don't have to mess with writing/editing it. An example of that approach is the gha module for generating GitHub Actions YAML (very similar to ADO): https://daggerverse.dev/mod/github.com/shykes/gha@bf36a0b37a75882e4a985179f090d86d2ad1dfb4

Another more Dagger-maximalist approach can be seen in demos like this "pocket CI" 🙂
#1298351651970617344 message

Love to hear more about your current project 😃
Let us know how we can help

gha :: Daggerverse

Manage Github Actions configurations with Dagger

hardy topaz Oct 25, 2024, 4:36 AM

#

I recommend reading this section on how Dagger integrates with your CI. It's not specific to Azure but very relevant.

https://docs.dagger.io/adopting#integrating-with-ci

Note: we should probably link to, or embed, this section in each specific CI integration page.

Adopting Dagger | Dagger

Once you have completed the quickstart, and learned the basic concepts, it's time to take the next step: adopting Dagger in your project. We call this daggerizing. But how does one daggerize a project, exactly?

true grotto Oct 25, 2024, 10:41 AM

#

Super intrigued by some of your work on GitHub around Hanabi
How did you know that? Haha!

Jeremy Adams and solomon, thank you very much for the detailed replies!

Some follow-up questions:

The second example on the documentation for the Azure DevOps page is a pipeline that has 3 steps that are all Dagger functions. Image a similar pipeline, but it has 30 steps instead of 3, with all of those steps still being Dagger functions. Is it possible to point the Dagger CLI at this YAML pipeline file to execute it on your local computer? Or do you have to manually construct an identical 30-sequence-long function chain, as documented here?
In Azure DevOps pipelines, there are some predefined / built-in variables, like System.AccessToken, which is useful for e.g. subsequent queries to the Azure DevOps HTTP API. In the "stub" Dagger invocation YAML that Jeremy provided above, is there a way to pass in all of the built-in variables, or does that have to be manually done for each variable? (To clarify, the built-in variables are environment variables, so I assume those would not automatically get passed in to a Docker container.)

true grotto Oct 25, 2024, 12:06 PM

#

In Azure DevOps (and GitHub Actions), there is a concept of a "matrix", where you can run multiple jobs in parallel. Above, Jeremy recommends using a single pipeline as a single entry point to Dagger. But in this setup, a single CI runner / agent is stuck running the entire company's pipelines. With this setup, how would I run multiple jobs in parallel? Is there some easy-to-use abstraction from within a Dagger function that I can pass a brand new job to an entirely different AZDO agent? (To clarify, Azure DevOps has the concept of an "agent pool", such that you can add N computers to your agent pool, and then jobs will be passed to whichever computer is free and has unqueued work.)

hardy topaz Oct 25, 2024, 2:28 PM

#

true grotto 3) In Azure DevOps (and GitHub Actions), there is a concept of a "matrix", where...

the docs link above addresses most of these questions I think 🙂

#

Daggerizing leads to simplifying. It's common to merge several large CI pipelines into a single one that just wraps dagger call. This usually leads to massive simplification of the CI configuration, as complex YAML/Groovy/shell spaghetti is replaced by clean code. Taken to the extreme, this process reduces the entire CI configuration to a single dagger call, with everything else happening inside Dagger. Although this sometimes happens, in practice most projects converge to a middle ground, where the CI configuration shrinks to just enough dagger call invocations to take advantage of proprietary CI features. Usually these features are job scaling, and job visualization. The more dependent you are on these proprietary CI features, the more granularity you will need to keep in your CI configuration.

true grotto Oct 25, 2024, 2:35 PM

#

hardy topaz > Daggerizing leads to simplifying. It's common to merge several large CI pipeli...

Although this sometimes happens, in practice most projects converge to a middle ground
Thanks, that partly answers question 3. I am more interested in the "sometimes" part, though. Is there any documentation for that? Or documentation on how to use Dagger to emulate a traditional pipeline matrix? I am envisioning an AZDO pipeline that hands off to Dagger, and then the main entry-point Dagger function invokes Kubernetes to run the individual jobs, or something along those lines.

#

(But maybe you dont want to do that because then you can't run the pipeline code anymore on your local laptop?)

vapid sage Oct 25, 2024, 2:39 PM

#

true grotto > Although this sometimes happens, in practice most projects converge to a middl...

If the matrix is spinning up infrastructure that needs to be controlled by the CI-specific YAML, then I'd keep the matrix in ADO.
If the matrix is something that can be handled within Dagger (e.g. building for multiple versions of node or go, or multi-arch/OS), then a for loop or two can traverse all the possibilities.

https://docs.dagger.io/cookbook#perform-a-matrix-build

Cookbook | Dagger

Filesystem

hardy topaz Oct 25, 2024, 2:41 PM

#

vapid sage If the matrix is spinning up infrastructure that needs to be controlled by the C...

☝️ exactly. Until Dagger gains the ability to scale out function calls of course 😇

true grotto Oct 25, 2024, 2:45 PM

#

Thanks for the link. Yeah, I see now, you can only emulate a matrix that runs on a single ~~Docker container~~ pipeline agent.
Whereas with a "real" matrix, each job would be running on a separate ~~Docker container~~ pipeline agent.
So using a matrix in this way won't actually speed up anything, because a typical ~~Docker container~~ pipeline agent will only have like 1 CPU.
[Edit - sorry, I meant pipeline agent. I see that the Dagger code does actually spawn multiple containers. But they would all be running on the same pipeline agent.]

vapid sage Oct 25, 2024, 2:54 PM

#

true grotto Thanks for the link. Yeah, I see now, you can only emulate a matrix that runs on...

With your edit, yes, the Dagger Engine will take advantage of the resources of the CI runner or laptop to execute the DAG you've described in code by creating and orchestrating multiple containers under the hood. The compute you use for the ADO runner will definitely affect things. Parallel execution and caching are automatic, so it's definitely possible that you'd see performance gains with your pipeline. Prob best to just give it a try 🙂

true grotto Oct 25, 2024, 2:56 PM

#

Thanks Jeremy. Can you address questions 1 + 2 above? Namely, 1) is there an GitHub Actions or AZDO pipeline emulator that can tie in with the Dagger CLI? And 2), is there an easy to way to automatically pass all AZDO CI variables into Dagger without having to manually specify it?

#Hello. I am currently reading about