#Hi folks! I’m Maarten from HackerOne.

1 messages · Page 1 of 1 (latest)

bright summit
#

Hey! Welcome 🙂

Its definitely possible and I don't think its a bad practice.

Dagger is just code, so you can do anything that you can do in go, python, or typescript.

The way i would imagine the thing you're describing working is

  1. write a test function that accepts some parameters (like the directory for the thing youre trying to test, maybe even a test command if its different across sub projects)

  2. write a function or long cli command (or both!) string that calls function 1 with all the monorepo parameters defined

Is there a particular SDK that you were most interested in?

magic pollen
#

SDK is going to be Python!

bright summit
magic pollen
#

Thanks for sharing! Then I would guess given a monorepo with multiple projects you would do

  1. Create a dagger module per project
  2. Implement a test function in each module for each project
  3. Create a root module which includes the other modules and awaits each test invocation

An approach like this raises a couple of questions:

  • Can you control the parallelism of the generated dag by submitting the job to a cluster of machines? Or is the entire dag executed on a single machine?

  • Each project might require its own host dependencies (files, secrets, env variables). Do you require passing all of these as arguments to the cli? Or is there also a way to store these in code? Similar to Bazel .bazelrc for config and BUILD files for files/directories?

bright summit
#

Can you control the parallelism of the generated dag by submitting the job to a cluster of machines? Or is the entire dag executed on a single machine?

Right now the entire dag is executed on a single machine. It can do many things in parallel but you are limited scaling vertically. Vertical scaling does work quite well for many people though, but it would be great to hear your experience if you give it a try.

Horizontal scaling is still a WIP. One approach is to use the existing CI config to dispatch different steps to different dagger engines.

Each project might require its own host dependencies (files, secrets, env variables). Do you require passing all of these as arguments to the cli? Or is there also a way to store these in code? Similar to Bazel .bazelrc for config and BUILD files for files/directories?

The concept of "host" changes a bit in dagger, remember everything is going to be executed in a container runtime. I would suggest using a function like base() to define all the dependencies. Check out this example from that project I shared before: https://github.com/levlaz/boundary-layer/blob/master/dagger/src/main/__init__.py#L22

In this case it returns the project specific build container, but the fact that it returns a container means that you can chain it indefinitly. So in your case, you may have a global base() function that includes all the most common dependencies that exist across all project, and then have a similar local base() function within a given project that chains on the thing you need.

The benefit of this approach is that you only have to build the original base() once, and then incrementally add things in, everything will be cached as much as possible automatically.

GitHub

Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform - levlaz/boundary-layer

magic pollen
#

Thanks for the context! I’ll try to experiment!

#

Also I read somewhere that all exported files to the host are written once? At the end of the dag execution? I’m asking because I have the following use case:

  1. Build a NixOS vm inside a container using https://github.com/nix-community/nixos-generators
  2. Export the image to the host
  3. Run the image using Lima (because I assume you cannot do virtualization inside a Dagger container). Communicate with Lima on the host from within a container using https://github.com/msoap/shell2http
  4. Start a testinfra test suite inside a container against the running vm

If it’s possible to export the image during execution, instead of at the end, all these steps could become a single Dagger invocation!

GitHub

Collection of image builders [maintainer=@Lassulus] - nix-community/nixos-generators

GitHub

Executing shell commands via HTTP server. Contribute to msoap/shell2http development by creating an account on GitHub.

#

Alternatives to step 3 would be:

  • Ability to run a dagger function on the host, not in a container
  • Ability to run a VM from within a container
  • Ability to run a Dagger function directly in a vm
bright summit
#

What is Lima?

(because I assume you cannot do virtualization inside a Dagger container).

You can run Docker in Docker with Dagger, so if Lima can run inside of a container then you can run it as a service as a part of your pipeline.

Alternatives to step 3 would be:

  • Ability to run a dagger function on the host, not in a container
  • Ability to run a VM from within a container
  • Ability to run a Dagger function directly in a vm

Its not possible to do any of these afaik.

magic pollen
#

Gotcha! Then I’m going to try to upload the image during the build to the macOS host and boot the vm there.

bright summit
#

Sorry I am confused how VMs are getting into the mix 🙂

What is the purpose of Lima here, why not just run the container without Lima?

magic pollen
#

Haha no worries, it’s also a bit complicated!

The vm part is necessary to run an iso created by a Dagger build. Dagger is used to create an os image which I want to run tests against. Unfortunately it’s not possible to load an os image into a container (afaik) and therefore a vm is necessary.