#Help with "Dagger-in-Dagger": Running Dagger Engine as a Service for Pytest

1 messages · Page 1 of 1 (latest)

limpid stump
#

Hi!
I’m experimenting with a "Dagger-in-Dagger" setup for module integration tests. My goal is to have a pytest container connect to a Dagger Engine running as a Dagger Service.

While I know this isn't the standard pattern, I'm trying to understand the feasibility of the architecture. Currently, the client container hangs/fails to connect to the engine service despite a health-check loop.

import dagger

@dagger.object_type
class DaggerTest:
  @dagger.function
  async def run_nested_dagger(self) -> str:
      """Starts a nested Dagger Engine and runs 'dagger version' against it."""
      version = "0.20.8"
      engine_svc = (
          dagger.dag.container()
          .from_(f"registry.dagger.io/engine:v{version}")
          .with_exposed_port(8080)
          # .with_mounted_cache(
          #    "/var/lib/dagger", dagger.dag.cache_volume("nested-engine-data")
          # )
          .with_exec(
              ["/usr/local/bin/dagger-entrypoint.sh", "--addr", "tcp://0.0.0.0:8080"],
              experimental_privileged_nesting=True,
              insecure_root_capabilities=True,
          )
          .as_service()
      )
      client = (
          dagger.dag.container()
          .from_("python:3.11-slim")
          .with_exec(["apt-get", "update", "-qq"])
          .with_exec(["apt-get", "install", "-y", "-qq", "curl"])
          .with_exec(
              [
                  "sh", "-c", f"curl -L https://dl.dagger.io/dagger/install.sh | DAGGER_VERSION={version} BIN_DIR=/usr/local/bin sh",
              ]
          )
          .with_service_binding("dagger-engine", engine_svc)
          .with_env_variable("DAGGER_HOST", "tcp://dagger-engine:8080")
          .with_exec(
              [
                  "sh", "-c", "until dagger version; do echo 'Waiting for Dagger Engine...'; sleep 1; done",
              ]
          )
      )
      return await client.with_exec(["dagger", "version"]).stdout()
abstract topaz
#

And, assuming you do need control over the version. What is your version matrix exactly? And what will be the lifecycle of future changes to the matrix (when do you plan on adding or removing a new target engine version)

#

Nesting is a strong suit of Dagger, often underappreciated, so depending on your requirements I'm hoping we can find a great solution for you 🙂

limpid stump
#

Hi Solomon!
The short answer:
I am building a Dagger-native SDK where Dagger functions can dynamically bind services to containers based on a configuration file. My immediate goal is to treat the Dagger Engine as one of these services, allowing a pytest container to run integration tests against a module in a fully isolated, nested environment.
While a version matrix is a future goal, the current hurdle is the 'Dagger-in-Dagger' connectivity. I want the Dagger Engine to be treated as just another service in my orchestration logic.

The Context:
I previously developed a multi-language SDK that ensured reproducible command execution across local and CI/CD environment (GitLab) using Docker and Docker Compose. After spending some time with Dagger, I’m convinced that it has solved many of the problems I had much better and is moving in the right direction for the future of CI/CD.
I am now reimplementing this vision to leverage Dagger's architecture as Shiryu. A key feature of my previous solution was its "sidecar" capability, which allowed developers to define docker-compose files to spin up necessary databases or APIs for any specific command. My goal is to replicate that logic by using service configuration files that are parsed inside Dagger functions and bound to the command's runner container at runtime. In this setup, the Dagger Engine is the most complex service I need to orchestrate.
My current work involves using the SDK Shiryu on itself for development. It is currently working in GitLab and GitHub. Most of the common files necessary for a python project can be dynamically generated or updated by the init Dagger functions.
The work-in-progress is in these repositories (specifically in the feat-add-python-sdk branch):

GitHub

Generic Programming Language SDK. Contribute to fvonbergen/shiryu development by creating an account on GitHub.

magic aspen
#

@limpid stump we kind of do this ourselves in our dagger/dagger repo where we spin-up a dev-engine for our integration tests and then we run e2e flows there. Not sure if you've checked that already.

While a version matrix is a future goal, the current hurdle is the 'Dagger-in-Dagger' connectivity. I want the Dagger Engine to be treated as just another service in my orchestration logic.

you can surely do this but take into account that it'll make test suite slower as it requires starting a new dagger engine container, populating the cache, etc.
*
Do you specifically want the Dagger engine to be treated as another service managed by our orchestration logic? or it's ok to make an exception here? I'm mostly asking because the other way to approach this is to use the engine's nesting capabilities which allow module funcitons to access the out engine to run whatever you need

limpid stump
#

Hi @magic aspen !
I wasn't aware of those integration tests. Do you mean the ones located here: https://github.com/dagger/dagger/tree/main/e2e](https://github.com/dagger/dagger/tree/main/e2e) ? If so, I’ll need to take some time to dive into them, as I'm not very familiar with Go.
I realize these tests will take longer to run. Is there a way to just share a socket or the underlying communication protocol between the engine and the spun-up containers? Right now, I'm still trying to understand how the engine-to-client communication works and how to properly debug it.
Honestly, I'm interested in both approaches, as they would both give me a much better insight into Dagger's internals. But if I had to choose, I’d prefer to learn how to spin up the engine inside a Dagger function. Even though I know it will take significantly longer to run, it feels like the better alternative for what I'm trying to achieve.

abstract topaz
#

All SDKs support this out of the box. there is no need to modify any code anywhere except for setting that argument in withExec

#

if you're relying on dagger as a core component for your project (which sounds awesome) then there's no need to abstract away the building and running of your runtime from within the runtime

limpid stump
#

Hi @abstract topaz !
Thank you! I started working with your proposal. Adding experimental_privileged_nesting was trivial, but I encountered an error that took me some time to isolate.
In my Dagger function, I use dagger.dag.current_module().source() to retrieve the module version. However, when nesting the container, it seems the Dagger engine is unable to identify the current module context.
The error is quite long. The last lines are:

        ...
        except TransportQueryError as e:
            if error := _query_error_from_transport(e, request):
>               raise error from e
E               dagger.QueryError: failed to get current module: no current module

I have isolated the issue within my repository.

Reproduction Steps
To verify that the function works successfully on its own, you can run:

dagger --mod=https://github.com/fvonbergen/shiryu.git@debug-dagger-module-testing-inside-pytest call get-module-version

To run the failing unit test suite locally (where the function is nested):

dagger --mod=https://github.com/fvonbergen/shiryu.git@debug-dagger-module-testing-inside-pytest call python tester unit --experimental-privileged-nesting=True

Context & Logs

Questions

  1. Is this expected behavior when utilizing privileged nesting?
  2. Is there a recommended way to handle module context or versioning in nested environments, or is falling back to a hardcoded default the only option?
GitHub

Generic Programming Language SDK. Contribute to fvonbergen/shiryu development by creating an account on GitHub.

GitHub

Generic Programming Language SDK. Contribute to fvonbergen/shiryu development by creating an account on GitHub.

GitHub

Generic Programming Language SDK. Contribute to fvonbergen/shiryu development by creating an account on GitHub.

magic aspen
#

checking this now @limpid stump

limpid stump
#

Thanks, @magic aspen !
Let me know if you need anything else from me.

limpid stump
#

Hey @magic aspen!
Just following up on this—did you happen to find anything interesting, or see if it's a known quirk with dagger-in-dagger?
Just wanted to check in and see if you had any insights on whether this looks like a bug or just an incompatible use of dagger.dag.current_module().source() on my side.

abstract topaz
#

(the issue talks about accessing the current workspace - but it also applies to the current module. Both are contextual information available to a module, but not to the containers it runs, even if they have nesting enabled)

#

The workaround: every dagger object is loaded by ID. Each type has its own load-from-id fuinction:

  • dag.load_module_source_from_id() to load a ModuleSource
  • dag.load_directory_from_id() to load a Directory
  • and so on

So, when executing your nested container, you can inject the ID into the container, as an env variable. Then the nested client can retrieve the env variable and load the object from it. You can use this trick to pass the current module.

dag.container().etc-etc-etc.with_env_variable("CURRENT_MODULE_ID", string(dag.current_module().id))
limpid stump
#

Hi @abstract topaz !
Thanks for the quick reply and for pointing out the issue/workaround. I'll test out injecting the ID into the nested container shortly.
Appreciate the help!

limpid stump
#

Hi @abstract topaz ,
Following up on the issue with dagger.dag.current_module().source() blowing up inside the nested test containers (dagger-in-dagger).
The problem is that I have a metadata helper that relies on dagger.dag.current_module().source(). This works on the host, but that context is lost inside a nested dagger-in-dagger environment. Since our test containers need to run "clean" commands without passing down or fetching Dagger IDs, the environment variable solution will not work for me.
Considering that it is a known upstream limitation tracked here: https://github.com/dagger/dagger/issues/13054, I am going to hardcode the metadata values until this is fixed upstream.
Thanks for your patience, and the amazing support!

GitHub

Automation engine to build, test and ship any codebase. Runs locally, in CI, or directly in the cloud - Issues · dagger/dagger