#Joining concurrent DAG branches?

1 messages · Page 1 of 1 (latest)

north harbor
#

Hi 🙂 Is there already some kind of common best practice on how to join parallel/concurrent branches of the DAG? Like parallel "build" and "test" steps that should be joined before follow-up steps can be started?

Right now I'm mostly just using an empty container and mounting folders from the build- and test- containers into that so that the engine knows it has to execute both steps, but this feels a bit weird.

solemn meteor
#

Hey! Yes there is. Which SDK are you using?

#

In all SDKs, to run pipelines concurrently, you need to send concurrent requests to the engine. These are usually wrapped in a language construct (error group, task group...) whereby the calling code waits for the whole group to finish before continuing.

north harbor
#

@solemn meteor thank you! Sorry, I forgot to mention that I'm looking for such a "synchronization point" within the engine itself. No need to export anything to the host 🙂

Basically having a single container that receives dagger.Files from the test and build containers should do such a "joining of the graphs", right?

I'm mostly wondering if that would be a pattern or an anti-pattern 😅

solemn meteor
#

I know what you meant, just wondering why you want it that way.

#

I don't think it's necessarily an anti-pattern, but it does feel like a workaround. You can see it being mentioned in "DX: Pipeline Synchronization" (https://github.com/dagger/dagger/issues/4205), see Problem 2: Workaround 1.

#

There's a new issue for this in "Proposal: Helper for multiple pipeline synchronization" (https://github.com/dagger/dagger/issues/5083) where I looked into supporting it with the API directly but the problem is accepting multiple different id types in the schema (see "Alternatives considered" section). So the suggested improvement is make a multi-query request or even just abstract the language's concurrency.

GitHub

Summary This proposal suggests adding an SDK helper for efficiently synchronizing multiple pipelines, in an easy to use way. Background This is based on one aspect of: #4205 Motivation Often times,...

#

By the way, with Sync it becomes a bit simpler:

_, err := c.Container().
-   From("alpine").
    WithMountedDirectory("/test", test.Directory("/tmp")).  
    WithMountedDirectory("/lint", lint.Directory("/tmp")).
-   WithExec([]string{"ls"}).
-   ExitCode(ctx)
+   Sync(ctx)
#

A while back I actually tested this pattern against using the language's concurrency and didn't see any noticeable performance improvement.