#Stuck on "DBG frontend exporting spans"

1 messages ยท Page 1 of 1 (latest)

gentle walrus
#

The Dagger pipelines runs fast when executed locally but takes for ever on our cluster executed via ARGO Workflows.
I always seems stuck at some commands (e.g. a "upload" step). I enabled the debug logs to see what is going on, and everytime it is stuck the last log line is "DBG frontend exporting spans".

What is this? What does it export? Where? Is this necessary or can this be disabled?

rugged hinge
# gentle walrus The Dagger pipelines runs fast when executed locally but takes for ever on our c...

Hey there! does it get stuck on random steps? Or it's always at the same one?

Have you followed our docs here : https://docs.dagger.io/ci/integrations/argo-workflows/ on how to setup Dagger in Argo?

Dagger provides a programmable container engine that can be invoked from an Argo Workflow to run a Dagger pipeline. This allows you to benefit from Dagger's caching, debugging, and visualization features, whilst still keeping all of your existing Argo Workflows infrastructure.

gentle walrus
#

Yes, with one difference where I had to also specify the command for the docker engine sidecar: command: ["dagger-entrypoint.sh"]

It feels like it's always the same steps.

rugged hinge
#

do you have kubectl access to the pod where your engine is running in your k8s cluster? If so, you could try setting the _EXPERIMENTAL_DAGGER_RUNNER_HOST variable to the k8s pod name as shown here: https://docs.dagger.io/ci/integrations/kubernetes/#example and check if the dagger call works ok when calling your remote engine with your local machine's pipeline ๐Ÿ™

This section covers different strategies for deploying Dagger on a Kubernetes cluster.

gentle walrus
#

I'm not sure what you are suggesting. It pipeline is generally working, it just get's stuck for a while when it hits these DBG ... logs.

#

Not stuck in the sense of not continuing

#

but in the sense of taking some minutes before continuing.

#

The pipeline took 1-2min locally but takes 1h on the cluster

rugged hinge
#

is there any chance you can connect Dagger Cloud to your pipeline so we can see what's happening from our side?

gentle walrus
#

It's a payed service for companies, right?

#

What are these "DBG frontend exporting spans"? What is happening in this stage?

rugged hinge
rugged hinge
#

so you're good to create an account and use a token in your argo server to send telemetry there

gentle walrus
#

If this is not configured, there is nothing send outside our network? and it is also not trying in these steps (having connection issues and hence being stuck?)

rugged hinge
#

eventually, it's also recommended if you can put the engine state volume /var/lib/dagger in a physical disk and not network block storage as well. Dagger is mostly disk intensive since each operation you perform in a container, gets it's overlay CoW snapshot.

gentle walrus
#

You mean to mount a PVC to /var/lib/dagger?

rugged hinge
gentle walrus
#

ah you mean local to the node the build is running on?
We don't have specific build nodes atm.

rugged hinge
rugged hinge
gentle walrus
#

I will need to check that with our IT.

rugged hinge
#

cc @humble sequoia @hollow arch

gentle walrus
#

Does it make sense to have a separate deployment of the dagger engine? (ensuring both will run on the same node).

rugged hinge
gentle walrus
#

A single dagger engine can handle multiple requesting build jobs at the same time?

rugged hinge
gentle walrus
#

We are looking into configuring a dedicated build node and I'm planning to make the enginge a standalone deployment and not a sidecar.
Will report back how that goes ๐Ÿ™‚

rugged hinge
gentle walrus
#

Still with the sidecars: When I launch two pipeline executions at the same time, then the second one fails to connect to the socket. Is that expected? Is that also a limitation when deploying a single engine? Or should it be able to run multiple builds in parallel?

rugged hinge
#

Have you checked that both pods land in the same node?

#

Oh wait

#

You're using sidecars

gentle walrus
#

yes, two workflows each with their own sidecar

#

the second one starting and running in parallel can not connect to the socket.

#

Socket is mounted as empty dir

rugged hinge
gentle walrus
#

As we are looking into switching to a separate engine deployment, it might not matter for us, but at least consider it reported.

#

I will report back if the issue persists with a separate engine deployment.

gentle walrus
#

Can confirm that parallel builds are not a problem for the separate deployment. We now also have the dedicated PVC.

#

Build time went down significantly

#

I'm happy now

gentle walrus
#

Thanks for the help

real moss
#

Great to hear! You're most welcome ๐Ÿ™‚