#web app traces visualization in dagger cloud

1 messages · Page 1 of 1 (latest)

gleaming oar
#

Hi Dagger Team,

I'm using Dagger for my CI/CD pipelines, and I'm exploring how to get a better observability picture for my web applications alongside my pipelines.

Dagger Cloud's tracing is very useful for understanding what's happening inside my Dagger pipelines, and the spans are a great help when I need to debug. However, I'm running into a challenge when I try to connect these traces with user interactions in my web applications that are being built and deployed using Dagger.

Specifically, my web application uses its tracing (via OpenTelemetry with Deno), and I'm trying to figure out if there's a recommended way to visualize those traces alongside, or linked with, the Dagger pipeline traces directly within Dagger Cloud.

For example, imagine a user places an order through a "purchase hub" type of web app. I'd ideally like to trace that whole user journey, and see how that relates to any Dagger operations that may be involved, all within Dagger Cloud without relying on external tracing tools.

Any guidance would be much appreciated.

Thanks!

topaz canyon
#

Hello! Normally, anytime Dagger executes a tool inside a container, and that tool emits open telemetry spans, Dagger should pick up those spans automatically, and forward them to Dagger Cloud so that you can view them in context. We use this feature for our own tests, it's pretty sweet.

#

There may be a glitch somewhere that is preventing this from working out of the box in your case. We will gladly help you find out what it is!

#

silently pinging @spiral sandal who is the expert in this regard. It's pretty late for him right now, but I'm sure he will have an opinion on this tomorrow

gleaming oar
#

Thanks for the quick response. It's helpful to know about Dagger's automatic span pickup for tools running in the containers.

I should clarify that I'm interested in getting traces from my web application running as a service, not just from Dagger's testing tools.

My web application itself is instrumented with OpenTelemetry (using Deno and the OTel libraries), and the plan is to generate spans to track user journeys and application-level events. The goal is to see this information along side the dagger pipeline execution within Dagger Cloud.

Here's an example:

  1. main.ts:
/*main.ts*/
const PORT: number = 8000;
import { type Span, trace } from "@opentelemetry/api";

const tracer = trace.getTracer("dice-lib");

function rollOnce(min: number, max: number) {
    return Math.floor(Math.random() * (max - min + 1) + min);
}

export function rollTheDice(rolls: number, min: number, max: number) {
    return tracer.startActiveSpan("rollTheDice", (span: Span) => {
        const result: number[] = [];
        span.setAttribute("rolls", rolls);
        for (let i = 0; i < rolls; i++) {
            result.push(rollOnce(min, max));
        }
        span.end();
        return result;
    });
}


const handler = async (req: Request): Promise<Response> => {
    const pathname = new URL(req.url).pathname;
    if (pathname === "/rolldice") {
        const rolls = new URL(req.url, `http://localhost:${PORT}`).searchParams.get(
            "rolls",
        );
        console.log("Got request for ", rolls);
        if (!rolls) {
            return new Response(
                "Request parameter 'rolls' is missing or not a number.",
                { status: 400 },
            );
        }
        return new Response(JSON.stringify(rollTheDice(parseInt(rolls), 1, 6)));
    }
    return new Response("Not found", { status: 404 });
};
Deno.serve({ port: PORT }, handler);
#
  1. index.ts
import { dag, Container, object, func, Service, Directory } from "@dagger.io/dagger";

@object()
export class OtelDagger {
  @func()
  baseService(source: Directory): Promise<Service> {
    const service = dag
      .container()
      .from("denoland/deno:latest")
      .withDirectory("/app", source)
      .withWorkdir("/app")
      .withEnvVariable("OTEL_EXPORTER_OTLP_PROTOCOL", "http/json")
      .withExec(['deno', 'run', '-A', "main.ts"])
      .withExposedPort(8000)
      .asService();
    return service;
  }
}
spiral sandal
#

Those should work too - they should appear beneath the asService span (the one that also has the service logs + is running for the duration of the service)

gleaming oar
#

@spiral sandal
Thanks for your help.

As you can see in the Dagger Cloud UI, I am only getting the logs under the asService span, but not the traces from dice-lib.

From the Deno documentation (https://docs.deno.com/runtime/fundamentals/open_telemetry/#quick-start), it appears that Deno's OpenTelemetry implementation, by default, sends telemetry data to localhost:4318 (which can be changed through the OTEL_EXPORTER_OTLP_ENDPOINT flag).

So, my question is: is there a way I can configure the Dagger pipeline or service to pick up the telemetry data being sent from my Deno application (i.e., to pick up traces from the localhost:4318 endpoint) so that they are displayed as spans under asService in Dagger Cloud?

Deno

In-depth documentation, guides, and reference materials for building secure, high-performance JavaScript and TypeScript applications with Deno

spiral sandal
#

if you check out the env, it should also be automatically setting all the standard OTEL_* env vars as appropriate

#

in the deno docs it says this:

To enable the OpenTelemetry integration, run your Deno script with the --unstable-otel flag and set the environment variable OTEL_DENO=true:
so maybe you need a withEnvVariable("OTEL_DENO", "true")

gleaming oar
#

OTEL_EXPORTER_OTLP_PROTOCOL is for the format of the otel data format to the collector,

  baseService(source: Directory): Promise<Service> {
    const service = dag
      .container()
      .from("denoland/deno:latest")
      .withDirectory("/app", source)
      .withWorkdir("/app")
      .withEnvVariable("OTEL_DENO", "true")
      .withExec(['deno', 'run', '-A', '--unstable-otel', "main.ts"])
      .withExposedPort(8000)
      .asService();
    return service;
  }

tried this withEnvVariable("OTEL_DENO", "true") and my pipeline fails

│ ● .asService: Service! 7.3s
│ ┃ Download https://registry.npmjs.org/@opentelemetry%2fapi                                                                                  
│ ┃ Download https://registry.npmjs.org/@opentelemetry/api/-/api-1.9.0.tgz                                                                    
│ ┃ Listening on http://0.0.0.0:8000/                                                                                                         
● .up: Void 7.3s
┃ 05:21:49 INF tunnel started port=8000 protocol=tcp http_url=http://localhost:8000 description="tunnel 0.0.0.0:8000 -> bmmmkloek6nfc.8038nfba
┃ m8s.dagger.local:8000"                                                                                                                      

Error: Post "http://dagger/query": command [docker exec -i dagger-engine-v0.15.2 buildctl dial-stdio] has exited with exit status 137, make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=
topaz canyon
#

That error may be unrelated, I encountered it also a few days ago

normal peak
#

@spiral sandal can you please help us here?

spiral sandal
#

yeah that error seems unrelated, i would try again

normal peak
spiral sandal
#

not really - that's just a higher level way to create spans so you don't have to use the OTel SDK, whereas it seems like you're trying to use existing direct OTel integration

normal peak
normal peak
normal peak
#

@spiral sandal @topaz canyon Hey guys can anyone please help to solve this or let us know in case we need to create an issue for this?

spiral sandal
# normal peak <@108011715077091328> <@488409085998530571> Hey guys can anyone please help to s...

If you create an issue, please include a repro - as far as I know we're doing everything right here according to the OTel spec, so to proceed we'll need a repro.

To be clear:

  • Dagger automatically configures standard OTEL_* env vars, configuring exporters for traces, logs, and metrics. This happens for every container and service, unconditionally, so you shouldn't need to enable anything.
  • Deno appears to also require you to set OTEL_DENO=true and run it with --unstable-otel to enable its OTel integration.
  • We are actively using Dagger's automatic OTEL_* configuration in various places (namely our own integration test suite), so it for sure works, but I have run into tools in the past that misinterpret the OTel spec, so maybe there's a bug in Deno or something.
normal peak
normal peak
normal peak
spiral sandal
#

@normal peak I'm confused: why does this run its own OTLP data receiving service? Don't you just want Deno's tracing data to show up in the Dagger trace? Or are you trying to also send OTLP data to a service of your own?

Can confirm there's a crash that's caused by the data sent by Deno (looks like it sends some OTel logs that don't have a traceID set), looking into it

normal peak
spiral sandal
normal peak
spiral sandal
#

Just make sure when you send to Dagger that you don't override OTEL_EXPORTER_OTLP_PROTOCOL - that has to stay as the value Dagger provided. So I ended up making this change in your repro:

diff --git a/src/poc_dagger_deno_otel/src/index.ts b/src/poc_dagger_deno_otel/src/index.ts
index 6b9fca1..ed17fca 100644
--- a/src/poc_dagger_deno_otel/src/index.ts
+++ b/src/poc_dagger_deno_otel/src/index.ts
@@ -16,7 +16,6 @@ export class OtelDagger {
       .from("denoland/deno:latest")
       .withDirectory("/app", source)
       .withWorkdir("/app")
-      .withEnvVariable("OTEL_EXPORTER_OTLP_PROTOCOL", "http/json")
       .withEnvVariable("OTEL_DENO", "true")
       .withEnvVariable("OTEL_DENO_CONSOLE", "true")
       .withExec(["deno", "task", "start"])

It's fine to set it to that when emitting to your own OTLP service of course, just has to match what the backend expects. (Dagger uses http+protobuf.)

After I made that change the engine crashed when starting the server, but that's kind of good because it at least meant Deno was sending data there. So once the crash fix is released, it should work. If I find time I'll double check and see what that looks like, but I have a lot on my plate right now 🙂 - maybe you could try it out with a dev engine

normal peak
normal peak
#

@spiral sandal we tried with the latest release of deno and dagger, we are not able to see otel logs inside dagger cloud. We tried to see docker logs to troubleshoot issue.

we found below error which might be helpful to troubleshoot further

2025-02-20 10:50:53 time="2025-02-20T05:20:53Z" level=error msg="error unmarshalling logs request" error="map[error:proto: cannot parse invalid wire-format data kind:*errors.prefixError stack:<nil>]"
---
2025-02-20 11:12:33 time="2025-02-20T05:42:33Z" level=error msg="log has no resource" 
---
2025-02-20 11:13:12 time="2025-02-20T05:43:12Z" level=debug msg="exporting metrics to clients" clients=2
2025-02-20 11:13:12 time="2025-02-20T05:43:12Z" level=error msg="error exporting metrics" err="map[error:export to xynmih9nvco6ku2spoz02hxom: failed to unmarshal resource metrics: unknown aggregation from pb: *v1.Metric_Histogram\nunknown aggregation from pb: *v1.Metric_Sum\nunknown aggregation from pb: *v1.Metric_Histogram\nunknown aggregation from pb: *v1.Metric_Histogram kind:*fmt.wrapError stack:<nil>]"
---
normal peak
spiral sandal
spiral sandal
#

ah i can at least repro the log has no resource one, with dagger call deno-run-with-env --source=example up

spiral sandal
#

ok, I have https://github.com/dagger/dagger/pull/9716 which helps to some extent but I think some things still need to be handled in code, and at that point I'm a bit out of my depth. Maybe @stiff drift can help? I think the remaining work is to ensure the trace context continues from the trace + span ID provided via env vars, similar to how the TypeScript SDK detects it, except in this case it's a "pure TypeScript" codebase and not a Dagger module using the SDK

GitHub

Add (mostly AI generated) transformations to handle different types of metrics data (cc @cwlbraa)
More gracefully handle log records with missing data (resource / trace ID)

stiff drift
# spiral sandal ok, I have https://github.com/dagger/dagger/pull/9716 which helps to some extent...

The Typescript SDK uses the OTEL environment variable to work, when using with connection: https://github.com/dagger/dagger/blob/eb738ebe8bf53a80f8061d377dd04934e1489fce/sdk/typescript/src/connect.ts#L35

The trace context is already propagated for each call thanks to: https://github.com/dagger/dagger/blob/eb738ebe8bf53a80f8061d377dd04934e1489fce/sdk/typescript/src/common/graphql/client.ts#L55

Is there any extra work required to correctly handle traces produced from Deno?

normal peak
#

@spiral sandal Please let us know what we can give from our side to solve this

normal peak
#

@spiral sandal Any hopes? Please

boreal jay
# normal peak <@108011715077091328> Any hopes? Please

Hey @normal peak

We're a small team with competing priorities and we're not able to keep directly debugging this since it seems the issue is not with dagger SDK itself but your own implementation.

Please continue to investigate and try things on your side. If you get this to work and find a bug in our SDK please submit a patch in GitHub.

normal peak
boreal jay
normal peak
boreal jay