#Dagger CLI's memory usage

1 messages · Page 1 of 1 (latest)

minor shale
#

👋 Hello!

I have a pipeline that is quite resource intensive.
My understanding of dagger is making me think that the engine should be the one consuming all the resources as the work will happen there.

This is largely true, and I can see it use around 12GB of memory.
However, I can also see that the memory usage of the CLI process keeps increasing through the dagger call (~20min total), and grows to almost 1GB.

In our CI platform I've currently allocated 1GB of memory for the CLI, which is leading to this specific pipeline getting OOM killed...

Do you have any insights on why is the memory usage so high in the CLI? Also any recommendations on how much memory to allocate to the CLI?

Thank you so much for your help!

inner abyss
# minor shale 👋 Hello! I have a pipeline that is quite resource intensive. My understanding...

hey Vincent! the CLI does effectively quite some work as it's the one in charge of sending all the telemetry to Dagger Cloud. Is there a chance you could take some memory snapshots and send then to us so we can see where the memory is being allocated? It's possible that there's a memory leak somwhere which we might have to optimize.

In order to take a heap dump of the CLI you need to set PPROF=localhost:6060 (or whatever port you prefer) as an env variable and then once the CLI has been running for some time you can just get the profile with curl -o heap.pprof http://localhost:6060/debug/pprof/heap

If you could take 2 o 3 profiles across the lifespan of the CLI that will allow us to compare them and understand what's currently allocating so much memory 🙏

minor shale
#

Thanks for the quick response! I've just DM'd you the profiles.

I'm also not using Dagger Cloud, and have set DO_NOT_TRACK=1 in CI.

Would you have recommendations on the memory to allocate to the CLI?

lofty badge
#

Hmm don't see anything too weird in the profiles, the largest one is 387.4MB but it looks like it's just a lot of telemetry data being emitted over time, which slowly adds up. Is it a very long running + busy function that it's calling? (in terms of spans emitted)

minor shale
#

It runs for 20/30mins and does multiple thousands (maybe almost 10k) of WithExec calls. At first I thought it might be a github actions logging issue so I started using the --silent flag when calling dagger (its used in the call that generated the profiles), but I guess the data is sent anyway

#

I'm guessing I just need to bump my memory limits on the CLI

lofty badge
#

Oof yeah that sounds like a lot to fit into 1GB, would recommend just bumping it for now, curious to know what it plateaus at

#

There are optimizations we could do if it really came down to it, but the cost of entry there seems high vs. other priorities, if it's an easy enough tweak on your end 🙏