#Engine garbage collecting aggressively in v0.11.0

1 messages · Page 1 of 1 (latest)

lapis stone
#

Coming from here: #1223351179753422958 message

Just wondering if someone has noticed a noticeable decrease in the pipelines caching. I've just put together a very basic pipeline that transfers ~500mb from the host to the function and I notice that after a few seconds the pipeline finishes, I see some garbage collection happening in the engine even though I have enough disk space in the container and I start from a clean state every time

#
time="2024-04-09T16:51:21Z" level=debug msg="remove snapshot" key=5arexgv126yiuaqy7f46u9k56 snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="remove snapshot" key=btbnvtjy35a77x2pa2jcq55lh snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="remove snapshot" key=jssmab40nli9bke1z7mg3sttt snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="remove snapshot" key=nabssv9e0r7yv7x8t59qsl1hu snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="remove snapshot" key=pfn8v90cd9m6sjlnmvto1ki5g snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="remove snapshot" key=r544rvk30i7y6x1qpybxx1imq snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="remove snapshot" key=rocj24tudnne99ldfan78p4sc-view snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="schedule snapshotter cleanup" snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="removed snapshot" key=buildkit/46/jssmab40nli9bke1z7mg3sttt snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="removed snapshot" key=buildkit/57/5arexgv126yiuaqy7f46u9k56 snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="removed snapshot" key=buildkit/31/r544rvk30i7y6x1qpybxx1imq snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="removed snapshot" key=buildkit/38/nabssv9e0r7yv7x8t59qsl1hu snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="removed snapshot" key=buildkit/43/btbnvtjy35a77x2pa2jcq55lh snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="removed snapshot" key=buildkit/55/pfn8v90cd9m6sjlnmvto1ki5g snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="removed snapshot" key=buildkit/52/rocj24tudnne99ldfan78p4sc-view snapshotter=overlayfs
time="2024-04-09T16:51:21Z" level=debug msg="snapshot garbage collected" d=4.656707ms snapshotter=overlayfs

cc @scarlet grove @frozen folio any ideas?

#

haven't checked if this also happens in v0.10.x

#

this is all what the pipeline is doing:

// Returns lines that match a pattern in the files of the provided Directory
func (m *Lala) Test(ctx context.Context, dir *Directory) ([]string, error) {
    return dir.Directory("./").Entries(ctx)
}

and my dir context is ~500mb

jolly snow
#

I think I've also experienced this behaviour in v0.10.3 when creating a pipeline that builds a Go program for multiple platforms. It came to my attention when I saw that re-running the same dagger call function (with the same args) took always a few seconds.

I'm now on v0.11.0 and I'm also seeing a lot of removed snapshot messages in the engine logs after calling the same function multiple times.

The disk space of the Docker VM is ~50% available. The dirty, temporary solution was to disable the GC in the engine.toml file.

scarlet grove
#

Hm, I don't think the removed snapshots are related to any cache busting we're seeing going on - I can see them happening even just on a normal run that ends up perfectly cached

#

Maybe the reason we might be seeing "more" of them on v0.11.0 is that we've done a lot of log cleanup - lots of old noisy messages have been silenced, so maybe these just stand out more than they used to

#

@jolly snow are you using dagger cloud? If you are, sharing a trace of both an uncached and a cached run would be really helpful! There's definitely inefficiencies around, I'm curious what you're seeing though.

#

@lapis stone I think it's likely there is some caching regression, but it doesn't seem related to these logs from what I can tell

lapis stone
#

my reading about the behavior is that those garbage collections seem to be removing? usable cache snapshots that then need to be re-computed again afterwards 🤔

scarlet grove
#

hmmm

#

how full is your cache?

#

as in, how much space does it currently appear to be taking up?

lapis stone
scarlet grove
#

hm