#Using export on files or directories, causes huge disk-space used, which ends gitlab-ci pipeline

1 messages · Page 1 of 1 (latest)

hybrid oak
#

I'm using dagger to automate my build system, which runs in a docker container.
So I'm using the docker_build of my dockerfile, and executing the needed "Cmake" commands in there.

At the end I export the needed zip file to the host directory using the "ctr.file("zipfile.zip").export("output/zipfile.zip")"

Running this locally is fine, but in my gitlab-ci environment, my 20GB of runner disk-space is completely filled up before the end of the pipeline.. I'm trying to run a for loop with 20 executions in it, but by the 8-th time the gitlab-runner is full..

Some code:

    async with (dagger.Connection(config) as client):
        context_dir = client.host().directory(".", exclude=["docs/", ".git*", ".cache"])
        build_ctr = await context_dir.docker_build(dockerfile="soft/Dockerfile")


        async def build_configs(cfg: Path, build_type: str, semaphore: anyio.Semaphore):
            async with semaphore:
                rel_bld = await common_release_build_step(
                    cfg, build_type, build_ctr, context_dir
                )

                unpacked_ctr = await unpack_and_encode(rel_bld, f"{build_type}_Output")
                await unpacked_ctr.file(f"SW.zip").export(f"{outdir}/SW.zip")

        async with anyio.create_task_group() as tg:
            semaphore = anyio.Semaphore(1)
            for cfg in cfg_files:
                tg.start_soon(build_configs, cfg, build_type, semaphore)

Any clues? I see that the "snapshots" in the docker volumes are being created for every run it seems, can anyone give me some tips? Thx!

minor schooner
#

It'd be greart to have a bit more context about how your app / what you're building currently looks like

#

is each execution building the same thing?

hybrid oak
#

Hi Marcos, OK will give you some more context.

#

I'm building an embedded application using cmake in the container. For some reason, I need to have a new, clean environment for each build. (some files need to change) So I am building the container first, and next I am using cmake in that "fresh" container each time to execute the cmake build script.

#

The diskspace is only getting eaten up once I add the "await unpacked_ctr.file(f"SW.zip").export(f"{outdir}/SW.zip")" line, so it seems like the "snapshotting" mechanism is being activated because of that?

hybrid oak
#

@minor schooner Any more feedback? I'm quite blocked on this, I am just trying to find some information on how the snapshotting is working with Dagger itself.. If I can turn it off or so?

hybrid oak
#

@minor schooner OK, the export is not causing this, rather just spinning up multiple "big" containers in a loop, is "saving" the container itself to a snapshot for each execution in it.. Is this for caching purposes?

minor schooner
minor schooner
#

how are you seeing this currently?

minor schooner
minor schooner
#

here's a very basic high level approach in Go about how I'd approach this.

given the following directory structure:

.
├── a
├── b
├── c
├── d
├── go.mod
├── go.sum
└── main.go

5 directories, 3 files

If I need to create a pipeline that builds the a,b,c and d subproejcts out of a base container, I'd do something like:

    ctx := context.TODO()

    c, err := dagger.Connect(ctx, dagger.WithLogOutput(os.Stdout))
    if err != nil {
        panic(err)
    }

    hd := c.Host().Directory(".",
        dagger.HostDirectoryOpts{
            Exclude: []string{".git"},
        })
    es, err := hd.Entries(ctx)
    if err != nil {
        panic(err)
    }

    base := c.Container().From("alpine")

    for _, e := range es {
        if len(filepath.Ext(e)) == 0 {
            _, err := base.WithMountedDirectory("/app", hd.Directory(e)).
                WithExec([]string{"echo", "cmake here"}).
                File("out.zip").Export(ctx, "out.zip")
            if err != nil {
                panic(err)
            }
        }
    }
}

^ the most important thing here is the WithMountedDirectory call one each loop iteration so that doesn't produce a layered FS over the base container each time.

#

LMK if this helps @hybrid oak

minor schooner
hybrid oak
#

Well, I have been diggin in a bit deeper yday into this issue, and indeed, tried out the with_mounted_directory, but I think the issue lies somewhere else..

#

My container I am using in the pipeline is one based on a dockerfile. So I am using a dagger docker_build step to build it. The container itself is around 3gb in size.

#

When I execute the loop for cfg in cfg_files, which executes around 17 times, I am using in the gitlab-ci around 19GB of diskspace for this execution. This diskspace is mainly going into the volume attached to the dagger-engine, in the runc-overlayfs/snapshots folder.

#

For reference, the git repo I am adding to the container for execution is around 200mb. And the output generated is around 1.5mb per execution.

hybrid oak
hybrid oak
minor schooner
#

are you cloning a different repo each loop iteration?

#

seems like each of your loop iterations is incorrectly duplicating data somehow

hybrid oak
#

Nope, nothing is being cloned again, I'm litterally just executing the CMake command, and a linux "wine" command to execute a windows exe file inside of the container...

royal garden
minor schooner
#

@hybrid oak any chance we could jump into a quick #911305510882513037 session whenever you have time to check this out

#

or maybe is there a way you could share us a simple repro that exposes this volume increase behavior somehow?

#

I've tried to reproduce it myself but wasn't able to 😬

hybrid oak
#

@minor schooner Audio session is gonna be difficult I gues due to the timediff, although, I'm in CEST. If you ever have time 9am-6pm CEST time, I will be available.. I can also look into making a small repo for this one, although of course, having an EXE somewhere to use with wine to execute something COULD be challenging to be shareable...

hybrid oak
#

@minor schooner For now I "fixed" the issue by choosing the bigger gitlab-runner size, from 20GB -> 100GB, but of course this is just a band-aid 😄

minor schooner
royal garden
#

Having re-read the thread, it seems to me that you are somehow copying the entire container image from Docker into Dagger. That is the only explanation for you filling 20GB of disk space on iteration 8 (container image is 3GB).

Without looking at the code, and even being able to run it locally, it's very difficult to debug this. I can appreciate that this may be difficult to share, so maybe we could see the dagger run output with --focus=false ? Attaching a screenshot of what that looks like in one of my daggerized projects:

If you use dagger call -v, that will work too.

minor schooner
#

If you use dagger call --focus=false, that will work too.

afaik --focus=false doesn't work in dagger call. ref: #maintainers message

royal garden
minor schooner
royal garden
#

That makes sense. If the runner has 20GB disk space, the image is 3GB, and it fails on the 8th iteration, that is something that I would check. I also don't fully understand what is zipfile.zip and how big it is. 200MB was mentioned, and also 1.5MB, but neither could explain the high disk usage.

#

Also worth checking: is the Engine connected to the Cloud?

minor schooner
#

I did a small test locally also by starting from a base Dockerfile and wasn't able to repro

hybrid oak
#

Hi @minor schooner , sorry was in a meeting. I do have some time still now?

hybrid oak
minor schooner
hybrid oak
#

If @ashen socket does not have the time, I can just share my script, which will not be functional, but I can share the dockerfile and my python script.. Would this be useful as well?

ashen socket
#

i will say, and this is a bit of a cop-out, 20GB is not much space for Dagger - is there an easy way to increase it?

#

as a general rule, space tends to be the cheapest variable to change, and some out-of-the-box CI defaults are too low to really benefit from caching. caches tend to add up quickly, but we definitely don't want to be wasteful, so if there is an issue here it's absolutely a priority. just looking for a faster path to being unblocked 🙂

#

in my experience the simplest answer to 'no space left on device' is that the disk usage plateau for your caching requirements is just higher than expected, so it'd help to bump it and see what value it plateaus at

hybrid oak
#

Yeah, as I said, I increased to the 100GB runners in gitlab now, which is OK. But I would like to be able to understand the reason for the "snapshotting" of the complete container every time, rather than bandaiding it..

ashen socket
#

oh got it, didn't catch that

hybrid oak
#

np

ashen socket
#

how much disk usage has it gotten up to with the higher limit?

hybrid oak
#

Around 30GB

minor schooner
#

so basically I assume that by refactoring the pipeline a bit, I think we should be good

hybrid oak
#

Yeah, pretty sure "the unnoticed thing" is probably this "wine XXX.exe" executing and installing the packages in docker for running wine.. But I can't remove this, as I really need it in the execution.. Moving it from the Dockerfile to Dagger, did not change anything btw

minor schooner