#Dagger storage location

1 messages · Page 1 of 1 (latest)

flint sluice
#

Hi, where can I find all the locations where dagger stores data? I know it uses the docker daemon under the hood (/var/lib/docker/), but I guess that this is not the only location...

I am trying to run a pipeline on GH actions space after freeing some space with the maximize-build-space GH action. As a result, I get about 100GB on the /docker/ location which is created by the maximize-build-space action. After that I symlink docker files there, so that I can have more space and build the image with no problems:

sudo mv /var/lib/docker /docker/
sudo ln -s /docker/docker /var/lib/docker
sudo systemctl restart docker

After building the image, I convert it to Singularity using a function that does pretty much this:

my_container = dag.container().from_("busybox")
# Adapted from github.com/shykes/x/singularity
sif_file = (
    dag.container()
    .from_("quay.io/singularity/docker2singularity")
    .with_file("img.tar", my_container.as_tarball())
    .with_exec(["singularity", "build", "img.sif", "oci-archive://img.tar"]) # CRASHES HERE
    .file("img.sif") 
)

This crashes with the error:

INFO:    Starting build...
INFO:    Fetching OCI image...
INFO:    Extracting OCI image...
FATAL:   While performing build: packer failed to pack: while unpacking tmpfs: while unpacking layer sha256:8cd7a829f9590817f5a94f1e50408777dba173caa39f5471a1c4e2d8f5b01696: unpack entry: usr/local/lib/python3.10/dist-packages/libtransformer_engine.so: unpack to regular file: short write: write /tmp/build-temp-251354134/rootfs/usr/local/lib/python3.10/dist-packages/libtransformer_engine.so: no space left on device

which suggests that the /tmp location in the container using quay.io/singularity/docker2singularity image is mapped somewhere outside the /docker location I created for docker data.

#

For completeness, after the conversion, I would do

# This part is not reached, but I added it for completeness
await (
    dag.container()
    .from_("quay.io/singularity/docker2singularity")
    .with_file("container.sif", sif_file)
    .with_secret_variable(name="SINGULARITY_DOCKER_USERNAME", secret=username)
    .with_secret_variable(name="SINGULARITY_DOCKER_PASSWORD", secret=password)
    .with_exec(
        [
            "singularity",
            "push",
            "container.sif",
            f"{uri}",
        ]
    )
    .stdout()
)
outer trout
#

@flint sluice assuming you use the default docker provisioner for the engine (it's optional) then the engine data is all in a standard docker volume, mounted into the engine container (I forget the exact mount point)

flint sluice
#

Thank you @outer trout ! I ran some additional tests and it seems that in fact the disk quota error happens on the partition where docker data is kept. The reason is that the conversion of Docker to Singularity is actually taking up a lot of space. I'll keep investigating

outer trout
#

if the issue is accumulation of cache data, you could mount a cache volume in the container for intermediary files not to be added to cache

#

but it's surprising that cache data would cause this, because dagger will normally handle garbage collection to avoid filling the disk

flint sluice
#

I think that the issue is caused by the singularity build command that tries to create too many files inside the container

outer trout
#

depending on how you setup your disk mounts, it might be worth trying a cache volume. under the hood that sets up a bind mount into the container, so the files written to that directory won't go into a copy-on-write layer.

nocturne parrot
nocturne parrot
flint sluice
#

Thank you @outer trout and @nocturne parrot !
It seems that the issue is caused by the large amount of storage needed by intermediate files created by Singularity during the conversion ["singularity", "build", "img.sif", "oci-archive://img.tar"] and I could not find ways to address this with singularity. It is not a Dagger issue.

However, I am wondering whether I could save up some storage somewhere else.

For instance I could skip the conversion of a dagger container to a tarball before passing it to singularity build. To do so I could try singularity pull img.sif docker://my_dagger_container... Is there a way to pull an existing dagger container inside another container as it was on a remote repository? With this I can access docker daemon, but I cannot see the containers currently available in dagger (not an expert in this):

dagger core \
  container \
  from --address "quay.io/singularity/docker2singularity" \
  with-unix-socket --path /var/run/docker.sock --source /var/run/docker.sock \
  terminal

Instead, I would like to access existing dagger containers from within another dagger container as they were on some docker registry (idk if this makes sense)

nocturne parrot
# flint sluice Thank you <@488409085998530571> and <@336241811179962368> ! It seems that the i...

no, you can't access the containers directly because both Dagger and Docker store the images in different locations and with different formats. If singularity allows passing an OCI registry as an argument, what you can do is to spin up a registry within your dagger pipeline and push the image there. Here's an example on how you could achieve that https://github.com/dagger/dagger/issues/6411#issuecomment-2354072245.

It still requires you to call the AsTarball function but you can store it in a WithMountedTemp path so it gets deleted after the step finishes

GitHub

Related to: #5235 #6271 There's a workaround using docker commands, but having support for Service would make it much simpler to manage.

#

let me know if that makes sense 🙏

flint sluice
# nocturne parrot no, you can't access the containers directly because both Dagger and Docker stor...

Interesting... I may actually try another workaround.

I am not very familiar with how Dagger caching works, but AFAIU you suggest to mount the container tarball in a temp location. Something like this?

my_container = dag.container().from_("busybox")
# Adapted from github.com/shykes/x/singularity
sif_file = (
    dag.container()
    .from_("quay.io/singularity/docker2singularity")
    .with_mounted_temp("/tmp")
    .with_env_variable("SINGULARITY_TMPDIR", "/tmp")
    .with_env_variable("SINGULARITY_CACHEDIR", "/tmp")
    .with_file("/tmp/img.tar", my_container.as_tarball())
    .with_exec(["singularity", "build", "img.sif", "oci-archive://img.tar"])
    .file("img.sif") 
)

This way all the files created by Singularity under /tmp and the Docker tarball will not be replicated (would they?) into Dagger's cache. However, as suggested by @outer trout , I guess that Dagger's GC should automatically try to avoid issues with disk quota...

Serving a registry in Dagger isn't introducing some sort of duplication of the container images, anyways?

nocturne parrot
nocturne parrot
#

@flint sluice are you using docker 4 mac by any chance?

flint sluice
nocturne parrot
#

oh, I see.. so we're maxing out the GHA runner disk space?

#

interesting..

flint sluice
# nocturne parrot interesting..

exactly. But the reason is that I am basing my container on nvcr.io/nvidia/pytorch:24.05-py3 (which is quite large) and converting the result to singularity... It's not a Dagger issue. I was just wondering whether I was doing things right

nocturne parrot
flint sluice
nocturne parrot
#

well.. maybe a current approach could be to split the pipeline in two jobs? One that generates the image and pushes it to ghcr's registry. And the second job to call the singularity build command against that GHCR endpoint

#

that seems the most reasonable thing to do IMO

#

and since everything is within the GH network, it should still be reasonably fast

#

you can pass the GHCR URL from one job to the other using GHA's inputs / ouputs job thingy

flint sluice
#

Yes, I will try with some workaround like that. My main goal here was to undestand whether I was misusing dagger or not using some features that could have helped me

nocturne parrot