#Manually run container based on a failing Dagger with_exec?

1 messages Β· Page 1 of 1 (latest)

limber dagger
#

This question is only half related to Dagger. Most of it is probably related to Buildkit / Docker more. I wasn't sure how to formulate this better. Here the case...

I've written a pipeline in Dagger, and everything is working as expected. One of the steps in my pipeline is using with_exec to run a test suite. The test suite I'm running reports a handful of failures. I need to investigate what is it exactly that causes the failures.

Ideally, in order to reduce the feedback loop I want to enter a container at the failing step of the pipeline: look around the file system, try running the test manually, update the test code with printf statements, etc.. Something in the surrounding container environment makes the test fail, and I just need to understand what it is.

Once I know why the tests are failing, I would go back and update the pipeline (set ENV, add missing permissions, files, etc.) based on the findings.

Because Dagger is using buildkit, all pipeline "steps" are cached, e.g. buildkit stores the cache somewhere on the disk. So, I assume, there must be a way to "manually" restore the cache, making (?) an image out of it. That image then could be entered by docker run -it image-restored-from-cache bash to do manual forensics.

P.S. I initially thought that cache is some analogy of "intermediate" layers of Docker. However, merely running docker images -a does not reveal the cache I am looking for - because, I assume, buildkit cache and intermediate docker images - are two completely different things.

Any ideas, how do I make an image out of failing Dagger step? πŸ€”

#

Manually run container based on a failing Dagger with_exec?

limber dagger
#

One way I think I could work around this is to:

  • temporarily replace my with_exec with a publish step,
  • publish an image
  • run container from the image (also, separately start a database service, because my tests need a DB),
  • while in the contaimer, figure out why tests are failing,
  • improve my dagger pipeline based on the findings.

But for some reason I am failing to publish the image to local registry πŸ€” E.g. this fails:

Dagger.Container.publish("localhost:5000/testuser/testrepo:latest")

(btw I am using Elixir SDK)

limber dagger
#

The steps I took:

  • replaced with_exec running tests in the pipeline with - saving the container as a .tar file, e.g. Dagger.Container.export(container, "/tmp/debugging.tar")
  • restored .tar file as an image, docker load --input /tmp/debugging.tar
  • created a temp docker network docker network create eugene-is-testing,
  • ran a PostgreSQL service in the network, docker run --hostname postgres --network eugene-is-testing -e POSTGRES_PASSWORD=postgres -d postgres:14.5-alpine
  • finally, entered the image for debugging: docker run --network eugene-is-testing -it 7fef67df4890 bash
#

I could only wish for the process of getting inside the container for debugging to be shorter. Maybe there's could be a way to ask to dagger/buildkit to "hang", and let the developer enter the "hanging" step's container? πŸ€”

But nevertheless, it's nice to know there's an escape hatch, with saving/restoring the image manually as of now πŸ‘

waxen marten
#

In the meantime, for simple, non-private stuff, you can use ttl.sh for very short lived images in the cloud, or if you have some other registry you can use. This way you don't have to export and load a tar file. Soon we'll have networking between the host and containers, then you can push to your localhost instead (if you don't have or can't use a publicly accessible registry).

limber dagger
#

@waxen marten thanks! Good to know better debugging is on the radar πŸ‘