#DeadlineExceeded when pulling from company artifactory

1 messages Β· Page 1 of 1 (latest)

regal quiver
#

Hi!
First of all thanks for all your work, absolutely love Dagger (although I only have basic knowledge of CI).
I'm trying to convert our corporate project pipeline which mainly consists of building and runnning C# projects and running python integration tests to Dagger.
Everything works perfectly when run locally, but I'm running into an issue when triggering the pipeline in GitLab CI.
I set up the following simplified pipeline for debugging purposes:

import os

import anyio

import dagger
import logging
import httpx


logging.basicConfig(
   format="%(levelname)s [%(asctime)s] %(name)s - %(message)s",
   datefmt="%Y-%m-%d %H:%M:%S",
   level=logging.DEBUG,
)


async def test():
   async with dagger.Connection(
       dagger.Config(log_output=sys.stderr)
   ) as client:

       secret_password = client.set_secret("password", os.getenv(SECRET_FOO)

       python = (
           client.container()
           .with_registry_auth(
               os.getenv(REGISTRY_URL), os.getenv(REGISTRY_USER), secret_password
           )  
           .from_(
               "https://artifactory.COMPANY.com/foo/bar/foobar:3.0"
           )
           .with_exec(["python", "-V"])
       )

       # execute
       version = await python.stdout()

   print(f"Hello from Dagger and {version}")

which leads to the following error:

dagger.exceptions.QueryError: DeadlineExceeded: failed to do request: Head "https://artifactory.COMPANY.com/v2/foo/bar/foobar/manifests/3.0": dial tcp 80.72.131.83:443: i/o timeout

Proxy is properly set via HTTP_PROXY Envs, and httpx logs lead me to believe that it works as expected.
I'm not even sure I need the .with_registry_auth() call.
I can pull the desired just fine before running the dagger pipeline.
I know about the open issues on GitHub regarding private registries, could it be that this is the issue I'm also running into?

I'm happy to provide more information as necessary!
Thanks!

regal quiver
#

Maybe of relevance: Trying an image available over a public registry such as "python:3.11-alpine" leads to the same issue.

sand phoenix
regal quiver
#

Sure! Thanks for caring!

regal quiver
# sand phoenix Hey. Can you give a repro with the public registry? Does it happen without `.wit...
  • Same behavior with and without with_registry_auth()
  • This does only happen in GitLab CI. Local pipeline is working flawlessly, including pulling images from the artifactory.

Repro pipeline:

import os
import anyio
import dagger

async def test():
    async with dagger.Connection(
        dagger.Config(log_output=sys.stderr, timeout=100)
    ) as client:
        python = (
            client.container().from_("python:3.11-alpine").with_exec(["python", "-V"])
        )

        version = await python.stdout()

    print(f"Hello from Dagger and {version}")


if __name__ == "__main__":
    anyio.run(test)

This pipeline runs fine locally.
Running the pipeline in GitLab leads to dagger.exceptions.QueryError: DeadlineExceeded: failed to do request: Head "https://registry-1.docker.io/v2/library/python/manifests/3.11-alpine": dial tcp 34.205.13.154:443: i/o timeout

#

I'm afraid I need to modify the .gitlab-ci.yml to remove company related data...
The image has docker-cli already installed.
GitLab setup uses /var/run/docker.sock instead of TLS, which is why I removed the variables suggested in the docs for GitLab Ci.

.docker:
  image:
    name: "<internal image we use to run python tests>"
    entrypoint: [""]
  services:
    - docker:${DOCKER_VERSION}-dind
  variables:
    DOCKER_VERSION: "20.10.16"
    http_proxy: "<corporate proxy>"
    https_proxy: "<corporate proxy>"
    HTTP_PROXY: "<corporate proxy>"
    HTTPS_PROXY: "<corporate proxy>"
    no_proxy: "<...>"
    NO_PROXY: "<...>"
    SSL_CERT_FILE: $REQUESTS_CA_BUNDLE

build:
  extends: [.docker]
  script:
    - cd test_suite
    - pip3 install -i $ARTIFACTORY_PYPI_SERVER --extra-index-url $ARTIFACTORY_PYPI_SERVER_EXTRA --upgrade -r requirements.txt
    - export SSL_CERT_FILE=$REQUESTS_CA_BUNDLE
    - python test_public.py
#

Let me know if I can provide any additional information!

sand phoenix
# regal quiver I'm afraid I need to modify the .gitlab-ci.yml to remove company related data......

Ah that's what I thought. So, for the registry, the dagger engine needs those certificates. The SSL_CERT_FILE won't be used by the Python SDK. Even though you're trying to use a public image, you're still configuring your CI environment to go through your corporate proxy, which requires those certificates, I assume. Let me check for a moment what we have in the documentation today for setting that up in a custom runner.

#

It's not in a guide yet because the design on this isn't finalized. It'll become much easier to configure at one point.

regal quiver
#

Ohh, that would make sense!
So a way to fix this would be by using a modified version of the engine with our custom certificates in /etc/ssl/certs ?

#

This is very helpful, much appreciated!

sand phoenix
#

I believe @hot granite has already given lots of advice on this issue. I've seen a lot of users asking for it, but I've never tried it myself.

#

But you can see those conversations here on discord, I'll help find some.

regal quiver
sand phoenix
#

Also pinging @shadow tiger.

regal quiver
#

Yeah right, thats the one! I will try to implement your suggestion and will report back! Thank you so much!

sand phoenix
regal quiver
#

Maybe I'm misunderstanding the documentation, but using FROM registry.dagger.io/engine:0.6.2 as base image should work, right?

sand phoenix
#

Yes, are you getting an error?

#

You should be able to run that image directly but just change how it's run by adding a volume for your certs.

regal quiver
#

Yes!

manifest unknown```
sand phoenix
#

Ah, it's v0.6.2

regal quiver
#

ohh damn. Thanks 😁

#

Fixed it! πŸ˜„

sand phoenix
#

Awesome πŸ™‚

shadow tiger
#

@regal quiver congrats on getting that working! Let me know if you're interested in checking out your pipelines in Dagger Cloud as well. We're working with some early adopters who need the visibility and caching services πŸ™‚

regal quiver
#

Oh, sorry for the confusion, I was referencing the manifest unknown issue! πŸ™‚
I think I'm not quite there yet, but I think I'm close..

Do I need to modify my .gitlab-ci.yml when using _EXPERIMENTAL_DAGGER_RUNNER_HOST ?

regal quiver
#

So this is how my script looks like right now in .gitlab-ci.yml

- docker run --rm --privileged --name "custom_dagger_engine" -d -v dagger-engine:/var/lib/dagger artifactory.company.com/foobar
/dagger/engine:latest 
- export SSL_CERT_FILE=$REQUESTS_CA_BUNDLE
- export _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://custom_dagger_engine
- python test.py

This doesn't work though, I'm hitting a SessionError...

Error: new client: buildkit failed to respond: failed to list workers: Unavailable: connection error: desc = "error reading server preface: command [docker exec -i custom_dagger_engine buildctl dial-stdio] has exited with exit status 1, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=Error: No such container: custom_dagger_engine"
Traceback (most recent call last):
[...]
dagger.exceptions.SessionError: Failed to start Dagger engine session: Command '/root/.cache/dagger/dagger-0.6.2 session --label dagger.io/sdk.name:python --label dagger.io/sdk.version:0.6.2 --label dagger.io/sdk.async:true' returned non-zero exit status 1.

Something obvious I'm doing wrong?

sand phoenix
#

It's saying it can't find that container, so I'd check if it's actually running and not exited with an error.

#

Also, you need to mount your certs into that container, as I said, the SSL_CERT_FILE env won't be picked up by the SDK or the CLI.

#

Maybe you've copied them inside your custom image, if so, ignore what I said.

regal quiver
#

Yes, thats exactly what I did, should have mentioned that! I simply forgot to remove the line, sorry!

sand phoenix
#

Are you trying to run that locally?

#

With the custom runner

regal quiver
#

Does it make a difference whether I'm running the image with -d or not?

sand phoenix
#

You should need -d

regal quiver
#

No thats in GitLab

sand phoenix
#

You can test locally, it's a faster feedback look until you make it work.

#

You can use docker:dind to make sure you're not getting anything from your host to murk the water

regal quiver
sand phoenix
#

In https://hub.docker.com/_/docker look for the section "Start a daemon instance". It's basically dind = docker in docker. It allows you to run docker containers from within a container, this way your "host" environment is clean which is closer to what's happening in your gitlab CI. Notice the docker image you're using there.

#

I didn't mean that section, you can just mount your host docker socket into a docker:cli container. That way you have the clean "host" environment but you're reusing your docker daemon to run the containers.

#

If you're not experienced in this sort of stuff, than just do what you know.

#

In your ci yaml, after spining up the runner, try using docker ps and check the logs to see if it's running correctly by the time you run python.

regal quiver
#

Thanks a ton, will report back!

regal quiver
#

Container is up and runnning and has the correct name by the time python starts 😞.

shadow tiger
#

@regal quiver want to chat briefly?

#

you can jump into dev-audio

#

sometimes easier to just look at your screen πŸ™‚

#

or show you mine

#

To Helder's point above, you can start a clean docker-in-docker evironment like this

docker run -d --rm --privileged docker:24-dind

which will emit a container id: e.g. ef0527ed9f80dff8aa2e2f5a819a8109a2445e50343c85e40fc8e553a4c71498

Then I just executed a shell in there and started using it

docker exec -it ef0527ed9f80dff8aa2e2f5a819a8109a2445e50343c85e40fc8e553a4c71498 sh
#

in that shell I can run a dagger engine or any other container

#

I'm back in dev-audio btw 😁

regal quiver
#

Sure! Give me a second

#

Very generous offer

regal quiver
#

Does not quite do the trick yet, but I think I need to tweak some things in .gitlab-ci.yml. We did not use dind, since we were using docker sockets, right @shadow tiger ?

This will have to wait until tomorrow though, its late for me. Thanks again!

regal quiver
#
#3 from python:3.11-alpine
Hello from Dagger and Python 3.11.4

It worked! πŸ₯³ πŸ₯³
I have never been more happy to see those lines. 😁
I removed any traces of dind in .gitlab-ci.yml, mounted the socket and used a regular python image, and it just worked.
Thank you so much @shadow tiger @sand phoenix!

regal quiver
#

So just a quick recap in case anyone else is facing a similar problem in the future:

  • I needed a custom certificate, which is why i took the registry.dagger.io/engine:v0.6.2 image as base, modified it to my needs and deployed it to our internal artifactory.
  • Since we're using docker.sock , I used a regular python image in .gitlab-ci.yml, not using docker-in-docker at all.
  • Before running the python dagger pipeline in CI, I'm running docker run --rm --privileged --name "custom_dagger_engine" -d -v /var/run/docker.sock:/var/run/docker.sock -v dagger-engine:/var/lib/dagger artifactory.company.com/foobar/custom_dagger_engine and export _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://custom_dagger_engine to point to my custom dagger engine
  • Running the python pipeline is now successful

daggerfire

hot granite
#

πŸ‘‹ Mounting /var/run/docker.sock shouldn't be necessary since the engine doesn't use docker

regal quiver
#

oh, good to know. Will try to run it without it then and report back!

regal quiver
#

Thanks for the heads up, works as expected without!

sand phoenix
regal quiver
#

Sure!

buil:
  image:
    name: "<internal image we use to run python tests>"
    entrypoint: [""]
  variables:
    http_proxy: "<corporate proxy>"
    https_proxy: "<corporate proxy>"
    HTTP_PROXY: "<corporate proxy>"
    HTTPS_PROXY: "<corporate proxy>"
    no_proxy: "<...>"
    NO_PROXY: "<...>"
  script:
    - docker login $ARTIFACTORY_URL --username $ARTIFACTORY_USER --password $ARTIFACTORY_PASSWORD
    - pip3 install -i $ARTIFACTORY_PYPI_SERVER --extra-index-url $ARTIFACTORY_PYPI_SERVER_EXTRA --upgrade -r requirements.txt
    - DAGGER_ENGINE_CONTAINER_NAME=custom_dagger_engine
    - if [[ $(docker ps -a --filter="name=$DAGGER_ENGINE_CONTAINER_NAME" --filter "status=running" | grep -w "$DAGGER_ENGINE_CONTAINER_NAME") ]]; then
    echo "Dagger engine already running, not starting ..."
  else
      echo "No dagger engine container running, starting ..."
      docker run --rm --privileged --name $DAGGER_ENGINE_CONTAINER_NAME -d -v dagger-engine:/var/lib/dagger artifactory.company.com/custom/dagger/engine 
  fi
    - export _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-container://$DAGGER_ENGINE_CONTAINER_NAME
    - python pipeline.py
#

The ugly if clause is for checking whether an engine container is still running and avoid a name conflict in case. For some reason, it is not getting killed after jobs.
But to be honest, im not even sure this is a bad thing, since this way caching is improved i guess? Open for suggestions though.

hot granite
#

The ugly if clause is for checking whether an engine container is still running and avoid a name conflict in case. For some reason, it is not getting killed after jobs.

do you have stateful self-hosted runners? That would cause the engine container not to be stopped between runs. I also think there might be an easier way than to start the engine manually which is to use docker-image://artifactory.company.com/custom/dagger/engine instead of docker-container. Since you already have the certificates included in your image, you can just let dagger handle the engine lifecycle for you

#

so basically:

buil:
  image:
    name: "<internal image we use to run python tests>"
    entrypoint: [""]
  variables:
    http_proxy: "<corporate proxy>"
    https_proxy: "<corporate proxy>"
    HTTP_PROXY: "<corporate proxy>"
    HTTPS_PROXY: "<corporate proxy>"
    no_proxy: "<...>"
    NO_PROXY: "<...>"
  script:
    - docker login $ARTIFACTORY_URL --username $ARTIFACTORY_USER --password $ARTIFACTORY_PASSWORD
    - pip3 install -i $ARTIFACTORY_PYPI_SERVER --extra-index-url $ARTIFACTORY_PYPI_SERVER_EXTRA --upgrade -r requirements.txt
    - export _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-image://artifactory.company.com/custom/dagger/engine
    - python pipeline.py

^ that should give you the same behavior I think

shadow tiger
#

also guessing you can move export _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-image://artifactory.company.com/custom/dagger/engine to the variables: section, right?

...
variables:
  _EXPERIMENTAL_DAGGER_RUNNER_HOST=docker-image://artifactory.company.com/custom/dagger/engine
...
regal quiver
#

Not quite sure about the stateful part, but I will ask around whether this is the case.

Oh, interesting, I didn't realize it is possible to point to images directly to images (this is not in the docs https://github.com/dagger/dagger/blob/main/core/docs/d7yxc-operator_manual.md#configuration i think?).

GitHub

A programmable CI/CD engine that runs your pipelines in containers - dagger/core/docs/d7yxc-operator_manual.md at main Β· dagger/dagger

#

Does letting dagger handle the engine lifecycle provide any benefits except a cleaner script?

shadow tiger
#

I'm guessing that a cleaner script is the main benefit.

#

I think the dagger engines are sticking around because an approach like this is at play

#

So the custom_dagger_engine container keeps running even after the job container that started it has finished.

regal quiver
#

Thanks, thats very helpful! Correct me if I'm wrong, but wouldn't this be a good thing in terms of caching?

shadow tiger
#

Yes, I'm guessing that if your custom_dagger_engine is staying around on the GitLab Runner Host, then it's using that docker volume for storage (-v dagger-engine:/var/lib/dagger) that will contain dagger engine cache and you should see CACHED in your log output for some steps.

#

here's my local dagger engine on my laptop

docker ps
CONTAINER ID   IMAGE                              COMMAND                  CREATED        STATUS        PORTS           NAMES
3b89b7a25fc9   registry.dagger.io/engine:v0.6.3   "dagger-entrypoint.s…"   2 hours ago    Up 2 hours                    dagger-engine-7bfecba5f007b4fa
#
docker inspect -f '{{ .Mounts }}' 3b89b7a25fc9
[{volume bd20e546eb43f5484345eb7b8f5f1981c8271f998da4ae65377c1f09df5947d3 /var/lib/docker/volumes/bd20e546eb43f5484345eb7b8f5f1981c8271f998da4ae65377c1f09df5947d3/_data /var/lib/dagger local  true }]
regal quiver
#

Thank you! Uptime is 3 hours, and inspect results in the following:

+ docker inspect -f '{{ .Mounts }}' e6c5f3e66a35
[{volume dagger-engine /var/lib/docker/volumes/dagger-engine/_data /var/lib/dagger local z true }]