#almost able to start a github runner from dagger but need some help

1 messages ยท Page 1 of 1 (latest)

small pendant
#

Hi folks

thanks to the amazing help from people here i am almost able to start many github selh hosted runners using dagger in a cross platform way. I am using the go api so the plan is to start as many go routines as possible. I really could not achieve this without your help !

Now during execution of the workflow somehow dagger-for-github is ending up looking for my module in the wrong location. i have a ./ci/e2e directory but somehow during the execution this is not being found. i could use a pair of eyes to figure what i am doing wrong.

Thanks you !!!

#

the checkout step is putting the code somewhere different than where the dagger-for-github step is trying to find it .. i think

#

maybe i can hard code the path where the code will be clone to by the previous step ... or maybe there is a better way ...

small pendant
#

ok so finally figure even dagger core version is returning the same error.

#

this is dagger in dagger and i was hoping passing this when calling the runner entry point would do the trick

    _, err = runner.
        // Terminal().
        WithExec([]string{"sh", "-c", "./entrypoint.sh"},
            dagger.ContainerWithExecOpts{
                InsecureRootCapabilities:      true,
                ExperimentalPrivilegedNesting: true,
            }).
#

i think i know whats going on ... i prolly need a go compiler when the ci/cd is running dagger

neat cloak
#

๐Ÿ‘€

#

catching up

small pendant
neat cloak
#

Assuming your CI runner is doing dagger call, then you shouldn't need go installed in the CI runner

#

Do you have a Trace URL to share by any chance? Could help peel off the layers

small pendant
#

i dont have trace in the runner setup so far .. this all works on my machine but not in ci

#
/home/runner/_work/_tool/go/1.22.6/x64/bin/go
/home/runner/_work/_temp/bin/dagger
/home/runner/_work/pf-rel/pf-rel
{
  "name": "runner-tools",
  "sdk": "go",
  "dependencies": [
    {
      "name": "apko",
      "source": "github.com/vito/daggerverse/apko@1eed54a12ffcd897efe2907d4dd0c015dc9f89d9"
    }
  ],
  "source": ".",
  "engineVersion": "v0.12.7"
}
1   : connect
1   : connect DONE [0.0s]

Error: dagger develop on a module without an SDK requires either --sdk or --source

here is the error

#
    - name: start dagger
      shell: bash
      run: |
        set -o pipefail
        which go && which dagger && pwd
        cd ./ci/e2e && cat dagger.json && dagger develop

relevant part part i nthe workflow

#

i also verified docker ps -a and dagger core version all work as expected

#

i could be missing some mount points that dagger is expecting not sure at this point

small pendant
#

i injected the DAGGER_CLOUD_TOKEN env secret using WithSecretVariable("DAGGER_CLOUD_TOKEN", it.DaggerToken). when starting the runner
then verified i have the correct token just before i call dagger develop in the workflow file still getting what i think is an auth error so i cant share traces :-

23:36:03 ERR failed to emit telemetry error="traces export: failed to send to http://api.dagger.cloud/v1/traces: 405 Method Not Allowed"
23:36:03 ERR failed to emit telemetry error="traces export: failed to send to http://api.dagger.cloud/v1/traces: 405 Method Not Allowed"
Error: dagger develop on a module without an SDK requires either --sdk or --source
neat cloak
#

I recommend calling dagger login at the top-level of all this (your dev machine)

#

then as long as you wrap the whole thing in a top-level dagger call or dagger run, you will see the whole trace all the way down

#

which is what we need here anyway

#

(dagger in dagger in dagger...)

small pendant
#

ialready have tracing working in my develop machine

neat cloak
#

ah! Awesome. Can you share a link to that? normally it should show the whole stack all the way down

small pendant
#

but it work thats not whats failing .. i was trying to share the ci that is failing ... lemme do it still

#

oh wait you mean the actual runner trace ? or the workflow execution trace ?

#

i think you mean the former

neat cloak
#

whatever is failing (I don't completely understand what), it should all be part of that one trace, right?

small pendant
#

starting the runner is not failing ..

neat cloak
#

ok but the runner executes a workflow, and that workflow fails right?

small pendant
#

startrunner --> start github workflow --> call dagger

#

startrunner is a dagger function and its working on my desk

neat cloak
#

is there an error anywhere in that chain?

small pendant
#

as i said it works on my desk ...

#

also i cant get traces to work in the ci ... i am getting an auth error like i said above

#

DAGGER_CLOUD_TOKEN is in the env but it still cant get pas the auth error

limpid talon
#

@small pendant I think you might be hitting the same issue @earnest ice highlighted here: #maintainers message.

small pendant
#

oh cool lemme take a look .. i was going desperate

limpid talon
#

confirmed, just tested this in a quick dagger program with the SDK and it also has the same issue:

func main() {

    ctx := context.Background()
    defer dag.Close()

    out, err := dag.Container().From("alpine").
        WithExec([]string{"apk", "add", "git", "curl"}).
        WithExec([]string{"sh", "-c", "curl -LO https://dl.dagger.io/dagger/install.sh && chmod +x install.sh"}).
        WithExec([]string{"sh", "-c", fmt.Sprintf("DAGGER_VERSION=%s BIN_DIR=/usr/local/bin ./install.sh", "v0.12.7")}).
        WithExec([]string{"sh", "-c", "git clone https://github.com/shykes/daggerverse && cd daggerverse && dagger call hello"}, dagger.ContainerWithExecOpts{ExperimentalPrivilegedNesting: true}).Stdout(ctx)

    if err != nil {
        panic(err)
    }

    fmt.Println(out)

}
small pendant
#

unfortunately the workaround is not going to work for me ๐Ÿ˜ฆ i cant predict which workdir will be used. the runner is meant to be used by many git repos ๐Ÿ˜ฆ and each one has its own e2e folder wher dagger functions live

limpid talon
small pendant
#

indeeed

#

i can pass any thing i want to the dagger function

limpid talon
#

yes, ok this works:

func main() {

    ctx := context.Background()
    defer dag.Close()

    dockerSock := dag.Host().UnixSocket("/var/run/docker.sock")

    out, err := dag.Container().From("alpine").
        WithExec([]string{"apk", "add", "git", "curl", "docker-cli"}).
        WithExec([]string{"sh", "-c", "curl -LO https://dl.dagger.io/dagger/install.sh && chmod +x install.sh"}).
        WithExec([]string{"sh", "-c", fmt.Sprintf("DAGGER_VERSION=%s BIN_DIR=/usr/local/bin ./install.sh", "v0.12.7")}).
        WithUnixSocket("/var/run/docker.sock", dockerSock).
        WithEnvVariable("_EXPERIMENTAL_DAGGER_RUNNER_HOST", "docker-container://dagger-engine-1f73f772dd1ba563").
        WithExec([]string{"sh", "-c", "git clone https://github.com/shykes/daggerverse && cd daggerverse/hello && dagger call hello"}).Stdout(ctx)

    if err != nil {
        panic(err)
    }

    fmt.Println(out)

}
#

^ because I already have the docker socket where the engine is running, I can just use that socket to tell the Dagger CLI to create a new dagger session using the existing engine instead of using the nesting which is what's currently causing the issue.

cc @earnest ice this same stopgap should also work for our pocket-ci case instead of running an inner engine like we're doing today.

small pendant
#

oh ... now the hope is that engine stays up right ? if it dies for any reason this will have to be restarted

limpid talon
small pendant
#

i though that was the container id at the end

limpid talon
#

oh wait, it doesn't.. because we're setting the container ID

small pendant
#

lemme try it looks good

#
โฏ ]docker-ps
CONTAINER ID   IMAGE                               COMMAND                  CREATED        STATUS        PORTS     NAMES
64548d1ee182   registry.dagger.io/engine:v0.12.7   "dagger-entrypoint.sโ€ฆ"   30 hours ago   Up 30 hours             dagger-engine-1f73f772dd1ba563

this is in my machine

limpid talon
#

@small pendant you don't need to set the _EXPERIMENTAL_DAGGER_RUNNER_HOST. As long as the SDK and the dagger CLI inside the pipeline have the same version, it will work automatically

small pendant
#

that dagger-engine-1f73f772dd1ba563 this loks like its not the container id

limpid talon
#

if the docker socket is there

#

you can remove that _EXPERIMENTAL_DAGGER_RUNNER_HOST. As long as the SDK and the CLI have the same version

small pendant
#

ok lemme remove the ExperimentalPrivilegedNesting then

small pendant
small pendant
#

dockerSock := dag.Host().UnixSocket("/var/run/docker.sock")
whats is the equivalent in the modules api ? i.e tur a socket from string to an object ?

limpid talon
small pendant
#
Error: response from query: Post "http://:mem/query": command [docker exec -i dagger-engine-1f73f772dd1ba563 buildctl dial-stdio] has exited with exit status 137, make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=
00:19:50 WRN canceling... (press again to exit immediately)
failed to wait for command termination: exit status 2
make: *** [Makefile:53: runner-start] Error 1

unfortunately the build never completes. although on the github ui i see that it is complete. something is keeping the dagger engine that is started like this not completing. the only error i see is the above

limpid talon
small pendant
#

i have the issue if i remove or no . .something weird is happening .. lemme re-check and leave it in just in case

small pendant
#

never stops running even with that flag .. ๐Ÿ˜ฆ

limpid talon
small pendant
# limpid talon do you see anything in the logs?
Error: response from query: Post "http://:mem/query": command [docker exec -i dagger-engine-1f73f772dd1ba563 buildctl dial-stdio] has exited with exit status 137, make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=

this is the only error i saw. i ended up removing dagger-for-github and it looks like its working now. not sure what it was.

small pendant