Salut @solomon ! | Dagger | Page 1

cedar depot May 7, 2025, 5:20 PM

#

When run locally the logs are slightly different

┃ registry.gitlab.com/<path>/syncer2:syncer-v0.1.1@sha256:4bb984f5f8c4a6aa9806294cd674957ad9b2571923d4cfd338385f6bb1d16c4c
│ │ ✘ remotes.docker.resolver.HTTPRequest 1.0s
│ │ │ ✘ HTTP HEAD 1.0s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.3s
│ │ │ ✘ HTTP HEAD 0.3s
│ │ ✘ remotes.docker.resolver.HTTPRequest 1.0s
│ │ │ ✘ HTTP HEAD 1.0s
│ │ ✘ remotes.docker.resolver.HTTPRequest 1.0s
│ │ │ ✘ HTTP HEAD 1.0s
│ │ ✘ remotes.docker.resolver.HTTPRequest 1.0s
│ │ │ ✘ HTTP HEAD 1.0s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.1s
│ │ │ ✘ HTTP HEAD 0.1s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.1s
│ │ │ ✘ HTTP HEAD 0.1s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.1s
│ │ │ ✘ HTTP HEAD 0.1s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.1s
│ │ │ ✘ HTTP HEAD 0.1s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.2s
│ │ │ ✘ HTTP HEAD 0.2s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.1s
│ │ │ ✘ HTTP HEAD 0.1s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.1s
│ │ │ ✘ HTTP HEAD 0.1s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.2s
│ │ │ ✘ HTTP HEAD 0.2s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.1s
│ │ │ ✘ HTTP HEAD 0.1s
│ │ ✘ remotes.docker.resolver.HTTPRequest 0.2s
│ │ │ ✘ HTTP HEAD 0.2s

I kinda remember to have read somewhere that the HEAD showing up as error is "normal", related to some wrong log level, but at the end of the day the whole POST or PUT is done successfully.
Also, while I am running dagger with the same verbose mode (-v) , there are no other requests' type shown lit it is when running via the K8s executor ( GET, POST etc ..)

wintry root May 7, 2025, 5:22 PM

#

I was thinking of a dagger cloud url, it's a web view of the whole trace where you can drill down in more detail

cedar depot May 7, 2025, 5:25 PM

#

Oh .. ok . I haven't been using the WebUI;
Let me create an account then and rerun that thing.

#

thx!

cedar depot May 7, 2025, 5:42 PM

#

All good. I can see the traces in Dagger Cloud . Not sure how I can share those though? 🤔

wintry root May 7, 2025, 5:44 PM

#

cedar depot All good. I can see the traces in Dagger Cloud . Not sure how I can share those ...

At the moment there are only 2 sharing options:

You can make all traces public for a given repository (I think @pearl cobalt ?) - probably not what you want
You can share a trace URL with a Dagger team member, if they have support/admin privileges we can look at it to help debug --> this is what you want

TLDR: you can safely just share a trace URL here with no special setting, and by default only Dagger team & your own org members can see it

cedar depot May 7, 2025, 5:47 PM

#

I ll take the TLDR; route then 😉

https://v3.dagger.cloud/clovrlabs/traces/dc2c8c8780f2a8592541e09d277fa34a

Dagger Cloud

Browse and visualize Dagger traces.

wintry root May 7, 2025, 5:48 PM

#

cedar depot I ll take the TLDR; route then 😉 https://v3.dagger.cloud/clovrlabs/traces/dc2...

Thanks, can you also share a trace of the same thing running successfully locally?

cedar depot May 7, 2025, 5:48 PM

#

just did 🙂

wintry root May 7, 2025, 5:49 PM

#

Sorry I only see one trace URL, it seems to be from a CI run and seems to fail

cedar depot May 7, 2025, 5:50 PM

#

https://v3.dagger.cloud/clovrlabs/traces/aacf9c21d56f2d3e38c91835e46af5bf?listen=853d384c074cae02&listen=af1716b36e201e1e

Dagger Cloud

Browse and visualize Dagger traces.

#

This one is the local one succeeding ☝️

wintry root May 7, 2025, 5:51 PM

#

Mmm, quick question, on your local machine does your regular docker config have permissions to push that that image?

#

(without using the token)

cedar depot May 7, 2025, 5:51 PM

#

For what is worth,
The crossplane function running the same way in the k8s environment manages to push to gitlab using the same token as well.

cedar depot May 7, 2025, 5:52 PM

#

wintry root Mmm, quick question, on your local machine does your regular docker config have ...

Yes

#

I can logout and test

wintry root May 7, 2025, 5:52 PM

#

cedar depot I can logout and test

yes please, I have one possible theory, that would help test it

cedar depot May 7, 2025, 5:52 PM

#

testing now.

#

It worked : i could publish.

wintry root May 7, 2025, 5:53 PM

#

Ah ok. Then my theory is wrong 😦

#

I was observing that you don't give the same image address to withRegistryAuth and publish

#

So was thinking maybe the token doesn't actually get used - and it only works locally because you're logged in

cedar depot May 7, 2025, 5:54 PM

#

wintry root I was observing that you don't give the same image address to `withRegistryAuth`...

This is odd. It shoudl be the same

wintry root May 7, 2025, 5:54 PM

#

withRegistryAuth: .../platform/crossplane/syncer2
publish: .../platform/crossplane/syncer2-debug:syncer-v0.1.1

cedar depot May 7, 2025, 5:55 PM

#

yes very true.

#

I could change it to see ?

wintry root May 7, 2025, 5:56 PM

#

Worth a shot

#

Another possible root cause, of course, could be that env://GL_TOKEN just doesn't get the right token in the CI environment. But I assumed you already checked that.

cedar depot May 7, 2025, 5:56 PM

#

does not explain why this is working locally though ..

wintry root May 7, 2025, 5:56 PM

#

cedar depot does not explain why this is working locally though ..

Right.

cedar depot May 7, 2025, 5:57 PM

#

wintry root Another possible root cause, of course, could be that `env://GL_TOKEN` just does...

Yes I triple checked .. Even changed the token to make sure ..

#

Also it worked without changing only the CI part to use the docker executor.

#

Ok .. so just tied locally with the same URL for both withRegistry and publish .. and it still works.
Tesing now via CI

#

OK . running again 'cause it failed for another reason.

#

😦 same thing I am afraid ...

#

There is one difference between the local and ci trace: the METHOD for a successful local run is "PUT" while the failed CI run is "POST"

#

🤔 If I try to run locally but changing the gitlab URL for something I know is wrong. I would faild with the exact same trace and messages.

#

(and I don't know what to do with this 🙂 yet )

pearl cobalt May 7, 2025, 6:18 PM

#

wintry root Another possible root cause, of course, could be that `env://GL_TOKEN` just does...

take into account that when running locally, the engine / cli also uses your ~/.docker/config.json setting as well. So if you're logged in via docker login it will just work automatically

#

@cedar depot I'd run docker logoutlocally to reproduce the auth issue

cedar depot May 7, 2025, 6:19 PM

#

@pearl cobalt I tried logged out and it works locally

#

I am logged out @pearl cobalt

pearl cobalt May 7, 2025, 6:20 PM

#

cedar depot I am logged out <@336241811179962368>

👍 can you confirm your ~/.docker/config.json doesn't have creds there?

cedar depot May 7, 2025, 6:20 PM

#

very much empty yes.

#

{
        "auths": {}
}⏎

#

Also the same code & token works using the docker executor via CI

#

So it feels to me that this is not really an auth issue.

#

Whatever the message keeps telling me 🙂

pearl cobalt May 7, 2025, 6:22 PM

#

cedar depot very much empty yes.

this seems to fail locally for you? https://v3.dagger.cloud/clovrlabs/traces/0162dbe4ca13b78958f8556b89fe31fc

Dagger Cloud

Browse and visualize Dagger traces.

#

looks like an auth issue

#

I undertsood locally it was still working? Sorry, I'm a bit confused

cedar depot May 7, 2025, 6:23 PM

#

No the latest test was me changing the URL to see what kind of message I would get. And it turns out this is the "not sufficient permissions" kinda message I got too.

#

just running it now with the proper URL and being logged out

#

and it works.

pearl cobalt May 7, 2025, 6:24 PM

#

ok, what happens if you pass an incorrect token? Does it fail?

cedar depot May 7, 2025, 6:24 PM

#

yes

#

running it now with a bad token locally

pearl cobalt May 7, 2025, 6:25 PM

#

ok, I saw that. Have you validate that in CI the GL_TOKEN is correctly set?

cedar depot May 7, 2025, 6:26 PM

#

Oh yes

pearl cobalt May 7, 2025, 6:26 PM

#

can you share a snippet of your .gitlabci.yaml file?

cedar depot May 7, 2025, 6:26 PM

#

sure

#

thx

pearl cobalt May 7, 2025, 6:26 PM

#

feel free to DM if you can't make it public

cedar depot May 7, 2025, 6:27 PM

#

I have actually debug it all the way to the code and displayed the plaintext() version of the secret.

#

So I am pretty sure that the value is the same I am using locally.

pearl cobalt May 7, 2025, 6:33 PM

#

cedar depot I have actually debug it all the way to the code and displayed the plaintext() v...

I have actually debug it all the way to the code and displayed the plaintext() version of the secret.

I'd assume you already tried that secret locally and it works?

cedar depot May 7, 2025, 6:34 PM

#

Well yes I did a manually push

#

after a docker login using this very token.

#

So yeah, I feel as solid I can be on the token side .. with such an obvious error message, I had to 1000x check this.

wintry root May 7, 2025, 6:53 PM

#

(sorry @cedar depot I am in a meeting... will come back to this after)

pearl cobalt May 7, 2025, 7:16 PM

#

had a quick call with Seb and something strange is definitely happening. Will try to repro using the same settings he's using in gitlab 🙏

#

we weren't able to make it work

cedar depot May 7, 2025, 7:29 PM

#

Thanks for your help both. Really appreciated 🤗

wintry root May 7, 2025, 7:38 PM

#

OK - sorry about that @cedar depot and thank you for your patience

pearl cobalt May 7, 2025, 8:23 PM

#

@cedar depot I wasn't able to repro using the public gitlab CI. Here's the code https://gitlab.com/marcosnils/dagger-ci-test as well as the gitlab CI job where this passes: https://gitlab.com/marcosnils/dagger-ci-test/-/jobs/9966685384

this is also the token permissions that I'm currently setting.

I'll try with the k8s executor tomorrow 🙏

pearl cobalt May 7, 2025, 9:29 PM

#

ok, I was also able to make it work with the local k8s executor here
let's continue checking tomorrow Seb

#

Seb, one thing that I'd like to try if you can tomorrow is to recycle your dagger engine nodes just to make sure there's no stale caching problem that might be happening here 🙏

cedar depot May 8, 2025, 7:35 AM

#

Thanks a lot @pearl cobalt for trying to reproduce the issue.
And I think you got it right. It was a cache issue. (I think yesterday we did not clean up the cache really on second thought)
This morning all I did is:

Go on each dagger engine pod
Delete everything under /var/lib/dagger
Restart the daemonset
Run he gitlabCI with 0 changes

-> 🎉 The job succeeded

#

Only one of the pod had some cache 8GB. I will dig further to see whether it could have been some disk space issue (Running TalOS as the underlying OS).

#

I was wondering 🤔 Is there a way to clean up the cache in a k8s context other than startingn a pod with a dagger cli an running the prune command? (on each node) ?

#

Again thanlks a lot for the time spent investigating. Let me know if you need me to test things any further.

wintry root May 8, 2025, 7:54 AM

#

Amazing! Glad you were able to solve it. Good thinking @pearl cobalt 🙂

wintry root May 8, 2025, 8:00 AM

#

cedar depot I was wondering 🤔 Is there a way to clean up the cache in a k8s context other ...

I think that's probably the best way available... We are working on making Dagger work better in a cluster. Right now we are concentrating our efforts on two major blockers:

Finally decoupling cache storage from compute in the engine architecture
Stabilizing the interfaces for remote engines.

Remove those two blockers paves the way to a stateless engine and cluster-aware engine, and from there a serverless auto-scaling engine... the sky is the limit 🙂

cedar depot May 8, 2025, 8:10 AM

#

Using he HostPath does feel a litlle odd for the dagger-engine to me. Also if we could not run runners and engine with elevated priviledge that would be awesome. In TalOS, the default pod policy enforced does not allow this, hence a little more tweak is needed to make dagger-engine works in that environment.
Looking forward for the coming changes!

wintry root May 8, 2025, 8:37 AM

#

Once we have a stateless engine with stable interfaces for remote access, we won't need the crutch of daemonset + hostpath 👍

Escalated privileges: this is a framing issue. Dagger is not a containerized app: it's a container runtime and orchestrator. It's only packaged as an OCI image for convenience, but architecturally it's a host service. You can use docker and kubernetes to provision it, but fundamentally they operate at the same layer. And in the future might even replace them for specialized workloads (although that is not our goal)

From a security standpoint: approach securing dagger the way you approach securing docker or kubernetes. Mostly that means treating it like a host service and making the host machine the security boundary.

cedar depot May 8, 2025, 9:01 AM

#

🤔 something is definitively weird here. I had just realized I left some debug stuff we did yesterday with @pearl cobalt .
I removed them, run the CI again and got this error (unrelated to the part that was previously failing)

https://v3.dagger.cloud/clovrlabs/traces/62cd3f1abd2851c122ef7478fcbeaa7f?span=aca2bf1b4522bb0d

It is complaining that "semantic-release" is not available in PATH while on the previous with_exec it just ran.
I will upgrade to the latest dagger version to see.

Dagger Cloud

Browse and visualize Dagger traces.

cedar depot May 8, 2025, 10:21 AM

#

FYI : I upgraded to 0.18.5 and so far all is fine. I did try to use the latest version but runners would fail with some dependencies error.

#

This is the traces when using 0.18.6

https://v3.dagger.cloud/clovrlabs/traces/79fdf1c44661c4c8ecd5cef61b9587bb

Dagger Cloud

Browse and visualize Dagger traces.

pearl cobalt May 8, 2025, 2:44 PM

#

wintry root Amazing! Glad you were able to solve it. Good thinking <@336241811179962368> 🙂

woot! having said that, some further investigation is needed here since wiping the cache shouldn't have been necessary for this to work. I'll try to repro it and open an issue. (sudo dagger agent do this for me in the background cc @silent depot )

pearl cobalt May 8, 2025, 3:15 PM

#

cedar depot This is the traces when using 0.18.6 https://v3.dagger.cloud/clovrlabs/traces/7...

@cedar depot looks like as if you have forgotten to run dagger develop or you have a mismatch between your production's and engine's version.

#

also just to make sure, are you committing the python generated sdk folder into your repo? That might cause some issues as well

near needle May 8, 2025, 3:19 PM

#

cedar depot This is the traces when using 0.18.6 https://v3.dagger.cloud/clovrlabs/traces/7...

are both the engine and cli using v0.18.6?

#

looking at the code, i think i would expect this to happen when a v0.18.5 cli calls a v0.18.6 engine

#

the source of the error was added in https://github.com/dagger/dagger/pull/10118 (cc @pallid torrent)
i think we should have bumped this? https://github.com/dagger/dagger/blob/main/engine/version.go#L32-L34

cedar depot May 8, 2025, 3:46 PM

#

near needle are both the engine and cli using v0.18.6?

Very true they are not. I did not figure I had to upgrade the helm package for the engine but it totally makes sense. Will do now and report back. Thx!

near needle May 8, 2025, 3:51 PM

#

near needle the source of the error was added in https://github.com/dagger/dagger/pull/10118...

gonna bump this in https://github.com/dagger/dagger/pull/10363

GitHub

chore: bump minimum engine version to v0.18.6 by jedevc · Pull Req...

We now use the IncludeDependencies arg unconditionally in the CLI (see #10118) - but this arg was only introduced in v0.18.6, so we should error cleanly if there's a mismatch here.

cedar depot May 8, 2025, 3:54 PM

#

thx @near needle , bumping the version to 0.18.6 at the engine level made the trick. Might be good to add it somewhere in the documentation. When running locally the upgrade is done for you automatically; so it was not obvious (at least for me 🙂 )

wintry root May 8, 2025, 3:57 PM

#

this is one of the reasons the interfaces are not yet stabilized... cli/engine versioning matrix hell

pearl cobalt May 8, 2025, 4:45 PM

#

cedar depot thx <@488718750690967563> , bumping the version to 0.18.6 at the engine level ma...

glad we finally got it working Seb. Justin's PR above should show a proper message when this happens. In this case it was a recent change where we forgot to bump the minimum engine version check 🙏

cedar depot May 8, 2025, 4:47 PM

#

pearl cobalt glad we finally got it working Seb. Justin's PR above should show a proper messa...

Glad I could have been of help 🙂 Keep up the good work guys!

silent depot May 8, 2025, 5:28 PM

#

pearl cobalt woot! having said that, some further investigation is needed here since wiping t...

@crude quiver can you open up an issue on dagger/dagger for this?

#Salut @solomon !