#kubernetes

1 messages Β· Page 1 of 1 (latest)

proud oyster
#

This channel could get interesting, because there are three ways to use Dagger and Kubernetes together:

  1. Dagger on Kubernetes. This is the most common, since many CI runners run on Kubernetes these days, and in that case we recommend running the Dagger Engine as a "sidecar" to the CI runner, using a Kubernetes Daemonset.

  2. Kubernetes on Dagger. This is less common, but very cool. You can run an ephemeral kubernetes (or k3s) cluster for testing purposes, on Dagger itself.

  3. kubectl on Dagger. Regardless of where your Dagger engine is running, sometimes your pipeline may need to deploy to a remote kubernetes cluster.

This channel is a good place for discussing all of those things πŸ™‚

half canyon
cerulean mountain
orchid blade
#

Testing Helm charts is the exact use-case I have right now! πŸ’₯

The Helm chart is in the same git repository as the application. We use Traefik as a reverse proxy so the application can provide different UIs according to the target audience. Helm templates and proxies are confusing enough to those familiar with them, now imagine the complexity it adds to front-end engineers who do not have prior experience with the topics. We want to enable them to change and add new routes, and the only way that will happen is if we simplify testing! I have picked Dagger for that, and I have been checking out how much I can use it to achieve the goal.

I need to build the app, send the image somewhere local (preferably), install the Helm chart in a Dagger-provided Kubernetes cluster, and run the Playwright tests against this deployment

ionic hare
#

Following @orchid blade thread above, we have a very similar use-case in our builds that I'll be working on next quarter so I'm quite curious to see what discoveries and pitfalls you run into

lavish summit
#

@proud oyster hi and thanks for all the work so far! I have a use case for ci system that is limited in options where to deploy and run it. Looking into running dagger as a deployment+replicaset exposed via TCP port. Dagger pods would run as privileged. Then I would point the SDK to the K8S service via the experimental flag. Did anyone try this out yet or is it yet in the roadmap? Thanks πŸ™‚

proud oyster
#

(cc @shell turret @low gorge for when they are back from weekend)

lavish summit
#

That’s great! What’s the best way to connect? Europe/CEST time zone here. I am planning to solve the storage issue using EFS (AWS’s super fast type of NFS) which is attachable to multiple pods, and run dagger in EKS. Therefore I can have round robin balancing to dagger pods behind a Service without thinking about cache miss or hit; as soon as the build ran and cached something the first time, the centralised EFS mount (mounted as rw on multiple pods) will be able to use it in subsequent builds

proud oyster
# lavish summit That’s great! What’s the best way to connect? Europe/CEST time zone here. I am p...

I think you might run into performance and concurrency issues with that approach to distributing the cache data. I will wait for the infra experts to weigh in.

Context is that we explored this architecture for our own Dagger Cloud service, and decided against it. Instead we are building a managed control plane that can orchestrate storage and distribution of cache data between all your engines and object storage services

#

But it’s possible that I’m misremembering, and both approaches are equally valid. Either way we’re happy to help you in finding the ideal setup

proud oyster
lavish summit
errant panther
#

I did a small POC for our Kubernetes Cluster. I deployed the Dagger Engine as a Deployment (only 1 replica, no serivce), and the connected from my local machine with _EXPERIMENTAL_DAGGER_RUNNER_HOST=kube-pod://buildkit-podname (perquisite, set the right Kubecontext before)
But as Solomon already pointed out, I think you will run into concurrrency issues if you try to share the cache with rwx volumes

low gorge
#

This might help: https://www.youtube.com/watch?v=c93_EsedP1s . To browse the code docker run -it --entrypoint nvim registry.dagger.io/equinix-demo-day-2023 .

As for unprivileged & rootless, it's complex:

Watch the full Demo Day! https://www.youtube.com/watch?v=-siv1ga0l_o

In this segment, Fen Aldrich and Gerhard Lazu show us a Dagger Demo on Equinix Metal.

Fen Aldrich, Developer Advocate - Equinix
Gerard Lazu, Software Engineer - Dagger
Kyle Penfound, Solutions Engineer - Dagger

Read Gerhard Lazu's blog post https://gerhard.io/talk/dagger-on-...

β–Ά Play video
GitHub

What are you trying to do? The goal is to support setting a Exec privileged so that processes like docker-in-docker can be run. Why is this important to you? Some of our projects are Kubernetes con...

GitHub

Instead of using --privileged with the default buildkit container, use a rootless one which makes it safer: https://github.com/moby/buildkit/blob/master/docs/rootless.md#docker
One reason to not re...

cerulean mountain
proud oyster
south thorn
languid badge
#

I think I missed the call... were yall talking about hosting buildkit in k8s by chance

#

Trying to sort out if I can do non-privileged dagger engine. Getting some errors related to dns if I run unprivileged

buildkitd: install resolv.conf: remount /etc/resolv.conf to upstream alias: operation not permitted
proud oyster
#

@languid badge yes it was about hosting Dagger engine (including its embedded buildkit) on kubernetes. Recording will be up soon, and we're happy to discuss the specifics here!

languid badge
#

great timing lol

proud oyster
#

there’s a PR in progress by @shell turret

languid badge
#

πŸ”₯ literally my exact setup. I've got the dagger engine remote though - maybe that's not required...

proud oyster
#

Without knowing your specific constraints, I would recommend sticking to the setup as documented as much as possible

#

Remote is possible but it's not the most common, if you hit issues there will be less pooled knowledge

languid badge
#

πŸ‘ - no constraints exactly, was just thinking to persist cache a little more. I think there's S3 options available as well though if I have individual engines like this right?

low gorge
#

Hi @languid badge

There will be a blog post which ties everything together: pull requests, issues (re unprivileged & rootless), previous community call video, Equinix Demo Day 2023 talk, etc. Will drop the link to the blog post here as soon as it goes live. For now, this is a good one to follow: https://github.com/dagger/dagger/pull/5446

FWIW: cc @shell turret

languid badge
#

doooope. ty!

south thorn
rose oak
#

I just watched that video and several time they talk about a blog post. I can't find that blog post.

proud oyster
rose oak
#

The way they talked in the community meeting it sounded like it was going to be published that day. That was where my confusion comes from.

proud oyster
#

Yeah I understand the confusion. We caught the problem late in the process, in other words: we shelved the blog post after it was mentioned in the demo.

As a team we like to ship fast which sometimes comes at the cost of imperfect synchronization (a delicate balance). So in exchange for the occasional confusion, you get a lot more features πŸ™‚

#

Note that the PR is not merged, so it’s not authoritative: what is documented there may not be exact supported in the future. But it’s directionally correct and comes from our own infra - so worth looking at

rose oak
#

yeah, this is very github-action centeric. I am hoping to run dagger in a k8s pod created by ago-workflows.

#

Seems that isn't a setup you have all really looked at yet.

proud oyster
#

the kub part, yes. quite common

#

the argo workflow part, not as much

#

but the argo-specific part shouldn’t matter in your case

fathom carbon
#

will it be possible to use something like a container registry or cloud bucket for caching?

cerulean mountain
#

So, you can use buildkit's default cache exporters (https://github.com/moby/buildkit/#export-cache) with the _EXPERIMENTAL_DAGGER_CACHE_CONFIG env variable which gets passed directly to buildkit in your runs. However, we know there are gotchas and opened issues for using the basic buildkit cache. Because we want something solid that also has better performance through enhancements like caching of volumes (like for pip cache), etc, we've been investing in the Dagger cache service. Happy to show that to you if you're interested πŸ™‚. cc @half canyon

GitHub

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit - GitHub - moby/buildkit: concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit

fathom carbon
proud oyster
#

@dire oasis I am very curious to learn more about how you’ve leveraged kubernetes for per-job and per-repo resource quotas in your builds. I’m confident we can address your concerns (how to avoid losing those benefits) but I’m sure there are tradeoffs involved, I’d like to understand them better.

#

cc @cerulean mountain I’m moving the party here 😁

dire oasis
#

Thanks @proud oyster. I'm trying to digest/understand the information about DAG and how it will help, I still feel apprehensive about Dagger Engine's ability to scale and the reliance on Docker. Everything else though makes me excited about Dagger. Though I think I need to stop asking and start trying Dagger. I want to try and put Dagger through the paces, maybe throw quite a few concurrent builds at it to see how Dagger Engine handles them.

proud oyster
#

That sounds good πŸ‘ Some apprehension is appropriate for a relatively young product

#

Note that there is no hard dependency on Docker. It’s only used as a convenience default to bootstrap the engine. In a kubernetes daemonset configuration, no docker needed

dire oasis
low gorge
sacred osprey
glossy egret
#

is there a good way to share cache between engine instances?

#

i see buildkit has a few options for external cache, perhaps could use the registry or s3

#

I see "magicache" mentioned in the Helm PR?

half canyon
# glossy egret is there a good way to share cache between engine instances?

Hey @glossy egret πŸ‘‹ Indeed there are some buildkit approaches, but yes, we have a cloud caching service (sometimes called "magicache" πŸ™‚ ) that is in early access that makes sharing cache between engines and CI runs super easy and includes extras like the ability to use cache volumes (like for go, node, python deps). We have customers running that setup in k8s.

proud oyster
#

@glossy egret we originally hoped that buildkit's cache export features would be enough, but it turns out there are many limitations, some of them fundamental, so we built a distributed cache service (which @half canyon mentioned)

glossy egret
#

Cool, didn't know Dagger was doing anything commercial

proud oyster
#

It's been mostly under wraps, we have customers in early access but haven't announced anything yet

proud oyster
#

@low gorge @daring wraith πŸ‘‹ for future Dagger+Kubernetes discussions πŸ™‚

sacred osprey
#

and @sacred osprey (the old man running the homelab on k3s)

sacred osprey
#

Dude! Where's my gitea !?

south thorn
sacred osprey
#

Future-proofing dagger

amber narwhal
#

I'm just gonna leave this idea here: I've been using Garden (https://garden.io/) lately and it's awesome....but Dagger is awesomer and it's capable of most of the things Garden does. It just needs a reusable module....

Accelerate the DevOps workflow. Build, deploy, and test in production-like environments with one platform.

cerulean mountain
amber narwhal
#

Image builds already exist in native Dagger

#

So it might make more sense to to rebuild the rest

half canyon
amber narwhal
#

Well, the basic features are build, deploy, test and run.

Build is kinda given, but build also "loads" a container image into the local Kubernetes cache (no idea how they do it).

Deploy is basically a nice wrapper around Helm, Kustomize and all the rest.

The thing that makes Garden stand out is that you can define dependencies between the different steps (which is also given in Dagger).

I'm not very familiar with their testing capabilities, but I'm pretty sure it's also just some high level API around running containers in a cluster.

So I think it's mostly just supporting the different deployment strategies and creating some glue around the different steps to make it easier for people to use. (For example: Garden is configured through YAML, can be organized into modules, etc.)

#

I understand YAML is not really for Dagger, but the nice thing about Garden is that it's easy to use and the API it provides is just enough and super easy at the same time.

I'm not saying we need YAML, but need the simplicity that it provides for Garden.

#

Another project I used Garden: https://github.com/bank-vaults/vault-secrets-webhook

Interestingly, I used Garden here so that I can avoid running the entire CI pipeline which is slow at the moment. Just running a local Kind cluster, deploying and building everything with Garden ended up being super easy.

GitHub

A Kubernetes mutating webhook that makes direct secret injection into Pods possible. - GitHub - bank-vaults/vault-secrets-webhook: A Kubernetes mutating webhook that makes direct secret injection ...

#

Happy to show you how I use Garden from a user perspective @half canyon

amber narwhal
#

Anybody working on that by any chance?

sharp hedge
#

but agree they have multiple places for dagger to leverage their existing CI

amber narwhal
#

I'm not talking about developing Flux itself. I'm talking about their image automation feature that automatically deploys new images based on certain policies. In order to achieve that they continuously poll container registries and then update the gitops repo with the new image.

Instead of that, I want dagger to update the gitops repo with the new image tags automatically after a new image is pushed.

It solves a whole lot of issues with Flux's own automation, namely write access to the gitops repo, multi-tenancy and a bunch of other issues.

sharp hedge
#

oh, it make sense

sacred osprey
#

Build OKD using Dagger?

sacred osprey
#

Awesome k8s/nodejs troubleshooting session with @violet hatch and @cerulean mountain ! I think I have a new lab buddy with Noe because he knows nodeJS and I know k8s.

rocky scarab
#

Hello everyone,
I'm trying to run dagger inside a container in k8s since two days, but without success for now.

evel=fatal msg="failed to mount {Type:overlay Source:overlay Target: Options:[index=off lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/310/fs:/var/lib/containerd/[................] on \"/tmp/initialC1941586119\": operation not permitted"
: exit status 1

Kubernetes cluster on EKS version 1.24.
My image is build with ghcr.io/containerd/nerdctl as base FROM ghcr.io/containerd/nerdctl and we have a symlink ln -s `which nerdctl` /usr/local/bin/docker

the volumes mounted (docker is not needed but we have another cluster on 1.22 that will need docker.sock):

volumes:
  - name: docker-sock
    hostPath:
      path: "/var/run/docker.sock"
  - name: containerd-lib
    hostPath:
      path: "/var/lib/containerd"
  - name: tmp-host
    hostPath:
      path: "/tmp"
  - name: containerd-dir
    hostPath:
      path: "/run/containerd"

security on template:

securityContext:
  runAsUser: 0
  runAsGroup: 1001
  fsGroup: 1001
  fsGroupChangePolicy: "OnRootMismatch"

on container:

  securityContext:
    privileged: true
    capabilities:
      add:
        - ALL
    runAsUser: 0

Everything working fine with docker in local when binding the docker.sock in docker-compose.

I have tried a lot of things if you have any idea on how to make it works thank you !

sharp hedge
#

Running Dagger in K8S

sacred osprey
timber solstice
#

Has anyone run dagger with EKS fargate?

cerulean mountain
worldly gate
#

Is there a reason the socket must be hostpath mounted? I see one example that doesn't do this. The Tekton example just uses emptyDir? https://docs.dagger.io/213240/tekton Is that example wrong?

    - name: dagger-socket
      emptyDir: {}
    - name: dagger-storage
      emptyDir: {}
...
      volumeMounts:
        - mountPath: /var/run/buildkit
          name: dagger-socket
        - mountPath: /var/lib/dagger
          name: dagger-storage
low gorge
#

Use Dagger with Tekton | Dagger

shadow dragon
#

Hey, I'm looking into developer tools for working in k8s. I found quite a few (okteto, devspace, tilt, skaffold, and I may be misssing some)... It is kind of a tangent subject to dagger but If I get it right most of these tools provide some alternative way to watch resources, then build a Docker image "just in time" and then push it to a local k8s cluster with minimum delay.

I was wondering if someone here has used any of these tools and combined it with dagger by any chance? I'm thinking The step of building the Docker image may be better implemented through dagger, but I'm not sure just yet. In particular I'm not sure if dagger has any support for watching resources... sounds like I may just have to use some golang file watcher and implement it myself.

Thx!

shadow dragon
#

Hey, I'm looking into developer tools

shrewd hazel
#

I noticed the new dagger docs no longer show a path to using Dagger in k8s. Is this an oversight? Or does the new version no longer work with k8s? (Yeah, I have yet to actually use Dagger and wanted to get started with Dagger in k8s).

proud oyster
#

By the way that guide is for production optimization. On k8s the Dagger CLI will work out of the box, as long as you support DinD

shrewd hazel
#

That kubernetes guide is still valid.
Great to know. I'll work on getting a Dagger Engine and my first workflows going in the next few days.

shrewd hazel
#

What would we be missing without using the Dagger Cloud distributed cache?

#

Or asked maybe in a better way, what is cached?

versed holly
# shrewd hazel Or asked maybe in a better way, what is cached?

Hey Scott!!

Caching in Dagger is the main feature that can potentially speed up your pipelines. When you write a Dagger function such as:

func (m *Stdout) Echo(ctx context.Context) (string, error) {
    return dag.Container().
        From("alpine:latest").
        WithExec([]string{"sh", "-c", `(echo "This is stdout" ; echo "This is stderr" 1>&2)`}).
        Stdout(ctx)
}

If you run this function twice you will see that the second time will be much faster. This is because each layer required to execute this function has already been computed and can thus be re-used. In this case, the image alpine:latest has already been fetched and the command has already been executed and we know it's output. So there is no need to re-execute any step. If I where to modify the contents of the WithExec step then that will need to be executed on the next run and then cached.

The other quite relevant aspect of caching is CacheVolume, there is a guide that does a better job at explaining this: https://docs.dagger.io/user-guide/cloud/572923/get-started/#step-4-use-cache-volumes-with-the-experimental-dagger-cloud-cache.

Related to your first question, there is an issue where we are debating "How to scale Dagger in production". Caching is one of the topics when we talk about scaling. There is a comment that @low gorge wrote that does a great job at explaining the different ways of scaling Dagger in prod and how that affects Dagger's caching capabilities: https://github.com/dagger/dagger/issues/6486#issuecomment-1910551524

GitHub

Problem We haven't conclusively answered the question: "what is the best way to scale Dagger in production?". This is in part because there is a wide variety of requirements and prefe...

shrewd hazel
#

So, if I understand correctly, the Dagger Cloud distributed cache is the docker build cache being stored externally from the Dagger engine? And if I understand what Gerhard wrote (nice write-up btw), we could run Dagger engine as a never-dying service and the build cache would be available locally nonetheless? If yes, that's the answer I'm looking for, as we'd want a full time dagger engine (or multiple engines depending on load) running in our clusters. We don't intend to sell the CI service as its own thing, but the process of development will entail Dagger for CI.

Also, it would be a welcome option for us in the future, if we could self-host our own distributed Dagger cache, especially if we find the ephemeral usage of Dagger to be more cost effective. It could be a paid option for sure. πŸ™‚ Thing is, we are purposely avoiding any 3rd party out-of-cluster management solutions. We feel our platform must be 100% independent of 3rd parties. This is, however, not to say that the users of the platform could and in fact should buy into any 3rd party solutions they may need. We just don't want lock-in on our side (whereby, we know anyone buying into our platform is being locked in. It's the nature of a platform. πŸ™‚ ).

low gorge
# shrewd hazel So, if I understand correctly, the Dagger Cloud distributed cache is the docker ...

Thanks!

There is a bit more to the Dagger Cloud Cache. @eternal kraken puts it best in a few minutes here: https://pod.gerhard.io/2#t=17m5s

If you can make the same volume available to Dagger Engine, state from previous operations (a.k.a. cache) can be re-used and builds will be more efficient.

I personally don't bother stopping the Dagger Engine in my K8s setup. Dagger is always ready to service requests, the only wait time is the CI runners (ARC in my case: https://github.com/actions/actions-runner-controller).

If you run the Dagger Engine as a daemonset on a dagger-runner node type, and only schedule CI runners on those nodes, then you don't need to worry about provisioning it. In this scenario, everything stays on your cluster.

Upgrades require some more thought, but nothing too involved.

We are actively working in this area, I expect us to have a bunch of production-related improvements over the coming months. Also relevant to this discussion: https://github.com/dagger/dagger/issues/5583

Other video resources:

shrewd hazel
#

Wow! Thanks Gerhard. Bist du Deutsch zufΓ€lligerweise? πŸ˜„
This is all very interesting and to some points way over my head, but I'll keep doggy paddling away. πŸ™‚

#

Nevermind. Schweizer, kein Deutscher. πŸ™‚

stuck oracle
#

Has anyone spent time getting dagger engine configured / spun up and managed by ArgoCD inside of k8s? About to embark down that path and figured I'd ask if there are any demons there compared to the typical setup path for applications managed by argo.

shrewd hazel
#

@stuck oracle - I'm also embarking on this path - sort of. Not with ArgoCD at first, but in running the Dagger engine persistently in a k8s cluster. The answers to the questions I posed above, if you didn't happen to read them, will probably interest you too. Especially the point made in Gerhard's post about vertical and horizontal scaling of Dagger engines and the fact that the upgrade process needs to be carefully designed when using the Dagger Engine long-running/ persistently, more than likely needing a blue-green deployment process (that is my recollection of my comprehension of his article. I may have misunderstood or mixed things up).

So, TLDR; and AFAIK, using ArgoCD alone won't completely work for a long-running/ non-ephemeral Dagger Engines. Well, the auto-updating won't. As I see it, you'd need something like Argo-Rollouts on top if ArgoCD too.

#

Anyone with real knowledge, please do correct me, if I am wrong.

proud oyster
#

The only thing I will add is that although we support long-running the engine container today, we are discussing moving away from that architecture, because it introduces versioning headaches (CLI and container are tightly coupled).

So it's best to keep that in mind before investing too much custom configuration or tooling.

shrewd hazel
#

@proud oyster - If you all decide to not support the long-running engine, then my comment above about desiring a self-hosted distributed cache would be even more significant. πŸ™‚

After reading the stuff given to me above, I don't see the coupling between the CLI and the Engine as a terribly hard to solve challenge, is it? In k8s?

The version of Engine running depends on the CLI used (so sort of looking at the dependency backwards) and because the pods running the CLI will be much more short-lived, the number of Engine versions that need to run is two at most. The hardest part would be the trigger to allow the newer versions of the CLI to be used in the CI workflow pods (for a lack of a better name), indicating the newest Engine is up and running and ready for work. And, also knowing when to kill the older version Engine. But, actually, I don't see those as a big issues either. πŸ€” Of course, my experience in this is limited. So, I'd love to hear about why it's a more difficult challenge, than I can imagine. πŸ™‚

Could it be that supporting the two paths in the Engine development is the harder part? πŸ€”

Or that k8s usage with long-running Engines avoids the commercial side of your enterprise to a point? πŸ€” (Which it doesn't have to be.) πŸ˜‰

proud oyster
#

A few points:

  • This is unrelated to commercialization strategy. We don't make product design decisions based on monetization.
  • We plan on supporting self-hosted distributed cache regardless.
  • The issue of CLI/engine coupling is complicated. One aspect is that CI configuration (where CLI version is managed) and Kubernetes configuration (where long-running container version is managed) have different lifecycles; often they are not even owned by the same people. And there is not always 1-1 mapping between them. An organization may have multiple CI configurations, and multiple kubernetes configurations, and possibly multiple permutations of them. In a simple stack, it's mildly tricky; in a more complex enterprise it's a nightmare.
  • On top of this, the engine container is simply not designed to be a long-running service. It's not safely multi-tenant; its remote communication protocol is private at the moment (unlike the GraphQL API exposed by the CLI)
  • Lastly, the operational model is currently split between 2 very different architectures: long-running container (on some kub installations) and CLI-managed (everone else). Having 2 operational models makes everything more complicated.

Some further reading:

GitHub

Question I want to start a discussion on an important aspect of Dagger's architecture: whether to make the engine stateful or stateless. This is a complex topic with important ramifications, th...

GitHub

This issue was previously named "engine drivers", but "compute drivers" is proving more clear. Problem The Dagger CLI has a builtin β€œcompute driver”: a software interface which ...

shrewd hazel
#

We plan on supporting self-hosted distributed cache regardless.

Ok. This makes me very happy. πŸ™‚

And in fact, this comment now pushes me to think about what we will want to achieve with Dagger differently. Thanks for that!!!! πŸ™‚

This is unrelated to commercialization strategy. We don't make product design decisions based on monetization.

Ah, a breath of fresh air! OMG! πŸ™‚ Hard to believe, but awesome if you can continue to achieve it. And, this makes me now much more happier to take on Dagger as a solution and as a partner and to support it once we can. Very nice! 😊

low gorge
low gorge
# shrewd hazel <@221792763732033536> - I'm also embarking on this path - sort of. Not with Argo...

Actually Argo CD should be sufficient to manage one or more Engines on Kubernetes. I keep my Dagger Engines pinned to a specific minor version, e.g. 0.6.x, 0.9.x, etc. and only apply patch upgrades, which worked fine for the last 12 months.

When a new patch bump goes out, the running Engine will be stopped gracefully, which introduces minimal disruption.

So what are the hard parts of long-running Engines? It all ties back to the pre-provisioned concept, meaning that the CLI is given an existing Engine to work with, and this might cause issues. For example, running Dagger CLI 0.6.x against a 0.9.x Engine will not work.

Compute Drivers (linked to above) is what we are currently exploring as a potential solution to the above problem.

As for the stateful vs stateless, I have to spend more time considering this approach. I have been in the stateful camp for as long as I can remember, so that feels more natural. Same for bare metal, immutable infrastructure and declarative outer shells.

shrewd hazel
#

@low gorge - What's the reason for needing the older version along side the newer version?

low gorge
shrewd hazel
#

So, the workflow code is also "tied" to the versions?

#

I can see how that could be a real problem....

low gorge
#

It's the CLI version, specifically dagger . There weren't many breaking changes - a few per year - but to make sure that everything works well together, the recommendation is to always use the same version for both the CLI and the Engine.

For example, we often use different patch versions of the CLI vs the Engine in our own dagger/dagger workflows, but I don't remember us ever using different minor versions. Behaviour is undefined, so keeping the minors in sync is strongly recommended.

proud oyster
# low gorge Actually Argo CD should be sufficient to manage one or more Engines on Kubernete...

I'm also on "stateful compute" camp, but that doesn't mean necessarily the engine container itself should be stateful. For example I love to ssh+rsync files into a long-running server. sshd is long-running, but the rsync server is ephemeral and short-lived. They work great together.

In this analogy, half of our community deploying the engine like sshd, and the other half like rsync. Eventually we will need to pick one, and I think the rsync model is a better north star to aim for.

low gorge
# proud oyster I'm also on "stateful compute" camp, but that doesn't mean necessarily the engin...

Got it. I will go over https://github.com/dagger/dagger/issues/5484 and continue this conversation there so that it's easier to reference in the future. πŸ‘

GitHub

Question I want to start a discussion on an important aspect of Dagger's architecture: whether to make the engine stateful or stateless. This is a complex topic with important ramifications, th...

shrewd hazel
#

I'm also on "stateful compute" camp
Me too. But, only because I've been using VMs and bare metal servers in the past. Kubernetes (and indirectly Docker) offers that paradigm shift to more ephemeral/ stateless usage of applications and it is more "cloud-like" in the end. πŸ™‚ The only thing making this difficult is the fact that practically every application has some files or persisted data they work on and need config to run. This all needs to be "hooked up" and that then becomes the complication - the challenge... πŸ™‚

proud oyster
#

Yes that is one complication. The other is that Dagger itself is a container orchestrator, although one with very different goals and design from kubernetes. It's possible for Kubernetes to run a container that itself runs more containers, but it can be awkward sometimes.

proud oyster
shrewd hazel
#

Is there any ETA on the self-hosted Dagger distributed cache, [mentioned above](#kubernetes message)? 😊 I'm just looking for a ball-park like, "it's actually around the corner" aka a few weeks away or "it's still in planning stage with no ETA yet" aka it will come soonβ„’. πŸ˜›

proud oyster
shrewd hazel
# proud oyster still planning stage. but there are workarounds available. It depends what your ...

Thanks for the quick reply. What would you consider to be examples of architectural constraints, say in k8s? This is exactly the open question currently in my mind. I'm asking, because I just watched Kyle's Argo Workflows with Dagger video (again) and read the guide, but I'm uncertain about how to get the "CACHED" results he got, without Dagger Cloud. He didn't mention using Dagger Cloud in the video (which was a miss to sell it πŸ˜› ), and I highly doubt he had a workaround going. But, this scenario of using Argo Workflows to trigger CI runs is where I'm heading. Caching is secondary for now for sure, but it brought me back to remembering what you said about the plans on the self-hosted cache and my question. πŸ™‚

shrewd hazel
#

Hmm.. I just watched the video again, and I didn't realize he said "cache persisted within my runs within kubernetes". So, I think I'm still out to lunch about what caches what and how. Sorry. But, any explanation would be greatly appreciated. πŸ™‚

proud oyster
#

No worries. I promise it will get better over the next few months. With Functions launched, production readiness is the new top priority.

#

Caching architecture is actually quite simple:

  • Each engine always has one local cache. It's stored in the engine container's local filesystem.
  • Optionally, an engine can sync its local cache to a remote storage service. This requires a centralized orchestration service. Dagger Cloud is the only such orchestration service today. It combines orchestration & storage for ease of use (at the expense of flexibility).

So the two main parameters of your production architecture are:

  1. can you use Dagger Cloud, and
  2. how persistent is your dagger engine's local storage?

The more persistent your local storage, the less you need distributed caching (which today means Dagger Cloud). If your local storage is very ephemeral, and you can't or won't Dagger Cloud for distributed caching, then today your best bet is to find ways to make your local storage less ephemeral.

In the future we will decentralize cache orchestration - removing the need for a centralized service altogether. The engines will have configurable storage drivers for plugging directly to commodity object storage. Making a decentralized design is a more challenging design, but the reward is that you will have more options to distribute cache, relieving the pressure to make local storage more persistent.

How to make your local storage more persistent depends on your compute architecture, and how much flexibility you have in changing it. Those are the constraints I was referring to earlier.

I hope this helps!

shrewd hazel
#

Oh yes. Very much. Thank you so much for taking the time.

Correct me if I am wrong, Kyle mentioned setting up a volume for Dagger. I thought this was only for the gomod cache, but it is for all caching used by Dagger? πŸ€”

If the volume is not for "all caching" (which I don't think it is), what path would I need to create a volume on to get a persisted cache for the Dagger Engine? If I know that, I think I might have what I need for an ephemeral setup, which might be a naive take on all this, but I also think would be an awesome start. πŸ™‚

On a side note: I rarely get super enthusiastic about any OSS project I tackle. Some, the rare but very important ones, I've gotten very close to and have become a sort of an evangelist and some even a member of the team. Dagger is a project that is giving me this very good vibe and feeling of wanting to give back as soon as I can. Not sure how I will do that or when, but I thought I'd mention it (and I hope it doesn't come off as blowing smoke up your you-know-what πŸ™‚ It's heartfelt and true. 😊 ).

stuck oracle
#

When running the dagger engine in K8s, how would I access environment variables established through helm? I.e. I want to set up various secrets for our functions to leverage when they run. It's not clear to me that's possible and that all values need to be passed in?

proud oyster
#

Yes when calling a Dagger Function you need to pass everything explicitly as arguments. There is a native secret type which you can use to securely pass those secrets.

#

This works exactly the same regardless of where your dagger engine runs, by design. It ensures your functions are as portable and reproducible as possible.

stuck oracle
#

Im going to start the work over here to get dagger running in my K8s cluster. I need to setup a new node pool in GKE and I know that Dagger wants local SSDs to preform well. There is some docs on google about createing node pools with "local SSDs with block storage", and I assume this is the route that I want to be going. https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/local-ssd-raw#node-pool

I wanted to reach out and double check that this was the right option I should be going down. Granted I could get this node pool spun up properly, I was curious about how I should be configuring Dagger through helm to leverage these SSDs properly for it's cache. Does anyone have any example helm chart setups that might be a good guide for me to at least reference in getting that setup properly?

Google Cloud

This page explains how to provision Local SSD storage on clusters and provides examples of how workloads can consume data from Local SSD-backed raw block storage.

low gorge
low gorge
# stuck oracle Im going to start the work over here to get dagger running in my K8s cluster. I ...

I had a look at that documentation, and adapted this - didn't test:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: "dagger-engine-pv"
spec:
  capacity:
    storage: 375Gi
  accessModes:
  - "ReadWriteOnce"
  persistentVolumeReclaimPolicy: "Retain"
  storageClassName: "local-storage"
  local:
    path: "/var/lib/dagger"
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: "kubernetes.io/hostname"
          operator: "In"
          values:
          - "gke-test-cluster-default-pool-926ddf80-f166"
#

Not sure if a filesystem is created automatically.

shrewd hazel
#

@low gorge - could it be you answered my question too above?

what path would I need to create a volume on to get a persisted cache for the Dagger Engine?
So `/var/lib/dagger'?

low gorge
shrewd hazel
#

But, hoping to keep hooking up the volume (by making sure the engine containers are started on the same nodes)

#

How are you triggering the workflows to start?

proud oyster
#

@shrewd hazel to be clear you cannot share the same state directory between engines.

shrewd hazel
#

Ok. So, back to stateful. This is the understanding that is missing because Dagger is like a black-box in my mind. 😊
So, if I create a daemon set with node affinity, I can run say two engines on two nodes and have a local volume attached and all is good in terms of cache (hopefully).

Then each CLI instance would need to be "guided" to the one or other engine. If the same workflows are called over and over, they should be guided to the same engines. By guided, I mean setting _EXPERIMENTAL_DAGGER_RUNNER_HOST. Am I getting closer to a correct understanding? πŸ˜›

proud oyster
#

Note that using a daemonset vs. regular deployment is up to you, it boils down to the kind of persistence you want. We document daemonset + tight coupling to a CI runner configuration, because it's a reasonable default for the most common use of dagger-on-kub.

But there's nothing magical about dagger that somehow requires a daemonset no matter what.

#

Generally we're shifting towards decoupling your dagger compute architecture from CI compute architecture when you can. Keeps your options open.

low gorge
# shrewd hazel Ok. So, back to stateful. This is the understanding that is missing because Dagg...

The simplest & most reliable setup that I am aware of - and is available today - is 1 Dagger Engine per K8s node with a dedicated local disk.

Store the Dagger Engine unix socket on hostPath, and mount it into any pod that needs it. Not putting unnecessary pressure on the container network stack will help.

New CI runners will spin up around the Engine, mount the Dagger Engine unix socket, and get to work. I wouldn't worry too much about jobs picking a specific Engine. As jobs run, cache will spread naturally across all Engines. If one K8s node becomes unavailable for any reason, you should have at least one more to run the workloads.

As you become more comfortable with this setup, you will add new K8s clusters, and then you will have multiple redundancies.

For KubeCon EU, I am preparing a talk which covers this very setup. Dagger Engines are spread across UK, France, Germany & Poland, one per K8s node. GitHub runners spin around them on demand. This is what that looks like:

If you're up for it, I'm happy to talk more after KubeCon, and even set up a live pairing session. Perhaps others will be interested too.

shrewd hazel
#

@low gorge - Thanks so much. Ok. Let me run with this understanding now. I'm getting closer... πŸ™‚

proud oyster
#

One best practice in this setup (tell me if you agree @low gorge) is to configure client pods at the infra level, rather than leak the config in the app/CI logic. ie. make sure to set EXPERIMENTAL_DAGGER_RUNNER_HOST in the kub config running your CI job, rather than in the CI configuration itself. That way the CI configuration remains portable.

low gorge
low gorge
shrewd hazel
#

Much appreciated @low gorge. I'll have a look tomorrow. πŸ™‚

#

@low gorge - You noted you're using Github runners. We don't wish to be dependent on Github to trigger CI workflows. Our goal is to allow any git merge that can send off a webhook to be the CI trigger. Do you see any issues with that direction?

proud oyster
#

that sounds even better to me 😁

#

how do you plan on processing the webhooks?

#

I'll let @low gorge answer on operational caveats today (I can't think of any). But in terms of where we want to go: the goal is full self-hosting for your dagger-based workflows. Meaning that the event handlers that call dagger functions (for example a webhook server, or ci runner) should themselves be runnable as dagger functions.

#
  • Today: run Github Actions or any other event trigger service alongside the dagger engine

  • Tomorrow: run event trigger service on (not alongside) the dagger engine.

shrewd hazel
#

how do you plan on processing the webhooks?
We'll be using Argo Workflows. πŸ™‚

#

Which is probably overkill. Theoretically, we could just simply run our own webhook servers to kick off Dagger functions. πŸ™‚

proud oyster
candid verge
#

I have a question about this setup, I did it end of 2023 but I was facing to an issue with dagger cloud on the cache sync, after the dagger run ... the cache was not sync in dagger cloud
The support tell me I had to stop the engine as a workaround to this issue
So at the end I had to:

  • migrate my daemonset to a sidecar docker engine

  • start the dagger engine in my pod calling the dagger cli

  • after stoping the dagger container with a timeout to let some time to sync cache.

This issue is fixed or it's still a limitation ?

shrewd hazel
#

I think I found a very minor issue with the helm chart, in particular with the values.yml file. The node tolerations and affinity entries, which are commented out, need to be at the level of engine. If someone less in the know (like me) just uncomments them where they are and assumes the entries should be under image, then tolerations and/ or affinity won't work. Should I put in a PR to fix it? Or just an issue?

GitHub

Application Delivery as Code that Runs Anywhere. Contribute to dagger/dagger development by creating an account on GitHub.

shrewd hazel
#

Btw, I have successfully launched my first Dagger engines. 😊

shrewd hazel
#

The k8s guide shows connection to the dagger engine via a local machine with kubectl connectivity. Is there another way to get communication going with the engine pods, but internal to the cluster?

Gerhard noted this above:

Store the Dagger Engine unix socket on hostPath, and mount it into any pod that needs it. Not putting unnecessary pressure on the container network stack will help.
But my k8s-fu isn't the greatest here. Looking at the Argo Workflows guide, I'd need something like this env to add to the engine pod.

env:
- name: "_EXPERIMENTAL_DAGGER_RUNNER_HOST"
  value: "unix:///var/run/dagger/buildkitd.sock"

Will that do it? And if yes, how to get it into the helm chart values?

versed holly
# shrewd hazel The k8s guide shows connection to the dagger engine via a local machine with kub...

Correct. You have to share the volumes between the dagger engine and the container pods so that the socket is available. For example, in our case we are setting up github runners that will communicate to the engine pod. In the runner pod spec we have:
Volumes:

volumes:
- name: varrundagger
  hostPath:
    path: /var/run/dagger

Volume mounts:

- name: varrundagger
  mountPath: /var/run/buildkit

And finally, like you share above, the env variable:

env:
- name: _EXPERIMENTAL_DAGGER_RUNNER_HOST
  value: unix:///var/run/buildkit/buildkitd.sock
versed holly
shrewd hazel
#

Correct. You have to share the volumes

stuck oracle
#

Maybe I've missed this along the way, but where do y'all host your Charts? I.e. Im working on my helm setup right now and need to know the dagger helm chart name / repo / verison. Would also love to see what can be configured with values files

stuck oracle
#

Hmmm. Getting dagger spun up in a cluster today. I have Dagger running, but Im not sure what I'm doing wrong here on the node selection to get the dameon set to only attach to a certain node pool.

  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: pool
                operator: In
                values:
                  - ci-runners
stuck oracle
#

For what it's worth. I spent the whole of today working on getting Dagger running inside of a K8s cluster and ran into a couple road blocks. The big one was that I couldn't the dagger CLI running in a self hosted gitlab runner either through the typical way outlined in the docs or by attempting to publish my own docker image which already had it installed. Kept running into this issue with the runner (https://gitlab.com/gitlab-org/charts/gitlab-runner/-/issues/477).

I also opened up this issue on the Dagger github about getting an official CLI docker image. (https://github.com/dagger/dagger/issues/6887)

GitHub

What are you trying to do? The dagger docs for gitlab / github / etc have each job installing curl followed by installing the dagger CLI to use for a given CI run. It would be much nicer DX, and pr...

shrewd hazel
#

The output in the terminal via my Coder pod looks a bit off, but the dagger engine daemons + cli in another "coder pod" is working. Yay! πŸ˜„

shrewd hazel
#

Seems all the dagger output doesn't get shown?

stuck oracle
#

Run Dagger on Kubernetes | Dagger

stuck oracle
#

Is there a dagger engine version 0.10 on the helm registry? It seems like the engine on verison 0.1.1 is 0.9.10

stuck oracle
#

Im realizing that I have my persistent volume setup properly on my cluster, but I haven't setup a PVC for the volume to be used by the dagger engine. Is there any particular setup that I should be using to set up that claim?

stuck oracle
#

Im realizing that I have my persistent

stuck oracle
#

Here's maybe an interesting one.... Has anyone gotten the dagger engine to run within a Tailscale network inside of k8s?

proud oyster
#

Not to my knowledge, but whatever the best practices are for tailscale on k8s, they should apply to dagger as well

shrewd hazel
#

Does anyone have an open source example of a CI pipeline using Dagger in k8s by chance? Something more than the argo-workflows example/ guide? I'm looking for some inspiration. 😊

versed holly
#

Hey Scott! Are you looking for example setups or specific dagger functions?

shrewd hazel
#

Hey Scott! Are you looking for example

heady dagger
heady dagger
#

is the source code this this available?Marcos made it. I would like to refactor it to use talos linux. I can copy most of the source code in the video frames but as it is a year old I was wondering if there were improvements. https://www.youtube.com/watch?v=u1Q6RNaQHTY

In this demo, Marcos shares his experiences with running Kubernetes in Dagger and explores different possibilities for testing pipelines.

Want to ask the presenter a question about the demo? Join us on the Demo Discord Forum here to discuss this specific demo:
https://discord.com/channels/707636530424053791/1120935751069208699

β–Ά Play video
heady dagger
proud oyster
heady dagger
shrewd hazel
#

Kubernetes | Dagger

south thorn
weak cypress
#

πŸ‘‹ I'm evaluating options on where to run a whole CI system and I have a very basic question...
I suppose there is no workaround to Dagger Engine requiring root capabilities. As that prevents the usage of GKE Autopilot, which would have been a great way to reduce the burden of maintaining a k8s cluster for our team.

daring wraith
#

Hey @weak cypress! πŸ‘‹ nice to meet you!

Yeah (unfortunately) we don't support doing rootless mode - there's a bunch of permissions that the dagger engine needs to be able to create isolation between the containers that it starts and manages. Without those permissions it can't do that very easily, or do all the fancy networking stuff it does.

There's some more info in https://docs.dagger.io/faq/#can-i-run-the-dagger-engine-as-a-rootless-container

south thorn
south thorn
tulip wedge
#

Hey so I was just reading this post: https://dagger.io/blog/argo-cd-kubernetes and that's pretty similar to what I've been thinking for our dagger setup once we get a little further in. One question I've been mulling over is whether or not to be concerned about pre-warming the node's local cache when a fresh node comes up. Does anyone do anything like that?

Powerful, programmable CI/CD engine that runs your pipelines in
containers β€” pre-push on your local machine and/or post-push in CI

versed holly
tulip wedge
#

Are you referring to Dagger's own cache?

south thorn
worn olive
versed holly
worn olive
ionic hare
#

Nice! Clearly need to play w/ Mermaid

elder bone
#

Has anyone tried Dagger on K8s Windows Node?

daring wraith
#

answered here: #1242180731548209293 message
we read all the channels here! it's fine to just ask in one place, we'll get back to you ❀️

weak cypress
#

πŸ‘‹ I'm trying to upgrade the engine to v0.12 today and I'm getting an odd error when templating the helm chart) - We use argoCD which internally does helm template ...

metadata.labels: Invalid value: "v\"0.12.0\"": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?'

Seems like the PR fixed the double vv$appVersion but introduced an incorrect label: v"$appVersion" (note the quotes after the v)

GitHub

And then add the "v" prefix in front of everywhere it needs to be.

Signed-off-by: Justin Chadwell

daring wraith
#

Bleh

#

You're entirely right

#

i absolutely love templated yaml 😒

#

sorry about that

weak cypress
#

I was in the middle of filling an issue and submitting the PR πŸ˜„ Thanks to you πŸ™‚

proud oyster
#

Would be fun to have an official method to generate the Kub configuration with a Dagger pipeline, instead of helm

#

I honestly don't see much value in supporting helm in the first place

#

makes me want to write the module to replace it πŸ˜›

weak cypress
#

Sorry to bother again about this... any chance the fixed chart can be released on the oci://registry.dagger.io/dagger-helm?

low gorge
weak cypress
#

Seems like the helm/chart/v0.12.0 tag needs to be moved to f7944784dab9a8b5fc76190bad6502e374ed5f4c?

low gorge
weak cypress
#

Thanks

low gorge
#

Great to know that this unblocks you πŸ’ͺ

south thorn
#

We'll cover a Kubernetes use case in our community call tomorrow. Come to watch the demo and ask all your questions πŸ™‚ #general message

cerulean mountain
#

cc @ionic hare we can continue chatting about ideas here πŸ™Œ

ionic hare
#

Tilt-like workflows thread, let's roll

solemn storm
versed holly
solemn storm
torpid crescent
#

@cerulean mountain I don't know if this is something already known or somehow with any possible solution.. But I noticed that if I try to get logs doing kubectl from the host to the k3s running inside dagger, I got:

❯ kubectl logs test
Error from server: Get "https://10.87.0.25:10250/containerLogs/default/test/test": proxy error from 10.87.0.25:6443 while dialing 10.87.0.25:10250, code 502: 502 Bad Gateway

can it be related to the egress selector in k3s?

#

(to be clear, not an issue for any crucial test I want to do, but just noticed it)

torpid crescent
#

Last time I saw it in a similar situation was fixed editing --egress-selector-mode... don't want to give you wrong directions though

cerulean mountain
#

can it be related to the egress selector in k3s?
that's what I was thinking about

torpid crescent
#

Ok

#

If so, should be easy

cerulean mountain
#

pushing a new version of the module with that fix πŸ™

torpid crescent
#

AwesomeπŸ”₯πŸ”₯

cerulean mountain
torpid crescent
south thorn
torpid crescent
#

@cerulean mountain the latest version of rancher/k3s image is breaking the k3s module at the level of the cgroup fix script... I think it would be a good idea to pass the name and the tag of the image as an argument, what do you think? I can open a PR in case

digital night
torpid crescent
#

nope, just randomly picked an old-enough one... so I can only tell that is working with rancher/k3s:v1.28.1-k3s1

digital night
cerulean mountain
#

I'll check out why the module stopped working now

#

@torpid crescent found the cause of the issue, fixing now so it works in the latest version

digital night
cerulean mountain
#

also added a way to optionally specify te image via dagger call --name foo --image rancher/k3s:latest server up

south thorn
#

Seeing y'all use this module is awesome ❀️ Thanks for reporting the issue @torpid crescent !

solemn storm
cerulean mountain
#

I don't have a personal need for that module now, so it's not likely that I'll create it any time soon

solemn storm
amber narwhal
#

I just wanted to say @cerulean mountain I so want to try your module, I just don't have a lot of time these days. But it's on my list. Expect me to test the sh*t out of it. πŸ˜„

cerulean mountain
amber narwhal
#

Thanks! It took me a minute to figure out how to use it. (needed to start the service manually)

daring wraith
#

Aha wonderful I had the same question lol πŸ˜† unfortunately starting the service for me seems to trigger some very strange dns issues in the engine for me that I need to debug

candid verge
#

Hi, I was trying to use the k3s module with a service binding + fetching the config in a container but it was not working because the file is not yet generated.
Do you think there is a solution to use it without starting it outside of my module ?
My goal is to execute tests on a kubernetes operator so as a first step I would like to start a cluster
Actually I did something but I think it's not a very clean method: https://gist.github.com/Dudesons/a75b8f9d12b389837a0cebd01181b024

Gist

poc_operator_with_k3.go. GitHub Gist: instantly share code, notes, and snippets.

amber narwhal
candid verge
amber narwhal
#

No, I mean simply remove the goroutine and just wait for the service to start.

candid verge
#

Ok I just did a try yes it's working I forgot I can call the Start method, thank you

#

Doing my tests I let a time.Sleep(30 * time.Second) .... When I remove it the k3s is up but coredns and another service are not yet started so the cluster is not totally ready

torpid crescent
#

We integrated this already in our gh CI, and is all green πŸ™‚ a lot to improve in caching and co, so I might have questions, be prepared πŸ˜…

cerulean mountain
south thorn
south thorn
south thorn
#

@hybrid pecan is livestreaming the daggerizing of OpenUnison, which uses Helm Charts.

If you are getting started with Dagger and Kubernetes, this stream might be helpful for you! https://www.youtube.com/watch?v=ZDm8e4cS8ek

This live stream series will demonstrate how open source projects can leverage Dagger to enhance their build automation and development environments. Dagger enables developers to define their pipelines and environments using general-purpose programming languages, increasing flexibility and maintainability compared to traditional configuration-ba...

β–Ά Play video
torpid crescent
# south thorn Happening now πŸ‘†

https://youtu.be/9FGOATYtpBM "refined" recording here

With this tutorial we test a dummy Kubernetes tool using @Dagger and CI/CD pipelines, we dive deep into modern DevOps practices era, learning how to streamline your workflow, and ensure robust, scalable deployments.

We start from scratch to equip you with the knowledge and skills to help you enhancing your Kubernetes testing experience.

00:00 ...

β–Ά Play video
south thorn
digital walrus
#

any idea when this feature will be available β€œBring-your-own storage for distributed caching” java mvn builds on short lived containers are really slow

proud oyster
# digital walrus any idea when this feature will be available β€œBring-your-own storage for distrib...

No date yet but it's in progress. So definitely this year πŸ™‚ We'll refine the target date as soon as we can.

In the meantime there are options on the infrastructure side. For example if you deploy the Daemonset from our docs, and point your dagger clients to that, you will benefit from the local storage of your kubernetes nodes. It won't be perfect caching, but it should be a noticeable improvement. And the longer-lived your kubernetes nodes, the more noticeable the improvements.

digital walrus
rigid wing
#

dagger operator on k8s - my manager is conserned, that dagger is not safe running dagger operator as demonset How can I convince him. Note: we are already running jenkins operator, but I would like to give alternatives like dagger. Thanks in advance

versed holly
#

Hi @rigid wing! What is the concern that your team is expressing?

If you are using Jenkins, you can think of dagger as compliment to it rather than an alternative. It will work alongside that infrastructure and make your pipelines faster and less dependent on jenkins-specific config files

cerulean mountain
tulip tapir
#

managers in a nutshell

rigid wing
#

Right πŸ™‚ Well, the biggest concern is that it runs as a daemonset and with elevated privileges...

hasty briar
amber narwhal
#

@cerulean mountain do you think it would be possible to support container imports in your k3s module?

cerulean mountain
#

I think it's tricky but doable. Mostly because there's a chicken-and-egg problem about k3s being up before actually calling k3s ctr images import. So the module will need to have some funky logic in its entrypoint script so it loads the images after k3s is effectively up. I guess the main limitation to implement this correctly is that the k3s command can't be executed remotely and needs to run in the same place where the k3s server is running

amber narwhal
#

Hm...I just found the registry example in your module. I guess that's a good alternative.

amber narwhal
#

Do you think that's a better approach to get container images built with Dagger installed into k3s @cerulean mountain ?

cerulean mountain
#

that way you have a clear separation about what depends on what and have better caching IMO

amber narwhal
cerulean mountain
amber narwhal
#

Let me see if I can get this working. I'll send a PR if I do.

torpid crescent
#

hello! I'm testing 0.13 for a @cerulean mountain 's K3s based module of mine:

func (m *Interlink) NewInterlink(
    ctx context.Context,
    manifests *dagger.Directory,
    // +optional
    kubeconfig *dagger.File,
    // +optional
    localRegistry *dagger.Service,
    // +optional
    localCluster *dagger.Service,
    // +optional
    // +default="dciangot/docker-plugin:v1"
    pluginImage string,
    // +optional
    pluginEndpoint *dagger.Service,
    // +optional
    pluginConfig *dagger.File,
) (*Interlink, error) {

    //K3s := dag.K3S(m.Name, K3SOpts{Image: "rancher/k3s:v1.28.1-k3s1"}).With(func(k *K3S) *K3S {
    K3s := dag.K3S(m.Name).With(func(k *dagger.K3S) *dagger.K3S {
        return k.WithContainer(
            k.Container().
                WithEnvVariable("BUST", time.Now().String()).
                WithDirectory("/manifests", manifests).
                WithExec([]string{"sh", "-c", `
cat <<EOF > /etc/rancher/k3s/registries.yaml
mirrors:
  "registry:5000":
    endpoint:
      - "http://registry:5000"
EOF`}).
                WithServiceBinding("registry", m.Registry).
                WithServiceBinding("plugin", pluginEndpoint),
        )
    })

    K3s.Server().Start(ctx)
    return m, nil
}
weak cypress
#

πŸ‘‹ What's the latest guidance around running dagger in k8s (permissions wide)

    privileged: true
    capabilities:
      add:
        - ALL

Is really ALL capabilities required?
Or is there a list of them I could set up to run the engine with enough access?

Context: I'll be meeting with the team that runs K8s as a service for our company and they already expressed concerns about the wide capabilities required

daring wraith
#

πŸ€”

#

i'm not sure why we have the spearate capabilities fields to add ALL

#

is it not enough to haveprivileged: true

#

cc @low gorge @versed holly - i don't think we use anything extra over buildkit in terms of privileges, and i've only seen that in the context of setting privileged

#

but note that privileged is effectively running the pod as root - we don't support running dagger in "rootless" mode (yet, maybe one day, though it's not anywhere on the roadmap)

torpid crescent
#

mmm, apparently the K3s module is now unable to resolve dagger services from inside a pod... that is weird because it was working in v0.11.. For instance if I start from this, I cannot get any pod running insice the k3s cluster to resolve "registry" https://github.com/marcosnils/daggerverse/blob/main/k3s/examples/go/main.go , any idea?

GitHub

Personal collection of Dagger modules. Contribute to marcosnils/daggerverse development by creating an account on GitHub.

vocal meadow
#

Hello, everyone πŸ‘‹
I'm at a stump here and any help is appreciated
I'm finishing the demo for KCD and i'm using @cerulean mountain k3s module to spin up a cluster with argo-workflows and argo-events installed, the workflow follows the example given in the docs (https://docs.dagger.io/integrations/argo-workflows/) and until here, so far so good. The problem is that dagger-engine sidecar container fails to start with the error in the image attached.
I'm running dagger in WSL2 on windows10.

Dagger provides a programmable container engine that can be invoked from an Argo Workflow to run a Dagger pipeline. This allows you to benefit from Dagger's caching, debugging, and visualization features, whilst still keeping all of your existing Argo Workflows infrastructure.

vocal meadow
#

Argo Workflows | Dagger

cerulean mountain
torpid crescent
cerulean mountain
#

and pods don't share the host /etc/hosts file πŸ€”

#

you might be able to resolve services by the service hostname, but the binding name I don't think it works

torpid crescent
#

registry I can't tell, but other service is quite sure... I can retry the pipeline with the dagger v0.11 and see

cerulean mountain
torpid crescent
#

and the latest integration-test of the repo is green... I'm looing in dagger cloud for the traces, one sec

cerulean mountain
#

maybe it's a k3s change? Could you try printing the /etc/hosts file inside the interlink pod?

#

if you see plugin and regisry there it's becuase k3s is setting those there somehow

#

you generally do that with HostAliases in k8s

torpid crescent
#

yeah, in fact, the current behavior should be the correct one..

#

that was the "trick", and I removed it in the current version... ok, now it makes sense, right? case closed

torpid crescent
cerulean mountain
amber narwhal
#

@cerulean mountain is there a reason why your k3s module requires 0.12.4? If not, do you mind bumping it down to 0.12.0? (I've been stuck on 0.12.0 for two months now due to a regression)

quartz lantern
#

πŸ‘‹ We're considering mounting /var/lib/docker in a PV to give us persistent caching across node restarts in k8s; I was just wondering if you had any thoughts on whether that was a good idea.

daring wraith
#

/var/lib/docker? i think you probably want /var/lib/dagger if you're looking for dagger caching πŸ˜„

#

but yes, this totally will work - but you do need to make sure that you only have one dagger instance accessing it at a time (if you try and have multiple users of it, the subsequent users will fail out, since it locks the entire db)

quartz lantern
#

That's the one! Thanks, Jed πŸ™‚

wheat owl
#

Hi!πŸ‘‹πŸ» I'm trying to build a container using dagger from a local dir/repo, and then publish it to a container registry (k3d/k3s on Docker Desktop for Mac) accessible locally as 127.0.0.1:5000.
I already searched the docs but I'm unable to find relevant documentation for this use case...
Is it possible, and is there an easy way to do it? TIA!

versed holly
#

Hi!πŸ‘‹πŸ» I'm trying to build a container

wheat owl
#

Hi again everyone πŸ‘‹πŸ», I'm trying to push an image to a local registry using Dagger, but I'm encountering a TLS error due to a certificate verification issue.
Here’s the error I get:

Function execution error: resolve: failed to export: failed to push git.localhost:8443/demo/my-nginx-1:latest: failed to do request: Head "https://git.localhost:8443/v2/demo/my-nginx-1/blobs/sha256:024f2d8883919b1b7a966d1383e87249ff31ddfac3f24a828d7b19c3e953fae9": tls: failed to verify certificate: x509: certificate is valid for 08711bf71c6310b05f686aa8698fa573.07f116773d9e9f9d0869a70a95762eab.traefik.default, not git.localhost

Pushing with docker works, but in case of need in Docker I would typically solve a similar problem by configuring an insecure registry or by ignoring self-signed certificates.
I know it is possible to "Configure the Engine to use Custom Certificate Authorities" but I'd like an easier solution, for a kuibernetes based local dev environment I'm setting up.
If there's no other option, I don't mind creating a custom runner, but I would prefer not to have to dump the CA files and then add them to a volume for the custom runner.
Maybe this can be done with a specific ENV var, or with a directive in the dagger configuration files?
Thanks in advance!

worldly gate
lyric breach
#

Connecting to dagger engine times out after upgrading Argo Workflow to the latest version (v3.5.11), dagger engine version v0.13.3

lyric breach
#

I am actually confused on which engine is used, the sidecar's or one installed in the cluster with helm?

vocal meadow
#

I am actually confused on which engine

south thorn
tardy garnet
#

Hey folks Is there a way to run the dagger engine without privileged access in k8s?

weak cypress
# tardy garnet Hey folks Is there a way to run the dagger engine without privileged access in k...

I'm afraid there is not.
You can check this demo where @proud oyster expands a bit on the topic (after ~13:00): https://youtu.be/Sn1w51Vh0mM?t=759

Join Nipuna Perera, Director of Cloud Engineering at Fidelity Investments as he shares insights on integrating Dagger into enterprise workflows. From handling compliance and security challenges to building seamless CI/CD pipelines, see how Dagger transforms software delivery in complex environments.

Want to learn more or have questions? Join us...

β–Ά Play video
sacred osprey
#

Good news, I'm moving my homelab to a colocation site and taking advantage to do a bit of refactoring.
I'm considering using Talos linux for my k8s, because @low gorge is a big fan, but looking at the https://www.talos.dev/v1.9/introduction/support-matrix/ , it seems like they are phasing out community support in the next version and going the Enterprise route.
Unless someone has a better idea, I'll probably boot with Fedora and run k3s.

Table of supported Talos Linux versions and respective platforms.

low gorge
# sacred osprey Good news, I'm moving my homelab to a colocation site and taking advantage to do...

Hey!

My interpretation of that table is:

  • Community support for 1.8 ended on 2024-12-17, when 1.9 was released
  • Community support for 1.9 ends on 2025-04-15, when 1.10 will be released (this needs to be confirmed since 1.10 stable is not out yet)

In my experience, the Sidero Labs team will do the right thing if it's a genuine bug. Here is the last one that I reported which was backported to 1.8 https://github.com/siderolabs/extensions/pull/580, even though 1.8 is technically out of community support.

I personally run a few homelabs on different versions on Talos, oldest one being v1.5.5, which is long overdue an upgrade. Since I don't do upgrades in place, I am looking for those few hours when I can restore from backup on one of my newer homelab hardware. In practice, 1.5 has been rock solid for me, and while I don't expect to get any support, there hasn't been any need for it.

If you do decide to go down the k3s path, you should probably talk to this like-minded friend: https://github.com/tailscale/tailscale/issues/10814#issuecomment-2479977752

Be on the lookout for generic device plugin issues on k3s - relevant if you want to use Tailscale, GPUs or any hardware devices in containers.

sacred osprey
# low gorge Hey! My interpretation of that table is: - Community support for 1.8 ended on 2...

Oh! Your interpretation is plausible and more encouraging than mine, thank you!
Since I don't know the Sidero folks , I was not sure what to expect. Your endorsement for homelab use is just what I needed.
Since I intend to use it as a community resource, I want to do all the config as code.
I suppose a good start would be https://github.com/onedr0p/cluster-template
unless you have a better suggestion?

GitHub

A template for deploying a Talos Kubernetes cluster including Flux for GitOps - onedr0p/cluster-template

heady dagger
#

Sidero is a great product suite but niche.

low gorge
# sacred osprey Oh! Your interpretation is plausible and more encouraging than mine, thank you! ...

That looks like a great resource, this is the first time that I come across it - just starred it.

Here is an alternative that I am familiar with: https://github.com/mischavandenburg/homelab . If you keep pulling on that thread, you will find a wealth of information from Mischa.

While you may already know about https://makeitwork.tv/from-homelab-to-production/ , I share all the code from the talk with members. This is complimentary for loyal fans like yourself. Just subscribe and I'll take care of the rest πŸ‘

heady dagger
sacred osprey
hardy lynx
#

Hey all! I just recently started looking into Dagger and I was wondering if there were features or modules that would allow dagger to execute similar to tilt https://tilt.dev/
I'm looking for hot reloads on file changes with automatic build and deployments for local development.
If not I think this would be a killer feature to add into dagger.

amber narwhal
proud oyster
#

We will build it I promise πŸ™‚

@hardy lynx if you could note your interest in that issue πŸ‘†that would help prioritize it! thanks

hardy lynx
# proud oyster We will build it I promise πŸ™‚ <@291719239814086656> if you could note your inte...

✨ amazing! I know this is a stretch and I will mention it on the issue, but do you think dagger would ever invest in a local web ui similar to tilt? I find this feature of tilt very useful as a dev/devops engineer. It would give visibility into dag status + logs for local services + buttons to trigger dag functions such as tests while hot reload is active. I like tilt features but I am not a fan of their domain specific language and it is ill suited for reuse in a ci pipeline.

proud oyster
hallow lodge
#

Hi everyone, I'm using Dagger to create ephemeral containers for integration tests. Now I have to test an application that is supposed to run inside a Kubernetes cluster (it needs to run virtctl to connect to a virtual machine created via Kubevirt), is there a way to create such an environment using Dagger?

#

To be more precise: I'd like to spin up a cluster, and inside of it a VM and a pod running my containerized application that I want to test

#

It might be completely out of scope wrt what Dagger can do so I figured I'd be better off asking before trying something impossible πŸ˜…

gritty coyote
gritty coyote
fringe wagon
#

Ola!
I had an issue trying to deploy dagger engine using helm on our CI clusters. We’re using flux and I think there’s a bug with the version label when version contains invalid characters.
Anyway, I proposed https://github.com/dagger/dagger/pull/9679 which should address the issue.

GitHub

The Helm app.kubernetes.io/version label is not sanitized. This can lead to incorrect template generation on some system updating the version.
For example with flux, we need to use an OCI registry ...

bitter idol
#

Who's headed to Kubecon? πŸ˜„

daring wraith
#

a few of us from dagger will be there, we've got a booth πŸ˜„ looking forward to seeing ya πŸ‘‹

bitter idol
#

Sweet! Thought I saw you guys on the sponsor list - do you know which booth you're at?

daring wraith
#

oh i actually don't know off the top of my head! I think @south thorn might?

harsh creek
#

This might actually belong here

south thorn
# bitter idol Sweet! Thought I saw you guys on the sponsor list - do you know which booth you'...

We will be at Platform Engineering Day (co-lo day before KubeCon). You'll see us at a table there.

At KubeCon, our booth is #N453.

And as Justin mentioned, we'd love to see you at the Hack Night too on April 1st! You don't need a KubeCon ticket to attend, so everyone is welcome!

Make sure to register to save your spot: https://lu.ma/hlx7s6ym

Dagger is an open-source runtime for composable workflowsβ€”perfect for AI agents and CI/CD automation alike. Whether you’re streamlining DevOps with modular,…

peak bolt
#

is it possible to point my dagger CLI at the dagger engine running on a remote cluster?

peak bolt
shrewd hazel
# peak bolt to clarify, i want to run the dagger cli on my laptop.

AFAIK, this isn't possible, because the communication between the engine and runners is internal to the node they are both running on. Dagger doesn't have a service to connect to. But AFAIK, you should be able to just install Dagger CLI locally and run your modules locally. The idea being, you can run the tasks/ pipelines anywhere you would have a docker-like runtime running.

Just FYI (and humble bragging 😊), in my k8s setup, I'm running the CLI inside a [Coder workspace] (https://coder.com/docs/user-guides/workspace-access) on the node with the engine for development of the CI pipeline. This allows me to setup up things like webhooks and the like to control CI processes and be able see them work "in action".

gritty coyote
shrewd hazel
# gritty coyote I believe you are looking for this configuration? Setting `_EXPERIMENTAL_DAGGER_...

@gritty coyote - @peak bolt mentioned a "remote cluster", which means k8s to me. I guess that needs clarification, but if Dagger is running in a k8s cluster, there is no service for it and no way to create a service for it, AFAIK (I'd love to be told I'm wrong). Without making a service available, there is no way to get access to the runner from outside the cluster. The only thing that points to me being wrong is the tcp option. But then, one would need to know how to present the port from the runner/ engine perspective and that is the piece of knowledge I'm missing to also help.

vale ore
# shrewd hazel <@418233653592719364> - <@1073852738967965706> mentioned a "remote cluster", whi...

You can def connect to remote resources, its a common pattern and lots of folks do it.

The main "gotcha" is you're responsible for securing the connection yourself.

So tcp is a decent choice if you are port forwarding via SSH or using something like tailscale.

In the same way that kubectl can connect to a remote k8s services, dagger cli can point to a pod using this form kube-pod://<podname>?context=<context>&namespace=<namespace>&container=<container> - as long as your kubectl is already configured to know how to reach this service.

peak bolt
shrewd hazel
vale ore
# shrewd hazel I'm confused. I'm looking at my Dagger engine pod in k8s. There are no ports ope...

I think you could open a port if you wanted to in the kubernetes config, but instead of doing that I think using the kube-pod: option with a properly configured kubeconfig file is the way to go

The overall point is that there is no dagger-specific way to connect to stuff securely, however you do it today you should choose the best corresponding option from this list: https://docs.dagger.io/configuration/custom-runner/#connection-interface

A runner is the "backend" of Dagger where containers are actually executed.

shrewd hazel
proud oyster
#

It's fair to say that it's possible but experimental - because there are lots of different possible architectures, and lots of different preferences in the community. So we want to learn more before we declare a certain architecture better than others.

shrewd hazel
#

I took a second look. There is a port setting for the engine in values.yaml. I set that and the engine pod now has an exposed port, which would allow me to create a service and expose to the outside world (if needed). I'll stick to the socket solution though, as it is working for us. At least now though, I know I can get the engine API exposed, should I need it, which wasn't clear to me before. Thanks all and sorry for my ignorance.

proud oyster
whole notch
#

Hi. I have a project with multiple repos in BitBucket. I'm leveraging autoscaled BB CI runners on my EKS cluster to run CI jobs. Now, spinning a dagger engine in every CI job is a bit.. underwhelming. It's wasteful and does not give me the benefit of a dagger cache.

I'm looking into spinning up dagger engine on the same EKS using your helmchart as a daemonset. Instructions say I'm to configure CI runners to point to dagger engine pods. The problem is, my CI runners are auto-scaled and when a CI job is enqueued, there is no way to know on which node it will be scheduled. So if my CI definition yaml points to a pod name, there's a very good chance that pod will be running on a completely different node, in a completely different AZ.

Is there a way to pin dagger-client pods to dagger-engine daemonset pods running on the same node, no matter how many nodes are in the cluster?

EDIT: one of the reasons I'm concerned about this is because some projecs have multi-gig codebases. And copying them to another server doesn't feel right, when there already IS a perfectly fine dagger engine running on the same server πŸ™‚

whole notch
#

Hi. I have a project with multiple repos

fringe wagon
#

Hey there, I think there's an issue either in the documentation or the Helm chart wrt host mount path.
In the helm chart, the mount is on the form of:

/run/dagger-{{ include "dagger.fullname" . }}

which, by default would lead to /run/dagger-dagger.

However in the doc for GitlabCI runner the example mentions host_path = "/run/dagger" which would lead to a corrupted mount. I don't know if it worth an issue, but just wanted people to be aware.

cerulean mountain
#

seems like after this, PR https://github.com/dagger/dagger/pull/9845 if you install the helm chart with the default instructions provided here: https://docs.dagger.io/ci/integrations/kubernetes/#example, then the defined volumes are mounted as follows:

        serviceAccountName: default
        terminationGracePeriodSeconds: 300
        volumes:
        - hostPath:
            path: /var/lib/dagger-dagger-dagger-helm
            type: ""
          name: varlibdagger
        - hostPath:
            path: /run/dagger-dagger-dagger-helm
            type: ""
          name: varrundagger
    updateStrategy:

which seems to me it's a bit odd since we have a good amount of dagger names in the path πŸ˜›

This section covers different strategies for deploying Dagger on a Kubernetes cluster.

#

seems like that should be fixed πŸ™ cc @low gorge @versed holly

cerulean mountain
heady dagger
half canyon
cerulean mountain
#
       terminationGracePeriodSeconds: 300
        volumes:
        - hostPath:
            path: /var/lib/dagger-dagger-dagger-helm
            type: ""
          name: varlibdagger
        - hostPath:
            path: /run/dagger-dagger-dagger-helm
            type: ""
          name: varrundagger
    updateStrategy:

that's what I get

half canyon
#

I’ll look at my testing output on RKE2 from the other day and see what I got there

half canyon
#

got same results using docs instructions...are we prefixing with ${namespace}- or something?

#

I have some recommendations from ChatGPT...based on the fact that it installs differently through the Rancher UI compared to helm upgrade.

some options are to do one of:

  1. Override fullname explicitly in values.yaml:
fullnameOverride: dagger-engine

or
2. Change your fullname helper to just .Release.Name
In _helpers.tpl:

{{- define "dagger.fullname" -}}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- end }}

Seems we may want to also move toward calling our release dagger vs dagger-helm, but not sure who that would break at this point. Maybe later.

Will explore more tomorrow.

#

Also look at

path: /run/{{ include "dagger.fullname" . }}

and ensure it's not

path: /run/dagger-{{ include "dagger.fullname" . }}
shrewd hazel
#

It was a stock install via Rancher UI many moons ago. I've been upgrading regularly though with no problems.

cerulean mountain
fallow arrow
#

Hey @cerulean mountain I am trying to use your k3s module within my company and running into a strange issue. I had to point it to our internal registry mirrrors. To do that, I have to add /etc/rancher/k3s/registries.yaml so when the server starts up it picks it up. In your module, the folder /etc/rancher/k3s is a cache volume to persist the k3s.yaml (KUBECONFIG). So what I did was modify the module to set K3S_KUBECONFIG_OUTPUT to a different folder and set that as the ccache. However, now I can't do the server up without changing the cache name every time. The config it tries to use is stale (certs are wrong). It works if I change the --name every time. I can't figure out what's going on and was wondering if you ran into something similar.

cerulean mountain
cerulean mountain
fallow arrow
#

Oh! good to hear we are on the same page! πŸ˜„ sorry that it's breaking for you too

#

I could get it to work by removing the /var/lib/rancher cache. But that means it bootstraps a new cluster and resources every time

cerulean mountain
#

it used to work where the same cluster could be started multiple times though

fallow arrow
#

Right, i was seeing that behavior just a couple of days ago. It's coincidence that I tested it today by moving the kubeconfig cache and it broke so made me thing my change broke it

cerulean mountain
#

ok @fallow arrow seems like v1.29.15-k3s1 works

#

I assume it's related to the fact that the IP of the service container changes on every run and the certificates become invalid which makes sense. Which will make it somehow hard to make it work in newer versions.

fallow arrow
#

fwiw, I notice that local-path-provisioner is missing in the latest version on kube-system

cerulean mountain
#

seems to be there in latest?

fallow arrow
#

hmm I didn't see it.. another thing, every time I start the server it spins up a new node but isn't able to clean up old ones. Are you seeing the same?

#

pods are also stuck in Terminating

#

btw, I see the local-path-provisioner in latest, false alarm

cerulean mountain
fallow arrow
#

works fine when I remove the /var/lib/rancher cache though. But then again that bootstraps everything.

#

latest is on v1.32 so that's 3 kube versions ahead of the working 1.29

cerulean mountain
#

ok, find a "good enough" workaround I believe

#

if I remove the server/tls folder when calling Server, that will bootstrap the TLS certificates automatically. Thing is that the old node will still appear. I believe that's inevitable @fallow arrow since the previous state will still be stored in ETCD

#

I think that has always happened and I'd expect it to be like that. Eventually the node will become NotReady

fallow arrow
#

so rm -rf server/tls before server start?

cerulean mountain
fallow arrow
#

what's the full path?

cerulean mountain
#

/var/lib/rancher/k3s/server/tls

fallow arrow
#

I think a cache bust is also needed

cerulean mountain
#

publishing now

fallow arrow
#

does your helm example still work?

cerulean mountain
#

I think I was the one causing this issue in v0.1.9 since I've moved the /var/lib/rancher to a cache volume instead of a mounted temp

cerulean mountain
fallow arrow
#

Error: INSTALLATION FAILED: Kubernetes cluster unreachable

cerulean mountain
#

seems to work here @fallow arrow

candid verge
#

Hi, I'm running dagger with self hosted github runner pod.
The pod is composed of 2 containers:

  • github runner with a dagger cli
  • dind container

Sometimes I can see some very slow step from dagger "internal actions".

29  : loadPackage DONE [11.0s]
22  : go SDK: load runtime DONE [53.4s]
30  : loadPackage DONE [23.2s]
1   : with-source with-aws-creds --src=~/.aws/ with-kube-config --src=~/.kube/ with-remote-ci-config apply --env=dev --region=eu-west-1 --account=eustaging --stack=market_trends --tfPlan=./untracked_files/tfplan_eustaging
10  : β”‚ load module
33  : β”‚ β”‚ inspecting module metadata
16  : β”‚ β”‚ initializing module DONE [1m16s]
18  : ModuleSource.asModule DONE [1m16s]
34  : Module.serve: Void
34  : Module.serve DONE [0.0s]

On some job the function the execution is taking 1min40-2min but the whole job took 5-6min.
On another call where I want to send a notification to slack the notification fired is taking +-1s but the whole execution of dagger 35-40s

I'm asking if it can be due to have dagger in a dind container and should I remove my dind container to use directly dagger with the host engine ?
For the moment I didn't install dagger as a daemonset because when a dagger version is changed for test / rollout, people of each team will handle the upgrade by specifiying the new agent label

versed holly
candid verge
#

so I should try to cahnge my dind container by the dagger one ? (tbh I don't remember why I use a dind instead of dagger directly)

versed holly
#

So far running Dagger standalone, connecting via UDS and mounting /var/lib/dagger with xfs has given us the best performance improvements. We wrote about it here: https://dagger.io/blog/argo-cd-kubernetes. There is a section called "Nodes" were we briefly explain how we did it and the reason behind it.

You can try that out! Out of curiosity, are there any CPU limits on the runner container? What kind of hardware are you rocking there?

#

Happy to help with the setup πŸ‘

candid verge
#

it's running in an eks cluster

#

We don't have specific hardware setup actually it's the first time we are deploying in our ci in order to migrate on it
I will try to setup dagger in sidecar of the container + a volume with xfs + uds

#

I keep you in touch next week I will be off end of this week

#

thank you for all these informations

versed holly
hasty briar
#

I'm unsure where this one belongs, #kubernetes or #github , but here goes. I'm hosting GitHub Runners via ARC in AKS. All has been fine up to (and including) v0.16.1 using the Dagger Helm Chart, and hostPath mounts from the runner pods for /var/run/dagger.

With v0.18.6 (from v0.16.1) this is now failing. My runners are unable to connect to the Dagger Engine deployed to the Host. My pipelines endlessly error with:

! connection error: desc = "transport: Error while dialing: dial unix /run/dagger/engine.sock: connect: no such file or directory"
moby.buildkit.v1.Control/Info
moby.buildkit.v1.Control/Info ERROR [0.0s]

One oddity I have spotted, is that the runner pods have the /run/dagger/buildkitd.sock, but the Dagger-Engine DaemonSet pods have the /run/dagger/engine.sock

Even though my RunnerDeployment configuration is (with cuts for sensitivity) this:

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: dagger-runners
  namespace: runner-pool
spec:
  replicas: 2
  template:
    spec:
      organization: --snip--
      image: ghcr.io/--snip--:v0.61.0
      labels:
        - build
        - dagger-runner
      dockerEnabled: true
      dockerdWithinRunnerContainer: true
      securityContext:
        fsGroup: 1001
        fsGroupChangePolicy: OnRootMismatch
      containers:
        - name: runner
          volumeMounts:
            - name: dagger
              mountPath: /run/dagger
          env:
            - name: _EXPERIMENTAL_DAGGER_RUNNER_HOST
              value: unix:///run/dagger/engine.sock
      volumes:
        - name: dagger
          hostPath:
            path: /run/dagger

I'm not sure where the buildkitd.sock is coming from, and why the engine.sock is not available on the runner. Any pointers? πŸ₯Ή

proud oyster
#

@eternal kraken @versed holly πŸ‘† at first glance this looks like a case of the engine changing its default socket path, and either 1) the latest helm chart still using the old path, or 2) latest helm chart using the correct path but needing some sort of manual migration from older helm chart?

cerulean mountain
cerulean mountain
candid verge
#

Hi, I'm trying to deploy dagger in kubernetes as a sidecar of my github runner but when I'm starting my runner the dagger engine crash but it's not very clear what I'm missing.
I share my github runner manifest + pod logs + events
(This is a first step, the second step will to create a pvc with xfs but before creating a webhook mutator I wanted to validate the setup)

candid verge
#

Hi, I'm trying to deploy dagger in

candid verge
#

So far running Dagger standalone,

limber folio
#

What permissions does the dagger cli need in order to talk to the dagger engine instance running in cluster?

versed holly
limber folio
#

Sorry, should have been more specific, I was using the kube-pod URL setting in the env var, which seemed to require a service account that can access the kube API. Thanks for getting back to me.

I'm trying to configure my gitlab runner to use it and am finding it to be quite slow but I can't tell where it is slow because the dagger cli was not giving any output, I've just added a -v and am now getting output for the job

limber folio
#

πŸ€” -v only worked in outputting the log on one job

versed holly
# limber folio Sorry, should have been more specific, I was using the `kube-pod` URL setting in...

Ah, good to know πŸ‘ . Does that mean you were able to make it work with the sa configured? Internally buildkit ends up doing kubectl exec so it needs create on pods/exec and get on pods (FYI: https://github.com/moby/buildkit/blob/0f85fe73978ed3d0b40d935438fa0a16eebbb0ae/client/connhelper/kubepod/kubepod.go#L21-L33)

Performance using a tunnel is usually not great even for simple tasks. Is the gitlab runner running on the same cluster as the engine?

limber folio
#

Yeah, got it working with the properly configured sa and role. So would it be better to use tcp://<address:port> connection rather than kube-pod://<podname>?

#

The runner is on the same cluster as the dagger engine pods, they are setup in the daemonset from the helm chart

versed holly
#

TCP would definitely be better, nothing beats a unix domain socket though so if they are running on the same host you could mount /var/run/buildkit and use unix://var/run/buildkit/buildkitd.sock. That said, I think I misread your message

The heavy lifting is mostly done by the engine itself. If the pipeline is being slow then there's a few things worth checking. My first question would be: are you sending a dagger.Directory from the client to the engine? If so, I would first test using tcp:// (preferably unix://) instead

limber folio
#

Yes, we are loading the directory, I'll see about mounting the socket

versed holly
#

Nice, let me know if I can help! If you are using our helm chart, then doing this on the gitlab runner pod should be enough:

  env:
    - name: _EXPERIMENTAL_DAGGER_RUNNER_HOST
      value: unix:///var/run/buildkit/buildkitd.sock
  volumeMounts:
    - name: varrundagger
      mountPath: /var/run/buildkit
volumes:
- name: varrundagger
  hostPath:
    path: /var/run/dagger
limber folio
#

On the pod or in the runner config.toml? I assume the latter

versed holly
#

I'm not entirely sure how Gitlab runners are configured. However it is, the end result needs to be that there is a volume that mounts the host's /var/run/buildkit directory into the container of the runner

#

That way you have access to the socket the dagger engine opened

limber folio
#

Just looking at the dagger pod it seem to have /run/dagger-dagger-engine-dagger-helm and /var/lib/dagger... should I be using the run path?

versed holly
limber folio
#

Installed via the oci image with flux helm repo and release

#

And the socket is engine.sock

#

Job completed in 5 minutes, still no log output showing. Assuming the pods do not replicate their cache between each other?

#

Anyway, gonna leave it alone for now and check it again on Monday, thanks for the help @versed holly

limber folio
#

couldn't help myself, took a look again this morning, removed the devbox setup, running registry.dagger.io/engine:v0.18.8 directly as the CI Job image and the dagger call commands directly instead of via devbox run and it's all a lot faster. Makes me wonder what is it devbox is doing that is impacting things so much.

visual abyss
#

Hi, I am running the Dagger engine inside a K3s cluster. The K3s pods can utilize the NVIDIA GPU, but the Dagger builds cannot.

I’m getting the following error when running:
dagger -m github.com/samalba/dagger-modules/nvidia-gpu call has-gpu

I deployed the engine using the Helm chart with the following overrides (see the photo).

Has anyone tested GPU utilization from the engine running inside Kubernetes?
Thanks in advance.

versed holly
visual abyss
harsh creek
#

I am running Dagger in k8s using ephemeral-storage. I have set resource limits but Dagger engine does not seem to be aware of the virtual disk size and thus does not garbage collect causing diskpressure and eventual eviction. How do I make Dagger aware of the disk size? I have no desire to use a custom gc policy unless that is the only way to handle it

harsh comet
#

We are running two Dagger Engines in two parallel running Argo Workflows/Steps. One of the two was unable to connect to the Dagger sidecar via socket. Is there a limitation?

harsh comet
#

I'm trying to push to include a docker registry certificate since two days and can not get it to work. Seems the only option is to build a custom engine docker image or to mount into the engine image. No other, simple, configuration available.

high ermine
#

Hi, I do try to deploy dagger on a K8s cluster but I have an issues with readines probe. Even if I use the official engine dagger image readines probe it fails and I have the following Event: Error: start engine: no driver for scheme "" found

Used image: registry.dagger.io/engine:v0.18.2

Could anyone give any hint about how this can be fixed ?

daring wraith
daring wraith
#

what have you set it to? πŸ‘€

high ermine
#

# Environment variables env: - name: _EXPERIMENTAL_DAGGER_RUNNER_HOST value: "kubernetes"

#

first time when I use it, so most probably I do it wrong

daring wraith
high ermine
#

Thank you

high ermine
#

is there any other way to connect from another pod to dagger without using _EXPERIMENTAL_DAGGER_RUNNER_HOST ? Don't know, having dagger behind a service for example ?

#

I mean, inside of same K8s cluster

twilit dust
#

Anyone running dagger in an openshift environment? Tried the instructions but I have not had success. I think due to SCC issues?

shrewd hazel
#

@twilit dust - Yeah. Dagger needs root privileges to do its thing. So, SCC will definitely get in the way. Though, I believe you can get around it. You just have to set up a service account with wider permissions and assign it to Dagger to use.

twilit dust
shrewd hazel
wet linden
#

Hey, I was going over docs on how to setup k8.

wet linden
#
  1. https://archive.docs.dagger.io/0.9/194031/kubernetes/
  2. https://docs.dagger.io/ci/integrations/kubernetes

There a couple of thing I want understand and ask:

  1. Currently we run our CI in k8 which has temporary nodes i.e the no. of nodes increase or decrease based on load. Our current pipeline looks something like this: Git push -> Trigger GHA -> GHA controller creates a new pod to run pipeline -> New pods builds and pushes ...... Now:
    1.a: The 1st link says, I am required to install dagger CLI locally. With 0.18.12, is it still required?
    1.b: Because of temp nodes, we won't have a predictable pipeline time, since nodes would be deleted. So to prevent this, I was thinking of mounting an EFS to the nodes, where we can store the docker cache. I found this tool: https://github.com/kubernetes-sigs/aws-efs-csi-driver to mount.
  2. My understanding is Dagger engine caches it's pipeline in Docker's cache. So there is no separate path that needs to be cached.
  3. Dagger would still need to be triggered by GitHub Actions. What will I need to change here. If dagger CLI isn't installed, how will this work? There's a GitHub actions for Dagger, which will install the CLI I assume. But that CLI will be installed inside GitHub's pod, so how will it talk to Dagger's controller?

If anyone has done 3, please share the steps

PS: I have little to no knowledge of k8, as compared to container.

#

Also, what I said above, is this even possible?

proud oyster
#

@wet linden the Dagger engine does not use the Docker cache (or any other feature of docker). It stores its cache in a local state directory.

If you run the Dagger engine in Docker or Kubernetes (pretty common), then that local state directory will be in a volume. It's up to you to manage that volume to balance data persistence, performance, reliability etc.

However you should be mindful of the following:

  1. Dagger does not support concurrent writes to its state directory
  2. Dagger is very IO-intensive, so if you mount its state directory from a remote source with poor IO latency, you will get poor performance
wet linden
# proud oyster <@437495595892998145> the Dagger engine does not use the Docker cache (or any ot...

Thanks. Then I guess mounting EFS to even to dagger doesn't make sense.

In my initial PoC, I ran a persistent node with dagger CLI install in a self-hosted github actions runner. But GHA became our bottleneck since it only supported 1 job at a time. While one dagger engine can run multiple pipelines at a time(as far as I know and based on some testing)

So, I know Dagger has [experimental](https://github.com/dagger/dagger/issues/9516 and/or https://docs.dagger.io/configuration/custom-runner/#connection-interface) support for remote engine i.e the dagger client/cli can run in GHA pods and engine can run a persistent machine. This should ideally be the best solution. Right?

proud oyster
proud oyster
# wet linden Thanks. Then I guess mounting EFS to even to dagger doesn't make sense. In my i...

Yes, with the current version of the engine there are 2 well-tested architectures for a self-hosted Github Actions cluster on Kubernetes:

  1. Run a dagger engine on each node of your CI cluster, using a daemonset. Then configure your CI runner to connect to its local node's dagger engine using a unix socket. This is the default configuration in our official helm chart, and in the docs.

  2. Run a dagger engine on a separate machine, and connect to it remotely with DAGGER_EXPERIMENTAL_RUNNER_HOST. We've labeled it experimental to reserve the right to break the protocol in future releases, but it works well. You can also run a cluster of engines, and load-balance across them, although that's slightly less chartered territory. And blindly load-balancing a wide variety of dagger workloads tends to lower your cache hit rate (cache locality is strongest for successive runs of the same pipeline). One very promising architecture is to run dedicated engines for certain pipelines, and configure DAGGER_EXPERIMENTAL_RUNNER_HOST so that successive runs of the same workflow are always routed to the same engine (or pool of engines).

#

Typically, the main constraint for any architecture is cache distribution.

We're working hard to decouple storage and compute in the engine, which will make everything much simpler. Soon!

glad locust
#

For 3

Dagger would still need to be triggered by GitHub Actions. What will I need to change here. If dagger CLI isn't installed, how will this work? There's a GitHub actions for Dagger, which will install the CLI I assume. But that CLI will be installed inside GitHub's pod, so how will it talk to Dagger's controller?

Thats correct, the dagger-for-github action (https://github.com/dagger/dagger-for-github) will install the CLI, some more exaples here: https://docs.dagger.io/ci/integrations/github-actions

The github actions pod will be able to use the dagger CLI to run your dagger functions, and those will be executed on the dagger engine specified by DAGGER_EXPERIMENTAL_RUNNER_HOST like solomon mentioned. More info on that here https://docs.dagger.io/configuration/custom-runner/#connection-interface

wet linden
#

Thanks both of you. Now I have a better understanding of how dagger works and what I need to do to reach a desired solution.

#

Just curious @proud oyster, decoupling CLI and engine doesn't risk the Dagger (cloud's) business. From what I remember, one of features dagger cloud offers is cross region caching. I believe with this experimental feature, it sort of in a way possible to replicate that?

#

@glad locust with this experimental flag, we will most likey be using the 6th option(ip address and port). But I don't see a port being exposed using docker ps in dagger engine, nor it is present in somewhere docs.

proud oyster
# wet linden Just curious <@488409085998530571>, decoupling CLI and engine doesn't risk the D...

We used to sell hosted distributed caching as an experimental service, but paused that, because of the engine limitations.

Yes, once we decouple storage and compute, we make it easier to eg. use a S3 bucket for shared cache distribution. In theory you could say it hurts our business opportunities. But in practice, not really - storing engine cache on a S3 bucket should be the bare minimum. There is a lot more value that a commercial product can add beyond that.

#

Also: you mention decoupling dagger CLI and engine, but that's different and not what I mean by "decoupling compute and storage"

wet linden
#

Hmmm, makes sense.

you mention decoupling dagger CLI and engine, but that's different and not what I mean by "decoupling compute and storage"

Yeah, I was reading https://github.com/dagger/dagger/issues/9516 confused CLI/Engine with compute and storage.
So what the team is trying to achieve is: Separation of responsibility between CLI and Engine and within the engine, separation of compute and storage.

Great work man. I just feel like any org that I join and has a broken CI/CD, I ask them to replace it with Dagger. Like an elixir which heals everything XD.

glad locust
# wet linden <@135620352201064448> with this experimental flag, we will most likey be using t...

You can configure the engine to listen on a tcp port by passing the extra args --addr tcp://0.0.0.0:1234 to the engine container. However, if you're running the engine on the same node as your CI runners, either as a sidecar or daemonset, connecting over a unix socket is a more common approach achieved by creating a shared volume between the CI runner pod and engine pod for the socket

wet linden
# glad locust You can configure the engine to listen on a tcp port by passing the extra args `...

Hey Kyle,
If the current setup is: Remote Dagger Engine and GHA running in k8 pods.
Then to do a dagger call. I would need to do a partial clone of the repo in k8 pods, and run dagger in remote machine. How will it do the build? If all the source code that needs to built resides on k8 pod?

A hacks I know around this is: Can do a partial clone of only the repo clone the repo again in a function and pass the source code to build function.

Is there a better way to do this?

glad locust
# wet linden Hey Kyle, If the current setup is: Remote Dagger Engine and GHA running in k8 po...

regardless of where it runs, dagger always has this client<->server relationship between the CLI and engine. Anything passed to a function through an argument, like your source Directory, is transferred to the engine at runtime and likewise anything exported from the engine is transferred to the client side.

You mention partial clone - if the entire source needed to build isn't cloned for the dagger call, you could optimize this interaction by passing the git ref you want to build as the argument rather than a local directory. For example dagger call build --source https://github.com/myorg/myproject@abcd1234 instead of dagger call build --source .

steep thorn
#

Quick question, when a dagger container is run on k8s, is that container run as a pod or is the engine using the container runtime directly or is it more at the cgroup level?

proud oyster
#

dagger engine runs as a privileged pod on k8s; then it runs its own containers itself by hitting the kernel directly

#

basically dagger can use k8s as a provisioner, but it doesn't rely on it as a runtime

#

same with docker, podman etc

steep thorn
#

Okay cool, so would that be through the cgroups api then? Kind does something similar to get CRI-O running in containers

proud oyster
steep thorn
#

Is the dagger runtime OCI compliant as a result?

#

Is the dagger runtime OCI compliant as a result?

fallow olive
# proud oyster dagger engine runs as a privileged pod on k8s; then it runs its own containers i...

Found this message through searching : )
I'm wondering what's your stand on this apporach, basically i love your product and used it heavily for the last week.
the problem is that we run our ci on jenkins through kubernetes (each ci runs on a ephemeral pod)
my org won't allow me to deploy a priviliged pod since they see it very fairly as a vulenrablity/threat.
I see Dagger as a product that will help me write better CI/CD code from many perspectives, hope i'm not being rude here but I’d really appreciate any advice, documentation, or community insights that could help me communicate the value and security posture of dagger in k8s environment, for example we currently build images with kaniko because of it : )

proud oyster
cyan lily
#

Based on our previous discussion, it seems that the Dagger Engine uses a container-in-container pattern.
I was wondering about the resource limits and requests defined in the pod spec β€” should the containers launched by Dagger respect those?
From what I’ve tested on my cluster, it doesn’t seem like they do.

proud oyster
# fallow olive Found this message through searching : ) I'm wondering what's your stand on this...

Hello! It's a fair question. Dagger is vertically integrated: it bundles its own container runtime, orchestrator and cache system. This is what makes its unique features possible.

You should take that vertical integration into account when securing Dagger.

Since Dagger is a system component, you should focus on securing it at the node level, not the pod level. If you consider your dagger workloads untrusted, then you should run them in a separate cluster, or in a segregated set of nodes in the same cluster.

fallow olive
weak cypress
#

For what is worth: That's how we approached our internal security review, (we had a very similar environment, jenkins running in k8s with ephemeral worker pods). It helped a lot when Solomon put it in that context - You need to think of dagger as its own system (not another workload that you bundle into a k8s cluster if that makes sense)

We have moved our CI workloads into their own k8s cluster, isolated from any other workloads and ran many privileged engines

steep thorn
nova dome
#

Hey everyone. I'm currently using the k3s module to stand up a kubernetes cluster to do some testing.

However, part of my tests require that I use CAPD (Docker implementation of CAPI). Which requires the docker socket for creating "Machines" that are backed by Docker. It appears that there is no docker socket running in this env.

My question is, should I start using the kind cluster instead (which module, there are quite a few in the daggerverse?) Or should I try to get the k3s module to work with CAPD?

Thanks for any advice!

nova dome
nova dome
nova dome
# nova dome Hey everyone. I'm currently using the [k3s module](https://daggerverse.dev/mod/g...

It appears that CAPD cannot connect to the cluster that it has created:

E0811 20:10:44.313611 1 cluster_accessor.go:262] "Connect failed" err="error creating HTTP client and mapper: cluster is not reachable: Get "https://x.x.x.x:6443/?timeout=5s\": context deadline exceeded" controller="clustercache" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="default/test-cluster" namespace="default" name="test-cluster" reconcileID="d3e99ed5-a9fe-4e36-8295-c4e40e0a0b11"

So I'm not really sure what the right way is to get k3s to communicate to docker machines that end up being created on the host machine. I must be fundamentally be missing a network link or something similar.

cerulean mountain
cerulean mountain
nova dome
#

@limber folio looks like we have something worth trying here.

nova dome
#

So it looks like I'd need to modify the way I'm defining the DockerMachineTemplate (among other things)

Right now I 'm using, for example:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
  name: quick-start-default-worker-machinetemplate
  namespace: test-cluster
spec:
  template:
    spec:
      extraMounts:
      - containerPath: /var/run/docker.sock
        hostPath: /var/run/docker.sock

In the k3s repo, I can install the provider and then use:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
  name: k3s-control-plane
spec:
  template:
    spec: {}

Which evidently does not need to mount the docker socket at all.

nova dome
#

@cerulean mountain just a quick update.
I fired up the dagger container that starts the k3s cluster, and then followed the steps starting at Install Providers

Unfortunately, the same issue arises with the networking. The management cluster is still not able to communicate with the workload cluster:

2025-08-12T15:27:27Z    ERROR   Reconciler error        {"controller": "kthreescontrolplane", "controllerGroup": "controlplane.cluster.x-k8s.io", "controllerKind": "KThreesControlPlane", "KThreesControlPlane": {"name":"test1-control-plane","namespace":"default"}, "namespace": "default", "name": "test1-control-plane", "reconcileID": "de38f8fd-390c-4b62-8425-0ae829263f1f", "error": "failed to get API group resources: unable to retrieve the complete list of server APIs: v1: Get \"https://x.x.x.x:6443/api/v1?timeout=30s\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)", "errorCauses": [{"error": "failed to get API group resources: unable to retrieve the complete list of server APIs: v1: Get \"https://x.x.x.x:6443/api/v1?timeout=30s\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"}]}

So somehow, we need to setup the routes properly when firing up the k3s container to begin with.

#

We're starting our service roughly, using these steps:

func (m *MyProject) newKubernetesService(name string, opts ...KubernetesServiceOpts) (*KubernetesService, error) {
    k3s := dag.K3S(name)

    base := k3s.Container().
        WithExec([]string{"mkdir", "-p", "/var/lib/rancher/k3s/agent/images"})

    for _, o := range opts {
        for _, image := range o.LoadImages {
            id := m.randomName("name-", 8)
            imagePath := fmt.Sprintf("/var/lib/rancher/k3s/agent/images/%s.tar", id)
            base = base.WithMountedFile(imagePath, image.AsTarball())
        }
    }

    server := base.AsService(dagger.ContainerAsServiceOpts{
        Args: []string{
            "sh", "-c",
            "k3s server --cluster-init --bind-address $(ip route | grep src | awk '{print $NF}') --disable traefik --disable metrics-server --egress-selector-mode=disabled > /dev/null 2>&1",
        },
        InsecureRootCapabilities: true,
        UseEntrypoint:            true,
    })

    return &KubernetesService{
        Service:    server,
        KubeConfig: k3s.Config(),
    }, nil
}

And then later on:

// Execute the end-to-end tests
func (m *MyProject) EndToEndTest(ctx context.Context) *dagger.File {
    k8sOpts := KubernetesServiceOpts{
        LoadImages: []*dagger.Container{m.Container()},
    }

    k8s, err := m.newKubernetesService("e2e-test", k8sOpts)
    if err != nil {
        return nil
    }

    _, err = k8s.Start(ctx)
    if err != nil {
        return nil
    }
        // Testing code....
} 
cerulean mountain
#

if by any chance you have the time to create a public repo with your ongoing efforts that will make things easier for me

nova dome
heady dagger
#

does anyone have the source on how to install daggerverse in a k8 cluster? I found a video that shows this but I can't find an implementation.

proud oyster
south thorn
analog coral
#

I'm clearly confused by something related to the dagger engine deployed on kubernetes nodes. I've deployed the engine using the helm chart in the dagger repo, pretty much unchanged. We're using gitlab, and I've configured the runner to be privileged and mount the /var/run dir for access to the dagger socket. However, my dagger jobs don't seem to connect to the local engine and instead seem to want to try to connect to dagger cloud:

$ echo "Dagger Engine: ${_EXPERIMENTAL_DAGGER_RUNNER_HOST}" # collapsed multi-line command
Dagger Engine: unix:///run/dagger/engine.sock
1 : connect
1 : [0.0s] | cloud url=https://dagger.cloud/traces/setup
2 : ┆ starting engine
2 : ┆ starting engine DONE [0.0s]
3 : ┆ connecting to engine

analog coral
#

Are image pull secrets filtered down to the engine that pulls the images? I've got the dagger engine deployed on kubernetes as a daemonset using the dagger helm chart. I set imagePullSecrets in the values.yaml and verified the pods have the pull secret in their config. All my dagger modules use custom base images, and I'm getting a 401 for every pull the dagger engine is trying to make.

I've verified the credentials in the pull secret are in fact valid.

#

Do I maybe need to do something to add RegistryCrendentials inside the module? (ie dag.Container.WithRegistryAuth().From())

zenith imp
#

I'm curious why the helm chart and documentation reference a pod instead of just creating a Service with spec.internalTrafficPolicy set to "Local" and using that via tcp?

analog coral
analog coral
frank niche
#

Help! How can I limit the memory a dagger-engine consumes? I have a Pod with a dagger-engine running that has memory and cpu limits, but they seem to be ignored. The engine starts to consume up the whole nodes memory until the kubelet node-checker start to error out. Or is there nay kind of upper limit one can impose on the dagger-engine and its started container-executions?

left shuttle
#

@proud oyster I'm reading https://docs.dagger.io/ci/integrations/kubernetes, and I'm trying to figure out, if there's a way yet to control scheduling or guarantee resources (for each function that gonna be run) ? or that something not yet available ?

peak bolt
#

@cerulean mountain I am trying to get your K3s go example working and wonder if you can help? I cloned your repo and I am running the example found here https://github.com/marcosnils/daggerverse/tree/main/k3s/examples/go

Maybe it is working and I am confused but I run this command and expect it to display the cluster info in my terminal

dagger call k-3-skubectl --args="cluster-info" stdout

However, it never returns. Here is a trace where I let it run for 5min before canceling.
https://dagger.cloud/cafe/traces/2fa2f2f9430c88ab013759db82c4c128?listen=c7af574b0c3ad054#d5829dcb1e6ed718

I am running podman on a Mac. It is probably something simple but if you could give me some suggestions of how to debug I would appreciate it, thanks

GitHub

Personal collection of Dagger modules. Contribute to marcosnils/daggerverse development by creating an account on GitHub.