#Dagger pausing / taking much longer on with-exec than expected

1 messages · Page 1 of 1 (latest)

ocean needle
#

When running the command Dagger seems to do nothing for almost a minute, then "Initializing modules..." + the rest of the Terraform output shows. Is this possibly Dagger mounting caches etc?

keen moon
# ocean needle Here's a trace: https://dagger.cloud/mjb/traces/ac289f84cd9cc3717e6f1584c9dfcf94...

I grabbed a main.tf from ChatGPT that had a bunch of AWS provider stuff (not a TF expert) and ran time terraform init manually inside of a Dagger container via terminal.

dagger core container from --address hashicorp/terraform with-file --path /app/main.tf --source ./main.tf with-workdir --path /app terminal
Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
real    0m 5.58s
user    0m 2.18s
sys    0m 0.79s

then ran the same thing via with-exec:

dagger core container from --address hashicorp/terraform with-file --path /app/main.tf --source ./main.tf with-workdir --path /app with-exec --args time,terraform,init stderr

real    0m 5.61s
user    0m 2.24s
sys    0m 0.61s
dagger core container from --address hashicorp/terraform with-file --path /app/main.tf --source ./main.tf with-workdir --path /app with-exec --args time,terraform,init stdout

Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/random versions matching "~> 3.0"...
- Finding hashicorp/null versions matching "~> 3.0"...
- Finding hashicorp/archive versions matching "~> 2.0"...
- Finding hashicorp/external versions matching "~> 2.0"...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Installing hashicorp/aws v5.100.0...
- Installed hashicorp/aws v5.100.0 (signed by HashiCorp)
- Installing hashicorp/random v3.7.2...
- Installed hashicorp/random v3.7.2 (signed by HashiCorp)
- Installing hashicorp/null v3.2.4...
- Installed hashicorp/null v3.2.4 (signed by HashiCorp)
- Installing hashicorp/archive v2.7.1...
- Installed hashicorp/archive v2.7.1 (signed by HashiCorp)
- Installing hashicorp/external v2.3.5...
- Installed hashicorp/external v2.3.5 (signed by HashiCorp)
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
#

So I guess now, I could try a Cache Volume for the providers?

#

Global, local, both?

Is this accurate?

Terraform maintains two primary locations for caching provider plugins:

Local Cache (Project-specific):
Within your Terraform project's working directory, a .terraform/providers subdirectory is created. This directory stores the provider plugin binaries specifically for that project after a terraform init operation.

Global Cache (Shared):
You can configure a global provider plugin cache by setting the plugin_cache_dir in your Terraform CLI configuration file (.terraformrc on Linux/macOS or terraform.rc on Windows). This global cache acts as a shared repository for all projects, preventing redundant downloads of the same provider versions.

  • On Linux/macOS, the global cache is typically located at ~/.terraform.d/plugins if configured.
  • On Windows, the global cache is typically located at %APPDATA%\terraform.d\plugins if configured.

When terraform init is executed, Terraform first checks the global cache (if configured) for the required provider versions. If found, it copies or symlinks the plugins to the local project's .terraform/providers directory. If not found in the global cache, Terraform downloads the providers from the Terraform Registry and stores them in both the global cache (if configured) and the local project cache.

ocean needle
#

Those times track, that's roughly how long the Terraform operation takes in Dagger too. The issue I'm facing is that for nearly a minute Terraform doesn't seem to do anything

#

Almost like Dagger is doing something first, but I can probably check with --progress=plain, let me try that

ocean needle
#

Ok so --progress=plain confirms it's not Dagger doing anything, it's definitely weird Terraform behaviour with caches. It finds the cache, but takes forever to act on it:

118 : ┆ [1.2s] | Initializing the backend...
118 : ┆ [24.2s] | 
118 : ┆ [24.2s] | Successfully configured the backend "s3"! Terraform will automatically
118 : ┆ [24.2s] | use this backend unless the backend configuration changes.
118 : ┆ [45.9s] | Initializing modules...
118 : ┆ [46.2s] | Initializing provider plugins...
118 : ┆ [46.2s] | - terraform.io/builtin/terraform is built in to Terraform
118 : ┆ [46.2s] | - Reusing previous version of hashicorp/tls from the dependency lock file
118 : ┆ [46.2s] | - Reusing previous version of hashicorp/aws from the dependency lock file
118 : ┆ [46.2s] | - Reusing previous version of hashicorp/awscc from the dependency lock file
118 : ┆ [46.7s] | - Using previously-installed hashicorp/tls v4.1.0
118 : ┆ [1m6s] | - Using previously-installed hashicorp/aws v6.6.0
118 : ┆ [1m7s] | - Using previously-installed hashicorp/awscc v1.50.0
118 : ┆ [1m7s] | 
118 : ┆ [1m7s] | Terraform has been successfully initialized!
keen moon
#
dagger -M -c 'container | from hashicorp/terraform | with-file /app/main.tf ./main.tf | with-workdir /app | with-mounted-cache /app/.terraform/providers $(cache-volume "tf-prov") | with-exec terraform,init | stdout'
#

I did a local cache volume and it made my run faster
from 5.5 secs down to 0.3 secs (on second run)

#

haven't tried global cache yet

ocean needle
#

Try with the AWS provider, the external provider is tiny in comparison

keen moon
#

I'm doing all of these tests with

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "~> 3.0"
    }
    null = {
      source  = "hashicorp/null"
      version = "~> 3.0"
    }
    archive = {
      source  = "hashicorp/archive"
      version = "~> 2.0"
    }
    external = {
      source  = "hashicorp/external"
      version = "~> 2.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "slow_bucket" {
  bucket = "slow-terraform-init-example-${random_id.rand.hex}"
}

resource "aws_iam_role" "slow_role" {
  name = "slow-init-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "ec2.amazonaws.com" }
    }]
  })
}

resource "random_id" "rand" {
  byte_length = 4
}

resource "null_resource" "noop" {
  triggers = {
    always = timestamp()
  }
}

resource "archive_file" "zip" {
  type        = "zip"
  source_dir  = "./some-dir"  # you can create a dummy dir
  output_path = "./some.zip"
}

resource "external" "example" {
  program = ["echo", "{\"foo\": \"bar\"}"]
}
ocean needle
#

Ah ok

keen moon
#
Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Finding hashicorp/random versions matching "~> 3.0"...
- Finding hashicorp/null versions matching "~> 3.0"...
- Finding hashicorp/archive versions matching "~> 2.0"...
- Finding hashicorp/external versions matching "~> 2.0"...
- Installing hashicorp/aws v5.100.0...
- Installed hashicorp/aws v5.100.0 (signed by HashiCorp)
- Installing hashicorp/random v3.7.2...
- Installed hashicorp/random v3.7.2 (signed by HashiCorp)
- Installing hashicorp/null v3.2.4...
- Installed hashicorp/null v3.2.4 (signed by HashiCorp)
- Installing hashicorp/archive v2.7.1...
- Installed hashicorp/archive v2.7.1 (signed by HashiCorp)
- Installing hashicorp/external v2.3.5...
- Installed hashicorp/external v2.3.5 (signed by HashiCorp)
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
#

offscreen stuff 🙂 from last Dagger Cloud run

ocean needle
#

Terraform isn't using the cached providers there

#

Installing hashicorp/aws...

#

It'll say Using previously-installed hashicorp/aws if it's using a cached provider

keen moon
#

but it everything in 0.3 seconds. I think it's finding it in the cache vol err maybe layer cache then...since returning like it's a first run

ocean needle
#

Yeah must be for that speed

#

So there's some combination of Dagger caching and Terraform caching I need to find

#

It should be that fast with cached providers

keen moon
#

I see what you mean. If I run manually in a terminal. Subsequent runs look like

Initializing the backend...
Initializing provider plugins...
- Reusing previous version of hashicorp/archive from the dependency lock file
- Reusing previous version of hashicorp/external from the dependency lock file
- Reusing previous version of hashicorp/aws from the dependency lock file
- Reusing previous version of hashicorp/random from the dependency lock file
- Reusing previous version of hashicorp/null from the dependency lock file
- Using previously-installed hashicorp/archive v2.7.1
- Using previously-installed hashicorp/external v2.3.5
- Using previously-installed hashicorp/aws v5.100.0
- Using previously-installed hashicorp/random v3.7.2
- Using previously-installed hashicorp/null v3.2.4
ocean needle
#

Yeah that's Terraform recognising the cache and using it, that should be equally fast but I'm not getting that

keen moon
#

That was me without Dagger Cache vol. Let me try manual runs, with cache vol

ocean needle
#

Ok improvement here, not telling Terraform to use a global cache:

116 : ┆ Container.withExec DONE [0.9s]
116 : ┆ [0.9s] | Initializing the backend...
116 : ┆ [1.6s] | 
116 : ┆ [1.6s] | Successfully configured the backend "s3"! Terraform will automatically
116 : ┆ [1.6s] | use this backend unless the backend configuration changes.
116 : ┆ [1.9s] | Initializing modules...
116 : ┆ [2.3s] | Initializing provider plugins...
116 : ┆ [2.3s] | - terraform.io/builtin/terraform is built in to Terraform
116 : ┆ [2.3s] | - Reusing previous version of hashicorp/awscc from the dependency lock file
116 : ┆ [2.6s] | - Reusing previous version of hashicorp/tls from the dependency lock file
116 : ┆ [2.6s] | - Reusing previous version of hashicorp/aws from the dependency lock file
116 : ┆ [3.0s] | - Installing hashicorp/awscc v1.50.0...
#

I'm getting the impression I need to use the lightest possible Terraform caching, and rely on Dagger caching much more here

#

Rather than trying to use both in some way

keen moon
#

oh nice. Yeah, I guess I was using layer cache when I got that near instant result.

ocean needle
#

I think this'll cause an error with Terraform trying to install providers that already exist in .terraform but I'm testing that now

keen moon
#

@stray flare got any TF caching tips for us 😄

ocean needle
#

That's what I'm doing, I've got four Terraform dirs and four .terraform caches, one for each mounted in <dir>/.terraform/

keen moon
#
dagger /app $ time terraform init
Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Finding hashicorp/random versions matching "~> 3.0"...
- Finding hashicorp/null versions matching "~> 3.0"...
- Finding hashicorp/archive versions matching "~> 2.0"...
- Finding hashicorp/external versions matching "~> 2.0"...
- Installing hashicorp/archive v2.7.1...
- Installed hashicorp/archive v2.7.1 (signed by HashiCorp)
- Installing hashicorp/external v2.3.5...
- Installed hashicorp/external v2.3.5 (signed by HashiCorp)
- Installing hashicorp/aws v5.100.0...
- Installed hashicorp/aws v5.100.0 (signed by HashiCorp)
- Installing hashicorp/random v3.7.2...
- Installed hashicorp/random v3.7.2 (signed by HashiCorp)
- Installing hashicorp/null v3.2.4...
- Installed hashicorp/null v3.2.4 (signed by HashiCorp)
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
real    0m 5.73s
user    0m 2.23s
sys    0m 0.77s
dagger /app $ time terraform init
Initializing the backend...
Initializing provider plugins...
- Reusing previous version of hashicorp/random from the dependency lock file
- Reusing previous version of hashicorp/null from the dependency lock file
- Reusing previous version of hashicorp/archive from the dependency lock file
- Reusing previous version of hashicorp/external from the dependency lock file
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/external v2.3.5
- Using previously-installed hashicorp/aws v5.100.0
- Using previously-installed hashicorp/random v3.7.2
- Using previously-installed hashicorp/null v3.2.4
- Using previously-installed hashicorp/archive v2.7.1

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
real    0m 0.92s
user    0m 0.50s
sys    0m 0.12s
#

that had the more desired effect

ocean needle
#

I also had the global cache for providers because downloading and installing AWS provider takes 8x longer than everything else

keen moon
#
dagger -M -c 'container | from hashicorp/terraform | with-file /app/main.tf ./main.tf | with-workdir /app | with-mounted-cache /app/.terraform $(cache-volume "tf-dir") | terminal'

then run time terraform init twice in there

ocean needle
#

So that's .terraform cached, no global cache

keen moon
#

right

ocean needle
#

Just checking, dagger core engine local-cache prune does clear cache volumes too right?

stray flare
#

catching up, is the current thinking that local caches are working as expected but global cache isn't?

ocean needle
#

Not fully sure, I'm not seeing the speed Jeremy is from local caches, but re-testing after pruning cache now

#

dagger core engine local-cache prune then:

First run:

116 : ┆ [1.0s] | Initializing the backend...
116 : ┆ [1.6s] | 
116 : ┆ [1.6s] | Successfully configured the backend "s3"! Terraform will automatically
116 : ┆ [1.6s] | use this backend unless the backend configuration changes.
116 : ┆ [2.0s] | Initializing modules...
...
116 : ┆ [10.7s] | Initializing provider plugins...
116 : ┆ [10.7s] | - terraform.io/builtin/terraform is built in to Terraform
116 : ┆ [10.7s] | - Reusing previous version of hashicorp/tls from the dependency lock file
116 : ┆ [10.8s] | - Reusing previous version of hashicorp/aws from the dependency lock file
116 : ┆ [10.9s] | - Reusing previous version of hashicorp/awscc from the dependency lock file
116 : ┆ [11.3s] | - Installing hashicorp/tls v4.1.0...
116 : ┆ [14.6s] | - Installed hashicorp/tls v4.1.0 (signed by HashiCorp)
116 : ┆ [14.8s] | - Installing hashicorp/aws v6.6.0...
116 : ┆ [1m41s] | - Installed hashicorp/aws v6.6.0 (signed by HashiCorp)
116 : ┆ [1m41s] | - Installing hashicorp/awscc v1.50.0...
116 : ┆ [1m51s] | - Installed hashicorp/awscc v1.50.0 (signed by HashiCorp)
#

Second run (one small change to inputs to ensure Dagger doesn't cache the whole thing in 5s flat, testing the cache volume not the layer cache):

116 : ┆ [0.8s] | Initializing the backend...
116 : ┆ [22.3s] | 
116 : ┆ [22.3s] | Successfully configured the backend "s3"! Terraform will automatically
116 : ┆ [22.3s] | use this backend unless the backend configuration changes.
116 : ┆ [44.6s] | Initializing modules...
116 : ┆ [45.0s] | Initializing provider plugins...
116 : ┆ [45.0s] | - terraform.io/builtin/terraform is built in to Terraform
116 : ┆ [45.0s] | - Reusing previous version of hashicorp/awscc from the dependency lock file
116 : ┆ [45.2s] | - Reusing previous version of hashicorp/tls from the dependency lock file
116 : ┆ [45.3s] | - Reusing previous version of hashicorp/aws from the dependency lock file
116 : ┆ [1m4s] | - Using previously-installed hashicorp/aws v6.6.0
116 : ┆ [1m6s] | - Using previously-installed hashicorp/awscc v1.50.0
116 : ┆ [1m6s] | - Using previously-installed hashicorp/tls v4.1.0
#

Definitely not seeing the same speeds

#

Interestingly the second run doesn't download any modules, so the cache is definitely being utilised. The first run downloads the expected modules from git

#

Yeah cache is there, if I remove the function call that runs init -> plan, and terminal in then tree the .terraform dir is there in all four directories, and has providers in all four

keen moon
stray flare
#

it looked like it did cache volumes 🤔

keen moon
#

oh, I see, it looks reset on the first run above...got it.
We might want to note that in the docs entry since it's pretty vague

ocean needle
#

It does clear cache-volumes yes

#
Initializing modules...
Initializing provider plugins...
- terraform.io/builtin/terraform is built in to Terraform
- Reusing previous version of hashicorp/awscc from the dependency lock file
- Reusing previous version of hashicorp/tls from the dependency lock file
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/awscc v1.50.0
- Using previously-installed hashicorp/tls v4.1.0
- Using previously-installed hashicorp/aws v6.6.0

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
real    1m 6.28s
user    1m 4.65s
sys    0m 1.20s

This is with a .terraform cache volume

#

Might be better off ditching local cache volumes and trying the global provider cache only, not sure why this is behaving this way

stray flare
#

yeah i'm trying to think of anything that could cause this because with the cached modules it should be basically instant. I'm not sure what terraform would be doing in that time

keen moon
#

Thing is, when I connect via terminal, run terraform init and then disconnect, and do again, I don't get any apparent cache vol benefit...could it be the engine needs to be shut down to flush cache vol to disk?

stray flare
#

no, i think thats a terminal thing

#

the cache volumes are working here, thats confirmed, the question is around performance

ocean needle
#

Testing a global provider cache only, no local caches. This might provide a middle ground where init doesn't take >1m

keen moon
#
dagger core engine local-cache prune

time dagger -M -c 'container | from hashicorp/terraform | with-file /app/main.tf ./main.tf | with-workdir /app | with-mounted-cache /app/.terraform $(cache-volume "tf-dir") | with-exec time,terraform,init | stderr'

real    0m 5.45s << from terraform init
user    0m 2.21s
sys    0m 0.60s

dagger -M -c   0.29s user 0.20s system 7% cpu 6.385 total

===
next run


real    0m 5.45s << clearly layer cached
user    0m 2.21s
sys    0m 0.60s

dagger -M -c   0.13s user 0.11s system 24% cpu 0.949 total << less than 1sec
ocean needle
#

Is there any special behaviour about sym links? Terraform should be sym linking to the global provider cache which should be basically instant?

desert crescent
#

seems to be working fine for you @keen moon ?

keen moon
#

seems to be using layer cache and not cache volumes

Because output is that of the form of a new install both times 👆 and identical run times
I guess I'm getting the cache of the time command

stray flare
desert crescent
ocean needle
#

There's more funny stuff going on with Terraform here (using global cache only, no local for these tests):

terraform init -backend=false :

real    0m 22.65s
user    0m 22.03s
sys    0m 0.36s

terraform init -reconfigure -backend-config <we have a bunch of these to reconfigure the backend>:

real    1m 5.15s
user    1m 3.89s
sys    0m 0.97s
#

Going back to local caches, will test without backend

keen moon
stray flare
#

yeah i was wondering about the backend reconfigure but in the logs it seemed like it was only the first 20s

ocean needle
#

On my laptop that backend reconfigure init takes 6s

stray flare
#

are you using the hashicorp/terraform container or some other container with the cli installed?

ocean needle
#

Hashicorp/terraform, using a sha instead of tag

stray flare
#

oh wait, this is running dagger locally right?

#

can you share the sha?

ocean needle
#

sha256:b3d13c9037d2bd858fe10060999aa7ca56d30daafe067d7715b29b3d4f5b162f

#

Yes running locally

keen moon
#

noting that these are created in user homedir after terraform init

~ $ tree .terraform.d
.terraform.d
├── checkpoint_cache
└── checkpoint_signature
#

But I see mention in CICD guides of

export TF_PLUGIN_CACHE_DIR="$HOME/.terraform.d/plugin-cache"

and using that as the cache vol

stray flare
ocean needle
#

Yes

stray flare
ocean needle
#

The Dagger arch is 386

#

Inside the container

#

Oh, you mean the difference between my laptop (arm64) and Dagger (i386)

stray flare
#

yeah exactly, if the engine is local

#

try this sha sha256:f5ac787eee9d292b6a3b97d40f04019ce08189d356233fc73d5ec7ef8529cce2

ocean needle
#

It is local

ocean needle
#

No cache:

real    0m 23.10s
user    0m 7.05s
sys    0m 3.23s
stray flare
#

yasss 🚀 you should be able to pin to the manifest sha for multiarch compat. The sha I sent is the arm64 one

ocean needle
#

With cache:

real    0m 2.58s
user    0m 1.19s
sys    0m 0.60s
#

Yeah I'll need a multi-arch because our laptops are arm64 and our gitlab runners are amd64

#

Do I need a SHA here? If I just use hashicorp/terraform:1.12.2 do I lose any performance / does Dagger re-pull etc?

stray flare
#

tagged versions will be verified against the manifest that the sha still matches what you have locally but it will only re-pull if that changes (which hashicorp will never do). only latest will re-pull

ocean needle
#

In that case I'll use 1.12.2 which should be a manifest and handle arch automatically for me here right

stray flare
#

yes exactly

ocean needle
#

Bit more testing later that's done it, local caches only, no global provider cache, <5s init down from ~1m5s-1m50s

#

You've both been tremendously helpful, I would not have thought about emulation here 🙏