#Dagger pausing / taking much longer on with-exec than expected
1 messages · Page 1 of 1 (latest)
Here's a trace: https://dagger.cloud/mjb/traces/ac289f84cd9cc3717e6f1584c9dfcf94. terraform init with cached providers takes 1m18s in Dagger. Copy pasting that command out of the trace and running locally takes ~3s
When running the command Dagger seems to do nothing for almost a minute, then "Initializing modules..." + the rest of the Terraform output shows. Is this possibly Dagger mounting caches etc?
I grabbed a main.tf from ChatGPT that had a bunch of AWS provider stuff (not a TF expert) and ran time terraform init manually inside of a Dagger container via terminal.
dagger core container from --address hashicorp/terraform with-file --path /app/main.tf --source ./main.tf with-workdir --path /app terminal
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
real 0m 5.58s
user 0m 2.18s
sys 0m 0.79s
then ran the same thing via with-exec:
dagger core container from --address hashicorp/terraform with-file --path /app/main.tf --source ./main.tf with-workdir --path /app with-exec --args time,terraform,init stderr
real 0m 5.61s
user 0m 2.24s
sys 0m 0.61s
dagger core container from --address hashicorp/terraform with-file --path /app/main.tf --source ./main.tf with-workdir --path /app with-exec --args time,terraform,init stdout
Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/random versions matching "~> 3.0"...
- Finding hashicorp/null versions matching "~> 3.0"...
- Finding hashicorp/archive versions matching "~> 2.0"...
- Finding hashicorp/external versions matching "~> 2.0"...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Installing hashicorp/aws v5.100.0...
- Installed hashicorp/aws v5.100.0 (signed by HashiCorp)
- Installing hashicorp/random v3.7.2...
- Installed hashicorp/random v3.7.2 (signed by HashiCorp)
- Installing hashicorp/null v3.2.4...
- Installed hashicorp/null v3.2.4 (signed by HashiCorp)
- Installing hashicorp/archive v2.7.1...
- Installed hashicorp/archive v2.7.1 (signed by HashiCorp)
- Installing hashicorp/external v2.3.5...
- Installed hashicorp/external v2.3.5 (signed by HashiCorp)
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
So I guess now, I could try a Cache Volume for the providers?
Global, local, both?
Is this accurate?
Terraform maintains two primary locations for caching provider plugins:
Local Cache (Project-specific):
Within your Terraform project's working directory, a .terraform/providers subdirectory is created. This directory stores the provider plugin binaries specifically for that project after a terraform init operation.Global Cache (Shared):
You can configure a global provider plugin cache by setting the plugin_cache_dir in your Terraform CLI configuration file (.terraformrc on Linux/macOS or terraform.rc on Windows). This global cache acts as a shared repository for all projects, preventing redundant downloads of the same provider versions.
- On Linux/macOS, the global cache is typically located at ~/.terraform.d/plugins if configured.
- On Windows, the global cache is typically located at %APPDATA%\terraform.d\plugins if configured.
When terraform init is executed, Terraform first checks the global cache (if configured) for the required provider versions. If found, it copies or symlinks the plugins to the local project's .terraform/providers directory. If not found in the global cache, Terraform downloads the providers from the Terraform Registry and stores them in both the global cache (if configured) and the local project cache.
Those times track, that's roughly how long the Terraform operation takes in Dagger too. The issue I'm facing is that for nearly a minute Terraform doesn't seem to do anything
Almost like Dagger is doing something first, but I can probably check with --progress=plain, let me try that
This is accurate yes, though I initially tried to only use local-to-project terraform caches rather than global
Ok so --progress=plain confirms it's not Dagger doing anything, it's definitely weird Terraform behaviour with caches. It finds the cache, but takes forever to act on it:
118 : ┆ [1.2s] | Initializing the backend...
118 : ┆ [24.2s] |
118 : ┆ [24.2s] | Successfully configured the backend "s3"! Terraform will automatically
118 : ┆ [24.2s] | use this backend unless the backend configuration changes.
118 : ┆ [45.9s] | Initializing modules...
118 : ┆ [46.2s] | Initializing provider plugins...
118 : ┆ [46.2s] | - terraform.io/builtin/terraform is built in to Terraform
118 : ┆ [46.2s] | - Reusing previous version of hashicorp/tls from the dependency lock file
118 : ┆ [46.2s] | - Reusing previous version of hashicorp/aws from the dependency lock file
118 : ┆ [46.2s] | - Reusing previous version of hashicorp/awscc from the dependency lock file
118 : ┆ [46.7s] | - Using previously-installed hashicorp/tls v4.1.0
118 : ┆ [1m6s] | - Using previously-installed hashicorp/aws v6.6.0
118 : ┆ [1m7s] | - Using previously-installed hashicorp/awscc v1.50.0
118 : ┆ [1m7s] |
118 : ┆ [1m7s] | Terraform has been successfully initialized!
dagger -M -c 'container | from hashicorp/terraform | with-file /app/main.tf ./main.tf | with-workdir /app | with-mounted-cache /app/.terraform/providers $(cache-volume "tf-prov") | with-exec terraform,init | stdout'
I did a local cache volume and it made my run faster
from 5.5 secs down to 0.3 secs (on second run)
haven't tried global cache yet
Try with the AWS provider, the external provider is tiny in comparison
I'm doing all of these tests with
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
random = {
source = "hashicorp/random"
version = "~> 3.0"
}
null = {
source = "hashicorp/null"
version = "~> 3.0"
}
archive = {
source = "hashicorp/archive"
version = "~> 2.0"
}
external = {
source = "hashicorp/external"
version = "~> 2.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
resource "aws_s3_bucket" "slow_bucket" {
bucket = "slow-terraform-init-example-${random_id.rand.hex}"
}
resource "aws_iam_role" "slow_role" {
name = "slow-init-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "ec2.amazonaws.com" }
}]
})
}
resource "random_id" "rand" {
byte_length = 4
}
resource "null_resource" "noop" {
triggers = {
always = timestamp()
}
}
resource "archive_file" "zip" {
type = "zip"
source_dir = "./some-dir" # you can create a dummy dir
output_path = "./some.zip"
}
resource "external" "example" {
program = ["echo", "{\"foo\": \"bar\"}"]
}
Ah ok
Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Finding hashicorp/random versions matching "~> 3.0"...
- Finding hashicorp/null versions matching "~> 3.0"...
- Finding hashicorp/archive versions matching "~> 2.0"...
- Finding hashicorp/external versions matching "~> 2.0"...
- Installing hashicorp/aws v5.100.0...
- Installed hashicorp/aws v5.100.0 (signed by HashiCorp)
- Installing hashicorp/random v3.7.2...
- Installed hashicorp/random v3.7.2 (signed by HashiCorp)
- Installing hashicorp/null v3.2.4...
- Installed hashicorp/null v3.2.4 (signed by HashiCorp)
- Installing hashicorp/archive v2.7.1...
- Installed hashicorp/archive v2.7.1 (signed by HashiCorp)
- Installing hashicorp/external v2.3.5...
- Installed hashicorp/external v2.3.5 (signed by HashiCorp)
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
offscreen stuff 🙂 from last Dagger Cloud run
Terraform isn't using the cached providers there
Installing hashicorp/aws...
It'll say Using previously-installed hashicorp/aws if it's using a cached provider
but it everything in 0.3 seconds. I think it's finding it in the cache vol err maybe layer cache then...since returning like it's a first run
Yeah must be for that speed
So there's some combination of Dagger caching and Terraform caching I need to find
It should be that fast with cached providers
I see what you mean. If I run manually in a terminal. Subsequent runs look like
Initializing the backend...
Initializing provider plugins...
- Reusing previous version of hashicorp/archive from the dependency lock file
- Reusing previous version of hashicorp/external from the dependency lock file
- Reusing previous version of hashicorp/aws from the dependency lock file
- Reusing previous version of hashicorp/random from the dependency lock file
- Reusing previous version of hashicorp/null from the dependency lock file
- Using previously-installed hashicorp/archive v2.7.1
- Using previously-installed hashicorp/external v2.3.5
- Using previously-installed hashicorp/aws v5.100.0
- Using previously-installed hashicorp/random v3.7.2
- Using previously-installed hashicorp/null v3.2.4
Yeah that's Terraform recognising the cache and using it, that should be equally fast but I'm not getting that
That was me without Dagger Cache vol. Let me try manual runs, with cache vol
Ok improvement here, not telling Terraform to use a global cache:
116 : ┆ Container.withExec DONE [0.9s]
116 : ┆ [0.9s] | Initializing the backend...
116 : ┆ [1.6s] |
116 : ┆ [1.6s] | Successfully configured the backend "s3"! Terraform will automatically
116 : ┆ [1.6s] | use this backend unless the backend configuration changes.
116 : ┆ [1.9s] | Initializing modules...
116 : ┆ [2.3s] | Initializing provider plugins...
116 : ┆ [2.3s] | - terraform.io/builtin/terraform is built in to Terraform
116 : ┆ [2.3s] | - Reusing previous version of hashicorp/awscc from the dependency lock file
116 : ┆ [2.6s] | - Reusing previous version of hashicorp/tls from the dependency lock file
116 : ┆ [2.6s] | - Reusing previous version of hashicorp/aws from the dependency lock file
116 : ┆ [3.0s] | - Installing hashicorp/awscc v1.50.0...
I'm getting the impression I need to use the lightest possible Terraform caching, and rely on Dagger caching much more here
Rather than trying to use both in some way
oh nice. Yeah, I guess I was using layer cache when I got that near instant result.
I think this'll cause an error with Terraform trying to install providers that already exist in .terraform but I'm testing that now
@stray flare got any TF caching tips for us 😄
I see other folks using the whole .terraform directory as their cache volume
https://github.com/Excoriate/daggerverse/blob/42848834710d3286621b7e289424c2a0173dbc46/terraform/dagger/main.go#L90-L95
That's what I'm doing, I've got four Terraform dirs and four .terraform caches, one for each mounted in <dir>/.terraform/
dagger /app $ time terraform init
Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Finding hashicorp/random versions matching "~> 3.0"...
- Finding hashicorp/null versions matching "~> 3.0"...
- Finding hashicorp/archive versions matching "~> 2.0"...
- Finding hashicorp/external versions matching "~> 2.0"...
- Installing hashicorp/archive v2.7.1...
- Installed hashicorp/archive v2.7.1 (signed by HashiCorp)
- Installing hashicorp/external v2.3.5...
- Installed hashicorp/external v2.3.5 (signed by HashiCorp)
- Installing hashicorp/aws v5.100.0...
- Installed hashicorp/aws v5.100.0 (signed by HashiCorp)
- Installing hashicorp/random v3.7.2...
- Installed hashicorp/random v3.7.2 (signed by HashiCorp)
- Installing hashicorp/null v3.2.4...
- Installed hashicorp/null v3.2.4 (signed by HashiCorp)
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
real 0m 5.73s
user 0m 2.23s
sys 0m 0.77s
dagger /app $ time terraform init
Initializing the backend...
Initializing provider plugins...
- Reusing previous version of hashicorp/random from the dependency lock file
- Reusing previous version of hashicorp/null from the dependency lock file
- Reusing previous version of hashicorp/archive from the dependency lock file
- Reusing previous version of hashicorp/external from the dependency lock file
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/external v2.3.5
- Using previously-installed hashicorp/aws v5.100.0
- Using previously-installed hashicorp/random v3.7.2
- Using previously-installed hashicorp/null v3.2.4
- Using previously-installed hashicorp/archive v2.7.1
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
real 0m 0.92s
user 0m 0.50s
sys 0m 0.12s
that had the more desired effect
I also had the global cache for providers because downloading and installing AWS provider takes 8x longer than everything else
dagger -M -c 'container | from hashicorp/terraform | with-file /app/main.tf ./main.tf | with-workdir /app | with-mounted-cache /app/.terraform $(cache-volume "tf-dir") | terminal'
then run time terraform init twice in there
So that's .terraform cached, no global cache
right
Just checking, dagger core engine local-cache prune does clear cache volumes too right?
catching up, is the current thinking that local caches are working as expected but global cache isn't?
Not fully sure, I'm not seeing the speed Jeremy is from local caches, but re-testing after pruning cache now
dagger core engine local-cache prune then:
First run:
116 : ┆ [1.0s] | Initializing the backend...
116 : ┆ [1.6s] |
116 : ┆ [1.6s] | Successfully configured the backend "s3"! Terraform will automatically
116 : ┆ [1.6s] | use this backend unless the backend configuration changes.
116 : ┆ [2.0s] | Initializing modules...
...
116 : ┆ [10.7s] | Initializing provider plugins...
116 : ┆ [10.7s] | - terraform.io/builtin/terraform is built in to Terraform
116 : ┆ [10.7s] | - Reusing previous version of hashicorp/tls from the dependency lock file
116 : ┆ [10.8s] | - Reusing previous version of hashicorp/aws from the dependency lock file
116 : ┆ [10.9s] | - Reusing previous version of hashicorp/awscc from the dependency lock file
116 : ┆ [11.3s] | - Installing hashicorp/tls v4.1.0...
116 : ┆ [14.6s] | - Installed hashicorp/tls v4.1.0 (signed by HashiCorp)
116 : ┆ [14.8s] | - Installing hashicorp/aws v6.6.0...
116 : ┆ [1m41s] | - Installed hashicorp/aws v6.6.0 (signed by HashiCorp)
116 : ┆ [1m41s] | - Installing hashicorp/awscc v1.50.0...
116 : ┆ [1m51s] | - Installed hashicorp/awscc v1.50.0 (signed by HashiCorp)
Second run (one small change to inputs to ensure Dagger doesn't cache the whole thing in 5s flat, testing the cache volume not the layer cache):
116 : ┆ [0.8s] | Initializing the backend...
116 : ┆ [22.3s] |
116 : ┆ [22.3s] | Successfully configured the backend "s3"! Terraform will automatically
116 : ┆ [22.3s] | use this backend unless the backend configuration changes.
116 : ┆ [44.6s] | Initializing modules...
116 : ┆ [45.0s] | Initializing provider plugins...
116 : ┆ [45.0s] | - terraform.io/builtin/terraform is built in to Terraform
116 : ┆ [45.0s] | - Reusing previous version of hashicorp/awscc from the dependency lock file
116 : ┆ [45.2s] | - Reusing previous version of hashicorp/tls from the dependency lock file
116 : ┆ [45.3s] | - Reusing previous version of hashicorp/aws from the dependency lock file
116 : ┆ [1m4s] | - Using previously-installed hashicorp/aws v6.6.0
116 : ┆ [1m6s] | - Using previously-installed hashicorp/awscc v1.50.0
116 : ┆ [1m6s] | - Using previously-installed hashicorp/tls v4.1.0
Definitely not seeing the same speeds
Interestingly the second run doesn't download any modules, so the cache is definitely being utilised. The first run downloads the expected modules from git
Yeah cache is there, if I remove the function call that runs init -> plan, and terminal in then tree the .terraform dir is there in all four directories, and has providers in all four
@desert crescent does
dagger core engine local-cache prune
https://docs.dagger.io/configuration/cache/#manual-pruning
Only affect layer cache?
and NOT cache volumes
it looked like it did cache volumes 🤔
oh, I see, it looks reset on the first run above...got it.
We might want to note that in the docs entry since it's pretty vague
It does clear cache-volumes yes
Initializing modules...
Initializing provider plugins...
- terraform.io/builtin/terraform is built in to Terraform
- Reusing previous version of hashicorp/awscc from the dependency lock file
- Reusing previous version of hashicorp/tls from the dependency lock file
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/awscc v1.50.0
- Using previously-installed hashicorp/tls v4.1.0
- Using previously-installed hashicorp/aws v6.6.0
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
real 1m 6.28s
user 1m 4.65s
sys 0m 1.20s
This is with a .terraform cache volume
Might be better off ditching local cache volumes and trying the global provider cache only, not sure why this is behaving this way
yeah i'm trying to think of anything that could cause this because with the cached modules it should be basically instant. I'm not sure what terraform would be doing in that time
Thing is, when I connect via terminal, run terraform init and then disconnect, and do again, I don't get any apparent cache vol benefit...could it be the engine needs to be shut down to flush cache vol to disk?
no, i think thats a terminal thing
the cache volumes are working here, thats confirmed, the question is around performance
Testing a global provider cache only, no local caches. This might provide a middle ground where init doesn't take >1m
dagger core engine local-cache prune
time dagger -M -c 'container | from hashicorp/terraform | with-file /app/main.tf ./main.tf | with-workdir /app | with-mounted-cache /app/.terraform $(cache-volume "tf-dir") | with-exec time,terraform,init | stderr'
real 0m 5.45s << from terraform init
user 0m 2.21s
sys 0m 0.60s
dagger -M -c 0.29s user 0.20s system 7% cpu 6.385 total
===
next run
real 0m 5.45s << clearly layer cached
user 0m 2.21s
sys 0m 0.60s
dagger -M -c 0.13s user 0.11s system 24% cpu 0.949 total << less than 1sec
Is there any special behaviour about sym links? Terraform should be sym linking to the global provider cache which should be basically instant?
maybe an internet / connection thing?
seems to be working fine for you @keen moon ?
seems to be using layer cache and not cache volumes
Because output is that of the form of a new install both times 👆 and identical run times
I guess I'm getting the cache of the time command
hm not that i'm aware of, they're technically different filesystems but it should be fine
so it doesn't seem to be picking up the cache volumes? Maybe we're setting it in the wrong path?
There's more funny stuff going on with Terraform here (using global cache only, no local for these tests):
terraform init -backend=false :
real 0m 22.65s
user 0m 22.03s
sys 0m 0.36s
terraform init -reconfigure -backend-config <we have a bunch of these to reconfigure the backend>:
real 1m 5.15s
user 1m 3.89s
sys 0m 0.97s
Going back to local caches, will test without backend
I was using a path that another user had in their terraform module, but might be the wrong approach (local cache)
yeah i was wondering about the backend reconfigure but in the logs it seemed like it was only the first 20s
On my laptop that backend reconfigure init takes 6s
are you using the hashicorp/terraform container or some other container with the cli installed?
Hashicorp/terraform, using a sha instead of tag
sha256:b3d13c9037d2bd858fe10060999aa7ca56d30daafe067d7715b29b3d4f5b162f
Yes running locally
noting that these are created in user homedir after terraform init
~ $ tree .terraform.d
.terraform.d
├── checkpoint_cache
└── checkpoint_signature
But I see mention in CICD guides of
export TF_PLUGIN_CACHE_DIR="$HOME/.terraform.d/plugin-cache"
and using that as the cache vol
what type of machine are you on? arm macbook?
Yes
that sha looks like its the 386 arch, so the speed is probably emulation!
The Dagger arch is 386
Inside the container
Oh, you mean the difference between my laptop (arm64) and Dagger (i386)
yeah exactly, if the engine is local
try this sha sha256:f5ac787eee9d292b6a3b97d40f04019ce08189d356233fc73d5ec7ef8529cce2
It is local
Huge improvement
No cache:
real 0m 23.10s
user 0m 7.05s
sys 0m 3.23s
yasss 🚀 you should be able to pin to the manifest sha for multiarch compat. The sha I sent is the arm64 one
With cache:
real 0m 2.58s
user 0m 1.19s
sys 0m 0.60s
Yeah I'll need a multi-arch because our laptops are arm64 and our gitlab runners are amd64
Do I need a SHA here? If I just use hashicorp/terraform:1.12.2 do I lose any performance / does Dagger re-pull etc?
tagged versions will be verified against the manifest that the sha still matches what you have locally but it will only re-pull if that changes (which hashicorp will never do). only latest will re-pull
In that case I'll use 1.12.2 which should be a manifest and handle arch automatically for me here right
yes exactly