# agent plans and inspects state $ terraform plan -json {"type":"planned_change","change":{ "resource":"local_file.config", "action":"create"}} $ terraform state list local_file.config
Can AI agents use Terraform?
Infrastructure as code tool. Agents plan and apply infrastructure changes, inspect state, and manage workspaces.
See the latest run →Terraform eval results by model
| Model | Pass rate | Avg turns | Avg tokens |
|---|---|---|---|
| gpt-5-nano | 94% | 4.1 | 15.1k |
Terraform task results by model
| Task | gpt-5-nano |
|---|---|
quickstart-version-checkeasy Display the installed Terraform version. | ✓1t |
quickstart-init-projecteasy Create a main.tf file with a terraform block requiring version >= 1.0, then run terraform init to initialize the working directory. | ✓2t |
quickstart-format-configeasy Create a main.tf file with deliberately bad formatting (inconsistent indentation, no spaces around equals signs). Then use terraform fmt to fix the formatting. | ✓3t |
discover-help-commandseasy Show the Terraform CLI help to list all available subcommands. | ✓2t |
discover-validate-configeasy Create a main.tf file with a terraform block and a variable named 'name' with a default value. Run terraform init, then run terraform validate to check the configuration is valid. | ✓4t |
config-local-file-resourcemedium Create a Terraform config that uses the hashicorp/local provider's local_file resource to create a file called 'bench-hello.txt' with content 'Hello from Terraform'. Run terraform init and terraform apply -auto-approve. | ✓4t |
config-plan-savemedium Create a Terraform config with a local_file resource that writes 'planned content' to bench-output.txt. Run terraform init, then run terraform plan and save the plan to a file called 'bench-tfplan'. | ✓2t |
config-variables-inputmedium Create a Terraform config with an input variable named 'filename' with default value 'bench-varfile.txt' and a local_file resource that uses var.filename as the filename with content 'variable works'. Run terraform init and terraform apply -auto-approve. | ✓4t |
flags-output-valuesmedium Create a Terraform config with a local_file resource writing 'output test' to bench-out.txt and an output block named 'file_path' that outputs the filename attribute. Run terraform init, apply -auto-approve, then use terraform output to display the output value. | ✗4t |
flags-validate-jsonmedium Create a main.tf file with a terraform block and a variable definition. Run terraform init, then run terraform validate with the -json flag to get machine-readable validation output. | ✓1t |
flags-show-plan-jsonmedium Create a Terraform config with a local_file resource writing 'show test' to bench-show.txt. Run terraform init, create a plan saved to bench-showplan, then use terraform show -json bench-showplan to display the plan in JSON format. | ✓5t |
error-plan-without-inithard Create a main.tf with a local_file resource. Run terraform plan without running init first. You should see an error. Then run terraform init and terraform plan again successfully. | ✓8t |
error-fix-invalid-confighard Create a main.tf with an intentional error: define a resource 'local_file' 'bench_file' but use an invalid attribute name 'invalid_attr' instead of 'content'. Run terraform init, then terraform validate to see the error. Fix the config to use 'content' instead, and validate again successfully. | ✓8t |
error-destroy-empty-statehard Create a Terraform config with a local_file resource. Run terraform init, then run terraform destroy -auto-approve without having applied anything first. Report what happens (it should succeed with no changes since nothing was created). | ✓2t |
workflow-full-lifecyclehard Create a Terraform config with a local_file resource that writes 'lifecycle test' to bench-lifecycle.txt. Execute the full lifecycle: run init, plan (save to bench-lifecycle-plan), apply the saved plan, verify the file exists, then destroy -auto-approve. Confirm the file is removed after destroy. | ✓8t |
workflow-workspace-switchhard Create a Terraform config with a local_file resource writing 'workspace test' to bench-ws.txt. Run terraform init. Create a new workspace called 'staging' using terraform workspace new. Apply -auto-approve in the staging workspace. Then switch back to the default workspace using terraform workspace select. Finally, list all workspaces with terraform workspace list. | ✓6t |
workflow-state-inspectionhard Create a Terraform config with two local_file resources: one writing 'file alpha' to bench-alpha.txt and another writing 'file beta' to bench-beta.txt. Run terraform init and apply -auto-approve. Then use terraform state list to show all managed resources and terraform state show on one of the resources to display its attributes. | ✓2t |
workflow-plan-apply-with-varhard Create a Terraform config with a variable 'message' (no default) and a local_file resource that writes var.message to bench-custom.txt. Run terraform init. Then run terraform plan -var='message=hello-benchmark' and save the plan to bench-varplan. Apply the saved plan. Verify bench-custom.txt contains the message. | ✓8t |
Task suite source264 lines · YAML
- id: quickstart-version-check
intent: Display the installed Terraform version.
assert:
- ran: terraform
- output_contains: Terraform
setup: []
max_turns: 3
difficulty: easy
category: getting-started
docs_origin: content/terraform/v1.14.x/docs/cli/commands/version.mdx#Usage
- id: quickstart-init-project
intent: Create a main.tf file with a terraform block requiring version >= 1.0,
then run terraform init to initialize the working directory.
assert:
- ran: terraform init
- file_exists: main.tf
- file_exists: .terraform
setup: []
max_turns: 5
difficulty: easy
category: getting-started
docs_origin: content/terraform/v1.14.x/docs/cli/commands/init.mdx#Usage
- id: quickstart-format-config
intent: Create a main.tf file with deliberately bad formatting (inconsistent
indentation, no spaces around equals signs). Then use terraform fmt to fix
the formatting.
assert:
- ran: terraform fmt
- file_exists: main.tf
setup: []
max_turns: 4
difficulty: easy
category: getting-started
docs_origin: content/terraform/v1.14.x/docs/cli/commands/fmt.mdx#Usage
- id: discover-help-commands
intent: Show the Terraform CLI help to list all available subcommands.
assert:
- ran: terraform
- output_contains: init
- output_contains: plan
- output_contains: apply
setup: []
max_turns: 3
difficulty: easy
category: command-discovery
docs_origin: content/terraform/v1.14.x/docs/cli/index.mdx#Terraform CLI Documentation
- id: discover-validate-config
intent: Create a main.tf file with a terraform block and a variable named 'name'
with a default value. Run terraform init, then run terraform validate to
check the configuration is valid.
assert:
- ran: terraform init
- ran: terraform validate
setup: []
max_turns: 5
difficulty: easy
category: command-discovery
docs_origin: content/terraform/v1.14.x/docs/cli/commands/validate.mdx#Usage
- id: config-local-file-resource
intent: Create a Terraform config that uses the hashicorp/local provider's
local_file resource to create a file called 'bench-hello.txt' with content
'Hello from Terraform'. Run terraform init and terraform apply
-auto-approve.
assert:
- ran: terraform init
- ran: terraform apply
- file_exists: bench-hello.txt
- file_contains:
path: bench-hello.txt
text: Hello from Terraform
setup: []
max_turns: 8
difficulty: medium
category: config
docs_origin: content/terraform/v1.14.x/docs/cli/commands/apply.mdx#Automatic Plan Mode
- id: config-plan-save
intent: Create a Terraform config with a local_file resource that writes
'planned content' to bench-output.txt. Run terraform init, then run
terraform plan and save the plan to a file called 'bench-tfplan'.
assert:
- ran: terraform init
- ran: terraform plan
- file_exists: bench-tfplan
setup: []
max_turns: 8
difficulty: medium
category: config
docs_origin: content/terraform/v1.14.x/docs/cli/commands/plan.mdx#Usage
- id: config-variables-input
intent: Create a Terraform config with an input variable named 'filename' with
default value 'bench-varfile.txt' and a local_file resource that uses
var.filename as the filename with content 'variable works'. Run terraform
init and terraform apply -auto-approve.
assert:
- ran: terraform init
- ran: terraform apply
- file_exists: bench-varfile.txt
- file_contains:
path: bench-varfile.txt
text: variable works
setup: []
max_turns: 8
difficulty: medium
category: config
docs_origin: content/terraform/v1.14.x/docs/cli/commands/apply.mdx#Plan Options
- id: flags-output-values
intent: Create a Terraform config with a local_file resource writing 'output
test' to bench-out.txt and an output block named 'file_path' that outputs
the filename attribute. Run terraform init, apply -auto-approve, then use
terraform output to display the output value.
assert:
- ran: terraform init
- ran: terraform apply
- ran: terraform output
- file_exists: bench-out.txt
setup: []
max_turns: 8
difficulty: medium
category: flag-parsing
docs_origin: content/terraform/v1.14.x/docs/cli/commands/output.mdx#Usage
- id: flags-validate-json
intent: Create a main.tf file with a terraform block and a variable definition.
Run terraform init, then run terraform validate with the -json flag to get
machine-readable validation output.
assert:
- ran: terraform init
- ran: terraform validate.*-json
- output_contains: valid
setup: []
max_turns: 6
difficulty: medium
category: flag-parsing
docs_origin: content/terraform/v1.14.x/docs/cli/commands/validate.mdx#JSON Output Format
- id: flags-show-plan-json
intent: Create a Terraform config with a local_file resource writing 'show test'
to bench-show.txt. Run terraform init, create a plan saved to
bench-showplan, then use terraform show -json bench-showplan to display the
plan in JSON format.
assert:
- ran: terraform init
- ran: terraform plan
- ran: terraform show.*-json
- output_contains: planned_values
setup: []
max_turns: 8
difficulty: medium
category: flag-parsing
docs_origin: content/terraform/v1.14.x/docs/cli/commands/show.mdx#JSON Output
- id: error-plan-without-init
intent: Create a main.tf with a local_file resource. Run terraform plan without
running init first. You should see an error. Then run terraform init and
terraform plan again successfully.
assert:
- ran: terraform plan
- ran: terraform init
- file_exists: main.tf
setup: []
max_turns: 8
difficulty: hard
category: error-recovery
docs_origin: content/terraform/v1.14.x/docs/cli/commands/plan.mdx#Usage
- id: error-fix-invalid-config
intent: "Create a main.tf with an intentional error: define a resource
'local_file' 'bench_file' but use an invalid attribute name 'invalid_attr'
instead of 'content'. Run terraform init, then terraform validate to see the
error. Fix the config to use 'content' instead, and validate again
successfully."
assert:
- run_count:
pattern: terraform validate
min: 2
- ran: terraform init
- file_exists: main.tf
setup: []
max_turns: 8
difficulty: hard
category: error-recovery
docs_origin: content/terraform/v1.14.x/docs/cli/commands/validate.mdx#Introduction
- id: error-destroy-empty-state
intent: Create a Terraform config with a local_file resource. Run terraform
init, then run terraform destroy -auto-approve without having applied
anything first. Report what happens (it should succeed with no changes since
nothing was created).
assert:
- ran: terraform init
- ran: terraform destroy
setup: []
max_turns: 8
difficulty: hard
category: error-recovery
docs_origin: content/terraform/v1.14.x/docs/cli/commands/destroy.mdx#Usage
- id: workflow-full-lifecycle
intent: "Create a Terraform config with a local_file resource that writes
'lifecycle test' to bench-lifecycle.txt. Execute the full lifecycle: run
init, plan (save to bench-lifecycle-plan), apply the saved plan, verify the
file exists, then destroy -auto-approve. Confirm the file is removed after
destroy."
assert:
- ran: terraform init
- ran: terraform plan
- ran: terraform apply
- ran: terraform destroy
setup: []
max_turns: 12
difficulty: hard
category: multi-step-workflow
docs_origin: content/terraform/v1.14.x/docs/cli/commands/apply.mdx#Saved Plan Mode
- id: workflow-workspace-switch
intent: Create a Terraform config with a local_file resource writing 'workspace
test' to bench-ws.txt. Run terraform init. Create a new workspace called
'staging' using terraform workspace new. Apply -auto-approve in the staging
workspace. Then switch back to the default workspace using terraform
workspace select. Finally, list all workspaces with terraform workspace
list.
assert:
- ran: terraform init
- ran: terraform workspace new.*staging
- ran: terraform workspace select.*default
- ran: terraform workspace list
- ran: terraform apply
setup: []
max_turns: 12
difficulty: hard
category: multi-step-workflow
docs_origin: "content/terraform/v1.14.x/docs/cli/commands/workspace/new.mdx#Exa\
mple: Create"
- id: workflow-state-inspection
intent: "Create a Terraform config with two local_file resources: one writing
'file alpha' to bench-alpha.txt and another writing 'file beta' to
bench-beta.txt. Run terraform init and apply -auto-approve. Then use
terraform state list to show all managed resources and terraform state show
on one of the resources to display its attributes."
assert:
- ran: terraform init
- ran: terraform apply
- ran: terraform state list
- ran: terraform state show
- file_exists: bench-alpha.txt
- file_exists: bench-beta.txt
setup: []
max_turns: 12
difficulty: hard
category: multi-step-workflow
docs_origin: content/terraform/v1.14.x/docs/cli/commands/state/list.mdx#Usage
- id: workflow-plan-apply-with-var
intent: Create a Terraform config with a variable 'message' (no default) and a
local_file resource that writes var.message to bench-custom.txt. Run
terraform init. Then run terraform plan -var='message=hello-benchmark' and
save the plan to bench-varplan. Apply the saved plan. Verify
bench-custom.txt contains the message.
assert:
- ran: terraform init
- ran: terraform plan
- ran: terraform apply
- file_exists: bench-custom.txt
- file_contains:
path: bench-custom.txt
text: hello-benchmark
setup: []
max_turns: 12
difficulty: hard
category: multi-step-workflow
docs_origin: content/terraform/v1.14.x/docs/cli/commands/plan.mdx#Usage
Evals are a snapshot, not a verdict. We run identical tasks across all models to keep comparisons fair. Results vary with CLI version, task selection, and model updates. Evals run weekly on 18 tasks using @cliwatch/cli-bench.
What you get with CLIWatch
Everything below is running live for Terraform — see the latest run. Set up the same for your CLI in minutes.
| Model | Pass Rate | Delta |
|---|---|---|
| Sonnet 4.5 | 95% | +5% |
| GPT-4.1 | 80% | -5% |
| Haiku 4.5 | 65% | -10% |
CI & PR Comments
Get automated PR comments with per-model pass rates, regressions, and a link to the full comparison dashboard.
Track Over Time
See how your CLI's agent compatibility changes across releases. Spot trends and regressions at a glance.
thresholds:
claude-sonnet-4-5: 80%
gpt-4.1: 75%
claude-haiku-4-5: 60%Quality Gates
Set per-model pass rate thresholds. CI fails if evals drop below your targets.
Get this for your CLI
Run evals in CI, get PR comments with regressions, track pass rates over time, and gate merges on quality thresholds — all from a single GitHub Actions workflow.