CLI Agent-Readiness Directory

How well do popular CLIs work with AI coding agents? Each CLI is independently evaluated on its own set of real-world tasks. Scores reflect per-CLI agent-readiness, not cross-CLI rankings; different CLIs have different tasks and complexity levels.

Last updated March 6, 2026·Grade reflects average pass rate.

CLI	Category	Pass Rate	Avg Turns	gpt-5-nano	Tasks
AAgent Ready6 CLIs
AWS	Cloud	100%	3.3	100%	4
Git	Developer Tools	100%	4.3	100%	4
GNU Make	Developer Tools	100%	2.8	100%	4
Go	Developer Tools	100%	4.0	100%	4
pip	Package Managers	100%	2.5	100%	4
Terraform	Cloud / DevOps	94%	4.1	94%	18
BAlmost There6 CLIs
curl	Developer Tools	89%	2.8	89%	18
Docker	Cloud / DevOps	78%	2.7	78%	18
gh	Developer Tools	75%	3.7	75%	4
jq	Data Processing	88%	1.9	88%	16
npm	Package Managers	89%	3.1	89%	19
Stripe	Developer Tools	86%	4.8	86%	14
CRoom to Grow7 CLIs
Cargo	Package Managers	63%	3.6	63%	19
Fly.io	Cloud	58%	3.5	58%	19
kubectl	Cloud / DevOps	50%	3.0	50%	4
pnpm	Package Managers	63%	3.7	63%	19
psql	Database	50%	5.5	50%	4
Supabase	Database	71%	4.6	71%	14
Vercel	Developer Tools	50%	2.4	50%	14

This is an early-stage showcase. Each CLI has its own task suite, so cross-CLI comparison is approximate. Our goal is standardized, comparable evals. We're adding more tasks and models regularly.

Explore evals

Detailed eval results for each CLI — per-model pass rates, per-task breakdowns, and the full task suite source.

# agent checks repo status and history
$ git status --porcelain
  M  src/index.ts
  ?? new-file.ts

GitDeveloper Tools

The ubiquitous version control system. Agents use it to commit changes, manage branches, resolve conflicts, and navigate repository history.

View analysis →

# agent extracts data from API response
$ echo '[{"name":"a","score":95},{"name":"b","score":72}]' |
    jq '.[] | select(.score > 80) | .name'
  "a"

jqData Processing

A lightweight command-line JSON processor. Agents use it to parse, filter, transform, and format JSON data from APIs and files.

View analysis →

# agent manages dependencies
$ npm ls --depth=0 --json
  {"dependencies":{"express":{"version":"4.21.0"},
   "typescript":{"version":"5.7.0"}}}

npmPackage Managers

The Node.js package manager. Agents use it to install dependencies, run scripts, manage versions, and publish packages.

View analysis →

# agent checks pod health
$ kubectl get pods -n production -o json
  {"items": [{"metadata": {"name": "api-7d4.."}
   "status": {"phase": "Running"}}]}

kubectlCloud / DevOps

The Kubernetes command-line tool. Used by agents to manage clusters, inspect workloads, debug pods, and apply manifests.

View analysis →

# agent creates a PR
$ gh pr create --title "Fix auth bug" --body "..."
  https://github.com/org/repo/pull/42
 
$ gh pr view 42 --json state,checks

ghDeveloper Tools

GitHub's official CLI. Agents use it to create PRs, manage issues, trigger workflows, and query repository data.

View analysis →

# agent inspects a running container
$ docker ps --format json
  {"ID":"a1b2c3","Names":"api","Status":"Up 2h"}
 
$ docker logs api --tail 10

DockerCloud / DevOps

Container management CLI. Agents build images, run containers, manage volumes, and inspect running services.

View analysis →

# agent plans and inspects state
$ terraform plan -json
  {"type":"planned_change","change":{
   "resource":"local_file.config",
   "action":"create"}}

TerraformCloud / DevOps

Infrastructure as code tool. Agents plan and apply infrastructure changes, inspect state, and manage workspaces.

View analysis →

# agent lists EC2 instances
$ aws ec2 describe-instances --query
    'Reservations[].Instances[].{Id:InstanceId,
     State:State.Name}' --output json
  [{"Id":"i-0a1b2c","State":"running"}]

Amazon Web Services CLI. Agents manage cloud resources, configure services, and query infrastructure across AWS.

View analysis →

# agent deploys to production
$ vercel --prod --yes
  Deploying to production...
  https://app.example.com

VercelDeveloper Tools

Frontend deployment platform CLI. Agents deploy projects, manage environment variables, and configure domains.

View analysis →

# agent retrieves a customer
$ stripe customers list --limit 1
  {"data": [{"id": "cus_abc123",
   "email": "user@example.com"}]}

StripeDeveloper Tools

Payment infrastructure CLI. Agents listen to webhooks, trigger test events, and manage Stripe resources.

View analysis →

# agent checks app status
$ fly status --json
  {"Name":"api","Status":"deployed",
   "Machines":[{"region":"iad","state":"started"}]}

Edge computing platform CLI. Agents deploy apps, manage machines, scale regions, and monitor deployments.

View analysis →

# agent runs a migration
$ supabase db push
  Connecting to remote database...
  Applying migration 20260208_add_users.sql...
  Finished supabase db push.

SupabaseDatabase

Open-source Firebase alternative CLI. Agents manage databases, run migrations, generate types, and manage edge functions.

View analysis →

# agent builds and tests a Rust project
$ cargo test
   Compiling mylib v0.1.0
     Running unittests src/lib.rs
  test result: ok. 3 passed; 0 failed

CargoPackage Managers

The Rust package manager and build system. Agents use it to create projects, manage dependencies, run tests, and build optimized binaries.

View analysis →

# agent tests an API endpoint
$ curl -s https://httpbin.org/get | jq .origin
  "203.0.113.42"
 
$ curl -w '%{http_code}' -o /dev/null -s https://example.com

curlDeveloper Tools

The universal command-line HTTP client. Agents use it to make API calls, download files, test endpoints, and debug HTTP traffic.

View analysis →

# agent runs tests with coverage
$ go test -cover ./...
  ok  example.com/myproject  0.003s  coverage: 85.7%
 
$ go build -o server .

GoDeveloper Tools

The Go programming language toolchain. Agents use it to build binaries, run tests, manage modules, and format code.

View analysis →

# agent manages a monorepo
$ pnpm add lodash
  + lodash 4.17.21
 
$ pnpm ls --depth=0

pnpmPackage Managers

Fast, disk-efficient JavaScript package manager. Agents use it to install dependencies, run scripts, and manage monorepo workspaces.

View analysis →

# agent manages Python dependencies
$ pip3 install requests
  Successfully installed requests-2.31.0
 
$ pip3 freeze > requirements.txt

pipPackage Managers

The Python package installer. Agents use it to install libraries, manage virtual environments, freeze requirements, and audit dependencies.

View analysis →

# agent queries a database
$ psql -c "SELECT name, score FROM users ORDER BY score DESC LIMIT 3"
   name  | score
  -------+-------
   alice |    95

The PostgreSQL interactive terminal. Agents use it to run queries, manage schemas, import/export data, and administer databases.

View analysis →

# agent runs build targets
$ make build
  gcc -Wall -O2 -o app main.c
 
$ make test

GNU MakeDeveloper Tools

The classic build automation tool. Agents use it to run build targets, manage dependencies between tasks, and automate project workflows.

View analysis →

Want to know how well AI agents can use your CLI?

Track pass rates, compare models, and catch regressions across releases.

CLIWatch mascot