# agent extracts data from API response
$ echo '[{"name":"a","score":95},{"name":"b","score":72}]' |
    jq '.[] | select(.score > 80) | .name'
  "a"

Can AI agents use jq?

A lightweight command-line JSON processor. Agents use it to parse, filter, transform, and format JSON data from APIs and files.

See the latest run →
88% overall pass rate1 model tested16 tasksvjq-1.73/6/2026

jq eval results by model

ModelPass rateAvg turnsAvg tokens
gpt-5-nano88%1.95.8k

jq task results by model

Taskgpt-5-nano
sort-keys-outputmedium
Print the contents of bench-unordered.json with all object keys sorted alphabetically.
1t
quickstart-pretty-printeasy
The file bench-raw.json contains minified JSON. Use jq to pretty-print it to stdout.
1t
quickstart-extract-fieldeasy
Extract the 'city' field from the nested 'address' object in bench-person.json. Print just the string value.
3t
quickstart-array-firsteasy
Get the first element from the JSON array in bench-items.json and print it.
2t
discover-versioneasy
Check what version of jq is installed and print it.
1t
discover-compact-flageasy
Print the contents of bench-data.json as a single compact line (no pretty-printing).
1t
config-null-input-constructmedium
Without any input file, use jq to construct and print a JSON object with keys 'project' (value 'benchmark') and 'version' (value 1). Do not create any input files.
1t
error-optional-operatorhard
The file bench-mixed.json contains an array with both objects and non-objects. Extract the 'name' field from each element without producing errors for non-object elements.
3t
config-arg-variablemedium
Use jq's --arg feature to pass the value 'production' as a variable named 'env', then construct a JSON object {"environment": $env} and print it. Do not use any input file.
1t
config-from-file-filtermedium
Write a jq filter to bench-filter.jq that selects objects where .status == "active", then run jq with that filter file against bench-users.json.
5t
raw-string-outputmedium
Extract the 'name' field from bench-record.json and print it as a raw string (no JSON quotes around it).
1t
slurp-combine-inputsmedium
There are three separate JSON files: bench-a.json, bench-b.json, bench-c.json. Use jq to slurp them all into a single JSON array and print it.
1t
error-exit-statushard
Use jq with the exit-status flag to check bench-check.json. The filter should test if .enabled is true. Print the result and ensure jq exits with a non-zero status if the value is false.
1t
tutorial-extract-transformhard
Read bench-commits.json (an array of commit objects). For each commit, extract just the 'message' and 'author' fields into a new object like {"msg": .message, "who": .author}. Print the resulting array.
2t
workflow-aggregate-reporthard
Read bench-sales.json (an array of sale records with 'region' and 'amount' fields). Group the sales by region, then for each region compute the total amount. Print the result as an array of {"region": ..., "total": ...} objects.
3t
workflow-merge-and-enrichhard
Merge data from two files: bench-products.json has [{"id": 1, "name": "Widget"}, ...] and bench-prices.json has {"1": 9.99, "2": 24.99, ...}. Produce an array where each product object gains a 'price' field looked up by its id. Write the result to bench-enriched.json.
2t
Task suite source249 lines · YAML
- id: quickstart-pretty-print
  intent: The file bench-raw.json contains minified JSON. Use jq to pretty-print
    it to stdout.
  assert:
    - ran: jq
    - output_contains: Alice
    - output_contains: scores
  setup:
    - echo '{"name":"Alice","scores":[90,85,92],"active":true}' > bench-raw.json
  max_turns: 3
  difficulty: easy
  category: getting-started
  docs_origin: docs/content/tutorial/default.yml#pretty-printing
- id: quickstart-extract-field
  intent: Extract the 'city' field from the nested 'address' object in
    bench-person.json. Print just the string value.
  assert:
    - ran: jq
    - output_contains: Portland
  setup:
    - "echo '{\"name\": \"Alice\", \"address\": {\"city\": \"Portland\",
      \"state\": \"OR\", \"zip\": \"97201\"}}' > bench-person.json"
  max_turns: 3
  difficulty: easy
  category: getting-started
  docs_origin: docs/content/manual/manual.yml#Object Identifier-Index
- id: quickstart-array-first
  intent: Get the first element from the JSON array in bench-items.json and print it.
  assert:
    - ran: jq
    - output_contains: Widget
  setup:
    - "echo '[{\"id\": 1, \"name\": \"Widget\"}, {\"id\": 2, \"name\":
      \"Gadget\"}, {\"id\": 3, \"name\": \"Sprocket\"}]' > bench-items.json"
  max_turns: 3
  difficulty: easy
  category: getting-started
  docs_origin: docs/content/tutorial/default.yml#array-indexing
- id: discover-version
  intent: Check what version of jq is installed and print it.
  assert:
    - ran: jq
    - output_contains: jq-
  setup: []
  max_turns: 3
  difficulty: easy
  category: command-discovery
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: discover-compact-flag
  intent: Print the contents of bench-data.json as a single compact line (no
    pretty-printing).
  assert:
    - ran: jq
    - output_contains: '"a":1'
  setup:
    - "echo '{\"a\": 1, \"b\": 2, \"c\": 3}' > bench-data.json"
  max_turns: 4
  difficulty: easy
  category: command-discovery
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: config-null-input-construct
  intent: Without any input file, use jq to construct and print a JSON object with
    keys 'project' (value 'benchmark') and 'version' (value 1). Do not create
    any input files.
  assert:
    - ran: jq
    - output_contains: benchmark
    - output_contains: version
  setup: []
  max_turns: 5
  difficulty: medium
  category: config
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: config-arg-variable
  intent: "Use jq's --arg feature to pass the value 'production' as a variable
    named 'env', then construct a JSON object {\"environment\": $env} and print
    it. Do not use any input file."
  assert:
    - ran: jq
    - output_contains: environment
    - output_contains: production
  setup: []
  max_turns: 5
  difficulty: medium
  category: config
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: config-from-file-filter
  intent: Write a jq filter to bench-filter.jq that selects objects where .status
    == "active", then run jq with that filter file against bench-users.json.
  assert:
    - ran: jq
    - file_exists: bench-filter.jq
    - output_contains: Alice
    - output_contains: Carol
  setup:
    - "echo '[{\"name\": \"Alice\", \"status\": \"active\"}, {\"name\": \"Bob\",
      \"status\": \"inactive\"}, {\"name\": \"Carol\", \"status\": \"active\"}]'
      > bench-users.json"
  max_turns: 6
  difficulty: medium
  category: config
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: raw-string-output
  intent: Extract the 'name' field from bench-record.json and print it as a raw
    string (no JSON quotes around it).
  assert:
    - ran: jq
    - output_contains: hello-world
  setup:
    - "echo '{\"name\": \"hello-world\", \"type\": \"greeting\"}' >
      bench-record.json"
  max_turns: 5
  difficulty: medium
  category: flag-parsing
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: slurp-combine-inputs
  intent: "There are three separate JSON files: bench-a.json, bench-b.json,
    bench-c.json. Use jq to slurp them all into a single JSON array and print
    it."
  assert:
    - ran: jq
    - output_contains: '"id": 1'
    - output_contains: '"id": 3'
  setup:
    - "echo '{\"id\": 1}' > bench-a.json"
    - "echo '{\"id\": 2}' > bench-b.json"
    - "echo '{\"id\": 3}' > bench-c.json"
  max_turns: 5
  difficulty: medium
  category: flag-parsing
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: sort-keys-output
  intent: Print the contents of bench-unordered.json with all object keys sorted
    alphabetically.
  assert:
    - ran: jq
    - verify:
        run: jq -S '.' bench-unordered.json
        output_contains: apple
  setup:
    - "echo '{\"zebra\": 1, \"apple\": 2, \"mango\": 3, \"banana\": 4}' >
      bench-unordered.json"
  max_turns: 5
  difficulty: medium
  category: flag-parsing
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: error-optional-operator
  intent: The file bench-mixed.json contains an array with both objects and
    non-objects. Extract the 'name' field from each element without producing
    errors for non-object elements.
  assert:
    - ran: jq
    - output_contains: Alice
    - output_contains: Bob
    - output_contains: Carol
  setup:
    - "echo '[{\"name\": \"Alice\"}, 42, {\"name\": \"Bob\"}, null, {\"name\":
      \"Carol\"}]' > bench-mixed.json"
  max_turns: 6
  difficulty: hard
  category: error-recovery
  docs_origin: docs/content/manual/manual.yml#Optional Object Identifier-Index
- id: error-exit-status
  intent: Use jq with the exit-status flag to check bench-check.json. The filter
    should test if .enabled is true. Print the result and ensure jq exits with a
    non-zero status if the value is false.
  assert:
    - ran: jq
    - output_contains: "false"
  setup:
    - "echo '{\"enabled\": false, \"name\": \"test\"}' > bench-check.json"
  max_turns: 6
  difficulty: hard
  category: error-recovery
  docs_origin: docs/content/manual/manual.yml#Invoking jq
- id: tutorial-extract-transform
  intent: "Read bench-commits.json (an array of commit objects). For each commit,
    extract just the 'message' and 'author' fields into a new object like
    {\"msg\": .message, \"who\": .author}. Print the resulting array."
  assert:
    - ran: jq
    - output_contains: msg
    - output_contains: who
    - output_contains: alice
    - output_contains: resolve null pointer
  setup:
    - >
      cat > bench-commits.json << 'EOF'

      [
        {"sha": "abc123", "message": "fix: resolve null pointer", "author": "alice", "date": "2025-01-15"},
        {"sha": "def456", "message": "feat: add search endpoint", "author": "bob", "date": "2025-01-16"},
        {"sha": "ghi789", "message": "docs: update README", "author": "carol", "date": "2025-01-17"}
      ]

      EOF
  max_turns: 8
  difficulty: hard
  category: multi-step-workflow
  docs_origin: docs/content/tutorial/default.yml#object-construction
- id: workflow-aggregate-report
  intent: "Read bench-sales.json (an array of sale records with 'region' and
    'amount' fields). Group the sales by region, then for each region compute
    the total amount. Print the result as an array of {\"region\": ...,
    \"total\": ...} objects."
  assert:
    - ran: jq
    - output_contains: west
    - output_contains: east
    - output_contains: "300"
    - output_contains: "500"
  setup:
    - |
      cat > bench-sales.json << 'EOF'
      [
        {"region": "west", "amount": 100},
        {"region": "east", "amount": 200},
        {"region": "west", "amount": 150},
        {"region": "east", "amount": 300},
        {"region": "west", "amount": 50}
      ]
      EOF
  max_turns: 10
  difficulty: hard
  category: multi-step-workflow
  docs_origin: docs/content/manual/manual.yml#Builtin operators and functions
- id: workflow-merge-and-enrich
  intent: "Merge data from two files: bench-products.json has [{\"id\": 1,
    \"name\": \"Widget\"}, ...] and bench-prices.json has {\"1\": 9.99, \"2\":
    24.99, ...}. Produce an array where each product object gains a 'price'
    field looked up by its id. Write the result to bench-enriched.json."
  assert:
    - ran: jq
    - file_exists: bench-enriched.json
    - file_contains:
        path: bench-enriched.json
        text: Widget
    - file_contains:
        path: bench-enriched.json
        text: "9.99"
  setup:
    - "echo '[{\"id\": 1, \"name\": \"Widget\"}, {\"id\": 2, \"name\":
      \"Gadget\"}]' > bench-products.json"
    - "echo '{\"1\": 9.99, \"2\": 24.99}' > bench-prices.json"
  max_turns: 10
  difficulty: hard
  category: multi-step-workflow
  docs_origin: docs/content/manual/manual.yml#Invoking jq

Evals are a snapshot, not a verdict. We run identical tasks across all models to keep comparisons fair. Results vary with CLI version, task selection, and model updates. Evals run weekly on 16 tasks using @cliwatch/cli-bench.

What you get with CLIWatch

Everything below is running live for jq see the latest run. Set up the same for your CLI in minutes.

ModelPass RateDelta
Sonnet 4.595%+5%
GPT-4.180%-5%
Haiku 4.565%-10%

CI & PR Comments

Get automated PR comments with per-model pass rates, regressions, and a link to the full comparison dashboard.

Pass rateLast 30 days
v1.0v1.6

Track Over Time

See how your CLI's agent compatibility changes across releases. Spot trends and regressions at a glance.

thresholds:
  claude-sonnet-4-5: 80%
  gpt-4.1: 75%
  claude-haiku-4-5: 60%

Quality Gates

Set per-model pass rate thresholds. CI fails if evals drop below your targets.

Get this for your CLI

Run evals in CI, get PR comments with regressions, track pass rates over time, and gate merges on quality thresholds — all from a single GitHub Actions workflow.

Compare other CLI evals