cli-agent-experience

Installation
SKILL.md

CLI Agent Experience

Language

Match user's language: Respond in the same language the user uses.

Overview

Audit a CLI tool's agent-friendliness against 11 dimensions split into base (determines grade) and bonus (improvement hints, doesn't penalize grade). Produces a scorecard with actionable suggestions.

Input: A CLI tool — source code, installed binary, or GitHub repo.

Output: Base grade + bonus hints + prioritized fix list.

Workflow

Progress:

  • Step 1: Identify the CLI tool
  • Step 2: Gather evidence
  • Step 3: Score each dimension
  • Step 4: Present report
  • Step 5: Fix gate

Step 1: Identify the CLI

Input Action
Source code in cwd Read entry point, help output, output formatting code
Installed binary name Run <tool> --help, sample commands
GitHub repo URL Clone/fetch, read source

Step 2: Gather Evidence

  1. <tool> --help and subcommand help (hierarchy depth)
  2. <tool> <command> --help for 2-3 representative commands
  3. Check for --json, --quiet, --dry-run, --yes/--force flags
  4. Read output formatting code in source (stdout vs stderr separation)
  5. Read error handling code (exit codes, error structure)
  6. Run a sample command if safe (read-only) to see actual output format

Step 3: Score Each Dimension

Score each 0-3. Read references/dimensions.md for full checklists.

Base dimensions (determine grade, /21):

# Dimension Key question
1 Structured Output Has --json for composability? Default output token-efficient?
2 Exit Codes Meaningful graduated codes, not just 0/1?
3 Non-Interactive Mode Can run fully unattended? TTY detection?
5 Self-Documentation --help has examples? Required vs optional clear?
8 Actionable Errors Error codes + failing input echo + recovery hints?
9 Consistent Grammar noun-verb hierarchy? Consistent naming?
11 Token Efficiency Output concise by default? Pagination? --limit/--field?

Bonus dimensions (hints only, /12):

# Dimension Key question Why bonus
4 Idempotency Safe to retry? Declarative commands? Domain-dependent (browser/chat inherently side-effectful)
6 Composability --quiet? stdin? Batch ops? Filtering? Depends on tool scope
7 Dry-Run --dry-run with structured diff? Domain-dependent (API wrappers, REPL)
10 Agent Teaching Skills/docs for agent consumption? Forward-looking, industry immature

Domain adaptation: Bonus dimensions have domain-dependent ceilings. When a tool maxes out what's possible in its domain, mark X/3 (domain ceiling). See references/dimensions.md § Domain Adaptation.

Step 4: Present Report

[CLI Agent Experience] <tool-name>

═══ Base Score ═══
 1. Structured Output:    <score>/3  <one-line>
 2. Exit Codes:           <score>/3  <one-line>
 3. Non-Interactive:      <score>/3  <one-line>
 5. Self-Documentation:   <score>/3  <one-line>
 8. Actionable Errors:    <score>/3  <one-line>
 9. Consistent Grammar:   <score>/3  <one-line>
11. Token Efficiency:      <score>/3  <one-line>
                           ─────────
Base:                      <total>/21

Grade: A (18-21) | B (14-17) | C (10-13) | D (6-9) | F (0-5)

═══ Bonus ═══
 4. Idempotency:          <score>/3  <one-line>
 6. Composability:        <score>/3  <one-line>
 7. Dry-Run:              <score>/3  <one-line>
10. Agent Teaching:        <score>/3  <one-line>
                           ─────────
Bonus:                     +<total>/12

═══ Top Improvements ═══
Base items first (affect grade), then bonus items (hints):
  1. What to change, why, and a concrete before/after example
  2. Priority: High (blocks agent usage) / Medium (degrades experience) / Low (forward-looking)

Step 5: Fix Gate

Ask user:

  • Fix all — apply improvements to source code
  • Pick items — select specific dimensions to fix
  • None — use report as reference

References

Read references/dimensions.md for the full scoring checklist per dimension with examples and anti-patterns.

Installs
1
First Seen
Apr 10, 2026