skill-tester
Skill Tester & Analyzer
A meta-skill for deeply testing and auditing other Claude skills. It instruments test runs to capture raw API call traces, records all script stdin/stdout/stderr with timing, and runs deterministic security scans followed by dedicated security and code review subagents against any scripts embedded in the skill.
Session Directory Layout
<report_root>/<skill_name>_<YYYYMMDD_HHMMSS>/
├── manifest.json # Validation results and session metadata (created by setup_test_env.py)
├── sandbox/ # Isolated workspace for script execution
├── inventory.json # Skill structure scan
├── scan_results.json # Deterministic security findings (B9 — runs first)
├── prompt_lint.json # Deterministic prompt quality findings (B11 — runs first)
├── prompt_review.json # AI prompt quality analysis (receives prompt_lint as input)
├── api_log.jsonl # All Claude API calls (one JSON object per line)
├── script_runs.jsonl # All script executions with I/O
├── security_report.json # AI security analysis (receives scan_results as input)
├── code_review.json # Code quality review
├── session_report.html # Claude Code session trace (API calls, tool use, conversation)
└── report.html # Unified interactive HTML report
Modes
| Mode | Description | Phases Run | Command |
|---|---|---|---|
| Full (default) | Complete analysis: scan → prompt-lint → test → security → review → report | All (2-9) | /st:run |
| Audit | Static analysis only, no test execution | 2-4, 6-7, 9 | /st:audit |
| Trace | Runtime capture only, no security/code review | 2, 5, 8, 9 | /st:trace |
| Report | Re-generate HTML from existing session data | 9 only | /st:report |
Commands
| Command | Mode | Phases | Purpose |
|---|---|---|---|
/st:init |
All | 1 | Set up session: target, mode, prompts, report location |
/st:run |
Full | 2-9 | Execute all analysis phases |
/st:audit |
Audit | 2-4, 6-7, 9 | Static analysis only |
/st:trace |
Trace | 2, 5, 8, 9 | Runtime capture only |
/st:report |
Report | 9 | Regenerate HTML from session data |
/st:status |
N/A | — | Show session state |
/st:resume |
Any | Variable | Resume interrupted session |
Interpreting Results
Security Severity Levels
| Level | Meaning | Action |
|---|---|---|
CRITICAL |
Active exploit risk (e.g., shell injection, RCE, hardcoded production key) | Block — do not use skill; fix immediately |
HIGH |
Likely data exposure or privilege escalation | Fix before production |
MEDIUM |
Defense-in-depth gap; not immediately exploitable | Fix in next iteration |
LOW |
Style/practice issue with minor security implications | Note in report |
INFO |
Observation, no risk | Informational only |
Code Quality Score (0–10)
| Range | Interpretation |
|---|---|
| 9–10 | Production-ready |
| 7–8 | Minor improvements needed |
| 5–6 | Significant gaps — refactoring advised |
| < 5 | Major issues — rework required |
More from ddunnock/claude-plugins
fmea-analysis
Conduct Failure Mode and Effects Analysis (FMEA) for systematic identification and risk assessment of potential failures in designs, processes, or systems. Supports DFMEA (Design), PFMEA (Process), and FMEA-MSR (Monitoring & System Response). Uses AIAG-VDA 7-step methodology with Action Priority (AP) risk assessment replacing traditional RPN. Use when analyzing product designs for potential failures, evaluating manufacturing process risks, conducting proactive risk assessment, preparing for APQP/PPAP submissions, investigating field failures, or when user mentions "FMEA", "failure mode", "DFMEA", "PFMEA", "severity occurrence detection", "RPN", "Action Priority", "design risk analysis", or needs to identify and prioritize potential failure modes with their causes and effects.
32fishbone-diagram
Create comprehensive Fishbone (Ishikawa/Cause-and-Effect) diagrams for structured root cause brainstorming. Guides teams through problem definition, category selection (6Ms, 8Ps, 4Ss, or custom), cause identification, sub-cause drilling, prioritization via multi-voting, and 5 Whys integration. Generates visual SVG diagrams and professional HTML reports. Use when brainstorming potential causes, conducting root cause analysis, facilitating quality improvement sessions, analyzing defects or failures, structuring team problem-solving, or when user mentions "fishbone", "Ishikawa", "cause and effect diagram", "6Ms", "cause analysis", or "brainstorming causes".
28pareto-analysis
Conduct Pareto Analysis (80/20 Rule) to identify the vital few causes driving the majority of problems. Guides data collection, category definition, chart creation, cumulative percentage calculation, and prioritization. Generates professional Pareto charts (SVG) and HTML reports with quality scoring. Use when prioritizing defects, complaints, failures, or improvement opportunities; when user mentions "Pareto", "80/20 rule", "vital few", "trivial many", "prioritization", or needs to identify which factors contribute most to a problem.
22fault-tree-analysis
Conduct Fault Tree Analysis (FTA) to systematically identify and analyze causes of system failures using Boolean logic gates. Top-down deductive method for safety and reliability engineering. Use when analyzing system failures, evaluating safety-critical designs, calculating failure probabilities, identifying minimal cut sets, assessing redundancy effectiveness, or when user mentions "fault tree", "FTA", "system failure analysis", "minimal cut sets", "safety analysis", "failure probability", "AND/OR gates", or needs to trace failure pathways from top event to basic events. Supports qualitative structure analysis and quantitative probability calculations.
22kepner-tregoe-analysis
Conduct Kepner-Tregoe (KT) Problem Solving and Decision Making (PSDM) analysis using the four rational processes - Situation Appraisal, Problem Analysis, Decision Analysis, and Potential Problem Analysis. Use when performing structured root cause analysis, making complex decisions, evaluating alternatives with weighted criteria, conducting IS/IS NOT specification analysis, anticipating implementation risks, troubleshooting complex issues, or when user mentions "Kepner-Tregoe", "KT method", "IS/IS NOT", "situation appraisal", "decision analysis", "MUSTS and WANTS", "potential problem analysis", or needs systematic problem-solving methodology. Includes specification matrices, decision scoring, quality rubrics, and professional report generation.
19concept-dev
This skill should be used when the user asks to "develop a concept", "explore a new idea", "brainstorm a system concept", "do concept development", "create a concept document", "run Phase A", "define the problem and architecture", or mentions concept exploration, feasibility studies, concept of operations, system concept, architecture exploration, solution landscape, or NASA Phase A.
18