The Agent Skills Directory

[COMMAND_EXECUTION]: The skill executes local Python scripts (run_benchmarks.py), GitHub CLI commands (gh pr close, gh issue close), and Git commands (git worktree remove) to manage benchmark workflows and cleanup.- [PROMPT_INJECTION]: The skill possesses an indirect prompt injection surface by ingesting and processing untrusted data.
Ingestion points: Benchmark task definitions in BENCHMARK_TASKS.md and execution results in result.json.
Boundary markers: No delimiters or instructions to ignore embedded commands are present in the prompt templates.
Capability inventory: The agent can execute shell commands, interact with GitHub repositories, and spawn subagents.
Sanitization: There is no evidence of validation or sanitization of ingested content before it is passed to the reviewer subagent.- [DATA_EXFILTRATION]: The skill accesses a hidden directory within the user's home folder (~/.amplihack/.claude/runtime/benchmarks/suite_v3/) to retrieve benchmark results. While this appears to be the tool's intended runtime path, accessing locations outside the project workspace increases potential data exposure risks.

model-evaluation-benchmark