The Agent Skills Directory

[COMMAND_EXECUTION]: The skill orchestrates complex workflows by executing the claude CLI and various bundled Python scripts (e.g., run_loop.py, aggregate_benchmark.py) using the subprocess module. These operations are transparently managed and necessary for its primary purpose of skill development.
[PROMPT_INJECTION]: The skill is designed to ingest and process user-provided test prompts and feedback strings. This creates an interface for indirect prompt injection (Category 8); however, this risk is inherent to the skill's function as a developer tool for testing AI behavior and is mitigated by the user's active participation in the evaluation loop.
[DATA_EXFILTRATION]: The generate_review.py script starts a local HTTP server on 127.0.0.1 to provide a browser-based interface for reviewing evaluation results. This server only accesses files within the user-specified workspace and is restricted to local access.

skill-creator