The Agent Skills Directory

[COMMAND_EXECUTION]: The skill executes several local Python scripts (e.g., run_eval.py, run_loop.py, package_skill.py) to manage benchmark data and package files. It also utilizes the claude CLI via subprocess calls to perform triggering tests and generate improved skill descriptions.
[REMOTE_CODE_EXECUTION]: To verify the behavior of newly created or modified skills, the tool spawns parallel subagents using the TaskCreate tool to execute test prompts in an isolated environment.
[EXTERNAL_DOWNLOADS]: The evaluation viewer loads the SheetJS library from a public CDN (cdn.sheetjs.com) to allow users to inspect spreadsheet outputs directly within the generated HTML report.
[DATA_EXFILTRATION]: The skill requires PII (full name and email address) from the user to populate the created-by metadata field in the generated skill's YAML frontmatter, which is then stored in the local file system.
[PROMPT_INJECTION]: The skill possesses an indirect prompt injection surface as it processes user-provided test prompts and feedback, which are subsequently interpolated into prompts for subagents and LLM-based optimization cycles.
Ingestion points: Reads evaluation prompts from evals.json and qualitative feedback from feedback.json (via a local web server).
Boundary markers: Uses XML-style tags (e.g., <skill_content>, <scores_summary>) to structure data provided to the model.
Capability inventory: Includes the ability to execute shell commands, write to the file system, and spawn subagent tasks.
Sanitization: Implements standard HTML escaping in the evaluation viewer UI and relies on model guardrails during the skill generation process.

skill-creator