skill-creator
Pass
Audited by Gen Agent Trust Hub on Mar 28, 2026
Risk Level: SAFECOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes several local Python scripts (e.g.,
run_eval.py,run_loop.py,package_skill.py) to manage benchmark data and package files. It also utilizes theclaudeCLI via subprocess calls to perform triggering tests and generate improved skill descriptions. - [REMOTE_CODE_EXECUTION]: To verify the behavior of newly created or modified skills, the tool spawns parallel subagents using the
TaskCreatetool to execute test prompts in an isolated environment. - [EXTERNAL_DOWNLOADS]: The evaluation viewer loads the SheetJS library from a public CDN (
cdn.sheetjs.com) to allow users to inspect spreadsheet outputs directly within the generated HTML report. - [DATA_EXFILTRATION]: The skill requires PII (full name and email address) from the user to populate the
created-bymetadata field in the generated skill's YAML frontmatter, which is then stored in the local file system. - [PROMPT_INJECTION]: The skill possesses an indirect prompt injection surface as it processes user-provided test prompts and feedback, which are subsequently interpolated into prompts for subagents and LLM-based optimization cycles.
- Ingestion points: Reads evaluation prompts from
evals.jsonand qualitative feedback fromfeedback.json(via a local web server). - Boundary markers: Uses XML-style tags (e.g.,
<skill_content>,<scores_summary>) to structure data provided to the model. - Capability inventory: Includes the ability to execute shell commands, write to the file system, and spawn subagent tasks.
- Sanitization: Implements standard HTML escaping in the evaluation viewer UI and relies on model guardrails during the skill generation process.
Audit Metadata