skill-creator
Warn
Audited by Gen Agent Trust Hub on Mar 6, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
- [COMMAND_EXECUTION]: The script
scripts/run_eval.pyutilizes the Pythonsubprocessmodule to execute theclaudeCLI tool for evaluation purposes. Furthermore,eval-viewer/generate_review.pyinvokes thelsofsystem utility to identify and manage processes occupying network ports. - [REMOTE_CODE_EXECUTION]: The skill implements a workflow where it dynamically writes temporary instruction files to the
.claude/commands/directory and subsequently triggers their execution via theclaudeCLI. This enables the runtime generation and application of agent behaviors based on evaluation datasets. - [EXTERNAL_DOWNLOADS]: The skill facilitates communication with Anthropic's official API using the
anthropicPython client library inscripts/improve_description.pyandscripts/run_loop.pyto perform automated description optimization. Additionally, the visualization component ineval-viewer/viewer.htmlloads the SheetJS library from a public CDN (cdn.sheetjs.com) for spreadsheet processing. - [PROMPT_INJECTION]: The skill possesses an indirect prompt injection surface as it ingests untrusted data from evaluation sets and user feedback files. This data is interpolated into prompts used for description optimization in
scripts/improve_description.pywithout explicit sanitization, though it employs XML-style boundary markers like<current_description>to mitigate accidental misinterpretation. Evidence Chain: 1) Ingestion points:eval_set.jsonandfeedback.json. 2) Boundary markers: XML tags used in optimizer prompt. 3) Capability inventory: Subprocess command execution, file system modification, and network operations. 4) Sanitization: No explicit validation or filtering observed for query content.
Audit Metadata