skill-creator
Pass
Audited by Gen Agent Trust Hub on Mar 26, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The script
scripts/run_eval.pyexecutes theclaudecommand-line interface viasubprocess.Popento test skill triggering behavior. - [COMMAND_EXECUTION]: The script
eval-viewer/generate_review.pyexecuteslsofandkillviasubprocess.runto manage the local network port used for the evaluation viewer. - [EXTERNAL_DOWNLOADS]: The skill utilizes the official
anthropicPython package to connect to the Anthropic API for the purpose of optimizing skill descriptions using large language models. - [EXTERNAL_DOWNLOADS]: The evaluation viewer template (
eval-viewer/viewer.html) loads the SheetJS library from a well-known public CDN (cdn.sheetjs.com) to enable the rendering of spreadsheet files in the browser. - [PROMPT_INJECTION]: The skill processes user-defined test cases from
evals.jsonandeval_set.json. These prompts are executed through the agent context, creating a surface for indirect prompt injection if the test data is untrusted. This behavior is associated with the intended primary skill purpose as a development and testing tool. - Ingestion points:
evals/evals.jsonandeval_set.jsonfiles processed inscripts/run_eval.py. - Boundary markers: Test queries are passed as arguments to the
claudeCLI. - Capability inventory: The skill can execute shell commands and modify files to facilitate the benchmarking process.
- Sanitization: The scripts focus on structural parsing and formatting; the content of the test prompts themselves is passed directly to the model for execution.
Audit Metadata