skill-evals-optimize

Pass

Audited by Gen Agent Trust Hub on Mar 13, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill is granted permission to execute arbitrary Python commands via Bash(python:*) and any shell script within the scripts/ directory. This provides a broad attack surface for command execution if the agent is manipulated.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection through external data sources.
  • Ingestion points: The agent reads the evaluation dataset opencode_skill_loading_eval_dataset.jsonl and the steering guide evals/skill-loading/docs/skill-optimization-steering.md during its workflow.
  • Boundary markers: No explicit delimiters or boundary markers are defined to help the agent distinguish between data and instructions within these files.
  • Capability inventory: The agent has the ability to execute shell scripts (Bash), run Python code, read files, and search directories (Glob, Grep).
  • Sanitization: No sanitization or validation steps are mentioned before the agent processes or acts upon the content of the dataset or steering guide.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 13, 2026, 07:26 AM