swe-bench

Pass

Audited by Gen Agent Trust Hub on Feb 27, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [SAFE]: All components of the skill, including the instructions and the reproduction script template, are consistent with its stated purpose of solving coding issues and do not exhibit malicious intent.
  • [COMMAND_EXECUTION]: The skill requires the execution of shell commands to run reproduction scripts and verification tests (e.g., python reproduce_issue.py, pytest), which is expected behavior for a developer-centric agent skill.
  • [PROMPT_INJECTION]: The skill presents an indirect prompt injection surface (Category 8) due to its processing of external data.
  • Ingestion points: Untrusted GitHub issue descriptions are read and processed during Phase 1 in SKILL.md.
  • Boundary markers: The instructions do not define specific delimiters or guardrails for isolating external issue content.
  • Capability inventory: The skill has access to shell command execution and Python script execution.
  • Sanitization: No input validation or sanitization of the issue content is specified in the workflow.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 27, 2026, 08:38 AM