swe-bench
Pass
Audited by Gen Agent Trust Hub on Feb 27, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [SAFE]: All components of the skill, including the instructions and the reproduction script template, are consistent with its stated purpose of solving coding issues and do not exhibit malicious intent.
- [COMMAND_EXECUTION]: The skill requires the execution of shell commands to run reproduction scripts and verification tests (e.g.,
python reproduce_issue.py,pytest), which is expected behavior for a developer-centric agent skill. - [PROMPT_INJECTION]: The skill presents an indirect prompt injection surface (Category 8) due to its processing of external data.
- Ingestion points: Untrusted GitHub issue descriptions are read and processed during Phase 1 in
SKILL.md. - Boundary markers: The instructions do not define specific delimiters or guardrails for isolating external issue content.
- Capability inventory: The skill has access to shell command execution and Python script execution.
- Sanitization: No input validation or sanitization of the issue content is specified in the workflow.
Audit Metadata