code-review

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [Indirect Prompt Injection] (HIGH): The skill processes untrusted external content and possesses side-effect-heavy capabilities.
  • Ingestion points: Reads repository source files (*.ts, *.tsx), pull request descriptions (/resolve_pr), and implementation plans.
  • Boundary markers: None identified. The instructions do not define delimiters to separate untrusted code from agent instructions.
  • Capability inventory: Executes shell commands including grep, ./scripts/log-skill.sh, npm run build, and npm test.
  • Sanitization: None. The skill directly executes build scripts from the environment being reviewed.
  • [Remote Code Execution] (HIGH): The 'Performance Review' and 'Pre-merge' workflows trigger 'npm run build' and 'npm test'.
  • Evidence: workflows/performance-pass.md and checklists/pre-merge.md.
  • Risk: If the code under review contains a malicious package.json with lifecycle hooks (e.g., prebuild or test), the agent will execute attacker-controlled code locally.
  • [Command Execution] (MEDIUM): The skill executes a local instrumentation script without verifying its contents.
  • Evidence: SKILL.md contains ./scripts/log-skill.sh "code-review" "manual" "$$".
  • Risk: While likely intended for telemetry, this establishes a pattern of executing local scripts that could be swapped or modified by other malicious skills or PRs.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 09:17 AM