sherlock-review

Pass

Audited by Gen Agent Trust Hub on Mar 18, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill mandates a 'Minimum Findings Enforcement' policy where every investigation must surface a minimum weight of observations (e.g., at least 3). By stating that an investigation finding nothing is a 'failed investigation', the skill pressures the agent to generate findings regardless of actual code quality, potentially resulting in hallucinated issues or biased reviews.
  • [COMMAND_EXECUTION]: The skill encourages the agent to 'execute locally' and run commands such as npm test and benchmarks on untrusted codebases to verify implementation claims. This capability can be exploited if a malicious pull request contains dangerous code within the test suite or benchmark configurations, leading to unauthorized command execution in the agent's environment.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it processes external, untrusted data (PR descriptions, commit messages, and source code) to perform deductive reasoning without defined safety boundaries.
  • Ingestion points: Source code changes, PR descriptions, and commit history are listed as primary evidence sources in SKILL.md.
  • Boundary markers: Absent. The skill does not provide delimiters or instructions to treat external data as untrusted or to ignore embedded instructions.
  • Capability inventory: The skill allows for git diff, npm test, benchmark execution, and log analysis across its agents (e.g., qe-code-reviewer).
  • Sanitization: Absent. There is no requirement or mechanism described to sanitize code or metadata before ingestion or execution.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 18, 2026, 08:19 AM