receiving-code-review

Pass

Audited by Gen Agent Trust Hub on Feb 18, 2026

Risk Level: SAFE
Full Analysis
  • Prompt Injection (SAFE): No instructions found that attempt to bypass safety filters or override system constraints. The behavioral constraints (e.g., avoiding performative agreement) are persona-specific and do not target agent safety protocols.
  • Data Exposure & Exfiltration (SAFE): The skill utilizes tools like grep and the GitHub API (gh api) for legitimate development tasks (verifying code usage and replying to PR comments). No sensitive file paths or unauthorized network endpoints are targeted.
  • Indirect Prompt Injection (LOW): This category is noted as the skill is designed to ingest and process untrusted data (external code review feedback).
  • Ingestion points: External reviewer feedback and GitHub PR comments.
  • Boundary markers: None explicitly defined for the input data string.
  • Capability inventory: File system searching (grep), GitHub API interaction (gh api), and code implementation (file writing).
  • Sanitization: The skill explicitly implements a verification and evaluation step ("Verify before implementing") which acts as a manual/logic-based sanitization layer against potentially malicious or incorrect external instructions.
  • Unverifiable Dependencies (SAFE): No external package installations or remote script executions are present.
  • Obfuscation (SAFE): No encoded strings, zero-width characters, or hidden text detected.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 18, 2026, 01:23 PM