performance-testing-review-ai-review

Pass

Audited by Gen Agent Trust Hub on Feb 27, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill processes untrusted input from pull requests and interpolates it directly into LLM prompts, creating an indirect prompt injection surface.
  • Ingestion points: The example Python orchestrator script (ai_review.py) ingests code diffs and PR descriptions via the reviewer.get_pr_diff() function.
  • Boundary markers: The provided prompt templates (e.g., review_prompt and security_analysis_prompt) do not use specific delimiters or XML tags to isolate the untrusted code_diff from the core instructions.
  • Capability inventory: The skill is designed to execute system commands via subprocess and perform network operations via the GitHub API and Anthropic SDK.
  • Sanitization: The provided code samples do not demonstrate any sanitization, filtering, or length-limiting of the ingested diff content before it is processed by the model.
  • [COMMAND_EXECUTION]: The orchestration scripts rely on executing external CLI tools to perform static analysis and secret detection.
  • Evidence: The Python script uses subprocess.run to execute sonar-scanner and subprocess.check_output to run semgrep. The GitHub Action workflow executes trufflehog and sonar-scanner as shell commands.
  • Context: These executions are necessary for the skill's primary function but rely on the security of the underlying environment and the presence of verified binaries in the PATH.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 27, 2026, 09:00 AM