multi-model-reviewer
Pass
Audited by Gen Agent Trust Hub on Mar 29, 2026
Risk Level: SAFECOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The script
scripts/multi_model_review.pyexecutes several local command-line interface (CLI) tools, includinggemini,codex, andclaude. It utilizesasyncio.create_subprocess_execto invoke these binaries, passing large prompts containing user-sourced code and specifications as command-line arguments. - [DATA_EXFILTRATION]: The skill reads source code and specification files from the local filesystem (via
_collect_specs,_collect_programs, and_collect_tests) and transmits this data to external AI service providers, such asapi.openai.com. While this is required for its function as a multi-model reviewer, it involves the outbound transmission of potentially sensitive project data to third-party servers. - [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection. It ingests untrusted data from local project files and interpolates it directly into system prompts for multiple LLMs. This creates a surface where a malicious actor could embed instructions within code comments or documentation to manipulate the review output.
- Ingestion points: Files read from
--spec-dir,--program-dir, and--test-dirviaSpecProgramTestCollectorinscripts/multi_model_review.py. - Boundary markers: Uses markdown code blocks (```yaml) and headers (## SPECIFICATION) to delimit content within the prompt.
- Capability inventory: Performs network requests via
httpxto OpenAI and local Ollama endpoints, and executes local binaries viasubprocess. - Sanitization: None. The content is read directly from the filesystem and truncated for length without escaping or filtering for embedded instructions.
Audit Metadata