multi-model-research

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMCREDENTIALS_UNSAFECOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [Unverifiable Dependency] (MEDIUM): The script scripts/research.py attempts to import MultiModelClient from llm_client.py, but this file is missing from the skill payload. This 'ghost dependency' is responsible for managing all API communications and credential handling, making it impossible to verify if the skill exfiltrates keys or performs unauthorized network requests.
  • [Credential Risk] (MEDIUM): The skill requires four distinct API keys (OpenAI, Gemini, Perplexity, XAI) and instructs users to store them in a ~/.env file. Given that the code handling these secrets is not provided for review, this represents a significant trust requirement and a potential vector for credential theft if the missing logic is malicious.
  • [Indirect Prompt Injection] (LOW): The skill aggregates content from web searches and external LLMs to create a synthesized report. The prompts/chairman_synthesis.md prompt does not use delimiters or boundary markers for the {responses_with_reviews} variable, allowing untrusted data to potentially hijack the 'Chairman' model's instructions.
  • Ingestion points: scripts/research.py (ingests data from Perplexity and four external LLM responses).
  • Boundary markers: Absent; uses simple curly-brace interpolation without XML tags or 'ignore instructions' guards.
  • Capability inventory: File system write access (via uv run output and Obsidian tool); broad network access via API requests.
  • Sanitization: None; the script and prompts do not escape or filter the content received from external models before processing.
  • [Command Execution] (LOW): The skill uses uv run to execute local Python scripts. While standard for this environment, the implementation allows for arbitrary query strings to be passed as arguments, which could lead to command injection if the underlying shell or script improperly handles quoting.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 06:36 PM