hypothesis-generation

Pass

Audited by Gen Agent Trust Hub on Apr 11, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill utilizes the Bash tool to execute local scripts and LaTeX compilation workflows (e.g., xelatex, bibtex). A mandatory instruction in SKILL.md requires running python scripts/generate_schematic.py, but this script is missing from the provided files, which could lead to execution errors or unexpected agent behavior if the agent attempts to create or locate it.- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection (Category 8c) because it ingests untrusted data from external scientific sources via WebFetch and WebSearch and processes it to generate hypotheses and reports.
  • Ingestion points: Scientific papers, reviews, and search results fetched from PubMed and the open web (SKILL.md).
  • Boundary markers: There are no instructions or delimiters provided to ensure the agent ignores or sanitizes potential instructions embedded within the external research data.
  • Capability inventory: The skill possesses Bash execution privileges and file system access (Write, Edit), creating a path for malicious instructions in data to potentially trigger command execution.
  • Sanitization: The skill lacks any defined validation or sanitization mechanisms for content retrieved from external literature.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 11, 2026, 10:28 PM