hypothesis-generation
Pass
Audited by Gen Agent Trust Hub on Apr 11, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill utilizes the
Bashtool to execute local scripts and LaTeX compilation workflows (e.g.,xelatex,bibtex). A mandatory instruction inSKILL.mdrequires runningpython scripts/generate_schematic.py, but this script is missing from the provided files, which could lead to execution errors or unexpected agent behavior if the agent attempts to create or locate it.- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection (Category 8c) because it ingests untrusted data from external scientific sources viaWebFetchandWebSearchand processes it to generate hypotheses and reports. - Ingestion points: Scientific papers, reviews, and search results fetched from PubMed and the open web (
SKILL.md). - Boundary markers: There are no instructions or delimiters provided to ensure the agent ignores or sanitizes potential instructions embedded within the external research data.
- Capability inventory: The skill possesses
Bashexecution privileges and file system access (Write,Edit), creating a path for malicious instructions in data to potentially trigger command execution. - Sanitization: The skill lacks any defined validation or sanitization mechanisms for content retrieved from external literature.
Audit Metadata