NYC
skills/ovachiever/droid-tings/biomni/Gen Agent Trust Hub

biomni

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSCREDENTIALS_UNSAFE
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill documentation explicitly states that it executes LLM-generated code with 'full system privileges'. This capability allows the agent to perform any operation the host user can, including modifying system files, installing software, or accessing sensitive data.
  • REMOTE_CODE_EXECUTION (HIGH): The core functionality involves 'dynamic analysis pipeline creation' where the agent autonomously writes and runs Python code. This is a direct RCE vector; if an attacker can influence the agent's input (Indirect Prompt Injection via research data), they can execute arbitrary code on the user's machine.
  • EXTERNAL_DOWNLOADS (MEDIUM): The framework automatically downloads approximately 11GB of biomedical databases on first use. This large-scale data ingestion from external sources increases the attack surface for data poisoning and supply chain attacks.
  • CREDENTIALS_UNSAFE (LOW): The setup instructions involve exporting 'ANTHROPIC_API_KEY' and other LLM provider keys. Given the agent's ability to execute code, these credentials are at risk of exfiltration if the agent is compromised.
  • PROMPT_INJECTION (LOW): The skill is highly vulnerable to Indirect Prompt Injection (Category 8).
  • Ingestion points: Processes external GWAS files, h5ad datasets, and PubMed literature indices.
  • Boundary markers: None mentioned in the skill definition to separate data from instructions.
  • Capability inventory: Full Python code execution and system access via agent.go().
  • Sanitization: Documentation suggests running in Docker/VMs as a mitigation, acknowledging the lack of internal sanitization.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 05:58 PM