hypothesis-testing

Pass

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: LOW
Full Analysis
  • Prompt Injection (SAFE): No patterns were detected that attempt to bypass safety guidelines, override system instructions, or extract system prompts. The instructions are focused on scientific methodology.
  • Data Exposure & Exfiltration (SAFE): The skill does not reference sensitive file paths (e.g., .ssh, .aws) and contains no hardcoded credentials or exfiltration logic.
  • Obfuscation (SAFE): All text is human-readable. No Base64, zero-width characters, or homoglyphs were found.
  • Unverifiable Dependencies & Remote Code Execution (SAFE): No external package installations or remote script executions (e.g., curl|bash) are present.
  • Privilege Escalation (SAFE): No commands requesting administrative or root access (e.g., sudo) are included.
  • Persistence Mechanisms (SAFE): The skill does not attempt to modify shell profiles, cron jobs, or startup services.
  • Indirect Prompt Injection (LOW):
  • Ingestion points: The skill uses tools like WebSearch and WebFetch to ingest data from external sources.
  • Boundary markers: None identified. Content from the web is processed directly by the agent.
  • Capability inventory: The skill only has access to read-only tools (WebSearch, WebFetch, Read, Grep, Glob). It cannot write files or execute code.
  • Sanitization: None specified.
  • Assessment: The vulnerability surface exists but the risk is negligible because the skill lacks any capabilities to perform side effects or persist malicious instructions beyond the immediate reasoning task.
  • Dynamic Execution (SAFE): No runtime compilation, library injection, or unsafe deserialization patterns were detected.
Audit Metadata
Risk Level
LOW
Analyzed
Feb 16, 2026, 04:43 AM