trustworthy-experiments

Pass

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: LOW
Full Analysis
  • [COMMAND_EXECUTION] (SAFE): The Python scripts sample_size.py and srm_check.py perform statistical calculations using the Python standard library. They use argparse for safe command-line input handling and do not execute external system commands or shell processes.
  • [DATA_EXFILTRATION] (SAFE): No network-capable libraries (such as requests or urllib) are used, and no attempts to access sensitive file system paths (e.g., .ssh, .aws) were found.
  • [REMOTE_CODE_EXECUTION] (SAFE): The skill does not perform any dynamic code evaluation (eval, exec) or download remote scripts from the internet.
  • [PROMPT_INJECTION] (SAFE): The SKILL.md file contains behavioral guidelines to maintain an 'experimentation lead' persona, but it does not include malicious instructions to bypass AI safety filters or extract system prompts.
  • [EXTERNAL_DOWNLOADS] (SAFE): The skill contains no references to external packages or remote dependencies; all logic is self-contained within the provided scripts.
Audit Metadata
Risk Level
LOW
Analyzed
Feb 16, 2026, 02:15 AM