bootstrap-realtime-eval

Pass

Audited by Gen Agent Trust Hub on Mar 30, 2026

Risk Level: SAFE
Full Analysis
  • Input Sanitization: The skill utilizes a Python script that sanitizes user-provided evaluation names through a slugification process, which mitigates risks associated with special characters in directory names.
  • Path Validation: The script includes explicit checks to verify that the target output directory is a subdirectory of the intended evaluation root, effectively preventing path traversal and protecting the rest of the file system.
  • Controlled File Operations: The tool uses standard Python libraries for file management and includes a mandatory flag for overwriting existing data, ensuring that file operations are intentional and predictable.
  • Local Resource Usage: The skill operates using local scripts and standard developer tools (such as pandas and pytest) without initiating external network requests or downloading untrusted content.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 30, 2026, 01:53 PM