bootstrap-realtime-eval
Pass
Audited by Gen Agent Trust Hub on Mar 30, 2026
Risk Level: SAFE
Full Analysis
- Input Sanitization: The skill utilizes a Python script that sanitizes user-provided evaluation names through a slugification process, which mitigates risks associated with special characters in directory names.
- Path Validation: The script includes explicit checks to verify that the target output directory is a subdirectory of the intended evaluation root, effectively preventing path traversal and protecting the rest of the file system.
- Controlled File Operations: The tool uses standard Python libraries for file management and includes a mandatory flag for overwriting existing data, ensuring that file operations are intentional and predictable.
- Local Resource Usage: The skill operates using local scripts and standard developer tools (such as pandas and pytest) without initiating external network requests or downloading untrusted content.
Audit Metadata