skill-testing-framework
Warn
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: MEDIUMCOMMAND_EXECUTION
Full Analysis
- COMMAND_EXECUTION (MEDIUM): The skill is architected to execute arbitrary local scripts defined in test configuration files. While the primary execution script (
run_tests.py) is missing from the provided files, the documentation (SKILL.md) and examples (assets/test_template.json,references/test_patterns.md) explicitly detail fields forscriptandargsto be run by the framework. If an attacker can influence a test suite's content, they can achieve arbitrary command execution on the host. - INDIRECT_PROMPT_INJECTION (MEDIUM): The skill possesses a Category 8 attack surface as it is designed to ingest and act upon external test definitions and baseline output files.
- Ingestion points: Files processed by
generate_test_template.py,validate_test_results.py, and the referencedrun_tests.py. - Boundary markers: None identified in the provided templates or scripts to distinguish between test data and malicious instructions.
- Capability inventory: File system read/write (
validate_test_results.py), directory creation, and script execution (referenced in documentation). - Sanitization: None. The scripts perform direct file operations and pattern matching using values directly from the input files.
- DATA_EXPOSURE (LOW): The
validate_test_results.pyscript includes a--create-baselinefeature that usesshutil.copy2to duplicate files. If mismanaged, this could be used to copy sensitive files into a baseline directory for later exfiltration or unauthorized access.
Audit Metadata