The Agent Skills Directory

[SAFE]: The skill's architecture for processing agent outputs and user prompts was analyzed for potential prompt injection vulnerabilities. While the skill ingests external data for evaluation purposes (specifically in the instructions for the grader and comparator agents), it utilizes structured JSON schemas for result management and safe DOM APIs (like textContent) in its web viewer. This localized processing of evaluation data follows standard practices for benchmarking utilities and does not present an exploitable vulnerability.
[SAFE]: The use of subprocess.run to call lsof and the use of os.kill in eval-viewer/generate_review.py were evaluated. These operations are strictly localized to managing the network port for the tool's own visualization server (ensuring a previous instance is cleared). The inputs are validated and the functionality is appropriate for a local developer tool, posing no risk of unauthorized command execution or privilege escalation.

skill-eval