The Agent Skills Directory

[SAFE]: The skill is entirely instructional and documents best practices for AI agent evaluation. All provided code snippets, such as those demonstrating the use of subprocess for running tests or exec() for code evaluation, are clearly presented as examples within the context of a sandbox or testing environment. No malicious patterns, such as prompt injection, unauthorized data access, or obfuscation, were detected. All external links point to well-known research benchmarks and official documentation.

agent-evaluation