The Agent Skills Directory

[SAFE]: The skill is purely informational, providing methodologies, prompt templates, and theoretical frameworks for evaluating AI model outputs.
[SAFE]: No executable scripts, binaries, or shell commands are included in the file.
[SAFE]: External references are limited to academic papers on ArXiv and reputable industry blogs, which are standard for documentation and research-oriented skills.
[SAFE]: Prompt templates provided for direct scoring and pairwise comparison include specific instructions to mitigate biases (length, position), which serves as a security best practice for building robust evaluation systems.

advanced-evaluation