llm-evaluation

Installation

SKILL.md

LLM Evaluation

Evaluate and validate LLM outputs for quality assurance using RAGAS and LLM-as-judge patterns.

Quick Reference

LLM-as-Judge Pattern

async def evaluate_quality(input_text: str, output_text: str, dimension: str) -> float:
    response = await llm.chat([{
        "role": "user",
        "content": f"""Evaluate for {dimension}. Score 1-10.
Input: {input_text[:500]}
Output: {output_text[:1000]}
Respond with just the number."""
    }])
    return int(response.content.strip()) / 10

Installs

Repository

yonatangross/sk…e-plugin

GitHub Stars

188

First Seen

Jan 21, 2026

llm-evaluation — yonatangross/skillforge-claude-plugin