deepeval-testing

SKILL.md

DeepEval Testing

Skill for pytest-style LLM evaluation with DeepEval.

Overview

DeepEval provides:

  • pytest-compatible LLM tests
  • Built-in metrics (faithfulness, relevance, toxicity)
  • Custom metric creation
  • Async test execution

Use When

This skill is automatically invoked when:

  • Creating pytest-style LLM tests
  • Evaluating RAG quality
  • Measuring faithfulness/relevance
  • Building custom metrics

Available Scripts

Script Description
scripts/init-deepeval.sh Initialize DeepEval project
scripts/run-tests.sh Run DeepEval tests

Available Templates

Template Description
templates/conftest.py pytest configuration
templates/test_basic.py Basic test structure
templates/test_rag.py RAG evaluation tests

Built-in Metrics

  • AnswerRelevancyMetric
  • FaithfulnessMetric
  • ContextualRelevancyMetric
  • HallucinationMetric
  • ToxicityMetric
  • BiasMetric
Weekly Installs
3
GitHub Stars
3
First Seen
Feb 11, 2026
Installed on
opencode3
gemini-cli3
claude-code3
github-copilot3
codex3
amp3