claude-skills-benchmark
Skill Benchmarking
Evaluate Agent Skills through static analysis and evaluation-driven methodology. Source: Anthropic's skill evaluation guidance.
When to Use
Activate when:
- Assessing skill quality across a plugin or marketplace
- Measuring skill activation accuracy (false positives/negatives)
- Comparing skill versions or skill-vs-no-skill performance
- Running the
/benchmark-skillscommand - Reviewing skill descriptions for optimization
Static Analysis Checks
Run these checks against every skill to produce a quality scorecard:
| Check | Pass Criteria |
|---|
More from vinnie357/claude-skills
material-design
Guide for implementing Material Design 3 (Material You). Use when designing Android apps, implementing dynamic theming, or following Material component patterns.
18elixir-testing
Guide for Elixir testing with ExUnit. Use when writing unit tests, implementing property-based tests, setting up mocks, or organizing test suites.
16phoenix-framework
Guide for Phoenix web applications. Use when building Phoenix apps, implementing LiveView, designing contexts, setting up channels, or integrating Tidewave MCP dev tools.
16elixir-anti-patterns
Identify and refactor Elixir anti-patterns. Use when reviewing Elixir code for smells, refactoring problematic patterns, or improving code quality.
15nushell
Guide for using Nushell for structured data pipelines and scripting. Use when writing shell scripts, processing structured data, or working with cross-platform automation.
14documentation-writing
Guide for writing technical documentation. Use when creating README files, API documentation, guides, or inline code documentation.
12