Overview

This skill is for running evaluations against models on the Hugging Face Hub on local hardware.

It covers:

inspect-ai with local inference
lighteval with local inference
choosing between vllm, Hugging Face Transformers, and accelerate
smoke tests, task selection, and backend fallback strategy

It does not cover:

Hugging Face Jobs orchestration
model-card or model-index edits
README table extraction
Artificial Analysis imports
.eval_results generation or publishing
PR creation or community-evals automation

If the user wants to run the same eval remotely on Hugging Face Jobs, hand off to the hugging-face-jobs skill and pass it one of the local scripts in this skill.

Installs

1.1K

Repository

huggingface/skills

GitHub Stars

10.8K

First Seen

Mar 23, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykWarn