evidence-heavy-evaluator
SKILL.md
Evidence Heavy Evaluator
Run a deterministic repo evaluation and emit auditable artifacts in test-output.
Workflow
- Choose inputs:
target_dir: repo or subdirectory to evaluate.profile:readiness,maintainability, orrelease-readiness.depth:quickordeep.execute_checks: include to run lint/test/typecheck/build evidence.
- Collect evidence:
skills/evidence-heavy-evaluator/scripts/collect_evidence.sh \
--target-dir <target_dir> \
--profile <profile> \
--depth <depth> \
[--execute-checks]
- Read outputs from
<target_dir>/test-output/evidence-heavy-evaluator/:
readiness-scorecard.jsonreadiness-report.mdchecks-summary.tsvmetrics.tsvsignals.tsv
- Summarize results for the user:
- Lead with highest-impact failed criteria.
- Cite the exact artifact paths used as evidence.
- Separate failed checks from skipped/not-evaluated checks.
Guardrails
- Keep evaluation read-only: do not edit code as part of this skill.
- Treat command failures as evidence, not blockers.
- Preserve deterministic ordering in report summaries.
- If
--execute-checksis omitted, call out that quality execution criteria are not evaluated.
Criteria
Use references/criteria-matrix.md as the source of truth for scoring criteria and profile weights.
Notes
- The collector automatically runs
render_report.pyafter evidence collection. uvis required becauserender_report.pyis executed withuv run.
Weekly Installs
2
Repository
0xsero/vllm-studioGitHub Stars
291
First Seen
9 days ago
Security Audits
Installed on
openclaw2
zencoder1
amp1
cline1
opencode1
cursor1