Review Release Notes

Audit a scored features.json file and its markdown draft from the perspective of the reader, not the implementer.

This is the editorial QA stage of the pipeline. Its job is to make sure the release notes feel curated, legible, and exciting to the right audience — not like an API inventory.

This skill reuses the shared rubric from editorial-scoring. It critiques the scoring; it does not invent a separate scoring philosophy.

When to use

After generate-features produced features.json
After release-notes drafted the markdown
When the selection feels too broad, too niche, or too internally focused
When you want to compare the current draft against the examples and recalibrate the cut

Inputs

Review these, in order:

changes.json — the source of truth for what shipped
features.json — the scored candidate list, if present
Draft markdown files (libraries.md, runtime.md, sdk.md, etc.)
Editorial examples in references/examples/
The scoring and quality bar references:
- ../editorial-scoring/SKILL.md
- references/feature-scoring.md
- references/quality-bar.md

If features.json does not exist yet, infer the current implicit scoring from:

which features were promoted to headings
how much space each feature received
what was grouped into bug-fix buckets or omitted

Core review questions

Use the shared rubric from ../editorial-scoring/SKILL.md rather than inventing a new one here. In particular:

keep the same reader-centric 10 / 8 / 6 / 4 / 2 / 0 scale
apply the same 80/20 audience filter
compare the draft against the examples to see whether it still teaches instead of merely enumerating

Example check

Compare the draft against the component examples:

Does the length match the importance?
Are medium-value items grouped instead of getting their own heavyweight sections?
Are performance stories backed by evidence?
Does the draft teach, not just enumerate?

Common failure modes

Over-scoring — too many 4-6 items are promoted to top-level sections
Under-explaining — a real 8+ feature is present but buried or not framed clearly
API inventory mode — the draft starts mirroring api-diff instead of telling a user story
Technical novelty bias — clever implementation details outrank practical user value
Missed revert — a promoted feature was later backed out or does not appear in the actual build

Multi-model review pattern

For the final editorial QA pass, use this skill as a two-reviewer parallel check to get broader viewpoint diversity:

Preferred set:

Claude Opus 4.6
GPT-5.4

Give both reviewers the same inputs and the same requested output:

most correct scoring
most wrong scoring
important omissions
cut list

Then synthesize the overlap and disagreements. Treat consensus as a strong signal, but do not turn this into a blind vote — fidelity to changes.json, the shared editorial-scoring rubric, and the repo's editorial rules still wins.

Reviewer checklist

Do not ask reviewers the vague question "do you like this?" Give them the same specific checks instead:

Which headings still sound vague, passive, anthropomorphic, or promotional?
Which sections fail the 80/20 reader-value test and should be cut, grouped, or demoted?
Which sentences infer feelings or outcomes (trust, confidence, easier, better) instead of stating the concrete change?
Which sections drift into API-inventory mode instead of teaching a user-facing story?
Which promoted items were later reverted, backed out, or are missing from the actual build/package set? Cite the revert PR or verification gap.
Which code samples or examples are weak, confusing, or unsupported by the text?
Which links, issue/PR references, or formatting details still violate house style?
What is the single highest-value rewrite still needed in the draft?
Is the wording conventional, or is it inventing non-standard phrasing or terms?
Are the subject and its adjective or adverb paired in a familiar way?
Would this phrasing seem normal in release notes for another developer platform?
If release-notes/features.json lists this feature, does the section begin with the standard preview blockquote?

Ask reviewers to answer with file + heading + issue + suggested rewrite. This produces actionable review instead of general taste feedback.

Output

Return a concise review with:

Most correct scoring — the features that are best prioritized
Most wrong scoring — over-scored and under-scored items
Important omissions — items from changes.json / features.json that should likely be promoted
Cut list — what to group, compress, demote, or drop first

As a working rule of thumb:

8+ — section-worthy
6-7 — grouped paragraph or short section
0-5 — bug-fix bucket, one-liner, or cut

review-release-notes