pr-cluster
PR Cluster Finder
Find clusters of related PRs so you can review them as a group, identify the best base PR, and close duplicates. This is the pre-review triage step — run before /review-pr.
"Think in Clusters, Not Individual PRs. A single good merge often resolves 5+ related issues and PRs at once."
Arguments
$1= PR number (required)$2= owner/repo (default: auto-detect fromgh repo view --json nameWithOwner -q .nameWithOwner, falls back toopenclaw/openclaw)
Output Rules
- Always include clickable GitHub links for every PR and issue mentioned in the report:
https://github.com/{owner}/{repo}/pull/{number}for PRs,https://github.com/{owner}/{repo}/issues/{number}for issues. - Use markdown link format:
[**#123**: title](url)so links are clickable in terminal and rendered output.
Workflow
Phase 1: Analyze the Target PR
Fetch the target PR's full details. Run these in parallel:
# PR metadata
gh api repos/{owner}/{repo}/pulls/{number} \
--jq '{number, title, state, body, user: .user.login, created_at, updated_at, comments, review_comments, labels: [.labels[].name], mergeable_state}'
# Files changed
gh api repos/{owner}/{repo}/pulls/{number}/files \
--jq '[.[] | {filename, status, additions, deletions}]'
# Reviews
gh api repos/{owner}/{repo}/pulls/{number}/reviews \
--jq '[.[] | {user: .user.login, state}]'
Extract these clustering signals from the response:
- Conventional commit scope — parse
type(scope): descriptionfrom title → extractscope(e.g.,telegram,discord,gateway). This is the strongest signal. - Keywords — 3-5 meaningful words from the title after stripping the type/scope prefix and stop words.
- Labels — especially
channel: *andsize: *labels. - Files changed — exact file paths. Strong duplicate signal.
- Directories touched — parent directories of changed files (e.g.,
src/telegram/). - Issue references — scan body for
#NNNN,fixes #NNNN,closes #NNNN,issue #NNNNpatterns.
Phase 2: Multi-Signal Search
Run all searches in parallel — they're independent. Use per_page=30 and sort=updated&order=desc for broad searches.
Search 1 — Same channel/scope label:
gh api 'search/issues?q=repo:{owner}/{repo}+is:pr+label:"{label}"&per_page=30&sort=updated&order=desc' \
--jq '[.items[] | {number, title, state, user: .user.login, updated_at, labels: [.labels[].name]}]'
Search 2 — Title keywords (catches different approaches to same problem):
gh api 'search/issues?q=repo:{owner}/{repo}+is:pr+{keyword1}+{keyword2}+{keyword3}&per_page=30&sort=updated&order=desc' \
--jq '[.items[] | {number, title, state, user: .user.login, updated_at}]'
Search 3 — File-path keywords (catches direct duplicates): For each distinctive file changed, search using the filename stem (without extension):
gh api 'search/issues?q=repo:{owner}/{repo}+is:pr+{filename-stem}&per_page=20' \
--jq '[.items[] | {number, title, state, user: .user.login}]'
Then verify file overlap on top candidates:
gh api repos/{owner}/{repo}/pulls/{candidate}/files --jq '[.[].filename]'
Search 4 — Linked issues (catches PRs targeting the same bug):
For each issue #NNNN referenced in the target PR body:
gh api 'search/issues?q=repo:{owner}/{repo}+is:pr+{issue_number}&per_page=20' \
--jq '[.items[] | {number, title, state, user: .user.login}]'
Search 5 — Scope in title (broadest catch for same subsystem):
gh api 'search/issues?q=repo:{owner}/{repo}+is:pr+in:title+{scope}&per_page=30&sort=updated&order=desc' \
--jq '[.items[] | {number, title, state, user: .user.login, updated_at}]'
Important — PR state handling: All searches include both open and closed PRs. For each result, capture the state field (open or closed). For closed PRs, check if they were merged using gh api repos/{owner}/{repo}/pulls/{number} --jq '.merged'. Use these status badges in the report:
OPEN— still openMERGED— closed and merged (the fix landed; this may mean the problem is already solved!)CLOSED— closed without merge (rejected or abandoned approach — note why if comments explain)
For CLOSED PRs, read the closing comments (gh api repos/{owner}/{repo}/issues/{number}/comments --jq '.[].body'). Look for:
- References to landed commits (e.g.,
559b5eab7) — these mean the fix shipped via a different path. Always link landed commits ashttps://github.com/{owner}/{repo}/commit/{sha}. - "Duplicate of #NNNN" — follow the chain to the actual resolution.
- "Reimplemented on main" — the PR's approach was adopted but re-done by a maintainer.
- Stale/inactivity closures — the PR may still be valid, just abandoned.
This is critical context: a closed PR with a landed commit often means the problem is already solved and all open duplicates should be closed too.
Phase 3: Score and Cluster
Deduplicate all results by PR number (exclude the target PR itself). For each candidate, calculate a relevance score:
| Signal | Points | Notes |
|---|---|---|
| Shared file | +3 each | Same files = likely duplicate |
| Same issue ref | +5 | Explicitly same bug |
| Shared label | +2 each | Same channel/scope |
| Title keyword overlap | +1 each | Similar problem space |
| Same commit scope | +2 | Same subsystem |
| Updated in last 30 days | +1 | Active, not stale |
| Has reviews | +1 | Already received attention |
maintainer or trusted-contributor label |
+1 | Higher quality signal |
Tier assignment:
- Tier 1 — Likely duplicates (score >= 8): Almost certainly addressing the same problem.
- Tier 2 — Strongly related (score 4-7): Same area, possibly different aspects.
- Tier 3 — Loosely related (score 2-3): Same subsystem, different issues.
Phase 4: Deep Dive on Top Candidates
For Tier 1 and Tier 2 PRs (up to ~15), fetch full details:
gh api repos/{owner}/{repo}/pulls/{number} \
--jq '{number, title, state, body: (.body[:500]), user: .user.login, created_at, updated_at, additions, deletions, comments, review_comments, mergeable_state, labels: [.labels[].name]}'
Evaluate each PR on:
- Problem match: Is it solving the exact same problem as the target?
- Completeness: Does it fully solve the problem or is it partial?
- Size: additions/deletions, size label (smaller focused fixes are preferred)
- Freshness: Last updated, staleness (>90 days = likely superseded)
- Author trust:
maintainer>trusted-contributor>experienced-contributor> unlabeled - Review status: Approved? Changes requested? No reviews?
- Quality signals: Does the PR body explain the problem clearly? Are there tests?
Remember: Treat PRs as problem descriptions, not finished code. The best PR to base from may not have the best code — it might just describe the problem most clearly or touch the right files.
Phase 5: Cluster Report
Present results in this format:
## PR Cluster Report: #{target_number}
### Target PR
[**#{number}**: {title}](https://github.com/{owner}/{repo}/pull/{number})
- Author: @{author} | Labels: {labels} | Files: {count} changed
- Problem: {1-2 sentence summary from PR body}
### Cluster Summary
{N} related PRs found | {T1} likely duplicates | {T2} strongly related | {T3} loosely related
---
### Tier 1: Likely Duplicates (score >= 8)
`{STATUS}` [**#{number}**: {title}](https://github.com/{owner}/{repo}/pull/{number})
- Author: @{author} | Updated: {date} | Size: {label} | Reviews: {count}
- Shared files: {list}
- Why duplicate: {specific reason — same fix, same issue, same files}
- Quality: {brief assessment}
[repeat for each]
If any Tier 1 PR is MERGED, flag prominently: "This fix may already be on main — verify before proceeding."
### Tier 2: Strongly Related (score 4-7)
`{STATUS}` [**#{number}**: {title}](https://github.com/{owner}/{repo}/pull/{number})
- Author: @{author} | Updated: {date}
- Connection: {what connects it to the target}
[repeat for each]
### Tier 3: Loosely Related (score 2-3)
- `{STATUS}` [#{number}: {title}](https://github.com/{owner}/{repo}/pull/{number}) — {one-line connection}
[brief list]
---
### Recommendation
**Best base PR**: [#{number}](https://github.com/{owner}/{repo}/pull/{number}) — {why this one}
- Consider: {what makes it the strongest starting point — freshness, completeness, author trust, code quality}
**Close if best is merged**: [#{n1}](https://github.com/{owner}/{repo}/pull/{n1}), [#{n2}](...) — {these are superseded}
**Keep open** (different enough to warrant separate review): [#{n}](https://github.com/{owner}/{repo}/pull/{n})
**Related issues**: [#{issue}](https://github.com/{owner}/{repo}/issues/{issue}) — {include all issue refs found during search}
**Suggested next step**: `/review-pr {best_number}` to begin the review pipeline on the best candidate.
**Duplicate cleanup**: After merging, close superseded PRs with a polite comment thanking the contributor and linking to the merged PR.
Integration with Maintainer Pipeline
This skill is Step 0 in the maintainer workflow:
/pr-cluster {number} → find the cluster, pick the best base
/review-pr {best} → structured review with findings
/prepare-pr {best} → rebase, fix findings, run gates, push
/merge-pr {best} → squash-merge with attribution
→ close duplicates from the cluster
After identifying the best candidate, hand off to /review-pr which will:
- Create a worktree at
.worktrees/pr-{number} - Run
scripts/pr-reviewfor setup - Produce
.local/review.mdand.local/review.jsonwith structured findings
The cluster context you gather here informs the review — mention duplicate PRs and alternative approaches in the review findings so /prepare-pr can incorporate the best ideas from across the cluster.
Rate Limits
- GitHub Search API: 30 requests/minute (authenticated). The parallel searches in Phase 2 typically use 5-8 requests. Phase 4 deep dives use 1 request per candidate.
- If rate limited (HTTP 403/429), reduce parallelism and add brief pauses.
- Use
per_page=30for broad label/scope searches;per_page=100only for narrow targeted queries.
Tips
- The conventional commit scope (e.g.,
telegramfromfix(telegram): ...) is your strongest clustering signal at OpenClaw scale — always extract and search on it. - PRs in the same
src/{channel}/directory are almost certainly related even if titles look different. - Stale PRs (>90 days) that overlap with a fresh PR are almost always superseded — mark them as close candidates.
- Always follow issue references —
fixes #NNNNin the body leads to the most reliable duplicate discovery since multiple PRs often target the same issue. - Size labels help triage quickly:
size: XS/size: SPRs are usually focused fixes,size: L/size: XLare broader refactors. - If the cluster is very large (>15 Tier 1+2), further sub-cluster by exact file overlap to identify truly identical attempts vs related-but-different fixes.
Closing Message Templates
After the cluster analysis, use the appropriate template when closing superseded PRs. Adapt the details to the specific situation — these are structures, not rigid scripts. Always use single-quoted heredoc (-F - <<'EOF') for gh comments to avoid shell escaping issues.
Template 1: Already Fixed on Main
Use when the problem was already resolved by a landed commit.
Closing — this was already fixed on `main`.
**What landed:**
- {commit_sha_short} ([`{title}`](https://github.com/{owner}/{repo}/commit/{sha})) shipped on {date}, which {brief description of what the fix did}.
**Why this PR is no longer needed:**
- {Specific explanation of why the landed fix makes this PR unnecessary.}
- {Any other context — e.g., the fix took a different approach, the behavior this PR changes is now correct, etc.}
Thank you for flagging this, @{author}! {Optional: acknowledge that the problem they identified was real, even if the fix path differed.}
Template 2: Duplicate of Another Open PR
Use when closing in favor of a canonical PR that will be reviewed/merged.
Closing this as a duplicate to keep the discussion and CI signal in one place.
**Why this is duplicate:**
- This PR and #{canonical} both modify the same core path (`{shared_file}`) to {what they both do}.
- Both target the same user-visible symptom: {description of the shared problem}.
- Keeping both open fragments review/CI and increases merge-conflict risk for the same logic area.
**Canonical thread to continue on:**
- #{canonical} — {brief reason why that one was chosen: more complete, has tests, fresher rebase, etc.}
If there is any behavior in this PR that is not covered by #{canonical}, please call it out there and we can fold it in explicitly.
Thank you for the contribution, @{author}!
Template 3: Superseded by a Different Approach
Use when a maintainer chose a fundamentally different solution path.
Closing — this was addressed via a different approach.
**What shipped instead:**
- #{landed_pr} / {commit_sha} took the approach of {description}, rather than {what this PR did}.
- {Why the alternative approach was preferred, e.g., "backward compatibility", "broader fix", "addresses root cause"}.
**Related threads:**
- #{canonical_pr} for the implementation that landed.
- #{issue} for the original issue, now resolved.
Thank you for the PR, @{author} — the problem you identified was valid and helped inform the fix that shipped.
Template 4: Stale / Abandoned with Active Replacement
Use when a PR went stale and a newer attempt exists.
Closing — this went stale and a more recent PR addresses the same problem.
**Active replacement:**
- #{newer_pr} ({title}) covers the same fix with {what makes it better: tests, changelog, fresher rebase}.
If you'd like to continue contributing to this area, #{newer_pr} is the thread to follow.
Thank you for the early work on this, @{author}!
Usage Notes
- Always thank the contributor by @-mentioning them.
- Always link to the canonical PR/commit/issue so the contributor can follow up.
- Never use backticks around issue/PR refs like
#12345— use plain #12345 so GitHub auto-links them. - Post the comment via:
gh pr comment {number} -R {owner}/{repo} -F - <<'EOF'\n{message}\nEOF
Confirmation Gate
Never close PRs without explicit maintainer approval. The workflow is:
- Draft all closing comments and present them to the maintainer for review.
- Show the full list: which PRs will be closed, which template is used, and the exact comment text.
- Wait for explicit "go" / approval before posting any comments or closing any PRs.
- Close PRs one at a time with comment + close:
# Comment first, then close
gh pr comment {number} -R {owner}/{repo} -F - <<'EOF'
{closing message}
EOF
gh pr close {number} -R {owner}/{repo}
- After all closures, report what was done with links to each closed PR's comment.