resolve-pr-comments
PR Comment Resolver
Multi-agent system for fetching, clustering, and resolving PR review comments at scale.
⛔⛔⛔ FORBIDDEN: gh cli / gh api for PR comments ⛔⛔⛔
DO NOT USE gh commands to fetch PR comment data. EVER.
# ❌ FORBIDDEN - NEVER run these commands
gh pr view <N> --json ...
gh api repos/.../pulls/<N>/comments
gh api repos/.../pulls/<N>/reviews
gh api graphql -f query='...' # for PR data
Why this is forbidden:
| Method | Tokens | Quality |
|---|---|---|
gh api / gh pr view |
10,000-50,000 | Raw, unprocessed, duplicates |
pr-resolver.sh script |
500-2,000 | Clustered, deduplicated, actionable |
The script already fetches everything. Manual gh calls waste 10-50x tokens on garbage data.
Architecture Overview
Orchestrator (You)
│
├── pr-resolver.sh ──> data.json + actionable.json + clusters/*.md
│
├── task(subagent_type: "pr-comment-reviewer") per cluster
│ │
│ ├── Validates comments against actual code
│ ├── Applies minimal fixes (VALID_FIX)
│ ├── Dismisses false positives (FALSE_POSITIVE)
│ ├── Defers risky/unclear items (VALID_DEFER)
│ └── Returns JSON output with actions taken
│
└── Handle deferred items (escalate or investigate)
Quick Start
⚠️ WAIT BY DEFAULT: The script waits for CI and AI reviews (max 10 min total) before clustering. DO NOT use
--skip-waitunless explicitly requested or CI confirmed passed.⏱️ TIMEOUT: OpenCode default is 2 min. Pass
timeout: 660000(11 min) to prevent early termination. On timeout, script outputs commands to check status - follow those to decide next steps.
# 1. Fetch and cluster comments (waits for CI + AI reviews)
bash({
command: "bash skills/resolve-pr-comments/scripts/pr-resolver.sh <PR_NUMBER>",
timeout: 660000,
description: "Fetch and cluster PR comments"
})
# Skip wait only when explicitly asked
bash({
command: "bash skills/resolve-pr-comments/scripts/pr-resolver.sh <PR_NUMBER> --skip-wait \"<reason>\"",
description: "Fetch PR comments (skip wait)"
})
Output: .ada/data/pr-resolver/pr-<N>/
data.json— Full data, all clustersactionable.json— Token-efficient, actionable clusters onlyclusters/— Markdown files for subagent consumption
# 2. Fire subagents for each actionable cluster (non-blocking, returns immediately)
background_task({ agent: "pr-comment-reviewer", prompt: "Process cluster. Read: .ada/data/pr-resolver/pr-7/clusters/agents-md-suggestion.md", description: "PR #7: agents-md" })
background_task({ agent: "pr-comment-reviewer", prompt: "Process cluster. Read: .ada/data/pr-resolver/pr-7/clusters/install-sh-issue.md", description: "PR #7: install-sh" })
# ... spawn all clusters immediately, then collect results as they complete with background_output(task_id)
# 3. After all subagents complete, verify
bash({
command: "bash skills/resolve-pr-comments/scripts/pr-resolver.sh <PR_NUMBER>",
timeout: 660000,
description: "Fetch and cluster PR comments"
})
# Success: actionable_clusters should be 0
Output Files
| File | Purpose | When to Use |
|---|---|---|
data.json |
Full cluster data including resolved comments | Historical context, debugging |
actionable.json |
Token-efficient, only actionable clusters | Primary orchestration input |
clusters/*.md |
Individual cluster files with full context | Subagent input |
actionable.json Structure
{
"pr_number": 7,
"repository": "owner/repo",
"generated_at": "2025-01-07T09:00:00Z",
"statistics": {
"total_comments": 10,
"resolved_comments": 3,
"unresolved_comments": 7,
"actionable_clusters": 4
},
"clusters": [
{
"cluster_id": "agents-md-suggestion",
"file": "AGENTS.md",
"concern": "suggestion",
"total_comments": 3,
"resolved_count": 1,
"unresolved_count": 2,
"actionable": true,
"comments": [...]
}
]
}
Complete Orchestration Workflow
Phase 1: Fetch and Analyze
bash({
command: "bash skills/resolve-pr-comments/scripts/pr-resolver.sh <PR_NUMBER>",
timeout: 660000,
description: "Fetch and cluster PR comments"
})
Read actionable.json to understand workload:
- How many actionable clusters?
- What concern types? (security requires special handling)
- Which files are affected?
Decision Point: If actionable_clusters: 0, you're done.
Phase 2: Spawn Subagents
⛔ DO NOT use task() in this skill - it blocks until ALL calls complete, one slow subagent blocks everything.
Use background_task for async execution. Fire all, collect as they complete.
Parallel Execution: Fire all subagents immediately. They operate on separate files.
Serial Execution: If multiple clusters affect the same file, wait for earlier task to complete before spawning next.
CRITICAL: The agent name MUST be exactly "pr-comment-reviewer". No variations.
# Fire ALL clusters immediately (non-blocking, returns task_id)
background_task({ agent: "pr-comment-reviewer", prompt: "Process cluster. Read: .ada/.../cluster-1.md", description: "Cluster 1" })
background_task({ agent: "pr-comment-reviewer", prompt: "Process cluster. Read: .ada/.../cluster-2.md", description: "Cluster 2" })
background_task({ agent: "pr-comment-reviewer", prompt: "Process cluster. Read: .ada/.../cluster-3.md", description: "Cluster 3" })
# ... all return task_ids immediately
# Collect results as they complete (system notifies on completion)
background_output({ task_id: "..." }) # Get result for specific task
Characteristics:
- Returns task_id immediately (non-blocking)
- Subagent runs in isolated sub-session
- Collect results with
background_output(task_id) - System notifies when each task completes
- Must explicitly tell subagent to read the file (no auto-injection)
Smart Scheduling: When a task completes, check if any blocked tasks can now start (e.g., same-file clusters waiting for earlier cluster to finish). Start unblocked tasks immediately rather than waiting for all results.
Common Errors:
| Error | Cause | Fix |
|---|---|---|
| Agent not found | Typo in agent name | Use exactly "pr-comment-reviewer" |
| Empty result | Subagent didn't read file | Include "Read the cluster file at: {path}" in prompt |
Phase 3: Collect Results
Each subagent returns output in two formats:
- Markdown Summary (human-readable)
- JSON Output (for orchestrator parsing)
The JSON output includes:
{
"cluster_id": "agents-md-suggestion",
"summary": {
"fixed": 2,
"dismissed": 1,
"deferred": 1
},
"actions": [
{
"comment_id": "123456",
"classification": "VALID_FIX",
"confidence": 95,
"action": "FIXED",
"reason": "Updated import path",
"script_executed": "pr-resolver-resolve.sh 7 123456",
"script_result": "SUCCESS",
"thread_resolved": true
}
],
"deferred_items": [
{
"comment_id": "345678",
"reason": "Security concern requires human review",
"context": "..."
}
]
}
Critical Fields:
script_executed: Actual command run (null for deferred)script_result: "SUCCESS" or error messagethread_resolved: Whether GitHub thread is now resolved
Phase 4: Handle Deferred Items
Subagents defer items they cannot safely resolve. Before escalating to human, attempt self-investigation.
Triage Deferred Items
Parse each deferred item's reason and classify investigability:
| Defer Reason | Self-Investigable? | Action |
|---|---|---|
| Low confidence (50-69) | YES | Investigate with more context |
| Conflicting evidence | YES | Gather more evidence, break tie |
| Complex refactor | MAYBE | Assess scope first |
| Security concern | YES - Research First | Research vulnerability type, check if it applies, look up mitigations. Only escalate if fix is unclear after thorough research |
| Unclear requirement | YES | Search codebase for patterns, check docs, understand intent |
Autonomy Principle: Exhaust all research options before escalating. You have tools - use them.
Self-Investigation Workflow
For each investigable deferred item:
-
READ the deferred_item context from subagent output
- What did the subagent check?
- Why was confidence low?
-
READ the file at the mentioned location (±30 lines context)
- Understand the full function/component, not just the line
-
GATHER additional evidence
grep: Search for patterns the subagent might have missedlsp_find_references: Check how the code is usedlsp_hover: Understand types and signaturesread: Check related files (imports, callers)
-
CHECK resolved comments in same cluster
- What patterns were already validated/dismissed?
-
SEARCH codebase for similar patterns
- How is this pattern handled elsewhere?
See Investigation Guide for detailed examples and decision trees.
Post-Investigation Decision
| New Confidence | Action |
|---|---|
| ≥70 | Apply fix or dismiss with evidence |
| 50-69 | Document findings, escalate with notes |
| <50 | Escalate immediately |
If you resolved it yourself, execute the scripts:
# After fixing
bash skills/resolve-pr-comments/scripts/pr-resolver-resolve.sh <PR> <COMMENT_ID>
# After dismissing with evidence
bash skills/resolve-pr-comments/scripts/pr-resolver-dismiss.sh <PR> <COMMENT_ID> "reason with evidence"
Escalation Format
When escalating to human, provide full context:
## Deferred Items Requiring Human Review
### 1. Comment {ID} - {file}:{line}
**Original Comment**: {bot's comment}
**Subagent Reason**: {why deferred}
**Subagent Confidence**: {N}%
**My Investigation**:
{what you checked and found}
**Why I'm Escalating**:
{specific uncertainty}
**Options I See**:
1. {option A} - {implications}
2. {option B} - {implications}
**My Recommendation**: {if you have one}
Phase 5: Verification
After all subagents complete and deferred items are handled:
bash({
command: "bash skills/resolve-pr-comments/scripts/pr-resolver.sh <PR_NUMBER>",
timeout: 660000,
description: "Fetch and cluster PR comments"
})
Success Criteria: actionable_clusters: 0
If actionable clusters remain:
- Check which threads weren't resolved
- Investigate why (subagent errors, script failures)
- Retry or handle manually
Subagent Output Contract
The @pr-comment-reviewer subagent (defined in agents/pr-comment-reviewer.md) returns:
Classification Types
| Classification | Meaning | Thread Status |
|---|---|---|
VALID_FIX |
Issue was real, fix applied | RESOLVED |
FALSE_POSITIVE |
Bot was wrong | DISMISSED |
ALREADY_FIXED |
Fixed elsewhere | DISMISSED |
STYLE_CHOICE |
Preference, not bug | DISMISSED |
VALID_DEFER |
Real issue, too risky | OPEN |
UNCLEAR |
Cannot determine | OPEN |
Scripts Reference
| Script | Purpose | Usage |
|---|---|---|
pr-resolver.sh <PR> |
Fetch + cluster + generate outputs | Main entry point (waits for CI by default) |
pr-resolver.sh <PR> --skip-wait "reason" |
Skip CI/review wait | When CI already passed |
pr-resolver-resolve.sh <PR> <ID> [ID2...] |
Resolve thread(s) after fixing | Post-fix cleanup |
pr-resolver-dismiss.sh <PR> <ID> "reason" |
Dismiss with reply | False positive handling |
Fork/Upstream Support
When working with PRs from forks to upstream repos, use --repo to specify the upstream:
# Fetch PR comments from upstream repo (when in a fork)
bash({
command: "bash skills/resolve-pr-comments/scripts/pr-resolver.sh <PR_NUMBER>--repo <UPSTREAM-OWNER/REPO>",
timeout: 660000,
description: "Fetch and cluster PR comments"
})
# Resolve/dismiss also support --repo
bash skills/resolve-pr-comments/scripts/pr-resolver-resolve.sh <PR_NUMBER> <ID> --repo <UPSTREAM-OWNER/REPO>
Auto-detection: If you're in a fork, the scripts automatically detect the upstream and use it. The --repo flag overrides this for explicit control.
Concern Categories
Comments are auto-categorized by content:
| Category | Trigger Keywords | Auto-Fix Risk |
|---|---|---|
security |
security, vulnerability, injection, xss, csrf | Research deeply - understand the vulnerability, verify it applies, fix if confident |
issue |
bug, error, fail, incorrect, broken | Careful |
import-fix |
import, export, require, module | Safe |
markdown-lint |
markdown, md0XX, fenced, code block | Safe |
doc-fix |
doc link, documentation, readme | Careful |
suggestion |
consider, should, might, could, suggest | Careful |
uncategorized |
Everything else | Research first, then decide |
When to Escalate to Human
| Scenario | Escalation Required |
|---|---|
| Security concern after thorough research still unclear | YES |
| Breaking API changes | YES |
| Performance implications unclear after benchmarking/analysis | YES |
| Multiple valid interpretations | YES (ask for preference) |
| Subagent failed 2+ times on same item | YES |
Key Principle: Research first, escalate only when truly uncertain after exhausting your tools.
Anti-Patterns
⛔ FORBIDDEN (Blocking Subagents)
# ❌ FORBIDDEN - task() blocks until ALL complete, one slow subagent blocks everything
task({ subagent_type: "pr-comment-reviewer", ... })
# ✅ CORRECT - background_task returns immediately, collect results as they complete
background_task({ agent: "pr-comment-reviewer", ... })
⛔ FORBIDDEN (Context Bloat)
These commands are NEVER allowed when using this skill:
# ❌ FORBIDDEN - bloats context with raw unprocessed data
gh pr view <N> --json ...
gh api repos/.../pulls/<N>/comments
gh api repos/.../pulls/<N>/reviews
gh api graphql -f query='...' # for PR comments
# ✅ CORRECT - run this instead
bash({
command: "bash skills/resolve-pr-comments/scripts/pr-resolver.sh <PR_NUMBER>",
timeout: 660000,
description: "Fetch and cluster PR comments"
})
Other Anti-Patterns
| Don't | Do Instead |
|---|---|
Use task() to spawn subagents |
Use background_task() for non-blocking execution |
| Process clusters without reading actionable.json first | Start with actionable.json to understand scope |
| Skip deferred items | Handle each deferred item explicitly |
| Defer without researching first | Research thoroughly (docs, codebase, web), then decide |
| Ignore subagent verification failures | Investigate why verification failed |
| Run multiple subagents on same file simultaneously | Serialize per-file to avoid conflicts |
| Mark complete without refreshing | Always run pr-resolver.sh again to verify |
| Ask humans for things you can look up | Use your tools - search, read, grep before escalating |
Context Efficiency Rule: The script produces actionable.json specifically to minimize token usage. Raw gh api output is ~10-50x larger than the processed output. Never fetch what the script already fetches.
Integration with Other Skills
| Skill | Integration |
|---|---|
docs-check |
Run after code fixes to check doc impact |
docs-write |
Use when docs need updating |
git-commit |
Commit resolved changes with proper format |
code-review |
Run before pushing to catch regressions |
References
- Investigation Guide - Detailed workflow for investigating deferred items
- Documentation Guide - Documentation standards
Workflow Checklist
Before marking PR comment resolution complete:
- Ran
pr-resolver.shto fetch current state - Read
actionable.jsonto understand workload - Spawned subagent for each actionable cluster
- Collected and verified subagent results
- Handled all deferred items (investigate or escalate)
- Re-ran
pr-resolver.sh- confirmedactionable_clusters: 0 - Reported any items escalated to human
More from lukasstrickler/ai-dev-atelier
ui-animation
Guide tasteful UI animation with easing, springs, layout animations, gestures, and accessibility. Covers Tailwind and Motion patterns. Use when: (1) Implementing enter/exit animations, (2) Choosing easing curves, (3) Configuring springs, (4) Layout animations and shared elements, (5) Drag/swipe gestures, (6) Micro-interactions, (7) Ensuring prefers-reduced-motion accessibility. Triggers: animate, animation, easing, spring, transition, motion, layout, gesture, drag, swipe, reduced motion, framer motion.
31use-graphite
Manage stacked PRs with Graphite CLI (gt) instead of git push/gh pr create. Auto-detects Graphite repos and blocks conflicting commands with helpful alternatives. Use when: (1) About to run git push or gh pr create in a Graphite repo, (2) Creating a new branch for a feature, (3) Submitting code for review, (4) Large changes that should be split into reviewable chunks, (5) Hook blocks your git command and suggests gt equivalent. NOT for: repos not initialized with Graphite, git add/commit/status/log. Triggers: git push blocked, gh pr create blocked, create branch, submit PR, stacked PRs, split large PR, gt create, gt submit, graphite workflow.
14tdd
Strict Red-Green-Refactor workflow for robust, self-documenting code. Discovers project test setup via codebase exploration before assuming frameworks. Use when: (1) Implementing new features with test-first approach, (2) Fixing bugs with reproduction tests, (3) Refactoring existing code with test safety net, (4) Adding tests to legacy code, (5) Ensuring code quality before committing, (6) When tests exist but workflow unclear, or (7) When establishing testing practices in a new project. Triggers: test, tdd, red-green-refactor, failing test, test first, test-driven, write tests, add tests, run tests.
9code-quality
Run comprehensive code quality checks including TypeScript typecheck, ESLint linting, Prettier formatting, and Markdown validation. Auto-fixes formatting issues in agent mode or provides read-only checks for CI pipelines. Use when: (1) Before committing code changes, (2) In CI/CD pipelines for automated quality gates, (3) After making significant code changes, (4) When preparing code for review, (5) When ensuring code meets quality standards, (6) For type checking, linting, formatting, and markdown validation, (7) In pre-commit hooks, or (8) For automated quality gates before merging. Triggers: finalize, code quality, typecheck, lint, format, check code, quality check, run checks, pre-commit, before commit, CI checks, validate code.
9git-commit
Write clear git commits with Conventional Commits format. Detects project conventions from history and config. Guides commit granularity. Use when: (1) Completing working code, (2) Code builds and tests pass, (3) Ready to save, (4) Before pushing, (5) After review feedback. Triggers: automatically when finishing commitable work that builds and passes tests.
8research
Conduct comprehensive academic research using OpenAlex, PDF extraction, paper search MCPs, and web search. Always produces comprehensive reports with 5+ evidence cards covering conflicting ideas and best approaches. Integrates academic papers (high weight) and non-academic sources (blogs, docs - lower weight but valuable for SOTA). Use when: (1) Researching software architecture patterns, (2) Finding academic papers on technical topics, (3) Conducting literature reviews, (4) Analyzing research papers for evidence-based decisions, (5) Building evidence cards from papers, (6) Finding related works and citations, or (7) When you need grounded, paper-backed technical recommendations. Triggers: research, find papers, literature review, evidence-based, academic research, compare approaches, what does research say, find studies on, search for papers about.
5