build-coherence
Build Coherence
Evaluate competing approaches through independent assessment, explicit reasoning-out-loud advocacy, confidence-calibrated commitment thresholds, and structured deadlock resolution — producing coherent decisions from multiple reasoning paths.
When to Use
forage-solutionshas identified multiple valid approaches and a selection must be made- Oscillating between two approaches without committing to either
- Needing to justify a decision with structured reasoning (architecture choice, tool selection, implementation strategy)
- When a previous decision was made by gut feeling and needs evidence-based validation
- When internal reasoning is producing contradictory conclusions and coherence must be restored
- Before an irreversible action (merging, deploying, deleting) where the cost of the wrong choice is high
Inputs
- Required: Two or more competing approaches to evaluate
- Optional: Quality assessments from prior scouting (see
forage-solutions) - Optional: Decision stakes (reversible, moderate, irreversible) for threshold calibration
- Optional: Time budget for the decision
- Optional: Known failure mode (oscillation, premature commitment, groupthink)
Procedure
Step 1: Independent Evaluation
Assess each approach on its own merits before comparing them. The critical rule: do not let the assessment of approach A bias the assessment of approach B.
For each approach, evaluate independently:
Approach Evaluation Template:
┌────────────────────────┬──────────────────────────────────────────┐
│ Dimension │ Assessment │
├────────────────────────┼──────────────────────────────────────────┤
│ Approach name │ │
├────────────────────────┼──────────────────────────────────────────┤
│ Core mechanism │ How does this approach solve the problem? │
├────────────────────────┼──────────────────────────────────────────┤
│ Strengths (2-3) │ What does this approach do well? │
├────────────────────────┼──────────────────────────────────────────┤
│ Risks (2-3) │ What could go wrong? What is assumed? │
├────────────────────────┼──────────────────────────────────────────┤
│ Evidence quality │ How well-supported is this approach? │
│ │ (verified / inferred / speculated) │
├────────────────────────┼──────────────────────────────────────────┤
│ Quality score (0-100) │ Overall assessment │
├────────────────────────┼──────────────────────────────────────────┤
│ Confidence (0-100) │ How confident in this assessment? │
└────────────────────────┴──────────────────────────────────────────┘
Fill this out for each approach separately. Do not write a comparison until all individual evaluations are complete.
Expected: Independent evaluations where each approach is assessed on its own terms. The evaluation of approach B does not reference approach A. Quality scores reflect genuine assessment, not ranking.
On failure: If the evaluations are contaminated (you find yourself writing "better than A" while assessing B), reset. Assess A completely, then clear the framing and assess B from scratch. If the scores are all identical, the evaluation dimensions are too coarse — add domain-specific criteria.
Step 2: Waggle Dance — Reason Out Loud
Advocate for each approach proportionally to its quality. This is the AI equivalent of the bee waggle dance: making implicit reasoning explicit and public.
- For each approach, state the case for it — as if presenting to a skeptical user:
- "Approach A is strong because [evidence]. The main risk is [risk], which is mitigated by [mitigation]."
- Advocacy intensity should be proportional to quality score:
- High-quality approach: detailed advocacy with specific evidence
- Medium-quality approach: brief advocacy with acknowledged limitations
- Low-quality approach: mentioned for completeness, not actively advocated
- Cross-inspection: after advocating for A, actively look for evidence that supports B instead. After advocating for B, look for evidence that supports A. This counteracts confirmation bias
The purpose of reasoning-out-loud is to make the decision auditable — to yourself and to the user. If the reasoning cannot be articulated, the assessment is shallower than the score suggests.
Expected: Explicit reasoning for each approach that would be persuasive to a neutral observer. Cross-inspection reveals at least one consideration that was initially overlooked.
On failure: If advocacy feels perfunctory (going through motions), the approaches may not be genuinely different — they may be variations of the same idea. Check: do the approaches differ in mechanism, or only in implementation detail? If the latter, the decision may not matter much — pick either and move on.
Step 3: Set Quorum Threshold and Commit
Set the confidence threshold required to commit, calibrated to the decision's stakes.
Confidence Thresholds by Stakes:
┌─────────────────────┬───────────┬──────────────────────────────────┐
│ Decision Type │ Threshold │ Rationale │
├─────────────────────┼───────────┼──────────────────────────────────┤
│ Easily reversible │ 60% │ Cost of trying and reverting is │
│ (can undo) │ │ low. Speed matters more than │
│ │ │ certainty │
├─────────────────────┼───────────┼──────────────────────────────────┤
│ Moderate stakes │ 75% │ Reverting has cost but is │
│ (costly to reverse) │ │ possible. Worth investing in │
│ │ │ evaluation │
├─────────────────────┼───────────┼──────────────────────────────────┤
│ Irreversible or │ 90% │ Cannot undo. Must be confident. │
│ high-stakes │ │ If threshold not met, gather │
│ │ │ more information before deciding │
└─────────────────────┴───────────┴──────────────────────────────────┘
- Classify the decision stakes
- Check: does the leading approach's quality score × confidence reach the threshold?
- If yes: commit. State the decision, the reasoning, and the key risk being accepted
- If no: identify what additional information would raise confidence to the threshold
- Once committed, do not revisit unless new disqualifying evidence emerges
Expected: A clear commitment moment with stated reasoning. The decision is made at an appropriate confidence level for its stakes.
On failure: If the threshold is never met (can't reach 90% on an irreversible decision), ask: is the decision truly irreversible? Can it be decomposed into a reversible test phase + an irreversible commit? Most apparently irreversible decisions can be staged. If staging is impossible, communicate the uncertainty to the user and ask for guidance.
Step 4: Resolve Deadlocks
When two or more approaches have similar scores and the quorum threshold is not met for any single one.
Deadlock Resolution:
┌────────────────────────┬──────────────────────────────────────────┐
│ Deadlock Type │ Resolution │
├────────────────────────┼──────────────────────────────────────────┤
│ Genuine tie │ The approaches are equivalent. Pick one │
│ (scores within 5%) │ and commit. The cost of deliberating │
│ │ exceeds the cost of picking the "wrong" │
│ │ equivalent option. Flip a coin mentally │
├────────────────────────┼──────────────────────────────────────────┤
│ Information deficit │ The tie exists because evaluation is │
│ (scores uncertain) │ incomplete. Invest one more specific │
│ │ investigation — a targeted file read, a │
│ │ quick test — then re-score │
├────────────────────────┼──────────────────────────────────────────┤
│ Oscillation │ Scoring keeps flip-flopping depending on │
│ (scores keep changing) │ which dimension gets attention. Time-box:│
│ │ set a timer, evaluate once more, commit │
│ │ to the result regardless │
├────────────────────────┼──────────────────────────────────────────┤
│ Approach merge │ The best parts of A and B can be │
│ (compatible strengths) │ combined. Check for compatibility. If │
│ │ merge is coherent, use it. If forced, │
│ │ don't — pick one │
└────────────────────────┴──────────────────────────────────────────┘
Expected: Deadlock resolved through the appropriate mechanism. The resolution is decisive — no lingering doubt that undermines execution.
On failure: If the deadlock persists through all resolution strategies, the decision may be premature. Ask the user: "I see two equally strong approaches: [A] and [B]. [Brief case for each.] Which aligns better with your priorities?" Delegating a genuine tie to the user is not a failure — it is acknowledging that the decision depends on values the AI cannot infer.
Step 5: Assess Coherence Quality
After committing to a decision, evaluate whether the process produced genuine coherence or just a decision.
- Was the decision evidence-based, or was it rubber-stamping an initial preference?
- Test: was the preference the same before and after evaluation? If so, did the evaluation change anything?
- Were the losing approaches genuinely considered, or were they straw men?
- Test: can you articulate the strongest case for the losing approach?
- What signal would trigger reassessment?
- Define a specific observation that would invalidate the decision ("If I discover that the API doesn't support X, then approach B becomes better")
- Is there useful information from the losing approaches that should inform implementation?
- A risk identified in approach B might apply to approach A as well
Expected: A brief quality check that either confirms the decision or identifies it as weak. If weak, return to the appropriate earlier step rather than proceeding on shaky ground.
On failure: If the quality check reveals that the decision was preference-based rather than evidence-based, acknowledge it honestly. Sometimes preference is all that is available — but it should be labeled as such, not dressed up as analysis.
Validation
- Each approach was evaluated independently before comparison
- Advocacy was proportional to quality (not equal attention regardless of merit)
- Cross-inspection was performed (looking for counter-evidence after advocacy)
- Quorum threshold was calibrated to decision stakes
- If deadlocked, a specific resolution strategy was applied
- Post-decision quality check was performed
- A reassessment trigger was defined
Common Pitfalls
- Premature commitment: Deciding before evaluating all approaches. The first approach considered has an anchoring advantage — it gets more mental attention simply by being first. Evaluate all before comparing
- Equal advocacy for unequal approaches: If approach A scored 85 and approach B scored 45, spending equal time advocating for both wastes effort and creates false equivalence
- Rubber-stamping: Going through the evaluation process to justify a decision already made. The test is whether the evaluation could have changed the outcome. If not, the process was theater
- Threshold avoidance: Lowering the confidence threshold to make the decision easier rather than gathering the information needed to meet the appropriate threshold
- Ignoring the losing side: The losing approach often contains warnings that apply to the winning one. Risks identified in approach B don't disappear just because approach A was chosen
Related Skills
build-consensus— the multi-agent consensus model that this skill adapts to single-agent reasoningforage-solutions— scouts the solution space that coherence evaluates; typically precedes this skillcoordinate-reasoning— manages information flow during multi-path evaluationcenter— establishes the balanced baseline needed for unbiased evaluationmeditate— clears assumptions between evaluating different approaches