drift-analysis
Drift Analysis
Knowledge and patterns for analyzing project state, detecting plan drift, and creating prioritized reconstruction plans.
Architecture Overview
/drift-detect
│
├─→ collectors.js (pure JavaScript)
│ ├─ scanGitHubState()
│ ├─ analyzeDocumentation()
│ └─ scanCodebase()
│
└─→ plan-synthesizer (Opus)
└─ Deep semantic analysis with full context
Data collection: Pure JavaScript (no LLM overhead) Semantic analysis: Single Opus call with complete context
Drift Detection Patterns
Types of Drift
Plan Drift: When documented plans diverge from actual implementation
- PLAN.md items remain unchecked for extended periods
- Roadmap milestones slip without updates
- Sprint/phase goals not reflected in code changes
Documentation Drift: When documentation falls behind implementation
- New features exist without corresponding docs
- README describes features that don't exist
- API docs don't match actual endpoints
Issue Drift: When issue tracking diverges from reality
- Stale issues that no longer apply
- Completed work without closed issues
- High-priority items neglected
Scope Drift: When project scope expands beyond original plans
- More features documented than can be delivered
- Continuous addition without completion
- Ever-growing backlog with no pruning
Detection Signals
HIGH-CONFIDENCE DRIFT INDICATORS:
- Milestone 30+ days overdue with open issues
- PLAN.md < 30% completion after 90 days
- 5+ high-priority issues stale > 60 days
- README features not found in codebase
MEDIUM-CONFIDENCE INDICATORS:
- Documentation files unchanged for 180+ days
- Draft PRs open > 30 days
- Issue themes don't match code activity
- Large gap between documented and implemented features
LOW-CONFIDENCE INDICATORS:
- Many TODOs in codebase
- Stale dependencies
- Old git branches not merged
Prioritization Framework
Priority Calculation
function calculatePriority(item, weights) {
let score = 0;
// Severity base score
const severityScores = {
critical: 15,
high: 10,
medium: 5,
low: 2
};
score += severityScores[item.severity] || 5;
// Category multiplier
const categoryWeights = {
security: 2.0, // Security issues get 2x
bugs: 1.5, // Bugs get 1.5x
infrastructure: 1.3,
features: 1.0,
documentation: 0.8
};
score *= categoryWeights[item.category] || 1.0;
// Recency boost
if (item.createdRecently) score *= 1.2;
// Stale penalty (old items slightly deprioritized)
if (item.daysStale > 180) score *= 0.9;
return Math.round(score);
}
Time Bucket Thresholds
| Bucket | Criteria | Max Items |
|---|---|---|
| Immediate | severity=critical OR priority >= 15 | 5 |
| Short-term | severity=high OR priority >= 10 | 10 |
| Medium-term | priority >= 5 | 15 |
| Backlog | everything else | 20 |
Priority Weights (Default)
security: 10 # Security issues always top priority
bugs: 8 # Bugs affect users directly
features: 5 # New functionality
documentation: 3 # Important but not urgent
tech-debt: 4 # Keeps codebase healthy
Cross-Reference Patterns
Document-to-Code Matching
// Fuzzy matching for feature names
function featureMatch(docFeature, codeFeature) {
const normalize = s => s
.toLowerCase()
.replace(/[-_\s]+/g, '')
.replace(/s$/, ''); // Remove trailing 's'
const docNorm = normalize(docFeature);
const codeNorm = normalize(codeFeature);
return docNorm.includes(codeNorm) ||
codeNorm.includes(docNorm) ||
levenshteinDistance(docNorm, codeNorm) < 3;
}
Common Mismatches
| Documented As | Implemented As |
|---|---|
| "user authentication" | auth/, login/, session/ |
| "API endpoints" | routes/, api/, handlers/ |
| "database models" | models/, entities/, schemas/ |
| "caching layer" | cache/, redis/, memcache/ |
| "logging system" | logger/, logs/, telemetry/ |
Output Templates
Drift Report Section
## Drift Analysis
### {drift_type}
**Severity**: {severity}
**Detected In**: {source}
{description}
**Evidence**:
{evidence_items}
**Recommendation**: {recommendation}
Gap Report Section
## Gap: {gap_title}
**Category**: {category}
**Severity**: {severity}
{description}
**Impact**: {impact_description}
**To Address**:
1. {action_item_1}
2. {action_item_2}
Reconstruction Plan Section
## Reconstruction Plan
### Immediate Actions (This Week)
{immediate_items_numbered}
### Short-Term (This Month)
{short_term_items_numbered}
### Medium-Term (This Quarter)
{medium_term_items_numbered}
### Backlog
{backlog_items_numbered}
Best Practices
When Analyzing Drift
-
Compare timestamps, not just content
- When was the doc last updated vs. last code change?
- Are milestones dated realistically?
-
Look for patterns, not individual items
- One stale issue isn't drift; 10 stale issues is a pattern
- One undocumented feature isn't drift; 5 undocumented features is
-
Consider context
- Active development naturally has some drift
- Mature projects should have minimal drift
- Post-launch projects often have documentation lag
-
Weight by impact
- User-facing drift matters more than internal
- Public API drift matters more than implementation details
When Creating Plans
-
Be actionable, not exhaustive
- Top 5 immediate items, not top 50
- Each item should be completable in reasonable time
-
Group related items
- "Update authentication docs" not "Update login page docs" + "Update signup docs"
-
Include success criteria
- How do we know this drift item is resolved?
-
Balance categories
- All security first, but don't ignore everything else
- Mix quick wins with important work
Data Collection (JavaScript)
The collectors.js module extracts data without LLM overhead:
GitHub Data
- Open issues categorized by labels
- Open PRs with draft status
- Milestones with due dates
- Stale items (> 90 days inactive)
- Theme analysis from titles
Documentation Data
- Parsed README, PLAN.md, CLAUDE.md, CHANGELOG.md
- Checkbox completion counts
- Section analysis
- Feature lists
Code Data
- Directory structure
- Framework detection
- Test framework presence
- Health indicators (CI, linting, tests)
Semantic Analysis (Opus)
The plan-synthesizer receives all collected data and performs:
- Cross-referencing: Match documented features to implementation
- Drift identification: Find divergence patterns
- Gap analysis: Identify what's missing
- Prioritization: Context-aware ranking
- Report generation: Actionable recommendations
Example Input/Output
Collected Data (from collectors.js)
{
"github": {
"issues": [...],
"categorized": { "bugs": [...], "features": [...] },
"stale": [...]
},
"docs": {
"files": { "README.md": {...}, "PLAN.md": {...} },
"checkboxes": { "total": 15, "checked": 3 }
},
"code": {
"frameworks": ["Express"],
"health": { "hasTests": true, "hasCi": true }
}
}
Analysis Output (from plan-synthesizer)
# Reality Check Report
## Executive Summary
Project has moderate drift: 8 stale priority issues and 20% plan completion.
Strong code health (tests + CI) but documentation lags implementation.
## Drift Analysis
### Priority Neglect
**Severity**: high
8 high-priority issues inactive for 60+ days...
## Prioritized Plan
### Immediate
1. Close #45 (already implemented)
2. Update README API section...