overview
SourceAtlas: Project Overview (Stage 0 Fingerprint)
Constitution: ANALYSIS_CONSTITUTION.md v1.0
Context
Arguments: ${ARGUMENTS:-.}
Goal: Generate project fingerprint by scanning <5% of files to achieve 70-80% understanding in 10-15 minutes.
Auto-Save: Results automatically saved to .sourceatlas/overview.yaml (or subdirectory-specific path)
Time Limit: 10-15 minutes (typically 0-5 minutes)
Cache Check (Highest Priority)
If --force is NOT in arguments, check cache first:
-
Calculate cache path:
- No path argument or
.:.sourceatlas/overview.yaml - With path (e.g.,
src/api):.sourceatlas/overview-src-api.yaml
- No path argument or
-
Check if cache exists:
ls -la .sourceatlas/overview.yaml 2>/dev/null -
If cache exists:
- Calculate days since modification
- Use Read tool to read cache
- Output:
π Loading cache: .sourceatlas/overview.yaml (N days ago) π‘ Add --force to re-analyze - If over 30 days: Show warning
- Output cache content
- End, do not execute analysis
-
If cache does not exist: Continue with analysis
If --force is in arguments: Skip cache, execute analysis
Your Task
Execute Stage 0 Analysis Only - generate project fingerprint using information theory principles.
Information Theory Approach:
- High-entropy files contain disproportionate information
- Scan priority: Documentation β Configuration β Models β Entry Points β Tests
- Scale-aware: TINY/SMALL/MEDIUM/LARGE/VERY_LARGE projects need different approaches
Core Workflow
Execute these phases in order. See workflow.md for complete details.
Phase 1: Project Detection & Scale-Aware Planning (2-3 minutes)
Purpose: Detect project type, count files, determine scale, set scan limits.
Execute detection:
# Try helper script first (recommended)
if [ -f ~/.claude/scripts/atlas/detect-project.sh ]; then
bash ~/.claude/scripts/atlas/detect-project.sh ${ARGUMENTS:-.}
elif [ -f scripts/atlas/detect-project.sh ]; then
bash scripts/atlas/detect-project.sh ${ARGUMENTS:-.}
else
echo "Warning: detect-project.sh not found, using manual detection"
fi
Scale-Aware Scan Limits:
- TINY (<5 files): 1-2 files (50% max)
- SMALL (5-15 files): 2-3 files (10-20%)
- MEDIUM (15-50 files): 4-6 files (8-12%)
- LARGE (50-150 files): 6-10 files (4-7%)
- VERY_LARGE (>150 files): 10-15 files (3-7%)
β See workflow.md#phase-1 for manual fallback
Phase 2: High-Entropy File Prioritization (5-8 minutes)
Purpose: Scan highest information-density files first.
Scan Priority Order:
- Documentation (README.md, CLAUDE.md, docs/)
- Configuration (package.json, docker-compose.yml, etc.)
- Core Models (models/, entities/, domain/) - pick 2-3 only
- Entry Points (app.ts, routes/) - pick 1-2 examples
- Tests - pick 1-2 examples
Execute scanning:
# Use helper script if available
if [ -f ~/.claude/scripts/atlas/scan-entropy.sh ]; then
bash ~/.claude/scripts/atlas/scan-entropy.sh ${ARGUMENTS:-.}
else
echo "Warning: scan-entropy.sh not found, scanning manually"
fi
AI Tool Detection:
# Detect AI collaboration level (Tier 1 + Tier 2)
if [ -f ~/.claude/scripts/atlas/detect-ai-tools.sh ]; then
bash ~/.claude/scripts/atlas/detect-ai-tools.sh ${ARGUMENTS:-.}
else
# Fallback: manual checks
ls -la CLAUDE.md .cursorrules .windsurfrules CONVENTIONS.md AGENTS.md .aiignore 2>/dev/null
ls -la .claude/ .cursor/rules/ .windsurf/rules/ .clinerules/ .roo/ .continue/rules/ .ruler/ 2>/dev/null
fi
β See workflow.md#phase-2 for manual commands
Phase 3: Generate Hypotheses (3-5 minutes)
Purpose: Generate scale-appropriate hypotheses with confidence levels and evidence.
Hypothesis Categories:
- Technology Stack: Languages, frameworks, databases, testing
- Architecture: Patterns, structure, layering
- Development Practices: Code quality, testing, documentation
- AI Collaboration: Tool detection (Level 0-4)
- Business Domain: Purpose, entities, features
Scale-Aware Targets:
- TINY: 5-8 hypotheses
- SMALL: 7-10 hypotheses
- MEDIUM: 10-15 hypotheses
- LARGE: 12-18 hypotheses
- VERY_LARGE: 15-20 hypotheses
Each hypothesis must include:
- hypothesis: Clear statement
- confidence: 0.0-1.0 (aim for >0.7)
- evidence: file:line references
- validation_method: How to verify
β See workflow.md#phase-3 for detailed guidance
Output Format
Generate output with branded header, then YAML format:
πΊοΈ SourceAtlas: Overview
βββββββββββββββββββββββββββββββ
π [project_name] β [SCALE] ([file count] files)
Then YAML content with sections:
metadata: project_name, scan_time, total_files, scanned_files, scan_ratio, project_scale, contextproject_fingerprint: project_type, scale, primary_language, framework, architecturetech_stack: backend, frontend (optional), infrastructure (optional)hypotheses: architecture, tech_stack, development, ai_collaboration, businessscanned_files: List with file, reason, key_insightsummary: understanding_depth, key_findings
β See output-template.md for complete YAML structure and examples
Critical Rules
- Scale-Aware Scanning: Follow recommended file limits from Phase 1
- Exclude Common Bloat: Never scan .venv/, node_modules/, vendor/, pycache, .git/
- Time Limit: Complete in 10-15 minutes (typically 0-5 minutes)
- Hypothesis Quality: Each must have confidence >0.7 and evidence
- Scale-Aware Targets: Use hypothesis targets appropriate for project scale
- No Deep Diving: Understand structure > implementation details
- STOP after Stage 0: Do not proceed to validation or git analysis
Handoffs Decision Rules
Follow Constitution Article VII: Handoffs Principles
β οΈ Choose ONE output, NOT both:
Case A - End (No Table): When any condition is met:
- Project too small: TINY (<10 files)
- Findings too vague: Cannot provide high confidence (>0.7) parameters
- Goal achieved: AI collaboration Level β₯3 and scale TINY/SMALL
Output:
β
**Analysis sufficient** - Project is small, can read all files directly
Case B - Suggestions (Table): When project scale is large enough or clear next steps exist.
| Finding | Command | Parameter |
|---|---|---|
| Clear patterns | /sourceatlas:pattern |
Pattern name |
| Complex architecture | /sourceatlas:flow |
Entry point file |
| Scale β₯ LARGE | /sourceatlas:history |
No parameters |
| High risk areas | /sourceatlas:impact |
Risk file/module |
Format:
## Recommended Next
| # | Command | Purpose |
|---|---------|---------|
| 1 | `/sourceatlas:pattern "repository"` | Found Repository pattern in 15 files |
π‘ Enter a number (e.g., `1`) or copy the command to execute
β See reference.md#handoffs for detailed logic
Self-Verification Phase (REQUIRED)
Purpose: Prevent hallucinated file paths, incorrect counts, fictional configs. Execute AFTER output generation, BEFORE save.
Verification Steps:
Step V1: Extract Verifiable Claims
Extract from generated YAML:
- File paths (
scanned_files[].file) - Config files (
tools_detected[].config_file) - File count (
metadata.total_files) - Git branch (
metadata.context.git_branch) - Evidence references (
hypotheses.*.evidence)
Step V2: Parallel Verification
Run ALL checks in parallel:
- Verify scanned files exist:
test -f path - Verify AI tool configs exist:
test -f config - Verify file count: Β±10% tolerance
- Verify git branch:
git branch --show-current - Verify evidence files exist
Step V3: Handle Results
- β All pass β Continue to output/save
- β οΈ 1-2 fail β Correct claims, note in summary
- β 3+ fail β Re-execute analysis phases
Step V4: Verification Summary
Add to footer:
If all passed:
β
Verified: [N] scanned files, [M] config paths, file count
If corrected:
π§ Self-corrected: [list corrections]
β
Verified: [N] scanned files, [M] config paths, file count
β See verification-guide.md for complete checklist and examples
Auto-Save (Default Behavior)
After verification passes, automatically:
- Create directory:
mkdir -p .sourceatlas - Save YAML output to:
- Root:
.sourceatlas/overview.yaml - Subdirectory:
.sourceatlas/overview-[path].yaml
- Root:
- Confirm:
πΎ Saved to .sourceatlas/overview.yaml
β See reference.md#auto-save for details
Advanced
- Scale-aware analysis: reference.md#scale-aware-analysis
- Helper scripts: reference.md#helper-scripts
- Cache behavior: reference.md#cache-behavior
- AI collaboration detection: reference.md#ai-collaboration-detection
- Information theory: reference.md#information-theory-principles
Output Header
Start your output with:
πΊοΈ SourceAtlas: Overview
βββββββββββββββββββββββββββββββ
π [project_name] β [SCALE] ([file count] files)
Then follow YAML structure in output-template.md.