overview

SKILL.md

SourceAtlas: Project Overview (Stage 0 Fingerprint)

Constitution: ANALYSIS_CONSTITUTION.md v1.0

Context

Arguments: ${ARGUMENTS:-.}

Goal: Generate project fingerprint by scanning <5% of files to achieve 70-80% understanding in 10-15 minutes.

Auto-Save: Results automatically saved to .sourceatlas/overview.yaml (or subdirectory-specific path)

Time Limit: 10-15 minutes (typically 0-5 minutes)


Cache Check (Highest Priority)

If --force is NOT in arguments, check cache first:

  1. Calculate cache path:

    • No path argument or .: .sourceatlas/overview.yaml
    • With path (e.g., src/api): .sourceatlas/overview-src-api.yaml
  2. Check if cache exists:

    ls -la .sourceatlas/overview.yaml 2>/dev/null
    
  3. If cache exists:

    • Calculate days since modification
    • Use Read tool to read cache
    • Output:
      πŸ“ Loading cache: .sourceatlas/overview.yaml (N days ago)
      πŸ’‘ Add --force to re-analyze
      
    • If over 30 days: Show warning
    • Output cache content
    • End, do not execute analysis
  4. If cache does not exist: Continue with analysis

If --force is in arguments: Skip cache, execute analysis


Your Task

Execute Stage 0 Analysis Only - generate project fingerprint using information theory principles.

Information Theory Approach:

  • High-entropy files contain disproportionate information
  • Scan priority: Documentation β†’ Configuration β†’ Models β†’ Entry Points β†’ Tests
  • Scale-aware: TINY/SMALL/MEDIUM/LARGE/VERY_LARGE projects need different approaches

Core Workflow

Execute these phases in order. See workflow.md for complete details.

Phase 1: Project Detection & Scale-Aware Planning (2-3 minutes)

Purpose: Detect project type, count files, determine scale, set scan limits.

Execute detection:

# Try helper script first (recommended)
if [ -f ~/.claude/scripts/atlas/detect-project.sh ]; then
    bash ~/.claude/scripts/atlas/detect-project.sh ${ARGUMENTS:-.}
elif [ -f scripts/atlas/detect-project.sh ]; then
    bash scripts/atlas/detect-project.sh ${ARGUMENTS:-.}
else
    echo "Warning: detect-project.sh not found, using manual detection"
fi

Scale-Aware Scan Limits:

  • TINY (<5 files): 1-2 files (50% max)
  • SMALL (5-15 files): 2-3 files (10-20%)
  • MEDIUM (15-50 files): 4-6 files (8-12%)
  • LARGE (50-150 files): 6-10 files (4-7%)
  • VERY_LARGE (>150 files): 10-15 files (3-7%)

β†’ See workflow.md#phase-1 for manual fallback

Phase 2: High-Entropy File Prioritization (5-8 minutes)

Purpose: Scan highest information-density files first.

Scan Priority Order:

  1. Documentation (README.md, CLAUDE.md, docs/)
  2. Configuration (package.json, docker-compose.yml, etc.)
  3. Core Models (models/, entities/, domain/) - pick 2-3 only
  4. Entry Points (app.ts, routes/) - pick 1-2 examples
  5. Tests - pick 1-2 examples

Execute scanning:

# Use helper script if available
if [ -f ~/.claude/scripts/atlas/scan-entropy.sh ]; then
    bash ~/.claude/scripts/atlas/scan-entropy.sh ${ARGUMENTS:-.}
else
    echo "Warning: scan-entropy.sh not found, scanning manually"
fi

AI Tool Detection:

# Detect AI collaboration level (Tier 1 + Tier 2)
if [ -f ~/.claude/scripts/atlas/detect-ai-tools.sh ]; then
    bash ~/.claude/scripts/atlas/detect-ai-tools.sh ${ARGUMENTS:-.}
else
    # Fallback: manual checks
    ls -la CLAUDE.md .cursorrules .windsurfrules CONVENTIONS.md AGENTS.md .aiignore 2>/dev/null
    ls -la .claude/ .cursor/rules/ .windsurf/rules/ .clinerules/ .roo/ .continue/rules/ .ruler/ 2>/dev/null
fi

β†’ See workflow.md#phase-2 for manual commands

Phase 3: Generate Hypotheses (3-5 minutes)

Purpose: Generate scale-appropriate hypotheses with confidence levels and evidence.

Hypothesis Categories:

  • Technology Stack: Languages, frameworks, databases, testing
  • Architecture: Patterns, structure, layering
  • Development Practices: Code quality, testing, documentation
  • AI Collaboration: Tool detection (Level 0-4)
  • Business Domain: Purpose, entities, features

Scale-Aware Targets:

  • TINY: 5-8 hypotheses
  • SMALL: 7-10 hypotheses
  • MEDIUM: 10-15 hypotheses
  • LARGE: 12-18 hypotheses
  • VERY_LARGE: 15-20 hypotheses

Each hypothesis must include:

  • hypothesis: Clear statement
  • confidence: 0.0-1.0 (aim for >0.7)
  • evidence: file:line references
  • validation_method: How to verify

β†’ See workflow.md#phase-3 for detailed guidance


Output Format

Generate output with branded header, then YAML format:

πŸ—ΊοΈ SourceAtlas: Overview
───────────────────────────────
πŸ”­ [project_name] β”‚ [SCALE] ([file count] files)

Then YAML content with sections:

  • metadata: project_name, scan_time, total_files, scanned_files, scan_ratio, project_scale, context
  • project_fingerprint: project_type, scale, primary_language, framework, architecture
  • tech_stack: backend, frontend (optional), infrastructure (optional)
  • hypotheses: architecture, tech_stack, development, ai_collaboration, business
  • scanned_files: List with file, reason, key_insight
  • summary: understanding_depth, key_findings

β†’ See output-template.md for complete YAML structure and examples


Critical Rules

  1. Scale-Aware Scanning: Follow recommended file limits from Phase 1
  2. Exclude Common Bloat: Never scan .venv/, node_modules/, vendor/, pycache, .git/
  3. Time Limit: Complete in 10-15 minutes (typically 0-5 minutes)
  4. Hypothesis Quality: Each must have confidence >0.7 and evidence
  5. Scale-Aware Targets: Use hypothesis targets appropriate for project scale
  6. No Deep Diving: Understand structure > implementation details
  7. STOP after Stage 0: Do not proceed to validation or git analysis

Handoffs Decision Rules

Follow Constitution Article VII: Handoffs Principles

⚠️ Choose ONE output, NOT both:

Case A - End (No Table): When any condition is met:

  • Project too small: TINY (<10 files)
  • Findings too vague: Cannot provide high confidence (>0.7) parameters
  • Goal achieved: AI collaboration Level β‰₯3 and scale TINY/SMALL

Output:

βœ… **Analysis sufficient** - Project is small, can read all files directly

Case B - Suggestions (Table): When project scale is large enough or clear next steps exist.

Finding Command Parameter
Clear patterns /sourceatlas:pattern Pattern name
Complex architecture /sourceatlas:flow Entry point file
Scale β‰₯ LARGE /sourceatlas:history No parameters
High risk areas /sourceatlas:impact Risk file/module

Format:

## Recommended Next

| # | Command | Purpose |
|---|---------|---------|
| 1 | `/sourceatlas:pattern "repository"` | Found Repository pattern in 15 files |

πŸ’‘ Enter a number (e.g., `1`) or copy the command to execute

β†’ See reference.md#handoffs for detailed logic


Self-Verification Phase (REQUIRED)

Purpose: Prevent hallucinated file paths, incorrect counts, fictional configs. Execute AFTER output generation, BEFORE save.

Verification Steps:

Step V1: Extract Verifiable Claims

Extract from generated YAML:

  • File paths (scanned_files[].file)
  • Config files (tools_detected[].config_file)
  • File count (metadata.total_files)
  • Git branch (metadata.context.git_branch)
  • Evidence references (hypotheses.*.evidence)

Step V2: Parallel Verification

Run ALL checks in parallel:

  • Verify scanned files exist: test -f path
  • Verify AI tool configs exist: test -f config
  • Verify file count: Β±10% tolerance
  • Verify git branch: git branch --show-current
  • Verify evidence files exist

Step V3: Handle Results

  • βœ… All pass β†’ Continue to output/save
  • ⚠️ 1-2 fail β†’ Correct claims, note in summary
  • ❌ 3+ fail β†’ Re-execute analysis phases

Step V4: Verification Summary

Add to footer:

If all passed:

βœ… Verified: [N] scanned files, [M] config paths, file count

If corrected:

πŸ”§ Self-corrected: [list corrections]
βœ… Verified: [N] scanned files, [M] config paths, file count

β†’ See verification-guide.md for complete checklist and examples


Auto-Save (Default Behavior)

After verification passes, automatically:

  1. Create directory: mkdir -p .sourceatlas
  2. Save YAML output to:
    • Root: .sourceatlas/overview.yaml
    • Subdirectory: .sourceatlas/overview-[path].yaml
  3. Confirm: πŸ’Ύ Saved to .sourceatlas/overview.yaml

β†’ See reference.md#auto-save for details


Advanced


Output Header

Start your output with:

πŸ—ΊοΈ SourceAtlas: Overview
───────────────────────────────
πŸ”­ [project_name] β”‚ [SCALE] ([file count] files)

Then follow YAML structure in output-template.md.

Weekly Installs
2
GitHub Stars
32
First Seen
Jan 29, 2026
Installed on
antigravity2
mcpjam1
openhands1
junie1
windsurf1
zencoder1