reverse-engineer
Reverse Engineer (Route-Aware)
Step 2 of 6 in the Reverse Engineering to Spec-Driven Development process. The 6-step process: 1. Analyze, 2. Reverse Engineer (this skill), 3. Create Specs, 4. Gap Analysis, 5. Implementation Planning, 6. Implementation.
Estimated Time: 30-45 minutes
Prerequisites: Step 1 completed (analysis-report.md and route selection in .stackshift-state.json)
Output: 11 documentation files in docs/reverse-engineering/
Route-Dependent Behavior:
- Greenfield: Extract business logic only (framework-agnostic)
- Brownfield: Extract business logic + technical implementation details
Output is the same regardless of implementation framework (Spec Kit, BMAD, or BMAD Auto-Pilot). The framework choice only affects what happens after Step 2.
Configuration Check
Guard: Verify state file exists before proceeding.
if [ ! -f .stackshift-state.json ]; then
echo "ERROR: .stackshift-state.json not found."
echo "Step 1 (Initial Analysis) must be completed first. Run /stackshift.analyze to begin."
exit 1
fi
DETECTION_TYPE=$(cat .stackshift-state.json | jq -r '.detection_type')
ROUTE=$(cat .stackshift-state.json | jq -r '.route')
if [ "$DETECTION_TYPE" = "null" ] || [ -z "$DETECTION_TYPE" ]; then
echo "ERROR: detection_type missing from state file. Re-run /stackshift.analyze."
exit 1
fi
if [ "$ROUTE" = "null" ] || [ -z "$ROUTE" ]; then
echo "ERROR: route missing from state file. Re-run /stackshift.analyze."
exit 1
fi
echo "Detection: $DETECTION_TYPE"
echo "Route: $ROUTE"
SPEC_OUTPUT=$(cat .stackshift-state.json | jq -r '.config.spec_output_location // "."')
echo "Writing specs to: $SPEC_OUTPUT"
if [ "$SPEC_OUTPUT" != "." ]; then
mkdir -p "$SPEC_OUTPUT/docs/reverse-engineering"
mkdir -p "$SPEC_OUTPUT/.specify/memory/specifications"
fi
State file structure:
{
"detection_type": "monorepo-service",
"route": "greenfield",
"implementation_framework": "speckit",
"config": {
"spec_output_location": "~/git/my-new-app",
"build_location": "~/git/my-new-app",
"target_stack": "Next.js 15..."
}
}
Capture commit hash for incremental updates:
COMMIT_HASH=$(git rev-parse HEAD 2>/dev/null || echo "unknown")
COMMIT_DATE=$(git log -1 --format=%ci 2>/dev/null || date -u +"%Y-%m-%d %H:%M:%S")
echo "Pinning docs to commit: $COMMIT_HASH"
Extraction approach based on detection + route:
| Detection Type | + Greenfield | + Brownfield |
|---|---|---|
| Monorepo Service | Business logic only (tech-agnostic) | Full implementation + shared packages (tech-prescriptive) |
| Nx App | Business logic only (framework-agnostic) | Full Nx/Angular implementation details |
| Generic App | Business logic only | Full implementation |
detection_typedetermines WHAT patterns to look for (shared packages, Nx project config, monorepo structure, etc.)routedetermines HOW to document them (tech-agnostic vs tech-prescriptive)
Phase 1: Deep Codebase Analysis
Use the Task tool with subagent_type=stackshift:stackshift-code-analyzer:AGENT to perform analysis. If the agent is unavailable, fall back to the Explore agent.
Error recovery: If a subagent fails or returns empty results for a sub-phase, retry once with the Explore agent. If the retry also fails, record the gap with an [ANALYSIS INCOMPLETE] marker and continue with remaining sub-phases.
Missing components: If a sub-phase finds no relevant code (e.g., no frontend in a backend-only service), document the absence in the corresponding output file rather than skipping the sub-phase.
Launch sub-phases 1.1 through 1.6 in parallel using separate subagent invocations. Collect all results before proceeding to Phase 2.
1.1 Backend Analysis
- Find all API endpoints and record their method, route, auth requirements, parameters, and purpose.
- Catalog every data model including schemas, types, interfaces, and field definitions.
- Inventory all configuration sources: env vars, config files, and settings.
- Map every external integration: APIs, services, and databases.
- Extract business logic from services, utilities, and algorithms.
1.2 Frontend Analysis
- List all pages and routes with their purpose and auth requirements.
- Catalog all components by category: layout, form, and UI components.
- Document state management: store structure and global state patterns.
- Map the API client layer: how the frontend calls the backend.
- Extract styling patterns: design system, themes, and component styles.
1.3 Infrastructure Analysis
- Document deployment configuration: IaC tools, cloud provider, and services.
- Map CI/CD pipelines and workflows.
- Catalog database setup: type, schema, and migrations.
- Identify storage systems: object storage, file systems, and caching.
1.4 Testing Analysis
- Locate all test files and identify the testing frameworks in use.
- Classify tests by type: unit, integration, and E2E.
- Estimate coverage percentages by module.
- Catalog test data: mocks, fixtures, and seed data.
1.5 Business Context Analysis
- Read README, CONTRIBUTING, and any marketing or landing pages.
- Extract package descriptions and repository metadata.
- Identify comment patterns indicating user-facing features.
- Collect error messages and user-facing strings for persona inference.
- Analyze naming conventions to reveal domain concepts.
- Examine git history for decision archaeology.
1.6 Decision Archaeology
- Inspect dependency manifests (package.json, go.mod, requirements.txt) for technology choices.
- Analyze config files (tsconfig, eslint, prettier) for design philosophy.
- Review CI/CD configuration for deployment decisions.
- Run git blame on key architectural files to identify decision points.
- Collect comments with "why" explanations (TODO, HACK, FIXME, NOTE).
- Look for rejected alternatives visible in git history or comments.
Progress signal: After all sub-phases complete, log: "Phase 1 complete: Analysis gathered for [list which sub-phases produced results]."
Phase 2: Generate Documentation
Create docs/reverse-engineering/ directory and generate all 11 documentation files. For each file, apply the greenfield or brownfield variant as described in operations/output-file-specs.md. Read that file now for the detailed per-file specifications.
If .stackshift-docs-meta.json already exists, overwrite it completely with fresh metadata.
Step 2.1: Write metadata file FIRST
COMMIT_HASH=$(git rev-parse HEAD 2>/dev/null || echo "unknown")
COMMIT_DATE=$(git log -1 --format=%ci 2>/dev/null || date -u +"%Y-%m-%d %H:%M:%S")
GENERATED_AT=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
Write docs/reverse-engineering/.stackshift-docs-meta.json:
{
"commit_hash": "<COMMIT_HASH>",
"commit_date": "<COMMIT_DATE>",
"generated_at": "<GENERATED_AT>",
"doc_count": 11,
"route": "<greenfield|brownfield>",
"docs": {
"functional-specification.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"integration-points.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"configuration-reference.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"data-architecture.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"operations-guide.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"technical-debt-analysis.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"observability-requirements.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"visual-design-system.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"test-documentation.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"business-context.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" },
"decision-rationale.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }
}
}
Step 2.2: Add metadata header to each doc
Every generated doc starts with this header after the title:
# [Document Title]
> **Generated by StackShift** | Commit: `<short-hash>` | Date: `<GENERATED_AT>`
> Run `/stackshift.refresh-docs` to update with latest changes.
Step 2.3: Generate files with checkpoints
Generate files in this order, logging progress after each:
Batch 1 (core architecture):
- functional-specification.md
- data-architecture.md
- integration-points.md
- configuration-reference.md
After writing files 1-4, log: "Generated 4/11 files (core architecture complete)." Verify the output directory contains 4 files before continuing.
Batch 2 (operations and quality): 5. operations-guide.md 6. technical-debt-analysis.md 7. observability-requirements.md 8. visual-design-system.md 9. test-documentation.md
After writing files 5-9, log: "Generated 9/11 files (operations and quality complete)." Verify the output directory contains 9 files before continuing.
Batch 3 (context and decisions): 10. business-context.md 11. decision-rationale.md
After writing files 10-11, log: "Generated 11/11 files. Phase 2 complete."
Output structure:
docs/reverse-engineering/
├── .stackshift-docs-meta.json
├── functional-specification.md
├── integration-points.md
├── configuration-reference.md
├── data-architecture.md
├── operations-guide.md
├── technical-debt-analysis.md
├── observability-requirements.md
├── visual-design-system.md
├── test-documentation.md
├── business-context.md
└── decision-rationale.md
Success Criteria
- All 11 documentation files generated in
docs/reverse-engineering/ - Comprehensive coverage of all application aspects
- Framework-agnostic functional specification (for greenfield)
- Complete data model documentation
- Business context captured with clear
[INFERRED]/[NEEDS USER INPUT]markers - Decision rationale documented with ADR format
- Integration points fully mapped with data flow diagrams
.stackshift-docs-meta.jsoncreated with commit hash for incremental updates- Each doc has metadata header with commit hash and generation date
Next Step
Once all documentation is generated:
For GitHub Spec Kit (implementation_framework: speckit):
Proceed to Step 3 -- use /stackshift.create-specs to transform docs into .specify/ specs.
For BMAD Method (implementation_framework: bmad):
Proceed to Step 6 -- hand off to BMAD's *workflow-init. BMAD's PM and Architect agents use the reverse-engineering docs as context.
For BMAD Auto-Pilot (implementation_framework: bmad-autopilot):
Proceed to /stackshift.bmad-synthesize to auto-generate BMAD artifacts. The 11 reverse-engineering docs provide ~90% of what BMAD needs.
DO / DON'T
DO:
- Describe WHAT the system does, not HOW (especially for greenfield)
- Use all available signals for inference: README, comments, naming, config, git history
- Mark confidence levels: no marker = confident,
[INFERRED]= reasonable inference,[NEEDS USER INPUT]= genuinely unknown - Cross-reference between docs (e.g., tech debt informs trade-offs)
- Cite specific evidence for each inference
DON'T:
- Hard-code framework names in functional specs (greenfield)
- Mix business logic with technical implementation (greenfield)
- Fabricate business goals with no supporting evidence
- State inferences as facts without marking them
- Skip a section because it requires inference -- attempt it and mark confidence
Completeness Checklist
Verify analysis captured:
- ALL API endpoints (not just the obvious ones)
- ALL data models (including DTOs, types, interfaces)
- ALL configuration options (check multiple files)
- ALL external integrations
- ALL user-facing strings and error messages (for persona/context inference)
- ALL config files (for decision rationale inference)
Each document must be comprehensive, accurate, organized, actionable, and honest about inferred vs verified information.