cosmograph
Cosmograph Architecture Mapping
You are the project's architecture cartographer for Cosmograph. Your job is to walk the code carefully, identify meaningful architectural entities and relationships, and emit a graph dataset that renders clearly and answers useful engineering questions.
The goal is not to graph every symbol in the codebase. The goal is to represent the application's architecture at the right level of abstraction for navigation, reasoning, and change impact analysis.
When to run
Run this skill when:
- The user asks for a Cosmograph of an application, feature, or architecture
- The user wants
points.json,links.json,config.json, orlayout.json - The user wants a codebase explored and represented as a graph
- The user wants touchpoints, dependencies, navigation, or data flow mapped into Cosmograph
Golden rules
- Define
SCAN_ROOTas the current working directory at skill start. SCAN_ROOTis the authoritative read scope unless the user explicitly broadens it.- Never read, trace, classify, or emit files outside the
SCAN_ROOTsubtree unless the user explicitly asks for broader coverage. - Treat the closest git repo root only as the write safety boundary, not as the scan scope.
- Never write outside the closest git repo root.
- Start with discovery. Do not emit graph data until you understand the architecture well enough to defend the node and link choices.
- Stay architecture-pattern agnostic. Detect the architecture that exists instead of forcing the codebase into a preconceived pattern.
- Represent meaningful architectural entities, not every implementation detail.
- Bias toward denser architectural coverage once an entity or relationship is meaningful. Prefer more truthful detail over an overly sparse graph.
- Every link must have semantic meaning. Avoid generic unlabeled edges.
- Prefer evidence over interpretation. Mark inferred edges or classifications explicitly.
- Keep the graph renderable. If a choice would create noise without insight, collapse or omit it.
- Use stable IDs and stable indices so repeated runs produce comparable output.
- If the codebase already has useful architecture docs under
architecture/, use them as supporting context, but verify against code before emitting the graph. - If layout is obvious from the graph shape, omit
layout.json. Only create it when it materially improves readability. - The output should be useful both for visual rendering and for downstream filtering, grouping, and drill-down behavior.
- If it materially improves coverage and the environment supports it, you may spawn up to 3 sub-agents to crawl independent areas of the codebase in parallel. Any sub-agent must inherit the same
SCAN_ROOTrestriction.
Output structure
Write to:
architecture/output/points.jsonarchitecture/output/links.jsonarchitecture/output/config.jsonarchitecture/output/layout.jsonwhen a guided layout materially improves the renderarchitecture/domains/<domain>.ymlfor intermediate per-domain tracking when the map is built incrementally
Create architecture/output/ if missing.
Create architecture/domains/ if missing when using per-domain tracking files.
Reference material:
- For point and link schemas, output contracts, stable ID patterns, evidence rules, and render tuning, load references/guidelines.md.
Core modeling principle
Model the application as:
- Points: meaningful architectural entities a developer would navigate to directly
- Links: typed relationships between those entities
Do not blindly make every file a point. Make something a point when it is a stable touchpoint in the architecture, such as:
- A package or module
- A screen or route
- A major view or component
- A view model, controller, store, or state container
- A service, repository, adapter, or gateway
- A domain model or schema root
- A persistence boundary such as a database or cache
- An external system such as an API, SDK, queue, or vendor service
Usually do not make these first-class points unless the user explicitly wants them:
- Tiny helpers
- Formatters
- Extensions
- Small utility functions
- Constants files
- Pure implementation detail types with no architectural role
Those details can be:
- omitted
- folded into a parent point
- surfaced as metadata on a point
How to walk the codebase
Step 1 - Find repo root and scope
- Capture the current working directory as
SCAN_ROOT. - Determine the closest git repo root.
- Treat the repo root only as the write safety boundary for emitted files.
- Identify whether the user wants the full architecture within
SCAN_ROOTor a bounded domain withinSCAN_ROOT. - Do not assume the repo root is the requested scope. The human is responsible for positioning the working directory before running the skill.
- Enforce a simple path rule: if a candidate file or directory does not live under
SCAN_ROOT, exclude it unless the user explicitly broadens scope. - If the user runs the skill from
<root>/ios/, map only the iOS subtree and do not emit Android or other peer-platform traces. - Default to a full-architecture map for the scanned area under
SCAN_ROOT. - If the codebase is large, break the architecture into domain slices and map one slice at a time until the full architecture is covered.
- Do not reduce scope to only "meaningful top-level areas" as a shortcut. Coverage across the full architecture is the default requirement.
- If existing docs or registries reference systems outside
SCAN_ROOT, treat them as out-of-scope context unless the user explicitly broadens the scan boundary.
Step 2 - Discover top-level architecture
Before collecting points, identify:
- The dominant architecture patterns or composition styles present in the scanned area
- Top-level packages, apps, modules, and folders
- Entry points such as app bootstrap, main routes, or feature registries
- Major screens or routes
- Primary state containers or orchestration layers
- Data and integration boundaries
- External dependencies that shape the architecture
The point of pattern detection is not to label the codebase for its own sake. The point is to choose the right collection points for that codebase. Examples:
- Layered or clean architecture may emphasize use cases, repositories, gateways, and boundary crossings
- MVC, MVVM, MVP, Redux, Elm-style, or Flux-like systems may emphasize controllers, presenters, reducers, stores, selectors, and actions
- Component-driven frontend systems may emphasize routes, layouts, components, hooks, contexts, and client-server boundaries
- Event-driven or workflow-oriented systems may emphasize jobs, handlers, queues, triggers, flows, retries, and state transitions
- Modular monoliths or package-oriented repos may emphasize packages, modules, registries, feature roots, public APIs, and shared infrastructure
For this step, explicitly ask:
- What architectural patterns are actually present?
- Which point types and link types best fit those patterns?
- Which collection points would be missing if you only modeled the obvious top-level files?
Useful things to inspect:
- Package manifests
- App entrypoints
- Navigation or router definitions
- Dependency injection setup
- Feature registries
- Store or state composition
- Service and repository directories
- Network and persistence layers
- Domain models and schema roots
- Background jobs, workers, schedulers, queues, and event handlers
- Hooks, contexts, middleware, interceptors, and composition roots
- Configuration that rewires behavior across environments or feature flags
If beneficial, split discovery across up to 3 sub-agents by independent areas such as:
- feature domains
- architectural layers
- application entrypoints versus data/integration boundaries
Keep final modeling decisions centralized in the main agent.
When the architecture is broad, use the domain slices as the unit of progress and complete them one by one.
Do not assign a sub-agent any area outside SCAN_ROOT.
Step 3 - Extract candidate points
Create candidate points only for entities that matter architecturally. Favor a richer dataset when the additional nodes and edges clarify the render. The default failure mode should be under-collapse, not over-collapse.
Good candidates:
- User-visible screens and routes
- Major views/components that structure a screen
- View models, controllers, stores, presenters
- Services and repositories
- Databases, caches, queues, or APIs
- Packages or modules that contain meaningful feature boundaries
- Use cases, reducers, actions, selectors, middleware, handlers, coordinators, hooks, contexts, registries, and adapters when they materially shape architecture
- Important flows, triggers, rendered states, or helpers when they materially clarify lifecycle, control flow, error handling, or coupling
Weak candidates that usually should not stand alone:
- Tiny helpers
- Mappers with no independent lifecycle
- One-line wrappers
- Small leaf utility files
When in doubt:
- Prefer keeping a candidate if it clarifies stack traversal, domain clustering, or cross-domain coupling
- Collapse or omit only when the candidate is repetitive and does not improve understanding
Behavioral nodes are optional and should be used selectively. Include them when they make the graph more explanatory, not merely more detailed.
Good uses:
- A
flowpoint that explains a key lifecycle such as initial load, checkout submit, or sync recovery - A
triggerpoint that clarifies what starts important work - A
statepoint for loading, success, empty, disabled, or error when those states are architecturally important - A
helperorerror_handlerpoint when it materially shapes control flow or coupling
Poor uses:
- Emitting every helper as a point
- Modeling every function as a point
- Creating isolated behavioral nodes that are not anchored to a screen, flow, service, or module
Coverage check for candidate points:
- Do the chosen points let you trace the system from entrypoint to external boundary?
- Do they cover both steady-state dependencies and transient runtime touchpoints?
- Do they expose domain clusters and the important shared infrastructure between domains?
- Are there missing orchestration points such as reducers, actions, handlers, use cases, middleware, contexts, jobs, or schedulers that would make the links more truthful?
- Are there missing boundary points such as caches, queues, SDKs, webhooks, feature flags, configuration registries, or schema roots that would make cross-domain behavior more legible?
- Are there enough intermediate points to make the render legible without forcing a human to infer large hidden jumps?
- Have you traced through enough lower-level components that each important domain path reads as a chain rather than a single coarse edge?
When working domain-by-domain, keep an intermediate tracking file for each domain under architecture/domains/.
Recommended filename:
architecture/domains/auth.ymlarchitecture/domains/payments.ymlarchitecture/domains/shared-infra.yml
Use these files to track candidate points and candidate links before final normalization. They exist to make the crawl inspectable by humans and to reduce the chance of losing cross-domain context while moving slice by slice.
Recommended YAML shape:
domain: auth
status: in_progress
entrypoints:
- path: src/auth/routes.ts
symbol: authRoutes
points:
- id: route:auth/login
type: route
label: LoginRoute
path: src/auth/routes.ts
layer: presentation
status: observed
links:
- source: route:auth/login
target: controller:auth/login
type: owns
status: observed
shared_links:
- source: service:auth/session
target: cache:shared/redis
type: writes
targetDomain: shared-infra
status: observed
notes:
- Session creation flows into shared Redis cache used by multiple domains.
Track shared links explicitly so cross-domain references remain visible while the architecture is being assembled incrementally.
Step 4 - Extract typed links
For each candidate point, inspect:
- What creates it
- What renders it
- What it calls
- What it depends on
- What state it binds to
- What data sources it reads or writes
- What screen or flow it transitions to
Only create a link if the relationship is meaningful and supported by code. Walk each important path from top to bottom of the stack wherever possible. Do not stop at the first obvious dependency hop. Prefer multiple specific links over a single coarse link when the intermediate architectural steps matter. Trace through:
- conditional branches
- fallback paths
- feature-flagged behavior
- async triggers and callbacks
- transient dependencies such as helpers, middleware, adapters, or mappers when they materially shape control flow
- cross-domain handoffs and shared infrastructure
The target outcome is not just a bag of local edges. The graph should reveal domain clusters, full-stack paths through those clusters, and the shared links between domains where those links are real.
Examples:
Screen -> ViewModelasbinds_toViewModel -> ServiceascallsService -> Repositoryasdepends_onRepository -> DatabaseasreadsandwritesScreen -> Screenasnavigates_toModule -> ScreenascontainsTrigger -> FlowastriggersFlow -> Stateastransitions_toService -> Helperasuses_helperFlow -> ErrorHandlerashandles_error_with
Step 5 - Normalize and de-noise
Before writing output:
- Merge duplicate entities with the same architectural role
- Remove low-value nodes that only create clutter
- Ensure each point has one clear primary type
- Ensure each link has one clear semantic type
- Ensure point indices are sequential and stable
- Ensure link source and target indices match the point index mapping
- Ensure behavioral nodes attach to a parent screen, flow, service, or module rather than floating as isolated graph noise
Heuristics for good graphs
Use these heuristics to avoid a bad render:
- Prioritize breadth of architecture over microscopic detail
- Prefer richer architectural granularity over an overly thin first-pass graph
- Keep helper and utility explosion out of the graph only when those helpers do not change control flow, coupling, or stack traversal
- Favor typed relationships over dense generic connectivity
- Prefer one representative point per architectural concept
- Use parent-child containment to preserve context without over-linking
- If one module contains many leaf utilities, keep the module and only include the most important leaves
- Size by importance, not raw file count
- Color by point type or layer
- Use labels for high-importance nodes first
- Let
overviewnodes and edges form the backbone of the graph - Let
behaviornodes and edges enrich local understanding without drowning the backbone
Required workflow
Follow this order:
- Walk the code to understand architecture
- Identify the architecture patterns present and select point and link types that fit them
- Partition the architecture into domain slices when needed so the full map can be built incrementally without dropping coverage
- For each domain slice, record intermediate candidate points and links in
architecture/domains/<domain>.yml - Check whether the collection points are sufficient for a faithful point-to-link mapping and add missing categories when needed
- Trace important flows from top to bottom of the stack, including meaningful branches and transient dependencies
- Repeat until all relevant domain slices in scope are covered
- Normalize shared points and cross-domain links across the domain files
- Output
points.jsonandlinks.json - Create
config.jsonto help render and explore the dataset - Create
layout.jsononly if a guided layout materially improves readability
Do not skip discovery and jump straight to generation.
Verification checklist
Before finishing, verify:
- The output folder exists
- The domain tracking folder exists if you used domain slices
- No points or links were emitted from sibling or peer directories outside the current working directory subtree unless the user explicitly asked for broader scope
- Every emitted
pathstarts with or resolves underSCAN_ROOT points.jsonparseslinks.jsonparsesconfig.jsonparseslayout.jsonparses if created- Each
architecture/domains/*.ymlfile parses if created - Point indices are sequential and unique
- Every link resolves to valid points
- The graph is not overloaded with low-value nodes
overviewnodes and edges still form a readable backbone- Behavioral nodes explain lifecycle, rendering, error handling, or coupling rather than adding incidental detail
- The chosen config reflects the actual fields in the datasets
- Cross-domain links remain explicit rather than being flattened into ambiguous local edges
Response expectations
When you complete the work, report:
- Which area of the codebase was mapped
- The files written under
architecture/output/ - The files written under
architecture/domains/if any - The modeling decisions that shaped the graph
- Any major inferred areas or confidence limits
More from lmcjt37/skills
dev-plan
Maintains a per-project engineering plan (plan.md) plus a structured decision log, adapting to tickets, docs, or local tasks across different agents and IDEs.
5map-architecture
Maps a project or domain by tracing screens, interfaces, functions, callsites, branches, and data flow into Mermaid-based architecture notes. Use when the user asks for architecture analysis, domain mapping, dependency tracing, touchpoint discovery, or wants diagrams written under architecture/.
5journal
Maintains a per-project engineering journal (journal.md), decision log, observation log, and session records across different agents and IDEs.
3revert
Revert a previously merged pull request by creating a fresh branch from the target base branch, applying the correct git revert flow, pushing the branch, and opening a GitHub PR. Use when asked to "revert PR X", "undo PR X", or "back out a merged pull request" on `main` or a release branch.
2dev-journal
Maintains a per-project engineering journal (journal.md) plus a structured decision log and full conversation transcripts.
2summarise
Summarises the current state of the project based on the journal and plan, extracting key decisions, observations, and progress.
2