tooyoung:codebase-stats
Codebase Stats
Count lines of code with auto-detected project structure. No external tools required.
Tools to use: Bash for file discovery and counting, Glob and Read for config file detection.
Do NOT use cloc, tokei, scc, or any external LOC counter.
Activation
Trigger on:
/codebase-statsor/loc- "count lines of code", "LOC by module", "code stats", "project size"
Execution Flow
Step 1: Detect Project Structure
Use Glob and Read to inspect config files at the project root. Follow this decision tree in order (first match wins):
1a. JS/TS Workspace Monorepo
Signal files: pnpm-workspace.yaml, turbo.json, nx.json, lerna.json, or root package.json containing "workspaces"
Module extraction:
pnpm-workspace.yaml: readpackagesglobs, expand with Globpackage.jsonworkspaces: read array, expand globs- Each directory with its own
package.json= one module - Label:
workspace
1b. Rust Workspace
Signal: root Cargo.toml containing [workspace]
Module extraction: parse members array, expand globs. Label: workspace
1c. Java / Kotlin Multi-Module
Signal: root pom.xml with <modules>, or settings.gradle / settings.gradle.kts
Module extraction:
- Maven: parse
<module>tags - Gradle: parse
include(...)orinclude '...', convert:a:btoa/b - Label:
maven-moduleorgradle-module
1d. Go Multi-Module
Signal: go.mod found in 2+ directories
Module extraction: each directory containing go.mod = one module. Label: go-module
1e. Python Multi-Package
Signal: pyproject.toml or setup.py found in 2+ directories
Module extraction: each directory containing either file = one module. Label: py-package
1f. Full-Stack Monolith
Signal: top-level directory pairs matching known patterns:
src+src-tauri(Tauri)frontend+backendclient+serverweb+apiapp+server
Module extraction: each matched directory = one module. Label: monolith
1g. Simple Single-Module
Fallback: none of the above matched.
Module: project root as the only module. Label: single
Always: add a <root> pseudo-module for files in the project root not belonging to any detected module.
Step 2: Count Lines
Run a single Bash script. The counting logic:
# Template - adapt ROOT, MODULES, PRUNE_DIRS as needed
find "$ROOT" \
\( -name .git -o -name node_modules -o -name target -o -name dist \
-o -name build -o -name out -o -name coverage -o -name .next \
-o -name .nuxt -o -name .svelte-kit -o -name .turbo -o -name .cache \
-o -name __pycache__ -o -name .venv -o -name venv -o -name .mypy_cache \
-o -name .pytest_cache -o -name vendor -o -name third_party \
-o -name .yarn -o -name .pnpm-store -o -name Pods -o -name DerivedData \
-o -name .gradle -o -name .idea -o -name .vscode -o -name .settings \
-o -name gen -o -name generated \
-o -name bin -o -name obj -o -name tmp -o -name temp \
\) -prune -o \
-type f ! -name '*.min.js' ! -name '*.min.css' ! -name '*.bundle.js' \
! -name '*.map' ! -name '*.snap' ! -name '*.lock' ! -name '*.d.ts' \
! -name 'package-lock.json' ! -name 'pnpm-lock.yaml' ! -name 'go.sum' \
! -name '*.pb.go' ! -name '*_pb2.py' \
! -name '*.generated.*' ! -name '*.gen.*' ! -name '*.g.dart' \
! -name '*.designer.*' \
-print0 | \
while IFS= read -r -d '' f; do
# skip symlinks
[ -L "$f" ] && continue
# skip binary
grep -qI . "$f" 2>/dev/null || continue
# get extension or basename for extensionless files
base=$(basename "$f")
case "$base" in
Makefile|Dockerfile|Vagrantfile|Rakefile|Gemfile|Justfile|CMakeLists.txt) ext="$base" ;;
*.*) ext="${base##*.}" ;;
*) ext="$base" ;;
esac
# count
lines=$(wc -l < "$f" 2>/dev/null) || continue
loc=$(grep -cve '^[[:space:]]*$' "$f" 2>/dev/null || echo 0)
# output: path<TAB>ext<TAB>lines<TAB>loc
printf '%s\t%s\t%d\t%d\n' "$f" "$ext" "$lines" "$loc"
done
File type mapping: Apply this extension-to-type table to classify each file:
| Type | Extensions |
|---|---|
| ts | .ts |
| tsx | .tsx |
| js | .js, .mjs, .cjs |
| jsx | .jsx |
| py | .py |
| rs | .rs |
| go | .go |
| java | .java |
| kt | .kt, .kts |
| swift | .swift |
| c | .c |
| cpp | .cc, .cpp, .cxx |
| h | .h, .hh, .hpp, .hxx |
| css | .css, .scss, .sass, .less |
| html | .html, .htm |
| sql | .sql |
| sh | .sh, .bash, .zsh, .fish |
| toml | .toml |
| json | .json, .jsonc |
| yaml | .yml, .yaml |
| xml | .xml |
| md | .md, .mdx |
| vue | .vue |
| svelte | .svelte |
| dart | .dart |
| rb | .rb |
| php | .php |
| cs | .cs |
| scala | .scala |
| elixir | .ex, .exs |
| docker | Dockerfile |
| make | Makefile, CMakeLists.txt |
Unmatched extensions: group into other. Skip known non-source: .png, .jpg, .jpeg, .gif, .svg, .ico, .pdf, .woff, .woff2, .ttf, .eot, .mp3, .mp4, .webm, .webp, .zip, .tar, .gz, .dmg, .exe, .dll, .so, .dylib, .DS_Store.
Module assignment: for each file path, find the longest-matching module root prefix. Files not matching any module go to <root>.
Step 3: Render Output
Language: Match the user's prompt language. If the user writes in Chinese, use Chinese table headers and labels (e.g., "扫描根目录", "文件类型", "模块", "占比"). If in English, use English. Technical terms (module names, file types, paths) stay as-is regardless of language.
Produce four Markdown tables in this exact order:
Table 1: Project Summary
| Metric | Value |
| ------------------------ | --------------------------- |
| Scan Root | `/path/to/project` |
| Structure | `monorepo (pnpm workspace)` |
| Modules | 5 |
| Source Files | 842 |
| LOC (non-empty) | 59,040 |
| Total Lines | 72,318 |
| Skipped (binary/symlink) | 12 |
Table 2: By File Type
Sorted by LOC descending.
| Type | Files | LOC | Lines | Share |
| ---- | ----: | -----: | -----: | ----: |
| ts | 120 | 18,420 | 22,510 | 31.2% |
| tsx | 44 | 9,720 | 11,300 | 16.5% |
| rs | 28 | 5,180 | 6,420 | 8.8% |
| css | 15 | 2,340 | 2,890 | 4.0% |
| ... | | | | |
Share = LOC / total LOC * 100, 1 decimal place.
Table 3: By Module
Sorted by LOC descending.
| Module | Kind | Files | LOC | Share | Top Types |
| ------------- | --------- | ----: | -----: | ----: | ---------------------- |
| packages/core | workspace | 210 | 25,110 | 42.5% | ts(10.2k), tsx(9.7k) |
| packages/ui | workspace | 160 | 18,890 | 32.0% | tsx(15.4k), css(1.9k) |
| <root> | root | 18 | 1,320 | 2.2% | json(0.5k), yaml(0.4k) |
Top Types: top 3 file types by LOC within the module, formatted as type(Xk) using abbreviation (e.g., 1.2k for 1,200).
Table 4: Module x Type Detail (Top 20)
Only show if multiple modules detected. Sorted by LOC descending, capped at 20 rows.
| Module | Type | Files | LOC |
| ------------- | ---- | ----: | -----: |
| packages/core | ts | 95 | 12,340 |
| packages/core | tsx | 75 | 10,210 |
| packages/ui | tsx | 88 | 15,420 |
If more than 20 combinations exist, append: Showing top 20 of N combinations.
Number Formatting
- Use thousand separators for numbers >= 1,000 (e.g.,
59,040) - Percentages: 1 decimal place (e.g.,
31.2%) - Abbreviated form in Top Types:
1.2kfor 1,200
Edge Cases
| Case | Behavior |
|---|---|
| Empty project | Show summary with zeros, note "No source files found" |
| Symlinks | Never follow, count in "Skipped" |
| Binary files with source extension | Detect via grep -qI, skip, count in "Skipped" |
| Very large repo (>10k files) | Print warning before counting, continue normally |
| Permission errors | Skip file, increment "Skipped" count |
| Nested monorepo (workspace inside workspace) | Use deepest module root (longest prefix match) |
Generated directories (gen/, generated/) |
Exclude by default |
other share > 15% |
Warn user to review unmapped extensions |
Performance Tips
- Always use
find -pruneto skip heavy directories early - Process files in a single
find | while readloop, not per-file tool calls - Aggregate with
awkorsort | uniq -c, not per-row processing - Use
Readonly for config files (manifests), never for source files - For repos with >5,000 files, run counting in a single Bash invocation
More from shiqkuangsan/oh-my-daily-skills
tooyoung:excalidraw-artist
Create Excalidraw hand-drawn style diagrams, including architecture, flowchart, swimlane/timeline, sequence, basic wireframe, ERD/data model, state machine, matrix/comparison table, tree/hierarchy, and CI/CD pipeline. Trigger words: draw diagram, architecture diagram, flowchart, swimlane, timeline, roadmap, Gantt, sequence diagram, excalidraw, ERD, data model, state machine, comparison table, matrix, tree, hierarchy, CI/CD pipeline
24tooyoung:chainlit-builder
Quickly build Chainlit AI chat demos and POCs using OpenAI-compatible chat completion patterns, including streaming, multi-turn memory, file upload, tool-call step visualization, and demo styling. Trigger words: chainlit, build demo, chat demo, conversation demo, Chainlit 演示, AI 聊天 demo, 对话式 POC
24tooyoung:threejs-builder
Create simple Three.js web apps with scene setup, lighting, geometries, materials, animations, OrbitControls, particles, and responsive rendering. Use for Three.js scenes, WebGL demos, 3D showcases, and interactive 3D web content. Trigger: threejs, Three.js, 3D scene, WebGL, 三维展示, 3D showcase, interactive 3D
23tooyoung:openclash-merger
将 vless+reality 等新协议配置转换为带 GEOSITE 规则的配置文件,支持 11 地区分组 + AI/媒体/游戏分流,可直接上传 OpenClash 使用。触发词:合并 OpenClash、转换订阅、Clash 配置
23tooyoung:nano-banana-builder
Build Next.js App Router image-generation apps using Gemini Nano Banana / Nano Banana Pro with AI SDK. Covers exact model names, Server Actions/API routes, conversational multi-turn image editing, storage, rate limiting, safety, and cost controls. Trigger: nano banana, Gemini image, AI 生图, 图片生成, text-to-image, image generation app, iterative image editor, multi-turn image editing
23tooyoung:easy-openrouter
Test individual LLM models through OpenRouter and compare observed latency, cost, token usage, and outputs. Includes model ID format, :nitro/:online modifiers, rankings/provider lookup, and simple manual comparison workflows. Trigger words: OpenRouter, test model, model ID, compare models, provider latency, throughput, cheapest provider, fastest provider, :nitro, :online
22