nestjs-dev-guidelines
NestJS Dev Guidelines
A complete set of production-grade NestJS and Nest-style Node.js backend standards. Apply these rules whenever working on a codebase that already uses NestJS concepts such as modules, controllers, providers/services, DTOs, pipes, and guards, or when the task is explicitly about designing those patterns. Think like a senior backend engineer: consistency over cleverness, explicit over implicit, boundaries over shortcuts.
How to use this skill
- Before writing or reviewing code, scan
references/00-execution-discipline.mdand the Non-Negotiables below. - For file placement, module boundaries, or refactor decisions, use the Decision Trees.
- For deep detail on a topic, open the matching
references/NN-<topic>.mdfile. Each is self-contained: TL;DR, rules, good/bad examples, anti-patterns, and a review checklist. - When reviewing a PR, run through
references/29-code-review-checklist.md+ the topic references relevant to the diff.
Scope and precedence
Use this section when the repo is "backend TypeScript" but not obviously a standard NestJS app.
- NestJS-first scope. Apply this skill directly when the repo already uses NestJS primitives
(
@Module, controllers, providers/services, guards, pipes, DTOs) or when the user is asking you to design a NestJS solution. - Plain Node backends are partial-match only. If the repo is plain Express/Fastify/Hono/Koa without NestJS primitives, use only the cross-cutting guidance here (validation, contracts, authz, DB design, testing, observability). Do not force Nest-specific APIs, decorators, or folder structure onto that codebase.
- Protect outcomes first, tooling second. The non-negotiables below are invariants first and implementation defaults second. Centralized input validation, stable response contracts, structured redacted logging, ownership checks, and test coverage matter more than the exact library used to achieve them.
- Follow equivalent-safe repo conventions. If the repo already uses a different but healthy
mechanism that preserves the same safety/property boundary (for example Zod instead of
class-validator, or a repo-standard structured logger instead ofnestjs-pino), follow the repo rather than rewriting it mid-task. - Escalate harmful conflicts. If the existing pattern weakens security, correctness, data integrity, or a stable contract, call it out explicitly and ask before normalizing or widening that pattern.
Execution discipline
Apply these rules before touching NestJS-specific details. They exist to reduce the usual LLM failure modes: silent assumptions, overbuilt code, broad refactors, and unverified fixes.
- Think before coding. State assumptions. If the task is ambiguous, ask or explicitly list the plausible interpretations instead of silently picking one.
- Verify volatile facts. Versions, package APIs, model IDs, install commands, and CLI flags
change. Verify them against official docs and the repo before recommending — do not recall
from memory. See
35-source-of-truth-freshness.md. - Search before writing new code. Check the current module first, then shared/common/core code for an existing util, DTO, service, guard, interceptor, repository, or pattern you can reuse.
- Fix root causes, not trigger hacks. Do not patch broad behavior with one-off regexes, hardcoded phrases, or manual if/then rules unless the product requirement is literally deterministic rule-based routing.
- Prefer the smallest correct change. No speculative flags, abstractions, helpers, config, or edge-case handling unless the task or codebase clearly needs them.
- Make surgical edits. Touch only the lines needed for the request. Do not clean up unrelated code, comments, or formatting just because you are nearby.
- Search and update the full impact surface. If a shared function, contract, DTO, type, or behavior changes, scan and update callers, tests, docs, examples, and related flows until the whole change is consistent.
- Ask before changing shared behavior. If reusing existing code requires changing shared semantics, multiple callers, or a reusable contract, stop and confirm instead of silently widening the blast radius.
- Define what success looks like. Convert vague requests into checks you can verify: regression test, unit test, e2e test, typecheck, lint, build, or a concrete manual check.
- Stop when confused. Name the uncertainty early. Short clarifying questions are cheaper than rewriting the wrong code.
Open references/00-execution-discipline.md for the full checklist and examples.
Non-negotiables (protect these outcomes; prefer these patterns)
Each rule has a Why so you can reason about edge cases instead of applying it blindly.
- No business logic in a controller. Controllers validate input, delegate to a service, and
shape the response. Nothing else.
Why: controllers are thin HTTP adapters. Logic in them can't be unit-tested without booting
the framework, and it pulls HTTP concerns into domain code. See
04-code-quality.md. - Every external input goes through a DTO. Body, query, and param DTOs use
class-validator;ValidationPipeis global withwhitelist: true,forbidNonWhitelisted: true,transform: true. Why: DTOs are the one choke point where unknown fields, bad types, and injection payloads are stopped. Skipping one means you trust the client. See09-validation.md. - Never trust the client. Every ID from the URL is checked against the authenticated
user/org's ownership. Filter and sort fields are whitelisted. Mass-assignment is prevented
by
whitelist. Why: IDOR and mass-assignment are the two most common application-level breaches. They only exist when code assumes "if the token is valid, the payload is fine." See11,12. - One module owns its tables. No cross-module raw DB reads. If module B needs data from
module A, call A's service (DI) or subscribe to A's events.
Why: shared table access makes every schema change a cross-team coordination problem.
Modules become coupled through the DB instead of through APIs. See
03-module-design.md. - snake_case in the database, camelCase in code. Tables plural, columns snake_case,
primary keys
id, foreign keys<entity>_id. Why: each ecosystem has a convention; mixing them creates a lifetime of mapping bugs and makes ad-hoc SQL painful. Pick the convention of the side that's hardest to change (the DB). See13-database-design.md. - Responses follow one stable contract. Single-resource success returns the object itself;
list success returns
{ data, meta }; errors return{ code, message, details?, traceId }with the correct HTTP status. Why: a consistent contract lets clients write one error handler and one pagination handler that works everywhere, and lets support triage issues bytraceId. See07,10. - Pagination is required for any list endpoint. Choose cursor/keyset for sequential browsing
over mutable or large data; choose offset when page-number navigation or exact totals are real
product requirements. Both return
metawith pagination info. Why: an unpaginated list is a latent OOM and a latent DB outage. The right pagination model depends on UX, consistency requirements, and scale. See08. - Secrets come from env only — validated with Zod at boot. No secrets in code, no secrets
in logs. Invalid env = crash before serving traffic.
Why: a missing/malformed env var caught at boot is a minor incident; caught at runtime on
the hot path, it's a customer outage. See
20,11. - Structured logs, redacted.
nestjs-pinowith JSON in prod; redactauthorization,cookie,set-cookie,password,token. Correlation ID on every log line. Why: plain-text logs can't be queried at scale, and one unredactedAuthorizationheader is a credential leak with a long tail. See21. - Test at boundaries, not internals. Unit-test services with mocked dependencies;
e2e-test controllers through the HTTP layer; never mock the class under test.
Why: mocking the code under test just re-asserts the mock. Tests should verify the
contract (HTTP, DB, external calls), not the implementation. See
23. - Tenant identity is server-derived and layered. In a multi-tenant app, tenant/org id
comes from the authenticated session/JWT, never from the request body. Guards, services,
and repositories each filter by tenant; tests prove tenant A cannot see tenant B's data.
Why: cross-tenant leaks almost always come from a single missing
WHERE tenant_id = ?or a client-supplied id that was trusted too early. Layered enforcement means a bug in one layer does not leak data. See33.
Senior-engineer mindset (decision trees)
Full decision trees live in references/05-thinking-decision-trees.md. Short version:
Where does this file go?
- Business feature →
modules/<feature>/ - App-wide infra (auth, redis, mail) →
core/<name>/ - Generic utility (pipe, decorator, type) →
common/<kind>/ - External API client (Stripe, AWS) →
integrations/<provider>/ - CLI/CRON job →
commands/<name>.command.ts - Domain event publisher/listener →
events/<event-name>/
Should I create a new module?
- Does it own its own DB tables? → Yes, new module.
- Does it have its own lifecycle / domain logic? → Yes, new module.
- Is it just a helper used by one module? → No, put it inside that module's
utils/. - Is it used by 2+ modules but owns no data? →
common/utility.
Should I refactor this now?
- Am I already editing this code? → Yes, clean as you go.
- Is it blocking my current feature? → Yes, refactor minimally.
- Is it just "ugly"? → No, note it and move on.
Should I skip the test?
- Is it a controller? → No. E2E test always.
- Is it a service with branching logic? → No. Unit test always.
- Is it a thin pass-through (e.g.,
findById) with no logic? → Skipping is OK; the e2e test of the controller covers it. - Is it a DTO / type? → No test needed; the compiler is the test.
Rule index (one line per reference)
Read the full reference file when you need detail. The number prefix is for stable ordering.
| # | File | Rule in one line |
|---|---|---|
| 00 | 00-execution-discipline.md |
Think first, verify volatile facts (versions/APIs/model IDs) against docs, keep changes small, edit surgically, define success criteria, verify before claiming done |
| 01 | 01-folder-structure.md |
src/{core,common,integrations,modules,events,commands} — one place for each kind of code |
| 02 | 02-naming-conventions.md |
camelCase vars, PascalCase classes, snake_case DB, kebab-case.ts files, SCREAMING_SNAKE env |
| 03 | 03-module-design.md |
One module per bounded context; @Global() only for true app-wide infra |
| 04 | 04-code-quality.md |
SOLID, constructor DI, pure utils, small functions, no any without a reason |
| 05 | 05-thinking-decision-trees.md |
How to decide: where to put code, when to refactor, when to skip a test |
| 06 | 06-api-design.md |
REST, plural nouns, verbs match semantics, URI versioning /v1/..., idempotency keys |
| 07 | 07-standard-responses.md |
Single success returns a plain object; lists return { data, meta }; errors return { code, message, details?, traceId } |
| 08 | 08-pagination-filters-sorting.md |
Cursor/keyset for sequential browsing, offset when page numbers/exact totals are real requirements; filter[field]=, sort=-createdAt; whitelist fields |
| 09 | 09-validation.md |
class-validator DTOs + global ValidationPipe; Zod for env + runtime JSON parsing |
| 10 | 10-error-handling.md |
Hybrid taxonomy: HTTP status + namespaced code + traceId; domain errors extend semantic Nest exceptions; one global filter with host.getType() + headersSent guards, logs via PinoLogger |
| 11 | 11-security.md |
Security review routine: OWASP Top 10, transport/CORS, injection/SSRF, password hashing, rate limits, PII/audit, and links to auth/webhooks/uploads |
| 12 | 12-authentication-patterns.md |
Session cookie (browsers) or Bearer JWT (mobile/server); hash session/refresh tokens at rest; rotate refresh; use revoked_before; cookie takes precedence and invalid cookies fail closed; auth errors use { code, message } |
| 13 | 13-database-design.md |
snake_case, plural tables, FK <entity>_id, indexes on FKs + query paths, deleted_at, UUIDv7 or bigint |
| 14 | 14-database-orm-patterns.md |
raw pg / TypeORM / Prisma / Drizzle — side-by-side patterns |
| 15 | 15-migrations.md |
Always forward-only in prod; no destructive changes without two-step rollout |
| 16 | 16-cascade-rules.md |
ON DELETE CASCADE for owned data; RESTRICT for shared refs; SET NULL for optional |
| 17 | 17-pipelines-interceptors-guards.md |
Order: Guard → Interceptor (pre) → Pipe → Handler → Interceptor (post) → Filter |
| 18 | 18-events.md |
EventEmitter2 for in-process; outbox pattern when crossing services or queues |
| 19 | 19-background-jobs.md |
BullMQ default; idempotent handlers; retries with backoff; DLQ for poison messages |
| 20 | 20-configuration.md |
ConfigModule global; Zod schema; fail fast on boot if env invalid |
| 21 | 21-logging.md |
nestjs-pino, JSON in prod, redact secrets, correlation ID per request |
| 22 | 22-observability.md |
OpenTelemetry traces + metrics; Langfuse/Helicone for LLM traces |
| 23 | 23-testing.md |
Unit beside impl (*.spec.ts); e2e in test/; mock at boundaries; real DB for integration |
| 24 | 24-performance.md |
Avoid N+1; size the pool; cache selectively; stream large payloads |
| 24a | 24a-caching-patterns.md |
Cache deliberately; stable namespaced keys, TTL + invalidation, stampede protection; never the sole authority for auth/quota/billing |
| 25 | 25-documentation-swagger.md |
@ApiTags / @ApiOperation / @ApiResponse; DTOs auto-schema via @ApiProperty |
| 26 | 26-ai-product-patterns.md |
LLM gateway with provider abstraction, retry, fallback, timeout |
| 27 | 27-ai-streaming-sse.md |
SSE endpoints; cancel-aware (abort upstream); heartbeat; typed event vocab; not resumable on reconnect |
| 28 | 28-ai-usage-metering-cost.md |
Per-call token + cost rows; aggregate per user/org/model; enforce quotas |
| 29 | 29-code-review-checklist.md |
PR review checklist across all rules above |
| 30 | 30-code-review-anti-patterns.md |
Catalog of anti-patterns with good-vs-bad snippets |
| 31 | 31-rules-rationale-examples.md |
Cross-cut rule + rationale + good/bad examples for quick reference |
| 32 | 32-modern-nestjs-stack.md |
Decision checklist for modernizing/starting a NestJS service; bootstrap order, module-system checks; no frozen version matrix |
| 33 | 33-multi-tenancy-patterns.md |
Server-derived tenant identity enforced across auth, guard, service, and repository layers; tests prove isolation |
| 34 | 34-health-shutdown.md |
Liveness vs readiness; one shutdown coordinator; drain before close; worker processes drain separately |
| 35 | 35-source-of-truth-freshness.md |
Durable invariants stay local; volatile APIs/versions/models verified against official docs and the repo |
| 36 | 36-webhooks.md |
Verify signature on raw bytes (raw-body config + timingSafeEqual), dedupe on (provider, event_id), ack 2xx after enqueue (incl. unhandled types), re-fetch authoritative state for high-stakes events, resolve tenant from the verified payload |
| 37 | 37-file-uploads.md |
Prefer presigned direct-to-bucket uploads (PUT for clients, POST policy for browsers); cap size/MIME at the boundary; sniff magic bytes; opaque tenant-prefixed storage keys; server-compute hash/size/mime; AV scan before exposure; rate-limit upload endpoints |
| 38 | 38-decorators-scopes-dynamic-modules.md |
Param decorators only extract from request; default to singleton scope; dynamic modules for configurable infra; forwardRef is a smell |
| 39 | 39-exception-filters.md |
One global filter shapes every error to { code, message, details?, traceId }; throw typed HttpException subclasses; never leak internals |
When to deep-read
| You are doing... | Open... |
|---|---|
| Starting any implementation or bug fix | 00, 05 |
| Starting a new feature module | 01, 03, 04 |
| Starting or modernizing a service | 32, 34, 20 |
| Designing a new endpoint | 06, 07, 08, 09 |
| Returning errors consistently | 10 |
| Designing DB tables | 13, 14, 15, 16 |
| Adding auth to an endpoint | 11, 12, 17 |
| Adding multi-tenant isolation | 33, 11, 12, 14 |
| Adding a list endpoint | 07, 08 |
| Adding caching | 24, 24a |
| Adding a background task | 19, 34 |
| Writing tests | 23 |
| Adding observability | 21, 22 |
| Adding health/readiness/shutdown | 34, 32 |
| Building an LLM feature | 26, 27, 28, 22 |
| Handling webhooks from a third party | 36, 11, 19, 21, 33 |
| Accepting file uploads | 37, 11, 24 |
| Designing custom decorators / scoped providers / dynamic modules | 38, 03, 17 |
| Designing the global exception filter / error wire shape | 39, 07, 10 |
| Giving version/command/model advice | 35 |
| Reviewing a PR | 29, 30, and any topic relevant to the diff |
Code review mode
When the user asks you to review a PR, changes, or a diff:
- Read the diff top-to-bottom first — understand intent before critiquing.
- Walk the
29-code-review-checklist.md— mark each item pass/fail/NA. - For each fail, cite the specific section in the relevant reference file (e.g.,
"business logic in controller — see
04-code-quality.md→ Layering / Anti-patterns"). - Distinguish blockers (security, data-loss risk, non-negotiable rule broken) from suggestions (style, minor naming, optional refactor). Don't block on suggestions.
- If you spot an anti-pattern, link to
30-code-review-anti-patterns.mdand quote the good version. - End the review with: (a) blockers (numbered), (b) suggestions (bulleted), (c) approval conditional on blockers being resolved.
AI product appendix
If the project is an AI product backend (LLMs, agents, RAG, vector store, streaming, usage- based billing), also read:
26-ai-product-patterns.md— provider-agnostic LLM gateway, retry/fallback27-ai-streaming-sse.md— SSE endpoint patterns, cancel, heartbeat28-ai-usage-metering-cost.md— token + cost metering, quota enforcement22-observability.md— Langfuse/Helicone for LLM tracing
Everything else (auth, DB, error handling, testing) applies identically to AI products — the AI bit is a module, not a framework.
Meta
- Execution discipline before framework detail. Small, explicit, verified changes beat fast, broad, clever ones.
- Rules > style preferences. These are rules because they prevent real bugs or pay back in maintainability. Style debates are not covered.
- Consistency > cleverness. If a repo has an established equivalent-safe pattern that differs from this skill, follow the repo unless the pattern is actively harmful.
- When in doubt, keep it boring. Boring code is easy to review, easy to onboard, and easy to replace. That is the point.