NestJS Dev Guidelines

A complete set of production-grade NestJS and Nest-style Node.js backend standards. Apply these rules whenever working on a codebase that already uses NestJS concepts such as modules, controllers, providers/services, DTOs, pipes, and guards, or when the task is explicitly about designing those patterns. Think like a senior backend engineer: consistency over cleverness, explicit over implicit, boundaries over shortcuts.

How to use this skill

Before writing or reviewing code, scan references/00-execution-discipline.md and the Non-Negotiables below.
For file placement, module boundaries, or refactor decisions, use the Decision Trees.
For deep detail on a topic, open the matching references/NN-<topic>.md file. Each is self-contained: TL;DR, rules, good/bad examples, anti-patterns, and a review checklist.
When reviewing a PR, run through references/29-code-review-checklist.md + the topic references relevant to the diff.

Scope and precedence

Use this section when the repo is "backend TypeScript" but not obviously a standard NestJS app.

NestJS-first scope. Apply this skill directly when the repo already uses NestJS primitives (@Module, controllers, providers/services, guards, pipes, DTOs) or when the user is asking you to design a NestJS solution.
Plain Node backends are partial-match only. If the repo is plain Express/Fastify/Hono/Koa without NestJS primitives, use only the cross-cutting guidance here (validation, contracts, authz, DB design, testing, observability). Do not force Nest-specific APIs, decorators, or folder structure onto that codebase.
Protect outcomes first, tooling second. The non-negotiables below are invariants first and implementation defaults second. Centralized input validation, stable response contracts, structured redacted logging, ownership checks, and test coverage matter more than the exact library used to achieve them.
Follow equivalent-safe repo conventions. If the repo already uses a different but healthy mechanism that preserves the same safety/property boundary (for example Zod instead of class-validator, or a repo-standard structured logger instead of nestjs-pino), follow the repo rather than rewriting it mid-task.
Escalate harmful conflicts. If the existing pattern weakens security, correctness, data integrity, or a stable contract, call it out explicitly and ask before normalizing or widening that pattern.

Execution discipline

Apply these rules before touching NestJS-specific details. They exist to reduce the usual LLM failure modes: silent assumptions, overbuilt code, broad refactors, and unverified fixes.

Think before coding. State assumptions. If the task is ambiguous, ask or explicitly list the plausible interpretations instead of silently picking one.
Verify volatile facts. Versions, package APIs, model IDs, install commands, and CLI flags change. Verify them against official docs and the repo before recommending — do not recall from memory. See 35-source-of-truth-freshness.md.
Search before writing new code. Check the current module first, then shared/common/core code for an existing util, DTO, service, guard, interceptor, repository, or pattern you can reuse.
Fix root causes, not trigger hacks. Do not patch broad behavior with one-off regexes, hardcoded phrases, or manual if/then rules unless the product requirement is literally deterministic rule-based routing.
Prefer the smallest correct change. No speculative flags, abstractions, helpers, config, or edge-case handling unless the task or codebase clearly needs them.
Make surgical edits. Touch only the lines needed for the request. Do not clean up unrelated code, comments, or formatting just because you are nearby.
Search and update the full impact surface. If a shared function, contract, DTO, type, or behavior changes, scan and update callers, tests, docs, examples, and related flows until the whole change is consistent.
Ask before changing shared behavior. If reusing existing code requires changing shared semantics, multiple callers, or a reusable contract, stop and confirm instead of silently widening the blast radius.
Define what success looks like. Convert vague requests into checks you can verify: regression test, unit test, e2e test, typecheck, lint, build, or a concrete manual check.
Stop when confused. Name the uncertainty early. Short clarifying questions are cheaper than rewriting the wrong code.

Open references/00-execution-discipline.md for the full checklist and examples.

Non-negotiables (protect these outcomes; prefer these patterns)

Each rule has a Why so you can reason about edge cases instead of applying it blindly.

No business logic in a controller. Controllers validate input, delegate to a service, and shape the response. Nothing else. Why: controllers are thin HTTP adapters. Logic in them can't be unit-tested without booting the framework, and it pulls HTTP concerns into domain code. See 04-code-quality.md.
Every external input goes through a DTO. Body, query, and param DTOs use class-validator; ValidationPipe is global with whitelist: true, forbidNonWhitelisted: true, transform: true. Why: DTOs are the one choke point where unknown fields, bad types, and injection payloads are stopped. Skipping one means you trust the client. See 09-validation.md.
Never trust the client. Every ID from the URL is checked against the authenticated user/org's ownership. Filter and sort fields are whitelisted. Mass-assignment is prevented by whitelist. Why: IDOR and mass-assignment are the two most common application-level breaches. They only exist when code assumes "if the token is valid, the payload is fine." See 11, 12.
One module owns its tables. No cross-module raw DB reads. If module B needs data from module A, call A's service (DI) or subscribe to A's events. Why: shared table access makes every schema change a cross-team coordination problem. Modules become coupled through the DB instead of through APIs. See 03-module-design.md.
snake_case in the database, camelCase in code. Tables plural, columns snake_case, primary keys id, foreign keys <entity>_id. Why: each ecosystem has a convention; mixing them creates a lifetime of mapping bugs and makes ad-hoc SQL painful. Pick the convention of the side that's hardest to change (the DB). See 13-database-design.md.
Responses follow one stable contract. Single-resource success returns the object itself; list success returns { data, meta }; errors return { code, message, details?, traceId } with the correct HTTP status. Why: a consistent contract lets clients write one error handler and one pagination handler that works everywhere, and lets support triage issues by traceId. See 07, 10.
Pagination is required for any list endpoint. Choose cursor/keyset for sequential browsing over mutable or large data; choose offset when page-number navigation or exact totals are real product requirements. Both return meta with pagination info. Why: an unpaginated list is a latent OOM and a latent DB outage. The right pagination model depends on UX, consistency requirements, and scale. See 08.
Secrets come from env only — validated with Zod at boot. No secrets in code, no secrets in logs. Invalid env = crash before serving traffic. Why: a missing/malformed env var caught at boot is a minor incident; caught at runtime on the hot path, it's a customer outage. See 20, 11.
Structured logs, redacted. nestjs-pino with JSON in prod; redact authorization, cookie, set-cookie, password, token. Correlation ID on every log line. Why: plain-text logs can't be queried at scale, and one unredacted Authorization header is a credential leak with a long tail. See 21.
Test at boundaries, not internals. Unit-test services with mocked dependencies; e2e-test controllers through the HTTP layer; never mock the class under test. Why: mocking the code under test just re-asserts the mock. Tests should verify the contract (HTTP, DB, external calls), not the implementation. See 23.
Tenant identity is server-derived and layered. In a multi-tenant app, tenant/org id comes from the authenticated session/JWT, never from the request body. Guards, services, and repositories each filter by tenant; tests prove tenant A cannot see tenant B's data. Why: cross-tenant leaks almost always come from a single missing WHERE tenant_id = ? or a client-supplied id that was trusted too early. Layered enforcement means a bug in one layer does not leak data. See 33.

Senior-engineer mindset (decision trees)

Full decision trees live in references/05-thinking-decision-trees.md. Short version:

Where does this file go?

Business feature → modules/<feature>/
App-wide infra (auth, redis, mail) → core/<name>/
Generic utility (pipe, decorator, type) → common/<kind>/
External API client (Stripe, AWS) → integrations/<provider>/
CLI/CRON job → commands/<name>.command.ts
Domain event publisher/listener → events/<event-name>/

Should I create a new module?

Does it own its own DB tables? → Yes, new module.
Does it have its own lifecycle / domain logic? → Yes, new module.
Is it just a helper used by one module? → No, put it inside that module's utils/.
Is it used by 2+ modules but owns no data? → common/ utility.

Should I refactor this now?

Am I already editing this code? → Yes, clean as you go.
Is it blocking my current feature? → Yes, refactor minimally.
Is it just "ugly"? → No, note it and move on.

Should I skip the test?

Is it a controller? → No. E2E test always.
Is it a service with branching logic? → No. Unit test always.
Is it a thin pass-through (e.g., findById) with no logic? → Skipping is OK; the e2e test of the controller covers it.
Is it a DTO / type? → No test needed; the compiler is the test.

Rule index (one line per reference)

Read the full reference file when you need detail. The number prefix is for stable ordering.

#	File	Rule in one line
00	`00-execution-discipline.md`	Think first, verify volatile facts (versions/APIs/model IDs) against docs, keep changes small, edit surgically, define success criteria, verify before claiming done
01	`01-folder-structure.md`	`src/{core,common,integrations,modules,events,commands}` — one place for each kind of code
02	`02-naming-conventions.md`	`camelCase` vars, `PascalCase` classes, `snake_case` DB, `kebab-case.ts` files, `SCREAMING_SNAKE` env
03	`03-module-design.md`	One module per bounded context; `@Global()` only for true app-wide infra
04	`04-code-quality.md`	SOLID, constructor DI, pure utils, small functions, no `any` without a reason
05	`05-thinking-decision-trees.md`	How to decide: where to put code, when to refactor, when to skip a test
06	`06-api-design.md`	REST, plural nouns, verbs match semantics, URI versioning `/v1/...`, idempotency keys
07	`07-standard-responses.md`	Single success returns a plain object; lists return `{ data, meta }`; errors return `{ code, message, details?, traceId }`
08	`08-pagination-filters-sorting.md`	Cursor/keyset for sequential browsing, offset when page numbers/exact totals are real requirements; `filter[field]=`, `sort=-createdAt`; whitelist fields
09	`09-validation.md`	class-validator DTOs + global ValidationPipe; Zod for env + runtime JSON parsing
10	`10-error-handling.md`	Hybrid taxonomy: HTTP status + namespaced code + traceId; domain errors extend semantic Nest exceptions; one global filter with `host.getType()` + `headersSent` guards, logs via `PinoLogger`
11	`11-security.md`	Security review routine: OWASP Top 10, transport/CORS, injection/SSRF, password hashing, rate limits, PII/audit, and links to auth/webhooks/uploads
12	`12-authentication-patterns.md`	Session cookie (browsers) or Bearer JWT (mobile/server); hash session/refresh tokens at rest; rotate refresh; use `revoked_before`; cookie takes precedence and invalid cookies fail closed; auth errors use `{ code, message }`
13	`13-database-design.md`	snake_case, plural tables, FK `<entity>_id`, indexes on FKs + query paths, `deleted_at`, UUIDv7 or bigint
14	`14-database-orm-patterns.md`	raw pg / TypeORM / Prisma / Drizzle — side-by-side patterns
15	`15-migrations.md`	Always forward-only in prod; no destructive changes without two-step rollout
16	`16-cascade-rules.md`	`ON DELETE CASCADE` for owned data; `RESTRICT` for shared refs; `SET NULL` for optional
17	`17-pipelines-interceptors-guards.md`	Order: Guard → Interceptor (pre) → Pipe → Handler → Interceptor (post) → Filter
18	`18-events.md`	EventEmitter2 for in-process; outbox pattern when crossing services or queues
19	`19-background-jobs.md`	BullMQ default; idempotent handlers; retries with backoff; DLQ for poison messages
20	`20-configuration.md`	`ConfigModule` global; Zod schema; fail fast on boot if env invalid
21	`21-logging.md`	nestjs-pino, JSON in prod, redact secrets, correlation ID per request
22	`22-observability.md`	OpenTelemetry traces + metrics; Langfuse/Helicone for LLM traces
23	`23-testing.md`	Unit beside impl (`*.spec.ts`); e2e in `test/`; mock at boundaries; real DB for integration
24	`24-performance.md`	Avoid N+1; size the pool; cache selectively; stream large payloads
24a	`24a-caching-patterns.md`	Cache deliberately; stable namespaced keys, TTL + invalidation, stampede protection; never the sole authority for auth/quota/billing
25	`25-documentation-swagger.md`	`@ApiTags` / `@ApiOperation` / `@ApiResponse`; DTOs auto-schema via `@ApiProperty`
26	`26-ai-product-patterns.md`	LLM gateway with provider abstraction, retry, fallback, timeout
27	`27-ai-streaming-sse.md`	SSE endpoints; cancel-aware (abort upstream); heartbeat; typed event vocab; not resumable on reconnect
28	`28-ai-usage-metering-cost.md`	Per-call token + cost rows; aggregate per user/org/model; enforce quotas
29	`29-code-review-checklist.md`	PR review checklist across all rules above
30	`30-code-review-anti-patterns.md`	Catalog of anti-patterns with good-vs-bad snippets
31	`31-rules-rationale-examples.md`	Cross-cut rule + rationale + good/bad examples for quick reference
32	`32-modern-nestjs-stack.md`	Decision checklist for modernizing/starting a NestJS service; bootstrap order, module-system checks; no frozen version matrix
33	`33-multi-tenancy-patterns.md`	Server-derived tenant identity enforced across auth, guard, service, and repository layers; tests prove isolation
34	`34-health-shutdown.md`	Liveness vs readiness; one shutdown coordinator; drain before close; worker processes drain separately
35	`35-source-of-truth-freshness.md`	Durable invariants stay local; volatile APIs/versions/models verified against official docs and the repo
36	`36-webhooks.md`	Verify signature on raw bytes (raw-body config + `timingSafeEqual`), dedupe on `(provider, event_id)`, ack `2xx` after enqueue (incl. unhandled types), re-fetch authoritative state for high-stakes events, resolve tenant from the verified payload
37	`37-file-uploads.md`	Prefer presigned direct-to-bucket uploads (`PUT` for clients, `POST` policy for browsers); cap size/MIME at the boundary; sniff magic bytes; opaque tenant-prefixed storage keys; server-compute hash/size/mime; AV scan before exposure; rate-limit upload endpoints
38	`38-decorators-scopes-dynamic-modules.md`	Param decorators only extract from request; default to singleton scope; dynamic modules for configurable infra; `forwardRef` is a smell
39	`39-exception-filters.md`	One global filter shapes every error to `{ code, message, details?, traceId }`; throw typed `HttpException` subclasses; never leak internals

When to deep-read

You are doing...	Open...
Starting any implementation or bug fix	00, 05
Starting a new feature module	01, 03, 04
Starting or modernizing a service	32, 34, 20
Designing a new endpoint	06, 07, 08, 09
Returning errors consistently	10
Designing DB tables	13, 14, 15, 16
Adding auth to an endpoint	11, 12, 17
Adding multi-tenant isolation	33, 11, 12, 14
Adding a list endpoint	07, 08
Adding caching	24, 24a
Adding a background task	19, 34
Writing tests	23
Adding observability	21, 22
Adding health/readiness/shutdown	34, 32
Building an LLM feature	26, 27, 28, 22
Handling webhooks from a third party	36, 11, 19, 21, 33
Accepting file uploads	37, 11, 24
Designing custom decorators / scoped providers / dynamic modules	38, 03, 17
Designing the global exception filter / error wire shape	39, 07, 10
Giving version/command/model advice	35
Reviewing a PR	29, 30, and any topic relevant to the diff

Code review mode

When the user asks you to review a PR, changes, or a diff:

Read the diff top-to-bottom first — understand intent before critiquing.
Walk the 29-code-review-checklist.md — mark each item pass/fail/NA.
For each fail, cite the specific section in the relevant reference file (e.g., "business logic in controller — see 04-code-quality.md → Layering / Anti-patterns").
Distinguish blockers (security, data-loss risk, non-negotiable rule broken) from suggestions (style, minor naming, optional refactor). Don't block on suggestions.
If you spot an anti-pattern, link to 30-code-review-anti-patterns.md and quote the good version.
End the review with: (a) blockers (numbered), (b) suggestions (bulleted), (c) approval conditional on blockers being resolved.

AI product appendix

If the project is an AI product backend (LLMs, agents, RAG, vector store, streaming, usage- based billing), also read:

26-ai-product-patterns.md — provider-agnostic LLM gateway, retry/fallback
27-ai-streaming-sse.md — SSE endpoint patterns, cancel, heartbeat
28-ai-usage-metering-cost.md — token + cost metering, quota enforcement
22-observability.md — Langfuse/Helicone for LLM tracing

Everything else (auth, DB, error handling, testing) applies identically to AI products — the AI bit is a module, not a framework.

nestjs-dev-guidelines