cc-defensive-programming
Skill: cc-defensive-programming
STOP - Never Skip
| Check | Why Critical |
|---|---|
| No executable code in assertions | Code disappears in production builds |
| No empty catch blocks | Silently swallows bugs that compound |
| External input validated | Security vulnerabilities, data corruption |
CRISIS TRIAGE (2 minutes)
Production down? Use this prioritized subset:
Immediate (30 seconds each)
- Is external input validated at entry point? If no → add validation NOW
- Any empty catch blocks hiding the real error? If yes → add logging, find root cause
- Any assertions with side effects? If yes → extract to separate statement
Before Deploying Fix (60 seconds)
- Does fix match architectural error strategy? (return code vs exception vs shut down)
- Are you catching at right abstraction level? (not
EOFExceptionfromGetEmployee())
Why triage works: These 5 items catch 80% of defensive programming bugs. Full checklist (21 items) is for non-emergency review.
Empty catch blocks cause compounding failures. A suppressed error in one layer cascades into harder-to-diagnose failures in production.
Key Definitions
External Input
Any data not provably controlled by current code path:
- User input (keyboard, forms, CLI args)
- Files (config, data, uploads)
- Network (APIs, databases, inter-service calls)
- Environment variables, system properties
- Data from ANY other service, even internal ones
"Internal team API" is still external. If it crosses a network boundary or process boundary, validate it.
Assertion
Code used during development that allows a program to check itself as it runs. When true = operating as expected. When false = detected an unexpected error (bug). Use for conditions that should never occur.
Barricade
A damage-containment strategy. Interfaces designated as boundaries to "safe" areas. Data crossing these boundaries is checked for validity.
Limitation: Barricades reduce redundant validation but do NOT replace defense-in-depth for security-critical operations. If barricade validation has a bug, what happens?
Correctness vs Robustness
- Correctness: Never returning an inaccurate result; no result is better than wrong result (safety-critical)
- Robustness: Always trying to keep software operating, even if results are sometimes inaccurate (consumer apps)
Preconditions / Postconditions
- Preconditions: Properties client code promises are true BEFORE calling routine
- Postconditions: Properties routine promises are true AFTER executing
- [XREF: Meyer 1997, "Design by Contract"]
When NOT to Use
- Pure functions without side effects - Immutable data flows don't need defensive mutation protection (but still validate inputs at system boundaries)
- Prototype/spike code - Time-boxed exploration (max 1 week) before committing to error strategy
- Test code - Test doubles intentionally violate production patterns
- Performance-critical inner loops - Where assertion overhead matters (must show profiling data proving >5% overhead)
SECURITY EXCEPTION: Security-critical code (authentication, authorization, cryptographic operations, PII handling) is NEVER exempt from defensive programming regardless of other factors. When in doubt, validate.
Crisis Invariants - NEVER SKIP
These checks are almost always required. Exceptions need explicit justification and documentation:
| Check | Time | Why Critical |
|---|---|---|
| No executable code in assertions | 15 sec | Code disappears in production builds |
| No empty catch blocks | 15 sec | Silently swallows bugs that compound |
| External input validated | 30 sec | Security vulnerabilities, data corruption |
| Assertions for bugs only | 15 sec | Assertions disabled in production; anticipated errors need handling |
Why these four? Violations create silent failures that are nearly impossible to debug later. They don't crash loudly - they corrupt data and hide bugs.
Error handling deferred to a future session is rarely completed. Edge cases are forgotten once the code moves out of active context. If deferral is necessary, create a tracked ticket with specific scope.
Pattern Reuse Gate
BEFORE implementing any error handling, search the codebase:
| Search For | Why |
|---|---|
| Same error type elsewhere | How is it handled? Log? Throw? Return code? |
| Same module's error handling | What's the established pattern here? |
| Barricade/validation patterns | Where are the trust boundaries? |
| Exception hierarchy | What custom exceptions exist? |
Questions to answer:
- How does this codebase handle this type of error elsewhere?
- Is there an established error-handling strategy (exceptions vs return codes vs Result types)?
- What logging/monitoring patterns are used?
- Are there existing custom exception classes I should use?
If pattern found: Follow it. Consistency in error handling is critical for debugging.
If no pattern found: You're establishing one. Document your decision. Consider if this should become the pattern.
See: pattern-reuse-gate.md for full gate protocol.
Modes
CHECKER
Purpose: Execute checklists for defensive programming, assertions, exceptions, and error handling Triggers:
- "review my error handling"
- "check for defensive programming"
- "review assertions"
- "check exception handling"
- "audit my validation code" Non-Triggers:
- "design my error handling strategy" -> APPLIER
- "review my control flow" -> cc-control-flow-quality
- "review my class design" -> cc-routine-and-class-design Checklist: See checklists.md Output Format: | Item | Status | Evidence | Location | |------|--------|----------|----------| Severity:
- VIOLATION: Fails checklist item (missing validation, empty catch blocks, assertions with side effects)
- WARNING: Partial compliance (inconsistent error handling, missing documentation)
- PASS: Meets requirement
APPLIER
Purpose: Apply defensive programming techniques when implementing error handling and validation Triggers:
- "how should I handle this error"
- "should I use assertion or error handling"
- "help me design barricades"
- "set up defensive programming"
- "implement input validation" Non-Triggers:
- "review my existing error handling" -> CHECKER
- "optimize my error handling performance" -> performance skill Produces:
- Assertion placement recommendations
- Error-handling strategy decisions
- Barricade architecture designs
- Input validation implementations Constraints:
- Use assertions for bugs that should never occur (p.191)
- Use error handling for anticipated error conditions (p.194)
- Design barricades at external interfaces (p.203)
- Favor development debugging aids over production constraints (p.205)
Decision Flowcharts
Assertion vs Error Handling
digraph assertion_vs_error {
rankdir=TB;
START [label="Handling a potentially\nbad condition" shape=doublecircle];
never [label="Should this NEVER happen?\n(programmer bug)" shape=diamond];
assertion [label="Use ASSERTION" shape=box style=filled fillcolor=lightblue];
anticipated [label="Is it anticipated\nbad input?" shape=diamond];
errorhandling [label="Use ERROR HANDLING" shape=box style=filled fillcolor=lightgreen];
robust [label="Building highly\nrobust system?" shape=diamond];
both [label="Use BOTH\n(assert + handle)" shape=box style=filled fillcolor=lightyellow];
reconsider [label="Clarify requirements:\nIs it a bug or expected?\nAsk domain expert." shape=box style=filled fillcolor=lightcoral];
START -> never;
never -> robust [label="yes"];
never -> anticipated [label="no"];
robust -> both [label="yes"];
robust -> assertion [label="no"];
anticipated -> errorhandling [label="yes"];
anticipated -> reconsider [label="no/unsure"];
}
Correctness vs Robustness
digraph correctness_vs_robustness {
rankdir=TB;
START [label="Choose error\nhandling philosophy" shape=doublecircle];
safety [label="Safety-critical?\n(medical, aviation, nuclear)" shape=diamond];
correctness [label="Favor CORRECTNESS\nShut down > wrong result" shape=box style=filled fillcolor=lightcoral];
consumer [label="Consumer app?\n(games, word processors)" shape=diamond];
robustness [label="Favor ROBUSTNESS\nKeep running > perfect" shape=box style=filled fillcolor=lightgreen];
analyze [label="ANALYZE DOMAIN\n(see guidance below)" shape=box style=filled fillcolor=lightyellow];
START -> safety;
safety -> correctness [label="yes"];
safety -> consumer [label="no"];
consumer -> robustness [label="yes"];
consumer -> analyze [label="no"];
}
Domain Analysis Guidance (for "Analyze domain" path):
| Domain Type | Lean Toward | Key Question |
|---|---|---|
| Enterprise/B2B | Correctness | "Would wrong data cause business decisions based on false info?" |
| SaaS platforms | Balanced | "What's the blast radius of a wrong answer vs unavailability?" |
| Internal tools | Robustness | "Is user technical enough to recover from a crash?" |
| Data pipelines | Correctness | "Does downstream processing assume data integrity?" |
| Real-time systems | Context-dependent | "Is stale data better or worse than no data?" |
Keep Debug Code in Production?
digraph debug_code_production {
rankdir=TB;
START [label="Should this debug\ncode stay in production?" shape=doublecircle];
important [label="Checks important errors?\n(calculations, data integrity)" shape=diamond];
keep1 [label="KEEP" shape=box style=filled fillcolor=lightgreen];
crash [label="Causes hard crash?\n(no save opportunity)" shape=diamond];
remove1 [label="REMOVE" shape=box style=filled fillcolor=lightcoral];
diagnose [label="Helps remote diagnosis?\n(logging, state dumps)" shape=diamond];
keep2 [label="KEEP as silent log" shape=box style=filled fillcolor=lightgreen];
remove2 [label="REMOVE or make\nunobtrusive" shape=box style=filled fillcolor=lightyellow];
START -> important;
important -> keep1 [label="yes"];
important -> crash [label="no"];
crash -> remove1 [label="yes"];
crash -> diagnose [label="no"];
diagnose -> keep2 [label="yes"];
diagnose -> remove2 [label="no"];
}
Decision Framework: Assertions vs Error Handling
| Condition Type | Use Assertion | Use Error Handling | Guidance |
|---|---|---|---|
| Should never occur (bug) | Yes | No | Assert documents the impossibility |
| Can occur at runtime | No | Yes | Handle gracefully |
| External input | No | Yes | Always validate external data |
| Internal interface (same module) | Yes | No | Assert for contract violations |
| Internal interface (cross-module) | Yes | Yes, if crossing trust boundary | Validate at module boundaries |
| Precondition violation | Yes | Yes, if public API | Public APIs need graceful errors |
| Security-critical | Both | Both | Defense in depth |
| Highly robust systems | Both | Both | Belt and suspenders |
Barricade Design (p.203-204)
MUST be performed in order:
- Identify external interfaces (user input, files, network, APIs, inter-service calls)
- Place validation at barricade boundary - all external data checked here
- Inside barricade: use assertions for internal bugs (data assumed validated)
- Strategy: External = error handling; Internal = assertions
Class-level barricade: Public methods validate and sanitize; private methods within that class can assume data is safe.
Critical caveat: "Trust inside barricade" means reduced redundant validation, NOT zero validation. For security-critical paths (auth, crypto, PII), validate again even inside the barricade. Bugs in barricade validation happen.
Async and Modern Patterns
Traditional exception propagation assumes synchronous call stacks. Modern patterns need different approaches:
Promises/Async-Await
// BAD: Unhandled rejection crashes Node.js
async function fetchUser(id) {
const response = await fetch(`/api/users/${id}`); // Can reject
return response.json();
}
// GOOD: Explicit error handling
async function fetchUser(id) {
try {
const response = await fetch(`/api/users/${id}`);
if (!response.ok) {
throw new UserNotFoundError(id); // Domain-level exception
}
return response.json();
} catch (e) {
if (e instanceof UserNotFoundError) throw e;
throw new UserServiceError('Failed to fetch user', { cause: e });
}
}
Callbacks
- Errors don't propagate through callbacks - must be explicitly passed
- Use error-first callback pattern:
callback(error, result) - Wrap callback APIs in promises for better error handling
Promise.all() Error Aggregation
// BAD: First rejection loses other results
const results = await Promise.all(promises);
// GOOD: Collect all results including failures
const results = await Promise.allSettled(promises);
const failures = results.filter(r => r.status === 'rejected');
if (failures.length > 0) {
logErrors(failures);
// Decide: fail entirely or continue with partial results?
}
Event Handlers / React Error Boundaries
- Synchronous exceptions in event handlers don't crash the app but ARE silently swallowed
- Use React Error Boundaries to catch rendering errors
- Log errors in event handlers explicitly
Offensive Programming (p.206)
Make errors painful during development so they're found and fixed:
| Technique | Purpose |
|---|---|
| Make asserts abort | Don't let programmers bypass known problems |
| Fill allocated memory | Detect memory allocation errors immediately |
| Fill files/streams completely | Flush out file-format errors early |
| Default/else clauses fail hard | Impossible to overlook unexpected cases |
| Fill objects with junk before deletion | Detect use-after-free immediately |
| Email error logs to yourself | Get notified of errors in the field |
Paradox: During development, make errors noticeable and obnoxious. During production, make errors unobtrusive with graceful recovery.
Production Transition (p.209-210)
| Debug Code Type | Action | Rationale |
|---|---|---|
| Checks important errors (calculations, data) | KEEP | Tax calculation errors matter; messy screens don't |
| Checks trivial errors (screen updates) | REMOVE or log silently | Penalty is cosmetic only |
| Causes hard crashes | REMOVE | Users need chance to save work |
| Enables graceful crash with diagnostics | KEEP | Mars Pathfinder diagnosed issues remotely |
| Logging for tech support | KEEP | Convert assertions from halt to log |
| Exposes info to attackers | REMOVE | Error messages shouldn't help attackers |
Evidence Summary
| Claim | Source | Application |
|---|---|---|
| "Garbage in, garbage out" is obsolete | McConnell p.188 | Production software must validate or reject |
| Assertions especially useful in large/complex programs | McConnell p.189 | More code = more interface mismatches to catch |
| Error handling is architectural decision | McConnell p.197 | Decide at architecture level, enforce consistently |
| Trade speed for debugging aids | McConnell p.205 | Development builds can be slow if they catch bugs |
| Exceptions weaken encapsulation | McConnell p.198 | Callers must know what exceptions called code throws |
| Dead program does less damage than crippled one | Hunt & Thomas | Fail fast, fail loud during development |
| Mars Pathfinder used debug code in production | McConnell p.209 | JPL diagnosed and fixed remotely using left-in debug aids |
| Bugs cost 100x more to fix in production | IBM Systems Sciences Institute | Validates investment in early defensive programming |
| 15-50% of development time spent on debugging | McConnell, citing multiple studies | Defensive programming reduces this significantly |
| Mars Climate Orbiter lost due to unit mismatch | NASA 1999 | 9 months of success doesn't mean code is safe |
Error-Handling Strategy Options (p.194-197)
- Return neutral value - Use for: display defaults, non-critical config
- Substitute next valid data - Use for: streaming data, sensor readings with redundancy
- Return same answer as previous - Use for: display refresh, non-critical caching (NOT financial data)
- Substitute closest legal value - Use for: input clamping, slider bounds
- Log warning and continue - Use for: non-critical degradation, feature flags
- Return error code - Use for: APIs with status conventions, C-style interfaces
- Call centralized error handler - Use for: consistent logging, monitoring integration
- Display error message - Use for: user-facing apps (but don't leak security info)
- Shut down gracefully - Use for: safety-critical, data-corrupting errors
Strategy selection is an architectural decision - be consistent throughout.
| Application Type | Favor | Avoid |
|---|---|---|
| Safety-critical (medical, aviation) | Shut down | Return guessed value |
| Consumer apps (games, word processors) | Keep running | Crash without save |
| Financial/audit | Fail with clear error | Silent substitution |
| Data pipelines | Fail and retry OR quarantine | Silent data loss |
| Real-time systems | Degrade gracefully | Hard crash |
Chain
| After | Next |
|---|---|
| Validation complete | cc-control-flow-quality (CHECKER) |
More from ryanthedev/code-foundations
building
Execute whiteboard plans through gated phases with subagent dispatch. Require feature branch. Each phase goes through PRE-GATE (discovery + pseudocode) -> IMPLEMENT -> POST-GATE (reviewer) -> CHECKPOINT. Produce per-phase commits, execution log, and working code with tests. Use after /code-foundations:whiteboarding to implement saved plans. Triggers on: build it, execute plan, implement the whiteboard, run the plan.
1cc-debugging
Guide systematic debugging using scientific method: STABILIZE -> HYPOTHESIZE -> EXPERIMENT -> FIX -> TEST -> SEARCH. Two modes: CHECKER audits debugging approach (outputs status table with violations/warnings), APPLIER guides when stuck (outputs stabilization strategy, hypothesis formation, fix verification). Use when encountering ANY bug, error, test failure, crash, wrong output, flaky behavior, race condition, regression, timeout, hang, or code behavior differing from intent. Triggers on: debug, fix, broken, failing, investigate, figure out why, not working, it doesn't work, something's wrong.
1whiteboarding
Brainstorm and plan features through codebase search, technology research, and 2-3 approach comparison before producing implementation-ready plans. Use when starting features, designing solutions, or planning complex work. Triggers on: whiteboard, let's plan, brainstorm, design this, figure out how to build. Save plans to docs/plans/ for execution via /code-foundations:building.
1prototype
Validate technical feasibility with minimum code before full implementation. Prove ONE atomic question ('Can I X?') through 6-phase workflow: SCOPE, CONTEXT, MINIMUM, EXECUTE, VERIFY, CAPTURE. Use when facing technical uncertainty, unsure if something is possible, or need proof before planning. Triggers on: prototype, POC, prove this works, spike, demo this, can I do X, is it possible, feasibility check. Produce prototype log in docs/prototypes/ with YES/NO/PARTIAL verdict and chain to whiteboarding.
1setup-ast
Configure tree-sitter CLI and language grammars for AST-powered code review. Use when AST extraction fails, tree-sitter not found, grammars missing, or setting up new machine. Triggers on: setup tree-sitter, install grammars, AST not working, tree-sitter not found, setup ast.
1cc-quality-practices
Execute quality checklists (112+ items) for code review, testing strategy, and debugging. CHECKER mode audits QA practices with evidence tables. APPLIER mode generates test cases (5:1 dirty ratio), runs Scientific Debugging Method (STABILIZE-HYPOTHESIZE-EXPERIMENT-FIX-VERIFY-SEARCH), or sets up inspection procedures. Use when planning QA, choosing review methods, designing tests, or debugging fails. Triggers on: defects found late, tests pass but production bugs, coverage disputes, review ineffective, spending excessive time debugging.
1