nullable-architecture
Nullable Architecture
Use this skill when refactoring code or writing tests in the Nullables style.
This skill is inspired by James Shore's writing on Nullables and testing without mocks. It is this project's interpretation of those ideas, adapted for how we build and test code here. It is not an official James Shore document or canonical definition of the pattern.
The focus is practical:
- explicit construction
newfor the class under test.createNull()for dependencies- light refactoring guidance instead of heavy taxonomy
- infrastructure wrappers at the environment boundary
- output tracking instead of spies
- behavior simulation for pushed events
- narrow, sociable, state-based tests
- example-driven tests that teach the design
The core idea is to test right up to the line of code that calls the environment, without actually hitting the environment. Nullables are production code with an "off" switch, not mocks, and the overall style combines narrow, sociable, state-based tests with nulled infrastructure.
What Good Looks Like
A good nullable test:
- reads like a concrete example of how the system works
- uses real production classes with nulled dependencies
- avoids mocks, spies, patched methods, and casted partials
- asserts on state or tracked output
- uses explicit
// arrange,// act,// assertstructure - stays DAMP rather than aggressively DRY
- illustrates one meaningful behavior, not every branch
The goal is not coverage-by-default. The goal is confidence through a few clear examples.
Test Shape: AAA And DAMP
Prefer tests with explicit AAA structure:
// arrange// act// assert
Keep blank lines between those sections when they are separate.
It is fine to collapse act and assert when that reads better, especially when:
- the action is a single line
- the assertion immediately explains why that action matters
- splitting them would add ceremony without adding clarity
Typical good shape:
// act & assert
expect(service.save("hello")).toEqual(ok(true));
or:
// act
cursor.moveNext();
// assert
expect(cursor.getToken()).toBe(nextToken);
Prefer DAMP tests ("Descriptive And Meaningful Phrases") over DRY tests. In practice that means:
- repeat a little setup when that makes the example easier to read
- repeat a little assertion logic when that keeps the behavior obvious at the test site
- optimize for local readability over helper reuse
Do not extract helpers just to remove three or four repeated lines. A good test should read top-to-bottom as a worked example without forcing the reader to jump around.
Test Helpers
Bias toward fewer helpers in tests.
When a helper is worth keeping, it should spell out what it does in domain language. Good helper names are specific:
expectFocusedElement(...)expectRootFocused(...)expectTrackedWrites(...)
Avoid vague or over-general helpers such as:
expectState(...)expectFocus(...)setupThing(...)
Good helpers usually do one of these:
- remove noisy mechanical setup that is not important to the example
- package a repeated domain assertion with a precise name
Helpers should not hide the interesting part of the test. If a helper bundles several different assertions or obscures which state is being checked, inline those assertions instead.
Start With The Test You Want
Do not begin by classifying everything.
Start with these questions:
- What class do I want to test?
- What dependency makes that class hard to test?
- Where is the first real line of code that touches the environment?
- What would I need from a nullable version of that dependency to write a clear state-based test?
Usually the answer is to introduce or improve an infrastructure wrapper.
Embedded Infrastructure (Anti-Pattern)
The most common cause of mock-heavy tests is embedded infrastructure — a function or method that calls the outside world on some line, with the dependency imported at the top of the file. This is pervasive in JS, TS, Python, and other languages. Because the dependency is hardwired via import, the only way to control it in tests is to patch or mock it.
Mocking frameworks (Jest's __mocks__, Python's unittest.mock) exist to work around this pattern. They can be manageable if all external calls are corralled into one place, but they still couple tests to import paths and call mechanics.
The preferred fix is injection: extract the environment call into an infrastructure wrapper and pass it in. This is what the rest of this skill describes.
Infrastructure Wrappers
Infrastructure wrappers are the lowest-level classes that touch one external system and present a clean API to the rest of the code. In Shore's pattern language, infrastructure wrappers sit at the environment boundary and own the reusable nullability machinery.
Examples:
CommandLinearoundprocess.argvandstdoutFileStorearound filesystem accessHttpClientaroundfetchNativeBridgearoundwindow.ReactNativeWebViewBrowserSelectionaround browser selection APIs
Good infrastructure wrappers:
- keep environment-specific code in one place
- expose behavior in the application's vocabulary
- provide
.create()for the real environment - provide
.createNull()for the nulled environment - provide configurable responses for incoming data when needed
- provide output tracking for observable writes
- provide behavior simulation when the environment pushes events inward
Shore's CommandLine example is the model: the wrapper owns configurable input, tracked output, and the real/null switch in production code.
Embedded stubs
An embedded stub is the nulled implementation that lives inside the production infrastructure wrapper. It is not a test-local fake and it is not a mock. It is production code that the wrapper uses in .createNull().
For example:
ViewportScroller.create()uses the real browser APIsViewportScroller.createNull()uses an embedded stub that does not scroll the real page, but can return configured element rects and track scroll requests
Why this matters:
- the real/null switch stays in production code, not scattered through tests
- tests do not patch methods or cast partial objects
- the same nullable implementation can be reused by many tests
The embedded stub should usually do three things:
- return safe default values with no real I/O
- accept configurable responses when tests need controlled inputs
- record observable outputs when tests need to assert what would have happened
Configurable responses
A configurable response is test-controlled input returned by a nullable dependency. Instead of rewriting production code to make testing easier, configure the nullable dependency to answer in a specific way.
Examples:
- a nullable repository returns a specific user
- a nullable command-line wrapper returns specific args
- a nullable viewport wrapper returns a specific rect for a matching element
- a nullable clock returns a chosen current time
Prefer configuring responses at construction time, close to the test, for example:
const repo = UserRepo.createNull({
findById: (id) => (id === 'u1' ? user : null)
});
Sometimes a nullable dependency needs to return different values over time. In that case a queue of return values is fine:
const clock = Clock.createNull({
now: [t1, t2, t3]
});
Use a queue only when the response genuinely changes over time and matching by input is not enough. Prefer matching by input when the behavior is really "return this value for this request".
Preflight
Before writing tests, verify:
- The class under test has
.create()and.createNull()if it participates in the nullable graph. - Constructor dependencies also have
.createNull(). .createNull()can be instantiated without real I/O.- Observable writes can be asserted via tracking instead of call counts.
- External pushed events can be simulated without patching internals.
If any of these fail, refactor toward the patterns below before writing more tests.
Light Refactoring Guide
Move down to the environment boundary
When code is hard to test, move downward through its dependencies until you find the first line that actually touches the environment.
Wrap that line in a small infrastructure wrapper.
Then build the nullable from the bottom up
Preferred progression:
- Extract the environment call into an infrastructure wrapper.
- Add
.create()and.createNull(). - Put the real environment dependency in
.create(). - Put an embedded stub in
.createNull(). - Add configurable responses if tests need controlled inputs.
- Add output tracking if tests need to observe writes.
- Add behavior simulation if the environment pushes events inward.
- Inject the wrapper into higher-level classes.
The most reusable machinery tends to live at this lower boundary layer.
In practice, steps 4 and 5 are where many tests become dramatically simpler:
- the embedded stub gives you a reusable nulled environment implementation
- the configurable responses let the test describe what the environment should say back
Together they replace most situations where people reach for mocks, spies, patched methods, or special-case production code.
Keep the advice light
Do not spend time forcing every class into a taxonomy before you can improve the test.
Prefer:
- one small wrapper
- one real
.create() - one real
.createNull() - one tracker if output matters
- one simulator if pushed events matter
- one or two example tests that prove the design works
Construction Rules
Constructors are dumb
Constructors receive already-built dependencies.
- Do not instantiate collaborators in constructors.
- Do not branch between real and null dependencies in constructors.
- Wiring callbacks and subscriptions is fine.
Factories choose the graph
.create() and .createNull() decide whether collaborators are real or nulled.
static create() {
return new Foo(Bar.create(), Baz.create());
}
static createNull() {
return new Foo(Bar.createNull(), Baz.createNull());
}
If something must be built first, build it in the factory and pass it into new.
If a collaborator has its own real/null branching, keep that branching in its factory instead of pushing it into the consumer's constructor.
Use new for the class under test
When testing Foo, prefer:
const repo = Repo.createNull({ user: existingUser });
const bus = MessageBus.createNull();
const foo = new Foo(repo, bus);
Why:
- the test owns the setup
- configuration is visible at the test site
- the test is not coupled to
Foo.createNull()defaults
Use .createNull() for dependencies
When Foo is only a collaborator of the class under test, use Foo.createNull().
This keeps higher-level tests focused.
Delayed child creation
Prefer eager instantiation by default.
Use delayed child creation only when a dependency is optional and only needed on a later path such as a click, menu action, route change, mode change, or child-flow launch.
Typical fit:
- settings screens that may open one of many child screens
- menus that launch optional tools or editors
- routers that create child app objects on demand
- flows with many possible branches where most children are never opened
Preferred shape:
static create(ctl) {
return new Settings(ctl, {
BindingsEditor: (params) => BindingsEditor.create(ctl, params),
FiltersUI: () => FiltersUI.create(ctl)
});
}
Pass a small create object into the constructor, then call create.X() only when the user triggers that path.
Use this pattern only when all of these are true:
- the dependency is not needed for startup
- it is only used on specific later paths
- eager construction would add meaningful cost, setup noise, or unnecessary state
- the factory object keeps the constructor clearer than eagerly building many optional children
Do not use it when:
- the dependency is cheap and almost always used
- the factory object would make the constructor harder to understand
- the laziness has no real payoff
- the factory is only hiding muddled design
Infrastructure Wrapper Example
Copy this shape first:
class CommandLine {
static create() {
return new CommandLine(process);
}
static createNull({ args = [] } = {}) {
return new CommandLine(new StubbedProcess(args));
}
constructor(proc) {
this.proc = proc;
this.output = [];
}
args() {
return this.proc.argv.slice(2);
}
writeOutput(text) {
this.proc.stdout.write(text);
this.output.push(text);
}
trackOutput() {
return { data: this.output };
}
}
This closely follows Shore's CommandLine example, where the wrapper owns configurable responses and output tracking.
Output Tracking
Use output tracking when code causes a collaborator to do something observable.
Examples:
- app changed
- message sent
- notification shown
- record logged
- file written
Track behavior objects, not function calls. Output tracking is a first-class nullable pattern specifically for observing what would have been sent to the environment without locking tests to call mechanics.
Good:
const appChanges = ctl.app.trackAppChanges();
expect(appChanges.data).toEqual([
{ previous: null, current: editDocument }
]);
Bad:
expect(runSpy).toHaveBeenCalledTimes(1);
Behavior Simulation
Use behavior simulation when an external system pushes an event into the code under test.
Examples:
- browser click
- websocket message
- timer firing
- native container event
Simulation methods belong on the nullable dependency:
simulateElementClick(...)simulateMessage(...)simulateTimeout(...)
They should reuse the real event path as much as possible. Shore's behavior simulation pattern is specifically about simulating environment behavior through the nullable rather than patching the consumer or inventing a disconnected path.
Preferred shape:
private handleElementClick(target: Element) {
this.REQUEST_FOCUS(target, { scrollIntoView: false });
}
connect() {
root.addEventListener("mousedown", (evt) => {
evt.preventDefault();
this.handleElementClick(evt.target as Element);
});
}
simulateElementClick(target: Element) {
this.handleElementClick(target);
}
If you are already calling the real domain API of the collaborator, that is not behavior simulation. It is just using the collaborator normally.
State-Based Assertions
Interaction-based tests (mocks, spies, call counts) lock in design choices harder than anything else — they couple tests to how code collaborates internally, not what it achieves. Assert on outcomes instead.
Prefer asserting:
- returned values
- resulting state
- tracked output
- domain-visible changes
Avoid asserting:
- method call counts
- private helper usage
- patched method behavior
- exact interactions unless the interaction record itself is the behavior
When asserting, prefer a few direct expectations over one opaque "state" helper. The reader should be able to see what changed without reverse-engineering a utility function.
Object interactions are implementation details; the consequences of those interactions are what the tests should care about.
Example Pattern
Use examples that show the design clearly.
describe('EditDocument', () => {
it('stays in the same app object when focus moves into editing', () => {
// arrange
const ctl = Controller.createNull();
const appChanges = ctl.app.trackAppChanges();
const doc = JsedDocument.createNull(root);
const editManager = EditManager.createNull({
document: doc,
userInput: ctl.input,
onError: (err) => editDocument.handleEditError(err)
});
const editDocument = new EditDocument(ctl, doc, editManager);
const p1 = byId(doc, 'p1');
editDocument.onStart();
editManager.nav.REQUEST_FOCUS(p1);
// act
editManager.nav.REQUEST_FOCUS(p1);
// assert
expect(editManager.getMode()).toBe('editing');
expect(appChanges.data).toEqual([]);
});
});
What this example illustrates:
- explicit
newfor the class under test .createNull()for dependencies- output tracking instead of spying on
app.run - assertion on resulting mode and tracked app changes
Refactor Example
If code directly calls an environment API:
class SaveDocument {
save(text) {
localStorage.setItem("draft", text);
}
}
Refactor toward:
class DraftStore {
static create() {
return new DraftStore(localStorage);
}
static createNull() {
return new DraftStore(new StubbedStorage());
}
constructor(storage) {
this.storage = storage;
this.writes = [];
}
save(text) {
this.storage.setItem("draft", text);
this.writes.push({ key: "draft", value: text });
}
trackWrites() {
return { data: this.writes };
}
}
class SaveDocument {
static create() {
return new SaveDocument(DraftStore.create());
}
static createNull() {
return new SaveDocument(DraftStore.createNull());
}
constructor(draftStore) {
this.draftStore = draftStore;
}
save(text) {
this.draftStore.save(text);
}
}
Then test with:
const draftStore = DraftStore.createNull();
const writes = draftStore.trackWrites();
const saveDocument = new SaveDocument(draftStore);
saveDocument.save("hello");
expect(writes.data).toEqual([{ key: "draft", value: "hello" }]);
What To Test
This section is about unit tests. Prefer testing at the highest level the nullable graph allows. Because we inject infrastructure and write sociable tests through real code, we can test happy paths, sad paths, and edge cases deterministically and instantly at the top level without mocks or real I/O.
Avoid unit tests that exercise interactions between system components to produce a behaviour. Such tests lock in implementation details: refactor the interaction and you have to rewrite the test, even though the outward behaviour is unchanged. Where possible, test the component that performs the interaction instead.
The criterion: did the test have to perform several actions on several subcomponents to achieve a single outward behaviour? If yes, the test is probably encoding assumptions about system internals.
This pushes unit tests toward a barbell:
- High-level and highly sociable — exercise the system at its top edge. Nullable architecture and dependency injection make this practical: real code runs end-to-end, infrastructure is nulled, behaviour is deterministic.
- Low-level and narrow — test a function or class with a single responsibility directly.
Smell: mirroring the same behaviour at multiple levels. If you find yourself writing a test that covers the same (or nearly the same) thing a higher-level test already covers, stop. With nullable architecture and DI, there's no point mirroring behaviour into lower levels — the top-level test already exercises it through real code. Only drop to a lower level when it's meaningfully easier to write/read there, or when you're genuinely operating on the low end of the barbell (testing a single-responsibility unit for its own sake, not as a proxy for higher-level behaviour).
Choose a few tests that teach the system.
For orchestrators and app-layer classes
Test:
- the main behavior
- one or two important edge cases
- tracked outputs on dependencies
- externally visible state changes
For infrastructure wrappers
Test:
.createNull()works with defaults- configurable responses drive the right behavior
- output tracking records observable writes
- behavior simulation covers pushed external events
For pure/value code
Use plain inputs and outputs. Do not force nullables where they do not help.
What Not To Do
- do not use mocks
- do not use spies
- do not patch methods in tests
- do not cast hand-built objects to richer types
- do not assert call counts
- do not hide important setup in vague helpers
- do not collapse AAA structure into large setup helpers
- do not extract generic assertion helpers when a few direct expects would read more clearly
- do not use
.create()in unit tests unless you intentionally want real infrastructure - do not spend time inventing a classification taxonomy when a small wrapper and one good example test would clarify the design faster
Review Checklist
When reviewing a nullable test or refactor, ask:
- Is the class under test instantiated explicitly with
new? - Are dependencies coming from
.createNull()? - Is any spy or mock being used where tracking would be better?
- Is any pushed event being faked by patching internals instead of simulation?
- Do the assertions describe behavior instead of interactions?
- Does the test teach something real about the design?
- Is the lowest environment boundary wrapped in a small nullable infrastructure wrapper?
- Does the test use clear AAA structure?
- Are helper names specific enough to explain what they assert or set up?
- Would the test be easier to read if one of the helpers were inlined?
Response Pattern
When asked to apply this style:
- Identify the class under test.
- Identify the lowest environment boundary that makes the code hard to test.
- Introduce or improve an infrastructure wrapper there.
- Instantiate the class under test with
newwhen the test is about that class. - Replace hand-built doubles with
.createNull()dependencies. - Add output tracking where the test wants to know what changed.
- Add behavior simulation where the test wants to model incoming external events.
- Rewrite assertions to focus on state and tracked outputs.
- Keep only the smallest set of examples that make the design clear.
More from danielbush/skills
nullables
Guide for implementing James Shore's Nullables pattern and A-Frame architecture for testing without mocks. Use when implementing or refactoring code to follow patterns of: (1) Separating logic, infrastructure, and application layers, (2) Creating testable infrastructure with create/createNull factory methods, (3) Writing narrow, sociable, state-based tests without mocks, (4) Implementing value objects, (5) Building infrastructure wrappers that use embedded stubs, or (6) Designing dependency injection through static factory methods.
15work-tracker
Create and manage work items, tickets, and tracking artifacts in a project's work/ directory. Also handles session continuity, summarisation, and searching past work. Supports unsupervised tickets — self-contained work items queued for autonomous agent execution. Use when the human wants to: create/move/scan work items, review the backlog, summarise a session, recall past work, continue from a previous session ('where were we', 'let's continue'), or queue work for an unsupervised agent. Bootstraps work/ on first use.
15nullables-test
Write illustrative tests for code that follows the Nullables pattern. Verifies the class under test is ready (all HARDWIRED_INFRA replaced by INJECTED_INFRA, every dependency has .createNull), then writes narrow, sociable, state-based tests using .createNull(). Tests should illustrate the system's concepts and architecture, not just achieve coverage. Use after applying the nullables-refactor skill, or when writing tests for code that already uses DUAL_FACTORY.
13nullables-refactor
Analyze a file and produce a refactoring plan to apply the Nullables pattern. Classifies code by side-effect boundary (PURE, IN_MEMORY, OUTSIDE_WORLD), identifies HARDWIRED_INFRA, recommends INFRASTRUCTURE_WRAPPER or NULLABLE_CLASS conversion, checks DUAL_FACTORY and CREATE_BOUNDARY_RULE compliance, and decides on DELAYED_INSTANTIATION. Use when asked to refactor a file or module to follow the nullables pattern.
13effect-ts
>
7jcodemunch
Code search and exploration using jcodemunch-mcp via `bunx mcporter`. TRIGGER when: reading code, exploring a codebase, looking up functions/classes/symbols, finding where something is defined or used, understanding how files relate, navigating unfamiliar code, checking what depends on something, investigating imports, tracing call chains, orienting on a repo, answering 'how does X work', 'where is X defined', 'what calls X', 'what would break if I change X', 'show me the code for X', searching across multiple files or repos, or any task that benefits from symbol-aware code intelligence beyond simple grep. Prefer this over raw file reads when exploring code structure, relationships, or usage patterns.
3