Context Degradation Patterns

Language models exhibit predictable degradation as context grows. Understanding these patterns is essential for diagnosing failures and designing resilient systems.

Degradation Patterns

Pattern	Cause	Symptoms
Lost-in-Middle	Attention mechanics	10-40% lower recall for middle content
Context Poisoning	Errors compound	Tool misalignment, persistent hallucinations
Context Distraction	Irrelevant info	Uses wrong information for decisions
Context Confusion	Mixed tasks	Responses address wrong aspects
Context Clash	Conflicting info	Contradictory guidance derails reasoning

Lost-in-Middle

Information at beginning and end receives reliable attention. Middle content suffers dramatically reduced recall.

Mitigation:

[CURRENT TASK]                      # At start (high attention)
- Goal: Generate quarterly report
- Deadline: End of week

[DETAILED CONTEXT]                  # Middle (less attention)
- 50 pages of data
- Supporting evidence

[KEY FINDINGS]                      # At end (high attention)
- Revenue up 15%
- Growth in Region A

Context Poisoning

Once errors enter context, they compound through repeated reference.

Entry pathways:

Tool outputs with errors
Retrieved docs with incorrect info
Model-generated summaries with hallucinations

Symptoms:

Tool calls with wrong parameters
Strategies that take effort to undo
Hallucinations that persist despite correction

Recovery:

Truncate to before poisoning point
Explicitly note poisoning and re-evaluate
Restart with clean context

Context Distraction

Even a single irrelevant document reduces performance. Models must attend to everything—they cannot "skip" irrelevant content.

Mitigation:

Filter for relevance before loading
Use namespacing for organization
Access via tools instead of context

Degradation Thresholds

Model	Degradation Onset	Severe Degradation
GPT-5.2	~64K tokens	~200K tokens
Claude Opus 4.5	~100K tokens	~180K tokens
Claude Sonnet 4.5	~80K tokens	~150K tokens
Gemini 3 Pro	~500K tokens	~800K tokens

The Four-Bucket Approach

Strategy	Purpose
Write	Save context outside window
Select	Pull relevant context in
Compress	Reduce tokens, preserve info
Isolate	Split across sub-agents

Counterintuitive Findings

Shuffled haystacks outperform coherent - Coherent context creates false associations
Single distractors have outsized impact - Step function, not proportional
Needle-question similarity matters - Dissimilar content degrades faster

When Larger Contexts Hurt

Performance degrades non-linearly after threshold
Cost grows exponentially with context length
Cognitive bottleneck remains regardless of size

Best Practices

Monitor context length and performance correlation
Place critical information at beginning or end
Implement compaction triggers before degradation
Validate retrieved documents for accuracy
Use versioning to prevent outdated info clash
Segment tasks to prevent confusion
Design for graceful degradation
Test with progressively larger contexts

context-degradation