context-engineering

SKILL.md

Context Engineering

Context engineering curates the smallest high-signal token set for LLM tasks. The goal: maximize reasoning quality while minimizing token usage.

When to Activate

  • Designing/debugging agent systems
  • Context limits constrain performance
  • Optimizing cost/latency
  • Building multi-agent coordination
  • Implementing memory systems
  • Evaluating agent performance
  • Developing LLM-powered pipelines

Core Principles

  1. Context quality > quantity - High-signal tokens beat exhaustive content
  2. Attention is finite - U-shaped curve favors beginning/end positions
  3. Progressive disclosure - Load information just-in-time
  4. Isolation prevents degradation - Partition work across sub-agents
  5. Measure before optimizing - Know your baseline

Key Metrics

  • Token utilization: Warning at 70%, trigger optimization at 80%
  • Token variance: Explains 80% of agent performance variance
  • Multi-agent cost: ~15x single agent baseline
  • Compaction target: 50-70% reduction, <5% quality loss
  • Cache hit target: 70%+ for stable workloads

Four-Bucket Strategy

  1. Write: Save context externally (scratchpads, files)
  2. Select: Pull only relevant context (retrieval, filtering)
  3. Compress: Reduce tokens while preserving info (summarization)
  4. Isolate: Split across sub-agents (partitioning)

Anti-Patterns

  • Exhaustive context over curated context
  • Critical info in middle positions
  • No compaction triggers before limits
  • Single agent for parallelizable tasks
  • Tools without clear descriptions

Guidelines

  1. Place critical info at beginning/end of context
  2. Implement compaction at 70-80% utilization
  3. Use sub-agents for context isolation, not role-play
  4. Design tools with clear descriptions (what, when, inputs, returns)
  5. Optimize for tokens-per-task, not tokens-per-request
  6. Validate with probe-based evaluation
  7. Monitor token usage in production
  8. Start minimal, add complexity only when proven necessary

Skill Coordination

When multiple skills are active:

  • Load only relevant skill content
  • Use skill metadata for discovery
  • Avoid loading full skill definitions unless needed
  • Reference skills by pattern detection, not direct names

References

For detailed guidance, see:

  • references/fundamentals.md - Context anatomy, attention mechanics
  • references/degradation.md - Debugging failures, lost-in-middle, poisoning
  • references/optimization.md - Compaction, masking, caching, partitioning
  • references/compression.md - Long sessions, summarization strategies
  • references/memory.md - Cross-session persistence, knowledge graphs
  • references/multi-agent.md - Coordination patterns, context isolation
  • references/evaluation.md - Testing agents, LLM-as-Judge, metrics
  • references/tool-design.md - Tool consolidation, description engineering
Weekly Installs
1
Installed on
windsurf1
opencode1
codex1
claude-code1
antigravity1
gemini-cli1