Context Compression

Context compression is the process of reducing the size of textual context provided to a language model while retaining the information most essential to the task. As conversations grow longer and retrieved documents grow larger, compression becomes critical for staying within token limits and keeping inference costs manageable without sacrificing answer quality.

Workflow

Measure the Token Budget: Determine the model's total context window (e.g., 4K, 32K, 128K tokens) and subtract the tokens reserved for the system prompt, instructions, and the model's generation output. The remainder is your available context budget. If the raw context already fits, compression may be unnecessary.
Score Information Density: Analyze each paragraph, sentence, or chunk of the raw context and assign an information-density score based on how many task-relevant facts it contains per token. Sentences that are purely stylistic, redundant, or off-topic receive low scores. This can be done heuristically (keyword overlap with the query) or via a lightweight classifier.
Select a Compression Strategy: Choose the most appropriate technique based on the compression ratio needed and the nature of the content:
- Extractive summarization — select the most important sentences verbatim.
- Abstractive summarization — rewrite content in fewer words while preserving meaning.
- Key-point extraction — pull out only named entities, facts, and figures.
- Selective pruning — remove low-density sentences, boilerplate, and repeated information.
Apply Compression: Execute the chosen strategy. For aggressive compression (>80% reduction), combine techniques — for example, first prune boilerplate, then abstractively summarize the remainder. For moderate compression (40–60%), extractive selection is often sufficient and avoids introducing paraphrasing errors.
Validate Information Retention: Compare the compressed output against the original to ensure no critical facts were lost. A quick validation pass can check that key entities, numbers, and conclusions from the original are still present in the compressed version.

context-compression

Context Compression

Workflow

More from seb1n/awesome-ai-agent-skills

summarization

proofreading

note-taking

knowledge-graph-creation

data-visualization

data-analysis