long-context
Originally fromovachiever/droid-tings
Installation
SKILL.md
Long Context: Extending Transformer Context Windows
When to Use This Skill
Use Long Context techniques when you need to:
- Process long documents (32k, 64k, 128k+ tokens) with transformer models
- Extend context windows of pre-trained models (LLaMA, Mistral, etc.)
- Implement efficient positional encodings (RoPE, ALiBi)
- Train models with length extrapolation capabilities
- Deploy models that handle variable-length inputs efficiently
- Fine-tune existing models for longer contexts with minimal compute
Key Techniques: RoPE (Rotary Position Embeddings), YaRN, ALiBi (Attention with Linear Biases), Position Interpolation
Papers: RoFormer (arXiv 2104.09864), YaRN (arXiv 2309.00071), ALiBi (arXiv 2108.12409), Position Interpolation (arXiv 2306.15595)