Context Patterns
Context engineering is the delicate art and science of filling the context window with just the right information for the next step.
Patterns
Context Rot
Model quality degrades as context gets longer, even well within the window limit. 11 of 13 models drop to half their baseline at 32k tokens. Every pattern below exists because of this.
The Pyramid
Start with general background, progressively add specific details. Give the model altitude before asking it to land. Mirrors how experts brief each other; context first, task second.
Select, Don't Dump
The smallest set of high-signal tokens that maximize the desired outcome. Surgical selection beats comprehensive inclusion.
Compress & Restart
When conversations grow long, summarize what matters and start fresh. Context quality degrades well before hitting advertised limits.
Write Outside the Window
Persist important context to external storage: scratchpads, memory files, knowledge bases. The context window is working memory, not long-term memory.
Grounding
Retrieval gets information into context. Grounding makes the model actually use it. Without explicit anchoring instructions, the model will often ignore what you retrieved and fall back to whatever it absorbed during training.
Anchor Turn
Front-load all source reads into one turn so every subsequent turn works from cache.
Isolate
Give sub-agents their own focused contexts instead of sharing one massive window. Anthropic's multi-agent system uses 15x more tokens total but gets better results, because each agent sees only what it needs.
Recursive Delegation
Let agents spawn child agents with scoped sub-contexts. Instead of stuffing everything into one window, the parent splits work, delegates with focused context, and aggregates results.
Progressive Disclosure
Start with a map, not the territory. Provide an index of what's available and let the model pull in details on demand.
Schema Steering
A JSON schema tells the model what to think about, in what order, and with what vocabulary. Define the structure and the model's reasoning follows.
Context Caching
Reuse computed context across requests to reduce costs and latency. Structure prompts so the stable prefix gets cached and only the variable part changes.
Attention Anchoring
Place critical information at the start and end of context. Models over-attend to the beginning and end of their context window, a phenomenon called 'lost in the middle.' Work with this bias instead of against it.
Guides
Context Engineering: A Practitioner's Guide
Context engineering is the discipline of deciding what information goes into an LLM's context window, how it's structured, and when to change it. This guide covers the core techniques, the patterns that keep recurring, and the mistakes that keep breaking production systems.
Memory Architectures for AI Agents
Compare memory implementations across systems. Flat files, structured databases, vector stores, and hybrid approaches. Map MemGPT, Claude, ChatGPT, and coding agents to episodic, semantic, and procedural memory concepts.
Context Rot Across Models
Data-driven comparison of how different models handle long context. NoLiMa and RULER benchmarks reveal which models maintain quality and which degrade fastest across GPT-4o, Claude, Gemini, Llama, and Mistral.
Recursive Delegation in Swarm, CrewAI, and LangGraph
How OpenAI Swarm, CrewAI, and LangGraph implement recursive delegation. Each framework handles context passing, result aggregation, and agent spawning differently.
Context Engineering for RAG Pipelines
Most RAG implementations fail not because retrieval is bad, but because nobody thought about what happens after retrieval. Bad chunking, no re-ranking, and no context budgeting waste the tokens you spent retrieving.
Context Engineering for Coding Agents
Configure Claude Code, Cursor, and Windsurf for better results. Structure your AGENTS.md and .cursorrules files to provide the right context at the right time.
Context Engineering for Code Generation
Include types, interfaces, and existing patterns in your context. Without them, the model generates code that matches its training data instead of your codebase.
Tools
Context Lens
See what your AI actually sees. A framework-agnostic proxy that intercepts LLM API calls and visualizes context window composition. Works with Claude, GPT, and any tool that calls an LLM API.
Stay Current
New patterns and research as they emerge.
You're in ✓