Context Engineering Anti-Patterns

Most context bugs don't look like bugs. The model produces an answer, it just isn't the right one. Here are the failure modes that cause this, and how to recognize which one you're hitting.

Drew Breunig: How Long Contexts Fail , Drew Breunig: How to Fix Your Context , NoLiMa Benchmark

How Context Fails

Drew Breunig’s taxonomy of context failures is a useful starting point: contexts can be poisoned (bad information introduced), distracting (too much irrelevant information), confusing (conflicting signals), or clashing (the context contradicts the model’s training). Each maps to recognizable production failure modes, and this guide extends that framework with the specific anti-patterns that show up repeatedly in real applications.

The Anti-Patterns

Context Dumping

What it is: including everything that might be relevant rather than what actually is. The entire codebase, all documentation pages, the complete conversation history, every retrieved document regardless of relevance.

Symptom: answers are technically correct but miss the point, or the model produces generic responses when specific ones are needed. Quality is inconsistent across similar queries because Context Rot means attention degrades as context grows; relevant information competes with noise, and dumping 50 documents when 3 are relevant actively degrades quality on the 3 that matter.

Fix: Select, Don’t Dump. For every candidate piece of context, ask whether it directly helps with this specific task, and if the answer is “maybe,” leave it out.

Context Starvation

What it is: the opposite problem, where so little context is included that the model fills gaps with training data. A system prompt that says “answer customer questions” with no policy documents, no scope definition, no customer information.

Symptom: answers are confident and plausible but wrong in the specifics. The model cites typical industry policies rather than your actual policies, and it uses standard terminology rather than your domain vocabulary. Without sufficient Grounding, the model draws on training data to fill in what wasn’t provided and presents those answers with exactly the same confidence as grounded ones; it won’t flag the difference for you.

Fix: ground factual content explicitly, and use Role Framing for vocabulary and scope. The minimum viable context is enough that the model doesn’t need to guess about anything that could be wrong.

Contradictory Context

What it is: the context window contains conflicting instructions or information. The system prompt says “be concise” while a later injected document says “provide comprehensive explanations,” or one retrieved document says the return window is 14 days and another says 30.

Symptom: inconsistent behavior where the same question gets different answers across sessions, or the model follows some instructions and ignores others with no discernible pattern. Models resolve contradictions by picking a side, often not the one you’d choose; the “winning” instruction tends to be the more recent one, the more specific one, or whichever aligns with existing model biases, and the resolution is never transparent.

Fix: audit context before assembly. For configuration and instructions, prefer a single authoritative source over multiple sources that might diverge. For retrieved documents, deduplicate or explicitly note which version takes precedence.

Stale Context

What it is: context assembled from outdated sources, like the pricing page from eight months ago, the API documentation from a deprecated version, or a user profile that hasn’t been refreshed since the customer downgraded.

Symptom: answers that were correct in the past but aren’t now, which users who know the current state notice immediately. The model is accurately reflecting old information, which makes the failure harder to catch because the output looks well-grounded. There’s no built-in expiry for context, so once something is in the retrieval index or the context template, it stays there until someone removes it.

Fix: Temporal Decay for context sources that age. Index documents with timestamps and weight recent versions more heavily, schedule regular retrieval index refreshes, and for critical facts like pricing and policies, retrieve at query time so the answer reflects the current state.

Context Echo

What it is: model output re-injected into context as if it were ground truth. An agent loop where the model summarizes a document and that summary becomes the context for the next step, or a conversation where the model’s answer in turn 5 gets included verbatim in the context for turn 10 and treated as a reliable source.

Symptom: errors that compound across steps. An incorrect fact produced early gets amplified through subsequent iterations, and the model becomes increasingly confident about something that was wrong from the start. This is the anti-pattern that turns a single early hallucination into a systemic failure, because the model cannot distinguish between context it generated itself and context from authoritative external sources.

Fix: mark the provenance of every context element. Model-generated summaries should be labeled as such, and wherever possible, prefer original sources over model-generated representations. In agent loops, Write Outside the Window provides persistent storage for verified facts, separate from the rolling context where echo effects accumulate.

Format Soup

What it is: mixing structural formats within a single context window without discipline. XML task instructions, JSON tool results, Markdown documentation, and plain prose instructions coexisting in the same prompt, often without clear boundaries between them.

Symptom: the model ignores some instructions, misparses structured data, or produces output in unexpected formats. In agentic loops, tool call failures spike when the model misreads a JSON field surrounded by Markdown. Models handle format mixing imperfectly, and when XML and Markdown appear together without clear separation, attention to each format is diminished.

Fix: pick one primary structure for a given context section and stay with it. Use XML tags to delimit distinct sections from each other, put structured data in clearly marked blocks with explicit format declarations, and prioritize consistency within a context window over clever format mixing.

Ghost Context

What it is: context that was relevant earlier in a session but is no longer relevant, kept in the window purely because nobody removed it. A retrieved document from a query five turns ago about a topic the user has moved on from, or system-level context injected at session start for a use case that has already been completed.

Symptom: the model brings up things the user moved past, answers drift toward earlier topics, and in multi-step agents, the agent re-addresses completed steps rather than progressing. This is the anti-pattern where context accumulates passively; information injected for one purpose stays in the window for all subsequent purposes unless something actively removes it.

Fix: Compress & Restart at session boundaries to discard irrelevant accumulated context. In structured agents, replace context when the task changes: swap out the prior task’s context section and bring in the new one fresh. For retrieved documents, treat retrieval as per-query rather than per-session wherever the query intent can change.

Diagnosing Which Anti-Pattern You’re Hitting

The fastest diagnostic is to inspect the assembled context directly: log the full context at query time and read it. Most anti-patterns are immediately obvious once you see what the model is actually reading, rather than what you assumed it was reading.

Symptom	Most likely anti-pattern
Confident wrong answers on factual questions	Context starvation or stale context
Inconsistent answers to the same question	Contradictory context
Correct but generic answers	Context dumping
Compounding errors across steps	Context echo
Model ignores explicit instructions	Format soup or contradictory context
Degrading quality in long sessions	Context dumping, ghost context, or context rot
Model references things user moved past	Ghost context

Keep in mind that the same symptom can have multiple causes, which is why the table is a starting point rather than a decision tree. The context log is the definitive source; everything else is inference.