Context Engineering vs Prompt Engineering
Prompt engineering is about phrasing one request well. Context engineering is about assembling the information environment that makes the model capable of doing the work at all. They sound similar but they solve different problems, and confusing them is why most agent systems degrade after a few turns.
Two Different Problems
Both disciplines involve shaping what you send to an LLM, both affect output quality, and the terminology sounds adjacent, so it’s easy to assume they’re the same thing at different scales, or that context engineering is just a fancier name for prompt engineering.
One is a craft problem; the other is an engineering problem. Understanding the difference changes what you pay attention to when something breaks.
What Prompt Engineering Actually Is
Prompt engineering is the craft of wording a request to get a better response: which phrasing produces more accurate output, does adding “think step by step” help, does the ordering of few-shot examples matter, where exactly should the instruction go.
In a single-turn interaction, this is genuinely the whole problem. You have one input string and one output, and the only lever you have is how you phrase the request; prompt engineering developed as a discipline because that lever turns out to matter quite a bit, and the results aren’t always obvious in advance. The scope is narrow by design: no memory, no tools, no accumulated state. A good prompt is a well-formed question with well-chosen examples and well-specified constraints, and that’s the entire surface area.
What Context Engineering Is
Context engineering operates on the 95% of the context window that isn’t the prompt.
In a production agent system, the system prompt might account for a few thousand tokens. The rest is accumulated conversation history, retrieved documents, tool outputs, file contents, and whatever else the application assembled programmatically. Nobody is hand-crafting that 95%; it’s built by code, which means it’s an engineering problem with engineering solutions.
The decisions involved are architectural, not linguistic: what information does this agent need to do its job, how much history is worth keeping versus summarizing versus discarding, what gets included when tool output comes back at 40k tokens, how context flows when one agent hands off to another. These questions don’t have answers you discover by tweaking wording; they require system design.
Birgitta Böckeler’s definition from martinfowler.com cuts through the noise: “curating what the model sees so that you get a better result.” The word “curating” matters. It implies active selection over time, not a one-shot composition.
Where They Intersect and Where They Don’t
The two disciplines overlap at the system prompt: that’s both a prompt engineering concern (how should instructions be worded and ordered?) and a context engineering concern (what information should live here vs. elsewhere, and how does this interact with everything that gets appended later?). Confusion usually starts there.
But the further you get from static instructions toward dynamic, accumulating context, the less prompt engineering applies. When a multi-step agent workflow starts degrading at turn 15, rewriting the system prompt will not fix it. The accumulated context is the problem; how you worded the initial instruction is not. Running the same prompt debugging instincts on a context management failure wastes time and misses what’s actually wrong.
The reverse is also true: if you’re building a single-turn extraction pipeline where every call is independent, context engineering concerns barely apply, and prompt engineering is the right tool for that job.
What the Shift Looks Like in Practice
A prompt engineering approach to building a coding assistant gives you something like this:
You are an expert software engineer. Fix the bug described below.
Write clean, well-documented code that follows best practices.
A context engineering approach adds the information environment the model needs to actually do the job:
System: You are a coding assistant for this FastAPI codebase.
Validation uses pydantic schemas. Password policy: 8+ chars,
one number, one special character. Email validation via
email-validator library. Errors return HTTP 422 with detail dict.
[auth.py — relevant section, 40 lines]
[failing test, 15 lines]
[error log excerpt, 5 lines]
Task: Fix the validation bug in validate_user_input().
The second version is better not because the instruction is worded better, but because the model has the right information to work with. You could spend a week perfecting the phrasing of the first version and still get worse results than the second version with a mediocre instruction.
Why the Distinction Matters
Diagnosing failures correctly depends on knowing which problem you have. Context engineering failures are consistent and positional: the model ignores retrieved documents even when they’re relevant (a Grounding problem), quality degrades after turn 10 but is fine earlier (Context Rot), or the model keeps repeating information from three turns ago that’s no longer true. Prompt engineering failures look different: wrong format, inconsistent tone, missing steps, output that doesn’t match stated constraints.
The skill sets are also different. Prompt engineering is mostly about writing and intuition about how models respond to phrasing; context engineering is mostly about information architecture, meaning what to include, when to evict it, and how to structure things so the model’s attention lands on what matters. One transfers from editorial work, the other from software architecture and database design. If you hire someone as a “prompt engineer” and expect them to manage context lifecycle in a long-running agent, you’re setting them up to fail at the wrong job.
The Honest Boundary
The terminology is still settling. “Context engineering” became common currency in 2025, partly because Karpathy used it in a widely-read post; before that, people called the same problem “prompt engineering” because no better term existed, and some practitioners still use them interchangeably. The discipline is new enough that the vocabulary hasn’t stabilized.
The underlying distinction is real regardless of what you call it, though. Phrasing a request well and managing an information environment across a multi-step workflow are different skills that require different intuitions and fail in different ways, and knowing which problem you’re solving tells you where to look when something breaks.
The patterns on this site are context engineering patterns. They assume a multi-turn, stateful environment where what you include, where you put it, and when you swap it out determines whether the system works.