Anchor Turn

Front-load all source reads into one turn so every subsequent turn works from cache.

The Problem This Solves

Agentic tasks run over many turns. Each turn, the model may need to consult source material: research files, documentation, a specification. The naive approach reads files on demand. When writing module 4, read the patterns documentation. When writing module 7, read the benchmarks file.

Every file read injects fresh tokens into that turn’s context. Fresh tokens are computed from scratch at full input price. They also break the cache prefix: providers hash the beginning of each request to find a cache hit, and inserting a large file mid-session means the hash no longer matches previous turns. The result is higher cost per turn and degraded cache utilisation across a session that might run for an hour.

How It Works

Dedicate the first turn of the session to reading everything the task will need.

  1. Read all source material in one turn. Open every file, document, or reference the task will require. Do it now, not on demand.
  2. Write a structured summary. Produce a consolidated reference document and write it to disk.
  3. Never re-read a source file. For all subsequent turns, draw on the conversation history and the summary. The summary is the canonical reference.
  4. Use the summary as a cache anchor. Because the summary enters the conversation history on turn 1, it becomes part of the cached prefix for every subsequent turn. The provider serves it from cache rather than recomputing it.

The net effect: fresh token consumption drops to near zero for the rest of the session. Every turn after the anchor turn is cheap, fast, and working from the same stable context.

Example

A session producing 8 course modules from research files.

Without an anchor turn: Each module turn reads 1-3 research files to cite specific data. Those reads inject 5,000-15,000 fresh tokens per turn. Over 90 turns, this compounds to 1.9 million fresh input tokens and 73% cache utilisation.

With an anchor turn: Turn 1 reads all 15 source files and writes a 900-word research notes file. Turns 2-160 draw from the cached conversation history. Fresh input tokens for the entire session: 191. Cache utilisation: 100%.

When to Use

When Not to Use