Tool Descriptions as Context

Tool definitions are context. The description tells the model when to use a tool and how. Most descriptions only say what the tool does; the ones that work also say when to use it and when not to.

The Problem This Solves

Tool definitions are context. The model reads them to decide which tool to call, what arguments to pass, and what the result means. The quality of those definitions determines whether it gets any of that right.

Poor descriptions cause cascading failures. The model picks the wrong tool, or the right tool with wrong arguments, or misses an opportunity to use a tool entirely. Each failed call wastes a turn, burns tokens, and often requires the user to intervene. The problem compounds in agentic loops where the model makes dozens of tool calls in sequence; one bad description can derail an entire chain of reasoning.

How It Works

A tool definition has four parts. The last two get the least attention but carry the most weight:

  1. Name. Should suggest purpose. search_knowledge_base is clear. query_42 is not.
  2. Description. The most important part. Tells the model not just what the tool does, but when to use it and what it doesn’t do. This is the context that drives tool selection.
  3. Parameters. The input schema. Use Schema Steering here; typed parameters with descriptions constrain what the model passes.
  4. Return description. What comes back and how to interpret it. Often omitted, but it helps the model plan multi-step workflows where one tool’s output feeds another.

The description is where most of the value lives. Compare:

# The model has to guess when to use this and what it covers
def search_documents(query: str):
    """Search for documents matching the query."""

# The model knows exactly when this is appropriate
def search_documents(query: str):
    """Search the internal knowledge base for policies,
    procedures, and FAQs. Use when the user asks about
    company processes or specific procedures. Returns
    top 5 results with excerpts. Does NOT search code
    repositories or customer data."""

The second version tells the model the scope (internal knowledge base), the trigger (company processes), the output shape (top 5 with excerpts), and the boundaries (not code, not customer data). The model can make an informed decision about whether this tool is appropriate for the current query.

Example

A data analysis agent with three tools.

Bad definitions where the model has to guess:

def query_db(sql): """Run a SQL query."""
def create_chart(data, type): """Create a chart."""
def send_email(to, subject, body): """Send an email."""

Good definitions that provide the context the model needs:

def query_db(sql):
    """Execute a read-only SQL query against the analytics
    database. Use for metrics, counts, aggregations, and
    lookups by ID. Do NOT use for writes or schema info
    (use describe_tables instead).
    Returns: JSON array of matching rows."""

def create_chart(data, chart_type):
    """Create a visualization from query results.
    Use after query_db when the user wants to see data
    visually. chart_type: 'bar' | 'line' | 'pie' | 'scatter'.
    Returns: URL to the generated chart image."""

def send_email(to, subject, body):
    """Send results to a user. Use ONLY after analysis is
    complete with results to share. Do not send test
    messages or debugging output.
    Requires: valid email, non-empty subject and body."""

The model now knows the ordering (query first, then chart, then email), the constraints (read-only queries, no test emails), and the relationships between tools (chart takes query output). These descriptions function as a lightweight workflow specification embedded in the tool context.

When to Use

When Not to Use