The Tool Layer: Connecting Agents to the World
A language model, by itself, is isolated. It can generate text about browsing the web, writing a file, or querying a database — but it cannot actually do any of those things. The tool layer is the architectural boundary that connects the agent to the external world: it translates the LLM's structured text output into real function calls, API requests, database queries, and code executions, then returns the results to the context window so the model can reason about what happened.
How a Tool Is Defined
Every tool exposed to an agent consists of three parts. First, a name and description: a human-readable name the model uses to identify the tool, plus a natural-language description of what the tool does, when to use it, and what it returns. The description is not incidental — the LLM reads it to decide whether to invoke the tool, so a vague or misleading description causes selection errors. Second, a parameter schema: a machine-readable specification (typically JSON Schema) of the arguments the tool accepts — their names, types, whether they are required, and acceptable value ranges. Third, an implementation: the actual code that executes when the tool is called. The LLM never sees the implementation; it only sees the name, description, and schema. This separation is fundamental to how tool use works.
When an agent 'decides to call a tool,' it generates a text object specifying the tool name and argument values. The orchestration loop validates this against the schema, executes the implementation, and returns the result as a new message. The model at no point reads or runs code — it reads descriptions and produces structured text. Tool use is fundamentally a language task with a structured side channel for execution.
Modern agent frameworks — OpenAI's function-calling API, Anthropic's tool use API, LangChain's tool abstraction, and others — all implement this same three-part structure, varying only in the specifics of the schema format and how results are injected back into the context. The Anthropic API, for example, requires each tool to supply a name, a description, and an input_schema following the JSON Schema Draft 7 specification. When the model decides to use a tool, the API returns a tool_use content block rather than a text block, and the caller is responsible for executing the function and returning the result in a tool_result message.
Categories of Tools
Tools fall into three functional categories. Retrieval tools fetch information without causing side-effects: web search, database queries, document lookup, weather APIs, stock price feeds. Because they are read-only, they are safe to retry and parallelize. Action tools cause side-effects in the external world: sending emails, writing files, posting to social media, submitting forms, calling payment APIs. These must be treated with care — they may be irreversible, expensive, or both, and the orchestration loop should implement confirmation, rate-limiting, and rollback mechanisms where possible. Computation tools run code or perform structured computation: a Python execution sandbox, a calculator, a SQL query runner, a regex engine. These are powerful because they allow precise operations that LLMs perform unreliably in pure text (arithmetic, sorting, data transformation), but they also introduce security considerations around code injection and sandboxing.
Match each tool example to the category it belongs to and its defining characteristic.
Terms
Definitions
Drag terms onto their definitions, or click a term then click a definition to match.
Tool Design Principles
Well-designed tools share five properties. They are narrowly scoped: each tool does one thing clearly rather than a family of things ambiguously. A tool called search_the_web_and_summarize_results is harder for the model to reason about and harder to test than two separate tools — one that searches, one that summarizes. They have unambiguous descriptions: the description must distinguish the tool from similar tools in the set. If you have both search_web and search_database, the descriptions must make crystal-clear when each is appropriate. They have safe defaults: required parameters should be minimal; optional parameters with sensible defaults reduce the chance of the model leaving a critical argument unset. They validate inputs strictly: the schema enforces types and ranges so that malformed tool calls fail loudly at validation rather than silently at execution. And they return structured, parseable results: raw API responses often contain noise the model cannot use; well-designed tools filter, format, and summarize their output before returning it to the context.
An agent has a tool called manage_file with parameters: action (one of 'read', 'write', 'delete', 'move', 'copy'), path, and content. A developer notices the agent frequently passes the wrong action or omits required parameters. What is the most likely root cause?
Retrieval tools are generally safe to run speculatively. Action tools — those with real-world side-effects — require explicit safeguards: human-in-the-loop confirmation for high-stakes actions, rate limits to prevent runaway loops, idempotency keys for payment operations, and rollback mechanisms where available. An agent that can send emails or charge cards with no confirmation layer is a serious risk even if the model itself is well-behaved.
A tool called get_weather returns a full JSON blob from the weather API, including hundreds of fields the agent never uses. What tool design principle does this violate, and what is the consequence?
Design a Tool Set
- You are building an agent that helps a small business manage customer support tickets. The agent should be able to: look up tickets by ID or status, post replies to tickets, escalate tickets to a human agent, and search a knowledge base for relevant help articles.
- Step 1: List the tools your agent needs. Give each a name (use snake_case) and a one-sentence description.
- Step 2: For each tool, specify its parameters: name, type (string, integer, boolean, etc.), and whether it is required or optional.
- Step 3: Classify each tool as retrieval, action, or computation.
- Step 4: Identify which tools are most dangerous if called incorrectly. What safeguards would you add to those tools?
- Goal: practice the full tool design process — naming, describing, schema design, and safety analysis.