Skip to main content
AI Agents & Automation

⏱ About 15 min15 XP

When a Tool Fails

In an ideal world, every tool call succeeds, returns exactly what the agent expected, and the task is completed smoothly. In the real world, tools fail constantly. The internet goes down. APIs return errors. Search queries return nothing useful. A service is temporarily unavailable. The input format was wrong. A rate limit was exceeded. How an agent handles these failures is just as important as how it uses tools when everything works.

Three Kinds of Tool Failure

Tool failures fall into three broad categories, and each requires a different response. Hard errors occur when the tool cannot run at all. The API returns an error code — perhaps 404 (not found), 429 (too many requests), or 500 (server error). The tool output is an error message rather than useful data. The agent must recognize this and decide: retry? try a different tool? ask the user for help? Empty or insufficient results occur when the tool runs but returns nothing helpful. A search for a very obscure topic might return zero results. A database lookup might return an empty table. The tool did not fail technically, but the agent still has no useful information. Wrong or misleading results occur when the tool returns data that looks correct but is not. A search result might be outdated, biased, or from an untrustworthy source. The agent accepted the output, but the data is bad.

Error Codes at a Glance

HTTP error codes tell you what went wrong. 404 means the resource was not found. 429 means you have made too many requests too quickly. 500 means something went wrong on the server's end. Agents (and their developers) need to handle each type differently.

Graceful Degradation: Failing Without Crashing

A well-designed agent degrades gracefully — it finds a useful path forward even when a tool fails, rather than freezing, hallucinating, or giving a confident wrong answer. For a hard error: the agent should acknowledge the failure, explain what happened in plain language, and offer alternatives. If the weather API is down, the agent might say: I could not reach the weather service right now. You can check weather.gov directly, or I can try again in a moment. For empty results: the agent should try rephrasing the query, broadening the search, or switching to a different tool. If a narrow search returns nothing, a broader search might. For wrong results: the agent should cross-check important information against a second source before relying on it — especially for facts that will drive real decisions.

Cross-Checking Critical Facts

When a tool returns a result that will drive an important decision, a careful agent searches for a second source to verify it before acting. One tool call is a data point; two agreeing tool calls are much stronger evidence.

The Worst Response to Failure: Hallucination

The most dangerous response to a tool failure is for the agent to fill in the gap by generating plausible-sounding text from its own training data — a behavior called hallucination. If the search tool fails and the agent pretends it got a result and answers anyway, the user has no idea the information is made up. A trustworthy agent treats tool failure honestly: it reports what happened and what it does not know. Honesty about uncertainty is a feature, not a bug. Users can deal with I could not retrieve that information right now. They cannot easily detect a confidently stated falsehood.

Never Fabricate Results

If a tool fails and the agent invents an answer rather than admitting the failure, the user receives misinformation with no warning. This is worse than getting no answer at all. A well-designed agent always acknowledges uncertainty or failure honestly.

Complete the sentences about how agents should handle tool failures.

When a tool returns a hard , the agent should acknowledge the failure and offer . When a tool returns empty results, the agent should try the query. Filling in a gap with made-up information is called .

An AI agent tries to look up a live sports score and receives HTTP error 429. What does this error mean, and what should the agent do?

Why is hallucination particularly dangerous as a response to a failed tool call?

Failure Recovery Playbook

  1. Step 1: You are the designer of an AI agent that helps students research for school projects. Your agent has a web_search tool, a calculator tool, and a get_current_date tool.
  2. Step 2: Write a Failure Recovery Playbook — a set of instructions for what the agent should do in each of these scenarios:
  3. A) web_search returns HTTP error 503 (service unavailable)
  4. B) web_search returns three results, but all three are from obviously unreliable sources
  5. C) calculator returns an error because the agent passed the input in the wrong format
  6. D) get_current_date succeeds, but the agent then tries to compute a date difference and gets a nonsensical negative number
  7. Step 3: For each scenario, write one sentence the agent should say to the user honestly describing what went wrong.
  8. Step 4: Which of these failures is hardest to detect automatically? Why?