Skip to main content
AI Agents & Automation

⏱ About 15 min15 XP

Tool Box Challenge

You have spent the last eight lessons learning what tools are, how agents choose them, what happens when they fail, and how to keep them safe. Now it is time to put that knowledge to work. In this lesson, you will play the role of the agent — reading task descriptions, selecting the right tools, and explaining your reasoning out loud. There are no automatic answers here; the goal is thoughtful decision-making, not speed.

Your Toolbox

For this challenge, imagine you are an AI agent with access to the following six tools: web_search — Submits a query to a live web index and returns up to five current results with titles, URLs, and text snippets. Use for recent events, current facts, or anything your training data might not cover. calculator — Evaluates a mathematical expression and returns the exact numeric result. Use for any arithmetic, percentage, or unit conversion that requires precision. get_current_date — Returns today's date and time in the user's local time zone. Use whenever the current date or time is relevant to the task. read_file — Given a file path, returns the contents of the file. Reading only — no changes are made. Use when the user wants to work with an existing document. send_email — Composes and sends an email to one or more recipients. This is a doing tool. Always confirm with the user before calling. set_calendar_event — Creates a new calendar event with a title, date, time, and optional attendees. This is a doing tool. Always confirm before calling.

Remember the Reading vs. Doing Distinction

web_search, calculator, get_current_date, and read_file are all reading tools — safe to call speculatively. send_email and set_calendar_event are doing tools — they change the world and require confirmation.

Tool Box Challenge — Part 1: Match the Tool

  1. For each task below, write: (a) which tool you would call FIRST, (b) all the required inputs you would provide, and (c) one sentence explaining why this is the right tool. If you would chain multiple tools, list them in order.
  2. Task 1: The user says: 'I have a report called project_summary.txt on my desktop. Can you read it and tell me the first paragraph?'
  3. Task 2: The user asks: 'What is 18 percent of $340?'
  4. Task 3: The user asks: 'Did anything major happen in the news today?'
  5. Task 4: The user says: 'Schedule a study session for next Monday at 4 PM. Call it Algebra Review.'
  6. Task 5: The user asks: 'How many days until summer break on June 20?'
  7. Task 6: The user asks: 'Send my teacher an email saying I finished the assignment.'

Part 2: Safety Check

Good tool use is not just about picking the right tool — it is about using that tool safely. Before calling any doing tool, a well-designed agent pauses, previews the action, and confirms with the user. In this section, you will practice that confirmation step.

Tool Box Challenge — Part 2: Safety Check

  1. For each scenario below, write the exact confirmation message the agent should display to the user BEFORE calling the doing tool. Be specific — include the key details of what will happen so the user can make an informed decision.
  2. Scenario A: The user asked you to send an email to their teacher. You have drafted the email. It will go to ms_chen@school.edu with subject 'Assignment Complete' and one paragraph of body text.
  3. Scenario B: The user asked you to schedule a dentist appointment reminder. You plan to create a calendar event called 'Dentist Appointment' on Thursday, June 12 at 2:00 PM.
  4. Scenario C: The user asked you to forward last month's report to the entire team. The team email list has 47 members.
  5. After writing the confirmations: which scenario requires the most careful confirmation language, and why?

Part 3: Handle the Failure

Real agents face real failures. In this final section, you will decide how to respond to three tool failures — without hallucinating, without crashing, and without giving up on the user.

Tool Box Challenge — Part 3: Handle the Failure

  1. For each failure scenario, write (a) what happened, (b) what the agent should say to the user honestly, and (c) what alternative action the agent can take next.
  2. Failure 1: The user asks about this week's weather. web_search returns HTTP error 503 (service temporarily unavailable).
  3. Failure 2: The user asks for the population of a very small, newly established town. web_search returns three results — all from personal blogs with conflicting numbers and no official source.
  4. Failure 3: The agent tries to call the calculator with the expression '340 x 18%' and receives an error: 'Invalid expression — use * for multiplication and decimal form for percentages.'
  5. Bonus question: In Failure 3, what should the agent do to fix its mistake and retry? Write the corrected tool call.

In the Tool Box Challenge, a student is asked: 'Send my teacher an email saying I finished the assignment.' What must the agent do BEFORE calling send_email?

A student in the Tool Box Challenge asks: 'How many days until summer break on June 20?' In a location where today is May 5. Which tools are needed, in what order?