Tool Use and Computer Use
A language model in isolation is a text transformer: it reads text and produces text. Its knowledge is frozen at its training cutoff; it cannot access real-time information, cannot execute code, cannot send an email, cannot interact with any external system. But frontier AI systems deployed today are not isolated language models. They are language models equipped with tools — functions they can call to act on the world beyond their context window. Tool use is the bridge between language intelligence and real-world action, and it is one of the capabilities that most dramatically expands what AI can accomplish.
Function Calling and the Tool Use Paradigm
The mechanism underlying most tool use is called function calling. A developer provides the model with a list of available functions — their names, parameters, and descriptions — and the model can decide to call one as part of its response. Instead of (or in addition to) generating text, the model outputs a structured function call: the function name and the arguments it wants to pass. The application receives this, executes the function in the real world, and returns the result to the model, which then continues generating. For example, a customer service AI might have access to a function get_order_status(order_id: str) that queries the company's order database. When a customer asks 'Where is my order #77432?', the model does not guess — it generates a function call get_order_status('77432'), receives the real-time status from the database, and then generates a natural-language response based on the actual data. This architecture separates what the model knows from what the model can access. The model's parametric knowledge (baked into its weights during training) may be outdated, incomplete, or uncertain. But through tools, it can access authoritative, real-time information and perform authoritative, real-world actions.
A useful mental model: the language model is the brain, and tools are its hands. The brain can reason, plan, and decide — but without hands, it can only speak. Tools give the model agency in the world. The design of which tools a model has access to, and with what permissions, is one of the most consequential decisions in building an AI system.
The Tool Ecosystem
The range of tools frontier models can use is extensive and growing. Web search allows the model to access current information not in its training data — a critical capability given that training data has a cutoff date. Code execution allows the model to write code and run it, receiving the output — enabling precise numerical computation, data analysis, and automated testing of its own outputs. File system access allows reading and writing files, enabling document processing workflows. API integrations connect the model to external services: calendars, email, databases, payment processors, IoT sensors, enterprise software. Database query tools let the model retrieve precisely the data it needs from structured sources. Image generation tools let a language model produce images in response to a prompt, extending its output beyond text. The Model Context Protocol (MCP), introduced by Anthropic in late 2024, is an emerging standard for how AI models connect to tools and data sources in a structured, interoperable way — analogous to how USB standardized device connection for computers. MCP allows developers to build tool servers that any compatible model can access, accelerating the growth of the tool ecosystem.
Flashcards — click each card to reveal the answer
Computer Use: A Step Further
Function calling requires developers to pre-define the tools a model can use. Computer use is a more general capability: the model can perceive the visual state of a computer screen as an image, and take actions — mouse clicks, keyboard input, scrolling — on that screen, just as a human operator would. There is no need to pre-define a function for every possible application; the model simply operates the computer as a human would. Anthropic released a computer use API for Claude in October 2024, enabling this capability. The model receives screenshots of the current screen, reasons about what it sees, and outputs actions: click at coordinate (450, 320), type 'monthly_report.pdf', press Enter. It can operate web browsers, word processors, spreadsheets, coding environments — any software visible on a screen. This capability is extremely powerful and extremely risky in combination. A model that can operate any software can accomplish nearly any digital task — but it can also cause serious harm if it misunderstands instructions, encounters adversarial content, or is given inappropriate permissions. The current state of computer use requires careful human oversight: the model is capable but not yet reliable enough for fully autonomous deployment across arbitrary tasks.
Match each tool use scenario to the primary mechanism or concept it illustrates.
Terms
Definitions
Drag terms onto their definitions, or click a term then click a definition to match.
Every tool granted to an AI model is a potential vector for harm — from errors, misunderstandings, or adversarial manipulation. A model with write access to a production database and send access to an email system, operating autonomously, can cause severe damage from a single misunderstanding. Tool permission design should be treated as a safety decision, with the principle of least privilege applied rigorously.
A language model is asked 'What is the current price of AAPL stock?' It has no web search or API tools available. What is the most accurate prediction of its response quality?
What is the key difference between function calling and computer use as tool-use paradigms?
Design a Tool-Equipped AI System
- Design a tool-equipped AI system for a specific domain of your choice.
- Step 1: Choose a domain — healthcare information assistant, e-commerce customer service, academic research assistant, personal finance advisor, or another domain you find interesting.
- Step 2: List five specific tools your system would have access to. For each tool, specify: the function name, the input parameters, what it returns, and why it is needed for your use case.
- Step 3: For each tool, classify its risk level (low, medium, high) based on what damage could result from the tool being called incorrectly or maliciously. Justify each classification.
- Step 4: Apply the principle of least privilege. For any tool classified as high-risk, propose a mitigation: could you restrict the tool's permissions, require human approval, log all calls, or limit the scope of what the tool can do?
- Step 5: Identify one prompt injection scenario specific to your domain — how might adversarial content in your domain trick the model into calling the wrong tool with the wrong parameters?
- Present your tool design and risk analysis to the class.