Skip to main content
Frontier & Future AI

⏱ About 20 min20 XP

Agents and Long-Horizon Tasks

Most interactions with AI systems follow a simple pattern: you give a prompt, the model responds, the interaction ends. The model produces a fixed output in one step. This is powerful but limited. A model that can only respond to a single prompt cannot book your flight while also checking your calendar, emailing your hotel, and updating a shared spreadsheet — at least not without you manually carrying each output into the next step. AI agents break this pattern. An agent is an AI system that perceives its environment, makes decisions, and takes sequences of actions over time in pursuit of a goal. Rather than producing one output and stopping, an agent plans, acts, observes what happened, and decides what to do next. This capacity for sequential, goal-directed behavior — agency — is what makes frontier AI systems capable of long-horizon tasks: jobs that unfold over many steps, minutes, or even hours.

The Agent Loop

An AI agent operates in a loop: Perceive, Plan, Act, Observe, repeat. In concrete terms: Perceive: the agent receives input — a user instruction, the current state of a web page, the output of a tool call, an error message from a code execution environment. This input updates its context window. Plan: the agent uses its language model to decide what action to take next. For complex tasks, this planning may involve decomposing the goal into subgoals, considering alternatives, and prioritizing steps. Act: the agent executes an action by calling a tool — sending an HTTP request, running code, querying a database, clicking a button on a web page, writing a file. Observe: the result of the action — a web page's content, the output of the code, the response from an API — is returned to the agent and added to its context. The cycle repeats. This loop can iterate dozens or hundreds of times for a single high-level task. The agent that books your flight might browse three airline sites, parse fare tables, check your calendar for conflicts, fill out a booking form, and send a confirmation email — each step producing output that informs the next.

The Context Window as Working Memory

An agent's context window — the sequence of text it can attend to — functions as its working memory. As the agent acts and observes, the accumulating history of actions, observations, and plans fills this window. Long tasks can exceed the context limit, requiring the agent to summarize or compress earlier history. Managing context across long-horizon tasks is one of the key engineering challenges in building reliable agents.

What Long-Horizon Tasks Look Like

Research and synthesis: an agent is given a question — 'What are the five largest lithium producers by output in 2023, and what were their year-over-year changes?' — and autonomously searches multiple databases, cross-references sources for accuracy, resolves contradictions, and produces a cited report. Software engineering: an agent is given a GitHub repository and a bug report. It reads relevant files, identifies the root cause, writes a fix, runs the test suite, iterates until tests pass, and opens a pull request. Personal task completion: an agent manages a complex multi-step workflow — researching vendors, drafting emails, scheduling meetings, updating a shared project management tool — based on a single high-level instruction. Scientific experimentation: an agent in a computational biology lab reads a hypothesis, identifies relevant datasets, writes analysis code, runs it, interprets results, and suggests follow-up experiments — in an autonomous loop that might run overnight. These are not hypothetical: all of these capabilities are demonstrated by current frontier agent frameworks, including Anthropic's Claude Computer Use, OpenAI's Operator, and open-source frameworks like AutoGPT and LangChain agents.

Match each agent component to its role in the agent loop.

Terms

Perceive
Plan
Act
Observe
Context window management

Definitions

Deciding to search for vendor prices before drafting an email, based on the current goal
Reading the error message returned after a failed API call and updating the plan accordingly
Receiving a web page's text content as the result of a browsing action
Summarizing earlier conversation turns when approaching the token limit in a long task
Executing a Python script in a sandboxed code interpreter

Drag terms onto their definitions, or click a term then click a definition to match.

Long-horizon agentic capability introduces risks that do not apply to single-turn models. An error early in the task can compound: if the agent misidentifies a user's goal in step one, it may execute dozens of actions in the wrong direction before the user notices. Actions may be irreversible: a deleted file, a sent email, a financial transaction, a deployed update cannot always be undone. The agent may encounter adversarial content: a malicious web page that contains text designed to hijack the agent's behavior — a technique called prompt injection. The field is actively developing mitigations: human approval checkpoints for irreversible actions, sandboxed execution environments, careful scope-limiting of agent permissions, and monitoring systems that flag anomalous agent behavior. But agent safety remains one of the most active and unsolved research areas in frontier AI.

Error Compounding and Irreversibility

An agent acting autonomously over many steps can cause harm that would not result from any single step in isolation. A mistake in understanding the user's intent, multiplied across fifty autonomous actions, can produce an outcome far from what was intended — and some of those actions may be impossible to reverse. The degree of autonomy granted to an agent should be proportional to the reliability with which its behavior can be verified.

Flashcards — click each card to reveal the answer

An AI agent is asked to book the cheapest available flight to a given destination. At step 12 of its 50-step task, it misreads a price table and identifies a more expensive flight as cheapest. What is the most concerning property of this scenario?

What distinguishes a prompt injection attack from a standard prompt input to an agent?

Design an Agent for a Real Task

  1. Design a hypothetical AI agent system for a task of your choosing.
  2. Step 1: Choose a long-horizon task — something that requires at least 10 distinct steps. Examples: planning and booking a team trip, conducting a literature review, managing a small e-commerce inventory.
  3. Step 2: Write out the complete agent loop for your task: list at least 10 Perceive-Plan-Act-Observe cycles in sequence, specifying what the agent perceives, what it plans, what action it takes, and what it observes.
  4. Step 3: Identify the three highest-risk steps — where an error would be most damaging or irreversible — and propose a specific safeguard for each (e.g., human approval checkpoint, sandboxing, confirmation email).
  5. Step 4: Identify one point where a prompt injection attack could occur — where external content the agent reads might contain malicious instructions. How would you defend against it?
  6. Present your design to the class and evaluate each other's risk assessments.