Skip to main content
AI Agents & Automation

⏱ About 15 min15 XP

Think: Deciding What to Do

After an agent perceives its environment, it faces the hardest part of the loop: figuring out what to do next. Perception gives it raw material — words, numbers, images, sensor readings. But raw material does not come with instructions attached. The agent must reason: given what I now know, what action will best advance my goal? That process of reasoning is what we call the think stage.

What Reasoning Looks Like Inside an Agent

For a language-based agent, reasoning happens inside a large language model. The agent's context window contains the goal, the conversation history, the results of previous actions, and the information gathered during the perceive stage. The language model processes all of this together and generates a response — which might be a plan, a decision, a tool call, or a direct answer. The language model does not have a separate reasoning module that you can point to and say 'there it is.' Instead, reasoning emerges from the pattern-matching and prediction process built into the model during training. When a model reasons well, it is drawing on patterns learned from millions of examples of humans solving similar problems.

Reasoning in Language Agents

In a language-based agent, the think stage is handled by the language model itself. It takes everything in the context window and generates a response that reflects its best judgment about what to do next.

Thinking in Steps

One of the most important discoveries in modern AI research is that agents reason far better when they think in steps rather than jumping directly to an answer. Instead of perceiving a problem and immediately outputting an action, the agent first works through intermediate reasoning steps — much like how you might jot down scratch work on paper before writing a final answer on a test. This is called step-by-step reasoning or chain-of-thought reasoning. A well-designed agent might think: 'The user wants to know the population of the capital of Brazil. First I should identify the capital of Brazil, which is Brasilia. Then I should look up Brasilia's population.' Only after working through those steps does it decide what action to take — in this case, running a search. Step-by-step thinking reduces errors because each intermediate conclusion can be checked. If the first step is wrong, the agent may catch it before it propagates into a bad action.

Chain-of-Thought Reasoning

Chain-of-thought reasoning is a technique where an agent writes out its intermediate reasoning steps before committing to an action. Breaking a problem into steps dramatically improves accuracy on complex tasks.

Choosing Among Actions

At the end of the think stage, the agent must pick one action from the set of actions available to it. Agents are given a toolkit — a list of things they are allowed to do. A research agent might have tools for web search, reading a document, writing a file, and sending an email. A robot might have tools for moving forward, turning, picking up, and placing. The agent's reasoning determines which tool to use, with what arguments, and in what order. A poorly-designed think stage leads to choosing the wrong tool, using it with wrong inputs, or taking actions in the wrong sequence. A well-designed think stage produces a clear, well-justified choice that advances the goal.

Match each reasoning concept to its correct description.

Terms

Chain-of-thought reasoning
Context window
Tool selection
Goal
Inference

Definitions

Running a trained model to produce a reasoning output without further learning
The full set of information available to the agent during the think stage
Working through intermediate steps before selecting a final action
The objective that guides the agent's reasoning and constrains what counts as a good action
Choosing which available capability to invoke based on the current goal

Drag terms onto their definitions, or click a term then click a definition to match.

The Risk of Overconfident Thinking

Language models can be overconfident. They may produce a reasoning chain that sounds logical and authoritative, yet contains a factual error or a flaw in the logic. This is sometimes called hallucination — the model generates plausible-sounding content that is factually wrong. This is why the observe stage matters so much. When an agent acts on flawed reasoning, the observation stage gives it a chance to detect that the action failed or produced an unexpected result. Systems that skip observation and trust their reasoning blindly are far more likely to cause harm. The antidote to overconfident thinking is not more thinking alone — it is more checking.

Hallucination

A hallucination occurs when a language model generates text that sounds confident and plausible but is factually incorrect. Agents that act on hallucinated reasoning without verification can cause real-world mistakes.

What is chain-of-thought reasoning?

An agent confidently states that the capital of Australia is Sydney and searches for Sydney's weather instead of Canberra's. What problem does this illustrate?

Be the Thinking Agent

  1. Step 1: You receive this user request: 'Find me a recipe for a dessert I can make in under 30 minutes that uses only ingredients I already have at home.'
  2. Step 2: Write out a chain-of-thought reasoning sequence for an agent trying to answer this. What does the agent need to figure out first? Second? Third? Write at least four reasoning steps.
  3. Step 3: Identify which tool the agent should use at the end of its reasoning — web search, memory lookup, direct answer, or something else — and explain why.
  4. Step 4: What could go wrong in the think stage here? List two possible reasoning errors.
  5. Step 5: How would the observe stage help catch those errors after the agent acts?