ReAct Prompting

Also known as: Reason+Act prompting, ReAct agent loop, thought-action-observation prompting

ReAct Prompting
ReAct Prompting is a framework where a language model interleaves explicit reasoning traces with external tool calls, cycling through Thought, Action, and Observation steps to solve multi-step tasks that require live information or computation.

ReAct Prompting is a framework where a language model cycles through Thought, Action, and Observation steps — interleaving its own reasoning with real tool calls to solve multi-step tasks.

What It Is

ReAct prompting exists because a language model on its own can reason but cannot act. It can think through a problem step by step, but it has no way to check a live price, run a query, or call an API mid-answer. ReAct fixes that by giving the model a structured format that alternates between thinking and doing — making the Thought–Action–Observation loop the core mechanism behind modern AI agents.

The name stands for “Reasoning + Acting,” introduced in a 2022 research paper that showed language models solve tasks more accurately when they interleave reasoning traces with tool calls. A model prompted with the ReAct format does not just produce a final answer — it generates a sequence of alternating steps:

  • Thought: The model reasons about what to do next. (“I need to find the current exchange rate to convert this amount correctly.”)
  • Action: The model calls a tool — a search engine, a calculator, an API, or a code interpreter. (“search: current USD to EUR exchange rate”)
  • Observation: The tool’s result is fed back into the context. (“1 USD = 0.92 EUR as of today”)

The loop repeats until the model reaches a final answer, or a maximum step count is hit. Think of it like a developer following a whiteboard runbook: write the next step, execute it, read the output, write the next step — rather than trying to hold everything in memory at once.

This structure separates ReAct from two simpler approaches. Pure chain-of-thought prompting generates a reasoning trace but never calls any external tool — everything stays inside the context window. Pure action-based execution calls tools in a fixed sequence but does not let the model reason between steps about which action to take next. ReAct combines both: the model’s reasoning determines which action to take, and the action’s result shapes the next round of reasoning.

The result is a framework where the model can solve problems that depend on information it does not have at generation time — current prices, live documentation, database records, or the output of code it just ran. For the Thought–Action–Observation loop described in the article about how ReAct prompting gives LLMs the ability to act, this three-phase cycle is not scaffolding around the main idea; it is the main idea. Every capability that makes an AI agent useful in practice — searching, checking, computing, filing — runs through this loop.

How It’s Used in Practice

The most common place people encounter ReAct is inside AI assistant tools and coding agents. When you ask an AI assistant “What’s the latest release of framework X and what changed since the previous version?”, the model does not guess from training data. A ReAct-based agent thinks through what it needs, calls a web search or documentation API, reads the result, and then composes its answer from actual retrieved content — not from stale memory.

Framework builders use ReAct as the default agent loop in tools like LangChain, LlamaIndex, and PydanticAI. These libraries handle the loop plumbing — parsing the model’s Action output, dispatching the right tool, feeding the Observation back into context — so developers only need to define which tools the agent can call and when to stop.

Pro Tip: The Thought step is doing more work than it appears. If the agent makes bad decisions, read the thoughts first — they show exactly where the reasoning went wrong and which tool call followed from a false assumption. Thought traces are not verbose output; they are a built-in debugger.

When to Use / When Not

ScenarioUseAvoid
Multi-step task needing live information (prices, docs, APIs)
Simple one-turn Q&A from known training data
Coding agent that needs to read files, run tests, check errors
Summarizing a document already in the context window
Tasks where tool call results change what to do next (dynamic path)
Tasks with a fixed, predictable sequence of steps

Common Misconception

Myth: “ReAct” means the model is reacting to user input — it’s about responsiveness or streaming behavior.

Reality: The name is short for “Reasoning + Acting.” ReAct describes a loop structure, not a speed or UI pattern. A batch job that takes several minutes and calls multiple tools in sequence is a ReAct agent. A fast streaming response that never calls a tool is not.

One Sentence to Remember

ReAct Prompting gives a language model a structured reason to pause, call a tool, and think again — making the difference between a model that answers from memory and one that can actually check.

FAQ

Q: Is ReAct prompting the same as chain-of-thought prompting?

A: No. Chain-of-thought generates reasoning steps but takes no external actions — everything happens inside the context window. ReAct extends chain-of-thought by adding tool calls between reasoning steps, so the model can retrieve live information instead of reasoning only over what it already knows.

Q: Do I need special model training to use ReAct prompting?

A: Not for basic use. Most capable language models follow the Thought/Action/Observation format when you describe it in the system prompt. Purpose-trained ReAct models handle the format more reliably, but the pattern works with general-purpose models using prompt-only configuration.

Q: What is the difference between a ReAct agent and prompt chaining?

A: Prompt chaining passes results from one prompt to the next in a fixed sequence. A ReAct agent is self-directing: the model’s own reasoning decides which tool to call and whether the loop is finished — the path is dynamic, not predetermined.

Expert Takes

The Thought–Action–Observation structure is a parsing contract between the model and its runtime. The model writes structured output — not prose — that the runtime splits on action markers to dispatch tool calls and inject observations. What looks like “the model reasoning” is a text generation task following a grammar. The loop terminates when the model’s output matches no action pattern; the remaining text becomes the answer. Reliability depends on consistent delimiter use throughout the context.

ReAct is the minimum viable agent loop I reach for before anything else. Write a system prompt that defines the format — Thought, Action, Observation — and specify which tools are available. The plumbing frameworks handle parsing and dispatch. Specify the tool set and the stop condition. If the agent misbehaves, the thought trace is your first debugging read — it shows which tool was called, and why, from the model’s own stated reasoning at that step.

Every company building “AI workflows” right now is building ReAct loops, whether they know the term or not. The question is who controls the tool set. A tightly scoped ReAct agent with a handful of specific tools outperforms a general-purpose agent with dozens of options — the model makes worse decisions the more choices it has. Narrow the action space early, or you will spend months debugging hallucinated tool calls that looked plausible in a demo.

The Observation step is the one nobody questions. The model calls a tool, gets a result, and treats it as ground truth — no provenance, no verification, no audit trail in the reasoning trace. What happens when the tool returns poisoned data? The model’s subsequent Thoughts treat that observation like any other input. ReAct’s loop is persuasive enough that a bad observation rarely triggers a skeptical Thought. That gap deserves more scrutiny than it currently gets.