ReAct 框架

Quote

Reasoning enables an agent to formulate a plan, while acting enables it to interact with the environment.

基本概念¶

ReAct（Reasoning + Acting）是一种将推理与行动交替进行的 Agent 范式，由 Yao et al. (2022) 在论文 ReAct: Synergizing Reasoning and Acting in Language Models 中提出。

传统 LLM 的使用方式通常是：

纯推理 (Reasoning-only)：如 Chain-of-Thought (CoT)，模型仅进行内部推理，无法与外部环境交互；
纯行动 (Acting-only)：如直接调用 API，模型直接输出动作，缺乏中间推理过程。

ReAct 的核心思想是将两者结合，让模型在每一步先进行推理（Thought），再执行行动（Action），观察结果后继续推理，形成循环。

工作流程¶

ReAct 的典型执行循环如下：

flowchart LR
    T[Thought] --> A[Action]
    A --> O[Observation]
    O --> T

具体步骤：

Thought：模型分析当前状态，进行推理，决定下一步应该做什么以及为什么；
Action：根据推理结果，选择并执行一个具体的动作（如搜索、计算、调用 API 等）；
Observation：获取动作执行的结果；
重复上述过程，直到模型认为已经得到足够的信息来回答原始问题。

示例¶

以一个简单的问答任务为例：

Question: What is the elevation range for the area that the eastern 
          sector of the Colorado orogeny extends into?

Thought 1: I need to search for the Colorado orogeny and find its eastern sector.
Action 1: Search["Colorado orogeny"]
Observation 1: The Colorado orogeny was an episode of mountain building in 
               the western United States...

Thought 2: The eastern sector extends into the High Plains. I need to find 
           the elevation range of the High Plains.
Action 2: Search["High Plains elevation range"]
Observation 2: The High Plains rise in elevation from around 1,800 ft to 
               7,000 ft...

Thought 3: The elevation range of the High Plains is 1,800 ft to 7,000 ft. 
           I can now answer the question.
Action 3: Finish["1,800 ft to 7,000 ft"]

与其他范式的对比¶

范式	推理	行动	特点
Prompting	✅	❌	纯内部推理，如 CoT
Acting	❌	✅	直接输出动作，无推理过程
ReAct	✅	✅	推理与行动交替，可解释性强

关键优势¶

可解释性：Thought 步骤使得模型的决策过程透明可审计；
灵活性：可以根据 Observation 动态调整策略，而非固定流程；
事实性：通过与外部工具交互获取真实信息，减少幻觉。

局限性¶

推理链过长时，累积错误可能导致偏离目标；
每一步都需要 LLM 推理，调用成本较高；
对于简单任务，ReAct 的开销可能过大。