{"paper":{"title":"ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models","license":"http://creativecommons.org/licenses/by/4.0/","headline":"ReWOO generates a full reasoning plan without tool observations first, then executes it in one pass to cut token use and allow smaller models.","cross_cats":["cs.AI"],"primary_cat":"cs.CL","authors_text":"Binfeng Xu, Bowen Lei, Dongkuan Xu, Subhabrata Mukherjee, Yuchen Liu, Zhiyuan Peng","submitted_at":"2023-05-23T00:16:48Z","abstract_excerpt":"Augmented Language Models (ALMs) blend the reasoning capabilities of Large Language Models (LLMs) with tools that allow for knowledge retrieval and action execution. Existing ALM systems trigger LLM thought processes while pulling observations from these tools in an interleaved fashion. Specifically, an LLM reasons to call an external tool, gets halted to fetch the tool's response, and then decides the next action based on all preceding response tokens. Such a paradigm, though straightforward and easy to implement, often leads to huge computation complexity from redundant prompts and repeated "},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"ReWOO achieves 5x token efficiency and 4% accuracy improvement on HotpotQA, a multi-step reasoning benchmark, while demonstrating robustness under tool-failure scenarios and enabling offloading from 175B GPT-3.5 to 7B LLaMA.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That a complete reasoning plan can be generated without any intermediate observations and that the subsequent single-pass execution of that plan will not lose critical information that interleaved observation would have supplied.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"ReWOO decouples reasoning from tool observations in augmented language models, delivering 5x token efficiency and 4% higher accuracy on multi-step reasoning benchmarks like HotpotQA.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"ReWOO generates a full reasoning plan without tool observations first, then executes it in one pass to cut token use and allow smaller models.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"7f84fd2836a0b373c832a80aa0b97e90b7bcb669096ffd9dcc8b501e7c5cf663"},"source":{"id":"2305.18323","kind":"arxiv","version":1},"verdict":{"id":"dc45b66a-9731-4cd5-9602-61128ca748c2","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T18:10:30.004133Z","strongest_claim":"ReWOO achieves 5x token efficiency and 4% accuracy improvement on HotpotQA, a multi-step reasoning benchmark, while demonstrating robustness under tool-failure scenarios and enabling offloading from 175B GPT-3.5 to 7B LLaMA.","one_line_summary":"ReWOO decouples reasoning from tool observations in augmented language models, delivering 5x token efficiency and 4% higher accuracy on multi-step reasoning benchmarks like HotpotQA.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That a complete reasoning plan can be generated without any intermediate observations and that the subsequent single-pass execution of that plan will not lose critical information that interleaved observation would have supplied.","pith_extraction_headline":"ReWOO generates a full reasoning plan without tool observations first, then executes it in one pass to cut token use and allow smaller models."},"references":{"count":42,"sample":[{"doi":"","year":2023,"title":"Re- act: Synergizing reasoning and acting in language models","work_id":"e6776553-76a5-43b0-9b45-b6c12ef61e44","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"Augmented Language Models: a Survey","work_id":"6426706e-f14a-4e4b-ade6-8414697a11d2","ref_index":2,"cited_arxiv_id":"2302.07842","is_internal_anchor":true},{"doi":"","year":2023,"title":"MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action","work_id":"6dc43db8-227d-438e-8658-0c8acecba08a","ref_index":3,"cited_arxiv_id":"2303.11381","is_internal_anchor":true},{"doi":"","year":2023,"title":"Taskmatrix.ai: Completing tasks by connecting foundation models with millions of apis.CoRR, abs/2303.16434, 2023","work_id":"90f98bad-b70c-481c-a56f-1197ee89e441","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"Tool Learning with Foundation Models","work_id":"0dcd87f8-e656-45c2-a990-50c17e71e00a","ref_index":5,"cited_arxiv_id":"2304.08354","is_internal_anchor":true}],"resolved_work":42,"snapshot_sha256":"a9d641e33f5fbe23b4639c025cddba284dbfe30aa7417433cb66d4bba22a05a7","internal_anchors":22},"formal_canon":{"evidence_count":2,"snapshot_sha256":"e239fcf662330d5d2143b78b6709a081e9832ae5debee6849d93291b66b98c17"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}