Reflexion: Language agents with verbal reinforcement learning

Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao · 2023

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

LLMs Know When They Know, but Do Not Act on It: A Metacognitive Harness for Test-time Scaling

cs.LG · 2026-05-13 · conditional · novelty 6.0

A metacognitive harness uses LLMs' pre- and post-solution self-monitoring signals to control test-time reasoning, raising pooled accuracy from 48.3% to 56.9% on text, code, and multimodal benchmarks.

Agentic Systems as Boosting Weak Reasoning Models

cs.AI · 2026-05-13 · unverdicted · novelty 6.0

Verifier-backed committee search boosts a weak reasoning model from 67% to 76.4% on SWE-bench Verified, matching stronger models by using local soundness signals to select among proposals.

citing papers explorer

Showing 2 of 2 citing papers.

LLMs Know When They Know, but Do Not Act on It: A Metacognitive Harness for Test-time Scaling cs.LG · 2026-05-13 · conditional · none · ref 27
A metacognitive harness uses LLMs' pre- and post-solution self-monitoring signals to control test-time reasoning, raising pooled accuracy from 48.3% to 56.9% on text, code, and multimodal benchmarks.
Agentic Systems as Boosting Weak Reasoning Models cs.AI · 2026-05-13 · unverdicted · none · ref 35
Verifier-backed committee search boosts a weak reasoning model from 67% to 76.4% on SWE-bench Verified, matching stronger models by using local soundness signals to select among proposals.

Reflexion: Language agents with verbal reinforcement learning

fields

years

verdicts

representative citing papers

citing papers explorer