MAR:Multi-Agent Reflexion Improves Reasoning Abilities in LLMs

Daniel Dosti; Grace Wu; Honghao Zhang; Onat Ozer; Vivi De La Rue; Yuchen Wang

arxiv: 2512.20845 · v2 · pith:ROJBQUG7new · submitted 2025-12-23 · 💻 cs.AI · cs.MA

MAR:Multi-Agent Reflexion Improves Reasoning Abilities in LLMs

Onat Ozer , Yuchen Wang , Grace Wu , Daniel Dosti , Honghao Zhang , Vivi De La Rue This is my paper

classification 💻 cs.AI cs.MA

keywords reflectionsllmsmulti-agentreasoningsameabilitiesaccuracyacting

0 comments

read the original abstract

LLMs have shown the capacity to improve their performance on reasoning tasks through reflecting on their mistakes, and acting with these reflections in mind. However, continual reflections of the same LLM onto itself exhibit degeneration of thought, where the LLM continues to repeat the same errors again and again even with the knowledge that its wrong. To address this problem, we instead introduce multi-agent with multi-persona debators as the method to generate reflections. Through out extensive experimentation, we've found that the leads to better diversity of in the reflections generated by the llm agent. We demonstrate an accuracy of 47% EM HotPot QA (question answering) and 82.7% on HumanEval (programming), both performances surpassing reflection with a single llm.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Communication-Theoretic Framework for LLM Agents: Cost-Aware Adaptive Reliability
cs.LG 2026-05 unverdicted novelty 6.0

LLM reliability techniques are unified as communication channel operators, with a new cost-aware router achieving superior quality-cost tradeoffs on hard tasks.
TEC: A Collection of Human Trial-and-error Trajectories for Problem Solving
cs.CL 2026-04 unverdicted novelty 6.0

TEC is a new public dataset of detailed human trial-and-error trajectories and reflections on web tasks, with humans showing substantially higher accuracy than LLMs.
Security Considerations for Multi-agent Systems
cs.CR 2026-03 unverdicted novelty 6.0

No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.