pith. sign in

arxiv: 2512.20845 · v2 · pith:ROJBQUG7new · submitted 2025-12-23 · 💻 cs.AI · cs.MA

MAR:Multi-Agent Reflexion Improves Reasoning Abilities in LLMs

classification 💻 cs.AI cs.MA
keywords reflectionsllmsmulti-agentreasoningsameabilitiesaccuracyacting
0
0 comments X
read the original abstract

LLMs have shown the capacity to improve their performance on reasoning tasks through reflecting on their mistakes, and acting with these reflections in mind. However, continual reflections of the same LLM onto itself exhibit degeneration of thought, where the LLM continues to repeat the same errors again and again even with the knowledge that its wrong. To address this problem, we instead introduce multi-agent with multi-persona debators as the method to generate reflections. Through out extensive experimentation, we've found that the leads to better diversity of in the reflections generated by the llm agent. We demonstrate an accuracy of 47% EM HotPot QA (question answering) and 82.7% on HumanEval (programming), both performances surpassing reflection with a single llm.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Communication-Theoretic Framework for LLM Agents: Cost-Aware Adaptive Reliability

    cs.LG 2026-05 unverdicted novelty 6.0

    LLM reliability techniques are unified as communication channel operators, with a new cost-aware router achieving superior quality-cost tradeoffs on hard tasks.

  2. TEC: A Collection of Human Trial-and-error Trajectories for Problem Solving

    cs.CL 2026-04 unverdicted novelty 6.0

    TEC is a new public dataset of detailed human trial-and-error trajectories and reflections on web tasks, with humans showing substantially higher accuracy than LLMs.

  3. Security Considerations for Multi-agent Systems

    cs.CR 2026-03 unverdicted novelty 6.0

    No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.