Template collapse is a distinct failure mode in agentic RL invisible to entropy; mutual information proxies diagnose it better and SNR-aware filtering using reward variance improves input-dependent reasoning and task performance across planning, math, navigation, and code tasks.
Self-refine: Iterative refinement with self-feedback
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
ReflectiChain uses latent trajectory rehearsal and retrospective agentic RL inside an LLM world model to raise average step rewards by 250% and restore supply-chain operability from 13.3% to 88.5% on the Semi-Sim benchmark under extreme shocks.
Agentic RAG embeds agents with reflection, planning, tool use, and collaboration into retrieval pipelines to overcome static RAG limitations, and the survey offers a taxonomy by agent count, control, autonomy, and knowledge representation plus applications and open challenges.
citing papers explorer
-
RAGEN-2: Reasoning Collapse in Agentic RL
Template collapse is a distinct failure mode in agentic RL invisible to entropy; mutual information proxies diagnose it better and SNR-aware filtering using reward variance improves input-dependent reasoning and task performance across planning, math, navigation, and code tasks.
-
From Topology to Trajectory: LLM-Driven World Models For Supply Chain Resilience
ReflectiChain uses latent trajectory rehearsal and retrospective agentic RL inside an LLM world model to raise average step rewards by 250% and restore supply-chain operability from 13.3% to 88.5% on the Semi-Sim benchmark under extreme shocks.
-
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG
Agentic RAG embeds agents with reflection, planning, tool use, and collaboration into retrieval pipelines to overcome static RAG limitations, and the survey offers a taxonomy by agent count, control, autonomy, and knowledge representation plus applications and open challenges.