MemQ improves LLM agent performance by using eligibility traces over provenance DAGs to assign credit to dependent memories, achieving top success rates on six benchmarks with largest gains on complex multi-step tasks.
The Annals of Mathematical Statistics , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Opponent-aware peer-learning corrections in finite-unroll Meta-MAPG increase entry probability into target stable-Nash basins relative to standard policy gradient, with annealing to recover local convergence.
citing papers explorer
-
MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs
MemQ improves LLM agent performance by using eligibility traces over provenance DAGs to assign credit to dependent memories, achieving top success rates on six benchmarks with largest gains on complex multi-step tasks.
-
Equilibrium Selection in Multi-Agent Policy Gradients via Opponent-Aware Basin Entry
Opponent-aware peer-learning corrections in finite-unroll Meta-MAPG increase entry probability into target stable-Nash basins relative to standard policy gradient, with annealing to recover local convergence.