A new evaluation protocol shows agent memory reliability degrades variably with added irrelevant sessions depending on agent, memory interface, and scale.
Explaining context length scaling and bounds for language models.arXiv preprint arXiv:2502.01481, 2025
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
AgenticAI-DialogGen uses LLM agents to generate persona-grounded, topic-guided conversations and QA pairs encoding short- and long-term memory, producing the TGC dataset that improves LLM performance on memory tasks.
SCM-GRPO grounds multi-hop fact verification in structural causal models and applies GRPO reinforcement learning to optimize reasoning chain length, outperforming baselines on HoVer and EX-FEVER.
RAM outperforms prior methods on PoseTrack and 3DPW for zero-shot multi-person 3D motion tracking and reconstruction by fusing semantic tracking, memory-augmented pose estimation, and predictive fusion.
citing papers explorer
-
When Stored Evidence Stops Being Usable: Scale-Conditioned Evaluation of Agent Memory
A new evaluation protocol shows agent memory reliability degrades variably with added irrelevant sessions depending on agent, memory interface, and scale.
-
AgenticAI-DialogGen: Topic-Guided Conversation Generation for Fine-Tuning and Evaluating Short- and Long-Term Memories of LLMs
AgenticAI-DialogGen uses LLM agents to generate persona-grounded, topic-guided conversations and QA pairs encoding short- and long-term memory, producing the TGC dataset that improves LLM performance on memory tasks.
-
Grounding Multi-Hop Reasoning in Structural Causal Models via Group Relative Policy Optimization
SCM-GRPO grounds multi-hop fact verification in structural causal models and applies GRPO reinforcement learning to optimize reasoning chain length, outperforming baselines on HoVer and EX-FEVER.
-
RAM: Recover Any 3D Human Motion in-the-Wild
RAM outperforms prior methods on PoseTrack and 3DPW for zero-shot multi-person 3D motion tracking and reconstruction by fusing semantic tracking, memory-augmented pose estimation, and predictive fusion.