Momento: Evaluating Persistent Memory and Reasoning with Multi-Session Agentic Conversations

Adril Putra Merin; Ayu Purwarianti; David Anugraha; Genta Indra Winata

arxiv: 2606.00832 · v1 · pith:7RPH2DSHnew · submitted 2026-05-30 · 💻 cs.CL

Momento: Evaluating Persistent Memory and Reasoning with Multi-Session Agentic Conversations

Adril Putra Merin , David Anugraha , Ayu Purwarianti , Genta Indra Winata This is my paper

classification 💻 cs.CL

keywords agentsagenticcurrentuseractionsgoalsmomentomulti-session

0 comments

read the original abstract

Recent advances in agentic AI have enabled agents to complete complex tasks through tool use, reasoning, and multi-step planning. Yet existing benchmarks evaluate agents within a single session, ignoring past actions, stated preferences, and prior decisions that agents must integrate to fulfill personalized user goals. We introduce Momento, a benchmark for persistent agentic task completion in multi-session service environments, requiring agents to take consequential, tool-mediated actions while resolving temporal dependencies and evolving user goals across sessions. Experimental results reveal that current agents fail primarily through misestimation of user state, treating prior session history as a reliable proxy for current context rather than stale information requiring re-validation, highlighting a substantial gap between current agent capabilities and realistic long-horizon human-agent interaction.

This paper has not been read by Pith yet.

Momento: Evaluating Persistent Memory and Reasoning with Multi-Session Agentic Conversations

discussion (0)