Recognition: unknown
When to Forget: A Memory Governance Primitive
Pith reviewed 2026-05-10 16:02 UTC · model grok-4.3
The pith
A two-counter signal per memory converges almost surely to the probability of task success given its retrieval.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Memory Worth (MW) maintains two counters per memory unit: one incremented on successful outcomes when the memory is retrieved and one on failures. The estimator is the ratio of the success counter to the total. The paper proves that MW converges almost surely to p+(m) = Pr[y_t = +1 | m in M_t] under a stationary retrieval regime with a minimum exploration condition. The quantity p+(m) measures associational co-occurrence of the memory with success rather than any causal contribution of the memory itself. The estimator uses only scalar counters and can be added to any architecture that already records retrieval events and episode outcomes.
What carries the argument
Memory Worth, the two-counter per-memory estimator of conditional success probability given retrieval.
If this is right
- Memory systems can suppress retrieval of items whose MW falls below a chosen threshold.
- Deprecation policies can remove memories that remain below the low-value threshold for a sustained period.
- The same counters provide a running estimate that updates automatically as the agent's task distribution shifts.
- Only retrieval logs and binary outcome signals are needed, keeping the overhead to two scalars per memory.
Where Pith is reading between the lines
- In non-stationary environments the convergence guarantee would not hold, suggesting the need for exponential decay or windowed counters as a practical extension.
- The associational nature of p+(m) means MW could be combined with intervention experiments to test whether a memory actually causes the observed outcomes.
- High-MW memories could be prioritized in retrieval ranking to improve sample efficiency in downstream learning.
- The method might generalize to continuous-valued outcomes by replacing the binary counters with summed rewards.
Load-bearing premise
Retrieval must follow a stationary distribution with a minimum amount of exploration so that every memory is sampled often enough for the counters to converge.
What would settle it
Run a controlled stationary retrieval process with known exploration rate and known true conditional success probabilities for each memory; after many episodes the MW values should match those probabilities within sampling error, and any systematic deviation would falsify the convergence claim.
Figures
read the original abstract
Agent memory systems accumulate experience but currently lack a principled operational metric for memory quality governance -- deciding which memories to trust, suppress, or deprecate as the agent's task distribution shifts. Write-time importance scores are static; dynamic management systems use LLM judgment or structural heuristics rather than outcome feedback. This paper proposes Memory Worth (MW): a two-counter per-memory signal that tracks how often a memory co-occurs with successful versus failed outcomes, providing a lightweight, theoretically grounded foundation for staleness detection, retrieval suppression, and deprecation decisions. We prove that MW converges almost surely to the conditional success probability p+(m) = Pr[y_t = +1 | m in M_t] -- the probability of task success given that memory m is retrieved -- under a stationary retrieval regime with a minimum exploration condition. Importantly, p+(m) is an associational quantity, not a causal one: it measures outcome co-occurrence rather than causal contribution. We argue this is still a useful operational signal for memory governance, and we validate it empirically in a controlled synthetic environment where ground-truth utility is known: after 10,000 episodes, the Spearman rank-correlation between Memory Worth and true utilities reaches rho = 0.89 +/- 0.02 across 20 independent seeds, compared to rho = 0.00 for systems that never update their assessments. A retrieval-realistic micro-experiment with real text and neural embedding retrieval (all-MiniLM-L6-v2) further shows stale memories crossing the low-value threshold (MW = 0.17) while specialist memories remain high-value (MW = 0.77) across 3,000 episodes. The estimator requires only two scalar counters per memory unit and can be added to architectures that already log retrievals and episode outcomes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Memory Worth (MW), a lightweight two-counter per-memory primitive that accumulates co-occurrences with successful versus failed episode outcomes. It proves that MW converges almost surely to the associational quantity p+(m) = Pr[y_t = +1 | m in M_t] under a stationary retrieval regime plus a minimum exploration condition, and reports empirical Spearman correlations of 0.89 with ground-truth utilities in a synthetic environment plus differentiation of stale versus specialist memories in a neural-embedding micro-experiment.
Significance. If the convergence result and empirical validation hold, MW supplies a simple, parameter-free, outcome-driven signal that can be added to any architecture already logging retrievals and rewards. This addresses a genuine gap between static write-time scores and heuristic or LLM-based dynamic memory management, with clear potential for staleness detection and deprecation policies. The explicit caveat that p+ is associational rather than causal is a positive feature that appropriately bounds expectations.
major comments (2)
- [§1 and convergence theorem] §1 (Introduction) and the convergence theorem (presumably §4): the motivation centers on shifting task distributions that perturb retrieval frequencies, yet the almost-sure convergence guarantee is stated only for a stationary retrieval regime. In non-stationary regimes MW may lag or converge to a stale value, directly undermining the staleness-detection and deprecation use cases that motivate the work. A tracking-error bound or non-stationary extension is required for the central claim to support the intended applications.
- [Empirical validation] Empirical section (synthetic environment): the reported rho = 0.89 +/- 0.02 is obtained after 10,000 episodes, but it is unclear whether the simulation enforces the minimum exploration condition required by the theorem or how ground-truth utilities are defined independently of the MW counters. Without these details the experiment cannot be confirmed to validate the theorem's assumptions rather than merely showing correlation under favorable conditions.
minor comments (2)
- The two-counter update rule would benefit from an explicit equation (e.g., MW_t(m) = N_+(m) / (N_+(m) + N_-(m))) placed in the main text rather than only in prose.
- The micro-experiment with all-MiniLM-L6-v2 embeddings would be clearer if the exact retrieval threshold and episode-outcome logging protocol were stated.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the scope of the theoretical result and the need for greater clarity in the empirical validation. We address each major comment below, indicating the revisions we will incorporate.
read point-by-point responses
-
Referee: [§1 and convergence theorem] §1 (Introduction) and the convergence theorem (presumably §4): the motivation centers on shifting task distributions that perturb retrieval frequencies, yet the almost-sure convergence guarantee is stated only for a stationary retrieval regime. In non-stationary regimes MW may lag or converge to a stale value, directly undermining the staleness-detection and deprecation use cases that motivate the work. A tracking-error bound or non-stationary extension is required for the central claim to support the intended applications.
Authors: We agree that the almost-sure convergence is proven only under a stationary retrieval regime with the minimum exploration condition. The introduction motivates the work via shifting distributions, and MW is intended to serve as an online signal that updates with new outcomes; a shift in task distribution will cause MW to drift from its previously converged value, providing a detectable change for staleness. However, we do not derive a tracking-error bound or non-stationary extension, which would require additional assumptions on shift rates. We will add explicit discussion in the revised §1 and §5 clarifying the stationary assumption and noting that MW functions as a tracking estimator whose updates can flag distribution changes. revision: partial
-
Referee: [Empirical validation] Empirical section (synthetic environment): the reported rho = 0.89 +/- 0.02 is obtained after 10,000 episodes, but it is unclear whether the simulation enforces the minimum exploration condition required by the theorem or how ground-truth utilities are defined independently of the MW counters. Without these details the experiment cannot be confirmed to validate the theorem's assumptions rather than merely showing correlation under favorable conditions.
Authors: Ground-truth utilities are defined as fixed, independent per-memory success probabilities p_true(m) that are used to sample episode outcomes y_t ~ Bernoulli(p_true(m)) upon retrieval; these p_true values are set at environment initialization and never depend on the MW counters, which begin at zero. The minimum exploration condition is enforced by ensuring each memory is retrieved at least once every fixed interval with probability epsilon > 0, satisfying the theorem hypothesis. We will revise the experimental section to document these design choices, include pseudocode verifying the condition holds across the 10,000 episodes, and state the independent definition of p_true. revision: yes
- A formal tracking-error bound or non-stationary extension of the convergence theorem under shifting task distributions.
Circularity Check
MW estimator defined from counters converges to explicitly stated conditional probability via standard LLN; no reduction to inputs by construction
full rationale
The paper defines Memory Worth directly from two per-memory counters (success and failure co-occurrences) and proves almost-sure convergence to the independently defined associational quantity p+(m) = Pr[y_t = +1 | m in M_t] under stationarity plus minimum exploration. This is a standard application of the strong law of large numbers to the indicator sequence and does not equate the result to its own inputs or fitted parameters. No self-citation chains, ansatzes, or uniqueness theorems are invoked for the central claim. The empirical sections compare MW against ground-truth utilities in controlled environments, supplying independent content. The stationarity assumption is explicitly flagged as required, consistent with the derivation rather than hidden.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption stationary retrieval regime with a minimum exploration condition
invented entities (1)
-
Memory Worth (MW)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
A., Gillhofer, M., Widrich, M., Unterthiner, T., Brandstetter, J., and Hochreiter, S
Arjona-Medina, J. A., Gillhofer, M., Widrich, M., Unterthiner, T., Brandstetter, J., and Hochreiter, S. RUDDER : Return decomposition for delayed rewards. In Advances in Neural Information Processing Systems, volume 32, pp.\ 13544--13555, 2019
2019
-
[2]
and Heyde, C
Hall, P. and Heyde, C. C. Martingale Limit Theory and Its Application. Academic Press, New York, 1980
1980
-
[3]
G., Piot, B., Heess, N., van Hasselt , H., Wayne, G., Singh, S., Precup, D., and Munos, R
Harutyunyan, A., Dabney, W., Mesnard, T., Azar, M. G., Piot, B., Heess, N., van Hasselt , H., Wayne, G., Singh, S., Precup, D., and Munos, R. Hindsight credit assignment. In Advances in Neural Information Processing Systems, volume 32, pp.\ 12467--12476, 2019
2019
-
[4]
Steps toward artificial intelligence
Minsky, M. Steps toward artificial intelligence. Proceedings of the IRE, 49 0 (1): 0 8--30, 1961. https://doi.org/10.1109/JRPROC.1961.287775
-
[5]
MemGPT: Towards LLMs as Operating Systems
Packer, C., Wooders, S., Lin, K., Fang, V., Patil, S. G., Stoica, I., and Gonzalez, J. E. MemGPT : Towards LLM s as operating systems. arXiv preprint arXiv:2310.08560, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[6]
Park, J. S., O'Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., and Bernstein, M. S. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, UIST '23, pp.\ 1--22. ACM, 2023. https://doi.org/10.1145/3586183.3606763
-
[7]
Reflexion: Language agents with verbal reinforcement learning
Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., and Yao, S. Reflexion: Language agents with verbal reinforcement learning. In Advances in Neural Information Processing Systems, volume 36, pp.\ 8634--8652, 2023
2023
-
[8]
Memory that knows what it knows: Feedback-learned confidence and credit assignment for agent memory
Simsek, B. Memory that knows what it knows: Feedback-learned confidence and credit assignment for agent memory. Under review, 2025
2025
-
[9]
Sutton, R. S. and Barto, A. G. Reinforcement Learning: An Introduction. The MIT Press, Cambridge, Massachusetts, 2nd edition, 2018. ISBN 978-0-262-03924-6
2018
-
[10]
A-MEM: Agentic Memory for LLM Agents
Xu, W., Zhu, Z., Zhan, Z., Wang, Z., Yang, Y., Zhou, J., and Liang, X. A-MEM : Agentic memory for LLM agents. In Advances in Neural Information Processing Systems, volume 38, 2025. arXiv:2502.12110
work page internal anchor Pith review arXiv 2025
-
[11]
Zhang, G., and co-authors. Adaptive memory admission control for LLM agents. arXiv preprint arXiv:2603.04549. Cited as 2026 reflecting the arXiv submission date; not yet formally published, 2026
-
[12]
MemoryBank : Enhancing large language models with long-term memory
Zhong, W., Guo, L., Gao, Q., Ye, H., and Wang, Y. MemoryBank : Enhancing large language models with long-term memory. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp.\ 19724--19731, 2024
2024
-
[13]
Hu, Y., Liu, S., and Yue, Y. Evaluating memory in LLM agents via incremental multi-turn interactions. arXiv preprint arXiv:2507.05257, 2025
-
[14]
Memory in the Age of AI Agents
Zhang, G., and co-authors. Memory in the age of AI agents: A survey. arXiv preprint arXiv:2512.13564, 2025
work page internal anchor Pith review arXiv 2025
-
[15]
D., Raghavan, P., and Sch\"utze, H
Manning, C. D., Raghavan, P., and Sch\"utze, H. Introduction to Information Retrieval. Cambridge University Press, 2008
2008
-
[16]
and Gurevych, I
Reimers, N. and Gurevych, I. Sentence- BERT : Sentence embeddings using S iamese BERT -networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, pp.\ 3982--3992, 2019
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.