Retroactive Interference Model of Power-Law Forgetting

Antonios Georgiou; Mikhail Katkov; Misha Tsodyks

arxiv: 1907.08946 · v1 · pith:3DWWRP77new · submitted 2019-07-21 · 🧬 q-bio.NC

Retroactive Interference Model of Power-Law Forgetting

Antonios Georgiou , Mikhail Katkov , Misha Tsodyks This is my paper

Pith reviewed 2026-05-24 18:30 UTC · model grok-4.3

classification 🧬 q-bio.NC

keywords power-law forgettingretroactive interferencememory modelvalence measurerecognition memoryanalytical solutionword recognition

0 comments

The pith

A retroactive interference model with one free integer parameter reproduces power-law forgetting in memory experiments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a phenomenological model of forgetting in which new memories interfere retroactively with older ones according to their similarity along a multi-dimensional valence measure. Because recent memories encounter fewer similar successors, their retention drops faster, producing the characteristic power-law decay of memory strength with time. The model contains only one free integer parameter, the effective number of valence dimensions, and yields an exact analytical solution for retention probability. Recognition experiments using long streams of words confirm that the five-dimensional version matches observed forgetting curves.

Core claim

We have devised a phenomenological model that is based on the principle of retroactive interference, driven by a multi-dimensional valence measure for acquired memories. The model has only one free integer parameter and can be solved analytically. Recognition experiments with long streams of words result in a good match to a five-dimensional version of the model.

What carries the argument

The multi-dimensional valence measure that quantifies similarity between memories and thereby sets the strength of retroactive interference.

If this is right

Retention probability follows an explicit power-law form derived from the cumulative interference count in each dimension.
Older memories become progressively more stable because the number of potential interfering successors saturates.
The forgetting exponent is determined solely by the integer dimensionality parameter.
The analytical solution gives retention at arbitrary times without numerical simulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the valence dimensions map onto measurable semantic or perceptual features, the model predicts different forgetting rates for stimuli that differ in feature overlap.
The single-parameter structure implies that power-law forgetting may appear generically whenever memories are represented in a space with finite dimensionality and similarity-based interference.
Testing the model with controlled stimulus sets that vary the effective number of dimensions could directly measure the integer parameter from data.

Load-bearing premise

Retroactive interference driven by similarity in a multi-dimensional valence measure is the dominant mechanism producing the observed power-law forgetting.

What would settle it

An experiment in which new items are constructed to have zero similarity to prior items in every valence dimension, yet power-law forgetting is still observed, would falsify the model.

Figures

Figures reproduced from arXiv: 1907.08946 by Antonios Georgiou, Mikhail Katkov, Misha Tsodyks.

**Figure 1.** Figure 1: Interference model of forgetting. 1-D. Each item is represented as a thin vertical bar. The height of the bar corresponds to the valence of an item. The top row bars above the black line represent items that are stored in memory just before the acquisition of a new item, shown on the right (Sample). All the items that have smaller valence (bar height) than the new item are discarded from memory (crossed by… view at source ↗

**Figure 2.** Figure 2: Theoretical results. A. Theoretical retention curves (9) for different number of valence dimensions n. The dashed green line shows the asymptotic approximation of equation (10) for n = 5 that converges to an exact curve from T ≈ 104 , B. The average number of retained memories accumulated as a function of elapsed time from the beginning of acquisition. where the exponent α(t) can be estimated as α(t) = d(l… view at source ↗

**Figure 3.** Figure 3: Power fit of theoretical retention curves. A-E Theoretical retention curve computed with equation (9) for n = 5, plotted for different time windows. In the inset, the estimated value of the power α for the corresponding window is shown. F The dependence of α on time (orange curve). The value of α very slowly approaches −1, such that even for T = 108 it is still about −0.8. For comparison the asymptotic est… view at source ↗

**Figure 4.** Figure 4: Experimental protocol. A series of vertical bars represents word presentations. Pairs of horizontal bars represent a delayed recognition task, where participants were presented with one word shown to them previously and one lure word. Participants were requested to click on the word they felt that they saw before. In total 500 words were presented and all first 25 words were queried at different moments. A… view at source ↗

**Figure 5.** Figure 5: Experimental results. A Recognition performance for all participants who passed the qualification task (see Methods). B Recognition performance for participants that were perfect in the 2-back task. The experimental retention curve (recognition performance vs the lag between word presentation and delayed recognition task positions, solid green curve) declines for both groups. One can observe that performan… view at source ↗

read the original abstract

Memory and forgetting constitute two sides of the same coin, and although the first has been rigorously investigated, the latter is often overlooked. A number of experiments under the realm of psychology and experimental neuroscience have described the properties of forgetting in humans and animals, showing that forgetting exhibits a power-law relationship with time. These results indicate a counter-intuitive property of forgetting, namely that old memories are more stable than younger ones. We have devised a phenomenological model that is based on the principle of retroactive interference, driven by a multi-dimensional valence measure for acquired memories. The model has only one free integer parameter and can be solved analytically. We performed recognition experiments with long streams of words were performed, resulting in a good match to a five-dimensional version of the model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The model derives power-law forgetting analytically from multi-dimensional retroactive interference with one claimed free parameter, but the five-dimensional fit raises questions about whether the dimension count is independently fixed.

read the letter

The paper introduces a phenomenological model of forgetting based on retroactive interference driven by a multi-dimensional valence measure for memories. It claims to have an analytical solution with only one free integer parameter, the dimension count, and reports a good match to experimental data using a five-dimensional version in recognition tasks with long word streams. This is new in framing the power-law as emerging from interference in a high-dimensional space, which explains the greater stability of older memories without invoking separate mechanisms for different time scales. The approach is straightforward and ties directly to observed regularities in forgetting curves from psychology and neuroscience. It does well in providing a simple construction that can be solved exactly, avoiding the need for simulations to get the time dependence. The experimental setup with streams of words seems suitable for testing cumulative effects of interference. The soft spots are around the parameter count and the lack of detail. Selecting five dimensions to achieve the match suggests that the dimension might not be fixed independently, which could mean the model has more flexibility than stated and undercuts the single-parameter claim. The abstract also skips the derivation steps, error analysis, and data exclusion criteria, so the strength of the match is hard to assess from what's given. This is for colleagues working on models of memory and forgetting. A reader looking for new ways to account for power-law behavior would find it useful to consider, provided the full paper clarifies the points above. It shows honest engagement with the problem of explaining the power law. I recommend sending it for peer review to allow proper checking of the analytical steps and the experimental methods.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes a phenomenological model of forgetting based on retroactive interference driven by a multi-dimensional valence measure. It asserts that the model admits an analytical solution, has only one free integer parameter (the dimension count D), and yields a good match to recognition data from long word streams when instantiated in five dimensions.

Significance. If the analytical derivation is valid and the single free-parameter claim holds with D fixed independently of the fitting data, the work would supply a parsimonious account linking power-law forgetting to interference in a valence space, with potential relevance to memory models in psychology and neuroscience.

major comments (2)

[Abstract] Abstract: The claim that the model has 'only one free integer parameter' is load-bearing for the central parsimony argument, yet the five-dimensional version is selected to match the experimental curves; if D=5 is not fixed a priori by independent theory or stimulus properties, the model effectively incorporates an additional tunable integer parameter, directly undermining the reported analytical prediction of power-law forgetting.
[Abstract] Abstract: The abstract asserts an analytical solution and good data match but supplies no derivation steps, error analysis, exclusion criteria for the recognition experiments, or explicit equations showing how retroactive interference in D dimensions produces the power-law form; this gap prevents verification that the math supports the stated claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed review. We respond point-by-point to the major comments below, clarifying the role of the parameter D and the scope of the abstract while indicating planned revisions.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that the model has 'only one free integer parameter' is load-bearing for the central parsimony argument, yet the five-dimensional version is selected to match the experimental curves; if D=5 is not fixed a priori by independent theory or stimulus properties, the model effectively incorporates an additional tunable integer parameter, directly undermining the reported analytical prediction of power-law forgetting.

Authors: The model is formulated for arbitrary positive integer D, with the analytical derivation showing that retroactive interference in D-dimensional valence space produces power-law forgetting whose exponent depends only on D. D is the sole free integer parameter; once chosen, no additional parameters are required to generate the functional form or to compare with data. We selected D=5 as the value providing the closest match to the word-stream recognition curves, but the power-law prediction itself is independent of that specific choice. We will revise the abstract to state explicitly that D is the single free parameter, selected by fit quality, while the analytical result holds for any integer D. revision: partial
Referee: [Abstract] Abstract: The abstract asserts an analytical solution and good data match but supplies no derivation steps, error analysis, exclusion criteria for the recognition experiments, or explicit equations showing how retroactive interference in D dimensions produces the power-law form; this gap prevents verification that the math supports the stated claims.

Authors: The abstract is a high-level summary. The full analytical derivation, including the explicit multi-dimensional interference equations that reduce to the observed power-law form, appears in the Methods section. Error analysis of the fits and the exclusion criteria applied to the recognition trials are reported in the Results and supplementary material. To improve accessibility, we will expand the abstract by one sentence referencing the key interference equation and directing readers to the main text for the complete derivation and experimental details. revision: yes

Circularity Check

1 steps flagged

Dimensionality selection functions as fitted parameter, making 'one free integer parameter' claim and data match non-independent

specific steps

fitted input called prediction [Abstract]
"The model has only one free integer parameter and can be solved analytically. We performed recognition experiments with long streams of words were performed, resulting in a good match to a five-dimensional version of the model."

The single free integer parameter is the dimension, whose value (5) is selected to optimize agreement with the recognition experiment data. The 'good match' is therefore achieved by tuning the parameter to the target curves rather than predicting an independent outcome from the analytical solution.

full rationale

The abstract asserts a model with only one free integer parameter that is solved analytically and yields a good match in its five-dimensional version to recognition data. The dimension count is the integer parameter, and its value of 5 is chosen to produce the reported agreement with the experimental curves. This reduces the claimed match and parsimony to a post-hoc fit rather than an independent prediction from the derivation. The analytical solution itself may be non-circular, but the load-bearing claim of a single free parameter plus good match is undermined by this selection process.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The model rests on one free parameter (dimension count) chosen to fit data and on the domain assumption that valence similarity drives retroactive interference; the valence construct itself is introduced without independent evidence.

free parameters (1)

dimension count D
Integer parameter representing the number of valence dimensions; set to 5 to match the word-recognition data.

axioms (1)

domain assumption New memories interfere retroactively with older ones in proportion to their similarity in valence space
Invoked as the generative principle that produces the power-law decay.

invented entities (1)

multi-dimensional valence measure no independent evidence
purpose: Quantifies similarity between memories to compute interference strength
New construct introduced to drive the model; no external falsifiable handle supplied in the abstract.

pith-pipeline@v0.9.0 · 5655 in / 1156 out tokens · 26976 ms · 2026-05-24T18:30:35.631870+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

B. A. Richards, P. W. Frankland, The Persistence and Transience of Memory (2017). doi:10.1016/j.neuron.2017.04.037

work page doi:10.1016/j.neuron.2017.04.037 2017
[2]

Ebbinghaus, Memory: A contribution to experimental psychology., Dover, New York, 1964

H. Ebbinghaus, Memory: A contribution to experimental psychology., Dover, New York, 1964

work page 1964
[3]

Wixted, Analyzing the empirical course of forgetting, Journal of Experimental Psychology: Learning, Memory, and Cognition 16 (1990) 927–935

J. Wixted, Analyzing the empirical course of forgetting, Journal of Experimental Psychology: Learning, Memory, and Cognition 16 (1990) 927–935. doi:10.1037/0278-7393.16.5.927

work page doi:10.1037/0278-7393.16.5.927 1990
[4]

J. T. Wixted, E. B. Ebbesen, On the form of forgetting, Psychological Science 2 (6) (1991) 409–415. arXiv:https://doi.org/10.1111/ j.1467-9280.1991.tb00175.x, doi:10.1111/j.1467-9280.1991.tb00175.x. URL https://doi.org/10.1111/j.1467-9280.1991.tb00175.x

work page doi:10.1111/j.1467-9280.1991.tb00175.x 1991
[5]

M. J. Kahana, M. Adler, Note on the power law of forgetting, bioRxiv. arXiv:https://www.biorxiv.org/content/early/2017/08/ 09/173765.full.pdf, doi:10.1101/173765. URL https://www.biorxiv.org/content/early/2017/08/09/173765

work page doi:10.1101/173765 2017
[6]

J. S. Fisher, G. Radvansky, Patterns of forgetting, Journal of Memory and Language 102 (2018) 130–141. doi:10.1016/j.jml.2018.05. 008

work page doi:10.1016/j.jml.2018.05 2018
[7]

J. P. Nadal, G. Toulouse, J. P. Changeux, S. Dehaene, Networks of formal neurons and memory palimpsests, Epl 1 (10) (1986) 535–542. doi:10.1209/0295-5075/1/10/008

work page doi:10.1209/0295-5075/1/10/008 1986
[8]

J. T. Wixted, The psychology and neuroscience of forgetting, Annu. Rev. Psychol 55 (2004) 235–69. doi:10.1146/annurev.psych.55. 090902.141555. URL www.annualreviews.org

work page doi:10.1146/annurev.psych.55 2004
[9]

J. T. Wixted, On common ground: Jost’s (1897) law of forgetting and Ribot’s (1881) law of retrograde amnesia (2004). doi:10.1037/ 0033-295X.111.4.864

work page 2004
[10]

G. D. A. Brown, S. Lewandowsky, Forgetting in memory models: Arguments against trace decay and consolidation failure, Forgetting (2010) 49–75doi:10.4324/9780203851647

work page doi:10.4324/9780203851647 2010
[11]

t. K. Landauer, How much do people remember? some estimates of the quantity of learned information in long-term memory, Cognitive Science 10 (4) (1986) 477–493. doi:10.1016/S0364-0213(86)80014-3 . URL https://www.sciencedirect.com/science/article/pii/S0364021386800143

work page doi:10.1016/s0364-0213(86)80014-3 1986
[12]

Standing, Learning 10 000 pictures, Quarterly Journal of Experimental Psychology 25 (I 973) (1973) 207–222

L. Standing, Learning 10 000 pictures, Quarterly Journal of Experimental Psychology 25 (I 973) (1973) 207–222

work page 1973
[13]

Oxford Univer- sity Press (2018).https://doi.org/10.1093/oso/9780198814788.001.0001

N. Cowan, C. C. Morey, Z. Chen, M. Bunting, What do estimates of working memory capacity tell us?, in: The Cognitive Neuroscience of Working Memory, Oxford University Press, 2007, pp. 43–58. doi:10.1093/acprof:oso/9780198570394.003.0003. URL http://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780198570394.001.0001/ acprof-9780198570394-chapter-3

work page doi:10.1093/acprof:oso/9780198570394.003.0003 2007
[14]

A. G. Huth, W. A. de Heer, T. L. Gri ﬃths, F. E. Theunissen, J. L. Gallant, Natural speech reveals the semantic maps that tile human cerebral cortex, Nature 532 (7600) (2016) 453–458. doi:10.1038/nature17637. URL http://www.nature.com/doifinder/10.1038/nature17637

work page doi:10.1038/nature17637 2016
[15]

M. K. Healey, P. Crutchley, M. J. Kahana, Individual diﬀerences in memory search and their relation to intelligence, Journal of Experimental Psychology: General 143 (4) (2014) 1553–1569. doi:10.1037/a0036306

work page doi:10.1037/a0036306 2014
[16]

D. A. Medler, J. R. Binder, MCWord: An on-line orthographic database of the English language (2005). 9 Georgiou et al. / (2019) 1–11 10 Appendix A. Solution of Kahana model We analyze the version of Kahana model [5] with linear decay of memory strength: S (t) = a− bt with positive random coeﬃcients a and b. Other types of passive decay produce similar res...

work page 2005

[1] [1]

B. A. Richards, P. W. Frankland, The Persistence and Transience of Memory (2017). doi:10.1016/j.neuron.2017.04.037

work page doi:10.1016/j.neuron.2017.04.037 2017

[2] [2]

Ebbinghaus, Memory: A contribution to experimental psychology., Dover, New York, 1964

H. Ebbinghaus, Memory: A contribution to experimental psychology., Dover, New York, 1964

work page 1964

[3] [3]

Wixted, Analyzing the empirical course of forgetting, Journal of Experimental Psychology: Learning, Memory, and Cognition 16 (1990) 927–935

J. Wixted, Analyzing the empirical course of forgetting, Journal of Experimental Psychology: Learning, Memory, and Cognition 16 (1990) 927–935. doi:10.1037/0278-7393.16.5.927

work page doi:10.1037/0278-7393.16.5.927 1990

[4] [4]

J. T. Wixted, E. B. Ebbesen, On the form of forgetting, Psychological Science 2 (6) (1991) 409–415. arXiv:https://doi.org/10.1111/ j.1467-9280.1991.tb00175.x, doi:10.1111/j.1467-9280.1991.tb00175.x. URL https://doi.org/10.1111/j.1467-9280.1991.tb00175.x

work page doi:10.1111/j.1467-9280.1991.tb00175.x 1991

[5] [5]

M. J. Kahana, M. Adler, Note on the power law of forgetting, bioRxiv. arXiv:https://www.biorxiv.org/content/early/2017/08/ 09/173765.full.pdf, doi:10.1101/173765. URL https://www.biorxiv.org/content/early/2017/08/09/173765

work page doi:10.1101/173765 2017

[6] [6]

J. S. Fisher, G. Radvansky, Patterns of forgetting, Journal of Memory and Language 102 (2018) 130–141. doi:10.1016/j.jml.2018.05. 008

work page doi:10.1016/j.jml.2018.05 2018

[7] [7]

J. P. Nadal, G. Toulouse, J. P. Changeux, S. Dehaene, Networks of formal neurons and memory palimpsests, Epl 1 (10) (1986) 535–542. doi:10.1209/0295-5075/1/10/008

work page doi:10.1209/0295-5075/1/10/008 1986

[8] [8]

J. T. Wixted, The psychology and neuroscience of forgetting, Annu. Rev. Psychol 55 (2004) 235–69. doi:10.1146/annurev.psych.55. 090902.141555. URL www.annualreviews.org

work page doi:10.1146/annurev.psych.55 2004

[9] [9]

J. T. Wixted, On common ground: Jost’s (1897) law of forgetting and Ribot’s (1881) law of retrograde amnesia (2004). doi:10.1037/ 0033-295X.111.4.864

work page 2004

[10] [10]

G. D. A. Brown, S. Lewandowsky, Forgetting in memory models: Arguments against trace decay and consolidation failure, Forgetting (2010) 49–75doi:10.4324/9780203851647

work page doi:10.4324/9780203851647 2010

[11] [11]

t. K. Landauer, How much do people remember? some estimates of the quantity of learned information in long-term memory, Cognitive Science 10 (4) (1986) 477–493. doi:10.1016/S0364-0213(86)80014-3 . URL https://www.sciencedirect.com/science/article/pii/S0364021386800143

work page doi:10.1016/s0364-0213(86)80014-3 1986

[12] [12]

Standing, Learning 10 000 pictures, Quarterly Journal of Experimental Psychology 25 (I 973) (1973) 207–222

L. Standing, Learning 10 000 pictures, Quarterly Journal of Experimental Psychology 25 (I 973) (1973) 207–222

work page 1973

[13] [13]

Oxford Univer- sity Press (2018).https://doi.org/10.1093/oso/9780198814788.001.0001

N. Cowan, C. C. Morey, Z. Chen, M. Bunting, What do estimates of working memory capacity tell us?, in: The Cognitive Neuroscience of Working Memory, Oxford University Press, 2007, pp. 43–58. doi:10.1093/acprof:oso/9780198570394.003.0003. URL http://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780198570394.001.0001/ acprof-9780198570394-chapter-3

work page doi:10.1093/acprof:oso/9780198570394.003.0003 2007

[14] [14]

A. G. Huth, W. A. de Heer, T. L. Gri ﬃths, F. E. Theunissen, J. L. Gallant, Natural speech reveals the semantic maps that tile human cerebral cortex, Nature 532 (7600) (2016) 453–458. doi:10.1038/nature17637. URL http://www.nature.com/doifinder/10.1038/nature17637

work page doi:10.1038/nature17637 2016

[15] [15]

M. K. Healey, P. Crutchley, M. J. Kahana, Individual diﬀerences in memory search and their relation to intelligence, Journal of Experimental Psychology: General 143 (4) (2014) 1553–1569. doi:10.1037/a0036306

work page doi:10.1037/a0036306 2014

[16] [16]

D. A. Medler, J. R. Binder, MCWord: An on-line orthographic database of the English language (2005). 9 Georgiou et al. / (2019) 1–11 10 Appendix A. Solution of Kahana model We analyze the version of Kahana model [5] with linear decay of memory strength: S (t) = a− bt with positive random coeﬃcients a and b. Other types of passive decay produce similar res...

work page 2005