Learning by training: emergent return-point memory from cyclically tuning disordered sphere packings
Pith reviewed 2026-05-18 20:00 UTC · model grok-4.3
The pith
Cyclically tuned sphere packings evolve toward a marginally absorbing manifold that remembers the training range via return-point memory.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Athermal disordered sphere packings subjected to cyclic inverse design evolve toward a marginally absorbing manifold. This manifold encodes memory of the training range and produces return-point memory that matches observations in other cyclically driven systems. The mechanism rests on gradient discontinuities in the trained elastic quantities, which the authors propose as a general route to such manifolds and their associated memory.
What carries the argument
The marginally absorbing manifold (MAM), a structure in configuration space that absorbs cyclic training trajectories and encodes the range of elastic targets through return-point memory.
If this is right
- Trained packings retain information about the full range of past target properties through their configuration on the manifold.
- Memory formation occurs automatically from the training process without requiring explicit encoding steps.
- The same gradient-discontinuity mechanism can generate analogous memory in other adaptive physical systems.
- Design of materials that adapt under repeated loading can exploit this manifold structure to store history.
Where Pith is reading between the lines
- The mechanism may apply to living systems that adapt to cyclic environmental stresses, such as cells or tissues under repeated mechanical loading.
- Testing the model with different particle interaction potentials would check whether the gradient-discontinuity route remains dominant.
- The framework could connect to machine-learning settings where models are trained on data drawn from varying distributions.
Load-bearing premise
Gradient discontinuities in the trained elastic quantities are both necessary and sufficient to produce the marginally absorbing manifold and its return-point memory, and this mechanism applies beyond the specific sphere-packing model.
What would settle it
A concrete counterexample would be a system of cyclically tuned packings that develops return-point memory while showing no gradient discontinuities in the trained elastic quantities.
Figures
read the original abstract
Many living and artificial systems improve their fitness or performance by adapting to changing environments or diverse training data. However, it remains unclear how such environmental variation influences adaptation, what is learned in the process, and whether memory of past conditions is retained. In this work, we investigate these questions using athermal disordered systems that are subject to cyclic inverse design, enabling them to attain target elastic properties spanning a chosen range. We demonstrate that such systems evolve toward a marginally absorbing manifold (MAM), which encodes memory of the training range that closely resembles return-point memory observed in cyclically driven systems. We further propose a general mechanism for the formation of MAMs and the corresponding memory that is based on gradient discontinuities in the trained quantities. Our model provides a simple and broadly applicable physical framework for understanding how adaptive systems learn under environmental change and how they retain memory of past experiences.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript studies athermal disordered sphere packings subjected to cyclic inverse design that targets elastic properties over a chosen range. It reports that the packings evolve toward a marginally absorbing manifold (MAM) whose geometry encodes memory of the training range in a manner that closely resembles return-point memory. The authors propose that gradient discontinuities in the trained elastic quantities provide a general mechanism for both MAM formation and the emergence of this memory.
Significance. If the central claims hold, the work supplies a concrete, minimal physical model for how adaptive systems acquire and retain memory of past conditions under environmental variation. The link between cyclic training, gradient discontinuities, and return-point memory is conceptually interesting and could inform broader studies of learning in physical and biological systems. The sphere-packing inverse-design protocol is a reasonable choice for an athermal, disordered setting, and the introduction of the MAM as an emergent structure is a useful organizing idea.
major comments (2)
- [§4 (proposed general mechanism)] §4 (proposed general mechanism): The assertion that gradient discontinuities in the trained quantities are both necessary and sufficient for MAM formation and return-point memory is not tested by any control in which the discontinuities are removed or smoothed (for example by replacing the elastic response with a differentiable surrogate) while the cyclic inverse-design protocol is held fixed. Without such a control, it remains possible that the MAM and its memory properties arise from the geometry of configuration space or the form of the inverse-design objective rather than from the non-differentiable points.
- [Results on evolution to the MAM] Results on evolution to the MAM: The claim that the systems evolve toward the MAM and encode memory of the training range is presented without quantitative support such as a distance-to-manifold metric tracked over training cycles, convergence statistics across independent realizations, or error bars. This absence makes it difficult to assess how robust or complete the evolution is.
minor comments (2)
- [Methods / early Results] The definition of the MAM would be clearer if accompanied by an explicit mathematical characterization (e.g., a condition on the Hessian or on the set of admissible strains) rather than a purely descriptive statement.
- [Figure captions] Figure captions should explicitly state the number of independent realizations used for averaging and the precise definition of any shaded regions or error bars.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive comments on our manuscript. We are encouraged by the positive assessment of the significance and the usefulness of the MAM concept. We address each of the major comments below.
read point-by-point responses
-
Referee: §4 (proposed general mechanism): The assertion that gradient discontinuities in the trained quantities are both necessary and sufficient for MAM formation and return-point memory is not tested by any control in which the discontinuities are removed or smoothed (for example by replacing the elastic response with a differentiable surrogate) while the cyclic inverse-design protocol is held fixed. Without such a control, it remains possible that the MAM and its memory properties arise from the geometry of configuration space or the form of the inverse-design objective rather than from the non-differentiable points.
Authors: We agree that an explicit control experiment would provide stronger support for the proposed mechanism. However, in the context of athermal sphere packings, the gradient discontinuities stem directly from the discrete nature of contact formation and breaking, which is fundamental to the system's response. Implementing a fully differentiable surrogate while maintaining the inverse-design protocol and athermal conditions is challenging and would likely require significant modifications to the physical model. In the revised manuscript, we will add a dedicated paragraph in §4 discussing this point, including why such a control is difficult to implement without changing the essence of the system, and we will present additional supporting analysis from our existing data that links the memory properties specifically to the observed discontinuities. We believe this will clarify the scope of our claims. revision: partial
-
Referee: Results on evolution to the MAM: The claim that the systems evolve toward the MAM and encode memory of the training range is presented without quantitative support such as a distance-to-manifold metric tracked over training cycles, convergence statistics across independent realizations, or error bars. This absence makes it difficult to assess how robust or complete the evolution is.
Authors: We appreciate this feedback. While the manuscript includes several figures illustrating the evolution and memory encoding, we acknowledge that quantitative metrics would improve the presentation. In the revised manuscript, we will include new quantitative analyses: specifically, we will track and plot the distance to the MAM over the course of training cycles for multiple realizations, include error bars representing standard deviations across independent packings, and provide statistics on convergence rates. These additions will allow readers to better evaluate the robustness of the reported behavior. revision: yes
Circularity Check
No circularity: MAM emerges from training dynamics without self-referential definition or fitted prediction
full rationale
The paper presents the marginally absorbing manifold (MAM) as an emergent outcome of cyclic inverse design applied to athermal disordered sphere packings, with memory of the training range arising from the adaptation process rather than being presupposed in the definition. The proposed mechanism based on gradient discontinuities is offered as a general explanation derived from observed model behavior, not as a tautological fit or self-citation that reduces the central result to its inputs. No equations or derivations in the abstract or described claims show a 'prediction' that is statistically forced by construction from fitted parameters, nor does the argument rely on load-bearing self-citations or imported uniqueness theorems. The derivation remains self-contained, with the resemblance to return-point memory serving as an external analogy rather than an internal circular loop.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Athermal disordered sphere packings can be subjected to cyclic inverse design that tunes their elastic properties across a chosen range
invented entities (1)
-
marginally absorbing manifold (MAM)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We further propose a general mechanism for the formation of MAMs and the corresponding memory that is based on gradient discontinuities in the trained quantities.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
return-point changes in parameters, 2) return-point 4 changes in particle positions, 3) the number of required optimization steps, and 4) a particular component of the change in parameters after training. The rest of this section will detail and quantify these observations, and discuss the role of contact changes. In Section IV, we will build a theory bas...
-
[2]
Ease of training We begin by quantifying the apparent speedup in training observed in Fig. 1c-d. Figure 2a(i) shows the number of optimization steps,nsteps, to train the system from ν∗ max to ν∗ read. Note that since we always start at ν∗ max, readout data forν∗ read = ν∗ max is meaningless and are not shown. Importantly, for readout calculations, we trai...
-
[3]
Training dynamics To better understand what makes training easier after 23 cycles, Fig. 3(a) shows how the parameters (which again are species-level particle diameters) evolve during the 23 training cycles. While there are occasionally clear and dramatic parameter changes, e.g. during the 7th and 8th cycles, they eventually reach a steady state and change...
-
[4]
We now define two return-point measurements, both of which also exhibit memory
Return-point memory nsteps and θ⊥ both show clear features atν∗ read = ν∗ min and ν∗ read = ν∗ max, indicating memory. We now define two return-point measurements, both of which also exhibit memory. First, we define ∆θRP = vuut 1 nsp nspX α=1 膆 α − θ† α 2 , (4) where the sum runs over the nsp parameters, and θ† and 膆 are the parameters before and afte...
-
[5]
MAMs with different training ranges and target quantities Figure 2a(i-iv) demonstrates that cyclic training leads to a MAM, thus storing memory of ν∗ min and ν∗ max. Importantly, this result is not specific to this one example system or the choice of ν∗ min and ν∗ max. Figure 2b and c show similar data for systems with different training ranges for ν, fir...
-
[6]
R. O. Duda and P. E. Hart, Pattern classification and scene analysis, A Wiley-interscience publication (1973)
work page 1973
-
[7]
D. H. Ackley, G. E. Hinton, and T. J. Sejnowski, A learning algorithm for boltzmann machines, Cognitive science 9, 147 (1985)
work page 1985
-
[8]
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internal representations by error propagation, Tech. Rep. (1985)
work page 1985
- [9]
-
[10]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems 25 (2012)
work page 2012
- [11]
-
[12]
An overview of gradient descent optimization algorithms
S. Ruder, An overview of gradient descent optimization algorithms, arXiv preprint arXiv:1609.04747 (2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
- [13]
-
[14]
B. Scellier and Y. Bengio, Equilibrium propagation: Bridging the gap between energy-based models and backpropagation, Frontiers in computational neuroscience 11, 24 (2017)
work page 2017
-
[15]
N. Pashine, D. Hexner, A. J. Liu, and S. R. Nagel, Directed aging, memory, and nature’s greed, Science advances5, eaax4215 (2019)
work page 2019
- [16]
-
[17]
M. Stern and A. Murugan, Learning without neurons in physical systems, Annual Review of Condensed Matter Physics 14, 417 (2023)
work page 2023
-
[18]
V. Lopez-Pastor and F. Marquardt, Self-learning machines based on hamiltonian echo backpropagation, Physical Review X13, 031020 (2023)
work page 2023
-
[19]
V. R. Anisetti, B. Scellier, and J. M. Schwarz, Learning by non-interferingfeedbackchemical signalingin physical networks, Physical Review Research5, 023024 (2023)
work page 2023
-
[20]
V. R. Anisetti, A. Kandala, B. Scellier, and J. Schwarz, Frequency propagation: Multimechanism learning in nonlinear physical networks, Neural Computation 36, 596 (2024)
work page 2024
-
[21]
S. Dillavou, M. Stern, A. J. Liu, and D. J. Durian, Demonstration of decentralized physics-driven learning, Physical Review Applied18, 014040 (2022)
work page 2022
- [22]
-
[23]
L. E. Altman, M. Stern, A. J. Liu, and D. J. Durian, Experimental demonstration of coupled learning in elastic networks, Physical Review Applied 22, 024053 (2024)
work page 2024
-
[24]
S. Dillavou, B. D. Beyer, M. Stern, A. J. Liu, M. Z. Miskin, and D. J. Durian, Machine learning without a processor: Emergent learning in a nonlinear analog network, Proceedings of the National Academy of Sciences 121, e2319718121 (2024)
work page 2024
-
[25]
Y. Tang, W. Ye, J. Jia, and Y. Chen, Learning stiffness tensors in self-activated solids via a local rule, Advanced Science 11, 2308584 (2024)
work page 2024
-
[26]
M. J. Falk, A. T. Strupp, B. Scellier, and A. Murugan, Temporal contrastive learning through implicit non- equilibrium memory, Nature Communications16, 2163 (2025)
work page 2025
- [27]
-
[28]
C. G. Evans, J. O’Brien, E. Winfree, and A. Murugan, Pattern recognition in the nucleation kinetics of non- equilibrium self-assembly, Nature625, 500 (2024)
work page 2024
- [29]
-
[30]
A. K. Behera, M. Rao, S. Sastry, and S. Vaikuntanathan, Enhanced associative memory, classification, and learning with active dynamics, Physical Review X13, 16 041043 (2023)
work page 2023
-
[31]
J.Veenstra, C.Scheibner, M.Brandenbourger, J.Binysh, A. Souslov, V. Vitelli, and C. Coulais, Adaptive locomotion of active solids, Nature , 1 (2025)
work page 2025
-
[32]
M.ZuandC.P.Goodrich,Designingathermaldisordered solids with automatic differentiation, Communications Materials 5, 141 (2024)
work page 2024
-
[33]
M. Zu, A. Desai, and C. P. Goodrich, Fully independent response in disordered solids, Physical Review Letters 134, 238201 (2025)
work page 2025
- [34]
-
[35]
J. A. Barker, D. E. Schreiber, B. G. Huth, and D. H. Everett, Magnetic hysteresis and minor loops: models and experiments, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences386, 251 (1983)
work page 1983
-
[36]
J. P. Sethna, K. Dahmen, S. Kartha, J. A. Krumhansl, B. W. Roberts, and J. D. Shore, Hysteresis and hierarchies: Dynamics of disorder-driven first-order phase transformations (1993)
work page 1993
-
[37]
Preisach, Über die magnetische nachwirkung, Zeitschrift für Physik94, 277 (1935)
F. Preisach, Über die magnetische nachwirkung, Zeitschrift für Physik94, 277 (1935)
work page 1935
-
[38]
N. C. Keim, J. D. Paulsen, Z. Zeravcic, S. Sastry, and S. R. Nagel, Memory formation in matter, Rev. Mod. Phys. 91, 035002 (2019)
work page 2019
- [39]
-
[40]
J. D. Paulsen and N. C. Keim, Minimal descriptions of cyclic memories, Proceedings of the Royal Society A475, 20180874 (2019), 1809.09715
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[41]
J. D. Paulsen, N. C. Keim, and S. R. Nagel, Multiple transient memories in experiments on sheared non- brownian suspensions, Phys. Rev. Lett. 113, 068301 (2014)
work page 2014
- [42]
-
[43]
N. C. Keim, J. D. Paulsen, and S. R. Nagel, Multiple transient memories in sheared suspensions: Robustness, structure, and routes to plasticity, Physical Review E88, 032306 (2013), 1307.1184
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[44]
M. Adhikari and S. Sastry, Memory formation in cyclically deformed amorphous solids and sphere assemblies, The European Physical Journal E 41, 105 (2018)
work page 2018
- [45]
-
[46]
C. W. Lindeman and S. R. Nagel, Multiple memory formation in glassy landscapes, Science Advances 7, eabg7133 (2021)
work page 2021
-
[47]
D.R.Scheff, S.A.Redford, C.Lorpaiboon, S.Majumdar, A. R.Dinner, and M.L. Gardel, Actin filament alignment causes mechanical hysteresis in cross-linked networks, Soft Matter17, 5499 (2021)
work page 2021
-
[48]
D. J. Pine, J. P. Gollub, J. F. Brady, and A. M. Leshansky, Chaos and threshold for irreversibility in sheared suspensions, Nature438, 997 (2005)
work page 2005
- [49]
-
[50]
G. I. Menon and S. Ramaswamy, Universality class of the reversible-irreversible transition in sheared suspensions, Physical Review E79, 061108 (2009), 0801.3881
work page internal anchor Pith review Pith/arXiv arXiv 2009
-
[51]
Reversible plasticity in amorphous materials
M. Lundberg, K. Krishan, N. Xu, C. S. O’Hern, and M. Dennin, Reversible plastic events in amorphous materials, Physical Review E 77, 041505 (2008), 0707.4014
work page internal anchor Pith review Pith/arXiv arXiv 2008
-
[52]
N. C. Keim and P. E. Arratia, Mechanical and microscopic properties of the reversible plastic regime in a 2d jammed material, Phys. Rev. Lett.112, 028302 (2014)
work page 2014
- [53]
-
[54]
N. C. Keim, J. Hass, B. Kroger, and D. Wieker, Global memory from local hysteresis in an amorphous solid, Phys. Rev. Res.2, 012004 (2020)
work page 2020
- [55]
-
[56]
J. Guénolé, W. G. Nöhring, A. Vaid, F. Houllé, Z. Xie, A. Prakash, and E. Bitzek, Assessment and optimization of the fast inertial relaxation engine (fire) for energy minimization in atomistic simulations and its implementation in lammps, Computational Materials Science 175, 109584 (2020)
work page 2020
-
[57]
A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, Automatic differentiation in machine learning: a survey (2018), arXiv:1502.05767 [cs.SC]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[58]
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, Nature 323, 533 (1986)
work page 1986
-
[59]
R. E. Wengert, A simple automatic derivative evaluation program, Commun. ACM7, 463–464 (1964)
work page 1964
-
[60]
Note that these spikes are not observed in Ref. [27]. As will become clear in Section IV, this is due to the smooth attractive potential used in that paper
-
[61]
N. C. Keim and S. R. Nagel, Generic transient memory formation in disordered systems with noise, Phys. Rev. Lett. 107, 010603 (2011)
work page 2011
- [62]
- [63]
-
[64]
J. Ren, J. A. Dijksman, and R. P. Behringer, Reynolds pressure and relaxation in a sheared granular system, Phys. Rev. Lett.110, 018302 (2013)
work page 2013
-
[65]
L. Laurson and M. J. Alava, Dynamic hysteresis in cyclic deformation of crystalline solids, Phys. Rev. Lett.109, 155504 (2012)
work page 2012
-
[66]
M. D. Haw, W. C. K. Poon, P. N. Pusey, P. Hebraud, and F. Lequeux, Colloidal glasses under shear strain, Phys. Rev. E58, 4673 (1998)
work page 1998
-
[67]
M.Rubinstein, Nonmonotonic aging and memory retention in disordered mechanicalsystems,Phys.Rev.Lett
Y.Lahini, O.Gottesman, A.Amir,andS. M.Rubinstein, Nonmonotonic aging and memory retention in disordered mechanicalsystems,Phys.Rev.Lett. 118,085501(2017)
work page 2017
-
[68]
J. R. Royer and P. M. Chaikin, Precisely cyclic sand: Self-organization of periodically sheared frictional grains, Proceedings of the National Academy of Sciences112, 49 (2015). 17
work page 2015
-
[69]
M. O. Lavrentovich, A. J. Liu, and S. R. Nagel, Period proliferation in periodic states in cyclically sheared jammed solids, Phys. Rev. E96, 020101 (2017)
work page 2017
-
[70]
S. Mukherji, N. Kandula, A. K. Sood, and R. Ganapathy, Strength of mechanical memories is maximal at the yield point of a soft glass, Phys. Rev. Lett.122, 158001 (2019)
work page 2019
-
[71]
While we do not fully understand this, it is worth noting that nsteps is the weakest of our measurements since it depends on the details of the optimization algorithm, which are not controlled for
-
[72]
In practice, the finite step size of any iterative ascent/descent algorithm will always result in some small amount of parameter drift
-
[73]
Note that the red descent path happens to start very slightly to the left of the GD, so it flows leftwards, but it could just as easily have started to the right of the GD and flowed in the opposite direction, depending on where in the oscillation the ascent path stops
-
[74]
They do not appear in our sphere packings, and we do not consider them further
Lower dimensional GDs are of course possible, but they are probably much less important due to the smaller chance of encountering them. They do not appear in our sphere packings, and we do not consider them further
-
[75]
M. Adhikari, R. Sharma, and S. Karmakar, Encoding fast and fault-tolerant memories in bulk and nanoscale amorphous solids, Physical Review Letters134, 018202 (2025)
work page 2025
-
[76]
Alberch, From genes to phenotype: dynamical systems and evolvability., Genetica84, 5 (1991)
P. Alberch, From genes to phenotype: dynamical systems and evolvability., Genetica84, 5 (1991)
work page 1991
-
[77]
M. C. Cowperthwaite and L. A. Meyers, How mutational networks shape evolution: lessons from rna models, Annu. Rev. Ecol. Evol. Syst.38, 203 (2007)
work page 2007
-
[78]
M. Pigliucci, Genotype–phenotype mapping and the end of the ‘genes as blueprint’metaphor, Philosophical Transactions of the Royal Society B: Biological Sciences 365, 557 (2010)
work page 2010
-
[79]
S. E. Ahnert, Structural properties of genotype– phenotype maps, Journal of The Royal Society Interface 14, 20170275 (2017)
work page 2017
-
[80]
S. M. Scheiner, Genetics and evolution of phenotypic plasticity, Annual review of ecology and systematics24, 35 (1993)
work page 1993
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.