pith. machine review for the scientific record. sign in

arxiv: 2604.05042 · v1 · submitted 2026-04-06 · 💻 cs.LG · cond-mat.dis-nn· cs.SY· eess.SY· math.DS· q-bio.NC

Recognition: no theorem link

Energy-Based Dynamical Models for Neurocomputation, Learning, and Optimization

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:18 UTC · model grok-4.3

classification 💻 cs.LG cond-mat.dis-nncs.SYeess.SYmath.DSq-bio.NC
keywords energy-based modelsdynamical systemsneurocomputationHopfield networksassociative memoryoscillator networksproximal descentcontrol theory
0
0 comments X

The pith

Energy-based dynamical models guided by control theory advance neurocomputation beyond feedforward and backpropagation methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The tutorial reviews how energy-based dynamical models perform computation through gradient flows on energy landscapes, starting from classical continuous-time Hopfield networks and Boltzmann machines. It then extends the approach to dense associative memory for high-capacity storage, oscillator-based networks for optimization, and proximal-descent dynamics for constrained problems. Control theory supplies the design principles that steer these systems toward improved scalability, robustness, and energy efficiency while bridging artificial and biological computation. A reader would care because the framework offers conceptual alternatives to dominant AI training techniques for tasks including model learning, memory retrieval, data-driven control, and optimization.

Core claim

Energy-based dynamical models encode information through gradient flows and energy landscapes, and control-theoretic principles can guide their design to perform learning, memory retrieval, data-driven control, and optimization with better scalability, robustness, and energy efficiency than conventional feedforward and backpropagation approaches.

What carries the argument

Energy-based dynamical models that encode information through gradient flows and energy landscapes, extended by control-theoretic design principles to modern neurocomputing tasks.

Load-bearing premise

The energy-based extensions will deliver measurable gains in scalability, robustness, and energy efficiency when deployed in real systems.

What would settle it

A side-by-side benchmark on a standard large-scale optimization or learning task in which an oscillator-based or proximal-descent network uses more energy or converges more slowly than a conventional method would undermine the claimed practical advantages.

Figures

Figures reproduced from arXiv: 2604.05042 by Adilson E. Motter, Arthur N. Montanari, Dmitry Krotov, Francesco Bullo.

Figure 1
Figure 1. Figure 1: Dynamical systems used in neurocomputation, machine learning, [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Continuous-time Hopfield neural network for associative memory. (a) Neural network, where nodes represent individual neurons [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Boltzmann machines for generative modeling and sampling [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Oscillatory associative memory models. (a) Binary memory patterns are encoded as phase-locked configurations, which form stable equilibria due [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Positive competitive network implementing sparse signal recon [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Ek-I network implementing winner-take-all competition. A set of k excitatory neurons project to a shared inhibitory interneuron, which in turn inhibits all excitatory neurons. Under the monostability condition wEE < 1 and the functionality condition wIE ≥ 1 + wII, the network selects the neuron receiving the strongest input while suppressing all others. through a shared inhibitory pool ( [PITH_FULL_IMAGE:… view at source ↗
read the original abstract

Recent advances at the intersection of control theory, neuroscience, and machine learning have revealed novel mechanisms by which dynamical systems perform computation. These advances encompass a wide range of conceptual, mathematical, and computational ideas, with applications for model learning and training, memory retrieval, data-driven control, and optimization. This tutorial focuses on neuro-inspired approaches to computation that aim to improve scalability, robustness, and energy efficiency across such tasks, bridging the gap between artificial and biological systems. Particular emphasis is placed on energy-based dynamical models that encode information through gradient flows and energy landscapes. We begin by reviewing classical formulations, such as continuous-time Hopfield networks and Boltzmann machines, and then extend the framework to modern developments. These include dense associative memory models for high-capacity storage, oscillator-based networks for large-scale optimization, and proximal-descent dynamics for composite and constrained reconstruction. The tutorial demonstrates how control-theoretic principles can guide the design of next-generation neurocomputing systems, steering the discussion beyond conventional feedforward and backpropagation-based approaches to artificial intelligence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript is a tutorial reviewing classical energy-based models such as continuous-time Hopfield networks and Boltzmann machines, then extending the framework to modern developments including dense associative memory models for high-capacity storage, oscillator-based networks for large-scale optimization, and proximal-descent dynamics for composite and constrained reconstruction. It frames these using control-theoretic principles to guide the design of neuro-inspired computing systems, with the goal of moving beyond conventional feedforward networks and backpropagation-based AI for tasks in model learning, memory retrieval, data-driven control, and optimization.

Significance. If the synthesis holds, the tutorial provides a coherent conceptual bridge between control theory, neuroscience, and machine learning that could help researchers reframe computation in terms of energy landscapes and gradient flows. Its integrative review of classical and modern models offers guidance for designing dynamical systems with potential advantages in scalability, robustness, and energy efficiency, though the manuscript itself contains no new theorems, derivations, or empirical benchmarks to substantiate performance gains.

minor comments (3)
  1. Abstract: the phrasing that these models 'aim to improve scalability, robustness, and energy efficiency' is presented as established motivation without any benchmarks, comparisons, or citations to supporting empirical work; consider qualifying it explicitly as a hypothesis or direction for future investigation.
  2. The tutorial would benefit from a dedicated section or subsection that explicitly contrasts the control-theoretic framing with standard backpropagation approaches, including a table or bullet list of key differences in training dynamics and stability properties.
  3. References: several classical citations (e.g., to Hopfield 1982 and Boltzmann machine literature) are appropriate, but the manuscript should add a small number of recent surveys or empirical papers on oscillator networks and proximal methods to strengthen the bridge to current practice.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript as a tutorial synthesizing energy-based dynamical models and for recommending minor revision. We appreciate the recognition that the work aims to provide a conceptual bridge between control theory, neuroscience, and machine learning.

Circularity Check

0 steps flagged

No significant circularity in this tutorial review

full rationale

The paper is a tutorial that reviews classical energy-based models (Hopfield networks, Boltzmann machines) and describes conceptual extensions (dense associative memory, oscillator networks, proximal dynamics) framed by control theory. It presents no new derivations, fitted parameters, or predictions that could reduce to self-inputs or self-citations. The central claim is guidance for design rather than a theorem or result derived from its own equations. As a review without load-bearing derivations, it is self-contained against external benchmarks and carries no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a tutorial paper with no new mathematical claims, free parameters, axioms, or invented entities introduced by the authors.

pith-pipeline@v0.9.0 · 5508 in / 1060 out tokens · 41709 ms · 2026-05-10T19:18:58.720859+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

151 extracted references · 151 canonical work pages · 2 internal anchors

  1. [1]

    Deep learning,

    Y . LeCun, Y . Bengio, and G. Hinton, “Deep learning,”Nature, vol. 521, no. 7553, pp. 436–444, 2015

  2. [2]

    Bengio, I

    Y . Bengio, I. Goodfellow, and A. Courville,Deep Learning. MIT Press, 2017

  3. [3]

    The mythos of model interpretability,

    Z. C. Lipton, “The mythos of model interpretability,” inICML Workshop on Human Interpretability in Machine Learning, 2016

  4. [4]

    Concrete Problems in AI Safety

    D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Man´e, “Concrete problems in AI safety,”arXiv:1606.06565, 2016

  5. [5]

    AI safety for everyone,

    B. Gyevnar and A. Kasirzadeh, “AI safety for everyone,”Nature Machine Intelligence, vol. 7, no. 4, pp. 531–542, 2025

  6. [6]

    Trends in AI inference energy consumption: Beyond the performance-vs- parameter laws of deep learning,

    R. Desislavov, F. Mart ´ınez-Plumed, and J. Hern´andez-Orallo, “Trends in AI inference energy consumption: Beyond the performance-vs- parameter laws of deep learning,”Sustainable Computing: Informat- ics and Systems, vol. 38, p. 100857, 2023

  7. [7]

    Backpropagation and the brain,

    T. P. Lillicrap, A. Santoro, L. Marris, C. J. Akerman, and G. Hinton, “Backpropagation and the brain,”Nature Reviews Neuroscience, vol. 21, no. 6, pp. 335–346, 2020

  8. [8]

    Neural networks and physical systems with emer- gent collective computational abilities

    J. J. Hopfield, “Neural networks and physical systems with emer- gent collective computational abilities.”Proceedings of the National Academy of Sciences of the U.S.A., vol. 79, pp. 2554–2558, 1982

  9. [9]

    Neurons with graded response have collective computational properties like those of two-state neurons

    ——, “Neurons with graded response have collective computational properties like those of two-state neurons.”Proceedings of the National Academy of Sciences of the U.S.A., vol. 81, no. 10, pp. 3088–3092, 1984

  10. [10]

    Gerstner, W

    W. Gerstner, W. M. Kistler, R. Naud, and L. Paninski,Neuronal dynamics: From single neurons to networks and models of cognition. Cambridge University Press, 2014

  11. [11]

    Adaptive pattern classification and universal recoding: I. parallel development and coding of neural feature detectors,

    S. Grossberg, “Adaptive pattern classification and universal recoding: I. parallel development and coding of neural feature detectors,” Biological Cybernetics, vol. 23, no. 3, pp. 121–134, 1976

  12. [12]

    Absolute stability of global pat- tern formation and parallel memory storage by competitive neural networks,

    M. A. Cohen and S. Grossberg, “Absolute stability of global pat- tern formation and parallel memory storage by competitive neural networks,”IEEE Transactions on Systems, Man, and Cybernetics, no. 5, pp. 815–826, 1983

  13. [13]

    Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects,

    R. P. Rao and D. H. Ballard, “Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects,”Nature Neuroscience, vol. 2, no. 1, pp. 79–87, 1999

  14. [14]

    Does predictive coding have a future?

    K. Friston, “Does predictive coding have a future?”Nature Neuro- science, vol. 21, no. 8, pp. 1019–1021, 2018

  15. [15]

    The free-energy principle: a unified brain theory?

    ——, “The free-energy principle: a unified brain theory?”Nature Reviews Neuroscience, vol. 11, no. 2, pp. 127–138, 2010

  16. [16]

    Optimal perceptual inference,

    G. E. Hinton and T. J. Sejnowski, “Optimal perceptual inference,” in IEEE Conference on Computer Vision and Pattern Recognition, vol. 448, 1983, pp. 448–453

  17. [17]

    A learning algorithm for Boltzmann machines,

    D. H. Ackley, G. E. Hinton, and T. J. Sejnowski, “A learning algorithm for Boltzmann machines,”Cognitive Science, vol. 9, no. 1, pp. 147–169, 1985

  18. [18]

    A tutorial on energy-based learning,

    Y . LeCun, S. Chopra, R. Hadsell, M. Ranzato, and F. Huang, “A tutorial on energy-based learning,”Predicting Structured Data, 2006

  19. [19]

    Your classifier is secretly an energy based model and you should treat it like one,

    W. Grathwohl, K.-C. Wang, J.-H. Jacobsen, D. Duvenaud, M. Norouzi, and K. Swersky, “Your classifier is secretly an energy based model and you should treat it like one,” inInternational Conference on Learning Representations, 2019

  20. [20]

    Generalized energy based models,

    M. Arbel, L. Zhou, and A. Gretton, “Generalized energy based models,” inInternational Conference on Learning Representations, 2021

  21. [21]

    Physical considera- tions in memory and information storage,

    M. Du, A. K. Behera, and S. Vaikuntanathan, “Physical considera- tions in memory and information storage,”Annual Review of Physical Chemistry, vol. 76, no. 1, pp. 471–495, 2025

  22. [22]

    A coherent Ising machine for 2000-node optimization problems,

    T. Inagaki, Y . Haribara, K. Igarashi, T. Sonobe, S. Tamate, T. Honjo, A. Marandi, P. L. McMahon, T. Umeki, K. Enbutsuet al., “A coherent Ising machine for 2000-node optimization problems,”Science, vol. 354, no. 6312, pp. 603–606, 2016

  23. [23]

    Combinatorial optimiza- tion by simulating adiabatic bifurcations in nonlinear Hamiltonian systems,

    H. Goto, K. Tatsumura, and A. R. Dixon, “Combinatorial optimiza- tion by simulating adiabatic bifurcations in nonlinear Hamiltonian systems,”Science Advances, vol. 5, no. 4, p. eaav2372, 2019

  24. [24]

    Ising machines as hardware solvers of combinatorial optimization problems,

    N. Mohseni, P. L. McMahon, and T. Byrnes, “Ising machines as hardware solvers of combinatorial optimization problems,”Nature Reviews Physics, vol. 4, no. 6, pp. 363–379, 2022

  25. [25]

    OIM: Oscillator-based Ising ma- chines for solving combinatorial optimisation problems,

    T. Wang and J. Roychowdhury, “OIM: Oscillator-based Ising ma- chines for solving combinatorial optimisation problems,” inInter- national Conference on Unconventional Computation and Natural Computation, 2019, pp. 232–256

  26. [26]

    Using synchronized oscillators to compute the maximum independent set,

    A. Mallick, M. K. Bashar, D. S. Truesdell, B. H. Calhoun, S. Joshi, and N. Shukla, “Using synchronized oscillators to compute the maximum independent set,”Nature Communications, vol. 11, p. 4689, 2020

  27. [27]

    Training coupled phase oscillators as a neuromorphic platform using equilibrium propaga- tion,

    Q. Wang, C. C. Wanjura, and F. Marquardt, “Training coupled phase oscillators as a neuromorphic platform using equilibrium propaga- tion,”Neuromorphic Computing and Engineering, vol. 4, no. 3, p. 034014, 2024

  28. [28]

    Supervised learning in physical networks: From machine learning to learning machines,

    M. Stern, D. Hexner, J. W. Rocks, and A. J. Liu, “Supervised learning in physical networks: From machine learning to learning machines,” Physical Review X, vol. 11, no. 2, p. 021045, 2021

  29. [29]

    Physical networks become what they learn,

    M. Stern, M. Guzman, F. Martins, A. J. Liu, and V . Balasubrama- nian, “Physical networks become what they learn,”Physical Review Letters, vol. 134, no. 14, p. 147402, 2025

  30. [30]

    Dense associative memory for pattern recognition,

    D. Krotov and J. J. Hopfield, “Dense associative memory for pattern recognition,” inAdvances in Neural Information Processing Systems, vol. 29, 2016

  31. [31]

    Dense associative memory is robust to adversarial inputs,

    D. Krotov and J. Hopfield, “Dense associative memory is robust to adversarial inputs,”Neural Computation, vol. 30, no. 12, pp. 3151– 3167, 2018

  32. [32]

    Capacity of oscillatory associative-memory networks with error-free retrieval,

    T. Nishikawa, Y .-C. Lai, and F. C. Hoppensteadt, “Capacity of oscillatory associative-memory networks with error-free retrieval,” Physical Review Letters, vol. 92, no. 10, p. 108101, 2004

  33. [33]

    Global optimization through heterogeneous oscillator Ising ma- chines,

    A. Allibhoy, A. N. Montanari, F. Pasqualetti, and A. E. Motter, “Global optimization through heterogeneous oscillator Ising ma- chines,” inIEEE Conference on Decision and Control, 2025, pp. 998–1005

  34. [34]

    Proximal gradient flow and Douglas-Rachford splitting dynamics: Global exponential stability via integral quadratic constraints,

    S. Hassan-Moghaddam and M. R. Jovanovi ´c, “Proximal gradient flow and Douglas-Rachford splitting dynamics: Global exponential stability via integral quadratic constraints,”Automatica, vol. 123, p. 109311, 2021

  35. [35]

    Proximal gradient dynam- ics: Monotonicity, exponential convergence, and applications,

    A. Gokhale, A. Davydov, and F. Bullo, “Proximal gradient dynam- ics: Monotonicity, exponential convergence, and applications,”IEEE Control Systems Letters, vol. 8, pp. 2853–2858, 2024

  36. [36]

    A review of recurrent neural net- works: LSTM cells and network architectures,

    Y . Yu, X. Si, C. Hu, and J. Zhang, “A review of recurrent neural net- works: LSTM cells and network architectures,”Neural Computation, vol. 31, no. 7, pp. 1235–1270, 2019

  37. [37]

    Hopfield Networks is All You Need

    H. Ramsauer, B. Sch ¨afl, J. Lehner, P. Seidl, M. Widrich, T. Adler, L. Gruber, M. Holzleitner, M. Pavlovi ´c, G. K. Sandveet al., “Hop- field networks is all you need,”arXiv:2008.02217, 2020

  38. [38]

    Beyond exploding and vanishing gradients: Analysing RNN training using attractors and smoothness,

    A. H. Ribeiro, K. Tiels, L. A. Aguirre, and T. Sch ¨on, “Beyond exploding and vanishing gradients: Analysing RNN training using attractors and smoothness,” inInternational Conference on Artificial Intelligence and Statistics, 2020, pp. 2370–2380

  39. [39]

    Long short-term memory,

    S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997

  40. [40]

    Learning phrase representations using RNN encoder–decoder for statistical machine translation,

    K. Cho, B. Van Merri ¨enboer, C ¸ . Gulc ¸ehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y . Bengio, “Learning phrase representations using RNN encoder–decoder for statistical machine translation,” inConfer- ence on Empirical Methods in Natural Language Processing, 2014, pp. 1724–1734

  41. [41]

    Neural ordinary differential equations,

    R. T. Chen, Y . Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” inAdvances in Neural Information Processing Systems, vol. 31, 2018

  42. [42]

    Bullo,Lectures on Neural Dynamics, 2025

    F. Bullo,Lectures on Neural Dynamics, 2025

  43. [43]

    New conditions for global stability of neural networks with application to linear and quadratic programming prob- lems,

    M. Forti and A. Tesi, “New conditions for global stability of neural networks with application to linear and quadratic programming prob- lems,”IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 42, no. 7, pp. 354–366, 2002

  44. [44]

    Excitatory and inhibitory interactions in localized populations of model neurons,

    H. R. Wilson and J. D. Cowan, “Excitatory and inhibitory interactions in localized populations of model neurons,”Biophysical Journal, vol. 12, no. 1, pp. 1–24, 1972

  45. [45]

    Firing rate models as associative memory: Synaptic design for robust retrieval,

    S. Betteti, G. Baggio, F. Bullo, and S. Zampieri, “Firing rate models as associative memory: Synaptic design for robust retrieval,”Neural Computation, vol. 37, no. 10, pp. 1807–1838, 2025

  46. [46]

    Stochastic processes and applications: Diffusion processes, the fokker-planck and langevin equations,

    G. A. Pavliotis, “Stochastic processes and applications: Diffusion processes, the fokker-planck and langevin equations,” inTexts in Applied Mathematics. Springer, 2014, vol. 60

  47. [47]

    Unsupervised learning of distributions on binary vectors using two layer networks,

    Y . Freund and D. Haussler, “Unsupervised learning of distributions on binary vectors using two layer networks,”Advances in Neural Information Processing Systems, vol. 4, 1991

  48. [48]

    Representational power of restricted boltzmann machines and deep belief networks,

    N. Le Roux and Y . Bengio, “Representational power of restricted boltzmann machines and deep belief networks,”Neural Computation, vol. 20, no. 6, pp. 1631–1649, 2008

  49. [49]

    Learning representations by back-propagating errors,

    D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,”Nature, vol. 323, no. 6088, pp. 533–536, 1986

  50. [50]

    K. L. Downing,Gradient Expectations: Structure, Origins, and Synthesis of Predictive Neural Networks. MIT Press, 2023

  51. [51]

    D. O. Hebb,The Organization of Behavior: A Neuropsychological Theory. Psychology Press, 2005

  52. [52]

    Mathematical formulations of Hebbian learning,

    W. Gerstner and W. M. Kistler, “Mathematical formulations of Hebbian learning,”Biological Cybernetics, vol. 87, no. 5, pp. 404– 415, 2002

  53. [53]

    Hebbian learning and development,

    Y . Munakata and J. Pfaffly, “Hebbian learning and development,” Developmental Science, vol. 7, no. 2, pp. 141–148, 2004

  54. [54]

    Dayan and L

    P. Dayan and L. F. Abbott,Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. MIT Press, 2005

  55. [55]

    Simplified neuron model as a principal component analyzer,

    E. Oja, “Simplified neuron model as a principal component analyzer,” Journal of Mathematical Biology, vol. 15, no. 3, pp. 267–273, 1982

  56. [56]

    Neural networks, principal components, and subspaces,

    ——, “Neural networks, principal components, and subspaces,”In- ternational Journal of Neural Systems, vol. 1, no. 1, pp. 61–68, 1989

  57. [57]

    Theory for the development of neuron selectivity: Orientation specificity and binocular interaction in visual cortex,

    E. L. Bienenstock, L. N. Cooper, and P. W. Munro, “Theory for the development of neuron selectivity: Orientation specificity and binocular interaction in visual cortex,”Journal of Neuroscience, vol. 2, no. 1, pp. 32–48, 1982

  58. [58]

    Unsupervised learning by competing hidden units,

    D. Krotov and J. J. Hopfield, “Unsupervised learning by competing hidden units,”Proceedings of the National Academy of Sciences of the U.S.A., vol. 116, no. 16, pp. 7723–7731, 2019

  59. [59]

    A Hebbian/anti-Hebbian neural network for linear subspace learning: A derivation from multidimensional scaling of streaming data,

    C. Pehlevan, T. Hu, and D. B. Chklovskii, “A Hebbian/anti-Hebbian neural network for linear subspace learning: A derivation from multidimensional scaling of streaming data,”Neural Computation, vol. 27, no. 7, pp. 1461–1495, 2015

  60. [60]

    Local unsupervised learning for image analysis,

    L. Grinberg, J. Hopfield, and D. Krotov, “Local unsupervised learning for image analysis,”arXiv:1908.08993, 2019

  61. [61]

    Achiev- ing stable dynamics in neural circuits,

    L. Kozachkov, M. Lundqvist, J.-J. Slotine, and E. K. Miller, “Achiev- ing stable dynamics in neural circuits,”PLoS computational biology, vol. 16, no. 8, p. e1007659, 2020

  62. [62]

    Equilibrium propagation: Bridging the gap between energy-based models and backpropagation,

    B. Scellier and Y . Bengio, “Equilibrium propagation: Bridging the gap between energy-based models and backpropagation,”Frontiers in Computational Neuroscience, vol. 11, p. 24, 2017

  63. [63]

    Equivalence of backpropagation and con- trastive Hebbian learning in a layered network,

    X. Xie and H. S. Seung, “Equivalence of backpropagation and con- trastive Hebbian learning in a layered network,”Neural Computation, vol. 15, no. 2, pp. 441–454, 2003

  64. [64]

    Contrastive Hebbian learning in the continuous Hopfield model,

    J. R. Movellan, “Contrastive Hebbian learning in the continuous Hopfield model,” inConnectionist Models, 1991, pp. 10–17

  65. [65]

    Equivalence of equilibrium propagation and recurrent backpropagation,

    B. Scellier and Y . Bengio, “Equivalence of equilibrium propagation and recurrent backpropagation,”Neural Computation, vol. 31, no. 2, pp. 312–329, 2019

  66. [66]

    Krotov, B

    D. Krotov, B. Hoover, P. Ram, and B. Pham, “Modern methods in associative memory,”arXiv:2507.06211, 2025

  67. [67]

    Long sequence Hopfield memory,

    H. Chaudhry, J. Zavatone-Veth, D. Krotov, and C. Pehlevan, “Long sequence Hopfield memory,” inAdvances in Neural Information Processing Systems, vol. 36, 2023, pp. 54 300–54 340

  68. [68]

    Memorization to generalization: Emergence of diffusion models from associative memory.arXiv preprint arXiv:2505.21777, 2025

    B. Pham, G. Raya, M. Negri, M. J. Zaki, L. Ambrogioni, and D. Krotov, “Memorization to generalization: Emergence of diffusion models from associative memory,”arXiv:2505.21777, 2025

  69. [69]

    Storing infinite numbers of patterns in a spin-glass model of neural networks,

    D. J. Amit, H. Gutfreund, and H. Sompolinsky, “Storing infinite numbers of patterns in a spin-glass model of neural networks,” Physical Review Letters, vol. 55, no. 14, p. 1530, 1985

  70. [70]

    On a model of associative memory with huge storage capacity,

    M. Demircigil, J. Heusel, M. L ¨owe, S. Upgang, and F. Vermet, “On a model of associative memory with huge storage capacity,”Journal of Statistical Physics, vol. 168, no. 2, pp. 288–299, 2017

  71. [71]

    Exponential capacity of dense associa- tive memories,

    C. Lucibello and M. M ´ezard, “Exponential capacity of dense associa- tive memories,”Physical Review Letters, vol. 132, p. 077301, 2024

  72. [72]

    Large associative memory problem in neurobiology and machine learning,

    D. Krotov and J. J. Hopfield, “Large associative memory problem in neurobiology and machine learning,” inInternational Conference on Learning Representations, 2021

  73. [73]

    Universal Hopfield networks: A general framework for single-shot associative memory models,

    B. Millidge, T. Salvatori, Y . Song, T. Lukasiewicz, and R. Bogacz, “Universal Hopfield networks: A general framework for single-shot associative memory models,” inInternational Conference on Machine Learning, 2022, pp. 15 561–15 583

  74. [74]

    Simplicial Hopfield networks,

    T. F. Burns and T. Fukai, “Simplicial Hopfield networks,” inInter- national Conference on Learning Representations, 2022

  75. [75]

    End-to-end dif- ferentiable clustering with associative memories,

    B. Saha, D. Krotov, M. J. Zaki, and P. Ram, “End-to-end dif- ferentiable clustering with associative memories,” inInternational Conference on Machine Learning, vol. 202, 2023, pp. 29 649–29 670

  76. [76]

    Julia Kempe, Dmitry Krotov, Hilde Kuehne, Daniel Lee, and Sara A Solla

    M. S. Kafraj, D. Krotov, and P. E. Latham, “A biologically plausible dense associative memory with exponential capacity,” arXiv:2601.00984, 2026

  77. [77]

    Dense associative memory with Epanechnikov energy,

    B. Hoover, K. Balasubramanian, D. Krotov, and P. Ram, “Dense associative memory with Epanechnikov energy,” inNew Frontiers in Associative Memories, 2025

  78. [78]

    Hierarchical associative memory.arXiv preprint 2107.06446, 2021

    D. Krotov, “Hierarchical associative memory,”arXiv:2107.06446, 2021

  79. [79]

    A universal abstraction for hierarchical hopfield networks,

    B. Hoover, D. H. Chau, H. Strobelt, and D. Krotov, “A universal abstraction for hierarchical hopfield networks,” inThe Symbiosis of Deep Learning and Differential Equations II, 2022

  80. [80]

    Energy transformer,

    B. Hoover, Y . Liang, B. Pham, R. Panda, H. Strobelt, D. H. Chau, M. Zaki, and D. Krotov, “Energy transformer,” inAdvances in Neural Information Processing Systems, vol. 36, 2024

Showing first 80 references.