Quantifying information stored in synaptic connections rather than in firing activities of neural networks
Pith reviewed 2026-05-23 17:14 UTC · model grok-4.3
The pith
Synaptic connections store quantifiable information via mutual information with data patterns, with joint encoding exceeding the sum of individual parts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using densely connected Hebbian networks for autoassociative memory with log-normal data patterns, analytical approximations are obtained for Shannon mutual information between the data and single synaptic connections, pairs, and arbitrary n-tuples. These approximations demonstrate synergistic interactions in which the joint information encoded by groups of synapses exceeds the sum of their individual contributions, while also aligning with known limits on pattern storage capacity and the distributed nature of coding in neural activity.
What carries the argument
Analytical approximations for Shannon mutual information between stored data patterns and n-tuples of synaptic connection strengths
Load-bearing premise
The calculations assume that input data patterns follow log-normal distributions and that the networks are densely connected Hebbian autoassociators.
What would settle it
Direct measurements of mutual information between actual synaptic weight distributions and input patterns in a biological or simulated network with non-log-normal statistics would fail to match the derived approximations if the synergy effect does not appear.
Figures
read the original abstract
A cornerstone of our understanding of both biological and artificial neural networks is that they store information in the strengths of synaptic connections among the neurons. However, in contrast to the well-established theory for quantifying information encoded by the firing activity of neural networks, there does not exist a framework for quantifying information stored in the network's connection distribution itself. Here, we develop a theoretical framework for synaptic information by using densely connected Hebbian networks performing autoassociative memory tasks and by modeling data patterns to be stored as log-normal distributions. Specifically, we derive analytical approximations for Shannon mutual information between the data and singletons, pairs, and arbitrary n-tuples of synaptic connections within the network. Our framework corroborates well-established insights regarding pattern storage capacity, supports the principle of distributed coding in neural firing activities, and formalizes the heterogeneity inherent in information encoding across synapses in a network. Notably, it discovers synergistic interactions among synapses, revealing that the information encoded jointly by all the synapses exceeds the 'sum of its parts'. Taken together, this study introduces a powerful, interpretable framework for quantitatively understanding information storage in the synapses of neural networks, one that illustrates the duality of synaptic connectivity and neural population activity in learning and memory.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a theoretical framework for quantifying Shannon mutual information stored in synaptic connections (rather than firing rates) of neural networks. It restricts attention to densely connected Hebbian autoassociative memory networks whose patterns are drawn from log-normal distributions, derives analytical approximations for the mutual information between the stored data and single synapses, pairs of synapses, and arbitrary n-tuples, and reports synergistic effects in which the joint information exceeds the sum of the marginal contributions. The framework is claimed to corroborate known capacity limits, support distributed coding, and formalize synaptic heterogeneity.
Significance. If the closed-form approximations prove accurate and the reported synergy is not an artifact of the two modeling choices, the work would supply a concrete, interpretable measure of information stored in connectivity that complements existing activity-based information theory. It would also supply a quantitative language for heterogeneity across synapses. No machine-checked proofs or reproducible code are supplied, but the attempt to obtain parameter-light analytic expressions for higher-order synaptic MI is a positive feature.
major comments (2)
- [Abstract and derivations section] Abstract and § on derivations: the analytical MI approximations are obtained only after imposing log-normal pattern statistics and a dense Hebbian outer-product rule; the manuscript contains no section that recomputes the same quantities (or the sign of the synergy) under Gaussian patterns, sparse binary patterns, or alternative plasticity rules. Because the MI quantities are functionals of the joint distribution induced by these two decisions, the synergy result is not shown to be robust, which is load-bearing for the claim that the framework reveals a general property of synaptic information storage.
- [Abstract] Abstract: the text states that 'analytical approximations exist' yet supplies neither the explicit expressions, their error bounds, nor any numerical validation against direct Monte-Carlo estimates of the mutual information. Without these steps it is impossible to judge the accuracy of the claimed closed forms or to rule out post-hoc fitting of the log-normal parameters.
minor comments (2)
- Notation for the n-tuple MI is introduced without an explicit recursive definition or pseudocode, making it difficult to verify the higher-order formulas.
- The manuscript would benefit from a short table comparing the analytic MI values to numerically estimated values for small networks.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and indicate the revisions planned for the next version of the manuscript.
read point-by-point responses
-
Referee: [Abstract and derivations section] Abstract and § on derivations: the analytical MI approximations are obtained only after imposing log-normal pattern statistics and a dense Hebbian outer-product rule; the manuscript contains no section that recomputes the same quantities (or the sign of the synergy) under Gaussian patterns, sparse binary patterns, or alternative plasticity rules. Because the MI quantities are functionals of the joint distribution induced by these two decisions, the synergy result is not shown to be robust, which is load-bearing for the claim that the framework reveals a general property of synaptic information storage.
Authors: We agree that the derivations and the reported synergy are obtained specifically under log-normal pattern statistics and the dense Hebbian outer-product rule. The manuscript presents a framework for this class of networks rather than claiming universality across all pattern distributions or plasticity rules. To address the robustness concern, the revised manuscript will include an expanded discussion of the modeling choices and a new subsection with numerical checks of the mutual-information approximations and synergy sign under Gaussian patterns (while retaining the analytic focus on the log-normal case). revision: yes
-
Referee: [Abstract] Abstract: the text states that 'analytical approximations exist' yet supplies neither the explicit expressions, their error bounds, nor any numerical validation against direct Monte-Carlo estimates of the mutual information. Without these steps it is impossible to judge the accuracy of the claimed closed forms or to rule out post-hoc fitting of the log-normal parameters.
Authors: We acknowledge that the abstract refers to the existence of analytical approximations without displaying the explicit forms, error bounds, or Monte-Carlo validation. The revised manuscript will move the leading closed-form expressions into the abstract and main text, include the derived error bounds, and add a dedicated validation section that compares the analytic mutual-information values against direct Monte-Carlo estimates for the same parameter regimes. revision: yes
Circularity Check
No significant circularity; derivations are explicit model-based approximations
full rationale
The paper starts from Shannon MI definitions and derives closed-form approximations under two explicit modeling choices (log-normal pattern statistics and dense Hebbian outer-product storage). These choices induce a joint distribution from which the MI expressions are computed; the resulting formulas are not equivalent to the inputs by construction, nor are any quantities fitted and then relabeled as predictions. No self-citation chains, uniqueness theorems, or ansatzes smuggled via prior work are invoked as load-bearing steps. The framework is therefore self-contained against its stated assumptions, and the reported synergy is a direct consequence of dependence structure in the induced distribution rather than a tautology.
Axiom & Free-Parameter Ledger
free parameters (1)
- log-normal distribution parameters
axioms (2)
- domain assumption Shannon mutual information is the appropriate measure for quantifying information stored in synaptic weights.
- domain assumption Hebbian learning in a densely connected network is a sufficient model for autoassociative memory.
Reference graph
Works this paper leans on
-
[1]
Borst, A. & Theunissen, F. E. Information theory and neural coding.Nature neuroscience 2, 947–957 (1999)
work page 1999
-
[2]
Quian Quiroga, R. & Panzeri, S. Extracting information from neuronal populations: information theory and decoding approaches. Nature Reviews Neuroscience 10, 173– 185 (2009)
work page 2009
-
[3]
Dimitrov, A. G., Lazar, A. A. & Victor, J. D. Information theory in neuroscience.Journal of computational neuroscience 30, 1–5 (2011)
work page 2011
-
[4]
Timme, N. M. & Lapish, C. A tutorial for information theory in neuroscience. eneuro 5 (2018)
work page 2018
-
[5]
Bialek, W. & Rieke, F. Reliability and information transmission in spiking neurons. Trends in neurosciences15, 428–434 (1992)
work page 1992
-
[6]
Palmer, S. E., Marre, O., Berry, M. J. & Bialek, W. Predictive information in a sensory population. Proceedings of the National Academy of Sciences 112, 6908–6913 (2015)
work page 2015
-
[7]
Self-organization in a perceptual network
Linsker, R. Self-organization in a perceptual network. Computer 21, 105–117 (1988)
work page 1988
-
[8]
Tishby, N. & Zaslavsky, N. Deep learning and the information bottleneck principle, 1–5 (IEEE, 2015)
work page 2015
-
[9]
Hebb, D. O. The organization of behavior: A neuropsychological theory (Wiley, New York, 1949)
work page 1949
-
[10]
Lamprecht, R. & LeDoux, J. Structural plasticity and memory. Nature Reviews Neuro- science 5, 45–54 (2004)
work page 2004
-
[11]
Sporns, O. & K ¨otter, R. Motifs in brain networks. PLoS biology 2, e369 (2004)
work page 2004
-
[12]
Battiston, F., Nicosia, V ., Chavez, M. & Latora, V . Multilayer motif analysis of brain networks. Chaos: An Interdisciplinary Journal of Nonlinear Science 27 (2017)
work page 2017
-
[13]
Baeg, E. H. et al. Learning-induced enduring changes in functional connectivity among prefrontal cortical neurons. Journal of Neuroscience 27, 909–918 (2007). 24
work page 2007
-
[14]
Bassett, D. S. et al. Dynamic reconfiguration of human brain networks during learning. Proceedings of the National Academy of Sciences 108, 7641–7646 (2011)
work page 2011
-
[15]
Mongillo, G., Barak, O. & Tsodyks, M. Synaptic theory of working memory. Science 319, 1543–1546 (2008)
work page 2008
-
[16]
Stokes, M. G. ‘activity-silent’working memory in prefrontal cortex: a dynamic coding framework. Trends in cognitive sciences 19, 394–405 (2015)
work page 2015
-
[17]
Panichello, M. F. et al. Intermittent rate coding and cue-specific ensembles support working memory. Nature 1–8 (2024)
work page 2024
-
[18]
Hinton, G. E. & Van Camp, D. Keeping the neural networks simple by minimizing the description length of the weights, 5–13 (1993)
work page 1993
-
[19]
Emergence of Invariance and Disentanglement in Deep Representations
Achille, A. & Soatto, S. On the emergence of invariance and disentangling in deep representations. arXiv preprint arXiv:1706.01350 125, 14 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[20]
Knoblauch, A., Palm, G. & Sommer, F. T. Memory capacities for synaptic and structural plasticity. Neural Computation 22, 289–341 (2010)
work page 2010
-
[21]
Willshaw, D. J., Buneman, O. P. & Longuet-Higgins, H. C. Non-holographic associative memory. Nature 222, 960–962 (1969)
work page 1969
-
[22]
Associative memory: on the (puzzling) sparse coding limit
Nadal, J.-P. Associative memory: on the (puzzling) sparse coding limit. Journal of Physics A: Mathematical and General 24, 1093 (1991)
work page 1991
-
[23]
Bosch, H. & Kurfess, F. J. Information storage capacity of incompletely connected associative memories. Neural Networks 11, 869–876 (1998)
work page 1998
-
[24]
Hopfield, J. J. Neurons with graded response have collective computational properties like those of two-state neurons. Proceedings of the national academy of sciences 81, 3088–3092 (1984)
work page 1984
-
[25]
The sum of log-normal probability distributions in scatter transmission sys- tems
Fenton, L. The sum of log-normal probability distributions in scatter transmission sys- tems. IRE Transactions on communications systems 8, 57–67 (1960)
work page 1960
-
[26]
Rissman, J. & Wagner, A. D. Distributed representations in memory: insights from functional brain imaging. Annual review of psychology 63, 101–128 (2012)
work page 2012
-
[27]
Stecker, G. C. & Middlebrooks, J. C. Distributed coding of sound locations in the audi- tory cortex. Biological cybernetics 89, 341–349 (2003)
work page 2003
-
[28]
A., Zatka-Haas, P., Carandini, M
Steinmetz, N. A., Zatka-Haas, P., Carandini, M. & Harris, K. D. Distributed coding of choice, action and engagement across the mouse brain. Nature 576, 266–273 (2019)
work page 2019
-
[29]
Abu-Mostafa, Y . & Jacques, J. S. Information capacity of the hopfield model. IEEE Transactions on Information Theory 31, 461–464 (1985)
work page 1985
-
[30]
Amit, D. J., Gutfreund, H. & Sompolinsky, H. Storing infinite numbers of patterns in a spin-glass model of neural networks. Physical Review Letters 55, 1530 (1985). 25
work page 1985
-
[31]
Panzeri, S., Schultz, S. R., Treves, A. & Rolls, E. T. Correlations and the encoding of information in the nervous system. Proceedings of the Royal Society of London. Series B: Biological Sciences 266, 1001–1012 (1999)
work page 1999
-
[32]
Schneidman, E., Bialek, W. & Berry, M. J. Synergy, redundancy, and independence in population codes. Journal of Neuroscience 23, 11539–11553 (2003)
work page 2003
-
[33]
Luppi, A. I. et al. A synergistic core for human brain evolution and cognition. Nature Neuroscience 25, 771–782 (2022)
work page 2022
-
[34]
Cayco-Gajic, N. A., Zylberberg, J. & Shea-Brown, E. Triplet correlations among simi- larly tuned cells impact population coding. Frontiers in computational neuroscience 9, 57 (2015)
work page 2015
-
[35]
Kafashan, M. et al. Scaling of sensory information in large neural populations shows signatures of information-limiting correlations. Nature communications 12, 473 (2021)
work page 2021
-
[36]
Sun, W., Advani, M., Spruston, N., Saxe, A. & Fitzgerald, J. E. Organizing memories for generalization in complementary learning systems. Nature neuroscience 26, 1438–1448 (2023)
work page 2023
-
[37]
Kang, L. & Toyoizumi, T. Distinguishing examples while building concepts in hip- pocampal and artificial networks. Nature Communications 15, 647 (2024)
work page 2024
-
[38]
Durstewitz, D., Seamans, J. K. & Sejnowski, T. J. Neurocomputational models of work- ing memory. Nature neuroscience 3, 1184–1191 (2000)
work page 2000
-
[39]
Dong, D. W. & Hopfield, J. J. Dynamic properties of neural networks with adapting synapses. Network: Computation in Neural Systems 3, 267 (1992)
work page 1992
-
[40]
Olfactory computation and object perception
Hopfield, J. Olfactory computation and object perception. Proceedings of the National Academy of Sciences 88, 6462–6466 (1991)
work page 1991
-
[41]
Kraskov, A., St ¨ogbauer, H. & Grassberger, P. Estimating mutual information. Physical Review E—Statistical, Nonlinear, and Soft Matter Physics 69, 066138 (2004)
work page 2004
-
[42]
Huang, J. & Mumford, D. Statistics of natural images and models , V ol. 1, 541–547 (IEEE, 1999)
work page 1999
-
[43]
V oss RFClarke, J. 1/f noise’in music and speech. Nature 258, 317318 (1975)
work page 1975
-
[44]
Piantadosi, S. T. Zipf’s word frequency law in natural language: A critical review and future directions. Psychonomic bulletin & review 21, 1112–1130 (2014). 26
work page 2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.