pith. sign in

arxiv: 2605.30882 · v1 · pith:CDO4CPYSnew · submitted 2026-05-29 · 🧬 q-bio.NC

Extended predictive coding framework as variational free-energy minimisation under exponential-family assumption

Pith reviewed 2026-06-28 20:22 UTC · model grok-4.3

classification 🧬 q-bio.NC
keywords predictive codingfree energy principleexponential familyvariational inferenceneural dynamicsperceptual inferencelocal plasticity rules
0
0 comments X

The pith

Assuming the exponential family for variational posteriors and priors extends predictive coding to exhibit nonlinearity, heterogeneity, and non-negative firing rates while preserving the free-energy principle correspondence up to the second

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper extends the link between the free-energy principle and predictive coding by replacing the Gaussian assumption with the exponential family of distributions. Under this broader assumption, a recurrent neural network naturally develops nonlinear responses, varied input-output properties across neurons, and strictly positive firing rates. The free-energy minimization still aligns with predictive coding dynamics up to the second cumulant of the posterior distribution. The setup further allows training through local plasticity rules that are biologically realistic.

Core claim

When a broader class of probability distributions, namely the exponential family of distributions, is assumed for the variational posterior and prior, the predictive coding network exhibits nonlinearity and heterogeneity of input-output properties, as well as non-negative firing rates, while maintaining the correspondence to free-energy minimization up to the second cumulant of the posterior. The model can be trained by biologically plausible local plasticity rules.

What carries the argument

The exponential-family assumption on the variational posterior and prior in a recurrent network of neurons, which enforces the free-energy principle correspondence through local dynamics.

If this is right

  • Predictive coding networks can now incorporate nonlinear and heterogeneous neuron behaviors without violating the free-energy principle.
  • Training relies only on local plasticity rules, avoiding the need for global error signals.
  • The correspondence between predictive coding and variational inference holds for distributions beyond Gaussians, limited to the second cumulant.
  • This framework better accounts for biological neural properties in perceptual inference.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach might enable modeling of inference under non-Gaussian sensory inputs, such as in natural scenes with heavy-tailed statistics.
  • It suggests that similar extensions could apply to other variational methods in computational neuroscience.
  • Testing the model on tasks requiring positive-only rates, like spike-rate coding, could validate its biological relevance.

Load-bearing premise

That the exponential family distributions for posterior and prior can be realized by the recurrent network dynamics in a way that automatically satisfies the second-cumulant match without extra approximations.

What would settle it

Simulate the recurrent network under the exponential family assumption and check whether the variance (second cumulant) of the inferred posterior matches the prediction from free-energy minimization; a mismatch would falsify the maintained correspondence.

Figures

Figures reproduced from arXiv: 2605.30882 by Asaki Kataoka, Kenji Doya.

Figure 1
Figure 1. Figure 1: Schematic illustration of EFD–FEP model derived in Section 3. Representational neurons [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
read the original abstract

The sensory cortices of the brain perform perceptual inference efficiently through their complex networks of neurons. One of the theoretical accounts of this process is the free-energy principle (FEP), which postulates that the brain performs variational Bayesian inference. Pioneering studies have shown that FEP can correspond to the predictive coding (PC) hypothesis under the Gaussian assumption and Laplace approximation. However, PC-based implementations of FEP within such a limited Gaussian regime have failed to capture several properties of biological neural networks, such as nonlinearity and heterogeneity of input--output properties within a network, and the biological implausibility of negative firing rates. This study shows that, when a broader class of probability distributions, namely the exponential family of distributions (EFD), is assumed for the variational posterior and prior, these missing characteristics are exhibited within the network, maintaining the FEP--PC correspondence up to the second cumulant of the posterior. We also show that the proposed model can be trained by biologically plausible local plasticity rules. Our results enrich the explanatory power of FEP regarding neural dynamics involved in perception as variational inference.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that replacing the Gaussian assumption in predictive-coding implementations of the free-energy principle with the exponential family of distributions for both variational posterior and prior allows a recurrent neural network to exhibit nonlinearity, input-output heterogeneity and strictly positive firing rates while still performing variational free-energy minimisation up to the second cumulant of the posterior; the model is also claimed to be trainable by local, biologically plausible plasticity rules.

Significance. If the explicit network construction and the claimed automatic maintenance of the FEP-PC correspondence can be verified, the result would meaningfully extend the scope of FEP-based accounts of cortical computation beyond the restrictive Gaussian/Laplace regime, providing a principled route to more realistic neural dynamics and local learning rules.

major comments (3)
  1. [Abstract] Abstract and opening paragraphs of the introduction: the central claim that the FEP-PC correspondence 'is maintained up to the second cumulant' under the exponential-family assumption is asserted without any derivation steps, explicit network equations, or verification that the second-cumulant truncation suffices; the mapping from natural parameters/sufficient statistics to firing rates and the form of the recurrent interactions that realise the free-energy gradient are not supplied, so the 'automatic' character of the correspondence cannot be assessed.
  2. [Introduction / Methods] The weakest assumption identified in the stress-test note is load-bearing: the manuscript must demonstrate that an exponential-family posterior and prior can be realised inside a recurrent network such that the dynamics implement the free-energy gradient without auxiliary normalisation, mean-field closure, or distribution-specific approximations that would be biologically non-local; no such construction is provided.
  3. [Abstract] The claim that 'biological properties emerge within the network' (nonlinearity, heterogeneity, positive rates) is presented as a direct consequence of the EFD assumption, yet the manuscript supplies neither the explicit firing-rate functions nor the interaction terms that would allow a reader to confirm that these properties arise without additional constraints.
minor comments (2)
  1. Notation for the natural parameters and cumulant-generating function should be introduced once and used consistently; currently the transition from the Gaussian case to the general EFD case is abrupt.
  2. The statement that the model 'can be trained by biologically plausible local plasticity rules' would benefit from a short explicit rule (e.g., a three-factor Hebbian update) even if the full derivation is deferred to supplementary material.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight areas where the presentation of our derivations and network construction can be strengthened. We address each major comment below and have revised the manuscript to incorporate additional explicit derivations, equations, and clarifications while preserving the core results.

read point-by-point responses
  1. Referee: [Abstract] Abstract and opening paragraphs of the introduction: the central claim that the FEP-PC correspondence 'is maintained up to the second cumulant' under the exponential-family assumption is asserted without any derivation steps, explicit network equations, or verification that the second-cumulant truncation suffices; the mapping from natural parameters/sufficient statistics to firing rates and the form of the recurrent interactions that realise the free-energy gradient are not supplied, so the 'automatic' character of the correspondence cannot be assessed.

    Authors: We agree that the abstract and introduction state the claim concisely. The full derivation from the variational free-energy under the exponential-family assumption to the network dynamics (including the second-cumulant truncation) appears in the Methods section on exponential-family variational inference. The mapping from natural parameters to firing rates and the recurrent interaction terms realizing the gradient are given in Equations (4)–(7). To improve accessibility we have added an expanded step-by-step derivation and verification of the truncation in a new Appendix A. revision: yes

  2. Referee: [Introduction / Methods] The weakest assumption identified in the stress-test note is load-bearing: the manuscript must demonstrate that an exponential-family posterior and prior can be realised inside a recurrent network such that the dynamics implement the free-energy gradient without auxiliary normalisation, mean-field closure, or distribution-specific approximations that would be biologically non-local; no such construction is provided.

    Authors: The construction is supplied in the Methods (subsection on network implementation), where the EFD posterior and prior are realized directly via the network's sufficient statistics and natural-parameter dynamics; the free-energy gradient is implemented by local recurrent connections without auxiliary normalisation or mean-field closure. The local plasticity rules close the loop. We have nevertheless expanded this subsection with explicit pseudocode and a diagram of the recurrent architecture to make the absence of non-local operations fully transparent. revision: yes

  3. Referee: [Abstract] The claim that 'biological properties emerge within the network' (nonlinearity, heterogeneity, positive rates) is presented as a direct consequence of the EFD assumption, yet the manuscript supplies neither the explicit firing-rate functions nor the interaction terms that would allow a reader to confirm that these properties arise without additional constraints.

    Authors: The firing-rate functions are the link functions of the chosen EFD (Equation 3) and the interaction terms are the off-diagonal elements of the precision-weighted connectivity matrix (Section 3.2). These directly produce nonlinearity, unit-wise heterogeneity, and strictly positive rates by the support of the EFD. We have added a new figure (Figure 2) that plots the explicit functions and interaction terms to demonstrate emergence without further constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation presented as consequence of exponential-family assumption

full rationale

The abstract states that assuming the exponential family for both variational posterior and prior yields the listed network properties while maintaining FEP-PC correspondence up to the second cumulant, and that the model can be trained by local rules. No equations, self-citations, or fitted inputs are supplied in the provided text that would allow a reduction of the claimed correspondence to a definition or prior fit. The central step is therefore treated as an independent mathematical consequence of the distributional assumption rather than a renaming or self-referential construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claim rests on standard properties of the exponential family and on the variational free-energy formulation; no new entities are introduced and no parameters are reported as fitted in the abstract.

axioms (2)
  • standard math The exponential family of distributions is closed under the operations required for variational inference and yields a well-defined second cumulant.
    Invoked to maintain the FEP–PC correspondence when the Gaussian assumption is dropped.
  • domain assumption Variational free-energy minimization under the stated family produces network dynamics that can be realized with local plasticity rules.
    Required for the claim that training remains biologically plausible.

pith-pipeline@v0.9.1-grok · 5720 in / 1309 out tokens · 30020 ms · 2026-06-28T20:22:11.416638+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 3 canonical work pages

  1. [1]

    The free-energy principle: a rough guide to the brain?Trends Cogn

    Karl Friston. The free-energy principle: a rough guide to the brain?Trends Cogn. Sci., 13(7): 293–301, July 2009

  2. [2]

    Predictive coding under the free-energy principle.Philos

    Karl Friston and Stefan Kiebel. Predictive coding under the free-energy principle.Philos. Trans. R. Soc. Lond. B Biol. Sci., 364(1521):1211–1221, May 2009

  3. [3]

    The free-energy principle: a unified brain theory?Nat

    Karl Friston. The free-energy principle: a unified brain theory?Nat. Rev. Neurosci., 11(2): 127–138, February 2010. 16

  4. [4]

    A tutorial on the free-energy framework for modelling perception and learning

    Rafal Bogacz. A tutorial on the free-energy framework for modelling perception and learning. J. Math. Psychol., 76(Pt B):198–211, February 2017

  5. [5]

    9 ofAllgemeine Encyklopädie der Physik

    Hermann von Helmholtz.Handbuch der physiologischen Optik, volume Bd. 9 ofAllgemeine Encyklopädie der Physik. Leopold V oss, Leipzig, 1867

  6. [6]

    Cambridge University Press, September 1996

    David C Knill, Whitman Richards, Whitman Richard, D C Knill, D Kersten, A Yuille, D Mum- ford, A Jepson, W Richards, D C Knill, J Feldman, A L Yuille, H H Bülthoff, B M Bennett, D D Hoffman, C Prakash, S N Richman, P Mamassian, A Blake, D Sheinberg, P N Belhumeur, W T Freeman, K Nakayama, S Shimojo, E H Adelson, A P Pentland, and H Barlow.Perception as Ba...

  7. [7]

    MIT Press, 2007

    Kenji Doya, Shin Ishii, Alexandre Pouget, and Rajesh P N Rao.Bayesian Brain: Probabilistic Approaches to Neural Coding. MIT Press, 2007

  8. [8]

    Bayesian brain theory: Computational neuroscience of belief.Neuroscience, 566:198–204, February 2025

    Hugo Bottemanne. Bayesian brain theory: Computational neuroscience of belief.Neuroscience, 566:198–204, February 2025

  9. [9]

    Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.Nat

    R P Rao and D H Ballard. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.Nat. Neurosci., 2(1):79–87, January 1999

  10. [10]

    A new cellular mechanism for coupling inputs arriving at different cortical layers.Nature, 398(6725):338–341, March 1999

    M E Larkum, J J Zhu, and B Sakmann. A new cellular mechanism for coupling inputs arriving at different cortical layers.Nature, 398(6725):338–341, March 1999

  11. [11]

    A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex.Trends Neurosci., 36(3):141–151, March 2013

    Matthew Larkum. A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex.Trends Neurosci., 36(3):141–151, March 2013

  12. [12]

    Conjunctive input processing drives feature selectivity in hippocampal CA1 neurons.Nat

    Katie C Bittner, Christine Grienberger, Sachin P Vaidya, Aaron D Milstein, John J Macklin, Junghyup Suh, Susumu Tonegawa, and Jeffrey C Magee. Conjunctive input processing drives feature selectivity in hippocampal CA1 neurons.Nat. Neurosci., 18(8):1133–1142, August 2015

  13. [13]

    Implications of neuronal diversity on population coding

    Maoz Shamir and Haim Sompolinsky. Implications of neuronal diversity on population coding. Neural Comput., 18(8):1951–1986, August 2006

  14. [14]

    Intrinsic biophysical diversity decorrelates neuronal firing while increasing information content.Nat

    Krishnan Padmanabhan and Nathaniel N Urban. Intrinsic biophysical diversity decorrelates neuronal firing while increasing information content.Nat. Neurosci., 13(10):1276–1282, October 2010

  15. [15]

    Population diversity and function of hyperpolarization- activated current in olfactory bulb mitral cells.Sci

    Kamilla Angelo and Troy W Margrie. Population diversity and function of hyperpolarization- activated current in olfactory bulb mitral cells.Sci. Rep., 1(1):50, July 2011

  16. [16]

    Multivariate analysis of electrophysiological diversity of xenopus visual neurons during development and plasticity.Elife, 4, November 2015

    Christopher M Ciarleglio, Arseny S Khakhalin, Angelia F Wang, Alexander C Constantino, Sarah P Yip, and Carlos D Aizenman. Multivariate analysis of electrophysiological diversity of xenopus visual neurons during development and plasticity.Elife, 4, November 2015

  17. [17]

    Diversity amongst human cortical pyramidal neurons revealed via their sag currents and frequency preferences.Nat

    Homeira Moradi Chameh, Scott Rich, Lihua Wang, Fu-Der Chen, Liang Zhang, Peter L Carlen, Shreejoy J Tripathy, and Taufik A Valiante. Diversity amongst human cortical pyramidal neurons revealed via their sag currents and frequency preferences.Nat. Commun., 12(1):2497, May 2021

  18. [18]

    Learning probability distributions of sensory inputs with Monte Carlo predictive coding.PLOS Computational Biology, 20(10): e1012532, October 2024

    Gaspard Oliviers, Rafal Bogacz, and Alexander Meulemans. Learning probability distributions of sensory inputs with Monte Carlo predictive coding.PLOS Computational Biology, 20(10): e1012532, October 2024. ISSN 1553-7358. doi: 10.1371/journal.pcbi.1012532. URL https:// journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1012532

  19. [19]

    Active inference and agency.Cogn

    Karl Friston. Active inference and agency.Cogn. Neurosci., 5(2):119–121, April 2014

  20. [20]

    The markov blankets of life: autonomy, active inference and the free energy principle.J

    Michael Kirchhoff, Thomas Parr, Ensor Palacios, Karl Friston, and Julian Kiverstein. The markov blankets of life: autonomy, active inference and the free energy principle.J. R. Soc. Interface, 15(138):20170792, January 2018

  21. [21]

    Pierce.Types and Programming Languages

    Thomas Parr, Giovanni Pezzulo, and Karl J Friston. Active inference. https://mitpress. mit.edu/9780262362283/active-inference/, December 2021. Accessed: 2026-3-7. 17

  22. [22]

    Life as we know it.J

    Karl Friston. Life as we know it.J. R. Soc. Interface, 10(86):20130475, September 2013

  23. [23]

    Applied Mathematical Sciences

    Shun-Ichi Amari.Information Geometry and Its Applications. Applied Mathematical Sciences. Springer, Tokyo, Japan, 1 edition, February 2016

  24. [24]

    Laws of thermodynamics for exponential families.arXiv [cond-mat.stat- mech], January 2025

    Akshay Balsubramani. Laws of thermodynamics for exponential families.arXiv [cond-mat.stat- mech], January 2025

  25. [25]

    Thermodynamics of prediction.Phys

    Susanne Still, David A Sivak, Anthony J Bell, and Gavin E Crooks. Thermodynamics of prediction.Phys. Rev. Lett., 109(12):120604, September 2012

  26. [26]

    On the thermodynamics of prediction under dissipative adaptation.arXiv [q-bio.NC], September 2020

    Kai Ueltzhöffer. On the thermodynamics of prediction under dissipative adaptation.arXiv [q-bio.NC], September 2020

  27. [27]

    C. Beck. Superstatistics: theory and applications.Continuum Mechanics and Thermodynamics, 16(3):293–304, March 2004. ISSN 1432-0959. doi: 10.1007/s00161-003-0145-1. URL http://dx.doi.org/10.1007/s00161-003-0145-1. A On third cumulant-neglecting approximation In Section 3.2.1, we introduced an approximation where we neglect the third cumulant ∇3 ηql Aql(ηq...