Self-orthogonalizing attractor neural networks emerging from the free energy principle
Pith reviewed 2026-05-22 12:51 UTC · model grok-4.3
The pith
Attractor networks develop approximately orthogonal representations by minimizing free energy without explicit rules.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Attractor networks arise from the free energy principle applied to a universal partitioning of random dynamical systems. These networks perform collective Bayesian active inference in which attractors represent priors, sensory data updates posteriors, and couplings are tuned to minimize surprise. Analytically and through simulations, the dynamics produce approximately orthogonal attractor representations that efficiently span the input space while enhancing generalization and mutual information between hidden causes and observable effects.
What carries the argument
Free energy minimization over a universal partitioning of random dynamical systems that induces attractor dynamics, inference, and learning.
If this is right
- Attractors efficiently span the input subspace and increase mutual information between hidden causes and observable effects.
- Random data presentation yields symmetric and sparse couplings, while sequential data produces asymmetric couplings and non-equilibrium steady-state dynamics.
- The framework generalizes conventional Boltzmann Machines and supplies a unifying account of self-organizing attractor networks.
Where Pith is reading between the lines
- The same partitioning principle might be applied to other complex systems to derive their self-organizing rules from first principles.
- This account suggests that orthogonalization is a generic consequence of joint accuracy-complexity optimization in any system governed by free energy.
- One could examine whether biological circuits exhibit the predicted shift from symmetric to asymmetric couplings when data arrive sequentially rather than randomly.
Load-bearing premise
The free energy principle applies directly to an arbitrary universal partitioning of random dynamical systems to produce plausible inference and learning dynamics without additional explicit rules.
What would settle it
A demonstration that attractor networks optimized for predictive accuracy and model complexity fail to develop approximately orthogonal representations, or that the partitioning alone does not generate the claimed self-organizing dynamics.
read the original abstract
Attractor dynamics are a hallmark of many complex systems, including the brain. Understanding how such self-organizing dynamics emerge from first principles is crucial for advancing our understanding of neuronal computations and the design of artificial intelligence systems. Here we formalize how attractor networks emerge from the free energy principle applied to a universal partitioning of random dynamical systems. Our approach obviates the need for explicitly imposed learning and inference rules and identifies emergent, but efficient and biologically plausible inference and learning dynamics for such self-organizing systems. These result in a collective, multi-level Bayesian active inference process. Attractors on the free energy landscape encode prior beliefs; inference integrates sensory data into posterior beliefs; and learning fine-tunes couplings to minimize long-term surprise. Analytically and via simulations, we establish that the proposed networks favor approximately orthogonalized attractor representations, a consequence of simultaneously optimizing predictive accuracy and model complexity. These attractors efficiently span the input subspace, enhancing generalization and the mutual information between hidden causes and observable effects. Furthermore, while random data presentation leads to symmetric and sparse couplings, sequential data fosters asymmetric couplings and non-equilibrium steady-state dynamics, offering a natural generalization of conventional Boltzmann Machines. Our findings offer a unifying theory of self-organizing attractor networks, providing novel insights for AI and neuroscience.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that attractor networks emerge from the free energy principle applied to a universal partitioning of random dynamical systems. This obviates explicit learning and inference rules, yielding emergent biologically plausible dynamics that implement collective multi-level Bayesian active inference. Attractors encode priors, inference updates posteriors, and learning minimizes long-term surprise. Analytically and in simulations, the networks produce approximately orthogonalized attractor representations by trading off predictive accuracy against model complexity; these span the input subspace efficiently. Random data presentation yields symmetric sparse couplings, while sequential data produces asymmetric couplings and non-equilibrium steady-state dynamics, generalizing conventional Boltzmann machines.
Significance. If the derivation is free of hidden factorization assumptions, the result would supply a first-principles account of self-organizing attractor dynamics that directly explains the emergence of efficient, approximately orthogonal representations and biologically plausible rules without additional constraints. This would constitute a unifying theory linking the free energy principle to attractor networks in both neuroscience and AI.
major comments (1)
- [analytical derivation] The central claim that approximately orthogonalized attractor representations emerge as a direct consequence of optimizing accuracy and complexity (abstract and analytical sections) appears to rest on a specific variational form of the free energy. If this form employs a mean-field or Laplace approximation whose factorization properties suppress cross terms by construction, orthogonality may be an artifact rather than a general consequence of the universal partitioning. Please state the explicit free-energy functional and show that the orthogonality result survives removal of the approximation.
minor comments (1)
- [Abstract] The abstract summarizes conceptual steps and simulation outcomes but contains no equations, parameter values, or quantitative metrics, which hinders immediate assessment of the strength of the reported results.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which have helped us improve the clarity of our manuscript. We address the major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: The central claim that approximately orthogonalized attractor representations emerge as a direct consequence of optimizing accuracy and complexity (abstract and analytical sections) appears to rest on a specific variational form of the free energy. If this form employs a mean-field or Laplace approximation whose factorization properties suppress cross terms by construction, orthogonality may be an artifact rather than a general consequence of the universal partitioning. Please state the explicit free-energy functional and show that the orthogonality result survives removal of the approximation.
Authors: We appreciate the referee's concern about potential artifacts from variational approximations. In our derivation, we start from the exact free energy principle for random dynamical systems under the universal partitioning into internal, external, and blanket states. The free energy functional is the standard variational free energy F[q] = E_{q(s)}[ln q(s) - ln p(o,s)], where q(s) is the variational density over hidden states s without imposed mean-field factorization across variables or Laplace approximation in the core analytical steps. The orthogonality emerges from the explicit trade-off in the complexity term (KL[q||p]), which penalizes redundant representations independently of factorization assumptions, as shown in the fixed-point attractor equations derived via gradient descent on F. We will revise the manuscript to state this functional explicitly in the analytical section and add a supplementary derivation demonstrating robustness under a non-factorized q with full covariance structure, supported by additional simulations using correlated priors. revision: yes
Circularity Check
No significant circularity; derivation introduces new application of FEP
full rationale
The paper formalizes attractor network emergence by applying the free energy principle to a universal partitioning of random dynamical systems, deriving inference and learning dynamics without explicit rules. The claim that approximately orthogonalized representations arise from the accuracy-complexity trade-off is presented as an analytical and simulation-based result of this application. While the FEP originates in prior work by one author, the specific construction, multi-level Bayesian active inference process, and orthogonality outcome add independent content rather than reducing by definition or self-citation chain to the inputs. No equations or steps are shown to force the result tautologically.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The free energy principle governs the dynamics of random dynamical systems under a universal partitioning
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
minimizing free energy ... simultaneously optimizing predictive accuracy and model complexity ... penalizes overlapping attractor representations and favours orthogonal representations ... anti-Hebbian term that subtracts the variance that is already explained
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leancostAlphaLog_high_calibrated_iff echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
Langevin function ... continuous Bernoulli ... J† symmetric part governs stationary distribution
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.