Self-orthogonalizing attractor neural networks emerging from the free energy principle

Karl Friston; Tamas Spisak

arxiv: 2505.22749 · v2 · pith:6CBONVGSnew · submitted 2025-05-28 · 🧬 q-bio.NC · cs.AI· cs.LG· cs.NE

Self-orthogonalizing attractor neural networks emerging from the free energy principle

Tamas Spisak , Karl Friston This is my paper

Pith reviewed 2026-05-22 12:51 UTC · model grok-4.3

classification 🧬 q-bio.NC cs.AIcs.LGcs.NE

keywords attractor networksfree energy principleself-organizing dynamicsBayesian active inferenceorthogonal representationsrandom dynamical systemsneural computation

0 comments

The pith

Attractor networks develop approximately orthogonal representations by minimizing free energy without explicit rules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that attractor networks emerge when the free energy principle is applied to a universal partitioning of random dynamical systems. This generates inference and learning dynamics that simultaneously optimize predictive accuracy and model complexity. Attractors encode prior beliefs, inference integrates data into posteriors, and learning adjusts couplings to reduce long-term surprise. The outcome is a multi-level Bayesian active inference process that favors approximately orthogonalized representations spanning the input subspace.

Core claim

Attractor networks arise from the free energy principle applied to a universal partitioning of random dynamical systems. These networks perform collective Bayesian active inference in which attractors represent priors, sensory data updates posteriors, and couplings are tuned to minimize surprise. Analytically and through simulations, the dynamics produce approximately orthogonal attractor representations that efficiently span the input space while enhancing generalization and mutual information between hidden causes and observable effects.

What carries the argument

Free energy minimization over a universal partitioning of random dynamical systems that induces attractor dynamics, inference, and learning.

If this is right

Attractors efficiently span the input subspace and increase mutual information between hidden causes and observable effects.
Random data presentation yields symmetric and sparse couplings, while sequential data produces asymmetric couplings and non-equilibrium steady-state dynamics.
The framework generalizes conventional Boltzmann Machines and supplies a unifying account of self-organizing attractor networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same partitioning principle might be applied to other complex systems to derive their self-organizing rules from first principles.
This account suggests that orthogonalization is a generic consequence of joint accuracy-complexity optimization in any system governed by free energy.
One could examine whether biological circuits exhibit the predicted shift from symmetric to asymmetric couplings when data arrive sequentially rather than randomly.

Load-bearing premise

The free energy principle applies directly to an arbitrary universal partitioning of random dynamical systems to produce plausible inference and learning dynamics without additional explicit rules.

What would settle it

A demonstration that attractor networks optimized for predictive accuracy and model complexity fail to develop approximately orthogonal representations, or that the partitioning alone does not generate the claimed self-organizing dynamics.

read the original abstract

Attractor dynamics are a hallmark of many complex systems, including the brain. Understanding how such self-organizing dynamics emerge from first principles is crucial for advancing our understanding of neuronal computations and the design of artificial intelligence systems. Here we formalize how attractor networks emerge from the free energy principle applied to a universal partitioning of random dynamical systems. Our approach obviates the need for explicitly imposed learning and inference rules and identifies emergent, but efficient and biologically plausible inference and learning dynamics for such self-organizing systems. These result in a collective, multi-level Bayesian active inference process. Attractors on the free energy landscape encode prior beliefs; inference integrates sensory data into posterior beliefs; and learning fine-tunes couplings to minimize long-term surprise. Analytically and via simulations, we establish that the proposed networks favor approximately orthogonalized attractor representations, a consequence of simultaneously optimizing predictive accuracy and model complexity. These attractors efficiently span the input subspace, enhancing generalization and the mutual information between hidden causes and observable effects. Furthermore, while random data presentation leads to symmetric and sparse couplings, sequential data fosters asymmetric couplings and non-equilibrium steady-state dynamics, offering a natural generalization of conventional Boltzmann Machines. Our findings offer a unifying theory of self-organizing attractor networks, providing novel insights for AI and neuroscience.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Attractor networks self-orthogonalize under FEP because accuracy-complexity tradeoffs favor efficient spanning of input space, with random versus sequential data producing symmetric versus asymmetric couplings.

read the letter

The main takeaway is that this work derives attractor dynamics and roughly orthogonal representations directly from the free energy principle applied to a partitioned random dynamical system. No separate learning rules are added; the orthogonality and the random-versus-sequential distinction both fall out of the same variational setup. That is the concrete new piece relative to earlier FEP papers on active inference and attractor networks.

Referee Report

1 major / 1 minor

Summary. The manuscript claims that attractor networks emerge from the free energy principle applied to a universal partitioning of random dynamical systems. This obviates explicit learning and inference rules, yielding emergent biologically plausible dynamics that implement collective multi-level Bayesian active inference. Attractors encode priors, inference updates posteriors, and learning minimizes long-term surprise. Analytically and in simulations, the networks produce approximately orthogonalized attractor representations by trading off predictive accuracy against model complexity; these span the input subspace efficiently. Random data presentation yields symmetric sparse couplings, while sequential data produces asymmetric couplings and non-equilibrium steady-state dynamics, generalizing conventional Boltzmann machines.

Significance. If the derivation is free of hidden factorization assumptions, the result would supply a first-principles account of self-organizing attractor dynamics that directly explains the emergence of efficient, approximately orthogonal representations and biologically plausible rules without additional constraints. This would constitute a unifying theory linking the free energy principle to attractor networks in both neuroscience and AI.

major comments (1)

[analytical derivation] The central claim that approximately orthogonalized attractor representations emerge as a direct consequence of optimizing accuracy and complexity (abstract and analytical sections) appears to rest on a specific variational form of the free energy. If this form employs a mean-field or Laplace approximation whose factorization properties suppress cross terms by construction, orthogonality may be an artifact rather than a general consequence of the universal partitioning. Please state the explicit free-energy functional and show that the orthogonality result survives removal of the approximation.

minor comments (1)

[Abstract] The abstract summarizes conceptual steps and simulation outcomes but contains no equations, parameter values, or quantitative metrics, which hinders immediate assessment of the strength of the reported results.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their insightful comments, which have helped us improve the clarity of our manuscript. We address the major comment below and outline the revisions we will make.

read point-by-point responses

Referee: The central claim that approximately orthogonalized attractor representations emerge as a direct consequence of optimizing accuracy and complexity (abstract and analytical sections) appears to rest on a specific variational form of the free energy. If this form employs a mean-field or Laplace approximation whose factorization properties suppress cross terms by construction, orthogonality may be an artifact rather than a general consequence of the universal partitioning. Please state the explicit free-energy functional and show that the orthogonality result survives removal of the approximation.

Authors: We appreciate the referee's concern about potential artifacts from variational approximations. In our derivation, we start from the exact free energy principle for random dynamical systems under the universal partitioning into internal, external, and blanket states. The free energy functional is the standard variational free energy F[q] = E_{q(s)}[ln q(s) - ln p(o,s)], where q(s) is the variational density over hidden states s without imposed mean-field factorization across variables or Laplace approximation in the core analytical steps. The orthogonality emerges from the explicit trade-off in the complexity term (KL[q||p]), which penalizes redundant representations independently of factorization assumptions, as shown in the fixed-point attractor equations derived via gradient descent on F. We will revise the manuscript to state this functional explicitly in the analytical section and add a supplementary derivation demonstrating robustness under a non-factorized q with full covariance structure, supported by additional simulations using correlated priors. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation introduces new application of FEP

full rationale

The paper formalizes attractor network emergence by applying the free energy principle to a universal partitioning of random dynamical systems, deriving inference and learning dynamics without explicit rules. The claim that approximately orthogonalized representations arise from the accuracy-complexity trade-off is presented as an analytical and simulation-based result of this application. While the FEP originates in prior work by one author, the specific construction, multi-level Bayesian active inference process, and orthogonality outcome add independent content rather than reducing by definition or self-citation chain to the inputs. No equations or steps are shown to force the result tautologically.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the free energy principle and the assumption that a universal partitioning of random dynamical systems is sufficient to derive the reported dynamics.

axioms (1)

domain assumption The free energy principle governs the dynamics of random dynamical systems under a universal partitioning
Invoked as the starting point for all emergent inference and learning rules.

pith-pipeline@v0.9.0 · 5762 in / 1144 out tokens · 36741 ms · 2026-05-22T12:51:04.624488+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

minimizing free energy ... simultaneously optimizing predictive accuracy and model complexity ... penalizes overlapping attractor representations and favours orthogonal representations ... anti-Hebbian term that subtracts the variance that is already explained
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean costAlphaLog_high_calibrated_iff echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Langevin function ... continuous Bernoulli ... J† symmetric part governs stationary distribution

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.