Neural policy com- position from free energy minimization

· 2025 · math.OC · arXiv 2512.04745

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

The ability to flexibly compose previously acquired skills to execute intelligent behaviors is a hallmark of natural intelligence. Such compositional flexibility is often attributed to context-dependent gating mechanisms that determine how multiple policies or behavioral primitives are combined. Yet, despite remarkable efforts, the normative objective from which such gating rules should arise, and the neural computations capable of implementing them, remain unclear. Existing approaches typically rely on prespecified design choices for the gating rules, and remain tied to specific architectures, learning paradigms, or datasets. Here, we introduce a normative framework in which policy composition emerges from the minimization of a variational free energy, providing a principled and broadly applicable objective for gating. Based on this framework, we derive a continuous-time gradient flow whose trajectories are guaranteed to converge, with explicit rate, to the optimal composition of primitives. We further show that this dynamics admits a mechanistic neural implementation as a soft-competitive recurrent circuit with context-sensitive local interactions. We evaluate the model on emerging flocking behaviors in multi-agent systems, human decision-making in bandit tasks, and control benchmarks in layered architectures. Across these settings, the model provides interpretable mechanistic accounts of policy composition, reproduces key behavioral signatures, yields insights into data, and matches or outperforms established models.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Predictive Coding with Bayesian Priors via Proximal Gradients

eess.SY · 2026-06-06 · unverdicted · novelty 7.0

Predictive coding equals proximal gradient descent on MAP problems, with priors setting nonlinearities via proximal operators and yielding leaky firing-rate networks plus hierarchical MRFs.

Energy-Based Dynamical Models for Neurocomputation, Learning, and Optimization

cs.LG · 2026-04-06 · unverdicted · novelty 3.0

The paper reviews and extends energy-based dynamical models that use gradient flows and energy landscapes for neurocomputation, learning, and optimization tasks.

citing papers explorer

Showing 2 of 2 citing papers.

Predictive Coding with Bayesian Priors via Proximal Gradients eess.SY · 2026-06-06 · unverdicted · none · ref 30 · internal anchor
Predictive coding equals proximal gradient descent on MAP problems, with priors setting nonlinearities via proximal operators and yielding leaky firing-rate networks plus hierarchical MRFs.
Energy-Based Dynamical Models for Neurocomputation, Learning, and Optimization cs.LG · 2026-04-06 · unverdicted · none · ref 138 · internal anchor
The paper reviews and extends energy-based dynamical models that use gradient flows and energy landscapes for neurocomputation, learning, and optimization tasks.

Neural policy com- position from free energy minimization

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer