pith. sign in

arxiv: 2512.04745 · v3 · pith:JH5ACUV3new · submitted 2025-12-04 · 🧮 math.OC · cs.AI· cs.SY· eess.SY· nlin.AO

Neural Policy Composition from Free Energy Minimization

Pith reviewed 2026-05-21 17:10 UTC · model grok-4.3

classification 🧮 math.OC cs.AIcs.SYeess.SYnlin.AO
keywords policy compositionvariational free energygradient flowneural gatingrecurrent circuitsmulti-agent flockingbandit tasksoptimal control
0
0 comments X

The pith

Policy composition arises from minimizing a variational free energy, producing a convergent gradient flow that a neural circuit can implement.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that flexible composition of behavioral primitives or policies follows directly from minimizing a variational free energy over their combinations. This supplies a single, architecture-independent objective that replaces hand-designed gating rules. From the free energy the authors derive a continuous-time gradient flow whose solutions converge at a known rate to the optimal mixing weights. The same flow admits an exact realization as a soft-competitive recurrent circuit whose connections depend on context. The resulting model accounts for observed patterns in multi-agent flocking, human bandit choices, and layered control tasks.

Core claim

Minimization of a suitably defined variational free energy over policy combinations induces a continuous-time gradient flow on the space of mixing weights; the trajectories of this flow converge, at an explicit rate, to the weights that realize the optimal composition of given primitives, and the flow itself is realized by a soft-competitive recurrent neural circuit with context-sensitive local interactions.

What carries the argument

The variational free energy functional whose gradient flow with respect to policy mixing weights yields both the convergence guarantee and the soft-competitive recurrent circuit.

If this is right

  • The composition dynamics converges to the optimal mixing at an explicit, provable rate.
  • The dynamics admits an exact implementation as a recurrent neural circuit without additional architectural constraints.
  • The same objective reproduces key behavioral signatures across flocking, bandit decision-making, and layered control tasks.
  • Gating rules emerge mechanistically from free-energy minimization rather than from prespecified design choices.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework supplies a candidate normative principle that could unify gating mechanisms across reinforcement learning and active inference models.
  • Because the dynamics is continuous-time and local, it offers a natural starting point for analyzing how biological circuits might implement skill composition on short timescales.
  • The explicit convergence rate could be used to predict how quickly an agent should switch between primitive policies when the context changes.

Load-bearing premise

A variational free energy can be defined over policy combinations so that its minimization simultaneously guarantees convergence to an optimal composition and supplies a direct mechanistic neural implementation.

What would settle it

A numerical integration of the derived gradient flow on the flocking or bandit benchmark that fails to converge to the composition minimizing the free energy, or a circuit simulation whose activity patterns deviate from the predicted soft-competitive interactions.

Figures

Figures reproduced from arXiv: 2512.04745 by Francesca Rossi, Francesco Bullo, Giovanni Russo, Veronica Centorrino.

Figure 1
Figure 1. Figure 1: GateMod Set-up. A At time step k − 1, an agent (e.g., a boid in a flock, or a person in a multi-armed bandit task, or an autonomous agent) receives the state xk−1 from the environment and determines action uk. Both xk−1 and uk are realizations of random variables, Xk−1 and Uk. We denote random variables with upper-case letters and their realizations with lower-case letters. Bold means that the variable is,… view at source ↗
Figure 2
Figure 2. Figure 2: GateMod. A GateFrame normative framework. At each time step, the agent computes optimal policy weights w⋆ k by solving an entropy-regularized optimization problem that minimizes a trade-off be￾tween statistical complexity and entropy. The constraints formalize the fact that the resulting policy is a linear, and hence convex, combination of primitives. The optimal weights correspond to the equilibrium of Ga… view at source ↗
Figure 3
Figure 3. Figure 3: A A boid in a flock of N boids. Position and velocity components form 4-dimensional state x i k ; u i k is the acceleration vector. We use the superscript to denote that states/actions are those of the i-th boid in the flock. The acceleration is built upon the social forces and a boid can only use information from boids within its field of view. The field angle, α, is set to 320◦ in the experiments. The ra… view at source ↗
Figure 4
Figure 4. Figure 4: A Comparison between Hybrid model from [36] and GateModin terms of PXP. Higher PXP for a given model suggests that the model provides better explanations for the data. Formally, PXP quantifies the probability that each considered model is the most frequent process that generated the data. To obtain the PXP, we start from GateMod optimal policy. The policy at each trial is used to compute the Bayesian Infor… view at source ↗
read the original abstract

The ability to flexibly compose previously acquired skills to execute intelligent behaviors is a hallmark of natural intelligence. Such compositional flexibility is often attributed to context-dependent gating mechanisms that determine how multiple policies or behavioral primitives are combined. Yet, despite remarkable efforts, the normative objective from which such gating rules should arise, and the neural computations capable of implementing them, remain unclear. Existing approaches typically rely on prespecified design choices for the gating rules, and remain tied to specific architectures, learning paradigms, or datasets. Here, we introduce a normative framework in which policy composition emerges from the minimization of a variational free energy, providing a principled and broadly applicable objective for gating. Based on this framework, we derive a continuous-time gradient flow whose trajectories are guaranteed to converge, with explicit rate, to the optimal composition of primitives. We further show that this dynamics admits a mechanistic neural implementation as a soft-competitive recurrent circuit with context-sensitive local interactions. We evaluate the model on emerging flocking behaviors in multi-agent systems, human decision-making in bandit tasks, and control benchmarks in layered architectures. Across these settings, the model provides interpretable mechanistic accounts of policy composition, reproduces key behavioral signatures, yields insights into data, and matches or outperforms established models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a normative framework in which policy composition emerges from minimization of a variational free energy functional over combinations of primitive policies. From this objective the authors derive a continuous-time gradient flow on the policy simplex whose trajectories converge to the optimal composition with an explicit rate; they further show that the dynamics admits a mechanistic implementation as a soft-competitive recurrent neural circuit with context-sensitive interactions. The framework is evaluated on multi-agent flocking, bandit decision-making, and layered control tasks, where it reproduces behavioral signatures and matches or exceeds baseline models.

Significance. If the claimed convergence guarantees and rate hold for general primitive policies without additional convexity or Lipschitz restrictions, the work would supply a principled, architecture-agnostic objective for gating that links free-energy minimization to both dynamical systems and neural implementation. The explicit rate, mechanistic circuit, and cross-domain evaluations would constitute a substantive contribution to normative modeling of compositional control.

major comments (2)
  1. [§3] §3 (Gradient-flow derivation): The abstract and framework claim an explicit convergence rate for the continuous-time dynamics, yet the manuscript does not state or verify the strong-convexity (or geodesic-convexity) condition on the free-energy functional over the policy simplex that would be required for a uniform rate independent of the choice of primitives. Without this, the rate may hold only for restricted classes of gating variables or primitive policies.
  2. [§4] §4 (Neural implementation): The mapping from the gradient flow to the soft-competitive recurrent circuit is presented as direct, but the derivation appears to introduce local interaction weights whose stability under the claimed dynamics is not shown to follow automatically from the free-energy objective; an explicit Lyapunov or contraction argument linking the circuit equations back to the variational functional is needed.
minor comments (2)
  1. [§2] Notation for the policy simplex and the variational free-energy functional should be introduced with a single consistent definition early in the paper rather than piecemeal across sections.
  2. [Figure 2] Figure 2 (neural circuit diagram) would benefit from explicit labeling of which variables correspond to the gating weights versus the primitive policy outputs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We address each major point below, indicating revisions that will be incorporated to clarify the conditions and strengthen the arguments.

read point-by-point responses
  1. Referee: [§3] §3 (Gradient-flow derivation): The abstract and framework claim an explicit convergence rate for the continuous-time dynamics, yet the manuscript does not state or verify the strong-convexity (or geodesic-convexity) condition on the free-energy functional over the policy simplex that would be required for a uniform rate independent of the choice of primitives. Without this, the rate may hold only for restricted classes of gating variables or primitive policies.

    Authors: We thank the referee for this observation. The explicit convergence rate in Section 3 is derived under the assumption that the variational free-energy functional is strongly convex with respect to the Fisher-Rao metric on the policy simplex. This property holds when the primitive policies satisfy suitable regularity conditions, such as bounded second derivatives or sufficient separation in the policy space. While the evaluated tasks satisfy these conditions, we agree that the assumption should be stated explicitly. In the revision we will update the statement of the main theorem to include the strong-convexity requirement and add a short discussion of sufficient conditions on the primitives, together with a verification for the bandit and layered-control examples. revision: yes

  2. Referee: [§4] §4 (Neural implementation): The mapping from the gradient flow to the soft-competitive recurrent circuit is presented as direct, but the derivation appears to introduce local interaction weights whose stability under the claimed dynamics is not shown to follow automatically from the free-energy objective; an explicit Lyapunov or contraction argument linking the circuit equations back to the variational functional is needed.

    Authors: We agree that an explicit stability argument would improve the presentation. The soft-competitive circuit is obtained by rewriting the continuous-time gradient flow in terms of local, context-dependent interactions that arise directly from the variational derivatives. To make the link rigorous, we will add a new proposition in Section 4 that constructs a Lyapunov function given by the free-energy functional itself. We will show that the time derivative of this function along the circuit trajectories is non-positive, thereby establishing that the circuit dynamics inherit the convergence guarantees of the original variational objective. The argument and its proof will be included in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

Derivation self-contained from external variational free energy principle with no reduction to fitted inputs or self-citations

full rationale

The paper presents policy composition as emerging directly from minimization of a variational free energy functional, followed by derivation of a continuous-time gradient flow with stated convergence guarantees. No equations or steps in the provided abstract or framework description reduce the claimed results to a self-referential definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The free-energy objective is invoked as an external normative principle rather than constructed from the target gating dynamics, and the neural implementation is presented as a consequence rather than an input. This satisfies the criteria for a self-contained derivation against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Because only the abstract is available, the ledger is necessarily incomplete. The central claim rests on the existence of a variational free energy that can be defined over policy spaces and whose minimization yields both optimal composition and a realizable neural circuit.

axioms (1)
  • domain assumption A variational free energy functional can be defined over combinations of existing policies such that its minimization produces the optimal composition.
    Stated in the abstract as the normative basis for the entire framework.

pith-pipeline@v0.9.0 · 5757 in / 1276 out tokens · 104495 ms · 2026-05-21T17:10:19.928087+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Energy-Based Dynamical Models for Neurocomputation, Learning, and Optimization

    cs.LG 2026-04 unverdicted novelty 3.0

    The paper reviews and extends energy-based dynamical models that use gradient flows and energy landscapes for neurocomputation, learning, and optimization tasks.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    Abbas and H

    B. Abbas and H. Attouch. Dynamical systems and forward-backward algorithm s associated with the sum of a convex subdifferential and a monotone cocoercive operator. Optimization, 64(10):2223–2252, 2014

  2. [2]

    H. H. Bauschke and P. L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, 2 edition, 2017

  3. [3]

    A. Beck. First-Order Methods in Optimization . SIAM, 2017

  4. [4]

    Beck and M

    A. Beck and M. Teboulle. Mirror descent and nonlinear projected su bgradient methods for convex optimization. Operations Research Letters, 31(3):167–175, 2003

  5. [5]

    F. Bullo. Contraction Theory for Dynamical Systems . Kindle Direct Publishing, 1.2 edition, 2024

  6. [6]

    Bullo, P

    F. Bullo, P. Cisneros-Velarde, A. Davydov, and S. Jafarpour. From con traction theory to fixed point algorithms on Riemannian and non-Euclidean spaces. In IEEE Conf. on Decision and Control , December 2021

  7. [7]

    Centorrino, A

    V. Centorrino, A. Gokhale, A. Davydov, G. Russo, and F. Bullo. Positive competitive networks for sparse reconstruction. Neural Computation , 36(6):1163–1197, 2024

  8. [8]

    P. L. Combettes and J.-C. Pesquet. Deep neural network structur es solving variational inequalities. Set-Valued and Variational Analysis , 28(3):491–518, 2020

  9. [9]

    Cominetti, E

    R. Cominetti, E. Melo, and S. Sorin. A payoff-based learning proced ure and its application to traffic games. Games and Economic Behavior , 70(1):71–83, September 2010

  10. [10]

    Coucheney, B

    P. Coucheney, B. Gaujal, and P. Mertikopoulos. Penalty-regulated dy namics and robust learning procedures in games. Mathematics of Operations Research , 40(3):611–633, August 2015

  11. [11]

    I. D. Couzin, J. Krause, R. James, G. D. Ruxton, and N. R. Franks. Coll ective Memory and Spatial Sorting in Animal Groups. Journal of Theoretical Biology , 218(1):1–11, 2002

  12. [12]

    T. M. Cover and J. A. Thomas. Elements of Information Theory . John Wiley & Sons, USA, 2006

  13. [13]

    Cucker and S

    F. Cucker and S. Smale. Emergent behavior in flocks. IEEE Transactions on Automatic Control , 52(5):852–862, 2007

  14. [14]

    Davydov, V

    A. Davydov, V. Centorrino, A. Gokhale, G. Russo, and F. Bullo. Time-var ying convex optimiza- tion: A contraction and equilibrium tracking approach. IEEE Transactions on Automatic Control , 70(11):7446–7460, 2025

  15. [15]

    Davydov, S

    A. Davydov, S. Jafarpour, and F. Bullo. Non-Euclidean contraction theor y for robust nonlinear stability. IEEE Transactions on Automatic Control , 67(12):6667–6681, 2022

  16. [16]

    Davydov, A

    A. Davydov, A. V. Proskurnikov, and F. Bullo. Non-Euclidean contracti on analysis of continuous-time neural networks. IEEE Transactions on Automatic Control , 70(1):235–250, 2025. 30

  17. [17]

    I. M. Elfadel and J. L. Wyatt Jr. The” softmax” nonlinearity: Derivat ion using statistical mechanics and useful properties as a multiterminal analog circuit element. Advances in neural information processing systems, 6, 1993

  18. [18]

    Frank and P

    M. Frank and P. Wolfe. An algorithm for quadratic programming. Naval Research Logistics Quarterly, 3(1–2):95–110, March 1956

  19. [19]

    On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning

    B. Gao and L. Pavel. On the properties of the softmax function with app lication in game theory and reinforcement learning. arXiv preprint arXiv:1704.00805 , 2017

  20. [20]

    Garrab´ e and Giovanni Russo

    E. Garrab´ e and Giovanni Russo. Probabilistic design of optimal seque ntial decision-making algorithms in learning and control. Annual Reviews in Control , 54:81–102, 2022

  21. [21]

    S. J. Gershman. Deconstructing the human algorithms for exploration . Cognition, 173:34–42, 2018

  22. [22]

    Gokhale, A

    A. Gokhale, A. Davydov, and F. Bullo. Proximal gradient dynamics: Monot onicity, exponential convergence, and applications. IEEE Control Systems Letters , 8:2853–2858, 2024

  23. [23]

    Goodfellow, Y

    I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, 2016

  24. [24]

    P. Guan, M. Raginsky, and R. M. Willett. Online markov decision pr ocesses with kullback–leibler control cost. IEEE Transactions on Automatic Control , 59(6):1423–1438, June 2014

  25. [25]

    Hassan-Moghaddam and M

    S. Hassan-Moghaddam and M. R. Jovanovi´ c. Proximal gradient flow and Douglas -Rachford splitting dynamics: Global exponential stability via integral quadratic constrai nts. Automatica, 123:109311, 2021

  26. [26]

    Hazimeh, Z

    H. Hazimeh, Z. Zhao, A. Chowdhery, M. Sathiamoorthy, Y. Chen, R. Mazumd er, L. Hong, and E. Chi. DSelect-k: Differentiable Selection in the Mixture of Ex perts with Applications to Multi-Task Learning. In Advances in Neural Information Processing Systems , volume 34, pages 29335–29347, 2021

  27. [27]

    Heins, B

    C. Heins, B. Millidge, L. Da Costa, R. P. Mann, K. J. Friston, and I. D. Couzin. Collective behavior from surprise minimization. Proceedings of the National Academy of Sciences , 121(17):e2320239121, 2024

  28. [28]

    C. K. Hemelrijk and H. Hildenbrandt. Self-organized shape and frontal d ensity of fish schools. Ethology, 114(3):245–254, 2008

  29. [29]

    E. Jang, S. Gu, and B. Poole. Categorical Reparameterization with Gumbe l-Softmax. In International Conference on Learning Representations , 2017

  30. [30]

    Kozachkov, K

    L. Kozachkov, K. V. Kastanenka, and D. Krotov. Building transformers f rom neurons and astrocytes. Proceedings of the National Academy of Sciences , 120(34), 2023

  31. [31]

    Kullback and R

    S. Kullback and R. A. Leibler. On information and sufficiency. Annals of Mathematical Statistics , 22:79–86, 1951

  32. [32]

    D. S. Leslie and E. J. Collins. Individual q-learning in normal form games. SIAM Journal on Control and Optimization , 44(2):495–514, January 2005. 31

  33. [33]

    Levine, W.J

    H. Levine, W.J. Rappel, and I. Cohen. Self-organization in systems of s elf-propelled particles. Phys. Rev. E , 63:017101, Dec 2000

  34. [34]

    H. Ling, G. E. Mclvor, J. Westley, K. van der Vaart, R. T. Vaughan, A. Thorn ton, and N. T. Ouel- lette. Behavioural plasticity and the transition to order in jackdaw flocks. Nature Communications, 10(1):5174, 2019

  35. [35]

    Lohmiller and J.-J

    W. Lohmiller and J.-J. E. Slotine. On contraction analysis for non-lin ear systems. Automatica, 34(6):683–696, 1998

  36. [36]

    Lukeman, Y

    R. Lukeman, Y. Li, and L. Edelstein-Keshet. Inferring individual rules from collective behavior. Proceedings of the National Academy of Sciences , 107(28):12576–12580, June 2010

  37. [37]

    R. D. McKelvey and T. R. Palfrey. Quantal response equilibria for normal form games. Games and Economic Behavior , 10(1):6–38, July 1995

  38. [38]

    Mertikopoulos and W

    P. Mertikopoulos and W. H. Sandholm. Learning in games via reinforcemen t and regularization. Mathematics of Operations Research , 41(4):1297–1324, November 2016

  39. [39]

    V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Si lver, and K. Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research , pages 1928–1937. PMLR, Jun 2016

  40. [40]

    Kevin P. Murphy. Probabilistic Machine Learning: Advanced Topics . MIT Press, 2023

  41. [41]

    M. Nagumo. ¨Uber die Lage der Integralkurven gew¨ ohnlicher Differentialgleichunge n. Proceedings of the Physico-Mathematical Society of Japan. 3rd Series , 24:551–559, 1942

  42. [42]

    Olfati-Saber

    R. Olfati-Saber. Flocking for multi-agent dynamic systems: Algori thms and theory. IEEE Transac- tions on Automatic Control , 51(3):401–420, 2006

  43. [43]

    Parikh and S

    N. Parikh and S. Boyd. Proximal algorithms. Foundations and Trends in Optimization , 1(3):127–239, 2014

  44. [44]

    Peters, K

    J. Peters, K. Mulling, and Y. Altun. Relative entropy policy search . Proceedings of the AAAI Con- ference on Artificial Intelligence , 24(1):1607–1612, July 2010

  45. [45]

    Peyr´ e and M

    G. Peyr´ e and M. Cuturi. Computational optimal transport: With appli cations to data science. Foun- dations and Trends in Machine Learning , 11(5-6):355–607, 2019

  46. [46]

    A. M. Reynolds, G. E. McIvor, A. Thornton, P. Yang, and N. T. Ouellette. Stochastic modelling of bird flocks: accounting for the cohesiveness of collective motion. Journal of the Royal Society Interface , 19(189):20210745, 2022

  47. [47]

    C. W. Reynolds. Flocks, herds and schools: A distributed beha vioral model. In Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniqu es, page 25–34, 1987

  48. [48]

    C. W. Reynolds. Flocks, herds, and schools: A distributed beh avioral model. Computer Graphics , 21(4):25–34, 1987. 32

  49. [49]

    Tyrrell Rockafellar

    R. Tyrrell Rockafellar. Convex Analysis . Princeton University Press, 1970

  50. [50]

    Russo, M

    G. Russo, M. Di Bernardo, and E. D. Sontag. Global entrainment of transcr iptional systems to periodic inputs. PLoS Computational Biology , 6(4):e1000739, 2010

  51. [51]

    Russo, M

    G. Russo, M. Di Bernardo, and E. D. Sontag. A contraction approach to the hi erarchical analysis and design of networked systems. IEEE Transactions on Automatic Control , 58(5):1328–1331, 2013

  52. [52]

    W. H. Sandholm. Population Games and Evolutionary Dynamics . MIT Press, 2010

  53. [53]

    Shafiei, H

    A. Shafiei, H. Jesawada, K. Friston, and G. Russo. Distributionally rob ust free energy principle for decision-making. In Nature Communications, 2025

  54. [54]

    Snow and J

    M. Snow and J. Orchard. Biological softmax: Demonstrated in modern Hopfi eld networks. In Pro- ceedings of the Annual Meeting of the Cognitive Science Society , volume 44, 2022

  55. [55]

    E. D. Sontag. Contractive systems with inputs. In J. C. Willems, S. Hara, Y. Ohta, and H. Fujioka, editors, Perspectives in Mathematical System Theory, Control, and Signal Pr ocessing, pages 217–228. Springer, 2010

  56. [56]

    D. J. T. Sumpter. The principles of collective animal behaviour . Philosophical Transactions of The Royal Society B: Biological Sciences , 361(1465):5–22, 2006

  57. [57]

    R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction . MIT Press, 1998

  58. [58]

    Tunstrøm, Y

    K. Tunstrøm, Y. Katz, C. C. Ioannou, C. Huepe, M. J. Lutz, and I. D. Couzi n. Collective States, Multistability and Transitional Behavior in Schooling Fish. PLOS Computational Biology , 9(2):1–11, 02 2013

  59. [59]

    A. Ullah. Entropy, divergence and distance measures with econometri c applications. Journal of Statistical Planning and Inference , 49(1):137–162, 1996

  60. [60]

    Vicsek, A

    T. Vicsek, A. Czir´ ok, E. Ben-Jacob, I. Cohen, and O. Shochet. Novel type of phase transition in a system of self-driven particles. Physical Review Letters , 75(6-7):1226–1229, 1995

  61. [61]

    S. Xie, G. Russo, and R. H. Middleton. Scalability in nonlinear networ k systems affected by delays and disturbances. IEEE Transactions on Control of Network Systems , 8(3):1128–1138, 2021

  62. [62]

    A. L. Yuille and D. Geiger. Winner-take-all mechanisms. In The Handbook of Brain Theory and Neural Networks , 1995. 33