pith. sign in

arxiv: 2606.31562 · v1 · pith:HATE65HFnew · submitted 2026-06-30 · 💻 cs.RO

Stabilization Learning: A Paradigm Transition Bridging Control Theory and Machine Learning

Pith reviewed 2026-07-01 05:36 UTC · model grok-4.3

classification 💻 cs.RO
keywords stabilization learningcontrol theorymachine learningLyapunov analysisdata-driven feedbackunified frameworkadaptive systemsstability
0
0 comments X

The pith

Stabilization learning unifies control theory and machine learning through a six-tuple framework centered on stability rather than proofs or optimality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes stabilization learning as a paradigm that lets systems adjust policies in real time under perturbations while keeping stability as the main objective. It sets this apart from certificate learning, which seeks formal proofs, and reinforcement learning, which seeks optimal performance. The central construction is a six-tuple mathematical framework that combines Lyapunov analysis, deep feature extraction, and data-driven feedback synthesis. This framework is used to recast eleven types of problems from control, observation, and recognition into a common form. The work also extends the tuple to seven-tuple versions for constrained and tracking cases and identifies two future directions: a more complete unified framework and efficient robust implementations.

Core claim

Stabilization learning constructs a unified mathematical framework based on a six-tuple that encompasses Lyapunov-based analysis, deep feature extraction, and data-driven feedback synthesis, enabling formal reformulation of problems across control, observation, and recognition domains while taking stability as the primary goal.

What carries the argument

The six-tuple (state space, controlled system, metrics, policy, and associated elements) that carries the unification of stability analysis with data-driven adaptation.

If this is right

  • Multi-agent cooperative tracking problems can be expressed inside the six-tuple or seven-tuple models.
  • Visual servo robot position stabilization admits a data-driven treatment that retains stability guarantees.
  • Recognition tasks such as chess games become amenable to stability-focused reformulation.
  • Push-T tasks and similar manipulation problems receive a common stability-oriented treatment.
  • Constrained learning with barrier spaces and target-tracking problems are captured by the seven-tuple extensions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework could be tested on high-dimensional nonlinear systems where separate Lyapunov or learning methods currently lack joint guarantees.
  • If the six-tuple works as claimed, it might allow stability considerations to be added to existing deep-learning pipelines in robotics without separate certificate steps.
  • The emphasis on real-time adaptation suggests possible extensions to time-varying environments not explicitly treated in the eleven examples.

Load-bearing premise

The six-tuple structure and its seven-tuple extensions supply a non-trivial unification that adds engineering practicality beyond separate uses of Lyapunov theory and data-driven methods.

What would settle it

An example problem from control or recognition that cannot be expressed inside the six-tuple or whose resulting policy fails to maintain stability under the perturbations the framework claims to handle.

Figures

Figures reproduced from arXiv: 2606.31562 by Quan Quan.

Figure 1
Figure 1. Figure 1: Two classic equilibrium states Stability can be primarily divided into two categories based on the system’s dependence on initial conditions and external inputs: • Stability dependent only on initial conditions: Refers to the system’s response to minor changes in initial conditions. If initial value perturbations do not cause system instability, the system is stable with respect to initial conditions. Comm… view at source ↗
Figure 2
Figure 2. Figure 2: Stabilization learning problem framework [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Two-phase illustration of the disk-pushing task [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Metric based on Lyapunov function [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Linear feedback system structure • Nyquist Stability Criterion [5]: Based on the frequency characteristics of the open-loop transfer function G(s)H(s) and the encirclement of the Nyquist curve on the complex plane relative to the critical point (−1, j0), as shown in Figure 6a, the stability of the closed-loop system is determined. The number of poles of the open-loop transfer function G(s)H(s) in the right… view at source ↗
Figure 6
Figure 6. Figure 6: Frequency-domain stability criteria for linear feedback systems [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Nonlinear feedback system structure Popov’s theorem states that when the transfer function of the linear part satisfies the following requirements, the system is absolutely stable: ∀ω ≥ 0, ∃q ≥ 0, s.t. Re[(1 + jωq)G(jω)] + 1 K ≥ 0. (2.22) (2) Time-Domain Stability Criteria For general stability problems, which may involve various complex systems, their stability can be determined by analyzing their actual … view at source ↗
Figure 8
Figure 8. Figure 8: Structure of the interconnected system Suppose that each subsystem within the interconnected system satisfies the input-to-state stability (ISS) condition. That is, there exist ISS-Lyapunov functions V1(s1) and V2(s2), obtained via smooth K∞ transformations, such that: V˙ 1(s1) ≤ −V1(s1) + γ1(V2(s2)), V˙ 2(s2) ≤ −V2(s2) + γ2(V1(s1)), (2.25) where γ1 and γ2 are class K∞ gain functions that satisfy the follo… view at source ↗
Figure 9
Figure 9. Figure 9: Visualization of neuron activities in various 2D attractors [PITH_FULL_IMAGE:figures/full_fig_p028_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Theoretical origins and characteristic comparison of reinforcement learning and [PITH_FULL_IMAGE:figures/full_fig_p030_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of Q-learning and D-learning [ [PITH_FULL_IMAGE:figures/full_fig_p035_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Paradigm unification and complementary relationship between Stabilization Learning [PITH_FULL_IMAGE:figures/full_fig_p036_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Schematic diagram of multi-agent cooperative pursuit task [PITH_FULL_IMAGE:figures/full_fig_p037_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Schematic diagram of closed-loop control architecture for visual servo-based position [PITH_FULL_IMAGE:figures/full_fig_p039_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Schematic diagram of geometric relationships in the interception task under FOV [PITH_FULL_IMAGE:figures/full_fig_p041_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: The chess game problem Problem Description: Consider the game problem of a player/agent in a chess match. In this game, both sides control black and white pieces on an 8 × 8 chessboard. Each side has 16 pieces divided into 6 types: Pawn (P), Knight (N), Bishop (B), Rook (R), Queen (Q), and King (K) [PITH_FULL_IMAGE:figures/full_fig_p044_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Schematic diagram of the Push-T task. The robot uses its end-effector to push a [PITH_FULL_IMAGE:figures/full_fig_p047_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Schematic diagram of finding the kernel of a non-convex function [PITH_FULL_IMAGE:figures/full_fig_p049_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Schematic diagram of relative pose estimation of a target UAV based on visual sensors [PITH_FULL_IMAGE:figures/full_fig_p051_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Schematic diagram of ego-motion estimation of UAV based on visual sensors [PITH_FULL_IMAGE:figures/full_fig_p053_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Schematic diagram of the forward diffusion process and the backward text-guided [PITH_FULL_IMAGE:figures/full_fig_p056_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Schematic diagram of a virtual tube Problem Description: Consider the problem of guiding multiple robots through a virtual tube, i.e., directing a robot swarm to travel inside a virtual tube along the direction of the tube’s generating curve, without colliding with other robots and without exceeding the boundaries of the virtual tube. The core of this problem lies in ensuring that all individuals in the r… view at source ↗
Figure 23
Figure 23. Figure 23: Schematic diagram of the end-to-end motion planning problem based on flow match [PITH_FULL_IMAGE:figures/full_fig_p063_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: L-learning vs. traditional learning-based control [ [PITH_FULL_IMAGE:figures/full_fig_p070_24.png] view at source ↗
read the original abstract

Stabilization learning is an interdisciplinary paradigm that bridges control theory and machine learning. Its core idea is to enable systems to adjust their policies under perturbations or environmental changes through real-time feedback and adaptive mechanisms. It takes stability as its primary goal, distinguishing itself from certificate learning, which focuses on formal proofs, and reinforcement learning, which pursues optimality. It encompasses a range of methods, including Lyapunov-based analysis and design, deep feature extraction, and data-driven feedback synthesis, and is applicable to complex high-dimensional, nonlinear systems. This paper elaborates on the two major categories of stability in stabilization learning, as well as three typical application scenarios: control, observation, and recognition. It constructs a unified mathematical framework based on a six-tuple, and expands into two types of seven-tuple models: constrained learning with barrier spaces and tracking problems with targets. It also analyzes the roles, meanings, and implementation choices of key elements such as state space, controlled system, metrics, and policy. Through the formal reformulation of 11 types of problems, including multi-agent cooperative tracking, visual servo robot position stabilization, chess games, and Push-T tasks, this paper illustrates the potential applicability of the framework across multiple domains. Finally, it points out that future stabilization learning will focus on two major directions: constructing a unified problem framework and achieving efficient and robust learning, providing solutions for complex system control that combine theoretical rigor with engineering practicality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces stabilization learning as a paradigm bridging control theory and machine learning, with stability as the primary goal rather than optimality (as in RL) or formal certificates (as in certificate learning). It proposes a unified mathematical framework centered on a six-tuple (encompassing state space, controlled system, metrics, policy, and two additional elements) that integrates Lyapunov-based analysis, deep feature extraction, and data-driven feedback synthesis. This is extended to seven-tuple models for constrained problems (with barrier spaces) and tracking problems (with targets). The framework is illustrated via formal reformulations of 11 problems across control, observation, and recognition, including multi-agent cooperative tracking, visual servo position stabilization, chess, and Push-T tasks. Future work is directed toward unified frameworks and efficient robust learning for complex nonlinear systems.

Significance. If the six-tuple and seven-tuple constructions yield non-trivial stability certificates, convergence bounds, or synthesis procedures that are not recoverable from standard combinations of Lyapunov theory and data-driven control, the framework could provide a useful organizing perspective for high-dimensional systems. However, the manuscript presents the unification at a conceptual level through enumeration and reformulation claims without new derivations, algorithms, or comparative analysis demonstrating engineering practicality or formal power beyond existing separate methods.

major comments (3)
  1. [Abstract / unified mathematical framework] Abstract and unified framework section: The central claim that the six-tuple 'enables formal reformulation of problems across control, observation, and recognition domains' rests on the tuple acting as a container for existing components (Lyapunov analysis plus data-driven synthesis), but no equation or derivation is supplied showing a stability certificate, convergence bound, or policy synthesis step that is unavailable from standard Lyapunov + data-driven control.
  2. [seven-tuple models for constrained and tracking cases] Section on seven-tuple models: The extensions to constrained learning (barrier spaces) and tracking (targets) are described as expansions of the six-tuple, yet no explicit mapping or proof is given that these preserve or improve upon the stability properties already obtainable from standard barrier Lyapunov functions or reference-tracking controllers.
  3. [formal reformulation of 11 types of problems] Reformulation of 11 problems (e.g., multi-agent tracking, visual servo, chess, Push-T): The paper states that these are formally reformulated within the framework, but supplies no concrete tuple assignments, stability analysis, or comparison to baseline methods that would substantiate the claim of added applicability or rigor.
minor comments (2)
  1. [Abstract] The abstract refers to 'two major categories of stability' and 'three typical application scenarios' without defining them explicitly before invoking the six-tuple; adding a brief enumeration or reference to later sections would improve readability.
  2. [unified mathematical framework] Notation for the six-tuple elements (state space, controlled system, metrics, policy) is introduced conceptually but would benefit from a single defining equation or table that lists all six components with their mathematical roles.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments and the recommendation for major revision. We agree that the manuscript is primarily conceptual and would benefit from greater explicitness in the framework descriptions and reformulations. We will revise to add concrete mappings and example derivations while preserving the paper's focus on unification rather than novel technical results.

read point-by-point responses
  1. Referee: Abstract / unified mathematical framework] Abstract and unified framework section: The central claim that the six-tuple 'enables formal reformulation of problems across control, observation, and recognition domains' rests on the tuple acting as a container for existing components (Lyapunov analysis plus data-driven synthesis), but no equation or derivation is supplied showing a stability certificate, convergence bound, or policy synthesis step that is unavailable from standard Lyapunov + data-driven control.

    Authors: The six-tuple is intended as an organizing abstraction to bridge domains using existing components, not as a generator of new certificates unavailable from standard methods. To improve clarity, the revised manuscript will include an explicit worked example deriving a Lyapunov-based stability certificate from the six-tuple elements for one control problem (visual servo stabilization). revision: yes

  2. Referee: [seven-tuple models for constrained and tracking cases] Section on seven-tuple models: The extensions to constrained learning (barrier spaces) and tracking (targets) are described as expansions of the six-tuple, yet no explicit mapping or proof is given that these preserve or improve upon the stability properties already obtainable from standard barrier Lyapunov functions or reference-tracking controllers.

    Authors: We agree explicit mappings are needed. The revision will add detailed element-by-element mappings for both seven-tuple cases, showing integration with barrier Lyapunov functions and tracking controllers, along with a short argument that stability properties are preserved under the extension. No improvement beyond unification is claimed. revision: yes

  3. Referee: [formal reformulation of 11 types of problems] Reformulation of 11 problems (e.g., multi-agent tracking, visual servo, chess, Push-T): The paper states that these are formally reformulated within the framework, but supplies no concrete tuple assignments, stability analysis, or comparison to baseline methods that would substantiate the claim of added applicability or rigor.

    Authors: The current text provides descriptive reformulations. The revision will supply explicit six-tuple or seven-tuple assignments and brief stability sketches for three representative problems (multi-agent tracking, visual servo, Push-T). Full baseline comparisons lie outside the conceptual scope and will be flagged as future work. revision: partial

standing simulated objections not resolved
  • Requests for stability certificates, convergence bounds, or synthesis procedures unavailable from standard combinations of Lyapunov theory and data-driven control, as these exceed the conceptual unification scope of the manuscript.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The proposal introduces the six-tuple and seven-tuple constructs as organizing devices without specifying numerical free parameters or new physical entities; relies on standard mathematical assumptions from control theory.

axioms (2)
  • domain assumption Lyapunov stability analysis provides a valid foundation for assessing system behavior under perturbations
    Invoked when stating that Lyapunov-based analysis is one of the core methods in the framework.
  • domain assumption Data-driven feedback synthesis can be combined with stability analysis without loss of rigor
    Assumed when claiming the framework unifies Lyapunov methods with deep feature extraction and data-driven approaches.
invented entities (2)
  • six-tuple mathematical framework no independent evidence
    purpose: To provide a unified structure for state space, controlled system, metrics, policy and related elements in stabilization learning
    New organizing construct introduced to encompass the described methods and problems
  • seven-tuple models for constrained and tracking cases no independent evidence
    purpose: To extend the six-tuple for barrier spaces and target tracking
    Additional structure postulated to handle specific application scenarios

pith-pipeline@v0.9.1-grok · 5777 in / 1601 out tokens · 29222 ms · 2026-07-01T05:36:22.950222+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

83 extracted references · 7 canonical work pages · 7 internal anchors

  1. [1]

    Wiener,Cybernetics: Or Control and Communication in the Animal and the Machine

    N. Wiener,Cybernetics: Or Control and Communication in the Animal and the Machine. Cam- bridge, MA: The MIT Press, 2nd ed., 1961

  2. [2]

    H. K. Khalil,Nonlinear Systems. Upper Saddle River, NJ: Prentice Hall, 3rd ed., 2002

  3. [3]

    B. D. O. Anderson and J. B. Moore,Optimal Control: Linear Quadratic Methods. Mineola, NY: Dover Publications, 2007

  4. [4]

    Safe control with learned certificates: A survey of neural Lyapunov, barrier, and contraction methods for robotics and control,

    C. Dawson, S. Gao, and C. Fan, “Safe control with learned certificates: A survey of neural Lyapunov, barrier, and contraction methods for robotics and control,”IEEE Transactions on Robotics, vol. 39, no. 3, pp. 1749–1767, 2023

  5. [5]

    Kailath,Linear Systems

    T. Kailath,Linear Systems. Englewood Cliffs, NJ: Prentice Hall, 1980

  6. [6]

    Score-based generative modeling through stochastic differential equations,

    Y. Song, J. Sohl-Dickstein, D. P . Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” inProceedings of the 2021 International Conference on Learning Representations (ICLR), 2021

  7. [7]

    Blind video temporal consistency,

    N. Bonneel, J. Tompkin, K. Sunkavalli, D. Sun, S. Paris, and H. Pfister, “Blind video temporal consistency,”ACM Transactions on Graphics (TOG), vol. 34, no. 6, p. 196, 2015

  8. [8]

    Safe nonlinear control using robust neural Lyapunov-Barrier functions,

    C. Dawson, Z. Qin, S. Gao, and C. Fan, “Safe nonlinear control using robust neural Lyapunov-Barrier functions,” inProceedings of the 2021 Conference on Robot Learning (CoRL), vol. 164, pp. 1724–1735, 2022

  9. [9]

    R. W. Beard and T. W. McLain,Small Unmanned Aircraft: Theory and Practice. Princeton, NJ: Princeton University Press, 2012

  10. [10]

    Stable inversion of SISO nonminimum phase linear systems through output planning: An experimental application to the one-link flexible manipulator,

    M. Benosman and G. Le Vey, “Stable inversion of SISO nonminimum phase linear systems through output planning: An experimental application to the one-link flexible manipulator,” IEEE Transactions on Control Systems Technology, vol. 11, no. 4, pp. 588–597, 2003

  11. [11]

    The internal model principle of control theory,

    B. A. Francis and W. M. Wonham, “The internal model principle of control theory,”Auto- matica, vol. 12, no. 5, pp. 457–465, 1976. Stabilization Learning75

  12. [12]

    Nonlinear internal models for output regulation,

    C. I. Byrnes and A. Isidori, “Nonlinear internal models for output regulation,”IEEE Transactions on Automatic Control, vol. 49, no. 12, pp. 2244–2247, 2004

  13. [13]

    Internal models in control, bioengineering, and neuroscience,

    M. Bin, J. Huang, A. Isidori, L. Marconi, M. Mischiati, and E. Sontag, “Internal models in control, bioengineering, and neuroscience,”Annual Review of Control, Robotics, and Autonomous Systems, vol. 5, no. 1, pp. 55–79, 2022

  14. [14]

    Unstructured Data: An overview of the data of Big Data,

    A. C. Eberendu, “Unstructured Data: An overview of the data of Big Data,”International Journal of Computer Trends and Technology, vol. 38, no. 1, pp. 46–50, 2016

  15. [15]

    A survey on image data augmentation for deep learning,

    C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,”Journal of Big Data, vol. 6, p. 60, 2019

  16. [16]

    Dimensionality reduction by learning an invariant mapping,

    R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping,” inProceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1735–1742, 2006

  17. [17]

    Dimensionality reduction: A compar- ative review,

    L. Van der Maaten, E. Postma, and J. Van den Herik, “Dimensionality reduction: A compar- ative review,” Tech. Rep. TiCC TR 2009-005, Tilburg University, 2009

  18. [18]

    Fundamental limits and tradeoffs in invariant representation learning,

    H. Zhao, C. Dan, B. Aragam, T. S. Jaakkola, G. J. Gordon, and P . Ravikumar, “Fundamental limits and tradeoffs in invariant representation learning,”Journal of Machine Learning Research, vol. 23, p. 340, 2022

  19. [19]

    Representation learning: A review and new perspectives,

    Y. Bengio, A. Courville, and P . Vincent, “Representation learning: A review and new perspectives,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1798–1828, 2013

  20. [20]

    Auto-Encoding variational bayes,

    D. P . Kingma and M. Welling, “Auto-Encoding variational bayes,” inProceedings of the 2014 International Conference on Learning Representations (ICLR), 2014

  21. [21]

    A unifying review of linear Gaussian models,

    S. Roweis and Z. Ghahramani, “A unifying review of linear Gaussian models,”Neural Computation, vol. 11, no. 2, pp. 305–345, 1999

  22. [22]

    Mixture Kalman filters,

    R. Chen and J. S. Liu, “Mixture Kalman filters,”Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 62, no. 3, pp. 493–508, 2000

  23. [23]

    Hybrid dynamical systems,

    R. Goebel, R. G. Sanfelice, and A. R. Teel, “Hybrid dynamical systems,”IEEE Control Systems Magazine, vol. 29, no. 2, pp. 28–93, 2009

  24. [24]

    Understanding world or predicting future? A comprehensive survey of world models,

    J. Ding, Y. Zhang, Y. Shang, Y. Zhang, Z. Zong, J. Feng, Y. Yuan, H. Su, N. Li, N. Sukiennik, F. Xu, and Y. Li, “Understanding world or predicting future? A comprehensive survey of world models,”ACM Computing Surveys, vol. 58, no. 3, p. 65, 2025

  25. [25]

    A review of learning-based dynamics models for robotic manipulation,

    B. Ai, S. Tian, H. Shi, Y. Wang, T. Pfaff, C. Tan, H. I. Christensen, H. Su, J. Wu, and Y. Li, “A review of learning-based dynamics models for robotic manipulation,”Science Robotics, vol. 10, no. 106, p. eadt1497, 2025

  26. [26]

    Neural ordinary differen- tial equations,

    R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differen- tial equations,” inProceedings of the 2018 Advances in Neural Information Processing Systems (NeurIPS), vol. 31, pp. 6571–6583, 2018

  27. [27]

    M. M. Deza and E. Deza,Encyclopedia of Distances. Springer Berlin Heidelberg, 4th ed., 2016

  28. [28]

    A Lyapunov formulation of the nonlinear small- gain theorem for interconnected ISS systems,

    Z.-P . Jiang, I. M. Y. Mareels, and Y. Wang, “A Lyapunov formulation of the nonlinear small- gain theorem for interconnected ISS systems,”Automatica, vol. 32, no. 8, pp. 1211–1215, 1996. Stabilization Learning76

  29. [29]

    Small gain theorems for large scale systems and construction of ISS Lyapunov functions,

    S. Dashkovskiy, B. S. Rüffer, and F. R. Wirth, “Small gain theorems for large scale systems and construction of ISS Lyapunov functions,”SIAM Journal on Control and Optimization, vol. 48, no. 6, pp. 4089–4118, 2010

  30. [30]

    Neural networks and physical systems with emergent collective computa- tional abilities,

    J. J. Hopfield, “Neural networks and physical systems with emergent collective computa- tional abilities,”Proceedings of the National Academy of Sciences, vol. 79, no. 8, pp. 2554–2558, 1982

  31. [31]

    Handwritten character recognition using Hopfield neural network,

    C. Olson and R. Y. Li, “Handwritten character recognition using Hopfield neural network,” inProceedings of the 1993 Applications of Artificial Neural Networks IV (SPIE), vol. 1965, pp. 412–418, 1993

  32. [32]

    Dynamics of pattern formation in lateral-inhibition type neural fields,

    S.-i. Amari, “Dynamics of pattern formation in lateral-inhibition type neural fields,”Biologi- cal Cybernetics, vol. 27, no. 2, pp. 77–87, 1977

  33. [33]

    Self-organizing continuous attractor networks and path integration: One-dimensional models of head direction cells,

    S. Stringer, T. Trappenberg, E. Rolls, and I. Araujo, “Self-organizing continuous attractor networks and path integration: One-dimensional models of head direction cells,”Network: Computation in Neural Systems, vol. 13, no. 2, pp. 217–242, 2002

  34. [34]

    Path integration and cognitive mapping in a continuous attractor neural network model,

    A. Samsonovich and B. L. McNaughton, “Path integration and cognitive mapping in a continuous attractor neural network model,”Journal of Neuroscience, vol. 17, no. 15, pp. 5900– 5920, 1997

  35. [35]

    Continuous attractor dynamics in spatial navigation: From population geometry to flexible computation,

    Y. Chen, M. Hua, X. Sun, and J. Peng, “Continuous attractor dynamics in spatial navigation: From population geometry to flexible computation,”Frontiers in Neuroscience, vol. 20, p. 1788255, 2026

  36. [36]

    Optimal adaptive control and differential games by reinforcement learning principles,

    D. Vrabie, K. G. Vamvoudakis, and F. L. Lewis, “Optimal adaptive control and differential games by reinforcement learning principles,”IEEE Control Systems Magazine, vol. 33, no. 3, pp. 50–91, 2013

  37. [37]

    Neural Lyapunov control,

    Y.-C. Chang, N. Roohi, and S. Gao, “Neural Lyapunov control,” inProceedings of the 2019 Advances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 3245–3254, 2019

  38. [38]

    Stabilizing neural control using self-learned almost Lyapunov critics,

    Y.-C. Chang and S. Gao, “Stabilizing neural control using self-learned almost Lyapunov critics,” inProceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 1803–1809, 2021

  39. [39]

    R. S. Sutton and A. G. Barto,Reinforcement Learning: An Introduction. Cambridge, MA: The MIT Press, 2nd ed., 2018

  40. [40]

    D. P . Bertsekas,Reinforcement Learning and Optimal Control. Belmont, MA: Athena Scientific, 2019

  41. [41]

    Q-learning,

    C. J. Watkins and P . Dayan, “Q-learning,”Machine Learning, vol. 8, no. 3, pp. 279–292, 1992

  42. [42]

    Human-level control through deep reinforcement learning,

    V . Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,”Nature, vol. 518, no. 7540, pp. 529–533, 2015

  43. [43]

    Continuous control with deep reinforcement learning,

    T. P . Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wier- stra, “Continuous control with deep reinforcement learning,” inProceedings of the 2016 International Conference on Learning Representations (ICLR), 2016

  44. [44]

    Control with patterns: A D-learning method,

    Q. Quan, K.-Y. Cai, and C. Wang, “Control with patterns: A D-learning method,” in Proceedings of the 8th Annual Conference on Robot Learning (CoRL), vol. 270, pp. 1384–1401, 2025. Stabilization Learning77

  45. [45]

    DOPT: D-learning with off-policy target toward sample efficiency and fast convergence control,

    Z. Shen and Q. Quan, “DOPT: D-learning with off-policy target toward sample efficiency and fast convergence control,” inProceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA), pp. 9637–9643, 2025

  46. [46]

    DL-Clip: Online D-learning with clipping operation for fast model-free stabilizing control,

    J. Liu, C. Wang, Z. Shen, and Q. Quan, “DL-Clip: Online D-learning with clipping operation for fast model-free stabilizing control,” inProceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 15838–15844, 2025

  47. [47]

    Proximal Policy Optimization Algorithms

    J. Schulman, F. Wolski, P . Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimiza- tion algorithms,”arXiv preprint arXiv:1707.06347, 2017

  48. [48]

    Learning to Adapt: Reptile-D-Learning for Robust and Efficient Control Under Parametric Uncertainty

    H. Cao, Z. Shen, and Q. Quan, “Learning to adapt: Reptile-D-Learning for robust and efficient control under parametric uncertainty,”arXiv preprint arXiv:2606.25659, 2026

  49. [49]

    Meta-Learning in neural networks: A survey,

    T. Hospedales, A. Antoniou, P . Micaelli, and A. Storkey, “Meta-Learning in neural networks: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 5149–5169, 2022

  50. [50]

    Model-Agnostic Meta-Learning for fast adaptation of deep networks,

    C. Finn, P . Abbeel, and S. Levine, “Model-Agnostic Meta-Learning for fast adaptation of deep networks,” inProceedings of the 2017 International Conference on Machine Learning (ICML), pp. 1126–1135, 2017

  51. [51]

    On First-Order Meta-Learning Algorithms

    A. Nichol, J. Achiam, and J. Schulman, “On First-Order Meta-Learning algorithms,”arXiv preprint arXiv:1803.02999, 2018

  52. [52]

    Safe policy optimization with cost practical stability: A DQ-learning method,

    C. Wang, L. Liu, and Q. Quan, “Safe policy optimization with cost practical stability: A DQ-learning method,”IEEE Robotics and Automation Letters, vol. 11, no. 6, pp. 6823–6830, 2026

  53. [53]

    Visual servo control. I. Basic approaches,

    F. Chaumette and S. Hutchinson, “Visual servo control. I. Basic approaches,”IEEE Robotics & Automation Magazine, vol. 13, no. 4, pp. 82–90, 2006

  54. [54]

    Quan,Introduction to Multicopter Design and Control

    Q. Quan,Introduction to Multicopter Design and Control. Singapore: Springer Singapore, 2017

  55. [55]

    Line-of-sight-constrained multicopter interceptability,

    K. Yang, C. Bai, and Q. Quan, “Line-of-sight-constrained multicopter interceptability,” Journal of Guidance, Control, and Dynamics, vol. 48, no. 4, pp. 951–960, 2025

  56. [56]

    High-speed interception multicopter control by image-based visual servoing,

    K. Yang, C. Bai, Z. She, and Q. Quan, “High-speed interception multicopter control by image-based visual servoing,”IEEE Transactions on Control Systems Technology, vol. 33, no. 1, pp. 119–135, 2024

  57. [57]

    A general reinforce- ment learning algorithm that masters chess, shogi, and Go through self-play,

    D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. Lillicrap, K. Simonyan, and D. Hassabis, “A general reinforce- ment learning algorithm that masters chess, shogi, and Go through self-play,”Science, vol. 362, no. 6419, pp. 1140–1144, 2018

  58. [58]

    Diffusion policy: Visuomotor policy learning via action diffusion,

    C. Chi, Z. Xu, S. Feng, E. Cousineau, Y. Du, B. Burchfiel, R. Tedrake, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,”The International Journal of Robotics Research, vol. 44, no. 10-11, pp. 1684–1704, 2025

  59. [59]

    Hairer, S

    E. Hairer, S. P . Nørsett, and G. Wanner,Solving Ordinary Differential Equations I: Nonstiff Problems. Berlin, Heidelberg: Springer, 2nd ed., 1993

  60. [60]

    Estimating 6D aircraft pose from keypoints and structures,

    R. Fan, T.-B. Xu, and Z. Wei, “Estimating 6D aircraft pose from keypoints and structures,” Remote Sensing, vol. 13, no. 4, p. 663, 2021. Stabilization Learning78

  61. [61]

    Vision-based autonomous quadrotor landing on a moving platform,

    D. Falanga, A. Zanchettin, A. Simovic, J. Delmerico, and D. Scaramuzza, “Vision-based autonomous quadrotor landing on a moving platform,” inProceedings of the 2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR), pp. 200–207, 2017

  62. [62]

    A unified and persistent interception control of multicopters with strapdown monocular camera,

    K. Yang, L. Liu, C. Bai, and Q. Quan, “A unified and persistent interception control of multicopters with strapdown monocular camera,”IEEE Transactions on Aerospace and Electronic Systems, vol. 62, pp. 1244–1253, 2025

  63. [63]

    Fast and omnidirectional relative position estimation with circular markers for UAV swarm,

    Z. Lu, Y. Wu, S. Yang, K. Zhang, and Q. Quan, “Fast and omnidirectional relative position estimation with circular markers for UAV swarm,”IEEE Transactions on Instrumentation and Measurement, vol. 73, p. 5034911, 2024

  64. [64]

    Consistency analysis and improvement of vision-aided inertial navigation,

    J. A. Hesch, D. G. Kottas, S. L. Bowman, and S. I. Roumeliotis, “Consistency analysis and improvement of vision-aided inertial navigation,”IEEE Transactions on Robotics, vol. 30, no. 1, pp. 158–176, 2013

  65. [65]

    ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,

    R. Mur-Artal and J. D. Tardós, “ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,”IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255–1262, 2017

  66. [66]

    Direct sparse odometry,

    J. Engel, V . Koltun, and D. Cremers, “Direct sparse odometry,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 3, pp. 611–625, 2017

  67. [67]

    SVO: Semidirect visual odometry for monocular and multicamera systems,

    C. Forster, Z. Zhang, M. Gassner, M. Werlberger, and D. Scaramuzza, “SVO: Semidirect visual odometry for monocular and multicamera systems,”IEEE Transactions on Robotics, vol. 33, no. 2, pp. 249–265, 2016

  68. [68]

    Deep unsupervised learning using nonequilibrium thermodynamics,

    J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” inProceedings of the 2015 International Conference on Machine Learning (ICML), vol. 37, pp. 2256–2265, 2015

  69. [69]

    Denoising diffusion probabilistic models,

    J. Ho, A. Jain, and P . Abbeel, “Denoising diffusion probabilistic models,”Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851, 2020

  70. [70]

    Learning transferable visual models from natural language supervision,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P . Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” inProceedings of the 2021 International Conference on Machine Learning (ICML), vol. 139, pp. 8748–8763, 2021

  71. [71]

    Practical distributed control for VTOL UAVs to pass a virtual tube,

    Q. Quan, R. Fu, M. Li, D. Wei, Y. Gao, and K.-Y. Cai, “Practical distributed control for VTOL UAVs to pass a virtual tube,”IEEE Transactions on Intelligent Vehicles, vol. 7, no. 2, pp. 342–353, 2021

  72. [72]

    Making robotics swarm flow more smoothly: A regular virtual tube model,

    P . Mao and Q. Quan, “Making robotics swarm flow more smoothly: A regular virtual tube model,” inProceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4498–4504, 2022

  73. [73]

    Flow Matching for Generative Modeling

    Y. Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le, “Flow matching for generative modeling,”ArXiv, vol. abs/2210.02747, 2022

  74. [74]

    Flow matching for scalable simulation-based inference,

    J. Wildberger, M. Dax, S. Buchholz, S. Green, J. H. Macke, and B. Schölkopf, “Flow matching for scalable simulation-based inference,”Advances in Neural Information Processing Systems, vol. 36, pp. 16837–16864, 2023

  75. [75]

    Improving and generalizing flow-based generative models with minibatch optimal transport,

    A. Tong, C. Meng, N. Malkin, G. Huguet, A. Carmichael, J. Rector-Brooks, K. Schramm, Y. Bengio, and G. Wolf, “Improving and generalizing flow-based generative models with minibatch optimal transport,” inProceedings of the 2023 International Conference on Machine Learning (ICML), pp. 34337–34358, PMLR, 2023. Stabilization Learning79

  76. [76]

    Rectified Flow: A Marginal Preserving Approach to Optimal Transport

    Q. Liu, “Rectified flow: A marginal preserving approach to optimal transport,”arXiv preprint arXiv:2209.14577, 2022

  77. [77]

    A survey on transfer learning,

    S. J. Pan and Q. Yang, “A survey on transfer learning,”IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, 2009

  78. [78]

    Deep Transfer Network: Unsupervised Domain Adaptation

    X. Zhang, F. X. Yu, S.-F. Chang, and S. Wang, “Deep transfer network: Unsupervised domain adaptation,”arXiv preprint arXiv:1503.00591, 2015

  79. [79]

    Overcoming catastrophic forgetting in neural networks,

    J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, “Overcoming catastrophic forgetting in neural networks,”Proceedings of the National Academy of Sciences (PNAS), vol. 114, no. 13, pp. 3521–3526, 2017

  80. [80]

    A continual learning survey: Defying forgetting in classification tasks,

    M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “A continual learning survey: Defying forgetting in classification tasks,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3366–3385, 2021

Showing first 80 references.