pith. sign in

arxiv: 2503.17581 · v2 · pith:W6QJC2HTnew · submitted 2025-03-21 · 🧮 math.OC · cs.LG

Time-optimal neural feedback control of nilpotent systems as a binary classification problem

Pith reviewed 2026-05-22 22:07 UTC · model grok-4.3

classification 🧮 math.OC cs.LG
keywords time-optimal controlnilpotent systemsbang-bang controlneural networksbinary classificationdeflated Newton methodswitching surfaces
0
0 comments X

The pith

A neural network trained as a binary classifier on polynomial root data approximates the time-optimal feedback law for nilpotent integrators.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to turn the synthesis of time-optimal controls for linear nilpotent systems into a root-finding task on parameter-dependent polynomials. The bang-bang theorem produces switching sequences whose times satisfy these polynomials; a deflated Newton solver with Hermite quadratic form estimates exhausts the real roots. Sampled solutions supply labels for a supervised dataset that trains a deep neural network to output the correct control sign in real time.

Core claim

By exhaustively solving the polynomial systems that encode admissible switching sequences for sampled states, one obtains a synthetic dataset on which a neural network, viewed strictly as a binary classifier, learns to reproduce the time-optimal feedback map for nilpotent chains of integrators.

What carries the argument

The deflated Newton method that finds all real roots of the parameter-dependent polynomials characterizing control switching times, whose solutions label the training set for the neural binary classifier.

If this is right

  • The same pipeline produces real-time controllers for chains of integrators up to moderate dimension.
  • Accuracy of the feedback map improves directly with denser sampling of the polynomial systems.
  • The binary-classification view replaces online root solving with a single forward pass through the network.
  • Robustness to state perturbations follows from the supervised training on a dense cover of the state space.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be tested on systems whose switching surfaces are known analytically to quantify any coverage gaps introduced by finite sampling.
  • Retraining the classifier on data that includes small random perturbations of the polynomial coefficients would indicate sensitivity to numerical error in the root finder.

Load-bearing premise

The sampled polynomial roots faithfully trace the true switching surfaces without large numerical omissions or artifacts.

What would settle it

Run the trained network in closed loop on a nilpotent integrator and measure whether the achieved settling time matches, within a small tolerance, the time obtained by solving the polynomial system on the same initial condition.

Figures

Figures reproduced from arXiv: 2503.17581 by Dante Kalise, Nelly Villamizar, Samuel Gue, Sara Bicego.

Figure 1
Figure 1. Figure 1: Graph of the two dimensional polynomial system from Example 4.2. The number of real solutions of the system depends on the sign of the diagonal elements in the matrix H(J). We have three cases: x 2 2 − 2x1 > 0 (left), x 2 2 − 2x1 = 0 (middle), and x 2 2 − 2x1 < 0 (right). this computation, a Gr¨obner basis of J is computed when finding a basis of R/J. As shown in [15], the algorithms to compute the Gr¨obne… view at source ↗
Figure 2
Figure 2. Figure 2: The trained classifier identifies the switching surface as a low￾confidence region [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of the optimal trajectory computed via deflation with the one controlled with the classifier uθ. On the left, the NN feedback is solely determined via uθ, whilst on the right we rely on the identification of the solution of the polynomial system (6.1) to identify the feedback at low￾confidence points. T denotes the exact minimum time. with state variable y = (y1, y2, y3) ⊤ and boundary condition… view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of the optimal trajectory with the uθ-controlled (left) and with the confidence-enhanced approximation (right). Relying on the poly￾nomial solver for low-confidence points improves the approximation, as the time TNN needed to reach the origin decreases. We motivated the introduction of the feedback NN approximation as way to achieve ro￾bustness against noise. To evaluate this, we introduce white… view at source ↗
Figure 5
Figure 5. Figure 5: Mean and variance of a Monte Carlo simulation with 1000 tra￾jectories perturbed by Gaussian noise. The robustness of the approximated feedback law ensures that the controlled system reaches the target destina￾tion, while the open-loop solution, obtained as the solution of the polynomial system (2.5), diverges from it. Hermite quadratic form over the reals is computationally unreliable, if not impossible. T… view at source ↗
Figure 6
Figure 6. Figure 6: Monte Carlo simulation of 1000 controlled noisy trajectories via open loop control signal vs. approximated feedback control (top). Comparison of the optimal trajectory with the one controlled via the approximated feedback law (left). Calling the polynomial solver for the low-confidence points improves the approximation performance in terms of time horizon TNN (right). 6.4. 5th-order integrator. Finally, we… view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of the optimal trajectory with the trajectory controlled using the approximated feedback law (left). Leveraging the polynomial solver for low-confidence points enhances the approximation’s performance in terms of the time horizon TNN (right). 7. Concluding remarks In this paper, we have revisited a classical time-optimal control problem from a perspec￾tive combining polynomial systems, numerical… view at source ↗
read the original abstract

A computational method for the synthesis of time-optimal feedback control laws for linear nilpotent systems is proposed. The method is based on the use of the bang-bang theorem, which leads to a characterization of the time-optimal trajectory as a parameter-dependent polynomial system for the control switching sequence. A deflated Newton's method is then applied to exhaust all the real roots of the polynomial system. The root-finding procedure is informed by the Hermite quadratic form, which provides a sharp estimate on the number of real roots to be found. In the second part of the paper, the polynomial systems are sampled and solved to generate a synthetic dataset for the construction of a time-optimal deep neural network -- interpreted as a binary classifier -- via supervised learning. Numerical tests in integrators of increasing dimension assess the accuracy, robustness, and real-time-control capabilities of the approximate control law.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a computational pipeline for time-optimal feedback control of linear nilpotent integrators. The bang-bang theorem yields a parameter-dependent polynomial system whose real roots label the switching surfaces; these roots are exhaustively recovered by a deflated Newton method guided by the Hermite quadratic form. The resulting labeled data train a deep neural network interpreted as a binary classifier that approximates the time-optimal feedback law for real-time use. Numerical experiments on integrators of increasing dimension are used to assess accuracy, robustness, and real-time performance.

Significance. If the synthetic dataset is provably complete and the classifier generalizes without boundary artifacts, the approach would supply a practical route to real-time time-optimal control for nilpotent systems whose exact switching surfaces become intractable in higher dimensions. The explicit use of the Hermite form to certify the number of real roots is a methodological strength that distinguishes the data-generation step from purely heuristic sampling.

major comments (3)
  1. [root-finding procedure (Section 3)] The central claim that the trained classifier furnishes an accurate approximation to the time-optimal feedback law rests on the completeness of the root set recovered by deflated Newton. No cross-validation against homotopy continuation or Gröbner-basis methods is reported to confirm that every real root is found for the sampled parameter instances, nor is any sensitivity analysis given for root perturbations near the decision boundary.
  2. [Numerical experiments (Section 5)] Numerical tests on integrators of increasing dimension are described, yet no quantitative error metrics (e.g., misclassification rate on a held-out test set, Hausdorff distance to exact switching surfaces, or closed-loop performance degradation relative to the exact bang-bang law) are supplied. Without these, it is impossible to assess whether the reported “accuracy” supports the real-time-control claim.
  3. [dataset construction (Section 4)] The parameter-sampling strategy used to generate the training set is not shown to be dense in neighborhoods of the switching surfaces. Coverage gaps or under-sampling near the decision boundary would directly degrade the classifier’s reliability precisely where the feedback law is most sensitive.
minor comments (2)
  1. [Section 3] Notation for the Hermite quadratic form and the deflated Newton iteration should be introduced with explicit equation numbers rather than inline descriptions.
  2. [Section 5] Figure captions for the switching-surface visualizations should state the integrator dimension and the number of sampled points used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed report. The comments highlight opportunities to strengthen the presentation of our algebraic certification and numerical validation. We address each major comment below.

read point-by-point responses
  1. Referee: [root-finding procedure (Section 3)] The central claim that the trained classifier furnishes an accurate approximation to the time-optimal feedback law rests on the completeness of the root set recovered by deflated Newton. No cross-validation against homotopy continuation or Gröbner-basis methods is reported to confirm that every real root is found for the sampled parameter instances, nor is any sensitivity analysis given for root perturbations near the decision boundary.

    Authors: The completeness of the recovered roots is guaranteed by the Hermite quadratic form, which supplies a sharp, exact upper bound on the number of real roots for each parameter instance. The deflated Newton iteration is continued until this certified count is attained, providing an algebraic guarantee rather than a heuristic one. This is the methodological distinction noted in the referee summary. Cross-validation against homotopy or Gröbner methods is therefore unnecessary and would be computationally infeasible at the dimensions considered. A clarifying sentence on this certification will be added to Section 3. revision: partial

  2. Referee: [Numerical experiments (Section 5)] Numerical tests on integrators of increasing dimension are described, yet no quantitative error metrics (e.g., misclassification rate on a held-out test set, Hausdorff distance to exact switching surfaces, or closed-loop performance degradation relative to the exact bang-bang law) are supplied. Without these, it is impossible to assess whether the reported “accuracy” supports the real-time-control claim.

    Authors: We agree that explicit quantitative metrics would strengthen the evaluation. In the revised manuscript we will report misclassification rates on held-out test sets, Hausdorff distances to the exact surfaces (computable in low dimensions), and closed-loop trajectory comparisons against the exact bang-bang law. revision: yes

  3. Referee: [dataset construction (Section 4)] The parameter-sampling strategy used to generate the training set is not shown to be dense in neighborhoods of the switching surfaces. Coverage gaps or under-sampling near the decision boundary would directly degrade the classifier’s reliability precisely where the feedback law is most sensitive.

    Authors: The sampling draws parameters from the distribution of states visited by bang-bang trajectories, which inherently concentrates samples near switching surfaces. To make this explicit, the revision will include a short analysis and a supplementary figure quantifying local density in neighborhoods of the decision boundaries. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation relies on independent numerical root-finding and supervised learning

full rationale

The paper's chain proceeds from the bang-bang theorem to a parameter-dependent polynomial system, solved via deflated Newton informed by the Hermite form to label a synthetic dataset, then trains a neural binary classifier on that data. Each step uses external classical numerical methods (deflated Newton, Hermite quadratic form) whose correctness does not depend on the neural network output or any fitted parameters later renamed as predictions. No self-definitional loops, fitted inputs called predictions, or load-bearing self-citations appear in the described construction. The central claim is an approximation whose accuracy is assessed by numerical tests rather than by algebraic identity with the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; ledger entries are inferred from the described method and standard optimal-control assumptions.

axioms (1)
  • domain assumption Bang-bang theorem applies to the linear nilpotent systems under consideration
    Invoked to characterize the time-optimal trajectory as a parameter-dependent polynomial system

pith-pipeline@v0.9.0 · 5681 in / 1215 out tokens · 26476 ms · 2026-05-22T22:07:53.415342+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

  1. [1]

    G. Albi, S. Bicego, M. Herty, Y. Huang, D. Kalise, and C. Se gala. Data/moment-driven approaches for fast predictive control of collective dynamics, 2024. arXi v:2402.15611

  2. [2]

    B. Azmi, D. Kalise, and K. Kunisch. Optimal feedback law r ecovery by gradient-augmented sparse poly- nomial regression. Journal of Machine Learning Research , 22(48):1–32, 2021

  3. [3]

    S. Basu, R. Pollack, and M. Roy. Algorithms in Real Algebraic Geometry . Algorithms and Computation in Mathematics. Springer Berlin, Heidelberg, second editi on, 2006

  4. [4]

    bang-ban g

    R. Bellman, I. Glicksberg, and O. Gross. On the “bang-ban g” control problem. Quart. Appl. Math. , 14:11–18, 1956

  5. [5]

    Ben-Or, D

    M. Ben-Or, D. Kozen, and J. Reif. The complexity of elemen tary algebra and geometry. Journal of computer and System Sciences , 32:251–264, 1986

  6. [6]

    Bicego, D

    S. Bicego, D. Kalise, and G. A. Pavliotis. Computation an d control of unstable steady states for mean field multiagent systems, 2024. arXiv:2406.11725

  7. [7]

    Bonnard, O

    B. Bonnard, O. Cots, Y. Privat, and E. Tr´ elat. Zermelo na vigation on the sphere with revolution metrics. In Ivan Kupka legacy—a tour through controlled dynamics , volume 12 of AIMS Ser. Appl. Math. , pages 35–65. Am. Inst. Math. Sci. (AIMS), Springfield, MO, 2023?

  8. [8]

    K. M. Brown and W. B. Gearhart. Deflation techniques for th e calculation of further solutions of a nonlinear system. Numer. Math. , 16:334–342, 1970/71

  9. [9]

    Charalampidis, N

    E. Charalampidis, N. Boull´ e, P. Farrell, and P. Kevreki dis. Bifurcation analysis of stationary solutions of two-dimensional coupled gross–pitaevskii equations us ing deflated continuation. Communications in Nonlinear Science and Numerical Simulation , 87:105255, 2020

  10. [10]

    Chen and C

    C.-t. Chen and C. Desoer. A proof of controllability of J ordan form state equations. IEEE Trans. Automatic Control, AC-13:195–196, 1968

  11. [11]

    Cipriani, A

    C. Cipriani, A. Scagliotti, and T. W¨ ohrer. A minimax op timal control approach for robust neural odes. In 2024 European Control Conference (ECC) , pages 58–64, 2024

  12. [12]

    C. Cohen. Formalization of a sign determination algori thm in real algebraic geometry. 2021

  13. [13]

    D. Cox, J. Little, and D. O’Shea. Ideals, varieties, and algorithms . Undergraduate Texts in Mathematics. Springer, Cham, fourth edition, 2015

  14. [14]

    Dolgov, D

    S. Dolgov, D. Kalise, and L. Saluzzi. Data-driven tenso r train gradient cross approximation for hamil- ton–jacobi–bellman equations. SIAM Journal on Scientific Computing , 45(5):A2153–A2184, 2023

  15. [15]

    T. Dub´ e. The structure of polynomial ideals and Gr¨ obner bases. SIAM Journal on Computing , 19(4):750– 773, 1990

  16. [16]

    Farrell, A

    P. Farrell, A. Birkisson, and S. Funke. Deflation techni ques for finding distinct solutions of nonlinear partial differential equations. SIAM Journal on Scientific Computing , 37(4):A2026–A2045, 2015

  17. [17]

    Gaillard and M

    L. Gaillard and M. Safey El Din. Solving parameter-depe ndent semi-algebraic systems. In International Symposium on Symbolic and Algebraic Computation , pages 447–456, 2024. TIME-OPTIMAL NEURAL FEEDBACK CONTROL AS BINARY CLASSIFICA TION 19

  18. [18]

    Grayson and M

    D. Grayson and M. Stillman. Macaulay2, a software syste m for research in algebraic geometry. Available at https://macaulay2.com

  19. [19]

    Karg and S

    B. Karg and S. Lucia. Efficient representation and approx imation of model predictive control laws via deep learning. IEEE Transactions on Cybernetics , 50(9):3866–3878, 2020

  20. [20]

    Kingma and J

    D. Kingma and J. Ba. Adam: A method for stochastic optimi zation. In Y. Bengio and Y. LeCun, editors, 3rd International Conference on Learning Representations , ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015

  21. [21]

    J. P. LaSalle. Time optimal control systems. Proc. Nat. Acad. Sci. U.S.A. , 45:573–577, 1959

  22. [22]

    An efficient spectral trust-region defla tion method for multiple solutions

    Li, Wang, and Li. An efficient spectral trust-region defla tion method for multiple solutions. Journal of Scientific Computing , 95:105–255, 2020

  23. [23]

    Lopez Garcia, K

    J. Lopez Garcia, K. Maluccio, F. Sottile, and T. Yahl. Re alroots – a Macaulay2 package. version 0.1. A Macaulay2 package available at https://macaulay2.com/doc/Macaulay2/share/doc/Macaulay2/RealRoots/html/index.html

  24. [24]

    Nakamura-Zimmerer, Q

    T. Nakamura-Zimmerer, Q. Gong, and W. Kang. Adaptive de ep learning for high-dimensional hamilton– jacobi–bellman equations. SIAM Journal on Scientific Computing , 43(2):A1221–A1247, 2021

  25. [25]

    Park and G

    H. Park and G. Regensburger, editors. Gr¨ obner Bases in Control Theory and Signal Processing . De Gruyter, Berlin, Boston, 2007

  26. [26]

    Patil and D

    D. Patil and D. Chakraborty. Computation of time optima l feedback control using Groebner basis. IEEE Trans. Automat. Control , 59(8):2271–2276, 2014

  27. [27]

    Patil, A

    D. Patil, A. Mulla, D. Chakraborty, and H. Pillai. Compu tation of feedback control for time optimal state transfer using groebner basis. Systems & Control Letters , 79:1–7, 2015

  28. [28]

    Phuoc Le and M

    H. Phuoc Le and M. Safey El Din. Solving parametric syste ms of polynomial equations over the reals through Hermite matrices. Journal of Symbolic Computation , 112:25–61, 2022

  29. [29]

    Pontryagin, V

    L. Pontryagin, V. Boltyanskii, R. Gamkrelidze, and E. M ishchenko. The mathematical theory of optimal processes. A Pergamon Press Book. The Macmillan Company, New York, 196 4. Translated by D. E. Brown

  30. [30]

    Rauscher and O

    F. Rauscher and O. Sawodny. Efficient online trajectory p lanning for integrator chain dynamics using polynomial elimination. IEEE Robotics and Automation Letters , 6(3), 2021

  31. [31]

    Sylvester

    J. Sylvester. On a remarkable modification of sturm’s th eorem. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science , 5:446–456, 1853

  32. [32]

    D. C. Vu, M. D. Pham, T. T. H. Nguyen, T. V. A. Nguyen, and T. L. Nguyen. Time-optimal trajectory generation and observer-based hierarchical sliding mode c ontrol for ballbots with system constraints. Internat. J. Robust Nonlinear Control , 34(11):7580–7610, 2024

  33. [33]

    Walther, T

    U. Walther, T. Georgiou, and A. Tannenbaum. On the compu tation of switching surfaces in optimal control: a Gr¨ obner basis approach. IEEE Trans. Automat. Control , 46(4):534–540, 2001

  34. [34]

    Y. Wang, C. Hu, Z. Li, S. Lin, S. He, and Y. Zhu. Time-optim al control for high-order chain-of-integrators systems with full state constraints and arbitrary terminal states. IEEE Transactions on Automatic Control, 70(3):1499–1514, 2025