Kernel-Based LMI Approaches to Solving the Hamilton-Jacobi-Bellman Equation and Nonlinear Optimal Control
Pith reviewed 2026-05-21 12:07 UTC · model grok-4.3
The pith
An explicit Riccati-Hessian equality constraint at equilibrium lets a kernel LMI formulation approximate solutions to the Hamilton-Jacobi-Bellman equation without collapsing to the trivial solution.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Representing the gradient of the value function in a reproducing kernel Hilbert space and converting the Hamilton-Jacobi-Bellman inequality into a Schur-complement linear matrix inequality yields a convex semidefinite program in the kernel coefficients. The novel Riccati-Hessian equality constraint imposed at the equilibrium removes the trivial zero solution and enforces consistency with the algebraic Riccati equation of the linearized dynamics. The resulting approximation satisfies the suboptimality relation J(x0; û) − V*(x0) ≤ ε T(x0), where T(x0) is determined solely by the system data and the working domain.
What carries the argument
The Riccati-Hessian equality constraint at the equilibrium point, which forces the Hessian of the kernel approximant to satisfy the algebraic Riccati equation of the linearized system.
If this is right
- The semidefinite program is convex and can be solved by standard interior-point solvers.
- The suboptimality bound remains valid even when the exact value function is not contained in the reproducing kernel Hilbert space.
- On benchmark problems the method recovers exact polynomial solutions to machine precision when they lie in the chosen space and produces residuals smaller than those of several competing techniques on the Van der Pol oscillator.
Where Pith is reading between the lines
- The same LMI-plus-constraint structure could be applied to other reproducing kernel spaces or to polynomial sum-of-squares bases for systems whose value functions admit known structure.
- Because the bound T(x0) depends only on data and domain size, the approach may remain useful for real-time receding-horizon implementations where exact dynamic programming is intractable.
- Scaling the number of kernel centers while monitoring both residual and actual closed-loop cost offers a practical diagnostic for when further refinement ceases to improve performance.
Load-bearing premise
The gradient of the value function lies sufficiently close to the chosen reproducing kernel Hilbert space that the linear matrix inequality and the Riccati-Hessian constraint together produce an approximation whose suboptimality stays bounded.
What would settle it
Compute the closed-loop cost starting from a chosen initial state and verify whether it exceeds the predicted quantity ε T(x0) by more than numerical tolerance; separately, check whether the Hessian of the obtained approximant at the equilibrium matches the algebraic Riccati solution of the linearized dynamics.
Figures
read the original abstract
We present a kernel-based linear matrix inequality (LMI) approach for the approximate solution of Hamilton--Jacobi--Bellman (HJB) equations arising in nonlinear optimal control. The method represents the gradient of the value function in a reproducing kernel Hilbert space (RKHS) and uses a Schur-complement reformulation to convert the quadratic HJB inequality into an LMI that is linear in the kernel coefficients, yielding a convex semidefinite program. The novel ingredient is an explicit Riccati--Hessian \emph{equality} constraint at the equilibrium, which removes the trivial solution and forces the Hessian of the approximation to match the algebraic Riccati equation solution of the linearised system. We give a suboptimality bound $J(x_0;\hat u) - V^*(x_0)\le \varepsilon\,T(x_0)$ in which $T(x_0)$ depends only on the problem data and the working domain (not on the approximation), and an RKHS approximation rate. Numerical experiments on a corrected 1D polynomial benchmark and on the Van der Pol oscillator measure $\varepsilon$, the RKHS approximation error, and the closed-loop cost $J(x_0;\hat u)$ versus the optimal value $V^*(x_0)$. On the 1D problem with $V^*$ in the polynomial-kernel RKHS the method recovers $V^*$ to within $3\times10^{-7}$ and achieves $0.000\%$ suboptimality. On Van der Pol it achieves the smallest HJB residual ($\varepsilon\approx 2.62$) of any method tested, beats LQR on every initial condition, and is within $0.42\%$ of the best per-IC cost (Albrekht order 6). When $V^*$ is not in the chosen RKHS, the method degrades gracefully: residuals stop improving with more centres but suboptimality remains bounded ($\le 13\%$ on the 1D test).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a kernel-based LMI approach to approximately solve the Hamilton-Jacobi-Bellman equation for nonlinear optimal control. The gradient of the value function is represented in an RKHS, the quadratic HJB inequality is converted to a convex LMI via Schur complement, and a novel Riccati-Hessian equality constraint is imposed at the equilibrium to exclude the trivial zero solution and enforce consistency with the linearized algebraic Riccati equation. A suboptimality bound J(x0; û) - V*(x0) ≤ ε T(x0) is derived where T(x0) depends only on problem data and domain, together with an RKHS approximation rate. Numerical results are reported on a corrected 1D polynomial system (recovery to 3e-7, 0% suboptimality) and the Van der Pol oscillator (smallest HJB residual among tested methods, within 0.42% of best per-IC cost).
Significance. If the central derivations hold, the work supplies a convex SDP formulation for nonlinear optimal control with an explicit, data-dependent suboptimality guarantee that remains valid even when the true value function lies outside the chosen RKHS. This combination of convexity, a non-triviality constraint that anchors the approximation to the linearization, and a bound independent of the particular kernel coefficients is a meaningful contribution to approximate dynamic programming and could enable reliable controller synthesis for systems where exact HJB solution is intractable.
minor comments (3)
- [Abstract] Abstract: the phrase 'corrected 1D polynomial benchmark' should briefly indicate what the original benchmark was and what correction was applied, so that readers can immediately place the reported 3e-7 recovery in context.
- [Numerical experiments] Numerical experiments section: the specific kernel (e.g., polynomial degree or Gaussian bandwidth) and the placement rule for the kernel centers should be stated explicitly, together with the number of centers used in each example, to support reproducibility.
- [Numerical experiments] The suboptimality percentages (0.000% on the 1D case, 0.42% on Van der Pol) are given relative to different references; a short clarifying sentence on how each percentage is computed would remove any ambiguity.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our work, for recognizing the combination of convexity, the Riccati-Hessian anchoring constraint, and the data-dependent suboptimality bound as a meaningful contribution, and for recommending minor revision. The referee's description of the method and results is accurate.
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper derives its LMI formulation by representing the value-function gradient in an RKHS, applying a Schur-complement transformation to the HJB inequality, and imposing an explicit linear Riccati-Hessian equality at the origin to exclude the zero solution. The suboptimality bound is stated to depend only on problem data and domain (not on the kernel coefficients or approximation error), and the numerical examples measure residuals and costs directly against known optima without reducing any central claim to a fitted input or self-referential definition. No load-bearing self-citation chains, uniqueness theorems imported from the authors, or ansatzes smuggled via prior work are used to justify the core steps; the approach remains convex and internally consistent even when V* lies outside the RKHS.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The gradient of the optimal value function lies in the reproducing kernel Hilbert space spanned by the chosen kernel and centres.
- standard math The linearised system around the equilibrium admits a stabilising solution to the algebraic Riccati equation.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The novel ingredient is an explicit Riccati-Hessian equality constraint at the equilibrium, which removes the trivial solution and forces the Hessian of the approximation to match the algebraic Riccati equation solution of the linearised system.
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanJ_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We give a suboptimality bound J(x0;û) - V*(x0) ≤ ε T(x0) in which T(x0) depends only on the problem data and the working domain.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
R. Alexander and D. Giannakis. Operator-theoretic framework for forecasting nonlinear time series with kernel analog techniques.Physica D: Nonlinear Phenomena, 409:132520, 2020
work page 2020
- [2]
-
[3]
D. P. Bertsekas.Dynamic Programming and Optimal Control, volume 1. Athena Scientific, 2012
work page 2012
-
[4]
A. Bittracher, S. Klus, B. Hamzi, P. Koltai, and C. Sch¨ utte. Dimensionality reduction of complex metastable systems via kernel embeddings of transition manifolds. arXiv preprint arXiv:1904.08622, 2019
-
[5]
J. Bouvrie and B. Hamzi. Balanced reduction of nonlinear control systems in reproducing kernel Hilbert space. In2010 48th Annual Allerton Conference on Communication, Control, and Computing, pages 294–301, 2010
work page 2010
-
[6]
J. Bouvrie and B. Hamzi. Empirical estimators for stochastically forced nonlinear systems: Observability, controllability and the invariant measure. InProc. of the 2012 American Control Conference, pages 294–301, 2012
work page 2012
-
[7]
J. Bouvrie and B. Hamzi. Kernel methods for the approximation of nonlinear systems.SIAM J. Control and Optimization, 55(4):2460–2492, 2017
work page 2017
-
[8]
J. Bouvrie and B. Hamzi. Kernel methods for the approximation of some key quantities of nonlinear systems.Journal of Computational Dynamics, 4(1):1–19, 2017
work page 2017
-
[9]
S. Boyd and L. Vandenberghe.Convex Optimization. Cambridge University Press, 2004
work page 2004
- [10]
-
[11]
F. Cucker and S. Smale. On the mathematical foundations of learning.Bulletin of the American Mathematical Society, 39:1–49, 2002
work page 2002
-
[12]
G. E. Fasshauer and M. J. McCourt.Kernel-based Approximation Methods using MATLAB. World Scientific Publishing Company, 2015
work page 2015
-
[13]
P. Giesl.Construction of Global Lyapunov Functions Using Radial Basis Functions, volume 1904 ofLecture Notes in Mathematics. Springer, Berlin Heidelberg, 2007
work page 1904
- [14]
-
[15]
S. Gros and M. Zanon. Data-driven control of nonlinear systems: An approximate dynamic programming approach using kernel methods.IEEE Transactions on Automatic Control, 65(10):4048–4063, 2020
work page 2020
-
[16]
B. Haasdonk, B. Hamzi, G. Santin, and D. Wittwar. Greedy kernel methods for center manifold approximation. InNumerical Mathematics and Advanced Applications ENUMATH 2019, pages 95–106. Springer, 2020
work page 2019
-
[17]
B. Haasdonk, B. Hamzi, G. Santin, and D. Wittwar. Kernel methods for center manifold approximation and a weak data-based version of the center manifold theorems.Physica D: Nonlinear Phenomena, 427:133007, 2021
work page 2021
-
[18]
B. Hamzi and F. Colonius. Kernel methods for the approximation of discrete-time linear autonomous and control systems.SN Applied Sciences, 1(7):1–12, 2019
work page 2019
- [19]
- [20]
- [21]
-
[22]
B. Hou, A. R. R. Matavalam, S. Bose, and U. Vaidya. Propagating uncertainty through system dynamics in reproducing kernel Hilbert space.Physica D: Nonlinear Phenomena, page 134168, 2024
work page 2024
-
[23]
B. Huang and U. Vaidya. A convex approach to data-driven optimal control via Perron- Frobenius and Koopman operators.IEEE Transactions on Automatic Control, 67(9):4778– 4785, 2022
work page 2022
-
[24]
Y. Jalalian, J. F. O. Ramirez, A. Hsu, B. Hosseini, and H. Owhadi. Data-efficient kernel methods for learning differential equations and their solution operators: Algorithms and error analysis. arXiv preprint arXiv:2203.12610, 2025
-
[25]
Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences
M. Kanagawa, P. Hennig, D. Sejdinovic, and B. K. Sriperumbudur. Gaussian processes and kernel methods: A review on connections and equivalences. arXiv preprint arXiv:1807.02582, 2018. 24
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[26]
S. Klus, F. N¨ uske, and B. Hamzi. Kernel-based approximation of the Koopman generator and Schr¨ odinger operator.Entropy, 22(7):722, 2020
work page 2020
-
[27]
J. B. Lasserre. Semidefinite programming relaxations for semialgebraic problems.Mathematical Programming, 112(1):65–92, 2008
work page 2008
-
[28]
J. Lee, B. Hamzi, B. Hou, H. Owhadi, G. Santin, and U. Vaidya. Kernel methods for the ap- proximation of the eigenfunctions of the Koopman operator.Physica D: Nonlinear Phenomena, 476:134662, 2025
work page 2025
-
[29]
J. Lee, B. Hamzi, Y. Kevrekidis, and H. Owhadi. Gaussian processes simplify differential equations. arXiv preprint, September 2024
work page 2024
-
[30]
J. Lee, H. Owhadi, B. Hamzi, and U. G. Vaidya. A note on kernel methods for the construction of Lyapunov functions using Koopman eigenfunctions. Preprint, ResearchGate, December 2024
work page 2024
-
[31]
D. Lengyel, B. Hamzi, H. Owhadi, and P. Parpas. Kernel sum of squares for data adapted kernel learning of dynamical systems from data: A global optimization approach. arXiv preprint arXiv:2408.06465, 2024
- [32]
-
[33]
J. Moyalan, H. Choi, Y. Chen, and U. Vaidya. Data-driven optimal control via linear transfer operators: A convex approach.Automatica, 150:110841, 2023
work page 2023
-
[34]
H. Owhadi. Computational graph completion.Research in the Mathematical Sciences, 9(2):27, 2022
work page 2022
-
[35]
P. A. Parrilo.Structured Semidefinite Programs and Semialgebraic Geometry Methods in Ro- bustness and Optimization. PhD thesis, California Institute of Technology, 2000
work page 2000
-
[36]
P. A. Parrilo. A sum of squares optimization approach to control design.Automatica, 39(2):185–201, 2003
work page 2003
-
[37]
A. Raghunathan and U. Vaidya. Optimal stabilization using Lyapunov measures.IEEE Transactions on Automatic Control, 59(5):1316–1321, 2014
work page 2014
-
[38]
G. Santin and B. Haasdonk. Kernel methods for surrogate modeling. InSystem and Data- Driven Methods and Algorithms. De Gruyter, 2019
work page 2019
-
[39]
A. Smirnov, B. Hamzi, and H. Owhadi. Mean-field limits of trained weights in deep learning: A dynamical systems perspective.Dolomites Research Notes on Approximation, 15(3):89–101, 2022
work page 2022
-
[40]
U. Vaidya. When Koopman meets Hamilton and Jacobi.IEEE Transactions on Automatic Control, 2025
work page 2025
-
[41]
U. Vaidya and D. Tellez-Castro. Data-driven stochastic optimal control with safety constraints using linear transfer operators.IEEE Transactions on Automatic Control, 68(10):6017–6032, 2023
work page 2023
-
[42]
Wendland.Scattered Data Approximation
H. Wendland.Scattered Data Approximation. Cambridge University Press, 2004. 25
work page 2004
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.