pith. sign in

arxiv: 2604.26172 · v2 · submitted 2026-04-28 · 📡 eess.SY · cs.AI· cs.LG· cs.SY· math.OC· stat.ML

Co-Learning Port-Hamiltonian Systems and Optimal Energy-Shaping Control

Pith reviewed 2026-05-08 03:03 UTC · model grok-4.3

classification 📡 eess.SY cs.AIcs.LGcs.SYmath.OCstat.ML
keywords port-Hamiltonian systemsenergy-shaping controlphysics-informed learningpassivity-based controltrajectory dataneural networkspendulum systemsoptimal control
0
0 comments X

The pith

A framework co-learns port-Hamiltonian models and energy-shaping controllers from trajectory data to produce inherently passive and stable closed-loop behavior.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method that jointly learns a port-Hamiltonian representation of system dynamics and an energy-balancing passivity-based controller. Both are represented by neural networks and refined through alternating optimization, where trajectory data collected under the current policy updates the model and the controller is then re-optimized on the new model. This structure embeds energy interactions so the resulting controller exploits the plant's passive dynamics without canceling its natural potential. A dissipation term is added during training to enforce strict energy decay, supporting transfer from simulation to real hardware. The approach is demonstrated on regulation and swing-up tasks for planar and torsional pendulums.

Core claim

By parameterizing a port-Hamiltonian model and an energy-balancing passivity-based controller with neural networks and alternating between model refinement on policy-generated trajectories and controller re-optimization, the method produces a controller that renders the closed-loop system passive and provably stable while preserving the plant's natural energy structure.

What carries the argument

Alternating optimization of neural-network-parameterized port-Hamiltonian dynamics and energy-balancing passivity-based controllers, combined with dissipation regularization to enforce energy decay.

Load-bearing premise

The true dynamics admit an accurate port-Hamiltonian representation that the neural networks can capture, and the alternating optimization converges to a solution that preserves the stability and passivity guarantees.

What would settle it

A closed-loop experiment in which total energy increases over time or the system becomes unstable under the learned controller would show that the passivity and stability claims do not hold.

Figures

Figures reproduced from arXiv: 2604.26172 by Ankur Kamboj, Biswadip Dey, Vaibhav Srivastava.

Figure 1
Figure 1. Figure 1: Left: The complete computational graph with different subsystems. Pink blocks and arrows represent NN pa￾rameterization. Green blocks are various system states. Purple arrows indicate automatic differentiation to obtain gradients. Right: The training consists of a warm-up phase for initial pH system learning from step-excited data, followed by alternating optimization iterations that refine the system mode… view at source ↗
Figure 2
Figure 2. Figure 2: Learned system matrices for 1-link planar pendulum: (a),(b), and (c) show that Mθ1 , Vθ2 (q), gθ3 (q) match the ground truth with an offset (invariance owing to training on ˙q instead of p). (d) The relative error in states shows rollout for 3s with 95% confidence bands; the y-axis is log-scaled for clearer representation. 5.2 Training Details For learning the system model, we minimize the integrated error… view at source ↗
Figure 3
Figure 3. Figure 3: show that the resulting closed-loop system is stabilized by the proposed controller for the target configuration (q ∗ , p∗ ) = (0, 0) (here the pendulum angle q is measured relative to the negative y-axis) view at source ↗
Figure 4
Figure 4. Figure 4: Swing-up Optimal EB-PBC for 1-link Planar Pendulum: Optimal EB-PBC controlled system trajectories for (a) learned system, and (b) true system for 100 initial conditions sampled from Pz0 −6 −4 −2 0 2 4 6 q −20 0 20 Optimal EB-PBC Potential Shaping V ∗ φ (q) V(q) V(q) + V∗ φ (q) view at source ↗
Figure 5
Figure 5. Figure 5: Energy Shaping for Swing-up Control of 1-link Planar Pendulum: The proposed framework learns the added desired potential V ∗ ϕ to place the minimum at ±π for pendulum swing-up. 5.5 Comparing Optimal EB-PBC with PD + Potential Compensation for Swing￾up Control of Torsional Pendulum We now show the performance of the optimal EB-PBC controller for the swing-up of a torsional pendulum. The Hamiltonian of this … view at source ↗
Figure 6
Figure 6. Figure 6: Optimal EB-PBC on 1-link Torsional Pendulum: (a) shows the learned pH system with perfectly converged system parameters. Reshaped potential landscape in (b) shows how optimal EB-PBC learns to utilize the natural potential to achieve the minimum at q ∗ = π compared to the PD+ controller that cancels and adds a quadratic potential. (c) The performance of optimal EB-PBC versus the optimized PD+ control for sw… view at source ↗
Figure 7
Figure 7. Figure 7: Snapshots of the learned state-dependent damping view at source ↗
Figure 8
Figure 8. Figure 8: Optimal EB-PBC on 2-link Torsional Pendulum: (a) and (b) show the learned pH system with perfectly converged system parameters. Reshaped potential landscape in (c) shows how optimal EB-PBC learns to utilize the natural potential to achieve the minimum at q ∗ = [π, 0]. (d) The performance of optimal EB-PBC versus the optimized standard EB-PBC control for swing-up of the torsional pendulum shows how optimall… view at source ↗
read the original abstract

We develop a physics-informed learning framework for energy-shaping control of port-Hamiltonian (pH) systems from trajectory data. The proposed approach co-learns a pH system model and an optimal energy-balancing passivity-based controller (EB-PBC) through alternating optimization with policy-aware data collection. At each iteration, the system model is refined using trajectory data collected under the current control policy, and the controller is re-optimized on the updated model. Both components are parameterized by neural networks that embed the pH dynamics and EB-PBC structure, ensuring interpretability in terms of energy interactions. The learned controller renders the closed-loop system inherently passive and provably stable, and exploits passive plant dynamics without canceling the natural potential. A dissipation regularization enforces strict energy decay during training, thereby enhancing robustness to sim-to-real gaps. The proposed framework is validated on state-regulation and swing-up tasks for planar and torsional pendulum systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript develops a physics-informed co-learning framework that alternates between fitting a neural-network-parameterized port-Hamiltonian (pH) model to trajectory data and re-optimizing an energy-balancing passivity-based controller (EB-PBC) on the updated model. Both the pH dynamics (skew-symmetric J, positive-semidefinite R, convex Hamiltonian H) and the EB-PBC law are embedded in the network architectures. A dissipation regularization term is added during training. The central claim is that the resulting controller renders the closed-loop system inherently passive and provably stable while exploiting rather than canceling the plant's natural potential; the method is demonstrated on state-regulation and swing-up tasks for planar and torsional pendulums.

Significance. If the stability and passivity guarantees can be shown to transfer from the learned model to the true plant under bounded approximation error, the approach would provide a principled route to structure-preserving, interpretable controllers for underactuated mechanical systems without requiring exact first-principles models. The combination of alternating optimization, policy-aware data collection, and dissipation regularization directly targets the sim-to-real gap while preserving energy-based passivity arguments.

major comments (2)
  1. [Abstract] Abstract: The claim that the learned controller is 'provably stable' and renders the closed-loop system 'inherently passive' is load-bearing for the contribution, yet the provided text establishes these properties only with respect to the learned pH model. No Lipschitz bounds on the neural-network approximation of J, R, and H, nor any robustness margin for the EB-PBC design under model mismatch, are referenced; without such arguments the transfer of stability to the true plant remains unproven.
  2. [Abstract] Abstract and alternating-optimization description: The framework relies on the alternating loop converging to a fixed point that preserves the pH structure (skew-symmetry of J, R ≽ 0, convexity of H) so that the EB-PBC passivity proof continues to hold. No convergence analysis, contraction mapping, or even empirical monitoring of these structural invariants across iterations is supplied; this omission directly affects whether the optimality and stability claims survive the co-learning procedure.
minor comments (1)
  1. [Abstract] The abstract states that the controller 'exploits passive plant dynamics without canceling the natural potential,' but the precise mechanism (e.g., how the learned Hamiltonian is used inside the EB-PBC law) is not expanded; a short clarifying sentence or equation reference would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. The points raised concerning the scope of the stability and passivity claims, as well as the convergence properties of the alternating optimization, are important for clarifying the manuscript's contributions. We address each major comment below and specify the revisions that will be made.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that the learned controller is 'provably stable' and renders the closed-loop system 'inherently passive' is load-bearing for the contribution, yet the provided text establishes these properties only with respect to the learned pH model. No Lipschitz bounds on the neural-network approximation of J, R, and H, nor any robustness margin for the EB-PBC design under model mismatch, are referenced; without such arguments the transfer of stability to the true plant remains unproven.

    Authors: We agree that the passivity and stability properties are formally established only for the learned port-Hamiltonian model, since the EB-PBC design and closed-loop analysis are performed with respect to the parameterized dynamics. The abstract will be revised to explicitly qualify these guarantees as holding for the learned model. The dissipation regularization term is introduced precisely to promote robustness against model mismatch, and the empirical validation on the pendulum systems provides supporting evidence for practical transfer. We will add a dedicated discussion paragraph addressing approximation errors and the absence of explicit robustness margins, while avoiding any unsubstantiated transfer claims. revision: partial

  2. Referee: [Abstract] Abstract and alternating-optimization description: The framework relies on the alternating loop converging to a fixed point that preserves the pH structure (skew-symmetry of J, R ≽ 0, convexity of H) so that the EB-PBC passivity proof continues to hold. No convergence analysis, contraction mapping, or even empirical monitoring of these structural invariants across iterations is supplied; this omission directly affects whether the optimality and stability claims survive the co-learning procedure.

    Authors: The neural-network architectures are explicitly constructed to enforce the required pH structure (skew-symmetric interconnection matrix, positive-semidefinite dissipation matrix, and convex Hamiltonian) at every iteration by design. While a theoretical convergence analysis of the alternating procedure is not provided, we will augment the manuscript with empirical monitoring of the structural invariants—specifically, the skew-symmetry residual norm, the minimum eigenvalue of the dissipation matrix, and convexity checks on the Hamiltonian—across co-learning iterations. These results will be reported to demonstrate that the invariants are preserved in practice, thereby supporting the applicability of the EB-PBC stability arguments to the learned models. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation grounded in external pH theory

full rationale

The paper parameterizes both the pH model and EB-PBC controller with neural networks that embed the standard port-Hamiltonian structure (skew-symmetric J, positive-semidefinite R, convex H) and energy-balancing passivity-based control form. Stability and passivity claims follow directly from classical pH passivity theory applied to the learned model, which is fitted to external trajectory data via alternating optimization. No central quantity is defined in terms of itself, no fitted parameter is renamed as a prediction, and no load-bearing step reduces to a self-citation chain or ansatz smuggled from prior author work. The dissipation regularization is an added training term, not a definitional loop. The framework remains self-contained against the independent benchmarks of pH theory and trajectory data.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The approach rests on the domain assumption that physical systems admit port-Hamiltonian representations and on the standard passivity-based control result that EB-PBC yields stability; one regularization hyperparameter is introduced to enforce energy decay.

free parameters (1)
  • dissipation regularization weight
    Hyperparameter that enforces strict energy decay during training to improve robustness.
axioms (2)
  • domain assumption System dynamics admit a port-Hamiltonian representation
    Invoked to justify embedding pH structure inside the neural networks.
  • domain assumption EB-PBC applied to a pH plant yields passivity and stability
    Standard result from passivity-based control theory used to claim closed-loop guarantees.

pith-pipeline@v0.9.0 · 5479 in / 1333 out tokens · 62285 ms · 2026-05-08T03:03:48.988478+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    Port-hamiltonian systems theory: An introductory overview,

    A. Van Der Schaft and D. Jeltsema, “Port-hamiltonian systems theory: An introductory overview,”Foundations and Trends®in Systems and Control, vol. 1, no. 2-3, pp. 173–378, 2014

  2. [2]

    Putting energy back in control,

    R. Ortega, A. J. Van Der Schaft, I. Mareels, and B. Maschke, “Putting energy back in control,”IEEE Control Systems Magazine, vol. 21, no. 2, pp. 18–33, 2002

  3. [3]

    Van der Schaft,L 2-Gain and Passivity Techniques in Nonlinear Control

    A. Van der Schaft,L 2-Gain and Passivity Techniques in Nonlinear Control. Springer, 2000

  4. [4]

    Control by interconnection and standard passivity-based control of port-Hamiltonian systems,

    R. Ortega, A. van der Schaft, F. Castanos, and A. Astolfi, “Control by interconnection and standard passivity-based control of port-Hamiltonian systems,”IEEE Transactions on Automatic Control, vol. 53, no. 11, pp. 2527–2542, 2008

  5. [5]

    Reinforcement learning for port-Hamiltonian systems,

    O. Sprangers, R. Babuˇ ska, S. P. Nageshrao, and G. A. D. Lopes, “Reinforcement learning for port-Hamiltonian systems,” IEEE Transactions on Cybernetics, vol. 45, no. 5, pp. 1017–1027, 2015. 16

  6. [6]

    Designing of robust adaptive passivity-based controller based on reinforcement learning for nonlinear port-Hamiltonian model with disturbance,

    A. Gheibi, A. R. Ghiasi, S. Ghaemi, and M. A. Badamchizadeh, “Designing of robust adaptive passivity-based controller based on reinforcement learning for nonlinear port-Hamiltonian model with disturbance,”International Journal of Control, vol. 93, no. 8, pp. 1754–1764, 2020

  7. [7]

    Optimal energy shaping via neural approxima- tors,

    S. Massaroli, M. Poli, F. Califano, J. Park, A. Yamashita, and H. Asama, “Optimal energy shaping via neural approxima- tors,”SIAM Journal on Applied Dynamical Systems, vol. 21, no. 3, pp. 2126–2147, 2022

  8. [8]

    Total energy shaping with neural interconnection and damping assignment-passivity based control,

    S. S.-E. Plaza, R. Reyes-B´ aez, and B. Jayawardhana, “Total energy shaping with neural interconnection and damping assignment-passivity based control,” inLearning for Dynamics and Control Conference, pp. 520–531, PMLR, 2022

  9. [9]

    Bayesian inference for path following control of port- Hamiltonian systems with training trajectory data,

    Y. Okura, K. Fujimoto, I. Maruta, A. Saito, and H. Ikeda, “Bayesian inference for path following control of port- Hamiltonian systems with training trajectory data,”SICE Journal of Control, Measurement, and System Integration, vol. 13, no. 2, pp. 40–46, 2020

  10. [10]

    Physics-informed multi-agent reinforcement learning for distributed multi-robot problems,

    E. Sebasti´ an, T. Duong, N. Atanasov, E. Montijano, and C. Sag¨ u´ es, “Physics-informed multi-agent reinforcement learning for distributed multi-robot problems,”IEEE Transactions on Robotics, 2025

  11. [11]

    Dissipation obstacle hampers control-by-interconnection methodology,

    M. Zhang, R. Ortega, D. Jeltsema, and H. Su, “Dissipation obstacle hampers control-by-interconnection methodology,” IFAC-PapersOnLine, vol. 48, no. 13, pp. 123–128, 2015. 5th IFAC Workshop on Lagrangian and Hamiltonian Methods for Nonlinear Control LHMNC 2015

  12. [12]

    Hamiltonian neural networks,

    S. Greydanus, M. Dzamba, and J. Yosinski, “Hamiltonian neural networks,”Advances in Neural Information Processing Systems, vol. 32, 2019

  13. [13]

    Symplectic ODE-Net: Learning Hamiltonian dynamics with control,

    Y. D. Zhong, B. Dey, and A. Chakraborty, “Symplectic ODE-Net: Learning Hamiltonian dynamics with control,” in International Conference on Learning Representations, 2020

  14. [14]

    Port-hamiltonian neural ODE networks on Lie groups for robot dynamics learning and control,

    T. Duong, A. Altawaitan, J. Stanley, and N. Atanasov, “Port-hamiltonian neural ODE networks on Lie groups for robot dynamics learning and control,”IEEE Transactions on Robotics, vol. 40, pp. 3695–3715, 2024

  15. [15]

    Stochastic port-Hamiltonian neural networks: Universal approximation with passivity guarantees,

    L. Di Persio, M. Ehrhardt, and A. Khedher, “Stochastic port-Hamiltonian neural networks: Universal approximation with passivity guarantees,”arXiv preprint arXiv:2603.10078, 2025

  16. [16]

    Learning subsystem dynamics in nonlinear systems via port-hamiltonian neural networks,

    G. Van Otterdijk, S. Moradi, S. Weiland, R. T´ oth, N. Jaensson, and M. Schoukens, “Learning subsystem dynamics in nonlinear systems via port-hamiltonian neural networks,” in2025 IEEE 64th Conference on Decision and Control (CDC), pp. 2071–2076, IEEE, 2025

  17. [17]

    Learning switching port-hamiltonian systems with uncertainty quantification,

    T. Beckers, T. Z. Jiahao, and G. J. Pappas, “Learning switching port-hamiltonian systems with uncertainty quantification,” IFAC-PapersOnLine, vol. 56, no. 2, pp. 525–532, 2023

  18. [18]

    Gaussian process port-hamiltonian systems: Bayesian learning with physics prior,

    T. Beckers, J. Seidman, P. Perdikaris, and G. J. Pappas, “Gaussian process port-hamiltonian systems: Bayesian learning with physics prior,” in2022 IEEE 61st Conference on Decision and Control (CDC), pp. 1447–1453, IEEE, 2022

  19. [19]

    Stable port-hamiltonian neural networks,

    F. J. Roth, D. K. Klein, M. Kannapinn, J. Peters, and O. Weeger, “Stable port-hamiltonian neural networks,”arXiv preprint arXiv:2502.02480, 2025

  20. [20]

    Learning neural koopman operators with dissipativity guarantees,

    Y. Xu, S. Sivaranjani, and V. Gupta, “Learning neural koopman operators with dissipativity guarantees,” in2025 IEEE 64th Conference on Decision and Control (CDC), pp. 2064–2070, IEEE, 2025

  21. [21]

    Control-oriented system identification: Classical, learning, and physics-informed approaches,

    S. Sivaranjani, Y. Shi, N. Atanasov, T. Duong, J. Feng, T. Martin, Y. Xu, V. Gupta, and F. Allg¨ ower, “Control-oriented system identification: Classical, learning, and physics-informed approaches,”arXiv preprint arXiv:2512.06315, 2025

  22. [22]

    Lyapunov-stable neural-network control

    H. Dai, B. Landry, L. Yang, M. Pavone, and R. Tedrake, “Lyapunov-stable neural-network control,”arXiv preprint arXiv:2109.14152, 2021

  23. [23]

    Neural lyapunov control,

    Y.-C. Chang, N. Roohi, and S. Gao, “Neural lyapunov control,”Advances in neural information processing systems, vol. 32, 2019

  24. [24]

    A survey on physics informed reinforcement learning: Review and open problems,

    C. Banerjee, K. Nguyen, C. Fookes, and M. Raissi, “A survey on physics informed reinforcement learning: Review and open problems,”Expert Systems with Applications, vol. 287, p. 128166, 2025

  25. [25]

    Neural ordinary differential equations,

    R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” inAdvances in Neural Information Processing Systems, vol. 31, 2018

  26. [26]

    Multi-objective loss balancing for physics-informed deep learning,

    R. Bischof and M. A. Kraus, “Multi-objective loss balancing for physics-informed deep learning,”Computer Methods in Applied Mechanics and Engineering, vol. 439, p. 117914, 2025

  27. [27]

    Understanding and mitigating gradient flow pathologies in physics-informed neural networks,

    S. Wang, Y. Teng, and P. Perdikaris, “Understanding and mitigating gradient flow pathologies in physics-informed neural networks,”SIAM Journal on Scientific Computing, vol. 43, no. 5, pp. A3055–A3081, 2021. 17

  28. [28]

    Automatic differentiation in machine learning: a survey,

    A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, “Automatic differentiation in machine learning: a survey,”Journal of Machine Learning Research, vol. 18, no. 153, pp. 1–43, 2018

  29. [29]

    SGDR: Stochastic gradient descent with warm restarts,

    I. Loshchilov and F. Hutter, “SGDR: Stochastic gradient descent with warm restarts,” inInternational Conference on Learning Representations, 2017

  30. [30]

    Learning to predict 3D rotational dynamics from images of a rigid body with unknown mass distribution,

    J. J. Mason, C. Allen-Blanchette, N. Zolman, E. Davison, and N. E. Leonard, “Learning to predict 3D rotational dynamics from images of a rigid body with unknown mass distribution,”Aerospace, vol. 10, no. 11, p. 921, 2023

  31. [31]

    Unsupervised learning of Lagrangian dynamics from images for prediction and control,

    Y. D. Zhong and N. Leonard, “Unsupervised learning of Lagrangian dynamics from images for prediction and control,” Advances in Neural Information Processing Systems, vol. 33, pp. 10741–10752, 2020. 18