pith. sign in

arxiv: 2604.13325 · v1 · submitted 2026-04-14 · 💻 cs.RO · cs.SY· eess.SY

Boundary Sampling to Learn Predictive Safety Filters via Pontryagin's Maximum Principle

Pith reviewed 2026-05-10 14:23 UTC · model grok-4.3

classification 💻 cs.RO cs.SYeess.SY
keywords safety filtersPontryagin Maximum PrincipleHamilton-Jacobi reachabilitycontrol barrier functionsboundary samplingautonomous systemsshared control
0
0 comments X

The pith

Pontryagin's Maximum Principle identifies boundary trajectories that guide efficient data sampling for learning predictive safety filters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to improve the efficiency of learning safety filters by focusing data collection on the most relevant states. It does this by using the Pontryagin Maximum Principle to characterize trajectories that come close to violating safety constraints. These boundary trajectories direct the sampling process for training a model of the Control Barrier Value Function based on Hamilton-Jacobi reachability. The learned function is then used to filter controls and enforce safety. Validation on a racing car application shows gains in speed, safety, and accuracy of the learned safe sets.

Core claim

The authors establish that trajectories which marginally satisfy the safety constraints, as characterized by the Pontryagin Maximum Principle, provide highly informative samples for learning the Control Barrier Value Function. This learned function approximates the solution to a Hamilton-Jacobi reachability problem and serves as the basis for a predictive safety filter that can be applied in real time. The result is improved learning performance compared to standard sampling methods, as demonstrated through simulations and physical experiments.

What carries the argument

Boundary trajectories generated by applying the Pontryagin Maximum Principle to the safety-constrained optimal control problem, used to guide sampling when training the Control Barrier Value Function.

If this is right

  • Learning of the safety filter converges with fewer samples and reaches target performance in less time.
  • The deployed filter produces fewer safety violations during operation.
  • The reconstructed safe set more closely matches the true reachable safe set.
  • The resulting filter supports real-time use with computation times around 3 milliseconds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same boundary-sampling idea could be applied to train other types of learned safety certificates beyond Hamilton-Jacobi value functions.
  • In multi-vehicle or multi-agent settings the approach might supply targeted data for coordinated safety constraints.
  • Combining PMP sampling with online adaptation could allow safety filters to maintain performance as the environment changes.

Load-bearing premise

Trajectories identified by the Pontryagin Maximum Principle as boundary cases are representative of the safety-critical states needed to learn an accurate Control Barrier Value Function without bias or gaps.

What would settle it

A test case in which a safety filter trained on PMP boundary samples permits constraint violations that a filter trained on uniform random samples prevents.

Figures

Figures reproduced from arXiv: 2604.13325 by James Dallas, John Subosits, John Talbot, Jonathan DeCastro, Somil Bansal, Thomas Lew.

Figure 1
Figure 1. Figure 1: Left: Extremal trajectory that barely stays in the safe set S and reachable sets from x0. Right: Generation of boundary trajectories to learn a safety filter. CBF constraints into model predictive control formula￾tions [6], enforcing the safety filtering constraints over a finite horizon [1], or lifting the CBF conditions to trajectory space [7]. Although these methods introduce lookahead predictions into … view at source ↗
Figure 2
Figure 2. Figure 2: Tangent ball. Then, by A3, and using [22, Lemma 4.6], no matter the choice of smooth control extension u˜ : [0, T + ϵ] → U with u˜ = u on [0, T], we have xT +ϵ ∈ Int(B) so xT +ϵ ∈ S / , so the system will always exit C. This is a contradiction. ■ Lemma 1 leads to the following result which gives conditions that barely-safe trajectories must satisfy. Corollary 1. (Characterization of barely-staying tra￾ject… view at source ↗
Figure 3
Figure 3. Figure 3: Experimental platform. extended with a virtual state with zero dynamics, repre￾senting the class-K parameter such that γ can be tuned for experiments to yield the desired behavior [21]. B. Experimental Vehicle The experimental vehicle ( [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Failure rate between PMP and uniform sampling. Performance vs training epochs and prediction horizon (top). Performance vs dataset size (bottom). PMP consistently improves failure rate. Shaded regions denote ±1 S.D. Significance shown with ∗ p < 0.05 and ∗∗p < 0.01. boundary sampling as a baseline. Boundary samples are perturbed by adding uniformly sampled noise of ±10cm to the boundary lateral position, p… view at source ↗
Figure 5
Figure 5. Figure 5: Value function at a speed of 10.5 m/s for 50k epochs and 30k samples for PMP sampling (center) and uniform sampling (right). Ground truth (Sec. V-A) is shown on left. Brown solid line is the 0-level set of the value function and black dashed lines are track edges and centerline. -320 -300 N (m) -420 -400 -380 E (m) -0.5 0.0 0.5 Steer Diff (rad ) (a) Steering difference. -320 -300 N (m) -420 -400 -380 E (m)… view at source ↗
Figure 6
Figure 6. Figure 6: Shared control experiment with γ = 0.1 for steering difference (left) torque difference (center), and velocity (right). Color depicts intervention (left, center) and velocity magnitude (right). than uniform sampling, indicating improved approxima￾tion of the safe-set geometry. These results complement the results for failure rate and further demonstrate that PMP improves performance by concentrating sample… view at source ↗
Figure 7
Figure 7. Figure 7: Segment of γ = 0.1 experiment for steering (left) and position (right). CBVF-QP output is shown in blue and driver request in green). Track edges are shown in red. to promote safety in real world experiments. Finally, wall times are on the order of 3ms yielding real-time applicability. VI. CONCLUSION This paper presents an approach to reduce the myopic behavior of safety filters and improve training effici… view at source ↗
read the original abstract

Safety filters provide a practical approach for enforcing safety constraints in autonomous systems. While learning-based tools scale to high-dimensional systems, their performance depends on informative data that includes states likely to lead to constraint violation, which can be difficult to efficiently sample in complex, high-dimensional systems. In this work, we characterize trajectories that barely avoid safety violations using the Pontryagin Maximum Principle. These boundary trajectories are used to guide data collection for learned Hamilton-Jacobi Reachability, concentrating learning efforts near safety-critical states to improve efficiency. The learned Control Barrier Value Function is then used directly for safety filtering. Simulations and experimental validation on a shared-control automotive racing application demonstrate PMP sampling improves learning efficiency, yielding faster convergence, reduced failure rates, and improved safe set reconstruction, with wall times around 3ms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes using Pontryagin's Maximum Principle (PMP) to characterize and sample boundary trajectories that marginally avoid safety violations. These trajectories guide data collection for learning a Control Barrier Value Function via Hamilton-Jacobi reachability, which is then deployed directly as a predictive safety filter. The approach is evaluated in simulation and on hardware for a shared-control automotive racing task, with claims of faster convergence, lower failure rates, improved safe-set reconstruction, and real-time inference around 3 ms.

Significance. If the central claim holds, the work provides a principled, PMP-based mechanism for concentrating samples near safety boundaries, which could improve data efficiency for learning-based safety filters in autonomous systems where uniform or random sampling is inefficient. The explicit use of an established optimal-control principle for sampling, combined with hardware validation on a real-time task, strengthens the contribution relative to purely heuristic sampling strategies.

major comments (2)
  1. [§3] §3 (Boundary Trajectory Characterization): The manuscript applies PMP to generate extremal trajectories from selected initial conditions but provides no formal coverage guarantee or density bound showing that these trajectories densely sample the relevant safety boundary manifold. In high-dimensional state spaces this leaves open the possibility of systematic gaps, directly undermining the claim that PMP sampling improves safe-set reconstruction without bias.
  2. [§5] §5 (Experimental Validation): All quantitative results (convergence speed, failure rates, safe-set metrics) are reported for an automotive racing system whose state dimension is modest (position/velocity/heading). No scaling experiments or analysis address whether the observed efficiency gains persist when state dimension increases and the boundary manifold becomes harder to cover with PMP trajectories.
minor comments (2)
  1. [Figure 4] Figure 4 (safe-set reconstruction): The plots would benefit from overlaying multiple independent runs with shaded variance to demonstrate consistency of the reported improvement.
  2. [§2] Notation: The distinction between the learned Control Barrier Value Function and the true Hamilton-Jacobi value function is occasionally blurred in the text; a short clarifying paragraph in §2 would help.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the detailed and constructive review of our manuscript. Below, we provide point-by-point responses to the major comments, clarifying our approach and indicating the revisions made to address the concerns raised.

read point-by-point responses
  1. Referee: [§3] §3 (Boundary Trajectory Characterization): The manuscript applies PMP to generate extremal trajectories from selected initial conditions but provides no formal coverage guarantee or density bound showing that these trajectories densely sample the relevant safety boundary manifold. In high-dimensional state spaces this leaves open the possibility of systematic gaps, directly undermining the claim that PMP sampling improves safe-set reconstruction without bias.

    Authors: We thank the referee for highlighting this important theoretical aspect. It is correct that the manuscript does not provide a formal coverage guarantee or density bound for the PMP-sampled trajectories. Deriving such guarantees for the manifold coverage in general nonlinear systems is a complex problem that would require additional theoretical development, such as analysis of the reachability of the boundary value problems. Our contribution focuses on the practical utility of using PMP to generate boundary trajectories that are guaranteed to be extremal with respect to the safety constraints by the maximum principle. This ensures that the data is concentrated on the safety boundary rather than being biased towards safe or unsafe regions as in uniform sampling. In the revised manuscript, we have expanded Section 3 to discuss the selection of initial conditions and acknowledge the potential for incomplete coverage in very high-dimensional spaces, while emphasizing that the method remains unbiased in targeting the boundary. We believe this addresses the concern without requiring a full density proof at this stage. revision: partial

  2. Referee: [§5] §5 (Experimental Validation): All quantitative results (convergence speed, failure rates, safe-set metrics) are reported for an automotive racing system whose state dimension is modest (position/velocity/heading). No scaling experiments or analysis address whether the observed efficiency gains persist when state dimension increases and the boundary manifold becomes harder to cover with PMP trajectories.

    Authors: We agree that the experiments are conducted on a system with modest state dimension, specifically the 6D kinematic bicycle model used in the automotive racing task. The manuscript does not include explicit scaling experiments to higher dimensions. However, the racing application involves complex, nonlinear dynamics and tight safety boundaries that are representative of challenges in autonomous systems. The PMP sampling is an offline process, and the learned safety filter achieves real-time performance. In the revision, we have added a subsection discussing the computational requirements of PMP trajectory generation as a function of state dimension and how the number of samples can be adapted. We also note that the hardware validation demonstrates the method's effectiveness in a real-world setting where safety is critical. We believe these additions provide context on scalability while the core results stand. revision: partial

Circularity Check

0 steps flagged

No significant circularity: PMP is external principle applied to sampling

full rationale

The paper's derivation applies the standard Pontryagin Maximum Principle (an established result from optimal control theory, independent of the authors) to characterize boundary trajectories, then uses those trajectories to guide data collection for learning the Control Barrier Value Function via Hamilton-Jacobi reachability. This sampled data trains the safety filter without any load-bearing step reducing by construction to a fitted parameter, self-defined quantity, or self-citation chain. The central claims (improved learning efficiency and safe-set reconstruction) rest on empirical validation rather than tautological re-labeling of inputs. No self-definitional, fitted-prediction, or uniqueness-imported patterns appear in the provided abstract or derivation outline.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the applicability of Pontryagin's Maximum Principle to characterize safety boundary trajectories in the given control problems; no free parameters or new invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Pontryagin's Maximum Principle can characterize trajectories that barely avoid safety violations in the relevant optimal control problems.
    Directly invoked in the abstract to generate boundary samples for learning.

pith-pipeline@v0.9.0 · 5448 in / 1150 out tokens · 30508 ms · 2026-05-10T14:23:54.404133+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

  1. [1]

    A predictive safety filter for learning-based control of constrained nonlinear dynamical systems,

    K. P. Wabersich and M. N. Zeilinger, “A predictive safety filter for learning-based control of constrained nonlinear dynamical systems,”Automatica, vol. 129, p. 109597, 2021

  2. [2]

    Control barrier functions: Theory and applica- tions,

    A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada, “Control barrier functions: Theory and applica- tions,” inEuropean Control Conference, 2019, pp. 3420–3431

  3. [3]

    Safety filter for lane-keeping control,

    C. Jiang, H. Gan, I. V ¨or¨os, D. Tak ´acs, and G. Orosz, “Safety filter for lane-keeping control,” in16th International Symposium on Advanced Vehicle Control, 2024, pp. 1–6

  4. [4]

    Control barrier functions for shared control and vehicle safety,

    J. Dallas, J. Talbot, M. Suminaka, M. Thompson, T. Lew, G. Orosz, and J. Subosits, “Control barrier functions for shared control and vehicle safety,” inAmerican Control Conference. IEEE, Jul. 2025, p. 4203–4210

  5. [5]

    Advances in the theory of control barrier functions: Addressing practical challenges in safe control synthesis for autonomous and robotic systems,

    K. Garg, J. Usevitch, J. Breeden, M. Black, D. Agrawal, H. Par- wana, and D. Panagou, “Advances in the theory of control barrier functions: Addressing practical challenges in safe control synthesis for autonomous and robotic systems,”Annual Reviews in Control, vol. 57, p. 100945, 2024. 9

  6. [6]

    Safety-critical model predictive control with discrete-time control barrier function,

    J. Zeng, B. Zhang, and K. Sreenath, “Safety-critical model predictive control with discrete-time control barrier function,” in American Control Conference, 2021, pp. 3882–3889

  7. [7]

    For- ward invariance in trajectory spaces for safety-critical control,

    M. Vahs, R. I. C. Muchacho, F. T. Pokorny, and J. Tumova, “For- ward invariance in trajectory spaces for safety-critical control,” inIEEE International Conference on Robotics and Automation, 2025, pp. 3926–3932

  8. [8]

    Hamilton- jacobi reachability: A brief overview and recent advances,

    S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton- jacobi reachability: A brief overview and recent advances,” in IEEE Conference on Decision and Control, 2017

  9. [9]

    Robust control barrier–value functions for safety-critical control,

    J. J. Choi, D. Lee, K. Sreenath, C. J. Tomlin, and S. L. Her- bert, “Robust control barrier–value functions for safety-critical control,” inIEEE Conference on Decision and Control, 2021

  10. [10]

    Deepreach: A deep learning ap- proach to high-dimensional reachability,

    S. Bansal and C. J. Tomlin, “Deepreach: A deep learning ap- proach to high-dimensional reachability,” inIEEE International Conference on Robotics and Automation, 2021, pp. 1817–1824

  11. [11]

    Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,

    M. Raissi, P. Perdikaris, and G. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,”Journal of Computational Physics, vol. 378, pp. 686– 707, 2019

  12. [12]

    Self-adaptive physics- informed neural networks,

    L. D. McClenny and U. M. Braga-Neto, “Self-adaptive physics- informed neural networks,”Journal of Computational Physics, vol. 474, p. 111722, 2023

  13. [13]

    Bridging model predictive control and deep learning for scalable reachability analysis,

    Z. Feng, L. Qiu, and S. Bansal, “Bridging model predictive control and deep learning for scalable reachability analysis,” in Robotics: Science and Systems, 2025

  14. [14]

    Safety with agency: Human-centered safety filter with application to ai-assisted motorsports,

    D. D. Oh, J. Lidard, H. Hu, H. Sinhmar, E. Lazarski, D. Gopinath, E. S. Sumner, J. A. DeCastro, G. Rosman, N. E. Leonard, and J. F. Fisac, “Safety with agency: Human-centered safety filter with application to ai-assisted motorsports,” 2025

  15. [15]

    A. A. Agrachev and Y . L. Sachkov,Control Theory from the Geometric Viewpoint. Springer Berlin Heidelberg, 2004

  16. [16]

    Optimal control and applications to aerospace: Some results and challenges,

    E. Tr ´elat, “Optimal control and applications to aerospace: Some results and challenges,”Journal of Optimization Theory & Ap- plications, vol. 154, no. 3, pp. 713–758, 2012

  17. [17]

    Bonnard and M

    B. Bonnard and M. Chyba,Singular Trajectories and their Role in Control Theory. Springer Berlin Heidelberg, 2003

  18. [18]

    Ellipsoidal techniques for reachability analysis,

    A. B. Kurzhanski and P. Varaiya, “Ellipsoidal techniques for reachability analysis,” inHybrid Systems: Computation and Con- trol, 2000

  19. [19]

    Study of methods for com- puting directional reachability,

    R. A. Natherson and D. J. Scheeres, “Study of methods for com- puting directional reachability,”Journal of Guidance, Control, and Dynamics, vol. 48, no. 12, pp. 2860–2869, 2025

  20. [20]

    Convex hulls of reachable sets,

    T. Lew, R. Bonalli, and M. Pavone, “Convex hulls of reachable sets,”IEEE Transactions on Automatic Control, vol. 70, no. 12, pp. 8195–8209, 2025

  21. [21]

    Reachability barrier networks: Learning hamilton- jacobi solutions for smooth and flexible control barrier functions,

    M. Kim, W. Sharpless, H. J. Jeong, S. Tonkens, S. Bansal, and S. Herbert, “Reachability barrier networks: Learning hamilton- jacobi solutions for smooth and flexible control barrier functions,” 2025

  22. [22]

    Estimating the convex hull of the image of a set with smooth boundary: Error bounds and applications,

    T. Lew, R. Bonalli, L. Janson, and M. Pavone, “Estimating the convex hull of the image of a set with smooth boundary: Error bounds and applications,”Discrete & Computational Geometry, vol. 74, no. 1, pp. 203–241, 2024

  23. [23]

    J. M. Lee,Introduction to Smooth Manifolds, 2nd ed. Springer New York, 2012

  24. [24]

    Time-optimal switching surfaces for triple integrator under full box constraints,

    Y . Wang, C. Hu, and Z. Jin, “Time-optimal switching surfaces for triple integrator under full box constraints,” inAmerican Control Conference, 2026

  25. [25]

    An introduction to optimal control,

    U. Boscain and B. Piccoli, “An introduction to optimal control,” Contrˆole non lin ´eaire et applications, pp. 19–66, 2005

  26. [26]

    Adaptive nonlinear model predictive control: Maximizing tire force and obstacle avoidance in autonomous vehicles,

    M. Thompson, J. Dallas, J. Y . M. Goh, and A. Balachandran, “Adaptive nonlinear model predictive control: Maximizing tire force and obstacle avoidance in autonomous vehicles,”IEEE Transactions on Field Robotics, vol. 1, pp. 318–331, 2024

  27. [27]

    Pacejka,Tire and Vehicle Dynamics

    H. Pacejka,Tire and Vehicle Dynamics. Elsevier, 2005